Sophie

Sophie

distrib > CentOS > 5 > i386 > by-pkgid > 90dba77ca23efa667b541b5c0dd77497 > files > 396

python-lxml-2.0.11-2.el5.i386.rpm

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.4.1: http://docutils.sourceforge.net/" />
<title>XPath and XSLT with lxml</title>
<link rel="stylesheet" href="style.css" type="text/css" />
</head>
<body>
<div class="document" id="xpath-and-xslt-with-lxml">
<div class="sidemenu"><ul id="lxml"><li><span class="section title">lxml</span><ul class="menu foreign" id="index"><li class="menu title"><a href="index.html">lxml</a><ul class="submenu"><li class="menu item"><a href="index.html#introduction">Introduction</a></li><li class="menu item"><a href="index.html#documentation">Documentation</a></li><li class="menu item"><a href="index.html#download">Download</a></li><li class="menu item"><a href="index.html#mailing-list">Mailing list</a></li><li class="menu item"><a href="index.html#bug-tracker">Bug tracker</a></li><li class="menu item"><a href="index.html#license">License</a></li><li class="menu item"><a href="index.html#old-versions">Old Versions</a></li></ul></li></ul><ul class="menu foreign" id="intro"><li class="menu title"><a href="intro.html">Why lxml?</a><ul class="submenu"><li class="menu item"><a href="intro.html#motto">Motto</a></li><li class="menu item"><a href="intro.html#aims">Aims</a></li></ul></li></ul><ul class="menu foreign" id="installation"><li class="menu title"><a href="installation.html">Installing lxml</a><ul class="submenu"><li class="menu item"><a href="installation.html#requirements">Requirements</a></li><li class="menu item"><a href="installation.html#installation">Installation</a></li><li class="menu item"><a href="installation.html#building-lxml-from-sources">Building lxml from sources</a></li><li class="menu item"><a href="installation.html#ms-windows">MS Windows</a></li><li class="menu item"><a href="installation.html#macos-x">MacOS-X</a></li></ul></li></ul><ul class="menu foreign" id="lxml2"><li class="menu title"><a href="lxml2.html">What's new in lxml 2.0?</a><ul class="submenu"><li class="menu item"><a href="lxml2.html#changes-in-etree-and-objectify">Changes in etree and objectify</a></li><li class="menu item"><a href="lxml2.html#new-modules">New modules</a></li></ul></li></ul><ul class="menu foreign" id="performance"><li class="menu title"><a href="performance.html">Benchmarks and Speed</a><ul class="submenu"><li class="menu item"><a href="performance.html#general-notes">General notes</a></li><li class="menu item"><a href="performance.html#how-to-read-the-timings">How to read the timings</a></li><li class="menu item"><a href="performance.html#parsing-and-serialising">Parsing and Serialising</a></li><li class="menu item"><a href="performance.html#the-elementtree-api">The ElementTree API</a></li><li class="menu item"><a href="performance.html#xpath">XPath</a></li><li class="menu item"><a href="performance.html#a-longer-example">A longer example</a></li><li class="menu item"><a href="performance.html#lxml-objectify">lxml.objectify</a></li></ul></li></ul><ul class="menu foreign" id="compatibility"><li class="menu title"><a href="compatibility.html">ElementTree compatibility of lxml.etree</a></li></ul><ul class="menu foreign" id="FAQ"><li class="menu title"><a href="FAQ.html">lxml FAQ - Frequently Asked Questions</a><ul class="submenu"><li class="menu item"><a href="FAQ.html#general-questions">General Questions</a></li><li class="menu item"><a href="FAQ.html#installation">Installation</a></li><li class="menu item"><a href="FAQ.html#contributing">Contributing</a></li><li class="menu item"><a href="FAQ.html#bugs">Bugs</a></li><li class="menu item"><a href="FAQ.html#threading">Threading</a></li><li class="menu item"><a href="FAQ.html#parsing-and-serialisation">Parsing and Serialisation</a></li><li class="menu item"><a href="FAQ.html#xpath-and-document-traversal">XPath and Document Traversal</a></li></ul></li></ul></li></ul><ul id="Developing with lxml"><li><span class="section title">Developing with lxml</span><ul class="menu foreign" id="tutorial"><li class="menu title"><a href="tutorial.html">The lxml.etree Tutorial</a><ul class="submenu"><li class="menu item"><a href="tutorial.html#the-element-class">The Element class</a></li><li class="menu item"><a href="tutorial.html#the-elementtree-class">The ElementTree class</a></li><li class="menu item"><a href="tutorial.html#parsing-from-strings-and-files">Parsing from strings and files</a></li><li class="menu item"><a href="tutorial.html#namespaces">Namespaces</a></li><li class="menu item"><a href="tutorial.html#the-e-factory">The E-factory</a></li><li class="menu item"><a href="tutorial.html#elementpath">ElementPath</a></li></ul></li></ul><ul class="menu foreign" id="api index"><li class="menu title"><a href="api/index.html">API reference</a></li></ul><ul class="menu foreign" id="api"><li class="menu title"><a href="api.html">APIs specific to lxml.etree</a><ul class="submenu"><li class="menu item"><a href="api.html#lxml-etree">lxml.etree</a></li><li class="menu item"><a href="api.html#other-element-apis">Other Element APIs</a></li><li class="menu item"><a href="api.html#trees-and-documents">Trees and Documents</a></li><li class="menu item"><a href="api.html#iteration">Iteration</a></li><li class="menu item"><a href="api.html#error-handling-on-exceptions">Error handling on exceptions</a></li><li class="menu item"><a href="api.html#error-logging">Error logging</a></li><li class="menu item"><a href="api.html#serialisation">Serialisation</a></li><li class="menu item"><a href="api.html#xinclude-and-elementinclude">XInclude and ElementInclude</a></li><li class="menu item"><a href="api.html#write-c14n-on-elementtree">write_c14n on ElementTree</a></li></ul></li></ul><ul class="menu foreign" id="parsing"><li class="menu title"><a href="parsing.html">Parsing XML and HTML with lxml</a><ul class="submenu"><li class="menu item"><a href="parsing.html#parsers">Parsers</a></li><li class="menu item"><a href="parsing.html#the-target-parser-interface">The target parser interface</a></li><li class="menu item"><a href="parsing.html#the-feed-parser-interface">The feed parser interface</a></li><li class="menu item"><a href="parsing.html#iterparse-and-iterwalk">iterparse and iterwalk</a></li><li class="menu item"><a href="parsing.html#python-unicode-strings">Python unicode strings</a></li></ul></li></ul><ul class="menu foreign" id="validation"><li class="menu title"><a href="validation.html">Validation with lxml</a><ul class="submenu"><li class="menu item"><a href="validation.html#validation-at-parse-time">Validation at parse time</a></li><li class="menu item"><a href="validation.html#dtd">DTD</a></li><li class="menu item"><a href="validation.html#relaxng">RelaxNG</a></li><li class="menu item"><a href="validation.html#xmlschema">XMLSchema</a></li><li class="menu item"><a href="validation.html#schematron">Schematron</a></li></ul></li></ul><ul class="menu current" id="xpathxslt"><li class="menu title"><a href="xpathxslt.html">XPath and XSLT with lxml</a><ul class="submenu"><li class="menu item"><a href="xpathxslt.html#xpath">XPath</a></li><li class="menu item"><a href="xpathxslt.html#xslt">XSLT</a></li></ul></li></ul><ul class="menu foreign" id="objectify"><li class="menu title"><a href="objectify.html">lxml.objectify</a><ul class="submenu"><li class="menu item"><a href="objectify.html#the-lxml-objectify-api">The lxml.objectify API</a></li><li class="menu item"><a href="objectify.html#asserting-a-schema">Asserting a Schema</a></li><li class="menu item"><a href="objectify.html#objectpath">ObjectPath</a></li><li class="menu item"><a href="objectify.html#python-data-types">Python data types</a></li><li class="menu item"><a href="objectify.html#how-data-types-are-matched">How data types are matched</a></li><li class="menu item"><a href="objectify.html#what-is-different-from-lxml-etree?">What is different from lxml.etree?</a></li></ul></li></ul><ul class="menu foreign" id="lxmlhtml"><li class="menu title"><a href="lxmlhtml.html">lxml.html</a><ul class="submenu"><li class="menu item"><a href="lxmlhtml.html#parsing-html">Parsing HTML</a></li><li class="menu item"><a href="lxmlhtml.html#html-element-methods">HTML Element Methods</a></li><li class="menu item"><a href="lxmlhtml.html#running-html-doctests">Running HTML doctests</a></li><li class="menu item"><a href="lxmlhtml.html#creating-html-with-the-e-factory">Creating HTML with the E-factory</a></li><li class="menu item"><a href="lxmlhtml.html#working-with-links">Working with links</a></li><li class="menu item"><a href="lxmlhtml.html#forms">Forms</a></li><li class="menu item"><a href="lxmlhtml.html#cleaning-up-html">Cleaning up HTML</a></li><li class="menu item"><a href="lxmlhtml.html#html-diff">HTML Diff</a></li><li class="menu item"><a href="lxmlhtml.html#examples">Examples</a></li></ul></li></ul><ul class="menu foreign" id="cssselect"><li class="menu title"><a href="cssselect.html">lxml.cssselect</a><ul class="submenu"><li class="menu item"><a href="cssselect.html#the-cssselector-class">The CSSSelector class</a></li><li class="menu item"><a href="cssselect.html#css-selectors">CSS Selectors</a></li><li class="menu item"><a href="cssselect.html#limitations">Limitations</a></li></ul></li></ul><ul class="menu foreign" id="elementsoup"><li class="menu title"><a href="elementsoup.html">BeautifulSoup Parser</a><ul class="submenu"><li class="menu item"><a href="elementsoup.html#entity-handling">Entity handling</a></li><li class="menu item"><a href="elementsoup.html#using-soupparser-as-a-fallback">Using soupparser as a fallback</a></li></ul></li></ul></li></ul><ul id="Extending lxml"><li><span class="section title">Extending lxml</span><ul class="menu foreign" id="resolvers"><li class="menu title"><a href="resolvers.html">Document loading and URL resolving</a><ul class="submenu"><li class="menu item"><a href="resolvers.html#resolvers">Resolvers</a></li><li class="menu item"><a href="resolvers.html#document-loading-in-context">Document loading in context</a></li><li class="menu item"><a href="resolvers.html#i-o-access-control-in-xslt">I/O access control in XSLT</a></li></ul></li></ul><ul class="menu foreign" id="extensions"><li class="menu title"><a href="extensions.html">Extension functions for XPath and XSLT</a><ul class="submenu"><li class="menu item"><a href="extensions.html#the-functionnamespace">The FunctionNamespace</a></li><li class="menu item"><a href="extensions.html#global-prefix-assignment">Global prefix assignment</a></li><li class="menu item"><a href="extensions.html#the-xpath-context">The XPath context</a></li><li class="menu item"><a href="extensions.html#evaluators-and-xslt">Evaluators and XSLT</a></li><li class="menu item"><a href="extensions.html#evaluator-local-extensions">Evaluator-local extensions</a></li><li class="menu item"><a href="extensions.html#what-to-return-from-a-function">What to return from a function</a></li></ul></li></ul><ul class="menu foreign" id="element classes"><li class="menu title"><a href="element_classes.html">Using custom Element classes in lxml</a><ul class="submenu"><li class="menu item"><a href="element_classes.html#element-initialization">Element initialization</a></li><li class="menu item"><a href="element_classes.html#setting-up-a-class-lookup-scheme">Setting up a class lookup scheme</a></li><li class="menu item"><a href="element_classes.html#implementing-namespaces">Implementing namespaces</a></li></ul></li></ul><ul class="menu foreign" id="sax"><li class="menu title"><a href="sax.html">Sax support</a><ul class="submenu"><li class="menu item"><a href="sax.html#building-a-tree-from-sax-events">Building a tree from SAX events</a></li><li class="menu item"><a href="sax.html#producing-sax-events-from-an-elementtree-or-element">Producing SAX events from an ElementTree or Element</a></li><li class="menu item"><a href="sax.html#interfacing-with-pulldom-minidom">Interfacing with pulldom/minidom</a></li></ul></li></ul><ul class="menu foreign" id="capi"><li class="menu title"><a href="capi.html">The public C-API of lxml.etree</a><ul class="submenu"><li class="menu item"><a href="capi.html#writing-external-modules-in-cython">Writing external modules in Cython</a></li><li class="menu item"><a href="capi.html#writing-external-modules-in-c">Writing external modules in C</a></li></ul></li></ul></li></ul><ul id="Developing lxml"><li><span class="section title">Developing lxml</span><ul class="menu foreign" id="build"><li class="menu title"><a href="build.html">How to build lxml from source</a><ul class="submenu"><li class="menu item"><a href="build.html#cython">Cython</a></li><li class="menu item"><a href="build.html#subversion">Subversion</a></li><li class="menu item"><a href="build.html#setuptools">Setuptools</a></li><li class="menu item"><a href="build.html#running-the-tests-and-reporting-errors">Running the tests and reporting errors</a></li><li class="menu item"><a href="build.html#contributing-an-egg">Contributing an egg</a></li><li class="menu item"><a href="build.html#providing-newer-library-versions-on-mac-os-x">Providing newer library versions on Mac-OS X</a></li><li class="menu item"><a href="build.html#static-linking-on-windows">Static linking on Windows</a></li><li class="menu item"><a href="build.html#building-debian-packages-from-svn-sources">Building Debian packages from SVN sources</a></li></ul></li></ul><ul class="menu foreign" id="lxml source howto"><li class="menu title"><a href="lxml-source-howto.html">How to read the source of lxml</a><ul class="submenu"><li class="menu item"><a href="lxml-source-howto.html#what-is-cython?">What is Cython?</a></li><li class="menu item"><a href="lxml-source-howto.html#where-to-start?">Where to start?</a></li><li class="menu item"><a href="lxml-source-howto.html#lxml-etree">lxml.etree</a></li><li class="menu item"><a href="lxml-source-howto.html#python-modules">Python modules</a></li><li class="menu item"><a href="lxml-source-howto.html#lxml-objectify">lxml.objectify</a></li><li class="menu item"><a href="lxml-source-howto.html#lxml-pyclasslookup">lxml.pyclasslookup</a></li><li class="menu item"><a href="lxml-source-howto.html#lxml-html">lxml.html</a></li></ul></li></ul><ul class="menu foreign" id="changes 2 0 11"><li class="menu title"><a href="changes-2.0.11.html">Release Changelog</a></li></ul><ul class="menu foreign" id="credits"><li class="menu title"><a href="credits.html">Credits</a><ul class="submenu"><li class="menu item"><a href="credits.html#special-thanks-goes-to:">Special thanks goes to:</a></li></ul></li></ul></li></ul></div><h1 class="title">XPath and XSLT with lxml</h1>
<p>lxml supports both XPath and XSLT through libxml2 and libxslt in a standards
compliant way.</p>
<div class="contents topic">
<p class="topic-title first"><a id="contents" name="contents">Contents</a></p>
<ul class="simple">
<li><a class="reference" href="#xpath" id="id1" name="id1">XPath</a><ul>
<li><a class="reference" href="#the-xpath-method" id="id2" name="id2">The <tt class="docutils literal"><span class="pre">xpath()</span></tt> method</a></li>
<li><a class="reference" href="#xpath-return-values" id="id3" name="id3">XPath return values</a></li>
<li><a class="reference" href="#generating-xpath-expressions" id="id4" name="id4">Generating XPath expressions</a></li>
<li><a class="reference" href="#the-xpath-class" id="id5" name="id5">The <tt class="docutils literal"><span class="pre">XPath</span></tt> class</a></li>
<li><a class="reference" href="#the-xpathevaluator-classes" id="id6" name="id6">The <tt class="docutils literal"><span class="pre">XPathEvaluator</span></tt> classes</a></li>
<li><a class="reference" href="#etxpath" id="id7" name="id7"><tt class="docutils literal"><span class="pre">ETXPath</span></tt></a></li>
<li><a class="reference" href="#error-handling" id="id8" name="id8">Error handling</a></li>
</ul>
</li>
<li><a class="reference" href="#xslt" id="id9" name="id9">XSLT</a><ul>
<li><a class="reference" href="#xslt-result-objects" id="id10" name="id10">XSLT result objects</a></li>
<li><a class="reference" href="#stylesheet-parameters" id="id11" name="id11">Stylesheet parameters</a></li>
<li><a class="reference" href="#the-xslt-tree-method" id="id12" name="id12">The <tt class="docutils literal"><span class="pre">xslt()</span></tt> tree method</a></li>
<li><a class="reference" href="#profiling" id="id13" name="id13">Profiling</a></li>
</ul>
</li>
</ul>
</div>
<p>The usual setup procedure:</p>
<pre class="literal-block">
&gt;&gt;&gt; from lxml import etree
&gt;&gt;&gt; from StringIO import StringIO
</pre>
<div class="section">
<h1><a id="xpath" name="xpath">XPath</a></h1>
<p>lxml.etree supports the simple path syntax of the <a class="reference" href="http://effbot.org/zone/element.htm#searching-for-subelements">find, findall and
findtext</a> methods on ElementTree and Element, as known from the original
ElementTree library (<a class="reference" href="http://effbot.org/zone/element-xpath.htm">ElementPath</a>).  As an lxml specific extension, these
classes also provide an <tt class="docutils literal"><span class="pre">xpath()</span></tt> method that supports expressions in the
complete XPath syntax, as well as <a class="reference" href="extensions.html">custom extension functions</a>.</p>
<p>There are also specialized XPath evaluator classes that are more efficient for
frequent evaluation: <tt class="docutils literal"><span class="pre">XPath</span></tt> and <tt class="docutils literal"><span class="pre">XPathEvaluator</span></tt>.  See the <a class="reference" href="performance.html#xpath">performance
comparison</a> to learn when to use which.  Their semantics when used on
Elements and ElementTrees are the same as for the <tt class="docutils literal"><span class="pre">xpath()</span></tt> method described
here.</p>
<div class="section">
<h2><a id="the-xpath-method" name="the-xpath-method">The <tt class="docutils literal"><span class="pre">xpath()</span></tt> method</a></h2>
<p>For ElementTree, the xpath method performs a global XPath query against the
document (if absolute) or against the root node (if relative):</p>
<pre class="literal-block">
&gt;&gt;&gt; f = StringIO('&lt;foo&gt;&lt;bar&gt;&lt;/bar&gt;&lt;/foo&gt;')
&gt;&gt;&gt; tree = etree.parse(f)

&gt;&gt;&gt; r = tree.xpath('/foo/bar')
&gt;&gt;&gt; len(r)
1
&gt;&gt;&gt; r[0].tag
'bar'

&gt;&gt;&gt; r = tree.xpath('bar')
&gt;&gt;&gt; r[0].tag
'bar'
</pre>
<p>When <tt class="docutils literal"><span class="pre">xpath()</span></tt> is used on an Element, the XPath expression is evaluated
against the element (if relative) or against the root tree (if absolute):</p>
<pre class="literal-block">
&gt;&gt;&gt; root = tree.getroot()
&gt;&gt;&gt; r = root.xpath('bar')
&gt;&gt;&gt; r[0].tag
'bar'

&gt;&gt;&gt; bar = root[0]
&gt;&gt;&gt; r = bar.xpath('/foo/bar')
&gt;&gt;&gt; r[0].tag
'bar'

&gt;&gt;&gt; tree = bar.getroottree()
&gt;&gt;&gt; r = tree.xpath('/foo/bar')
&gt;&gt;&gt; r[0].tag
'bar'
</pre>
<p>The <tt class="docutils literal"><span class="pre">xpath()</span></tt> method has support for XPath variables:</p>
<pre class="literal-block">
&gt;&gt;&gt; expr = "//*[local-name() = $name]"

&gt;&gt;&gt; print root.xpath(expr, name = "foo")[0].tag
foo

&gt;&gt;&gt; print root.xpath(expr, name = "bar")[0].tag
bar

&gt;&gt;&gt; print root.xpath("$text", text = "Hello World!")
Hello World!
</pre>
<p>Optionally, you can provide a <tt class="docutils literal"><span class="pre">namespaces</span></tt> keyword argument, which should be
a dictionary mapping the namespace prefixes used in the XPath expression to
namespace URIs:</p>
<pre class="literal-block">
&gt;&gt;&gt; f = StringIO('''\
... &lt;a:foo xmlns:a="http://codespeak.net/ns/test1"
...        xmlns:b="http://codespeak.net/ns/test2"&gt;
...    &lt;b:bar&gt;Text&lt;/b:bar&gt;
... &lt;/a:foo&gt;
... ''')
&gt;&gt;&gt; doc = etree.parse(f)

&gt;&gt;&gt; r = doc.xpath('/t:foo/b:bar',
...               namespaces={'t': 'http://codespeak.net/ns/test1',
...                           'b': 'http://codespeak.net/ns/test2'})
&gt;&gt;&gt; len(r)
1
&gt;&gt;&gt; r[0].tag
'{http://codespeak.net/ns/test2}bar'
&gt;&gt;&gt; r[0].text
'Text'
</pre>
<p>There is also an optional <tt class="docutils literal"><span class="pre">extensions</span></tt> argument which is used to define
<a class="reference" href="extensions.html">custom extension functions</a> in Python that are local to this evaluation.</p>
</div>
<div class="section">
<h2><a id="xpath-return-values" name="xpath-return-values">XPath return values</a></h2>
<p>The return values of XPath evaluations vary, depending on the XPath expression
used:</p>
<ul class="simple">
<li>True or False, when the XPath expression has a boolean result</li>
<li>a float, when the XPath expression has a numeric result (integer or float)</li>
<li>a 'smart' string (as described below), when the XPath expression has
a string result.</li>
<li>a list of items, when the XPath expression has a list as result.
The items may include Elements (also comments and processing
instructions), strings and tuples.  Text nodes and attributes in the
result are returned as 'smart' string values.  Namespace
declarations are returned as tuples of strings: <tt class="docutils literal"><span class="pre">(prefix,</span> <span class="pre">URI)</span></tt>.</li>
</ul>
<p>XPath string results are 'smart' in that they provide a
<tt class="docutils literal"><span class="pre">getparent()</span></tt> method that knows their origin:</p>
<ul class="simple">
<li>for attribute values, <tt class="docutils literal"><span class="pre">result.getparent()</span></tt> returns the Element
that carries them.  An example is <tt class="docutils literal"><span class="pre">//foo/@attribute</span></tt>, where the
parent would be a <tt class="docutils literal"><span class="pre">foo</span></tt> Element.</li>
<li>for the <tt class="docutils literal"><span class="pre">text()</span></tt> function (as in <tt class="docutils literal"><span class="pre">//text()</span></tt>), it returns the
Element that contains the text or tail that was returned.</li>
</ul>
<p>You can distinguish between different text origins with the boolean
properties <tt class="docutils literal"><span class="pre">is_text</span></tt>, <tt class="docutils literal"><span class="pre">is_tail</span></tt> and <tt class="docutils literal"><span class="pre">is_attribute</span></tt>.</p>
<p>Note that <tt class="docutils literal"><span class="pre">getparent()</span></tt> may not always return an Element.  For
example, the XPath functions <tt class="docutils literal"><span class="pre">string()</span></tt> and <tt class="docutils literal"><span class="pre">concat()</span></tt> will
construct strings that do not have an origin.  For them,
<tt class="docutils literal"><span class="pre">getparent()</span></tt> will return None.</p>
</div>
<div class="section">
<h2><a id="generating-xpath-expressions" name="generating-xpath-expressions">Generating XPath expressions</a></h2>
<p>ElementTree objects have a method <tt class="docutils literal"><span class="pre">getpath(element)</span></tt>, which returns a
structural, absolute XPath expression to find that element:</p>
<pre class="literal-block">
&gt;&gt;&gt; a  = etree.Element("a")
&gt;&gt;&gt; b  = etree.SubElement(a, "b")
&gt;&gt;&gt; c  = etree.SubElement(a, "c")
&gt;&gt;&gt; d1 = etree.SubElement(c, "d")
&gt;&gt;&gt; d2 = etree.SubElement(c, "d")

&gt;&gt;&gt; tree = etree.ElementTree(c)
&gt;&gt;&gt; print tree.getpath(d2)
/c/d[2]
&gt;&gt;&gt; tree.xpath(tree.getpath(d2)) == [d2]
True
</pre>
</div>
<div class="section">
<h2><a id="the-xpath-class" name="the-xpath-class">The <tt class="docutils literal"><span class="pre">XPath</span></tt> class</a></h2>
<p>The <tt class="docutils literal"><span class="pre">XPath</span></tt> class compiles an XPath expression into a callable function:</p>
<pre class="literal-block">
&gt;&gt;&gt; root = etree.XML("&lt;root&gt;&lt;a&gt;&lt;b/&gt;&lt;/a&gt;&lt;b/&gt;&lt;/root&gt;")

&gt;&gt;&gt; find = etree.XPath("//b")
&gt;&gt;&gt; print find(root)[0].tag
b
</pre>
<p>The compilation takes as much time as in the <tt class="docutils literal"><span class="pre">xpath()</span></tt> method, but it is
done only once per class instantiation.  This makes it especially efficient
for repeated evaluation of the same XPath expression.</p>
<p>Just like the <tt class="docutils literal"><span class="pre">xpath()</span></tt> method, the <tt class="docutils literal"><span class="pre">XPath</span></tt> class supports XPath
variables:</p>
<pre class="literal-block">
&gt;&gt;&gt; count_elements = etree.XPath("count(//*[local-name() = $name])")

&gt;&gt;&gt; print count_elements(root, name = "a")
1.0
&gt;&gt;&gt; print count_elements(root, name = "b")
2.0
</pre>
<p>This supports very efficient evaluation of modified versions of an XPath
expression, as compilation is still only required once.</p>
<p>Prefix-to-namespace mappings can be passed as second parameter:</p>
<pre class="literal-block">
&gt;&gt;&gt; root = etree.XML("&lt;root xmlns='NS'&gt;&lt;a&gt;&lt;b/&gt;&lt;/a&gt;&lt;b/&gt;&lt;/root&gt;")

&gt;&gt;&gt; find = etree.XPath("//n:b", namespaces={'n':'NS'})
&gt;&gt;&gt; print find(root)[0].tag
{NS}b
</pre>
<p>By default, <tt class="docutils literal"><span class="pre">XPath</span></tt> supports regular expressions in the <a class="reference" href="http://www.exslt.org/">EXSLT</a> namespace:</p>
<pre class="literal-block">
&gt;&gt;&gt; regexpNS = "http://exslt.org/regular-expressions"
&gt;&gt;&gt; find = etree.XPath("//*[re:test(., '^abc$', 'i')]",
...                    namespaces={'re':regexpNS})

&gt;&gt;&gt; root = etree.XML("&lt;root&gt;&lt;a&gt;aB&lt;/a&gt;&lt;b&gt;aBc&lt;/b&gt;&lt;/root&gt;")
&gt;&gt;&gt; print find(root)[0].text
aBc
</pre>
<p>You can disable this with the boolean keyword argument <tt class="docutils literal"><span class="pre">regexp</span></tt> which
defaults to True.</p>
</div>
<div class="section">
<h2><a id="the-xpathevaluator-classes" name="the-xpathevaluator-classes">The <tt class="docutils literal"><span class="pre">XPathEvaluator</span></tt> classes</a></h2>
<p>lxml.etree provides two other efficient XPath evaluators that work on
ElementTrees or Elements respectively: <tt class="docutils literal"><span class="pre">XPathDocumentEvaluator</span></tt> and
<tt class="docutils literal"><span class="pre">XPathElementEvaluator</span></tt>.  They are automatically selected if you use the
XPathEvaluator helper for instantiation:</p>
<pre class="literal-block">
&gt;&gt;&gt; root = etree.XML("&lt;root&gt;&lt;a&gt;&lt;b/&gt;&lt;/a&gt;&lt;b/&gt;&lt;/root&gt;")
&gt;&gt;&gt; xpatheval = etree.XPathEvaluator(root)

&gt;&gt;&gt; print isinstance(xpatheval, etree.XPathElementEvaluator)
True

&gt;&gt;&gt; print xpatheval("//b")[0].tag
b
</pre>
<p>This class provides efficient support for evaluating different XPath
expressions on the same Element or ElementTree.</p>
</div>
<div class="section">
<h2><a id="etxpath" name="etxpath"><tt class="docutils literal"><span class="pre">ETXPath</span></tt></a></h2>
<p>ElementTree supports a language named <a class="reference" href="http://effbot.org/zone/element-xpath.htm">ElementPath</a> in its <tt class="docutils literal"><span class="pre">find*()</span></tt> methods.
One of the main differences between XPath and ElementPath is that the XPath
language requires an indirection through prefixes for namespace support,
whereas ElementTree uses the Clark notation (<tt class="docutils literal"><span class="pre">{ns}name</span></tt>) to avoid prefixes
completely.  The other major difference regards the capabilities of both path
languages.  Where XPath supports various sophisticated ways of restricting the
result set through functions and boolean expressions, ElementPath only
supports pure path traversal without nesting or further conditions.  So, while
the ElementPath syntax is self-contained and therefore easier to write and
handle, XPath is much more powerful and expressive.</p>
<p>lxml.etree bridges this gap through the class <tt class="docutils literal"><span class="pre">ETXPath</span></tt>, which accepts XPath
expressions with namespaces in Clark notation.  It is identical to the
<tt class="docutils literal"><span class="pre">XPath</span></tt> class, except for the namespace notation.  Normally, you would
write:</p>
<pre class="literal-block">
&gt;&gt;&gt; root = etree.XML("&lt;root xmlns='ns'&gt;&lt;a&gt;&lt;b/&gt;&lt;/a&gt;&lt;b/&gt;&lt;/root&gt;")

&gt;&gt;&gt; find = etree.XPath("//p:b", namespaces={'p' : 'ns'})
&gt;&gt;&gt; print find(root)[0].tag
{ns}b
</pre>
<p><tt class="docutils literal"><span class="pre">ETXPath</span></tt> allows you to change this to:</p>
<pre class="literal-block">
&gt;&gt;&gt; find = etree.ETXPath("//{ns}b")
&gt;&gt;&gt; print find(root)[0].tag
{ns}b
</pre>
</div>
<div class="section">
<h2><a id="error-handling" name="error-handling">Error handling</a></h2>
<p>lxml.etree raises exceptions when errors occur while parsing or evaluating an
XPath expression:</p>
<pre class="literal-block">
&gt;&gt;&gt; find = etree.XPath("\\")
Traceback (most recent call last):
  ...
XPathSyntaxError: Invalid expression
</pre>
<p>lxml will also try to give you a hint what went wrong, so if you pass a more
complex expression, you may get a somewhat more specific error:</p>
<pre class="literal-block">
&gt;&gt;&gt; find = etree.XPath("//*[1.1.1]")
Traceback (most recent call last):
  ...
XPathSyntaxError: Invalid predicate
</pre>
<p>During evaluation, lxml will emit an XPathEvalError on errors:</p>
<pre class="literal-block">
&gt;&gt;&gt; find = etree.XPath("//ns:a")
&gt;&gt;&gt; find(root)
Traceback (most recent call last):
  ...
XPathEvalError: Undefined namespace prefix
</pre>
<p>This works for the <tt class="docutils literal"><span class="pre">XPath</span></tt> class, however, the other evaluators (including
the <tt class="docutils literal"><span class="pre">xpath()</span></tt> method) are one-shot operations that do parsing and evaluation
in one step.  They therefore raise evaluation exceptions in all cases:</p>
<pre class="literal-block">
&gt;&gt;&gt; root = etree.Element("test")
&gt;&gt;&gt; find = root.xpath("//*[1.1.1]")
Traceback (most recent call last):
  ...
XPathEvalError: Invalid predicate

&gt;&gt;&gt; find = root.xpath("//ns:a")
Traceback (most recent call last):
  ...
XPathEvalError: Undefined namespace prefix

&gt;&gt;&gt; find = root.xpath("\\")
Traceback (most recent call last):
  ...
XPathEvalError: Invalid expression
</pre>
<p>Note that lxml versions before 1.3 always raised an <tt class="docutils literal"><span class="pre">XPathSyntaxError</span></tt> for
all errors, including evaluation errors.  The best way to support older
versions is to except on the superclass <tt class="docutils literal"><span class="pre">XPathError</span></tt>.</p>
</div>
</div>
<div class="section">
<h1><a id="xslt" name="xslt">XSLT</a></h1>
<p>lxml.etree introduces a new class, lxml.etree.XSLT. The class can be
given an ElementTree object to construct an XSLT transformer:</p>
<pre class="literal-block">
&gt;&gt;&gt; f = StringIO('''\
... &lt;xsl:stylesheet version="1.0"
...     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
...     &lt;xsl:template match="/"&gt;
...         &lt;foo&gt;&lt;xsl:value-of select="/a/b/text()" /&gt;&lt;/foo&gt;
...     &lt;/xsl:template&gt;
... &lt;/xsl:stylesheet&gt;''')
&gt;&gt;&gt; xslt_doc = etree.parse(f)
&gt;&gt;&gt; transform = etree.XSLT(xslt_doc)
</pre>
<p>You can then run the transformation on an ElementTree document by simply
calling it, and this results in another ElementTree object:</p>
<pre class="literal-block">
&gt;&gt;&gt; f = StringIO('&lt;a&gt;&lt;b&gt;Text&lt;/b&gt;&lt;/a&gt;')
&gt;&gt;&gt; doc = etree.parse(f)
&gt;&gt;&gt; result_tree = transform(doc)
</pre>
<p>By default, XSLT supports all extension functions from libxslt and libexslt as
well as Python regular expressions through the <a class="reference" href="http://www.exslt.org/regexp/">EXSLT regexp functions</a>.
Also see the documentation on <a class="reference" href="extensions.html">custom extension functions</a> and <a class="reference" href="resolvers.html">document
resolvers</a>.  There is a separate section on <a class="reference" href="resolvers.html#i-o-access-control-in-xslt">controlling access</a> to external
documents and resources.</p>
<div class="section">
<h2><a id="xslt-result-objects" name="xslt-result-objects">XSLT result objects</a></h2>
<p>The result of an XSL transformation can be accessed like a normal ElementTree
document:</p>
<pre class="literal-block">
&gt;&gt;&gt; f = StringIO('&lt;a&gt;&lt;b&gt;Text&lt;/b&gt;&lt;/a&gt;')
&gt;&gt;&gt; doc = etree.parse(f)
&gt;&gt;&gt; result = transform(doc)

&gt;&gt;&gt; result.getroot().text
'Text'
</pre>
<p>but, as opposed to normal ElementTree objects, can also be turned into an (XML
or text) string by applying the str() function:</p>
<pre class="literal-block">
&gt;&gt;&gt; str(result)
'&lt;?xml version="1.0"?&gt;\n&lt;foo&gt;Text&lt;/foo&gt;\n'
</pre>
<p>The result is always a plain string, encoded as requested by the
<tt class="docutils literal"><span class="pre">xsl:output</span></tt> element in the stylesheet.  If you want a Python unicode string
instead, you should set this encoding to <tt class="docutils literal"><span class="pre">UTF-8</span></tt> (unless the <cite>ASCII</cite> default
is sufficient).  This allows you to call the builtin <tt class="docutils literal"><span class="pre">unicode()</span></tt> function on
the result:</p>
<pre class="literal-block">
&gt;&gt;&gt; unicode(result)
u'&lt;?xml version="1.0"?&gt;\n&lt;foo&gt;Text&lt;/foo&gt;\n'
</pre>
<p>You can use other encodings at the cost of multiple recoding.  Encodings that
are not supported by Python will result in an error:</p>
<pre class="literal-block">
&gt;&gt;&gt; xslt_tree = etree.XML('''\
... &lt;xsl:stylesheet version="1.0"
...     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
...     &lt;xsl:output encoding="UCS4"/&gt;
...     &lt;xsl:template match="/"&gt;
...         &lt;foo&gt;&lt;xsl:value-of select="/a/b/text()" /&gt;&lt;/foo&gt;
...     &lt;/xsl:template&gt;
... &lt;/xsl:stylesheet&gt;''')
&gt;&gt;&gt; transform = etree.XSLT(xslt_tree)

&gt;&gt;&gt; result = transform(doc)
&gt;&gt;&gt; unicode(result)
Traceback (most recent call last):
  [...]
LookupError: unknown encoding: UCS4
</pre>
</div>
<div class="section">
<h2><a id="stylesheet-parameters" name="stylesheet-parameters">Stylesheet parameters</a></h2>
<p>It is possible to pass parameters, in the form of XPath expressions, to the
XSLT template:</p>
<pre class="literal-block">
&gt;&gt;&gt; xslt_tree = etree.XML('''\
... &lt;xsl:stylesheet version="1.0"
...     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
...     &lt;xsl:template match="/"&gt;
...         &lt;foo&gt;&lt;xsl:value-of select="$a" /&gt;&lt;/foo&gt;
...     &lt;/xsl:template&gt;
... &lt;/xsl:stylesheet&gt;''')
&gt;&gt;&gt; transform = etree.XSLT(xslt_tree)
&gt;&gt;&gt; f = StringIO('&lt;a&gt;&lt;b&gt;Text&lt;/b&gt;&lt;/a&gt;')
&gt;&gt;&gt; doc = etree.parse(f)
</pre>
<p>The parameters are passed as keyword parameters to the transform call. First
let's try passing in a simple string expression:</p>
<pre class="literal-block">
&gt;&gt;&gt; result = transform(doc, a="'A'")
&gt;&gt;&gt; str(result)
'&lt;?xml version="1.0"?&gt;\n&lt;foo&gt;A&lt;/foo&gt;\n'
</pre>
<p>Let's try a non-string XPath expression now:</p>
<pre class="literal-block">
&gt;&gt;&gt; result = transform(doc, a="/a/b/text()")
&gt;&gt;&gt; str(result)
'&lt;?xml version="1.0"?&gt;\n&lt;foo&gt;Text&lt;/foo&gt;\n'
</pre>
</div>
<div class="section">
<h2><a id="the-xslt-tree-method" name="the-xslt-tree-method">The <tt class="docutils literal"><span class="pre">xslt()</span></tt> tree method</a></h2>
<p>There's also a convenience method on ElementTree objects for doing XSL
transformations.  This is less efficient if you want to apply the same XSL
transformation to multiple documents, but is shorter to write for one-shot
operations, as you do not have to instantiate a stylesheet yourself:</p>
<pre class="literal-block">
&gt;&gt;&gt; result = doc.xslt(xslt_tree, a="'A'")
&gt;&gt;&gt; str(result)
'&lt;?xml version="1.0"?&gt;\n&lt;foo&gt;A&lt;/foo&gt;\n'
</pre>
<p>This is a shortcut for the following code:</p>
<pre class="literal-block">
&gt;&gt;&gt; transform = etree.XSLT(xslt_tree)
&gt;&gt;&gt; result = transform(doc, a="'A'")
&gt;&gt;&gt; str(result)
'&lt;?xml version="1.0"?&gt;\n&lt;foo&gt;A&lt;/foo&gt;\n'
</pre>
</div>
<div class="section">
<h2><a id="profiling" name="profiling">Profiling</a></h2>
<p>If you want to know how your stylesheet performed, pass the <tt class="docutils literal"><span class="pre">profile_run</span></tt>
keyword to the transform:</p>
<pre class="literal-block">
&gt;&gt;&gt; result = transform(doc, a="/a/b/text()", profile_run=True)
&gt;&gt;&gt; profile = result.xslt_profile
</pre>
<p>The value of the <tt class="docutils literal"><span class="pre">xslt_profile</span></tt> property is an ElementTree with profiling
data about each template, similar to the following:</p>
<pre class="literal-block">
&lt;profile&gt;
  &lt;template rank="1" match="/" name="" mode="" calls="1" time="1" average="1"/&gt;
&lt;/profile&gt;
</pre>
<p>Note that this is a read-only document.  You must not move any of its elements
to other documents.  Please deep-copy the document if you need to modify it.
If you want to free it from memory, just do:</p>
<pre class="literal-block">
&gt;&gt;&gt; del result.xslt_profile
</pre>
</div>
</div>
</div>
<div class="footer">
<hr class="footer" />
Generated on: 2008-12-12.

</div>
</body>
</html>