<?xml version="1.0" encoding="ascii"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Martel.LAX</title> <link rel="stylesheet" href="epydoc.css" type="text/css" /> <script type="text/javascript" src="epydoc.js"></script> </head> <body bgcolor="white" text="black" link="blue" vlink="#204080" alink="#204080"> <!-- ==================== NAVIGATION BAR ==================== --> <table class="navbar" border="0" width="100%" cellpadding="0" bgcolor="#a0c0ff" cellspacing="0"> <tr valign="middle"> <!-- Tree link --> <th> <a href="module-tree.html">Trees</a> </th> <!-- Index link --> <th> <a href="identifier-index.html">Indices</a> </th> <!-- Help link --> <th> <a href="help.html">Help</a> </th> <th class="navbar" width="100%"></th> </tr> </table> <table width="100%" cellpadding="0" cellspacing="0"> <tr valign="top"> <td width="100%"> <span class="breadcrumbs"> <a href="Martel-module.html">Package Martel</a> :: Module LAX </span> </td> <td> <table cellpadding="0" cellspacing="0"> <!-- hide/show private --> <tr><td align="right"><span class="options">[<a href="javascript:void(0);" class="privatelink" onclick="toggle_private();">hide private</a>]</span></td></tr> <tr><td align="right"><span class="options" >[<a href="frames.html" target="_top">frames</a >] | <a href="Martel.LAX-module.html" target="_top">no frames</a>]</span></td></tr> </table> </td> </tr> </table> <!-- ==================== MODULE DESCRIPTION ==================== --> <h1 class="epydoc">Module LAX</h1><p class="nomargin-top"><span class="codelink"><a href="Martel.LAX-pysrc.html">source code</a></span></p> <pre class="literalblock"> A simple way to read lists of fields from flat XML records. Many XML formats are very simple: all the fields are needed, there is no tree hierarchy, all the text inside of the tags is used, and the text is short (it can easily fit inside of memory). SAX is pretty good for this but it's still somewhat complicated to use. DOM is designed to handle tree structures so is a bit too much for a simple flat data structure. This module implements a new, simpler API, which I'll call LAX. It only works well when the elements are small and non-hierarchical. LAX has three callbacks. start() -- the first method called element(tag, attrs, text) -- called once for each element, after the element has been fully read. (Ie, called when the endElement would be called.) The 'tag' is the element name, the attrs is the attribute object that would be used in a startElement, and the text is all the text between the two tags. The text is the concatenation of all the characters() calls. end() -- the last method called (unless there was an error) LAX.LAX is an content handler which converts the SAX events to LAX events. Here is an example use: >>> from Martel import Word, Whitespace, Group, Integer, Rep1, AnyEol >>> format = Rep1(Group("line", Word("name") + Whitespace() + ... Integer("age")) + AnyEol()) >>> parser = format.make_parser() >>> >>> from Martel import LAX >>> class PrintFields(LAX.LAX): ... def element(self, tag, attrs, text): ... print tag, "has", repr(text) ... >>> parser.setContentHandler(PrintFields()) >>> text = "Maggie 3 Porter 1 " >>> parser.parseString(text) name has 'Maggie' age has '3' line has 'Maggie 3' name has 'Porter' age has '1' line has 'Porter 1' >>> Callbacks take some getting used to. Many people prefer an iterative solution which returns all of the fields of a given record at one time. The default implementation of LAX.LAX helps this case. The 'start' method initializes a local variable named 'groups', which is dictionary. When the 'element' method is called, the information is added to groups; the key is the element name and the value is the list of text strings. It's a list because the same field name may occur multiple times. If you need the element attributes as well as the name, use the LAX.LAXAttrs class, which stores a list of 2-ples (text, attrs) instead of just the text. For examples: >>> iterator = format.make_iterator("line") >>> for record in iterator.iterateString(text, LAX.LAX()): ... print record.groups["name"][0], "is", record.groups["age"][0] ... Maggie is 3 Porter is 1 >>> If you only want a few fields, you can pass the list to constructor, as in: >>> lax = LAX.LAX(["name", "sequence"]) >>> </pre> <!-- ==================== CLASSES ==================== --> <a name="section-Classes"></a> <table class="summary" border="1" cellpadding="3" cellspacing="0" width="100%" bgcolor="white"> <tr bgcolor="#70b0f0" class="table-header"> <td colspan="2" class="table-header"> <table border="0" cellpadding="0" cellspacing="0" width="100%"> <tr valign="top"> <td align="left"><span class="table-header">Classes</span></td> <td align="right" valign="top" ><span class="options">[<a href="#section-Classes" class="privatelink" onclick="toggle_private();" >hide private</a>]</span></td> </tr> </table> </td> </tr> <tr class="private"> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.LAX._IsIn-class.html" class="summary-name" onclick="show_private();">_IsIn</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.LAX.LAX-class.html" class="summary-name">LAX</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.LAX.LAXAttrs-class.html" class="summary-name">LAXAttrs</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.LAX.ElementInfo-class.html" class="summary-name">ElementInfo</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.LAX.LAXPositions-class.html" class="summary-name">LAXPositions</a> </td> </tr> </table> <!-- ==================== NAVIGATION BAR ==================== --> <table class="navbar" border="0" width="100%" cellpadding="0" bgcolor="#a0c0ff" cellspacing="0"> <tr valign="middle"> <!-- Tree link --> <th> <a href="module-tree.html">Trees</a> </th> <!-- Index link --> <th> <a href="identifier-index.html">Indices</a> </th> <!-- Help link --> <th> <a href="help.html">Help</a> </th> <th class="navbar" width="100%"></th> </tr> </table> <table border="0" cellpadding="0" cellspacing="0" width="100%%"> <tr> <td align="left" class="footer"> Generated by Epydoc 3.0.1 on Mon Sep 15 09:26:29 2008 </td> <td align="right" class="footer"> <a target="mainFrame" href="http://epydoc.sourceforge.net" >http://epydoc.sourceforge.net</a> </td> </tr> </table> <script type="text/javascript"> <!-- // Private objects are initially displayed (because if // javascript is turned off then we want them to be // visible); but by default, we want to hide them. So hide // them unless we have a cookie that says to show them. checkCookie(); // --> </script> </body> </html>