<?xml version="1.0" encoding="ascii"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Martel.Iterator</title> <link rel="stylesheet" href="epydoc.css" type="text/css" /> <script type="text/javascript" src="epydoc.js"></script> </head> <body bgcolor="white" text="black" link="blue" vlink="#204080" alink="#204080"> <!-- ==================== NAVIGATION BAR ==================== --> <table class="navbar" border="0" width="100%" cellpadding="0" bgcolor="#a0c0ff" cellspacing="0"> <tr valign="middle"> <!-- Tree link --> <th> <a href="module-tree.html">Trees</a> </th> <!-- Index link --> <th> <a href="identifier-index.html">Indices</a> </th> <!-- Help link --> <th> <a href="help.html">Help</a> </th> <th class="navbar" width="100%"></th> </tr> </table> <table width="100%" cellpadding="0" cellspacing="0"> <tr valign="top"> <td width="100%"> <span class="breadcrumbs"> <a href="Martel-module.html">Package Martel</a> :: Module Iterator </span> </td> <td> <table cellpadding="0" cellspacing="0"> <!-- hide/show private --> <tr><td align="right"><span class="options">[<a href="javascript:void(0);" class="privatelink" onclick="toggle_private();">hide private</a>]</span></td></tr> <tr><td align="right"><span class="options" >[<a href="frames.html" target="_top">frames</a >] | <a href="Martel.Iterator-module.html" target="_top">no frames</a>]</span></td></tr> </table> </td> </tr> </table> <!-- ==================== MODULE DESCRIPTION ==================== --> <h1 class="epydoc">Module Iterator</h1><p class="nomargin-top"><span class="codelink"><a href="Martel.Iterator-pysrc.html">source code</a></span></p> <p>Iterate over records of a XML parse tree.</p> <p>The standard parser is callback based over all the elements of a file. If the file contains records, many people would like to be able to iterate over each record and only use the callback parser to analyze the record.</p> <p>If the expression is a 'ParseRecords', then the code to do this is easy; use its make_reader to grab records and its record_expression to parse them. However, this isn't general enough. The use of a ParseRecords in the format definition should be strictly a implementation decision for better memory use. So there needs to be an API which allows both full and record oriented parsers.</p> <p>Here's an example use of the API: >>> import sys >>> import swissprot38 # one is in Martel/test/testformats >>> from xml.dom import pulldom >>> iterator = swissprot38.format.make_iterator("swissprot38_record") >>> text = open("sample.swissprot").read() >>> for record in iterator.iterateString(text, pulldom.SAX2DOM()): .. print "Read a record with the following AC numbers:" ... for acc in record.document.getElementsByTagName("ac_number"): ... acc.writexml(sys.stdout) ... sys.stdout.write(" ") ...</p> <p>There are several parts to this API. First is the 'Iterator</p> <p>There are two parts to the API. One is the EventStream. This contains a single method called "next()" which returns a list of SAX events in the 2-ple (event_name, args). It is called multiple times to return successive event lists and returns None if no events are available.</p> <p>The other is the Iterator</p> <p>Sean McGrath has a RAX parser (Record API for XML) which uses a concept similar to this.</p> <!-- ==================== CLASSES ==================== --> <a name="section-Classes"></a> <table class="summary" border="1" cellpadding="3" cellspacing="0" width="100%" bgcolor="white"> <tr bgcolor="#70b0f0" class="table-header"> <td colspan="2" class="table-header"> <table border="0" cellpadding="0" cellspacing="0" width="100%"> <tr valign="top"> <td align="left"><span class="table-header">Classes</span></td> <td align="right" valign="top" ><span class="options">[<a href="#section-Classes" class="privatelink" onclick="toggle_private();" >hide private</a>]</span></td> </tr> </table> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.Iterator.StoreEvents-class.html" class="summary-name">StoreEvents</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.Iterator.EventStream-class.html" class="summary-name">EventStream</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.Iterator.Iterator-class.html" class="summary-name">Iterator</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.Iterator.RecordEventStream-class.html" class="summary-name">RecordEventStream</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.Iterator.IteratorRecords-class.html" class="summary-name">IteratorRecords</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.Iterator.HeaderFooterEventStream-class.html" class="summary-name">HeaderFooterEventStream</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.Iterator.IteratorHeaderFooter-class.html" class="summary-name">IteratorHeaderFooter</a> </td> </tr> <tr> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <a href="Martel.Iterator.Iterate-class.html" class="summary-name">Iterate</a> </td> </tr> </table> <!-- ==================== FUNCTIONS ==================== --> <a name="section-Functions"></a> <table class="summary" border="1" cellpadding="3" cellspacing="0" width="100%" bgcolor="white"> <tr bgcolor="#70b0f0" class="table-header"> <td colspan="2" class="table-header"> <table border="0" cellpadding="0" cellspacing="0" width="100%"> <tr valign="top"> <td align="left"><span class="table-header">Functions</span></td> <td align="right" valign="top" ><span class="options">[<a href="#section-Functions" class="privatelink" onclick="toggle_private();" >hide private</a>]</span></td> </tr> </table> </td> </tr> <tr class="private"> <td width="15%" align="right" valign="top" class="summary"> <span class="summary-type"> </span> </td><td class="summary"> <table width="100%" cellpadding="0" cellspacing="0" border="0"> <tr> <td><span class="summary-sig"><a name="_get_next_text"></a><span class="summary-sig-name">_get_next_text</span>(<span class="summary-sig-arg">reader</span>)</span></td> <td align="right" valign="top"> <span class="codelink"><a href="Martel.Iterator-pysrc.html#_get_next_text">source code</a></span> </td> </tr> </table> </td> </tr> </table> <!-- ==================== NAVIGATION BAR ==================== --> <table class="navbar" border="0" width="100%" cellpadding="0" bgcolor="#a0c0ff" cellspacing="0"> <tr valign="middle"> <!-- Tree link --> <th> <a href="module-tree.html">Trees</a> </th> <!-- Index link --> <th> <a href="identifier-index.html">Indices</a> </th> <!-- Help link --> <th> <a href="help.html">Help</a> </th> <th class="navbar" width="100%"></th> </tr> </table> <table border="0" cellpadding="0" cellspacing="0" width="100%%"> <tr> <td align="left" class="footer"> Generated by Epydoc 3.0.1 on Mon Sep 15 09:26:29 2008 </td> <td align="right" class="footer"> <a target="mainFrame" href="http://epydoc.sourceforge.net" >http://epydoc.sourceforge.net</a> </td> </tr> </table> <script type="text/javascript"> <!-- // Private objects are initially displayed (because if // javascript is turned off then we want them to be // visible); but by default, we want to hide them. So hide // them unless we have a cookie that says to show them. checkCookie(); // --> </script> </body> </html>