Package Martel :: Module Iterator
[show private | hide private]
[frames | no frames]

Module Martel.Iterator

Iterate over records of a XML parse tree.

The standard parser is callback based over all the elements of a file. If the file contains records, many people would like to be able to iterate over each record and only use the callback parser to analyze the record.

If the expression is a 'ParseRecords', then the code to do this is easy; use its make_reader to grab records and its record_expression to parse them. However, this isn't general enough. The use of a ParseRecords in the format definition should be strictly a implementation decision for better memory use. So there needs to be an API which allows both full and record oriented parsers.

Here's an example use of the API: >>> import sys >>> import swissprot38 # one is in Martel/test/testformats >>> from xml.dom import pulldom >>> iterator = swissprot38.format.make_iterator("swissprot38_record") >>> text = open("sample.swissprot").read() >>> for record in iterator.iterateString(text, pulldom.SAX2DOM()): .. print "Read a record with the following AC numbers:" ... for acc in record.document.getElementsByTagName("ac_number"): ... acc.writexml(sys.stdout) ... sys.stdout.write(" ") ...

There are several parts to this API. First is the 'Iterator

There are two parts to the API. One is the EventStream. This contains a single method called "next()" which returns a list of SAX events in the 2-ple (event_name, args). It is called multiple times to return successive event lists and returns None if no events are available.

The other is the Iterator

Sean McGrath has a RAX parser (Record API for XML) which uses a concept similar to this.
Classes
EventStream  
HeaderFooterEventStream  
Iterate  
Iterator  
IteratorHeaderFooter  
IteratorRecords  
RecordEventStream  
StoreEvents  

Generated by Epydoc 2.1 on Thu Aug 10 20:01:07 2006 http://epydoc.sf.net