Package Martel :: Module Parser
[hide private]
[frames] | no frames]

Module Parser

source code

Implement Martel parsers.

The classes in this module are used by other Martel modules and not typically by external users.

There are two major parsers, 'Parser' and 'RecordParser.' The first is the standard one, which parses the file as one string in memory then generates the SAX events. The other reads a record at a time using a RecordReader and generates events after each read. The generated event callbacks are identical.

At some level, both parsers use "_do_callback" to convert mxTextTools tags into SAX events.

XXX finish this documentation

XXX need a better way to get closer to the likely error position when parsing.

XXX need to implement Locator

Classes [hide private]
  ParserException
used when a parse cannot be done
  ParserPositionException
  ParserIncompleteException
  ParserRecordException
used by the RecordParser when it can't read a record
  MartelAttributeList
  Parser
Parse the input data all in memory
  RecordParser
Parse the input data a record at a time
  HeaderFooterParser
Header followed by 0 or more records followed by a footer
Functions [hide private]
 
_do_callback(s, begin, end, taglist, cont_handler, attrlookup)
internal function to convert the tagtable into ContentHandler events
source code
 
_do_dispatch_callback(s, begin, end, taglist, start_table_get, cont_handler, save_stack, end_table_get, attrlookup)
internal function to convert the tagtable into ContentHandler events
source code
 
_parse_elements(s, tagtable, cont_handler, debug_level, attrlookup)
parse the string with the tagtable and send the ContentHandler events
source code
Variables [hide private]
  _match_group = {}
  _attribute_list = {}
Function Details [hide private]

_do_callback(s, begin, end, taglist, cont_handler, attrlookup)

source code 

internal function to convert the tagtable into ContentHandler events

's' is the input text 'begin' is the current position in the text 'end' is 1 past the last position of the text allowed to be parsed 'taglist' is the tag list from mxTextTools.parse 'cont_handler' is the SAX ContentHandler 'attrlookup' is a dict mapping the encoded tag name to the element info

_do_dispatch_callback(s, begin, end, taglist, start_table_get, cont_handler, save_stack, end_table_get, attrlookup)

source code 

internal function to convert the tagtable into ContentHandler events

THIS IS A SPECIAL CASE FOR Dispatch.Dispatcher objects

's' is the input text 'begin' is the current position in the text 'end' is 1 past the last position of the text allowed to be parsed 'taglist' is the tag list from mxTextTools.parse 'start_table_get' is the Dispatcher._start_table 'cont_handler' is the Dispatcher 'end_table_get' is the Dispatcher._end_table 'cont_handler' is the SAX ContentHandler 'attrlookup' is a dict mapping the encoded tag name to the element info

_parse_elements(s, tagtable, cont_handler, debug_level, attrlookup)

source code 

parse the string with the tagtable and send the ContentHandler events

Specifically, it sends the startElement, endElement and characters events but not startDocument and endDocument.