Class | Nokogiri::XML::Reader |
In: |
lib/nokogiri/xml/reader.rb
ext/nokogiri/xml_sax_parser.c |
Parent: | Object |
Nokogiri::XML::Reader parses an XML document similar to the way a cursor would move. The Reader is given an XML document, and yields nodes to an each block.
Here is an example of usage:
reader = Nokogiri::XML::Reader(<<-eoxml) <x xmlns:tenderlove='http://tenderlovemaking.com/'> <tenderlove:foo awesome='true'>snuggles!</tenderlove:foo> </x> eoxml reader.each do |node| # node is an instance of Nokogiri::XML::Reader puts node.name end
Note that Nokogiri::XML::Reader#each can only be called once!! Once the cursor moves through the entire document, you must parse the document again. So make sure that you capture any information you need during the first iteration.
The Reader parser is good for when you need the speed of a SAX parser, but do not want to write a Document handler.
TYPE_NONE | = | 0 | ||
TYPE_ELEMENT | = | 1 | Element node type | |
TYPE_ATTRIBUTE | = | 2 | Attribute node type | |
TYPE_TEXT | = | 3 | Text node type | |
TYPE_CDATA | = | 4 | CDATA node type | |
TYPE_ENTITY_REFERENCE | = | 5 | Entity Reference node type | |
TYPE_ENTITY | = | 6 | Entity node type | |
TYPE_PROCESSING_INSTRUCTION | = | 7 | PI node type | |
TYPE_COMMENT | = | 8 | Comment node type | |
TYPE_DOCUMENT | = | 9 | Document node type | |
TYPE_DOCUMENT_TYPE | = | 10 | Document Type node type | |
TYPE_DOCUMENT_FRAGMENT | = | 11 | Document Fragment node type | |
TYPE_NOTATION | = | 12 | Notation node type | |
TYPE_WHITESPACE | = | 13 | Whitespace node type | |
TYPE_SIGNIFICANT_WHITESPACE | = | 14 | Significant Whitespace node type | |
TYPE_END_ELEMENT | = | 15 | Element end node type | |
TYPE_END_ENTITY | = | 16 | Entity end node type | |
TYPE_XML_DECLARATION | = | 17 | XML Declaration node type |
empty_element? | -> | self_closing? |
encoding | [R] | The encoding for the document |
errors | [RW] | A list of errors encountered while parsing |
source | [R] | The XML source |
Read the contents of the current node, including child nodes and markup. Returns a utf-8 encoded string.