|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
XML Pull Parser is an interface that defines parsing functionlity provided in XMLPULL V1 API (visit this website to learn more about API and its implementations).
There are following different kinds of parser depending on which features are set:
There are only two key methods: next() and nextToken() that provides access to high level parsing events and to lower level tokens.
The parser is always in some event state and type of the current event can be determined by calling getEventType() mehod. Initially parser is in START_DOCUMENT state.
Method next() return int that contains identifier of parsing event. This method can return following events (and will change parser state to the returned event):
import java.io.IOException; import java.io.StringReader; import org.xmlpull.v1.XmlPullParser; import org.xmlpull.v1.XmlPullParserException.html; import org.xmlpull.v1.XmlPullParserFactory; public class SimpleXmlPullApp { public static void main (String args[]) throws XmlPullParserException, IOException { XmlPullParserFactory factory = XmlPullParserFactory.newInstance(); factory.setNamespaceAware(true); XmlPullParser xpp = factory.newPullParser(); xpp.setInput( new StringReader ( "<foo>Hello World!</foo>" ) ); int eventType = xpp.getEventType(); while (eventType != xpp.END_DOCUMENT) { if(eventType == xpp.START_DOCUMENT) { System.out.println("Start document"); } else if(eventType == xpp.END_DOCUMENT) { System.out.println("End document"); } else if(eventType == xpp.START_TAG) { System.out.println("Start tag "+xpp.getName()); } else if(eventType == xpp.END_TAG) { System.out.println("End tag "+xpp.getName()); } else if(eventType == xpp.TEXT) { System.out.println("Text "+xpp.getText()); } eventType = xpp.next(); } } }
When run it will produce following output:
Start document Start tag foo Text Hello World! End tag foo
For more details on use of API please read Quick Introduction available at http://www.xmlpull.org
XmlPullParserFactory
,
defineEntityReplacementText(java.lang.String, java.lang.String)
,
getName()
,
getNamespace(java.lang.String)
,
getText()
,
next()
,
nextToken()
,
setInput(java.io.Reader)
,
FEATURE_PROCESS_DOCDECL
,
FEATURE_VALIDATION
,
START_DOCUMENT
,
START_TAG
,
TEXT
,
END_TAG
,
END_DOCUMENT
Field Summary | |
static byte |
CDSECT
TOKEN: CDATA sections was just read (this token is available only from nextToken()). |
static int |
COMMENT
TOKEN: XML comment was just read and getText() will return value inside comment (this token is available only from nextToken()). |
static int |
DOCDECL
TOKEN: XML DOCTYPE declaration was just read and getText() will return text that is inside DOCDECL (this token is available only from nextToken()). |
static int |
END_DOCUMENT
EVENT TYPE and TOKEN: logical end of xml document (available from next() and nextToken()). |
static int |
END_TAG
EVENT TYPE and TOKEN: end tag was just read (available from next() and nextToken()). |
static byte |
ENTITY_REF
TOKEN: Entity reference was just read (this token is available only from nextToken()). |
static java.lang.String |
FEATURE_PROCESS_DOCDECL
FEATURE: Processing of DOCDECL is by default set to false and if DOCDECL is encountered it is reported by nextToken() and ignored by next(). |
static java.lang.String |
FEATURE_PROCESS_NAMESPACES
FEATURE: Processing of namespaces is by default set to false. |
static java.lang.String |
FEATURE_REPORT_NAMESPACE_ATTRIBUTES
FEATURE: Report namespace attributes also - they can be distinguished looking for prefix == "xmlns" or prefix == null and name == "xmlns it is off by default and only meaningful when FEATURE_PROCESS_NAMESPACES feature is on. |
static java.lang.String |
FEATURE_VALIDATION
FEATURE: Report all validation errors as defined by XML 1.0 sepcification (implies that FEATURE_PROCESS_DOCDECL is true and both internal and external DOCDECL will be processed). |
static byte |
IGNORABLE_WHITESPACE
TOKEN: Ignorable whitespace was just read (this token is available only from nextToken()). |
static java.lang.String |
NO_NAMESPACE
This constant represents lack of or default namespace (empty string "") |
static byte |
PROCESSING_INSTRUCTION
TOKEN: XML processing instruction declaration was just read and getText() will return text that is inside processing instruction (this token is available only from nextToken()). |
static int |
START_DOCUMENT
EVENT TYPE and TOKEN: signalize that parser is at the very beginning of the document and nothing was read yet - the parser is before first call to next() or nextToken() (available from next() and nextToken()). |
static int |
START_TAG
EVENT TYPE and TOKEN: start tag was just read (available from next() and nextToken()). |
static int |
TEXT
EVENT TYPE and TOKEN: character data was read and will be available by call to getText() (available from next() and nextToken()). |
static java.lang.String[] |
TYPES
Use this array to convert event type number (such as START_TAG) to to string giving event name, ex: "START_TAG" == TYPES[START_TAG] This array contains all event types and token types and represents them as concise strings. |
Method Summary | |
void |
defineEntityReplacementText(java.lang.String entityName,
java.lang.String replacementText)
Set new value for entity replacement text as defined in XML 1.0 Section 4.5 Construction of Internal Entity Replacement Text. |
int |
getAttributeCount()
Returns the number of attributes on the current element; -1 if the current event is not START_TAG |
java.lang.String |
getAttributeName(int index)
Returns the local name of the specified attribute if namespaces are enabled or just attribute name if namespaces are disabled. |
java.lang.String |
getAttributeNamespace(int index)
Returns the namespace URI of the specified attribute number index (starts from 0). |
java.lang.String |
getAttributePrefix(int index)
Returns the prefix of the specified attribute Returns null if the element has no prefix. |
java.lang.String |
getAttributeType(int index)
Returns the type of the specified attribute If parser is non-validating it MUST return CDATA. |
java.lang.String |
getAttributeValue(int index)
Returns the given attributes value. |
java.lang.String |
getAttributeValue(java.lang.String namespace,
java.lang.String name)
Returns the attributes value identified by namespace URI and namespace localName. |
int |
getColumnNumber()
Current column: numbering starts from 0 (zero should be returned when parser is in START_DOCUMENT state!) It must return -1 if parser does not know current line number or can not determine it (for example in case of WBXML) |
int |
getDepth()
Returns the current depth of the element. |
int |
getEventType()
Returns the type of the current event (START_TAG, END_TAG, TEXT, etc.) |
boolean |
getFeature(java.lang.String name)
Return the current value of the feature with given name. |
java.lang.String |
getInputEncoding()
Return input encoding if known or null if unknown. |
int |
getLineNumber()
Current line number: numebering starts from 1. |
java.lang.String |
getName()
For START_TAG or END_TAG returns the (local) name of the current element when namespaces are enabled or raw name when namespaces are disabled. |
java.lang.String |
getNamespace()
Returns the namespace URI of the current element (default namespace is represented as empty string). |
java.lang.String |
getNamespace(java.lang.String prefix)
Return uri for the given prefix. |
int |
getNamespaceCount(int depth)
Return position in stack of first namespace slot for element at passed depth. |
java.lang.String |
getNamespacePrefix(int pos)
Return namespace prefixes for position pos in namespace stack If pos is out of range it throw exception. |
java.lang.String |
getNamespaceUri(int pos)
Return namespace URIs for position pos in namespace stack If pos is out of range it throw exception. |
java.lang.String |
getPositionDescription()
Short text describing parser position, including a description of the current event and data source if known and if possible what parser was seeing lastly in input. |
java.lang.String |
getPrefix()
Returns the prefix of the current element or null if elemet has no prefix (is in defualt namespace). |
java.lang.Object |
getProperty(java.lang.String name)
Look up the value of a property. |
java.lang.String |
getText()
Read text content of the current event as String. |
char[] |
getTextCharacters(int[] holderForStartAndLength)
Get the buffer that contains text of the current event and start offset of text is passed in first slot of input int array and its length is in second slot. |
boolean |
isAttributeDefault(int index)
Returns if the specified attribute was not in input was declared in XML. |
boolean |
isEmptyElementTag()
Returns true if the current event is START_TAG and the tag is degenerated (e.g. |
boolean |
isWhitespace()
Check if current TEXT event contains only whitespace characters. |
int |
next()
Get next parsing event - element content wil be coalesced and only one TEXT event must be returned for whole element content (comments and processing instructions will be ignored and emtity references must be expanded or exception mus be thrown if entity reerence can not be exapnded). |
int |
nextTag()
Call next() and return event if it is START_TAG or END_TAG otherwise throw an exception. |
java.lang.String |
nextText()
If current event is START_TAG then if next element is TEXT then element content is returned or if next event is END_TAG then empty string is returned, otherwise exception is thrown. |
int |
nextToken()
This method works similarly to next() but will expose additional event types (COMMENT, CDSECT, DOCDECL, ENTITY_REF, PROCESSING_INSTRUCTION, or IGNORABLE_WHITESPACE) if they are available in input. |
void |
require(int type,
java.lang.String namespace,
java.lang.String name)
Test if the current event is of the given type and if the namespace and name do match. |
void |
setFeature(java.lang.String name,
boolean state)
Use this call to change the general behaviour of the parser, such as namespace processing or doctype declaration handling. |
void |
setInput(java.io.InputStream inputStream,
java.lang.String inputEncoding)
Set the input stream for parser. |
void |
setInput(java.io.Reader in)
Set the input for parser. |
void |
setProperty(java.lang.String name,
java.lang.Object value)
Set the value of a property. |
Field Detail |
public static final java.lang.String NO_NAMESPACE
public static final int START_DOCUMENT
next()
,
nextToken()
,
Constant Field Valuespublic static final int END_DOCUMENT
NOTE: calling again next() or nextToken() will result in exception being thrown.
next()
,
nextToken()
,
Constant Field Valuespublic static final int START_TAG
next()
,
nextToken()
,
getName()
,
getPrefix()
,
getNamespace(java.lang.String)
,
getAttributeCount()
,
getDepth()
,
getNamespaceCount(int)
,
getNamespace(java.lang.String)
,
FEATURE_PROCESS_NAMESPACES
,
Constant Field Valuespublic static final int END_TAG
next()
,
nextToken()
,
getName()
,
getPrefix()
,
getNamespace(java.lang.String)
,
FEATURE_PROCESS_NAMESPACES
,
Constant Field Valuespublic static final int TEXT
NOTE: next() will (in contrast to nextToken ()) accumulate multiple events into one TEXT event, skipping IGNORABLE_WHITESPACE, PROCESSING_INSTRUCTION and COMMENT events.
NOTE: if state was reached by calling next() the text value will be normalized and if the token was returned by nextToken() then getText() will return unnormalized content (no end-of-line normalization - it is content exactly as in input XML)
next()
,
nextToken()
,
getText()
,
Constant Field Valuespublic static final byte CDSECT
nextToken()
,
getText()
,
Constant Field Valuespublic static final byte ENTITY_REF
nextToken()
,
getText()
,
Constant Field Valuespublic static final byte IGNORABLE_WHITESPACE
NOTE: this is different than callinf isWhitespace() method as element content may be whitespace but may not be ignorable whitespace.
nextToken()
,
getText()
,
Constant Field Valuespublic static final byte PROCESSING_INSTRUCTION
nextToken()
,
getText()
,
Constant Field Valuespublic static final int COMMENT
nextToken()
,
getText()
,
Constant Field Valuespublic static final int DOCDECL
nextToken()
,
getText()
,
Constant Field Valuespublic static final java.lang.String[] TYPES
public static final java.lang.String FEATURE_PROCESS_NAMESPACES
NOTE: can not be changed during parsing!
getFeature(java.lang.String)
,
setFeature(java.lang.String, boolean)
,
Constant Field Valuespublic static final java.lang.String FEATURE_REPORT_NAMESPACE_ATTRIBUTES
NOTE: can not be changed during parsing!
getFeature(java.lang.String)
,
setFeature(java.lang.String, boolean)
,
Constant Field Valuespublic static final java.lang.String FEATURE_PROCESS_DOCDECL
NOTE: if the DOCDECL was ignored further in parsing there may be fatal exception when undeclared entity is encountered!
NOTE: can not be changed during parsing!
getFeature(java.lang.String)
,
setFeature(java.lang.String, boolean)
,
Constant Field Valuespublic static final java.lang.String FEATURE_VALIDATION
NOTE: can not be changed during parsing!
getFeature(java.lang.String)
,
setFeature(java.lang.String, boolean)
,
Constant Field ValuesMethod Detail |
public void setFeature(java.lang.String name, boolean state) throws XmlPullParserException
Example: call setFeature(FEATURE_PROCESS_NAMESPACES, true) in order to switch on namespace processing. Default settings correspond to properties requested from the XML Pull Parser factory (if none were requested then all feautures are by default false).
XmlPullParserException
- if feature is not supported or can not be set
java.lang.IllegalArgumentException
- if feature string is nullpublic boolean getFeature(java.lang.String name)
NOTE: unknown features are
name
- The name of feature to be retrieved.
java.lang.IllegalArgumentException
- if feature string is null
public void setProperty(java.lang.String name, java.lang.Object value) throws XmlPullParserException
XmlPullParserException
public java.lang.Object getProperty(java.lang.String name)
NOTE: unknown features are
name
- The name of property to be retrieved.
public void setInput(java.io.Reader in) throws XmlPullParserException
XmlPullParserException
public void setInput(java.io.InputStream inputStream, java.lang.String inputEncoding) throws XmlPullParserException
NOTE: calling this function will not result in reading any input bytes even when parser have to determine input encoding including byte order marks and detection <? xml encoding (in case when inputEncoding is null). The XMLPULL implementation MUST postpone reading of input bytes until first call to one of next() methods.
NOTE: if inputEncoding is passed it MUST be used otherwise if inputEncoding is null the parser SHOULD try to determine input encoding following XML 1.0 specification (see below) but it is not required (for example when parser is constrained by memory footprint such as in J2ME environments) If encoding detection is supported then following feature http://xmlpull.org/v1/doc/features.html#detect-encoding MUST be true otherwise it must be false
inputStream
- contains raw byte input stream of possibly
unknown encoding (when inputEncoding is null) and in such case the parser
must derive encoding from <?xml declaration or assume UTF8 or UTF16 as
described in XML 1.0
Appendix F.1 Detection Without External Encoding Information
otherwise if inputEncoding is present then it must be used
(this is consistent with
XML 1.0
Appendix F.2 Priorities in the Presence of External Encoding Information
that allows for exception only for files and in such cases inputEncoding should
be null to trigger autodetecting.
if inputStream is null the IllegalArgumentException must be throwninputEncoding
- if not null it MUST be used as encoding for inputStream
XmlPullParserException
public java.lang.String getInputEncoding()
public void defineEntityReplacementText(java.lang.String entityName, java.lang.String replacementText) throws XmlPullParserException
The motivation for this function is to allow very small implementations of XMLPULL that will work in J2ME environments and though may not be able to process DOCDECL but still can be made to work with predefined DTDs by using this function to define well known in advance entities. Additionally as XML Schemas are replacing DTDs by allowing parsers not to process DTDs it is possible to create more efficient parser implementations that can be used as underlying layer to do XML schemas validation.
NOTE: this is replacement text and it is not allowed to contain any other entity reference
NOTE: list of pre-defined entites will always contain standard XML entities (such as & < > " ') and they cannot be replaced!
XmlPullParserException
setInput(java.io.Reader)
,
FEATURE_PROCESS_DOCDECL
,
FEATURE_VALIDATION
public int getNamespaceCount(int depth) throws XmlPullParserException
NOTE: default namespace is included in namespace table and is available by using null string as in getNamespace(null) (it may return null if xmlns="..." is not present) and as well by calling getNamespace() (that will never return null but "").
NOTE: when parser is on END_TAG then it is allowed to call this function with getDepth()+1 argument to retrieve position of namespace prefixes and URIs that were declared on corresponding START_TAG.
XmlPullParserException
getNamespacePrefix(int)
,
getNamespaceUri(int)
,
getNamespace()
,
getNamespace(String)
public java.lang.String getNamespacePrefix(int pos) throws XmlPullParserException
NOTE: when parser is on END_TAG then namespace prefixes that were declared in corresponding START_TAG are still accessible even though they are not in scope.
XmlPullParserException
public java.lang.String getNamespaceUri(int pos) throws XmlPullParserException
NOTE: when parser is on END_TAG then namespace prefixes that were declared in corresponding START_TAG are still accessible even though they are not in scope
XmlPullParserException
public java.lang.String getNamespace(java.lang.String prefix)
It will return null if namespace could not be found.
Convenience method for
for (int i = getNamespaceCount (getDepth ())-1; i >= 0; i--) { if (getNamespacePrefix (i).equals (prefix)) { return getNamespaceUri (i); } } return null;
NOTE: parser implementation can do more efifcient lookup (using Hashtable for exmaple).
NOTE:The 'xml' prefix is bound as defined in Namespaces in XML specification to "http://www.w3.org/XML/1998/namespace".
NOTE: The 'xmlns' prefix must be resolved to following namespace http://www.w3.org/2000/xmlns/ (visit this URL for description!).
getNamespaceCount(int)
,
getNamespacePrefix(int)
,
getNamespaceUri(int)
public int getDepth()
<!-- outside --> 0 <root> 1 sometext 1 <foobar> 2 </foobar> 2 </root> 1 <!-- outside --> 0 </pre>
public java.lang.String getPositionDescription()
public int getLineNumber()
public int getColumnNumber()
public boolean isWhitespace() throws XmlPullParserException
NOTE: non-validating parsers are not able to distinguish whitespace and ignorable whitespace except from whitespace outside the root element. ignorable whitespace is reported as separate event which is exposed via nextToken only.
NOTE: this function can be only called for element content related events (TEXT, CDSECT or IGNORABLE_WHITESPACE) otherwise exception will be thrown!
XmlPullParserException
public java.lang.String getText()
NOTE: in case of ENTITY_REF this method returns entity replacement text (or null if not available) and it is the only case when getText() and getTextCharacters() returns different values.
public char[] getTextCharacters(int[] holderForStartAndLength)
NOTE: this buffer must not be modified and its content MAY change after call to next() or nextToken().
NOTE: this method must return always the same value as getText() except in case of ENTITY_REF (where getText() is replacement text and this method returns actual input buffer with entity name the same as getName()). If getText() returns null then this method returns null as well and values returned in holder MUST be -1 (both start and length).
holderForStartAndLength
- the 2-element int array into which
values of start offset and length will be written into frist and second slot of array.
getText()
public java.lang.String getNamespace()
public java.lang.String getName()
NOTE: to reconstruct raw element name when namespaces are enabled you will need to add prefix and colon to localName if prefix is not null.
public java.lang.String getPrefix()
public boolean isEmptyElementTag() throws XmlPullParserException
NOTE: if parser is not on START_TAG then the exception will be thrown.
XmlPullParserException
public int getAttributeCount()
getAttributeNamespace(int)
,
getAttributeName(int)
,
getAttributePrefix(int)
,
getAttributeValue(int)
public java.lang.String getAttributeNamespace(int index)
NOTE: if FEATURE_REPORT_NAMESPACE_ATTRIBUTES is set then namespace attributes (xmlns:ns='...') must be reported with namespace http://www.w3.org/2000/xmlns/ (visit this URL for description!). The default namespace attribute (xmlns="...") will be reported with empty namespace.
NOTE:The xml prefix is bound as defined in Namespaces in XML specification to "http://www.w3.org/XML/1998/namespace".
public java.lang.String getAttributeName(int index)
public java.lang.String getAttributePrefix(int index)
public java.lang.String getAttributeType(int index)
public boolean isAttributeDefault(int index)
public java.lang.String getAttributeValue(int index)
NOTE: attribute value must be normalized (including entity replacement text if PROCESS_DOCDECL is false) as described in XML 1.0 section 3.3.3 Attribute-Value Normalization
defineEntityReplacementText(java.lang.String, java.lang.String)
public java.lang.String getAttributeValue(java.lang.String namespace, java.lang.String name)
NOTE: attribute value must be normalized (including entity replacement text if PROCESS_DOCDECL is false) as described in XML 1.0 section 3.3.3 Attribute-Value Normalization
namespace
- Namespace of the attribute if namespaces are enabled otherwise must be nullname
- If namespaces enabled local name of attribute otherwise just attribute name
defineEntityReplacementText(java.lang.String, java.lang.String)
public int getEventType() throws XmlPullParserException
XmlPullParserException
next()
,
nextToken()
public int next() throws XmlPullParserException, java.io.IOException
NOTE: empty element (such as <tag/>) will be reported with two separate events: START_TAG, END_TAG - it must be so to preserve parsing equivalency of empty element to <tag></tag>. (see isEmptyElementTag ())
XmlPullParserException
java.io.IOException
isEmptyElementTag()
,
START_TAG
,
TEXT
,
END_TAG
,
END_DOCUMENT
public int nextToken() throws XmlPullParserException, java.io.IOException
If special feature FEATURE_XML_ROUNDTRIP (identified by URI: http://xmlpull.org/v1/doc/features.html#xml-roundtrip) is true then it is possible to do XML document round trip ie. reproduce exectly on output the XML input using getText().
Here is the list of tokens that can be returned from nextToken() and what getText() and getTextCharacters() returns:
" titlepage SYSTEM "http://www.foo.bar/dtds/typo.dtd" [<!ENTITY % active.links "INCLUDE">]"
for input document that contained:
<!DOCTYPE titlepage SYSTEM "http://www.foo.bar/dtds/typo.dtd" [<!ENTITY % active.links "INCLUDE">]>
NOTE: returned text of token is not end-of-line normalized.
XmlPullParserException
java.io.IOException
next()
,
START_TAG
,
TEXT
,
END_TAG
,
END_DOCUMENT
,
COMMENT
,
DOCDECL
,
PROCESSING_INSTRUCTION
,
ENTITY_REF
,
IGNORABLE_WHITESPACE
public void require(int type, java.lang.String namespace, java.lang.String name) throws XmlPullParserException, java.io.IOException
Essentially it does this
if (type != getEventType() || (namespace != null && !namespace.equals( getNamespace () ) ) || (name != null && !name.equals( getName() ) ) ) throw new XmlPullParserException( "expected "+ TYPES[ type ]+getPositionDescription());
XmlPullParserException
java.io.IOException
public java.lang.String nextText() throws XmlPullParserException, java.io.IOException
The motivation for this function is to allow to parse consistently both empty elements and elements that has non empty content, for example for input:
p.nextTag() p.requireEvent(p.START_TAG, "", "tag"); String content = p.nextText(); p.requireEvent(p.END_TAG, "", "tag");This function together with nextTag make it very easy to parse XML that has no mixed content.
Essentially it does this
if(getEventType() != START_TAG) { throw new XmlPullParserException( "parser must be on START_TAG to read next text", this, null); } int eventType = next(); if(eventType == TEXT) { String result = getText(); eventType = next(); if(eventType != END_TAG) { throw new XmlPullParserException( "event TEXT it must be immediately followed by END_TAG", this, null); } return result; } else if(eventType == END_TAG) { return ""; } else { throw new XmlPullParserException( "parser must be on START_TAG or TEXT to read text", this, null); }
XmlPullParserException
java.io.IOException
public int nextTag() throws XmlPullParserException, java.io.IOException
essentially it does this
int eventType = next(); if(eventType == TEXT && isWhitespace()) { // skip whitespace eventType = next(); } if (eventType != START_TAG && eventType != END_TAG) { throw new XmlPullParserException("expected start or end tag", this, null); } return eventType;
XmlPullParserException
java.io.IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |