com.google.streamhtmlparser.util
Class EntityResolver

java.lang.Object
  extended by com.google.streamhtmlparser.util.EntityResolver

public class EntityResolver
extends Object

Decodes (unescapes) HTML entities with the complication that these are received one character at a time hence must be stored temporarily. Also, we may receive some "junk" characters before the actual entity which we will discard.

This class is designed to be 100% compatible with the corresponding logic in the C-version of the com.google.security.streamhtmlparser.HtmlParser, found in htmlparser.c. There are however a few intentional differences outlines below:

Valid HTML entities have one of the following three forms:

A reset method is provided to facilitate object re-use.


Nested Class Summary
static class EntityResolver.Status
          Returned in processChar method.
 
Constructor Summary
EntityResolver()
          Constructs an entity resolver that is initially empty and with status NOT_STARTED, see EntityResolver.Status.
EntityResolver(EntityResolver aEntityResolver)
          Constructs an entity resolver that is an exact copy of the one provided.
 
Method Summary
 String getEntity()
          Returns the decoded HTML Entity.
 EntityResolver.Status processChar(char input)
          Processes a character from the input stream and decodes any html entities from that processed input stream.
 void reset()
          Returns the object to its original state for re-use, deleting any stored characters that may be present.
 String toString()
          Returns the full state of the StreamEntityResolver in a human readable form.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

EntityResolver

public EntityResolver()
Constructs an entity resolver that is initially empty and with status NOT_STARTED, see EntityResolver.Status.


EntityResolver

public EntityResolver(EntityResolver aEntityResolver)
Constructs an entity resolver that is an exact copy of the one provided. In particular it has the same contents and status.

Parameters:
aEntityResolver - the entity resolver to copy
Method Detail

reset

public void reset()
Returns the object to its original state for re-use, deleting any stored characters that may be present.


toString

public String toString()
Returns the full state of the StreamEntityResolver in a human readable form. The format of the returned String is not specified and is subject to change.

Overrides:
toString in class Object
Returns:
full state of this object

getEntity

public String getEntity()
Returns the decoded HTML Entity. Should only be called after processChar returned status COMPLETED.

Returns:
the decoded HTML Entity or an empty String if we were called with any status other than COMPLETED

processChar

public EntityResolver.Status processChar(char input)
Processes a character from the input stream and decodes any html entities from that processed input stream.

Parameters:
input - the char to process
Returns:
the processed String. Typically returns an empty String while awaiting for more characters to complete processing of the entity.


Copyright © 2010-2012 Google. All Rights Reserved.