org.w3c.dom.ls
Interface DOMWriter


public interface DOMWriter

DOM Level 3 WD Experimental: The DOM Level 3 specification is at the stage of Working Draft, which represents work in progress and thus may be updated, replaced, or obsoleted by other documents at any time.

DOMWriter provides an API for serializing (writing) a DOM document out in an XML document. The XML data is written to an output stream, the type of which depends on the specific language bindings in use.

During serialization of XML data, namespace fixup is done when possible as defined in [DOM Level 3 Core], Appendix B. [DOM Level 2 Core] allows empty strings as a real namespace URI. If the namespaceURI of a Node is empty string, the serialization will treat them as null, ignoring the prefix if any. should the remark on DOM Level 2 namespace URI included in the namespace algorithm in Core instead?

DOMWriter accepts any node type for serialization. For nodes of type Document or Entity, well formed XML will be created if possible. The serialized output for these node types is either as a Document or an External Entity, respectively, and is acceptable input for an XML parser. For all other types of nodes the serialized form is not specified, but should be something useful to a human for debugging or diagnostic purposes. Note: rigorously designing an external (source) form for stand-alone node types that don't already have one defined in [XML 1.0] seems a bit much to take on here.

Within a Document, DocumentFragment, or Entity being serialized, Nodes are processed as follows Document nodes are written including with the XML declaration and a DTD subset, if one exists in the DOM. Writing a Document node serializes the entire document. Entity nodes, when written directly by DOMWriter.writeNode, output the entity expansion but no namespace fixup is done. The resulting output will be valid as an external entity. EntityReference nodes are serialized as an entity reference of the form "&entityName;" in the output. Child nodes (the expansion) of the entity reference are ignored. CDATA sections containing content characters that can not be represented in the specified output encoding are handled according to the "split-cdata-sections" boolean parameter. If the boolean parameter is true, CDATA sections are split, and the unrepresentable characters are serialized as numeric character references in ordinary content. The exact position and number of splits is not specified. If the boolean parameter is false, unrepresentable characters in a CDATA section are reported as errors. The error is not recoverable - there is no mechanism for supplying alternative characters and continuing with the serialization. DocumentFragment nodes are serialized by serializing the children of the document fragment in the order they appear in the document fragment. All other node types (Element, Text, etc.) are serialized to their corresponding XML source form. The serialization of a Node does not always generate a well-formed XML document, i.e. a DOMBuilder might through fatal errors when parsing the resulting serialization.

Within the character data of a document (outside of markup), any characters that cannot be represented directly are replaced with character references. Occurrences of '<' and '&' are replaced by the predefined entities &lt; and &amp;. The other predefined entities (&gt, &apos, and &quot;) are not used; these characters can be included directly. Any character that can not be represented directly in the output character encoding is serialized as a numeric character reference.

Attributes not containing quotes are serialized in quotes. Attributes containing quotes but no apostrophes are serialized in apostrophes (single quotes). Attributes containing both forms of quotes are serialized in quotes, with quotes within the value represented by the predefined entity &quot;. Any character that can not be represented directly in the output character encoding is serialized as a numeric character reference.

Within markup, but outside of attributes, any occurrence of a character that cannot be represented in the output character encoding is reported as an error. An example would be serializing the element <LaCañada/> with encoding="us-ascii".

When requested by setting the normalize-characters boolean parameter on DOMWriter, all data to be serialized, both markup and character data, is W3C Text normalized according to the rules defined in [CharModel]. The W3C Text normalization process affects only the data as it is being written; it does not alter the DOM's view of the document after serialization has completed.

Namespaces are fixed up during serialization, the serialization process will verify that namespace declarations, namespace prefixes and the namespace URIs associated with elements and attributes are consistent. If inconsistencies are found, the serialized form of the document will be altered to remove them. The method used for doing the namespace fixup while serializing a document is the algorithm defined in Appendix B.1 "Namespace normalization" of [DOM Level 3 Core]. previous paragraph to be defined closer here.

Any changes made affect only the namespace prefixes and declarations appearing in the serialized data. The DOM's view of the document is not altered by the serialization operation, and does not reflect any changes made to namespace declarations or prefixes in the serialized output.

While serializing a document the serializer will write out non-specified values (such as attributes whose specified is false) if the discard-default-content boolean parameter is set to true. If the discard-default-content flag is set to false and a schema is used for validation, the schema will be also used to determine if a value is specified or not. If no schema is used, the specified flag on attribute nodes is used to determine if attribute values should be written out.

Ref to Core spec (1.1.9, XML namespaces, 5th paragraph) entity ref description about warning about unbound entity refs. Entity refs are always serialized as &foo;, also mention this in the load part of this spec.

See also the Document Object Model (DOM) Level 3 Load and Save Specification.


Method Summary
 org.apache.xerces.dom3.DOMConfiguration getConfig()
          The configuration used when a document is loaded.
 java.lang.String getEncoding()
          The character encoding in which the output will be written.
 DOMWriterFilter getFilter()
          When the application provides a filter, the serializer will call out to the filter before serializing each Node.
 java.lang.String getNewLine()
          The end-of-line sequence of characters to be used in the XML being written out.
 void setEncoding(java.lang.String encoding)
          The character encoding in which the output will be written.
 void setFilter(DOMWriterFilter filter)
          When the application provides a filter, the serializer will call out to the filter before serializing each Node.
 void setNewLine(java.lang.String newLine)
          The end-of-line sequence of characters to be used in the XML being written out.
 boolean writeNode(java.io.OutputStream destination, org.w3c.dom.Node wnode)
          Write out the specified node as described above in the description of DOMWriter.
 java.lang.String writeToString(org.w3c.dom.Node wnode)
          Serialize the specified node as described above in the description of DOMWriter.
 

Method Detail

getConfig

public org.apache.xerces.dom3.DOMConfiguration getConfig()
The configuration used when a document is loaded.
In addition to the boolean parameters and parameters recognized in the Core module, the DOMConfiguration objects for DOMWriter adds, or modifies, the following boolean parameters:
"entity-resolver"
This parameter is equivalent to the "entity-resolver" parameter defined in DOMBuilder.config.
"xml-declaration"
true
[required] (default) If a Document Node or an Entity node is serialized, the XML declaration, or text declaration, should be included Document.version and/or an encoding is specified.
false
[required] Do not serialize the XML and text declarations.
"canonical-form"
true
[optional] This formatting writes the document according to the rules specified in [Canonical XML]. Setting this boolean parameter to true will set the boolean parameter "format-pretty-print" to false.
false
[ required] (default) Do not canonicalize the output.
"format-pretty-print"
true
[optional] Formatting the output by adding whitespace to produce a pretty-printed, indented, human-readable form. The exact form of the transformations is not specified by this specification. Setting this boolean parameter to true will set the boolean parameter "canonical-form" to false.
false
[required] (default) Don't pretty-print the result.
"normalize-characters"
This boolean parameter is equivalent to the one defined by DOMConfiguration in [DOM Level 3 Core]. Unlike in the Core, the default value for this boolean parameter is true. While DOM implementations are not required to implement the W3C Text Normalization defined in [CharModel], this boolean parameter must be activated by default if supported.
"unknown-characters"
true
[required] (default) If, while verifying full normalization when [XML 1.1] is supported, a character is encountered for which the normalization properties cannot be determined, then ignore any possible denormalizations caused by these characters.
false
[optional] Report an fatal error if a character is encountered for which the processor can not determine the normalization properties.

getEncoding

public java.lang.String getEncoding()
The character encoding in which the output will be written.
The encoding to use when writing is determined as follows: If the encoding attribute has been set, that value will be used.If the encoding attribute is null or empty, but the item to be written, or the owner document of the item, specifies an encoding (i.e. the "actualEncoding" from the document) specified encoding, that value will be used.If neither of the above provides an encoding name, a default encoding of "UTF-8" will be used.
The default value is null.

setEncoding

public void setEncoding(java.lang.String encoding)
The character encoding in which the output will be written.
The encoding to use when writing is determined as follows: If the encoding attribute has been set, that value will be used.If the encoding attribute is null or empty, but the item to be written, or the owner document of the item, specifies an encoding (i.e. the "actualEncoding" from the document) specified encoding, that value will be used.If neither of the above provides an encoding name, a default encoding of "UTF-8" will be used.
The default value is null.

getNewLine

public java.lang.String getNewLine()
The end-of-line sequence of characters to be used in the XML being written out. Any string is supported, but these are the recommended end-of-line sequences (using other character sequences than these recommended ones can result in a document that is either not serializable or not well-formed):
null
Use a default end-of-line sequence. DOM implementations should choose the default to match the usual convention for text files in the environment being used. Implementations must choose a default sequence that matches one of those allowed by "End-of-Line Handling" (, section 2.11) if the serialized content is XML 1.0 or "End-of-Line Handling" (, section 2.11) if the serialized content is XML 1.1.
CR
The carriage-return character (#xD).
CR-LF
The carriage-return and line-feed characters (#xD #xA).
LF
The line-feed character (#xA).

The default value for this attribute is null.

setNewLine

public void setNewLine(java.lang.String newLine)
The end-of-line sequence of characters to be used in the XML being written out. Any string is supported, but these are the recommended end-of-line sequences (using other character sequences than these recommended ones can result in a document that is either not serializable or not well-formed):
null
Use a default end-of-line sequence. DOM implementations should choose the default to match the usual convention for text files in the environment being used. Implementations must choose a default sequence that matches one of those allowed by "End-of-Line Handling" (, section 2.11) if the serialized content is XML 1.0 or "End-of-Line Handling" (, section 2.11) if the serialized content is XML 1.1.
CR
The carriage-return character (#xD).
CR-LF
The carriage-return and line-feed characters (#xD #xA).
LF
The line-feed character (#xA).

The default value for this attribute is null.

getFilter

public DOMWriterFilter getFilter()
When the application provides a filter, the serializer will call out to the filter before serializing each Node. Attribute nodes are never passed to the filter. The filter implementation can choose to remove the node from the stream or to terminate the serialization early.

setFilter

public void setFilter(DOMWriterFilter filter)
When the application provides a filter, the serializer will call out to the filter before serializing each Node. Attribute nodes are never passed to the filter. The filter implementation can choose to remove the node from the stream or to terminate the serialization early.

writeNode

public boolean writeNode(java.io.OutputStream destination,
                         org.w3c.dom.Node wnode)
Write out the specified node as described above in the description of DOMWriter. Writing a Document or Entity node produces a serialized form that is well formed XML, when possible (Entity nodes might not always be well formed XML in themselves). Writing other node types produces a fragment of text in a form that is not fully defined by this document, but that should be useful to a human for debugging or diagnostic purposes.
If the specified encoding is not supported the error handler is called and the serialization is interrupted.
Parameters:
destination - The destination for the data to be written.
wnode - The Document or Entity node to be written. For other node types, something sensible should be written, but the exact serialized form is not specified.
Returns:
Returns true if node was successfully serialized and false in case a failure occured and the failure wasn't canceled by the error handler.

writeToString

public java.lang.String writeToString(org.w3c.dom.Node wnode)
                               throws org.w3c.dom.DOMException
Serialize the specified node as described above in the description of DOMWriter. The result of serializing the node is returned as a DOMString (this method completely ignores all the encoding information available). Writing a Document or Entity node produces a serialized form that is well formed XML. Writing other node types produces a fragment of text in a form that is not fully defined by this document, but that should be useful to a human for debugging or diagnostic purposes.
Error handler is called if encoding not supported...
Parameters:
wnode - The node to be written.
Returns:
Returns the serialized data, or null in case a failure occured and the failure wasn't canceled by the error handler.
Throws:
org.w3c.dom.DOMException - DOMSTRING_SIZE_ERR: Raised if the resulting string is too long to fit in a DOMString.


Copyright © 1999-2003 Apache XML Project. All Rights Reserved.