org.w3c.dom.ls.DOMWriter Maven / Gradle / Ivy
/*
* Copyright (c) 2002 World Wide Web Consortium,
* (Massachusetts Institute of Technology, Institut National de
* Recherche en Informatique et en Automatique, Keio University). All
* Rights Reserved. This program is distributed under the W3C's Software
* Intellectual Property License. This program is distributed in the
* hope that it will be useful, but WITHOUT ANY WARRANTY; without even
* the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
* PURPOSE.
* See W3C License http://www.w3.org/Consortium/Legal/ for more details.
*/
package org.w3c.dom.ls;
import org.w3c.dom.Node;
import org.w3c.dom.DOMException;
import org.apache.xerces.dom3.DOMErrorHandler;
/**
* DOM Level 3 WD Experimental:
* The DOM Level 3 specification is at the stage
* of Working Draft, which represents work in
* progress and thus may be updated, replaced,
* or obsoleted by other documents at any time.
* DOMWriter
provides an API for serializing (writing) a DOM
* document out in an XML document. The XML data is written to an output
* stream, the type of which depends on the specific language bindings in
* use.
*
During serialization of XML data, namespace fixup is done when possible
* as defined in , Appendix B. allows empty strings as a real namespace
* URI. If the namespaceURI
of a Node
is empty
* string, the serialization will treat them as null
, ignoring
* the prefix if any. should the remark on DOM Level 2 namespace URI
* included in the namespace algorithm in Core instead?
*
DOMWriter
accepts any node type for serialization. For
* nodes of type Document
or Entity
, well formed
* XML will be created if possible. The serialized output for these node
* types is either as a Document or an External Entity, respectively, and is
* acceptable input for an XML parser. For all other types of nodes the
* serialized form is not specified, but should be something useful to a
* human for debugging or diagnostic purposes. Note: rigorously designing an
* external (source) form for stand-alone node types that don't already have
* one defined in seems a bit much to take on here.
*
Within a Document, DocumentFragment, or Entity being serialized, Nodes
* are processed as follows Documents are written including an XML
* declaration and a DTD subset, if one exists in the DOM. Writing a
* document node serializes the entire document. Entity nodes, when written
* directly by writeNode
defined in the DOMWriter
* interface, output the entity expansion but no namespace fixup is done.
* The resulting output will be valid as an external entity. Entity
* reference nodes are serialized as an entity reference of the form "
* &entityName;
" in the output. Child nodes (the expansion)
* of the entity reference are ignored. CDATA sections containing content
* characters that can not be represented in the specified output encoding
* are handled according to the "split-cdata-sections" feature.If the
* feature is true
, CDATA sections are split, and the
* unrepresentable characters are serialized as numeric character references
* in ordinary content. The exact position and number of splits is not
* specified. If the feature is false
, unrepresentable
* characters in a CDATA section are reported as errors. The error is not
* recoverable - there is no mechanism for supplying alternative characters
* and continuing with the serialization. DocumentFragment nodes are
* serialized by serializing the children of the document fragment in the
* order they appear in the document fragment. All other node types
* (Element, Text, etc.) are serialized to their corresponding XML source
* form. The serialization of a DOM Node does not always generate a
* well-formed XML document, i.e. a DOMBuilder
might through
* fatal errors when parsing the resulting serialization.
*
Within the character data of a document (outside of markup), any
* characters that cannot be represented directly are replaced with
* character references. Occurrences of '<' and '&' are replaced by
* the predefined entities < and &. The other predefined
* entities (>, &apos, etc.) are not used; these characters can be
* included directly. Any character that can not be represented directly in
* the output character encoding is serialized as a numeric character
* reference.
*
Attributes not containing quotes are serialized in quotes. Attributes
* containing quotes but no apostrophes are serialized in apostrophes
* (single quotes). Attributes containing both forms of quotes are
* serialized in quotes, with quotes within the value represented by the
* predefined entity ". Any character that can not be represented
* directly in the output character encoding is serialized as a numeric
* character reference.
*
Within markup, but outside of attributes, any occurrence of a character
* that cannot be represented in the output character encoding is reported
* as an error. An example would be serializing the element
* <LaCa?ada/> with the encoding="us-ascii".
*
When requested by setting the normalize-characters
feature
* on DOMWriter
, all data to be serialized, both markup and
* character data, is W3C Text normalized according to the rules defined in
* . The W3C Text normalization process affects only the data as it is being
* written; it does not alter the DOM's view of the document after
* serialization has completed.
*
Namespaces are fixed up during serialization, the serialization process
* will verify that namespace declarations, namespace prefixes and the
* namespace URIs associated with Elements and Attributes are consistent. If
* inconsistencies are found, the serialized form of the document will be
* altered to remove them. The algorithm used for doing the namespace fixup
* while serializing a document is a combination of the algorithms used for
* lookupNamespaceURI and lookupNamespacePrefix . previous paragraph to be
* defined closer here.
*
Any changes made affect only the namespace prefixes and declarations
* appearing in the serialized data. The DOM's view of the document is not
* altered by the serialization operation, and does not reflect any changes
* made to namespace declarations or prefixes in the serialized output.
*
While serializing a document the serializer will write out
* non-specified values (such as attributes whose specified
is
* false
) if the discard-default-content
feature
* is set to true
. If the discard-default-content
* flag is set to false
and a schema is used for validation,
* the schema will be also used to determine if a value is specified or not.
* If no schema is used, the specified
flag on attribute nodes
* is used to determine if attribute values should be written out.
*
Ref to Core spec (1.1.9, XML namespaces, 5th paragraph) entity ref
* description about warning about unbound entity refs. Entity refs are
* always serialized as &foo;
, also mention this in the
* load part of this spec.
*
DOMWriter
s have a number of named features that can be
* queried or set. The name of DOMWriter
features must be valid
* XML names. Implementation specific features (extensions) should choose an
* implementation dependent prefix to avoid name collisions.
*
Here is a list of features that must be recognized by all
* implementations. Using these features does affect the Node
* being serialized, only its serialized form is affected.
*
* -
*
"discard-default-content"
* - This feature is equivalent to the
* one provided on
Document.setNormalizationFeature
in .
* -
*
"canonical-form"
* -
*
* true
* - [optional] This formatting
* writes the document according to the rules specified in . Setting this
* feature to true will set the feature
"format-pretty-print"
* to false.
* false
* - [required] (default) Do not canonicalize the
* output.
*
* "entities"
* - This feature is equivalent to the one
* provided on
Document.setNormalizationFeature
in .
* -
*
"format-pretty-print"
* -
*
* true
* - [optional] Formatting
* the output by adding whitespace to produce a pretty-printed, indented,
* human-readable form. The exact form of the transformations is not
* specified by this specification. Setting this feature to true will set
* the feature "canonical-form" to false.
* false
* - [required] (
* default) Don't pretty-print the result.
*
* "namespaces"
* - This
* feature is equivalent to the one provided on
*
Document.setNormalizationFeature
in .
* -
*
"normalize-characters"
* - This feature is equivalent to the one
* provided on
Document.setNormalizationFeature
in . Unlike in
* the Core, the default value for this feature is true
.
* -
*
"split-cdata-sections"
* - This feature is equivalent to the one
* provided on
Document.setNormalizationFeature
in .
* -
*
"validate"
* - This feature is equivalent to the one provided on
*
Document.setNormalizationFeature
in .
* -
*
"unknown-characters"
* -
*
* true
* - [required] (default)
* If, while verifying full normalization when XML 1.1 is supported, a
* character is encountered for which the normalization properties cannot be
* determined, then ignore any possible denormalizations caused by these
* characters.
* false
* - [optional] Report an fatal error if a
* character is encountered for which the processor can not determine the
* normalization properties.
*
* "whitespace-in-element-content"
* -
* This feature is equivalent to the one provided on
*
Document.setNormalizationFeature
in .
*
* See also the Document Object Model (DOM) Level 3 Load
and Save Specification.
*/
public interface DOMWriter {
/**
* Set the state of a feature.
*
The feature name has the same form as a DOM hasFeature string.
*
It is possible for a DOMWriter
to recognize a feature
* name but to be unable to set its value.
* @param name The feature name.
* @param state The requested state of the feature (true
or
* false
).
* @exception DOMException
* NOT_SUPPORTED_ERR: Raised when the DOMWriter
recognizes
* the feature name but cannot set the requested value.
*
Raise a NOT_FOUND_ERR When the DOMWriter
does not
* recognize the feature name.
*/
public void setFeature(String name,
boolean state)
throws DOMException;
/**
* Query whether setting a feature to a specific value is supported.
*
The feature name has the same form as a DOM hasFeature string.
* @param name The feature name, which is a DOM has-feature style string.
* @param state The requested state of the feature (true
or
* false
).
* @return true
if the feature could be successfully set to
* the specified value, or false
if the feature is not
* recognized or the requested value is not supported. The value of
* the feature itself is not changed.
*/
public boolean canSetFeature(String name,
boolean state);
/**
* Look up the value of a feature.
*
The feature name has the same form as a DOM hasFeature string
* @param name The feature name, which is a string with DOM has-feature
* syntax.
* @return The current state of the feature (true
or
* false
).
* @exception DOMException
* NOT_FOUND_ERR: Raised when the DOMWriter
does not
* recognize the feature name.
*/
public boolean getFeature(String name)
throws DOMException;
/**
* The character encoding in which the output will be written.
*
The encoding to use when writing is determined as follows: If the
* encoding attribute has been set, that value will be used.If the
* encoding attribute is null
or empty, but the item to be
* written, or the owner document of the item, specifies an encoding
* (i.e. the "actualEncoding" from the document) specified encoding,
* that value will be used.If neither of the above provides an encoding
* name, a default encoding of "UTF-8" will be used.
*
The default value is null
.
*/
public String getEncoding();
/**
* The character encoding in which the output will be written.
*
The encoding to use when writing is determined as follows: If the
* encoding attribute has been set, that value will be used.If the
* encoding attribute is null
or empty, but the item to be
* written, or the owner document of the item, specifies an encoding
* (i.e. the "actualEncoding" from the document) specified encoding,
* that value will be used.If neither of the above provides an encoding
* name, a default encoding of "UTF-8" will be used.
*
The default value is null
.
*/
public void setEncoding(String encoding);
/**
* The end-of-line sequence of characters to be used in the XML being
* written out. Any string is supported, but these are the recommended
* end-of-line sequences (using other character sequences than these
* recommended ones can result in a document that is either not
* serializable or not well-formed):
*
* null
* - Use a default
* end-of-line sequence. DOM implementations should choose the default
* to match the usual convention for text files in the environment being
* used. Implementations must choose a default sequence that matches one
* of those allowed by "End-of-Line Handling" (, section 2.11) if the
* serialized content is XML 1.0 or "End-of-Line Handling" (, section
* 2.11) if the serialized content is XML 1.1.
* - CR
* - The carriage-return
* character (#xD).
* - CR-LF
* - The carriage-return and line-feed characters
* (#xD #xA).
* - LF
* - The line-feed character (#xA).
*
*
The default value for this attribute is null
.
*/
public String getNewLine();
/**
* The end-of-line sequence of characters to be used in the XML being
* written out. Any string is supported, but these are the recommended
* end-of-line sequences (using other character sequences than these
* recommended ones can result in a document that is either not
* serializable or not well-formed):
*
* null
* - Use a default
* end-of-line sequence. DOM implementations should choose the default
* to match the usual convention for text files in the environment being
* used. Implementations must choose a default sequence that matches one
* of those allowed by "End-of-Line Handling" (, section 2.11) if the
* serialized content is XML 1.0 or "End-of-Line Handling" (, section
* 2.11) if the serialized content is XML 1.1.
* - CR
* - The carriage-return
* character (#xD).
* - CR-LF
* - The carriage-return and line-feed characters
* (#xD #xA).
* - LF
* - The line-feed character (#xA).
*
*
The default value for this attribute is null
.
*/
public void setNewLine(String newLine);
/**
* When the application provides a filter, the serializer will call out
* to the filter before serializing each Node. Attribute nodes are never
* passed to the filter. The filter implementation can choose to remove
* the node from the stream or to terminate the serialization early.
*/
public DOMWriterFilter getFilter();
/**
* When the application provides a filter, the serializer will call out
* to the filter before serializing each Node. Attribute nodes are never
* passed to the filter. The filter implementation can choose to remove
* the node from the stream or to terminate the serialization early.
*/
public void setFilter(DOMWriterFilter filter);
/**
* The error handler that will receive error notifications during
* serialization. The node where the error occured is passed to this
* error handler, any modification to nodes from within an error
* callback should be avoided since this will result in undefined,
* implementation dependent behavior.
*/
public DOMErrorHandler getErrorHandler();
/**
* The error handler that will receive error notifications during
* serialization. The node where the error occured is passed to this
* error handler, any modification to nodes from within an error
* callback should be avoided since this will result in undefined,
* implementation dependent behavior.
*/
public void setErrorHandler(DOMErrorHandler errorHandler);
/**
* Write out the specified node as described above in the description of
* DOMWriter
. Writing a Document or Entity node produces a
* serialized form that is well formed XML, when possible (Entity nodes
* might not always be well formed XML in themselves). Writing other
* node types produces a fragment of text in a form that is not fully
* defined by this document, but that should be useful to a human for
* debugging or diagnostic purposes.
*
If the specified encoding is not supported the error handler is
* called and the serialization is interrupted.
* @param destination The destination for the data to be written.
* @param wnode The Document
or Entity
node to
* be written. For other node types, something sensible should be
* written, but the exact serialized form is not specified.
* @return Returns true
if node
was
* successfully serialized and false
in case a failure
* occured and the failure wasn't canceled by the error handler.
*/
public boolean writeNode(java.io.OutputStream destination,
Node wnode);
/**
* Serialize the specified node as described above in the description of
* DOMWriter
. The result of serializing the node is
* returned as a DOMString (this method completely ignores all the
* encoding information available). Writing a Document or Entity node
* produces a serialized form that is well formed XML. Writing other
* node types produces a fragment of text in a form that is not fully
* defined by this document, but that should be useful to a human for
* debugging or diagnostic purposes.
*
Error handler is called if encoding not supported...
* @param wnode The node to be written.
* @return Returns the serialized data, or null
in case a
* failure occured and the failure wasn't canceled by the error
* handler.
* @exception DOMException
* DOMSTRING_SIZE_ERR: Raised if the resulting string is too long to
* fit in a DOMString
.
*/
public String writeToString(Node wnode)
throws DOMException;
}