org.xml.sax.package.html Maven / Gradle / Ivy
SAX (Simple API to XML) is an event-driven parser API, which supports
most of the widely available XML parsers. It does not provide all the
information that an editor might want, such as comments, physical
structure information (such as unexpanded entities), CDATA sections,
or DTD support.
SAX promotes a modular design of XML systems, since it places no
requirement on parsers to provide particular data structures (such W3C's
DOM) representing XML documents. Since parsers plug in to the API using
a driver style API, different parsers can be used for different
application environments.
For more information on SAX, see
the SAX page.
Basic Use of a SAX Parser
To use SAX, first instantiate a Parser instance; you will
normally use the sax.helpers.ParserFactory class according to
the org.xml.sax.driver system property.
Then provide an implementation of the DocumentHandler,
to receive your parsing events. (For example, you can build a DOM
parse tree by using a com.sun.xml.tree.XmlDocumentBuilder instance
as your document handler). This will need to be able to make copies
of AttributeList data provided by the parser.
Finally, package your XML data as an InputSource
(or get its URL) and
call Parser.parse to send a stream of parse events to your
document handler.
Advanced Features
Many applications will want to customize the parser by providing
an ErrorHandler to handle errors, perhaps providing
diagnostics using the Locator given to the parser. For
example, when using a validating parser you will often want to
stop processing documents which have errors.
Applications can also provide an EntityResolver
to participate in resolving the entities required by the XML document
being parsed, arranging to use local or replicated copies. Some
resolvers may have intelligence to access catalogs mapping XML
public identifiers to URIs other than the system ID (perhaps stored
in a local repository or a Java resource). They may also use all
the available MIME type information, such as character encodings.
Some XML documents refer to unparsed entities or
to notations in their attributes. When working with such
documents, you will need to provide a DTDHandler object
in order to be notified of the notations and unparsed entities
which were defined in the document type definition (DTD).