All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.jdom2.input.sax.package-info Maven / Gradle / Ivy

Go to download

A complete, Java-based solution for accessing, manipulating, and outputting XML data

There is a newer version: 2.0.2
Show newest version
/*--

 Copyright (C) 2011-2012 Jason Hunter & Brett McLaughlin.
 All rights reserved.

 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions
 are met:

 1. Redistributions of source code must retain the above copyright
    notice, this list of conditions, and the following disclaimer.

 2. Redistributions in binary form must reproduce the above copyright
    notice, this list of conditions, and the disclaimer that follows
    these conditions in the documentation and/or other materials
    provided with the distribution.

 3. The name "JDOM" must not be used to endorse or promote products
    derived from this software without prior written permission.  For
    written permission, please contact .

 4. Products derived from this software may not be called "JDOM", nor
    may "JDOM" appear in their name, without prior written permission
    from the JDOM Project Management .

 In addition, we request (but do not require) that you include in the
 end-user documentation provided with the redistribution and/or in the
 software itself an acknowledgement equivalent to the following:
     "This product includes software developed by the
      JDOM Project (http://www.jdom.org/)."
 Alternatively, the acknowledgment may be graphical using the logos
 available at http://www.jdom.org/images/logos.

 THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
 WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED.  IN NO EVENT SHALL THE JDOM AUTHORS OR THE PROJECT
 CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
 USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
 ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
 OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 SUCH DAMAGE.

 This software consists of voluntary contributions made by many
 individuals on behalf of the JDOM Project and was originally
 created by Jason Hunter  and
 Brett McLaughlin .  For more information
 on the JDOM Project, please see .

 */

/**
 Support classes for building JDOM documents and content using SAX parsers.

 

Introduction

Skip to the Examples section for a quick bootstrap.

The {@link org.jdom2.input.SAXBuilder} class parses input and produces JDOM output. It does this using three 'pillars' of functionality, which when combined constitute a 'parse'.

The three pillars are:

  1. The SAX Parser - this is a 'third-party' parser such as Xerces.
  2. The SAX Event Handler - which reads the data produced by the parser
  3. The JDOMFactory - which converts the resulting data in to JDOM content
There are many different ways of parsing the document from its input state (DocType-validating, etc.), and there are also different ways to interpret the SAX events. Finally there are different ways to produce JDOM Content using different implementations of the JDOMFactory.

SAXBuilder provides a central location where these three pillars are configured. Some configuration settings require coordinated changes to both the SAX parser and the SAX handler, and SAXBuilder ensures the coordination is maintained.

Setting the Pillars

SAXBuilder provides a number of different mechanisms for stipulating what the three pillars will be:
  • A Constructor: {@link org.jdom2.input.SAXBuilder#SAXBuilder(XMLReaderJDOMFactory, SAXHandlerFactory, org.jdom2.JDOMFactory)}
  • A default constructor {@link org.jdom2.input.SAXBuilder#SAXBuilder()} that chooses a non-validating JAXP sourced XMLReader Factory {@link org.jdom2.input.sax.XMLReaders#NONVALIDATING} which it mates with a Default {@link org.jdom2.input.sax.SAXHandler} factory, and the {@link org.jdom2.DefaultJDOMFactory}.
  • A number of other constructors that mostly are for backward-compatibility with JDOM 1.x. These other constructors affect what {@link org.jdom2.input.sax.XMLReaderJDOMFactory} will be used but still use the default SAXHandler and JDOMFactory values.
  • Methods to change whatever was constructed:
    • {@link org.jdom2.input.SAXBuilder#setXMLReaderFactory(XMLReaderJDOMFactory)}
    • {@link org.jdom2.input.SAXBuilder#setSAXHandlerFactory(SAXHandlerFactory)}
    • {@link org.jdom2.input.SAXBuilder#setJDOMFactory(org.jdom2.JDOMFactory)}

The XMLReaderJDOMFactory Pillar

SAX parsers have been exposed as different things during the evolution of the SAX API, but in SAX2.0 they are XMLReader instances. Thus SAXBuilder needs an XMLReader to process the input. To get an XMLReader the SAXBuilder delegates to the {@link org.jdom2.input.sax.XMLReaderJDOMFactory} by calling {@link org.jdom2.input.sax.XMLReaderJDOMFactory#createXMLReader()}

XMLReader instances can be created in a few different ways, and also they can be set to perform the SAX parse in a number of different ways. The classes in this package are designed to make it easier and faster to locate the XMLReader that is suitable for the XML parsing you intend to do. At the same time, if the parsing you intend to do is outside the normal bounds of how JDOM is used, you still have the functionality to create a completely custom mechanism for setting the XMLReader for SAXBuilder.

There are two typical ways to specify and create an XMLReader instance: using JAXP, and using the SAX2.0 API. If necessary you can also create direct instances of XMLReader implementations using 'new' constructors, but each SAX implementation has different class names for their SAX drivers so doing raw constructors is not portable and not recommended.

Where possible it is recommended that you use the JAXP mechanism for obtaining XMLReaders because:

  • It is more 'modern'.
  • It provides a more consistent interface to different SAX implementations
  • It provides cleaner and more portable support for validating using the {@link javax.xml.validation.Validator} mechanisms.
  • It allows you to create differently-configured 'factories' that create XMLReaders in a pre-specified format (SAX2.0 has a single global factory that creates raw XMLReader instances that then need to be re-configured for you task).

JAXP Factories

JDOM exposes five factories that use JAXP to source XMLReaders. These factories cover almost all conditions under which you would want a SAX parser:
  1. A simple non-validating SAX parser
  2. A validating parser that uses the DOCTYPE references in the XML to validate against.
  3. A validating parser that uses the XML Schema (XSD) references embedded in the XML to validate against.
  4. A validating parser that uses an external Schema (XML Schema, Relax NG, etc.) to validate the XML against.
  5. A special case of the Schema-validating factory that specialises in XML Schema (XSD) validation and provides an easy way to create validating XMLReaders based on single or multiple input XSD documents.
The first three are all relatively simple, and are available as members of the {@link org.jdom2.input.sax.XMLReaders} enumeration. These members are 'singletons' that can be used in a multi-threaded and concurrent way to provide XMLReaders that are configured correctly for the respective behaviour.

To validate using an arbitrary external Schema you can use the {@link org.jdom2.input.sax.XMLReaderSchemaFactory} to create an instance for the particular Schema you want to validate against. Because this requires an input Schema it cannot be constructed as a singleton like the others.

{@link org.jdom2.input.sax.XMLReaderXSDFactory} is a special case of XMLReaderSchemaFactory which internally uses an efficient mechanism to compile Schema instances from one or many input XSD documents which can come from multiple sources.

SAX 2.0 Factory

JDOM supports using the SAX 2.0 API for creating XMLReaders through using either the 'default' SAX 2.0 implementation or a particular SAX Driver class. SAX2.0 support is available by creating instances of the {@link org.jdom2.input.sax.XMLReaderSAX2Factory} class.

It should be noted that it is preferable to use JAXP in JDOM because it is a more flexible API that allows more portable code to be created. The JAXP interface in JDOM is also able to support a wider array of functionality out-of-the-box, but the same functionality would require SAX-implementation specific configuration.

JDOM does not provide a pre-configured way to do XML Schema validation through the SAX2.0 API though. The SAX 2.0 API does not expose a convenient way to configure different SAX implementations in a consistent way, so it is up to the JDOM user to wrap the XMLReaderSAX2Factory in such a way that it reconfigures the XMLReader to be appropriate for the task at hand.

Custom Factories

If your circumstances require it you can create your own implementation of the {@link org.jdom2.input.sax.XMLReaderJDOMFactory} to provide XMLReaders configured as you like them. It will probably be best if you wrap an existing implementation with your custom code though in order to get the best results fastest.

Note that the existing JDOM implementations described above all set the generated XMLReaders to be namespace-aware and to supply namespace-prefixes. Custom implementations should also ensure that this is set unless you absolutely know what you are doing.

The SAXHandlerFactory Pillar

The SAXHandler interprets the SAX calls and provides the information to the JDOMFactory to create JDOM content. SAXBuilder creates a SAXHandler from the {@link org.jdom2.input.sax.SAXHandlerFactory} pillar. It is unusual for a JDOM user to need to customise the manner in which this happens, but, in the event that you do you can create a subclass of the SAXHandler class, and then create an instance of the SAXHandlerFactory that returns new subclass instances. This new factory can become a pillar in SAXBuilder and supply custom SAXHandlers to the parse process.

The JDOMFactory Pillar

There are a couple of reasons for changing the JDOMFactory pillar in SAXBuilder. The default JDOMFactory used is the {@link org.jdom2.DefaultJDOMFactory}. This factory validates the values being used to create JDOM content. There is also the {@link org.jdom2.UncheckedJDOMFactory} which does not validate the data, so it should only be used if you are absolutely certain that your SAX source can never provide illegal content. You may have other reasons for creating a custom JDOMFactory such as if you need to create custom versions of JDOM Content like a custom Element subclass.

Configuring the Pillars

The JDOMFactory pillar is not configurable; you can only replace it entirely. The other two pillars are configurable though, but you should inspect the getters and setters on {@link org.jdom2.input.SAXBuilder} to identify what can (by default) be changed easily. Remember, if you have anything that needs to be customised beyond what SAXBuilder offers you can always replace a pillar with a custom implementation.

Execution Model

Once all the pillars are set and configured to your satisfaction you can 'build' a JDOM Document from a source. The actual parse process consists of a 'setup', 'parse', and 'reset' phase.

The setup process involves obtaining an XMLReader from the XMLReaderJDOMFactory and a SAXHandler (configured to use the JDOMFactory) from the SAXHandlerFactory. These two instances are then configured to meet the settings specified on SAXBuilder, and once configured they are 'compiled' in to a SAXBuilderEngine.

The SAXBuilderEngine is a non-configurable 'embodiment' of the configuration of the SAXBuilder when the engine was created, and it contains the entire 'workflow' necessary to parse the input in to JDOM content. Further, it is a guarantee that the XMLReader and SAXHandler instances in the SAXBuilderEngine are never shared with any other engine or entity (assuming that the respective factories never issue the same instances multiple times). There is no guarantee made for the JDOMFactory being unique for each SAXBuilderEngine, but JDOMFactory instances are supposed to be reentrant/thread-safe.

The 'parse' phase starts once the setup phase is complete and the SAXBuilderEngine has been created. The created engine is used to parse the input, and the resulting Document is returned to the client.

The 'reset' phase happens after the completion of the 'parse' phase, and it resets the SAXBuilderEngine to its initial state, ready to process the next parse request.

Parser Reuse

A large amount of the effort involved in parsing the document is actually the creation of the XMLReader and the SAXHandler instances, as well as applying the configuration to those instances (the 'setup' phase).

JDOM2 uses the new SAXBuilderEngine to represent the state of the SAXBuilder at the moment prior to the parse. SAXBuilder will then 'remember' and reuse this exact SAXBuilderEngine until something changes in the SAXBuilder configuration. As soon as the configuration changes in any way the engine will be forgotten and a new one will be created when the SAXBuilder next parses a document.

If you turn off parser reuse with {@link org.jdom2.input.SAXBuilder#setReuseParser(boolean)} then SAXBuilder will immediately forget the engine, and it will also forget it after each build (i.e. SAXBuilder will create a new SAXBuilderEngine each parse).

It follows then that as long as you do not change the SAXBuilder configuration then the SAXBuilder will always reuse the same SAXBuilderEngine. This is very efficient because there is no configuration management between parses, and the procedure completely eliminates the 'setup' component for all but the first parse.

Parser Pooling

In order to facilitate Parser pooling it is useful to export the SAXBuilderEngine as a stand-alone reusable parser. At any time you can call {@link org.jdom2.input.SAXBuilder#buildEngine()} and you can get a newly created SAXBuilderEngine instance. The SAXBuilderEngine has the same 'build' methods as SAXBuilder, and these are exposed as the {@link org.jdom2.input.sax.SAXEngine} interface. Both SAXBuilder and SAXBuilderEngine implement the SAXEngine interface. Thus, if you use Parser pools you can pool either the SAXBuilder or the SAXBuilderEngine in the same pool.

It is most likely though that what you will want to do is to create a single SAXBuilder that represents the configuration you want, and then you can use this single SAXBuilder to create multiple SAXEngines as you need them in the pool by calling the buildEngine() method.

Examples

Create a simple SAXBuilder and parse a document:

 SAXBuilder sb = new SAXBuilder();
 Document doc = sb.build(new File("file.xml"));
 

Create a DTD validating SAXBuilder and parse a document:

 SAXBuilder sb = new SAXBuilder(XMLReaders.DTDVALIDATING);
 Document doc = sb.build(new File("file.xml"));
 
Create an XSD (XML Schema) validating SAXBuilder using the XSD references inside the XML document and parse a document:

 SAXBuilder sb = new SAXBuilder(XMLReaders.XSDVALIDATING);
 Document doc = sb.build(new File("file.xml"));
 

Create an XSD (XML Schema) validating SAXBuilder the hard way (see the next example for an easier way) using an external XSD and parse a document (see {@link org.jdom2.input.sax.XMLReaderSchemaFactory}):

 SchemaFactory schemafac =
 SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
 Schema schema = schemafac.newSchema(new File("myschema.xsd"));
 XMLReaderJDOMFactory factory = new XMLReaderSchemaFactory(schema);
 SAXBuilder sb = new SAXBuilder(factory);
 Document doc = sb.build(new File("file.xml"));
 

Create an XSD (XML Schema) validating SAXBuilder the easy way (see {@link org.jdom2.input.sax.XMLReaderXSDFactory}):

 File xsdfile = new File("myschema.xsd");
 XMLReaderJDOMFactory factory = new XMLReaderXSDFactory(xsdfile);
 SAXBuilder sb = new SAXBuilder(factory);
 Document doc = sb.build(new File("file.xml"));
 
*/ package org.jdom2.input.sax;




© 2015 - 2024 Weber Informatics LLC | Privacy Policy