All Downloads are FREE. Search and download functionalities are using the official Maven repository.

javax.xml.transform.overview.html Maven / Gradle / Ivy







 
  

Transformation API For XML

Introduction

This overview describes the set of APIs contained in javax.xml.transform. For the sake of brevity, these interfaces are referred to as TrAX (Transformations for XML).

There is a broad need for Java applications to be able to transform XML and related tree-shaped data structures. In fact, XML is not normally very useful to an application without going through some sort of transformation, unless the semantic structure is used directly as data. Almost all XML-related applications need to perform transformations. Transformations may be described by Java code, Perl code, XSLT Stylesheets, other types of script, or by proprietary formats. The inputs, one or multiple, to a transformation, may be a URL, XML stream, a DOM tree, SAX Events, or a proprietary format or data structure. The output types are the pretty much the same types as the inputs, but different inputs may need to be combined with different outputs.

The great challenge of a transformation API is how to deal with all the possible combinations of inputs and outputs, without becoming specialized for any of the given types.

The Java community will greatly benefit from a common API that will allow them to understand and apply a single model, write to consistent interfaces, and apply the transformations polymorphically. TrAX attempts to define a model that is clean and generic, yet fills general application requirements across a wide variety of uses.

General Terminology

This section will explain some general terminology used in this document. Technical terminology will be explained in the Model section. In many cases, the general terminology overlaps with the technical terminology.

  • Tree
    This term, as used within this document, describes an abstract structure that consists of nodes or events that may be produced by XML. A Tree physically may be a DOM tree, a series of well balanced parse events (such as those coming from a SAX2 ContentHander), a series of requests (the result of which can describe a tree), or a stream of marked-up characters.

  • Source Tree(s)
    One or more trees that are the inputs to the transformation.

  • Result Tree(s)
    One or more trees that are the output of the transformation.

  • Transformation
    The processor of consuming a stream or tree to produce another stream or tree.

  • Identity (or Copy) Transformation
    The process of transformation from a source to a result, making as few structural changes as possible and no informational changes. The term is somewhat loosely used, as the process is really a copy. from one "format" (such as a DOM tree, stream, or set of SAX events) to another.

  • Serialization
    The process of taking a tree and turning it into a stream. In some sense, a serialization is a specialized transformation.

  • Parsing
    The process of taking a stream and turning it into a tree. In some sense, parsing is a specialized transformation.

  • Transformer
    A Transformer is the object that executes the transformation.

  • Transformation instructions
    Describes the transformation. A form of code, script, or simply a declaration or series of declarations.

  • Stylesheet
    The same as "transformation instructions," except it is likely to be used in conjunction with XSLT.

  • Templates
    Another form of "transformation instructions." In the TrAX interface, this term is used to describe processed or compiled transformation instructions. The Source flows through a Templates object to be formed into the Result.

  • Processor
    A general term for the thing that may both process the transformation instructions, and perform the transformation.

  • DOM
    Document Object Model, specifically referring to the Document Object Model (DOM) Level 2 Specification.

  • SAX
    Simple API for XML, specifically referring to the SAX 2.0.2 release.

Model

The section defines the abstract model for TrAX, apart from the details of the interfaces.

A TRaX TransformerFactory is an object that processes transformation instructions, and produces Templates (in the technical terminology). A Templates object provides a Transformer, which transforms one or more Sources into one or more Results.

To use the TRaX interface, you create a TransformerFactory, which may directly provide a Transformers, or which can provide Templates from a variety of Sources. The Templates object is a processed or compiled representation of the transformation instructions, and provides a Transformer. The Transformer processes a Source according to the instructions found in the Templates, and produces a Result.

The process of transformation from a tree, either in the form of an object model, or in the form of parse events, into a stream, is known as serialization. We believe this is the most suitable term for this process, despite the overlap with Java object serialization.

TRaX Patterns

    Processor

    Intent: Generic concept for the set of objects that implement the TrAX interfaces.
    Responsibilities: Create compiled transformation instructions, transform sources, and manage transformation parameters and properties.
    Thread safety: Only the Templates object can be used concurrently in multiple threads. The rest of the processor does not do synchronized blocking, and so may not be used to perform multiple concurrent operations. Different Processors can be used concurrently by different threads.

    TransformerFactory

    Intent: Serve as a vendor-neutral Processor interface for XSLT and similar processors.
    Responsibilities: Serve as a factory for a concrete implementation of an TransformerFactory, serve as a direct factory for Transformer objects, serve as a factory for Templates objects, and manage processor specific features.
    Thread safety: A TransformerFactory may not perform multiple concurrent operations.

    Templates

    Intent: The runtime representation of the transformation instructions.
    Responsibilities: A data bag for transformation instructions; act as a factory for Transformers.
    Thread safety: Thread-safe for concurrent usage over multiple threads once construction is complete.

    Transformer

    Intent: Act as a per-thread execution context for transformations, act as an interface for performing the transformation.
    Responsibilities: Perform the transformation.
    Thread safety: Only one instance per thread is safe.
    Notes: The Transformer is bound to the Templates object that created it.

    Source

    Intent: Serve as a single vendor-neutral object for multiple types of input.
    Responsibilities: Act as simple data holder for System IDs, DOM nodes, streams, etc.
    Thread safety: Thread-safe concurrently over multiple threads for read-only operations; must be synchronized for edit operations.

    Result

    Potential alternate name: ResultTarget
    Intent: Serve as a single object for multiple types of output, so there can be simple process method signatures.
    Responsibilities: Act as simple data holder for output stream, DOM node, ContentHandler, etc.
    Thread safety: Thread-safe concurrently over multiple threads for read-only, must be synchronized for edit.





© 2015 - 2024 Weber Informatics LLC | Privacy Policy