All Downloads are FREE. Search and download functionalities are using the official Maven repository.

loci.poi.hpsf.package.html Maven / Gradle / Ivy

Go to download

Java API to handle Microsoft OLE 2 Compound Document format (Word, Excel). Based on poi-2.5.1-final-20040804.jar, with bugfixes for OLE v2 and memory efficiency improvements. Used by Bio-Formats for OLE support (cxd, ipw, oib, zvi). Used by VisBio overlays logic for XLS export feature.

There is a newer version: 5.3.9
Show newest version






 
  HPSF
 

 
  

Processes streams in the Horrible Property Set Format (HPSF) in POI filesystems. Microsoft Office documents, i.e. POI filesystems, usually contain meta data like author, title, last saving time etc. These items are called properties and stored in property set streams along with the document itself. These streams are commonly named \005SummaryInformation and \005DocumentSummaryInformation. However, a POI filesystem may contain further property sets of other names or types.

In order to extract the properties from a POI filesystem, a property set stream's contents must be parsed into a {@link loci.poi.hpsf.PropertySet} instance. Its subclasses {@link loci.poi.hpsf.SummaryInformation} and {@link loci.poi.hpsf.DocumentSummaryInformation} deal with the well-known property set streams \005SummaryInformation and \005DocumentSummaryInformation. (However, the streams' names are irrelevant. What counts is the property set's first section's format ID - see below.)

The factory method {@link loci.poi.hpsf.PropertySetFactory#create} creates a {@link loci.poi.hpsf.PropertySet} instance. This method always returns the most specific property set: If it identifies the stream data as a Summary Information or as a Document Summary Information it returns an instance of the corresponding class, else the general {@link loci.poi.hpsf.PropertySet}.

A {@link loci.poi.hpsf.PropertySet} contains a list of {@link loci.poi.hpsf.Section}s which can be retrieved with {@link loci.poi.hpsf.PropertySet#getSections}. Each {@link loci.poi.hpsf.Section} contains a {@link loci.poi.hpsf.Property} array which can be retrieved with {@link loci.poi.hpsf.Section#getProperties}. Since the vast majority of {@link loci.poi.hpsf.PropertySet}s contains only a single {@link loci.poi.hpsf.Section}, the convenience method {@link loci.poi.hpsf.PropertySet#getProperties} returns the properties of a {@link loci.poi.hpsf.PropertySet}'s {@link loci.poi.hpsf.Section} (throwing a {@link loci.poi.hpsf.NoSingleSectionException} if the {@link loci.poi.hpsf.PropertySet} contains more (or less) than exactly one {@link loci.poi.hpsf.Section}).

Each {@link loci.poi.hpsf.Property} has an ID, a type, and a value which can be retrieved with {@link loci.poi.hpsf.Property#getID}, {@link loci.poi.hpsf.Property#getType}, and {@link loci.poi.hpsf.Property#getValue}, respectively. The value's class depends on the property's type. The current implementation does not yet support all property types and restricts the values' classes to {@link java.lang.String}, {@link java.lang.Integer} and {@link java.util.Date}. A value of a yet unknown type is returned as a byte array containing the value's origin bytes from the property set stream.

To retrieve the value of a specific {@link loci.poi.hpsf.Property}, use {@link loci.poi.hpsf.Section#getProperty} or {@link loci.poi.hpsf.Section#getPropertyIntValue}.

The {@link loci.poi.hpsf.SummaryInformation} and {@link loci.poi.hpsf.DocumentSummaryInformation} classes provide convenience methods for retrieving well-known properties. For example, an application that wants to retrieve a document's title string just calls {@link loci.poi.hpsf.SummaryInformation#getTitle} instead of going through the hassle of first finding out what the title's property ID is and then using this ID to get the property's value.

Writing properties can be done with the classes {@link loci.poi.hpsf.MutablePropertySet}, {@link loci.poi.hpsf.MutableSection}, and {@link loci.poi.hpsf.MutableProperty}.

Public documentation from Microsoft can be found in the appropriate section of the MSDN Library.

History

2003-09-11:

{@link loci.poi.hpsf.PropertySetFactory#create(InputStream)} no longer throws an {@link loci.poi.hpsf.UnexpectedPropertySetTypeException}.

To Do

The following is still left to be implemented. Sponsering could foster these issues considerably.

  • Convenience methods for setting summary information and document summary information properties

  • Better codepage support

  • Support for more property (variant) types

@author Rainer Klute ([email protected]) @version $Id: package.html 496526 2007-01-15 22:46:35Z markt $ @since 2002-02-09





© 2015 - 2025 Weber Informatics LLC | Privacy Policy