
doc.transformers.se_tpb_wordml2dtbook.html Maven / Gradle / Ivy
The newest version!
se_tpb_wordml2dtbook
Transformer documentation: se_tpb_wordml2dtbook
Transformer Purpose
The purpose of this transformer is to provide an automatic conversion for structured Word files into DTBook.
Input Requirements
A Word 2003 xml file with embedded images. The document must comply to the conditions stated in the script documentation.
Output
On success
A DTBook 2005-1 file or a DTBook 2005-1 fileset if the input contains embedded images and parameter images is set to true (extract images).
On error
Configuration/Customization
Parameters (tdf)
- xml
- The WordML XML file
- out
- Path to output
- filename
- Optional file name for the dtbook file. Default is dtbook.xml
- images
- Set to true to extract images or false to only extract the DTBook file
- convertImages
- Set to true to convert all images into JPEG format. ImageMagick must be installed and the path to the conversion program must be set in the paths dialog (under preferences) if run using the Daisy Pipeline GUI or included in a pipeline property named "pipeline.imageMagick.converter.path" if run from the command prompt. E.g. <entry key="pipeline.imageMagick.converter.path">C:\\Program\\ImageMagick-6.3.4-Q16\\convert.exe</entry>
- overwrite
- Set to true to overwrite existing files or false to abort processing
- uid
- Set dtb:uid. Leave blank to generate a random uid
- stylesheet
- Sets the name of the assigned stylesheet in the output, or none if '' is specified
- title
- The name of the publication. If no value is supplied, the information is extracted from the original file
- author
- The author of the publication. If no value is supplied, the information is extracted from the original file
- customStyle
- The custom style identifier. Should match the desired style-id in the custom tag set.
- customTagset
- Path to the custom tagset
- defaultStyle
- WordML version number. The default value is the only one supported.
- defaultTagset
- Path to the default tagset
- forceJPEG
- When set to 'true' the transform will use a ".jpg" file extension regardless of what the original extension was. Otherwise, the original file extension will be preserved.
- uniquePageID
- When set to 'true' the transform will use a unique string as id. Otherwise, the page number will be used as id.
- dtbook-version
- Sets the dtbook doctype and value of the version attribute. Selectable values are 2005-1 or 2005-2.
Extended configurability
/custom/new-custom-tagset.xsl
Modify a copy of this file to support other styles in Word or to change the behaviour of a supported style. Description of grammar is provided here. To use the edited file, change the parameters customTagset and customStyle
Further development
Dependencies
Saxon.jar
Author
Joel Håkansson, TPB
Licensing
LGPL
© 2015 - 2025 Weber Informatics LLC | Privacy Policy