
doc.scripts.Narrator-DtbookToDaisy.html Maven / Gradle / Ivy
Pipeline Script: Narrator
Pipeline Script: Narrator
Overview
This script creates a DAISY/NISO 2005 DTB and a Daisy 2.02 DTB from a DTBook 2005 document. A TTS system is used to create
the audio for the book. xml:lang
attributes in the DTBook input document are used
to switch language of the TTS.
Note - at the time of writing, the default distribution of the Daisy Pipeline
supports running the Narrator script on the Windows™ Linux, and Mac OS X™ operating systems.
Note - this script requires that you have the Lame MP3 encoder installed on your system. Please refer to the LAME installation instructions.
Input Requirements
Any valid DTBook 2005 document with the following restrictions:
- Note references and annotation references are only allowed within the same DTBook document.
In order to conform to the Narrator internal requirements, the following transformations can be automatically applied to the input document:
- Lists (
<list>
) and definition lists (<dl>
) inside paragraphs (<p>
) are extracted and moved as siblings of the paragraph.
- The
dc:Language
, dc:Date
and dc:Publisher
metadata may be generated if they were not already set.
- An
xml:lang
attribute will be added to the DTBook root element if it was not already present.
- A
doctitle
element will be added to the front matter of the DTBook document if it was not already present.
- Dummy headings will be added to the DTBook document in the following cases:
- A level has no heading but at least one of its descendant sub-levels has a heading.
- No
h1
or hd
were found in the entire document.
The input document is validated to make sure it complies to the above rules when this script is
run.
Output
A valid Daisy 2.02 full text full audio book with full skipabillity (page numbers, sidebars,
production notes and foot notes). The book is synchronized on sentence level and the audio
is encoded in MP3 format.
A valid DAISY/NISO 2005 full text full audio book with full skipabillity and escapability.
As a part of the Narrator chain, the book is validated using ZedVal.
Configuration
- Input DTBook file
- Required. A DTBook 2005 document valid against to the rules described in the input
requirements section.
- Output directory
- Required. The base directory of the output. Since Narrator stores some intermediate files in this directory,
the resulting DAISY/NISO 2005 book will be located in the
z3986-mp3
subdirectory while the Daisy 2.02 book will be located in the daisy202
subdirectory.
- DTBook Fix
- Optional. Selects whether to apply DTBook Fix routines to the input document.
- Apply sentence detection
- Optional. Selects whether to apply sentence detection to the input document.
- Multi-language support
- Optional. Selects whether to use different TTS voices depending on the
xml:lang
attributes.
- MP3 Bitrate
- Optional. The bitrate of the generated MP3 files. A higher value will result in better sound quality but the audio files will be larger.
- 2.02 href target
- Optional. The SMIL element (
text
or par
) to target by href URIs in the content document and NCC of the DAISY 2.02 book.
Running Narrator on Linux
This script can be run out-of-the-box on Linux providing that two libraries are installed.
- The eSpeak TTS library
-
If you are using later versions of Ubuntu, eSpeak and its dependencies should be installed by default. Assure this is true by issuing the command espeak "testing"
or something similar.
If you do not have eSpeak installed, get it at eSpeak SourceForge homepage.
Note for advanced users - you can configure which eSpeak voice to use for which language by changing the Speech Generator TTS builder configuration. This is done in transformers/speechgen2/tts/ttsbuilder.xml
.
- Lame MP3 Encoder
-
The Lame MP3 encoder needs to be installed, and the pipeline.user.properties
file needs to point to the installation path.
Using Aptitude (for example on Ubuntu), install lame through the shell via the command sudo aptitude install lame
.
In pipeline.user.properties (or through the Pipeline GUI if you are using that) change pipeline.lame.path to /usr/bin/lame
.
Support for the MathML in DAISY extension
This script supports the MathML in DAISY extension.
The following key points should be taken into account when using this script to generate MathML-enabled DTBs:
- MathML in DAISY requires that the MathML islands have an alternate image and alternate text for use by reading systems that do not support MathML natively.
If your input document does not have the alternate image and alternate text, you will need to install an add-in to the Pipeline that provides the functionality to add these values.
- MathML in DAISY is an extension to the DAISY 3 (DAISY/NISO) standard. This script will still output a DAISY 2.02 DTB when MathML is present in the input, but in the 2.02 version of the DTB MathML islands are converted to measly images with alternate text.
Support for multi-language TTS configuration
The separate document Speechgen2 multi-language support describes how to refine the configuration of per-language TTS engines in the Narrator.
Note - this requires manual edition of XML configuration files.
Configuring Narrator for concurrent TTS rendering
The separate document Running speechgen2 using multiple TTS slaves describes how to configure Narrator to use multiple concurrent TTS engines. Note - this requires high technical proficiency.
Appendix: List of Transformers used
The documents linked below are parts of the Transformer technical documentation. These are developer and systems-administrator centric documents.