marytts.tools.voiceimport.help_import_main.html Maven / Gradle / Ivy
The newest version!
IMPORT MAIN HELP
Import Main Help File
The following file describes the available modules of the voice import tool.
Running all the modules successively from top to bottom will create
a new voice from your database. However, you might not need to run all of
them. The modules that are absolutely neccessary for the voice building
are marked with a star (*).
Modules processing the raw acoustic data:
- *PraatPitchmarker:
- Computes the pitchmarks with the help of praat.
Requires a valid praat-installation. You can download praat from here:
http://www.fon.hum.uva.nl/praat/
- *MCEPMaker:
- Computes the Mel-Cepstrum-Coefficients. You need the
Edinburgh Speech Tools for this. You can get them here:
http://www.cstr.ed.ac.uk/projects/speech_tools/
Modules processing unit labels and features:
- Mary2FestvoxTranscripts:
- Produces a transcript file in the Festvox style
from the transcript files used by Mary. This file is needed for the
SphinxLabelingPreparator.
- SphinxLabelingPreparator:
- Prepares the labeling with Sphinxtrain. For this,
you need Sphinxtrain and Sphinx2. You can download these programs here:
http://cmusphinx.sourceforge.net/webpage/html/download.php
Furthermore you need the Edinburgh Speech tools. You can get them here:
http://www.cstr.ed.ac.uk/projects/speech_tools/
This module requires a running
Maryserver, per default on the same machine using port 59125. You can
connect to a different server by altering the settings. See the settings
help for more information on this.
- SphinxTrainer:
- Trains the models used for labeling with Sphinxtrain.
Depending on the size of the data, this can take a long time.
- SphinxLabeler:
- Produces labels with the help of the modules built by the
SphinxTrainer. Uses Sphinx2.
- MRPALabelConverter:
- If you have labeled data in the Festvox format and using
the MRPA-Phoneset, use this module to convert the phones into the phoneset
used by Mary.
- Festvox2MaryTranscripts:
- Converts a transcript in the Festvox style into
the Mary format: individual text files, each containing one sentence.
- LabelledFilesInspector:
- Allows you to browse through your labels and listen
to the corresponding wave file.
- *PhoneUnitLabelComputer/HalfPhoneUnitLabelComputer:
- Converts the label files
into the label files used by Mary. PhoneUnitLabelComputer produces phone
labels, HalfPhoneUnitLabelComputer produces halfphone labels. You will need
both to build the voice.
- *PhoneUnitFeatureComputer/HalfPhoneUnitFeatureComputer:
- Computes the
features of the phone/halfphone units. This module requires a running
Maryserver, per default on the same machine using port 59125. You can
connect to a different server by altering the settings. See the settings
help for more information on this.
It depends on your server which features are computed: The configuration
files in the Marybase/conf/ directory ending on "targetfeatures" determine
which features will be computed. "english-halfphone-targetfeatures.config",
for example, determines which features are used for english halfphone targets.
- *PhoneLabelFeatureAligner/HalfPhoneLabelFeatureAligner:
- Tries to align the
labels and the features. If alignment fails, you can start the automatic pause
correction.
This works as follows:
- pauses, that are in the label file but not in the feature file
are deleted in the label file, and the durations of the previous
and next labels are stretched.
- pauses that are in the feature file but not in the label file
are inserted into the label file with length zero.
If there are still errors after the pause correction, you are prompted for each
error. You can skip the error or remove the corresponding file from the
basename list (the list of files that are used for your voice). "skip all"
and "remove all" does this for all problematic files. "Edit unit labels" allows
you to edit the label file. "Edit RAWMARYXML" lets you edit the maryxml that is
the input for computing the features. You have to have a Maryserver running in
order to recompute the features from the maryxml. You can alter the host and
port settings for the server by altering the settings for the
UnitFeatureComputer.
Modules producing basic data files:
- *WaveTimelineMaker:
- Produces a file containing all wave files. This file is
needed for various voice building steps and for synthesis.
- BasenameTimelineMaker:
- Produces a file containing all basenames and the
absolute times.
- *MCepTimelineMaker:
- Produces a file containing all mcep files. The file is
used for the join cost computation.
Modules building acoustic models:
- *PhoneUnitfileWriter:
- Produces a file containing all phone sized units.
- *PhoneFeatureFileWriter:
- Produces a file containing all the target cost
features for the phone sized units. The module needs a file defining which
features are to be used and what weights are given to them. They must be
the same features as the ones that the PhoneFeatureComputer used. If you
do not have a feature definition, the module tries to create one. For
more information, see the example file:
Marybase/modules/import/examples/PhoneUnitFeatureDefinition.txt
- *DurationCARTTrainer:
- Builds an acoustic model of durations in the database
using the program "wagon" from the Edinburgh Speech tools. You can get them
from here:
http://www.cstr.ed.ac.uk/projects/speech_tools/
Furthermore, the files produced by the two previous components are needed.
- *F0CARTTrainer:
- Builds acoustic models of F0 in the database. Like for the
DurationCARTTrainer, the program "wagon" and the files produced by
PhoneUnitfileWriter and PhoneFeatureFileWriter.
Modules building unit selection files:
- *HalfPhoneUnitfileWriter:
- Produces a file containing all halfphone sized units.
- *HalfPhoneFeatureFileWriter:
- Produces a file containing all the target cost
features for the phone sized units. The module needs a file defining which
features are to be used and what weights are given to them. They must be
the same features as the ones that the PhoneFeatureComputer used. If you
do not have a feature definition, the module tries to create one. For
more information, see the example file:
Marybase/modules/import/examples/HalfPhoneUnitFeatureDefinition.txt
- *JoinCostFileMaker:
- Produces a file containing all the join cost features for the
halfphone sized units.
- JoinCostPrecomputer:
- Precomputes the join costs for each unit pair. Use of this
module is discouraged, since it does not speed up the synthesis.
- *AcousticFeatureFileWriter:
- Produces a file containing all the target cost features
plus two acoustic target cost features for the halfphone sized units. Also produces
a feature definition containing those features.
- *CARTBuilder:
- Builds a preselection tree for the target cost features using the
program "wagon" from the Edinburgh Speech tools. You can get them from here:
http://www.cstr.ed.ac.uk/projects/speech_tools/
Additionally, you need to specify either a feature sequence or a top level tree.
They are used to built a basic tree that is extendend by wagon. This way, wagon runs
several times on smaller subsets of units rather than the whole set. It might still take
some time to run this module.
- Feature sequence: a file containing a list of features for which to build the tree.
- Top level tree: a file containing the basic tree.
For more information on these two possibilities of specifying the basic tree,
see the example files in Marybase/lib/modules/import/examples/.
If you give the CARTBuilder neither a feature sequence nor a top level tree file,
a default feature sequence is created which only contains "mary_phoneme" as feature.
If the basic tree contains leaves that are contain more units than the maximum number
of units allowed, the leaves are pruned and a warning message is printed. It is recommended that
you make sure that there are no leaves that are too big.
- *CARTPruner:
- Prunes the preselection tree. This module removes outliers from the preselection tree.
Module installing the voice:
- VoiceInstaller:
- Copies all the necessary files to a new subdirectory
in the lib/voices/ directory of your Mary installation. Furthermore,
a file that specifies the properties of the voice is created and
stored in the conf/ directory of your Mary installation. Next time
you start the Mary server, the voice is loaded. You can also do this by
hand if you know what you are doing.
If you have problems:
- Take a look at the README file: Marybase/modules/import/README
- write to the Mary mailing list: [email protected]
Anna Hunecke, June 2007.
© 2015 - 2025 Weber Informatics LLC | Privacy Policy