All Downloads are FREE. Search and download functionalities are using the official Maven repository.

marytts.tools.voiceimport.help_import_main.html Maven / Gradle / Ivy

The newest version!


IMPORT MAIN HELP



Import Main Help File

The following file describes the available modules of the voice import tool. Running all the modules successively from top to bottom will create a new voice from your database. However, you might not need to run all of them. The modules that are absolutely neccessary for the voice building are marked with a star (*).

Modules processing the raw acoustic data:

*PraatPitchmarker:
Computes the pitchmarks with the help of praat. Requires a valid praat-installation. You can download praat from here: http://www.fon.hum.uva.nl/praat/
*MCEPMaker:
Computes the Mel-Cepstrum-Coefficients. You need the Edinburgh Speech Tools for this. You can get them here: http://www.cstr.ed.ac.uk/projects/speech_tools/

Modules processing unit labels and features:

Mary2FestvoxTranscripts:
Produces a transcript file in the Festvox style from the transcript files used by Mary. This file is needed for the SphinxLabelingPreparator.
SphinxLabelingPreparator:
Prepares the labeling with Sphinxtrain. For this, you need Sphinxtrain and Sphinx2. You can download these programs here: http://cmusphinx.sourceforge.net/webpage/html/download.php Furthermore you need the Edinburgh Speech tools. You can get them here: http://www.cstr.ed.ac.uk/projects/speech_tools/ This module requires a running Maryserver, per default on the same machine using port 59125. You can connect to a different server by altering the settings. See the settings help for more information on this.
SphinxTrainer:
Trains the models used for labeling with Sphinxtrain. Depending on the size of the data, this can take a long time.
SphinxLabeler:
Produces labels with the help of the modules built by the SphinxTrainer. Uses Sphinx2.
MRPALabelConverter:
If you have labeled data in the Festvox format and using the MRPA-Phoneset, use this module to convert the phones into the phoneset used by Mary.
Festvox2MaryTranscripts:
Converts a transcript in the Festvox style into the Mary format: individual text files, each containing one sentence.
LabelledFilesInspector:
Allows you to browse through your labels and listen to the corresponding wave file.
*PhoneUnitLabelComputer/HalfPhoneUnitLabelComputer:
Converts the label files into the label files used by Mary. PhoneUnitLabelComputer produces phone labels, HalfPhoneUnitLabelComputer produces halfphone labels. You will need both to build the voice.
*PhoneUnitFeatureComputer/HalfPhoneUnitFeatureComputer:
Computes the features of the phone/halfphone units. This module requires a running Maryserver, per default on the same machine using port 59125. You can connect to a different server by altering the settings. See the settings help for more information on this.
It depends on your server which features are computed: The configuration files in the Marybase/conf/ directory ending on "targetfeatures" determine which features will be computed. "english-halfphone-targetfeatures.config", for example, determines which features are used for english halfphone targets.
*PhoneLabelFeatureAligner/HalfPhoneLabelFeatureAligner:
Tries to align the labels and the features. If alignment fails, you can start the automatic pause correction.
This works as follows:
- pauses, that are in the label file but not in the feature file are deleted in the label file, and the durations of the previous and next labels are stretched.
- pauses that are in the feature file but not in the label file are inserted into the label file with length zero.
If there are still errors after the pause correction, you are prompted for each error. You can skip the error or remove the corresponding file from the basename list (the list of files that are used for your voice). "skip all" and "remove all" does this for all problematic files. "Edit unit labels" allows you to edit the label file. "Edit RAWMARYXML" lets you edit the maryxml that is the input for computing the features. You have to have a Maryserver running in order to recompute the features from the maryxml. You can alter the host and port settings for the server by altering the settings for the UnitFeatureComputer.

Modules producing basic data files:

*WaveTimelineMaker:
Produces a file containing all wave files. This file is needed for various voice building steps and for synthesis.
BasenameTimelineMaker:
Produces a file containing all basenames and the absolute times.
*MCepTimelineMaker:
Produces a file containing all mcep files. The file is used for the join cost computation.

Modules building acoustic models:

*PhoneUnitfileWriter:
Produces a file containing all phone sized units.
*PhoneFeatureFileWriter:
Produces a file containing all the target cost features for the phone sized units. The module needs a file defining which features are to be used and what weights are given to them. They must be the same features as the ones that the PhoneFeatureComputer used. If you do not have a feature definition, the module tries to create one. For more information, see the example file: Marybase/modules/import/examples/PhoneUnitFeatureDefinition.txt
*DurationCARTTrainer:
Builds an acoustic model of durations in the database using the program "wagon" from the Edinburgh Speech tools. You can get them from here: http://www.cstr.ed.ac.uk/projects/speech_tools/ Furthermore, the files produced by the two previous components are needed.
*F0CARTTrainer:
Builds acoustic models of F0 in the database. Like for the DurationCARTTrainer, the program "wagon" and the files produced by PhoneUnitfileWriter and PhoneFeatureFileWriter.

Modules building unit selection files:

*HalfPhoneUnitfileWriter:
Produces a file containing all halfphone sized units.
*HalfPhoneFeatureFileWriter:
Produces a file containing all the target cost features for the phone sized units. The module needs a file defining which features are to be used and what weights are given to them. They must be the same features as the ones that the PhoneFeatureComputer used. If you do not have a feature definition, the module tries to create one. For more information, see the example file: Marybase/modules/import/examples/HalfPhoneUnitFeatureDefinition.txt
*JoinCostFileMaker:
Produces a file containing all the join cost features for the halfphone sized units.
JoinCostPrecomputer:
Precomputes the join costs for each unit pair. Use of this module is discouraged, since it does not speed up the synthesis.
*AcousticFeatureFileWriter:
Produces a file containing all the target cost features plus two acoustic target cost features for the halfphone sized units. Also produces a feature definition containing those features.
*CARTBuilder:
Builds a preselection tree for the target cost features using the program "wagon" from the Edinburgh Speech tools. You can get them from here: http://www.cstr.ed.ac.uk/projects/speech_tools/
Additionally, you need to specify either a feature sequence or a top level tree. They are used to built a basic tree that is extendend by wagon. This way, wagon runs several times on smaller subsets of units rather than the whole set. It might still take some time to run this module.
- Feature sequence: a file containing a list of features for which to build the tree.
- Top level tree: a file containing the basic tree.
For more information on these two possibilities of specifying the basic tree, see the example files in Marybase/lib/modules/import/examples/. If you give the CARTBuilder neither a feature sequence nor a top level tree file, a default feature sequence is created which only contains "mary_phoneme" as feature. If the basic tree contains leaves that are contain more units than the maximum number of units allowed, the leaves are pruned and a warning message is printed. It is recommended that you make sure that there are no leaves that are too big.
*CARTPruner:
Prunes the preselection tree. This module removes outliers from the preselection tree.

Module installing the voice:

VoiceInstaller:
Copies all the necessary files to a new subdirectory in the lib/voices/ directory of your Mary installation. Furthermore, a file that specifies the properties of the voice is created and stored in the conf/ directory of your Mary installation. Next time you start the Mary server, the voice is loaded. You can also do this by hand if you know what you are doing.

If you have problems:

- Take a look at the README file: Marybase/modules/import/README
- write to the Mary mailing list: [email protected]


Anna Hunecke, June 2007.




© 2015 - 2025 Weber Informatics LLC | Privacy Policy