marytts.tools.voiceimport.help_import_main.html Maven / Gradle / Ivy

Go to download
Show more of this group Show more artifacts with this name
Show all versions of marytts-builder Show documentation
The newest version!


IMPORT MAIN HELP



Import Main Help File
The following file describes the available modules of the voice import tool.
Running all the modules successively from top to bottom will create 
a new voice from your database. However, you might not need to run all of 
them. The modules that are absolutely neccessary for the voice building
are marked with a star (*).

Modules processing the raw acoustic data:

*PraatPitchmarker: 
Computes the pitchmarks with the help of praat. 
  Requires a valid praat-installation. You can download praat from here:
http://www.fon.hum.uva.nl/praat/
*MCEPMaker: 
Computes the Mel-Cepstrum-Coefficients. You need the 
  Edinburgh Speech Tools for this. You can get them here:
http://www.cstr.ed.ac.uk/projects/speech_tools/



Modules processing unit labels and features:

Mary2FestvoxTranscripts: 
Produces a transcript file in the Festvox style 
  from the transcript files used by Mary. This file is needed for the 
  SphinxLabelingPreparator.
SphinxLabelingPreparator:
Prepares the labeling with Sphinxtrain. For this,
  you need Sphinxtrain and Sphinx2. You can download these programs here:
http://cmusphinx.sourceforge.net/webpage/html/download.php
Furthermore you need the Edinburgh Speech tools. You can get them here:
http://www.cstr.ed.ac.uk/projects/speech_tools/
  This module requires a running 
  Maryserver, per default on the same machine using port 59125. You can 
  connect to a different server by altering the settings. See the settings
  help for more information on this.
SphinxTrainer:
Trains the models used for labeling with Sphinxtrain. 
  Depending on the size of the data, this can take a long time.
SphinxLabeler:
Produces labels with the help of the modules built by the 
  SphinxTrainer. Uses Sphinx2.

MRPALabelConverter:
If you have labeled data in the Festvox format and using
  the MRPA-Phoneset, use this module to convert the phones into the phoneset
  used by Mary.
Festvox2MaryTranscripts:
Converts a transcript in the Festvox style into
  the Mary format: individual text files, each containing one sentence.

LabelledFilesInspector:
Allows you to browse through your labels and listen
  to the corresponding wave file.

*PhoneUnitLabelComputer/HalfPhoneUnitLabelComputer:
Converts the label files 
  into the label files used by Mary. PhoneUnitLabelComputer produces phone 
  labels, HalfPhoneUnitLabelComputer produces halfphone labels. You will need 
  both to build the voice.
*PhoneUnitFeatureComputer/HalfPhoneUnitFeatureComputer:
Computes the 
  features of the phone/halfphone units. This module requires a running 
  Maryserver, per default on the same machine using port 59125. You can 
  connect to a different server by altering the settings. See the settings
  help for more information on this.

  It depends on your server which features are computed: The configuration 
  files in the Marybase/conf/ directory ending on "targetfeatures" determine 
  which features will be computed. "english-halfphone-targetfeatures.config", 
  for example, determines which features are used for english halfphone targets.
*PhoneLabelFeatureAligner/HalfPhoneLabelFeatureAligner:
Tries to align the 
  labels and the features. If alignment fails, you can start the automatic pause 
  correction.

  This works as follows:
 
  - pauses, that are in the label file but not in the feature file 
    are deleted in the label file, and the durations of the previous
    and next labels are stretched.

  - pauses that are in the feature file but not in the label file 
    are inserted into the label file with length zero.

  If there are still errors after the pause correction, you are prompted for each 
  error. You can skip the error or remove the corresponding file from the 
  basename list (the list of files that are used for your voice). "skip all" 
  and "remove all" does this for all problematic files. "Edit unit labels" allows 
  you to edit the label file. "Edit RAWMARYXML" lets you edit the maryxml that is 
  the input for computing the features. You have to have a Maryserver running in 
  order to recompute the features from the maryxml. You can alter the host and 
  port settings for the server by altering the settings for the 
  UnitFeatureComputer.


                
Modules producing basic data files:

*WaveTimelineMaker:
Produces a file containing all wave files. This file is 
  needed for various voice building steps and for synthesis.
BasenameTimelineMaker:
Produces a file containing all basenames and the
  absolute times.
*MCepTimelineMaker: 
Produces a file containing all mcep files. The file is 
  used for the join cost computation.



Modules building acoustic models:

*PhoneUnitfileWriter:
Produces a file containing all phone sized units.
*PhoneFeatureFileWriter: 
Produces a file containing all the target cost 
  features for the phone sized units. The module needs a file defining which 
  features are to be used and what weights are given to them. They must be 
  the same features as the ones that the PhoneFeatureComputer used. If you
  do not have a feature definition, the module tries to create one. For 
  more information, see the example file: 
  Marybase/modules/import/examples/PhoneUnitFeatureDefinition.txt
*DurationCARTTrainer: 
Builds an acoustic model of durations in the database
  using the program "wagon" from the Edinburgh Speech tools. You can get them 
  from here:
http://www.cstr.ed.ac.uk/projects/speech_tools/
  Furthermore, the files produced by the two previous components are needed.
*F0CARTTrainer: 
Builds acoustic models of F0 in the database. Like for the
  DurationCARTTrainer, the program "wagon" and the files produced by 
  PhoneUnitfileWriter and PhoneFeatureFileWriter.



Modules building unit selection files:

*HalfPhoneUnitfileWriter: 
Produces a file containing all halfphone sized units.
*HalfPhoneFeatureFileWriter: 
Produces a file containing all the target cost 
  features for the phone sized units. The module needs a file defining which 
  features are to be used and what weights are given to them. They must be 
  the same features as the ones that the PhoneFeatureComputer used. If you
  do not have a feature definition, the module tries to create one. For 
  more information, see the example file: 
  Marybase/modules/import/examples/HalfPhoneUnitFeatureDefinition.txt
*JoinCostFileMaker: 
Produces a file containing all the join cost features for the 
  halfphone sized units.
JoinCostPrecomputer: 
Precomputes the join costs for each unit pair. Use of this 
  module is discouraged, since it does not speed up the synthesis.
*AcousticFeatureFileWriter: 
Produces a file containing all the target cost features 
  plus two acoustic target cost features for the halfphone sized units. Also produces
  a feature definition containing those features.
*CARTBuilder: 
Builds a preselection tree for the target cost features using the 
  program "wagon" from the Edinburgh Speech tools. You can get them from here:
http://www.cstr.ed.ac.uk/projects/speech_tools/ 

  Additionally, you need to specify either a feature sequence or a top level tree. 
  They are used to built a basic tree that is extendend by wagon. This way, wagon runs
  several times on smaller subsets of units rather than the whole set. It might still take
  some time to run this module. 

  - Feature sequence: a file containing a list of features for which to build the tree.

  - Top level tree: a file containing the basic tree. 

  For more information on these two possibilities of specifying the basic tree,
  see the example files in Marybase/lib/modules/import/examples/.
  If you give the CARTBuilder neither a feature sequence nor a top level tree file, 
  a default feature sequence is created which only contains "mary_phoneme" as feature.
  If the basic tree contains leaves that are contain more units than the maximum number
  of units allowed, the leaves are pruned and a warning message is printed. It is recommended that
  you make sure that there are no leaves that are too big.
*CARTPruner: 
Prunes the preselection tree. This module removes outliers from the preselection tree.



Module installing the voice:

VoiceInstaller:
Copies all the necessary files to a new subdirectory 
  in the lib/voices/ directory of your Mary installation. Furthermore, 
  a file that specifies the properties of the voice is created and
  stored in the conf/ directory of your Mary installation. Next time
  you start the Mary server, the voice is loaded. You can also do this by
  hand if you know what you are doing.



If you have problems:
- Take a look at the README file: Marybase/modules/import/README

- write to the Mary mailing list: [email protected]




Anna Hunecke, June 2007.