
net.sf.okapi.steps.gcaligner.CharDist.properties Maven / Gradle / Ivy
## This property file defines the character distribution between
## languages to align. The character distribution is an average number
## of target characters per source character. For example, English to
## French is roughly 1 and English to Japanese is roughly 0.5. That
## means that the length of English and French sentence are roughly
## the same and an English sentence is as half in length as a Japanese
## sentence.
## The properties define a character distribution of a language to the
## imaginary neutral language. The imaginary neutral language is
## always a source language. Each language defines its character
## distribution as target language to the source neutral language. The
## character distribution of two languages are calculated based on the
## character distribution of each language to the neutral
## language. For example, if the character distribution of English to
## the neutral language is 1, Japanese is 0.5 and Chinese is also 0.5,
## the character distribution of English to Japanese is 0.5, English
## to Chinese is also 0.5 and Japanese to Chinese is 1.
## The key of the property is a two lower case letter language code or
## a locale code that is consisted of a two lower case letter language
## code + '_' (underscore) + two upper case letter country code. A
## property with a locale code overrides a property with a language
## code.
## The value of the property is a character distribution of a language
## to the neutral language. The default value is 1 if there is no
## definition for a language.
en = 1
fr = 1.1
it = 1.1
de = 1.1
es = 1.1
ja = 0.5
zh = 0.5
mn = 1.1
th = 1.5
id = 1.2
to = 1.1
km = 1.1
ton = 1.1
khm = 1.1
© 2015 - 2025 Weber Informatics LLC | Privacy Policy