
resources..russie-metadata.long-desc.html Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of lang-russian Show documentation
Show all versions of lang-russian Show documentation
Support for processing Russian documents
The newest version!
A named entity recognition pipeline that identifies basic entity types, such
as Person, Location, Organization, Money
amounts, Time and Date expressions. It works on documents
in the Russian language.
Default annotations
:Person
Standard named entity types
:Location
:Organization
:Date
:Address
Includes email and IP addresses as well as street addresses
Additional annotations available if selected
:Money
Monetary amounts
:Percent
Expressions representing percentages
:Token
The individual tokens of the text, with "category" feature for POS
:SpaceToken
The spaces between tokens
:Sentence
Sentences detected by the sentence splitter
:Lookup
Individual gazetteer lookups
:MSD
"Morpho-Syntactic Description" for selected tokens, including features for "lemma" (the base form of inflected words) and "type" (roughly equivalent to a part of speech tag in English, though more complex as it encodes features such as gender, grammatical case, etc.)
© 2015 - 2025 Weber Informatics LLC | Privacy Policy