
resources..russie-inflex-metadata.long-desc.html Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of lang-russian Show documentation
Show all versions of lang-russian Show documentation
Support for processing Russian documents
The newest version!
A named entity recognition pipeline that identifies basic entity types, such
as Person, Location, Organization, Money
amounts, Time and Date expressions. It works on documents
in the Russian language.
This version of the pipeline includes an inflexional gazetteer to
recognise more morphological variants of target names.
Default annotations
:Person
Standard named entity types
:Location
:Organization
:Date
:Address
Includes email and IP addresses as well as street addresses
Additional annotations available if selected
:Money
Monetary amounts
:Percent
Expressions representing percentages
:Token
The individual tokens of the text, with "category" feature for POS
:SpaceToken
The spaces between tokens
:Sentence
Sentences detected by the sentence splitter
:Lookup
Individual gazetteer lookups – for those lookups that come from the inflectional gazetteer this includes a "lemma" feature giving the base word form
:MSD
"Morpho-Syntactic Description" for selected tokens, including features for "lemma" (the base form of inflected words) and "type" (roughly equivalent to a part of speech tag in English, though more complex as it encodes features such as gender, grammatical case, etc.)
© 2015 - 2025 Weber Informatics LLC | Privacy Policy