All Downloads are FREE. Search and download functionalities are using the official Maven repository.

resources..russie-inflex-metadata.long-desc.html Maven / Gradle / Ivy

The newest version!

A named entity recognition pipeline that identifies basic entity types, such as Person, Location, Organization, Money amounts, Time and Date expressions. It works on documents in the Russian language.

This version of the pipeline includes an inflexional gazetteer to recognise more morphological variants of target names.

Default annotations
:Person Standard named entity types
:Location
:Organization
:Date
:Address Includes email and IP addresses as well as street addresses
Additional annotations available if selected
:Money Monetary amounts
:Percent Expressions representing percentages
:Token The individual tokens of the text, with "category" feature for POS
:SpaceToken The spaces between tokens
:Sentence Sentences detected by the sentence splitter
:Lookup Individual gazetteer lookups – for those lookups that come from the inflectional gazetteer this includes a "lemma" feature giving the base word form
:MSD "Morpho-Syntactic Description" for selected tokens, including features for "lemma" (the base form of inflected words) and "type" (roughly equivalent to a part of speech tag in English, though more complex as it encodes features such as gender, grammatical case, etc.)




© 2015 - 2025 Weber Informatics LLC | Privacy Policy