org.opensextant.annotations.package-info Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of opensextant-xponents-core Show documentation
Show all versions of opensextant-xponents-core Show documentation
An information extraction toolkit focused on geography and temporal entities
/**
*
* IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
*
* OpenSextant/Xponents sub-project
* __
* ___/ /___ ___ ___ ___ __ __ ___
* / _ // -_)/ -_)/ _ \/ -_)/ // // -_)
* \_,_/ \__/ \__// .__/\__/ \_, / \__/
* /_/ /___/
*
* IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
* Copyright 2013, 2019 MITRE Corporation
*
*
* DeepEye is an approach for simplifying typical NLP annotation exchanges. It represents a
* practical data model for representing annotations -- any span of text tagged with some metadata
* in the context of a document. The resulting annotation can be serialized as JSON, stored in a
* database, and later deserialized or retrieved from that database. All of these transformations
* from Java or object state to representational state incur some loss, some interpretation, etc.
*
* DeepEye offers some best practices and some conveniences that support rapid prototyping where NLP
* is invoked natively or RESTfully and the outputs are persisted in databases.
*
* The key concepts are the Record and the Annotation. A Record object represents the original data
* and any associated metadata. A Record must have an identifier and usually relates to a single
* source. Annotations are any key-value pair derived from a Record by some processing routine.
*
* Data structure:
*
* - Records have an id, text, and attributes. Other optional fields, as well, such as processing
* state to indicate processing was done and annotations were contributed by that processor.
* - Annotations link to Record by rec_id, they have a name, value, offset(s), attributes. As
* processing may yield spurious, repetitive annotations the AnnotationHelper can be used to cache
* the same name/value annotation as it appears over many span offsets. This convenience we term as
* annotation compression.
*
*
* Additional machinery here helps in pipelines:
*
*
* - AnnotationHelper is a utility class that can be used to formulate common OpenSextant
* annotations from Java classes. This utility class also helps with annotation compression and
* distilling large results in memory.
* - DeepEyeStore is a noSQL-style API for finding and updating Records, saving Annotations,
* updating Annotations and recording Record state. MongoDB, PostgreSQL and SQLite implementations
* have been attempted, where MongoDB has been the most successful. This class is only an interface
* specification without implementation.
*
*
*/
package org.opensextant.annotations;
© 2015 - 2024 Weber Informatics LLC | Privacy Policy