edu.stanford.nlp.ie.machinereading.Extractor Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of stanford-corenlp Show documentation
Show all versions of stanford-corenlp Show documentation
Stanford CoreNLP provides a set of natural language analysis tools which can take raw English language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases and word dependencies, and indicate which noun phrases refer to the same entities. It provides the foundational building blocks for higher level text understanding applications.
package edu.stanford.nlp.ie.machinereading;
import java.io.IOException;
import java.io.Serializable;
import java.util.logging.Level;
import edu.stanford.nlp.pipeline.Annotation;
// TODO make this an abstract class instead so setLoggerLevel doesn't have to be implemented by all subclasses.
// also, should add the load() method here -- though it can't be static, so maybe we need a different approach.
// Extractors should have a logger as an instance attribute.
public interface Extractor extends Serializable {
/**
* Trains one extractor model using the given dataset
*
* @param dataset
* dataset to train from (this should already have annotations and
* will typically be created by a reader)
*/
public void train(Annotation dataset);
/**
* Annotates the given dataset with the current model This works in place,
* i.e., it adds ExtractionObject objects to the sentences in the dataset To
* make sure you are not messing with gold annotation create a copy of the
* ExtractionDataSet first!
*
* @param dataset
* dataset to annotate
*/
public void annotate(Annotation dataset);
/**
* Serializes this extractor to a file
*
* @param path
* where to save the extractor
*
*/
public void save(String path) throws IOException;
public void setLoggerLevel(Level level);
}