edu.stanford.nlp.ling.Document Maven / Gradle / Ivy

Show more of this group Show more artifacts with this name
Show all versions of stanford-corenlp Show documentation

Stanford CoreNLP provides a set of natural language analysis tools which can take raw English language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases and word dependencies, and indicate which noun phrases refer to the same entities. It provides the foundational building blocks for higher level text understanding applications.

There is a newer version: 4.5.7

Show newest version

package edu.stanford.nlp.ling;

import java.util.List;

/**
 * Represents a text document as a list of Words with a String title.
 *
 * @author Sepandar Kamvar ([email protected])
 * @author Joseph Smarr ([email protected])
 * @author Sarah Spikes ([email protected]) (Templatization - added another parameter)
 *
 * @param  The type of the labels in the Datum
 * @param  The type of the features in the Datum,
 *	and the type stored in the List
 */
public interface Document extends Datum, List {

  /**
   * Returns title of document, or "" if the document has no title.
   * Implementations should never return null.
   *
   * @return The document's title
   */
  public abstract String title();

  /**
   * Returns a new empty Document with the same meta-data (title, labels, etc)
   * as this Document. Subclasses that store extra state should provide custom
   * implementations of this method. This method is primarily used by the
   * processing API, so the input document can be preserved and the output
   * document can maintain the meta-data of the in document.
   *
   * @return An empty document of the right sort.
   */
  public  Document blankDocument();

}