edu.stanford.nlp.process.WordSegmenter Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of stanford-parser Show documentation
Show all versions of stanford-parser Show documentation
Stanford Parser processes raw text in English, Chinese, German, Arabic, and French, and extracts constituency parse trees.
The newest version!
package edu.stanford.nlp.process;
import java.io.Serializable;
import java.util.List;
import java.util.Collection;
import edu.stanford.nlp.ling.HasWord;
import edu.stanford.nlp.ling.TaggedWord;
import edu.stanford.nlp.trees.Tree;
/** An interface for segmenting strings into words
* (in unwordsegmented languages).
*
* @author Galen Andrew
*/
public interface WordSegmenter extends Serializable {
void initializeTraining(double numTrees);
void train(Collection trees);
void train(Tree trees);
void train(List sentence);
void finishTraining();
void loadSegmenter(String filename);
List segment(String s);
}