![JAR search and dependency download from the Maven repository](/logo.png)
edu.berkeley.nlp.tokenizer.ChineseRetokenizer Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of berkeleyparser Show documentation
Show all versions of berkeleyparser Show documentation
The Berkeley parser analyzes the grammatical structure of natural language using probabilistic context-free grammars (PCFGs).
The newest version!
package edu.berkeley.nlp.tokenizer;
import java.util.LinkedList;
import java.util.List;
public class ChineseRetokenizer implements LineTokenizer {
public List tokenizeLine(String line) {
String replaced = replaceChars(line);
String[] tokens = replaced.split(" ");
boolean rightDoubleQuote = false;
boolean rightSingleQuote = false;
LinkedList newTokens = new LinkedList();
for (int i = 0; i
© 2015 - 2025 Weber Informatics LLC | Privacy Policy