All Downloads are FREE. Search and download functionalities are using the official Maven repository.

edu.berkeley.nlp.tokenizer.ChineseRetokenizer Maven / Gradle / Ivy

Go to download

The Berkeley parser analyzes the grammatical structure of natural language using probabilistic context-free grammars (PCFGs).

The newest version!
package edu.berkeley.nlp.tokenizer;

import java.util.LinkedList;
import java.util.List;

public class ChineseRetokenizer implements LineTokenizer {
	public List tokenizeLine(String line) {
		String replaced = replaceChars(line);
		String[] tokens = replaced.split(" ");
		boolean rightDoubleQuote = false;
		boolean rightSingleQuote = false;
		LinkedList newTokens = new LinkedList();
		for (int i = 0; i




© 2015 - 2025 Weber Informatics LLC | Privacy Policy