edu.stanford.nlp.io.Lexer Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of stanford-parser Show documentation
Show all versions of stanford-parser Show documentation
Stanford Parser processes raw text in English, Chinese, German, Arabic, and French, and extracts constituency parse trees.
package edu.stanford.nlp.io;
import java.io.IOException;
import java.io.Reader;
/**
* A Lexer interface to be used with {@link edu.stanford.nlp.process.LexerTokenizer}. You can put a {@link Reader} inside
* a Lexer with the {@link #yyreset} method. An easy way to build classes implementing this
* interface is with JFlex (http://jflex.de). Just make sure to include the following in the
* JFlex source file
*
* In the Options and Macros section of the source file, include
*
* %class JFlexDummyLexer
* %standalone
* %unicode
* %int
*
* %implements edu.stanford.nlp.io.Lexer
*
* %{
* public void pushBack(int n) {
* yypushback(n);
* }
*
* public int getYYEOF() {
* return YYEOF;
* }
* %}
*
* Alternatively, you can customize your own lexer and get lots of
* flexibility out.
*
* @author Roger Levy
*/
public interface Lexer {
public int ACCEPT = 1;
public int IGNORE = 0;
/**
* Gets the next token from input and returns an integer value
* signalling what to do with the token.
*/
public int yylex() throws IOException;
/**
* returns the matched input text region
*/
public String yytext();
/**
* Pushes back length
character positions in the
* lexer. Conventionally used to push back exactly one token.
*/
public void pushBack(int length);
/**
* returns value for YYEOF
*/
public int getYYEOF();
/**
* put a {@link Reader} inside the Lexer.
*/
public void yyreset(Reader r) throws IOException;
}