edu.stanford.nlp.io.Lexer Maven / Gradle / Ivy

Go to download

Show more of this group Show more artifacts with this name
Show all versions of stanford-parser Show documentation

Stanford Parser processes raw text in English, Chinese, German, Arabic, and French, and extracts constituency parse trees.

There is a newer version: 3.9.2

Show newest version

package edu.stanford.nlp.io;

import java.io.IOException;
import java.io.Reader;


/**
 * A Lexer interface to be used with {@link edu.stanford.nlp.process.LexerTokenizer}.  You can put a {@link Reader} inside
 * a Lexer with the {@link #yyreset} method.  An easy way to build classes implementing this
 * interface is with JFlex (http://jflex.de).  Just make sure to include the following in the
 * JFlex source file
 * 
 * 
 In the Options and Macros section of the source file, include
 * 

 * %class JFlexDummyLexer

 * %standalone

 * %unicode

 * %int

 * 


 * %implements edu.stanford.nlp.io.Lexer

 * 

 * %{

 * public void pushBack(int n) {

 * yypushback(n);

 * }

 * 

 * public int getYYEOF() {

 * return YYEOF;

 * }

 * %}

 * 
 * Alternatively, you can customize your own lexer and get lots of
 * flexibility out.
 *
 * @author Roger Levy
 */

public interface Lexer {

  public int ACCEPT = 1;
  public int IGNORE = 0;

  /**
   * Gets the next token from input and returns an integer value
   * signalling what to do with the token.
   */
  public int yylex() throws IOException;

  /**
   * returns the matched input text region
   */
  public String yytext();

  /**
   * Pushes back length character positions in the
   * lexer.  Conventionally used to push back exactly one token.
   */
  public void pushBack(int length);

  /**
   * returns value for YYEOF
   */
  public int getYYEOF();

  /**
   * put a {@link Reader} inside the Lexer.
   */
  public void yyreset(Reader r) throws IOException;

}