javacc-7.0.4.www.doc.apiroutines.html Maven / Gradle / Ivy
JavaCC API Documentation
JavaCC [tm]: API Routines
This web page is a comprehensive list of all classes, methods,
and variables available for use by a JavaCC [tm] user. These classes,
methods, and variables are typically used from the actions that
are embedded in a JavaCC grammar. In the sample code used below,
it is assumed that the name of the generated parser is "TheParser".
Non-Terminals in the Input Grammar
For each non-terminal NT in the input grammar file, the following method
is generated into the parser class:
-
returntype NT(parameters) throws ParseException;
Here, returntype and parameters are what were specified
in the JavaCC input file in the definition of NT (where NT occurred on the
left-hand side).
When this method is called, the input stream is parsed to match this non-terminal.
On a successful parse, this method returns normally. On detection of a parse
error, an error message is displayed and the method returns by throwing an exception
of the type ParseException.
Note that all non-terminals in a JavaCC input grammar have equal status;
it is possible to parse to any non-terminal by calling the non-terminal's method.
API for Parser Actions
- Token token;
This variable holds the last token consumed by the parser and can be used
in parser actions. This is exactly the same as the token returned by
getToken(0).
In addition, the two methods - getToken(int i) and
getNextToken() can also be used in
actions to traverse the token list.
The Token Manager Interface
Typically, the token manager interface is not to be used. Instead all access
must be made through the parser interface. However, in certain situations -
such as if you are not building a parser and building only the token manager -
the token manager interface is useful.
The token manager provides the following routine:
-
Token getNextToken() throws ParseError;
Each call to this method returns the next token in the input stream. This
method throws a ParseError exception when there is a lexical error, i.e.,
it could not find a match for any of the specified tokens from the input
stream. The type Token is described later.
Constructors and Other Initialization Routines
-
TheParser.TheParser(java.io.InputStream stream)
This creates a new parser object, which in turn creates a new token manager object
that reads its tokens from "stream". This constructor is available only
when both the options USER_TOKEN_MANAGER and USER_CHAR_STREAM are false.
If the option STATIC is true, this constructor (along with other constructors)
can be called exactly once to create a single parser object.
-
TheParser.TheParser(CharStream stream)
Similar to the previous constructor, except that this one is available only
when the option USER_TOKEN_MANAGER is false and USER_CHAR_STREAM is true.
-
void TheParser.ReInit(java.io.InputStream stream)
This reinitializes an existing parser object. In addition, it also reinitializes
the existing token manager object that corresponds to this parser object. The result
is a parser object with the exact same functionality as one that was created
with the constructor above. The only difference is that new objects are not
created. This method is available only
when both the options USER_TOKEN_MANAGER and USER_CHAR_STREAM are false.
If the option STATIC is true, this (along with the other ReInit methods)
is the only way to restart a parse operation
for there is only one parser and all one can do is reinitialize it.
-
void TheParser.ReInit(CharStream stream)
Similar to the previous method, except that this one is available only
when the option USER_TOKEN_MANAGER is false and USER_CHAR_STREAM is true.
-
TheParser(TheParserTokenManager tm)
This creates a new parser object which uses an already created token manager object "tm" as
its token manager. This constructor is only available if option USER_TOKEN_MANAGER is
false. If the option STATIC is true, this constructor (along with other constructors)
can be called exactly once to create a single parser object.
-
TheParser(TokenManager tm)
Similar to the previous constructor, except that this one is available only
when the option USER_TOKEN_MANAGER is true.
-
void TheParser.ReInit(TheParserTokenManager tm)
This reinitializes an existing parser object with the token manager object "tm" as its
new token manager. This method is only available if option USER_TOKEN_MANAGER is
false. If the option STATIC is true, this (along with the other ReInit methods)
is the only way to restart a parse operation
for there is only one parser and all one can do is reinitialize it.
-
void TheParser.ReInit(TokenManager tm)
Similar to the previous method, except that this one is available only
when the option USER_TOKEN_MANAGER is true.
-
TheParserTokenManager.TheParserTokenManager(CharStream stream)
Creates a new token manager object initialized to read input from "stream". When
the option STATIC is true, this constructor may be called only once.
This is available only when USER_TOKEN_MANAGER is false and USER_CHAR_STREAM
is true. When USER_TOKEN_MANAGER is false and USER_CHAR_STREAM is false (the default situation),
a constructor similar to the one above is available with the type CharStream
replaced as follows:
-
When JAVA_UNICODE_ESCAPE is false and UNICODE_INPUT is false, CharStream is
replaced by ASCII_CharStream.
-
When JAVA_UNICODE_ESCAPE is false and UNICODE_INPUT is true, CharStream is
replaced by UCode_CharStream.
-
When JAVA_UNICODE_ESCAPE is true and UNICODE_INPUT is false, CharStream is
replaced by ASCII_UCodeESC_CharStream.
-
When JAVA_UNICODE_ESCAPE is true and UNICODE_INPUT is true, CharStream is
replaced by UCode_UCodeESC_CharStream.
-
void TheParserTokenManager.ReInit(CharStream stream)
Reinitializes the current token manager object to read input from "stream". When
the option STATIC is true, this is the only way to restart a token manager operation.
This is available only when USER_TOKEN_MANAGER is false and USER_CHAR_STREAM
is true. When USER_TOKEN_MANAGER is false and USER_CHAR_STREAM is false (the default situation),
a constructor similar to the one above is available with the type CharStream
replaced as follows:
-
When JAVA_UNICODE_ESCAPE is false and UNICODE_INPUT is false, CharStream is
replaced by ASCII_CharStream.
-
When JAVA_UNICODE_ESCAPE is false and UNICODE_INPUT is true, CharStream is
replaced by UCode_CharStream.
-
When JAVA_UNICODE_ESCAPE is true and UNICODE_INPUT is false, CharStream is
replaced by ASCII_UCodeESC_CharStream.
-
When JAVA_UNICODE_ESCAPE is true and UNICODE_INPUT is true, CharStream is
replaced by UCode_UCodeESC_CharStream.
The Token Class
The Token class is the type of token objects that are created by the token manager
after a successful scanning of the token stream. These token objects are then
passed to the parser and are accessible to the actions in a JavaCC grammar usually
by grabbing the return value of a token. The methods getToken and getNextToken
described below also give access to objects of this type.
Each Token object has the following fields and methods:
-
int kind;
This is the index for this kind of token in the internal representation scheme
of JavaCC. When tokens in the JavaCC input file are given labels, these labels
are used to generate "int" constants that can be used in actions.
The value 0 is always used to represent the predefined token <EOF>. A
constant "EOF" is generated for convenience in the ...Constants file.
-
int beginLine, beginColumn, endLine, endColumn;
These indicate the beginning and ending positions of the token as it appeared
in the input stream.
-
String image;
This represents the image of the token as it appeared in the input stream.
-
Token next;
A reference to the next regular (non-special) token from the input
stream. If this is the last token from the input stream, or if the
token manager has not read tokens beyond this one, this field is
set to null.
The description in the above paragraph holds only if this token is also a regular
token. Otherwise, see below for a description of the contents of
this field.
Note: There are two kinds of tokens - regular and special. Regular tokens are
the normal tokens that are fed to the parser. Special tokens are other useful
tokens (like comments) that are not discarded (like white space). For more
information on the different kinds of tokens
please see the minitutorial on the token manager.
-
Token specialToken;
This field is used to access special tokens that occur prior to this
token, but after the immediately preceding regular (non-special) token.
If there are no such special tokens, this field is set to null.
When there are more than one such special token, this field refers
to the last of these special tokens, which in turn refers to the next
previous special token through its specialToken field, and so on
until the first special token (whose specialToken field is null).
The next fields of special tokens refer to other special tokens that
immediately follow it (without an intervening regular token). If there
is no such token, this field is null.
-
public Object getValue();
An optional attribute value of the Token.
Tokens which are not used as syntactic sugar will often contain
meaningful values that will be used later on by the compiler or
interpreter. This attribute value is often different from the image.
Any subclass of Token that actually wants to return a non-null value can
override this method as appropriate.
-
static final Token newToken(int ofKind);
static final Token newToken(int ofKind, String image);
Returns a new token object as its default behavior. If you wish to
perform special actions when a token is constructed or create subclasses
of class Token and instantiate them instead, you can redefine this method
appropriately. The only constraint is that this method returns a new
object of type Token (or a subclass of Token).
Reading Tokens from the Input Stream
There are two methods available for this purpose:
-
Token TheParser.getNextToken() throws ParseError
This method returns the next available token in the input stream and moves
the token pointer one step in the input stream (i.e., this changes the state
of the input stream). If there are no more tokens available in the input
stream, the exception ParseError is thrown. Care must be taken when calling
this method since it can interfere with the parser's knowledge of the state
of the input stream, current token, etc.
-
Token TheParser.getToken(int index) throws ParseError
This method returns the index-th token from the current token ahead in the
token stream. If index is 0, it returns the current token (the last token
returned by getNextToken or consumed by the parser); if index is 1, it returns
the next token (the next token that will be returned by getNextToken of consumed
by the parser) and so on. The index parameter cannot be negative. This method
does not change the input stream pointer (i.e., it does not change the
state of the input stream). If an attempt is made to access a token beyond the
last available token, the exception ParseError is thrown.
If this method is called from a semantic lookahead specification, which in turn
is called during a lookahead determination process, the current token is temporarily
adjusted to be the token currently being inspected by the lookahead process.
For more details,
please see the minitutorial on using lookahead.
Working with Debugger Tracing
When you generate parsers with the options DEBUG_PARSER or DEBUG_LOOKAHEAD, these
parsers produce a trace of their activity which is printed to the user console.
You can insert calls to the following methods to control this tracing activity:
-
void TheParser.enable_tracing()
-
void TheParser.disable_tracing()
For convenience, these methods are available even when you build parsers without
the debug options. In this case, these methods are no-ops. Hence you can
permanently leave these methods in your code and they automatically kick in when
you use the debug options.
Customizing Error Messages
To help the user in customizing error messages generated by the parser and lexer, the
user is offered the facilities described in this section. In the case of the
parser, these facilities are only available if the option ERROR_REPORTING is
true, while in the case of the lexer, these facilities are always available.
The parser contains the following method definition:
-
protected void token_error() { ... }
To customize error reporting by the parser, the parser class must be subclassed
and this method redefined in the subclass. To help with creating your
error reporting scheme, the following variables are available:
-
protected int error_line, error_column;
The line and column where the error was detected.
-
protected String error_string;
The image of the offending token or set of tokens. When a lookahead of more
than 1 is used, more than one token may be present here.
-
protected String[] expected_tokens;
An array of images of legitimate token sequences. Here again, each legitimate
token sequence may be more than just one token when a lookahead of more than
1 is used.
The lexer contains the following method definition:
-
protected void LexicalError() { ... }
To customize error reporting by the lexer, the lexer class must be subclassed
and this method redefined in the subclass. To help with creating your
error reporting scheme, the following variables are available:
-
protected int error_line, error_column;
The line and column where the error was detected.
-
protected String error_after;
The partial string that has been read since the last successful token
match was performed.
-
protected char curChar;
The offending character.
The ErrorHandler
(C++ only) interface
Since the parser doesn't use exceptions in C++, we provide an interface - ErrorHandler that handles the various different errors encountered during the parse.
int error_count;
- This protected field indicates the number of errors. If you are subclassing this class, it's your responsibility to update this field.
void handleUnexpectedToken()
- This public function is called when the parser encounters a different token when expecting to consume a specific kind of token. Parameters:
- int expectedKind - token kind that the parser was trying to consume.
- string expectedToken - the image of the token - tokenImages[expectedKind].
- Token* actual - the actual token that the parser got instead.
void handleParseError()
- This public function is called when the parser cannot continue parsing any further. Parameters:
- Token* last - the last token successfully parsed.
- Token* unexpected - the token at which the error occurs.
- string production - the name of the production in which this error occurrs.
int getErrorCount()
This public function returns the number of errors.
JavaCC [tm]: JJTree
JJTree has two APIs: it adds some parser
methods; and it requires all node objects to implement the Node interface.
JJTree parser methods
JJTree maintains some state in the parser object itself. It
encapsulates all this state with an object that can be referred to via
the jjtree
field.
The parser state implements an open stack where nodes are held
until they can be added to their parent node. The jjtree
state object provides methods for you to manipulate the contents of
the stack in your actions if the basic JJTree mechanisms are not
sufficient.
void reset()
- Call this to reinitialize the node stack. All nodes
currently on the stack are thrown away. Don't call this from
within a node scope, or terrible things will surely happen.
Node rootNode();
- Returns the root node of the AST. Since JJTree operates
bottom-up, the root node is only defined after the parse has
finished.
boolean nodeCreated();
- Determines whether the current node was actually closed and
pushed. Call this in the final action within a conditional node
scope.
int arity();
- Returns the number of nodes currently pushed on the node
stack in the current node scope.
void pushNode(Node n);
- Pushes a node on to the stack.
Node popNode();
- Returns the node on the top of the stack, and removes it from
the stack.
Node peekNode();
- Returns the node currently on the top of the stack.
The Node
interface
All AST nodes must implement this interface. It provides basic
machinery for constructing the parent and child relationships between
nodes.
public void jjtOpen();
- This method is called after the node has been made the
current node. It indicates that child nodes can now be added to
it.
public void jjtClose();
- This method is called after all the child nodes have been
added.
public void jjtSetParent(Node n);
public Node jjtGetParent();
- This pair of methods is used to inform the node of its
parent.
public void jjtAddChild(Node n, int i);
- This method tells the node to add its argument to the node's
list of children.
public Node jjtGetChild(int i);
- This method returns a child node. The children are numbered
from zero, left to right.
int jjtGetNumChildren();
- Return the number of children the node has.