All Downloads are FREE. Search and download functionalities are using the official Maven repository.

javacc-7.0.4.www.doc.features.html Maven / Gradle / Ivy

There is a newer version: 7.0.13
Show newest version




  
 JavaCC Features




JavaCC [tm]: Features

JavaCC [tm] is a Java parser generator written in the Java programming language. It produces pure Java code. Both JavaCC and the parsers generated by JavaCC have been run on a variety of Java platforms. JavaCC comes with a bunch of grammars including Java 1.0.2, Java 1.1, and Java 2 as well as a couple of HTML grammars.

Specific features of JavaCC are listed below:

  • TOP-DOWN: JavaCC generates top-down (recursive descent) parsers as opposed to bottom-up parsers generated by YACC-like tools. This allows the use of more general grammars (although left-recursion is disallowed). Top-down parsers have a bunch of other advantages (besides more general grammars) such as being easier to debug, having the ability to parse to any non-terminal in the grammar, and also having the ability to pass values (attributes) both up and down the parse tree during parsing.
  • LARGE USER COMMUNITY: JavaCC is by far the most popular parser generator used with Java applications. We've had over hundreds of thousands of downloads and estimate serious users in the many thousands (maybe even tens of thousands). Our mailing list and newsgroups together have a few thousand participants.
  • LEXICAL AND GRAMMAR SPECIFICATIONS IN ONE FILE: The lexical specifications such as regular expressions, strings, etc. and the grammar specifications (the BNF) are both written together in the same file. It makes grammars easier to read (since it is possible to use regular expressions inline in the grammar specification) and also easier to maintain.
  • TREE BUILDING PREPROCESSOR: JavaCC comes with JJTree, an extremely powerful tree building preprocessor.
  • EXTREMELY CUSTOMIZABLE: JavaCC offers many different options to customize its behavior and the behavior of the generated parsers. Examples of such options are the kinds of Unicode processing to perform on the input stream, the number of tokens of ambiguity checking to perform, etc. etc.
  • CERTIFIED TO BE 100% PURE JAVA: JavaCC runs on all Java compliant platforms Version 1.1 or later. It has been used on countless different machines with no special porting effort - a testimonial to the "Write Once, Run Everywhere" aspect of the Java [tm] programming language.
  • DOCUMENT GENERATION: JavaCC includes a tool called JJDoc that converts grammar files to documentation files (optionally in html).
  • MANY MANY EXAMPLES: The JavaCC release includes a wide range of examples including Java and HTML grammars. The examples, along with their documentation, are a great way to get acquainted with JavaCC.
  • INTERNATIONALIZED: The lexical analyzer of JavaCC can handle full Unicode input, and lexical specifications may also include any Unicode character. This facilitates descriptions of language elements such as Java identifiers that allow certain Unicode characters (that are not ASCII), but not others.
  • SYNTACTIC AND SEMANTIC LOOKAHEAD SPECIFICATIONS: By default, JavaCC generates an LL(1) parser. However, there may be portions of the grammar that are not LL(1). JavaCC offers the capabilities of syntactic and semantic lookahead to resolve shift-shift ambiguities locally at these points. For example, the parser is LL(k) only at such points, but remains LL(1) everywhere else for better performance. Shift-reduce and reduce-reduce conflicts are not an issue for top-down parsers.
  • PERMITS EXTENDED BNF SPECIFICATIONS: JavaCC allows extended BNF specifications - such as (A)*, (A)+, etc. - within the lexical and the grammar specifications. Extended BNF relieves the need for left-recursion to some extent. In fact, extended BNF is often easier to read as in A ::= y(x)* versus A ::= Ax|y.
  • LEXICAL STATES AND LEXICAL ACTIONS: JavaCC offers lex-like lexical state and lexical action capabilities. Specific aspects in JavaCC that are superior to other tools are the first class status it offers concepts such as TOKEN, MORE, SKIP, state changes, etc. This allows cleaner specifications as well as better error and warning messages from JavaCC.
  • CASE-INSENSITIVE LEXICAL ANALYSIS: Lexical specifications can define tokens not to be case sensitive either at the global level for the entire lexical specification, or on an individual lexical specification basis.
  • EXTENSIVE DEBUGGING CAPABILITIES: Using options DEBUG_PARSER, DEBUG_LOOKAHEAD, and DEBUG_TOKEN_MANAGER, one can get in-depth analysis of the parsing and the token processing steps.
  • SPECIAL TOKENS: Tokens that are defined as special tokens in the lexical specification are ignored during parsing, but these tokens are available for processing by the tools. A useful application of this is in the processing of comments.
  • VERY GOOD ERROR REPORTING: JavaCC error reporting is among the best in parser generators. JavaCC generated parsers are able to clearly point out the location of parse errors with complete diagnostic information.




© 2015 - 2024 Weber Informatics LLC | Privacy Policy