All Downloads are FREE. Search and download functionalities are using the official Maven repository.

weka.core.expressionlanguage.package-info Maven / Gradle / Ivy

/*
 *   This program is free software: you can redistribute it and/or modify
 *   it under the terms of the GNU General Public License as published by
 *   the Free Software Foundation, either version 3 of the License, or
 *   (at your option) any later version.
 *
 *   This program is distributed in the hope that it will be useful,
 *   but WITHOUT ANY WARRANTY; without even the implied warranty of
 *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *   GNU General Public License for more details.
 *
 *   You should have received a copy of the GNU General Public License
 *   along with this program.  If not, see .
 */

/*
 *    package-info.java
 *    Copyright (C) 2015 University of Waikato, Hamilton, New Zealand
 *
 */

/**
 * Package for a framework for simple, flexible and performant expression
 * languages

* *

Introduction & Overview

* * The {@link weka.core.expressionlanguage} package provides functionality to * easily create simple languages.

* * It does so through creating an AST (abstract syntax tree) that can then be * evaluated.

* * At the heart of the AST is the {@link weka.core.expressionlanguage.core.Node} * interface. It's an empty interface to mark types to be an AST node.
* Thus there are no real constraints on AST nodes so that they have as much * freedom as possible to reflect abstractions of programs.

* * To give a common base to build upon the {@link weka.core.expressionlanguage.common.Primitives} * class provides the subinterfaces for the primitive boolean * ({@link weka.core.expressionlanguage.common.Primitives.BooleanExpression}), * double ({@link weka.core.expressionlanguage.common.Primitives.DoubleExpression}) * and String ({@link weka.core.expressionlanguage.common.Primitives.StringExpression}) * types.
* It furthermore provides implementations of constants and variables of those * types.

* * Most extensibility is achieved through adding macros to a language. Macros * allow for powerful meta-programming since they directly work with AST nodes. *
* The {@link weka.core.expressionlanguage.core.Macro} interface defines what a * macro looks like.

* * Variable and macro lookup is done through * {@link weka.core.expressionlanguage.core.VariableDeclarations} and * {@link weka.core.expressionlanguage.core.MacroDeclarations} resp. Furthermore, * both can be combined through * {@link weka.core.expressionlanguage.common.VariableDeclarationsCompositor} * and {@link weka.core.expressionlanguage.common.MacroDeclarationsCompositor} * resp.
* This really allows to add built-in variables and powerful built-in functions * to a language.

* * Useful implementations are:
*
    *
  • {@link weka.core.expressionlanguage.common.SimpleVariableDeclarations}
  • *
  • {@link weka.core.expressionlanguage.common.MathFunctions}
  • *
  • {@link weka.core.expressionlanguage.common.IfElseMacro}
  • *
  • {@link weka.core.expressionlanguage.common.JavaMacro}
  • *
  • {@link weka.core.expressionlanguage.common.NoVariables}
  • *
  • {@link weka.core.expressionlanguage.common.NoMacros}
  • *
  • {@link weka.core.expressionlanguage.weka.InstancesHelper}
  • *
  • {@link weka.core.expressionlanguage.weka.StatsHelper}
  • *
* * The described framework doesn't touch the syntax of a language so far. The * syntax is seen as a separate element of a language.
* If a program is given in a textual representation (e.g. "A + sqrt(2.0)" is a * program in a textual representation), this textual representation declares * how the AST looks like. That's why the parser's job is to build the AST.
* There is a parser in the {@link weka.core.expressionlanguage.parser} package.
* However the framework allows for other means to construct an AST if needed.

* * Built-in operators like (+, -, *, / etc) are a special case, since they can * be seen as macros, however they are strongly connected to the parser too.
* To separate the parser and these special macros there is the * {@link weka.core.expressionlanguage.common.Operators} class which can be used * by the parser to delegate operator semantics elsewhere. * *

A word on parsers

* * Currently the parser is generated through the CUP parser generator and jflex * lexer generator. While parser generators are powerful tools they suffer from * some unfortunate drawbacks:

*
    *
  • The parsers are generated. So there is an additional indirection between * the grammar file (used for parser generation) and the generated code.
  • *
  • The grammar files usually have their own syntax which may be quite * different from the programming language otherwise used in a project.
  • *
  • In more complex grammars it's easy to introduce ambiguities and unwanted * valid syntax.
  • *
* It's for these reasons why the parser is kept as simple as possible and with * as much functionality delegated elsewhere as possible. * *

Summary

* * A flexible AST structure is given by the * {@link weka.core.expressionlanguage.core.Node} interface. The * {@link weka.core.expressionlanguage.core.Macro} interface allows for powerful * meta-programming which is an important part of the extensibility features. The * {@link weka.core.expressionlanguage.common.Primitives} class gives a good * basis for the primitive boolean, double & String types.
* The parser is responsible for building up the AST structure. It delegates * operator semantics to {@link weka.core.expressionlanguage.common.Operators}. * Symbol lookup is done through the * {@link weka.core.expressionlanguage.core.VariableDeclarations} and * {@link weka.core.expressionlanguage.core.MacroDeclarations} interfaces which * can be combined with the * {@link weka.core.expressionlanguage.common.VariableDeclarationsCompositor} * and {@link weka.core.expressionlanguage.common.MacroDeclarationsCompositor} * classes resp.

* *

Usage

* * With the described framework it's possible to create languages in a declarative * way. Examples can be found in * {@link weka.filters.unsupervised.attribute.MathExpression}, * {@link weka.filters.unsupervised.attribute.AddExpression} and * {@link weka.filters.unsupervised.instance.SubsetByExpression}.

* * A commonly used language is:

* *

 * // exposes instance values and 'ismissing' macro
 * InstancesHelper instancesHelper = new InstancesHelper(dataset);
 * 
 * // creates the AST
 * Node node = Parser.parse(
 *   // expression
 *   expression, // textual representation of the program
 *   // variables
 *   instancesHelper,
 *   // macros
 *   new MacroDeclarationsCompositor(
 *     instancesHelper,
 *     new MathFunctions(),
 *     new IfElseMacro(),
 *     new JavaMacro()
 *   )
 * );
 *
 * // type checking is neccessary, but allows for greater flexibility
 * if (!(node instanceof DoubleExpression))
 *   throw new Exception("Expression must be of boolean type!");
 *    
 * DoubleExpression program = (DoubleExpression) node;
 * 
* *

History

* * Previously there were three very similar languages in the * weka.core.mathematicalexpression package, * weka.core.AttributeExpression class and the * weka.filters.unsupervised.instance.subsetbyexpression package.
* Due to their similarities it was decided to unify them into one expressionlanguage. * However backwards compatibility was an important goal, that's why there are * some quite redundant parts in the language (e.g. both 'and' and '&' are operators * for logical and). * * @author Benjamin Weber ( benweber at student dot ethz dot ch ) * @version $Revision: 1000 $ */ package weka.core.expressionlanguage;




© 2015 - 2025 Weber Informatics LLC | Privacy Policy