com.gemstone.gemfire.cache.query.package.html Maven / Gradle / Ivy
Show all versions of gemfire-core Show documentation
com.gemstone.gemfire.cache.query
GemFire's query service, an implementation of an OQL/SQL-style
query processor with supporting indexes. The query service enables
data to be retrieved efficiently from a GemFire cache through the use
of a high-level query language. The query language allows you to
access the data based on multiple attributes and to tranform the
results. The service also allows you to create indexes in order to
optimize the performance of data retrieval.
The
{@linkplain com.gemstone.gemfire.cache.query.QueryService
QueryService} interface provides the public entry point to query
execution, query compilation, and index creation. A QueryService
instance is obtained from {@linkplain
com.gemstone.gemfire.cache.Cache#getQueryService}. Queries can also
be performed by using the shortcut methods {@linkplain
com.gemstone.gemfire.cache.Region#query}, {@linkplain
com.gemstone.gemfire.cache.Region#selectValue}, and {@linkplain
com.gemstone.gemfire.cache.Region#existsValue}.
This package description contains the following sections:
Terminology
- Attribute
-
A named member of a type. An attribute is mapped to a Java class
member in three possible ways with the following priority until a
match is found. If the attribute is named x, then:
public method getX()
public method x()
public field x
(attribute) path
-
A sequence of attributes separated by a dot (.) that is applied to
objects where the value of one attribute is used to apply the next
attribute in the sequence. For example, the path x.y.z as
it applies to object A is equivalent to
A.getX().getY().getZ().
-
collection
-
In the context of the query language, a collection includes not only
instances of java.util.Collection, but arrays and maps as
well. Iteration over a map results in the iteration over its entries
as Map.Entry.
-
region path
-
A fully-qualified specification of a region in the cache. The query
language only supports full paths starting from a cache root region.
The minimum region path is a forward slash ('/') followed by a root
region name. The region path for a non-root region, A, begins with a
forward slash and region A' s root region name followed by the
series of region names required to navigate to region A, with a
forward slash delminiter between region names.
-
struct
-
A data type that has a fixed number of elements, each of which has a
field name and can contain an object value. An element of a struct
is typically referred to by an attribute name, for example, if an
address were represented by a struct, then you might access the
address elements in this manner: address.zipCode = 12345;
address.city = 'San Francisco'. Query evaluation may
generate structs whose elements can be referred to as an attribute
in the query language. Structures that are returned to Java are
immutable and implement the {@linkplain
com.gemstone.gemfire.cache.query.Struct Struct} interface.
Query Language
-
-
-
-
-
-
-
-
-
-
The Query Language supported in GemFire is essentially a
subset of OQL (ODMG 3.0), which in turn is based on SQL-92.
The query language provides the basic set of clauses, expressions,
and operators that allow you to retrieve elements from objects stored
in cache regions. This section describes the operators, keywords, and
operations that the query language supports for this release.
NOTE: Query language keywords such as SELECT, DATE, and NULL, are
case-insensitive. Identifiers in query strings, such as attribute
names, method names, and path expressions, are case-sensitive.
Query language keywords are capitalized in the documentation to
identify them as keywords.
Region Paths
In the context of a query, the name of a region is specified by
its full path starting with a forward slash (/), and delimited by the
forward slash (/) between regions names. Region names in a region
path are restricted to alphanumeric characters or underscore
characters only.
Many queries need to operate on the values of a Region. Some
queries, however, need to operate on the Region itself, e.g. to get
the keys or entries of the region. To facilitate this, when a region
path evaluates to a Region in the context of a query, the type of
object it refers to has the following interface:
interface QRegion extends com.gemstone.gemfire.cache.Region, java.util.Collection { }
Thus, the interface inherits all the attributes and methods from both
Region and Collection. When used as a Collection, the elements are
the values of the region. This enables a query to use a Region as
either a collection of its values or as a Region itself.
Example:
/root/employees
Can be used as a Collection of values in region /root/employees
/root/employees.keys
Returns the set of keys in region /root/employees
SELECT Statement
The SELECT statement lets you filter data from collection(s) of
objects based on a predicate and optionally transform the results.
SELECT DISTINCT projectionList FROM fromClause [WHERE expression]
The DISTINCT keyword is required in this version of GemFire.
The semantics of the SELECT statement are as follows. The FROM
clause specifies the collections to iterate over, the WHERE clause
specifies a boolean expression to filter the elements in the
iteration(s), and the projectionAttributes performs an optional
transformation on each element in the result.
The FROM clause establishes a naming scope for the rest of the
fromExpression (if more collections are listed) as well as for the
WHERE clause and the projection attributes. Explicit variables may
also be declared in the FROM clause. These are added to the current
scope, one variable for each collection in the list. The variable
becomes an "iterator variable" in
that it is bound to
the elements of the associated collection as they are iterated over.
Each expression in the FROM clause can be any expression that
evaluates to a collection. For example, an expression in the FROM
clause is typically a path expression that resolves to a region in
the Cache so that the values in the region become the collection of
objects to filter. Other possibilities include a query parameter that
is bound to a collection, or a nested SELECT statement.
Result of a Select Statement
The result of a SELECT statement is a collection that implements
the {@linkplain com.gemstone.gemfire.cache.query.SelectResults
SelectResults} interface.
The SelectResults is a simple collection of unstructured objects
in two cases:
If only one expression is specified by the projection list
and no field name is specified
If the SELECT list is *
and there is a single collection in the FROM clause
In all other cases, it is a collection of structs containing the
data.
The name of each field in the struct is either specified
explicitly by an identifier (using the fieldname:expr
syntax in the SELECT list or an explicit variable name in the FROM
clause) or it defaults to the last attribute name specified in the
path expression of the associated expression. If default names can
not be decided, arbitrary unique names are generated by the query
processor.
Typing of Iteration
Elements in the FROM clause
In order for the query processor to be able to resolve implicit
attribute names in a select statement, the types of the elements in
the iterator collections specified in the FROM clause must be
provided.
In the following example, the type of
the values in the region and the type of the elements in the
positions.values collection must be available to the query
processor in order to resolve the implicit attributes id and
sharesOutstanding:
SELECT DISTINCT id FROM /portfolios, positions.values WHERE
sharesOutstanding > 100
There are three possible ways of providing this type information:
If it is a Region, provide the type in the valueConstraint
or keyConstraint RegionAttributes.
Use the TYPE keyword to specify the type of the elements.
Note that application-defined types must be imported,
e.g.:
IMPORT com.acme.Portfolio;
IMPORT
com.acme.Position;
SELECT DISTINCT id
FROM /portfolios TYPE
Portfolio, positions.values TYPE Position
WHERE sharesOutstanding
> 100
Use typecasting:
IMPORT com.acme.Portfolio;
IMPORT
com.acme.Position;
SELECT DISTINCT id
FROM
(collection<Portfolio>)/portfolios,
(collection<Position>)positions.values
WHERE
sharesOutstanding > 100
If explicit iterator variables are
used, this typing is not necessary since the names can be resolved
without requiring type information. However, in the future GemFire
may allow queries to be compiled into bytecodes, which will require
this type information. It is therefore recommended that this type
information always be provided to ensure forward compatibility.
Joins
Since the collections in the FROM clause are not necessarily
related to each other, SELECT statements could be used to join data
between two different regions:
SELECT p FROM /Persons p, /Flowers f WHERE p.name = f.name
This returns all the Persons from the /Persons region that has the
same name as some Flower in the /Flowers region. There could be
indexes on both regions for the name attribute, but only one index
could be used during query evaluation. GemFire does not currently
support the creation of indexes targeted for joins across more than
one region.
Method Invocation
The query language supports method invocation in query
expressions. The query processor maps attributes in query strings
using the rules described under Attribute
in the terminology section. If you know that the attribute name maps
to a public method that takes no parameters, you can simply include
the method name in the query string as an attribute. For example,
emps.isEmpty is equivalent to emps.isEmpty(). In
the following example, the query invokes isEmpty() on
subordinates, and
returns the set of all employees with no
subordinates.
SELECT DISTINCT * FROM /root/employees WHERE subordinates.isEmpty
GemFire also supports invoking public methods with parameters. To
invoke methods with parameters, provide the method arguments between
parentheses. For overloaded methods, the query processor matches the
runtime argument types with the parameter types required by the
method to resolve which method to call. If more than one method can
be
invoked, the query processor chooses the method whose
parameter types are the most specific for the given arguments.
Methods that are declared with a void return type return NULL when
executed in a query.
Operators
GemFire supports the following operator types in expressions:
Comparison Operators
Logical Operators
Unary Operators
Map and Index Operators
Dot and Forward Slash Operators
Comparison Operators
Comparison operators compare two values and return the results,
either true or false. GemFire query language supports the following
comparison operators:
=
<> < <=
> >=
The operator != is also
accepted as equivalent to <>.
Logical Operators
The operators AND and OR allow you to create more complex
expressions by combining expressions to produce a boolean result
(true or false).
Unary Operators
GemFire supports the unary operator NOT.
Map and Index Operators
If expr is an expression that evaluates to an integer i,
then a[expr] extracts the (i+1)th element of array,
list, or string a. (The first element has rank 0). The same
operator is used to access a value by key in a Map or Region, in
which case the type of expr can be any Object. In the case of a
Region, the map operator performs a non-distributed get that does not
cause a netSearch if the value is not present in the region locally.
(i.e. equivalent to getEntry(expr).getValue()).
Dot and Forward Slash Operators
The dot operator separates attribute names in a path expression,
and specifies the navigation through object attributes. The forward
slash is used to navigate through subregions as described above under
Region Paths.
Functions
The query language supports the functions:
- ELEMENT(query)
-
Extracts a single element from a collection or array. This function
throws a FunctionDomainException if the argument is not a collection
or array with exactly one element.
-
IS_DEFINED(query)
-
Returns true if the query evaluates to a non-null value.
-
IS_UNDEFINED(query)
-
Returns true if the query cannot be evaluated or evaluates to null.
Construction Expressions
The construction expression currently
implemented is the set constructor. A set can be constructed using
SET(e1, e2, ..., en)
where e1, e2, ..., en
are expressions. This constructor creates and returns the set
containing the elements e1, e2, ..., en
.
IN Expression
If e1 and e2 are expressions, e2 is a collection, and e1 is an
object or a literal having the same type or a subtype as the elements
of e2, then
e1 IN e2
is an expression of type boolean. It returns true if element e1 is
not UNDEFINED and belongs to collection e2, it returns false if e1 is
not UNDEFINED and does not belong to collection e2, and it returns
UNDEFINED if e1 is UNDEFINED.
Example:
2 IN SET(1, 2, 3)
This returns true.
As another example where the collection membership is a subquery:
If, in addition to a region with employees in it there is a
separate region named /payroll that contains payroll objects with
attributes empId for employee id and rate for payrate, this query
returns the names of the employees earning a rate of 8:
SELECT name FROM /root/employees WHERE empId IN (SELECT empId FROM /payroll WHERE rate = 8)
Literals
Query language expressions can contain literals as well as
operators and attribute names. The following lists the literal types
that GemFire supports.
- boolean
-
A boolean value, either TRUE or FALSE
-
integer and long
-
An integer literal is of type long if it is suffixed with the ASCII
letter L. Otherwise it is of type int.
-
floating point
-
A floating-point literal is of type float if it is suffixed with an
ASCII letter F. Otherwise its type is double and it can optionally
be suffixed with an ASCII letter D. A double or floating point
literal can optionally include an exponent suffix of E or e,
followed by a signed or unsigned number.
-
string
-
String literals are delimited by single quotation marks. Embedded
single-quotation marks are doubled. For example, the character
string 'Hello' evaluates to the value Hello, while the character
string 'He said, ''Hello''' evaluates to He said, 'Hello'. Embedded
newlines are kept as part of the string literal.
-
char
-
A literal is of type char if it is a string literal prefixed by the
keyword CHAR, otherwise it is of type string. The CHAR literal for
the single-quotation mark character is CHAR '''' (four single
quotation marks).
-
date
-
A java.sql.Date object that uses the JDBC format prefixed with the
DATE keyword: DATE yyyy-mm-dd In the Date, yyyy represents the year,
mm represents the month, and dd represents the day. The year must be
represented by four digits; a two-digit shorthand for the year is
not allowed.
-
time
-
A java.sql.Time object that uses the JDBC format (based on a 24-hour
clock) prefixed with the TIME keyword: TIME hh:mm:ss In the Time, hh
represents the hours, mm represents the minutes, and ss represents
the seconds. The hours field is based on a 24-hour clock.
-
timestamp
-
A java.sql.Timestamp object that uses the JDBC format with a
TIMESTAMP prefix: TIMESTAMP yyyy-mm-dd hh:mm:ss.fffffffff In the
Timestamp, yyyy-mm-dd represents the date, hh:mm:ss represents the
time, and fffffffff represents the fractional seconds (up to nine
digits).
-
NULL
-
UNDEFINED
Comparing Values with java.util.Date
You can compare the temporal literal values DATE, TIME, and
TIMESTAMP with java.util.Date values. However, there is no literal
for java.util.Date values in the query language.
Rules for UNDEFINED
The special value UNDEFINED behaves according to the following
rules:
Accessing an attribute of a NULL
value results in UNDEFINED.
IS_UNDEFINED(UNDEFINED) returns TRUE; IS_DEFINED(UNDEFINED)
returns FALSE
If the predicate that is defined by the WHERE clause of a
SELECT statement returns UNDEFINED, this is handled as if the
predicate returns false.
For AND and OR expressions:
UNDEFINED AND FALSE // returns false
UNDEFINED OR TRUE // returns true
UNDEFINED AND TRUE // returns UNDEFINED
UNDEFINED OR FALSE // returns UNDEFINED
UNDEFINED AND UNDEFINED // returns UNDEFINED
UNDEFINED OR UNDEFINED // returns UNDEFINED
Any other operation with any UNDEFINED operands results in
UNDEFINED. This includes the dot operator, method invocations with
UNDEFINED arguments, and comparison operations.
Query Parameter
References
Query parameters are identified by a dollar sign ($) followed by a
digit that represents the parameter's position in the parameter array
passed in the execute() method. Counting begins at one (1). That is,
$1 references the first bound attribute, and $2 references the second
bound attribute, and so on.
Comments
You can include comments in the query string. To insert a one-line
comment, begin the line with two dashes (--). To insert a
multiple-line comment, begin the comment block with /* and
end it with */.
Indexes
An index provides significant performance gains for query
execution by enabling the query evaluation to avoid the often
substantial work of iterating through every object in a collection of
objects, instead looking up results in a preconstructed index.
GemFire provides functional indexes that support attribute (or more
complex functions of attributes) comparison with a constant using any
of the relational operators. An index is kept current with respect to
a region automatically. The RegionAttribute
synchronousIndexMaintenance specifies whether the indexes
for the region are kept current atomically when the region is
modified (synchronous), or whether they are brought up to date in a
background thread (asynchronous). The default is asynchronous since
this provides the best performance. Asynchronous index maintenance
may batch up multiple updates to the same key in the region and only
apply the most recent.
The QueryService provides the following methods for creating an
index:
Index createIndex(String name, IndexType indexType, String indexedExpression,
String fromClause);
Index createIndex(String name, IndexType indexType, String indexedExpression,
String fromClause, String imports);
The indexType currently must be either IndexType.FUNCTIONAL
or PRIMARY_KEY
.
The fromClause resolves to a collection or list
of collections which will correspond to the fromClause or part of a
fromClause in a SELECT statement. The fromClause must contain one and
only one region path, and the collections that the expressions
evaluate to must have the characteristic that each struct generated
by those collections is dependent on one and only one entry in the
referenced region (otherwise and exception is thrown since the index
wouldn't be able to be maintained on single entry updates).
References to query parameters are not allowed.
The indexedExpression should be a function of the
elements of the collection (or collection of structs) referenced in
the fromExpression, and is used to optimize the comparison of the
same path found in a WHERE clause when used to compare against a
constant expression (i.e. an expression that does not depend on the
iterator context) using the relational operators. The exact use and
specification of the indexedExpression varies depending on the index
type. See {@linkplain com.gemstone.gemfire.cache.query.IndexType}. No
query parameters or region paths are permitted.
The imports provides packages and classes used in variable
typing and typecasts.
In addition to functional indexes, GemFire also supports a primary
key index. A primary key index uses the keys in the region
itself. By creating a primary key index, you make the query service
aware of the relationship between the values in the region and the
keys in the region and enable the relationship to be used to optimize
the execution of queries. For example, if the values in a region are
employee objects and the keys in the region is the attribute empId on
those employees, then you can create a primary key index on that
region with the indexedExpression "empId". Note that
primary key indexes are not sorted so they will only be used for
equality tests. To obtain a sorted index on the region keys, create a
functional index instead.
Index creation is also supported in the cache.xml for creating
indexes declaratively at the same time regions are created via the
cache.xml.
RESERVED WORDS
abs all and andthen any array as asc avg bag bag boolean by by byte char collection count date declare define desc dictionary distinct double element enum except exists false first flatten float for from group having import in int intersect interval is_defined is_undefined last like list listtoset long map max min mod nil not null octet or order orelse query select set short some string struct sum time timestamp true type undefine undefined union unique unsigned where
LANGUAGE GRAMMAR
symbol ::= expression
n is a nonterminal symbol that has to appear at some place within the grammar on the left side of a rule, all nonterminal symbols have to be derived to be terminal symbols.
t represents the terminal symbol t
x y represents x followed by y
x | y
(x| y ) represents x or y,
[ x ] x or empty
{ x } possibly empty sequence of x
query_program ::= [ imports ; ] query [;]
imports ::= import { ; import }
import ::= IMPORT qualifiedName [ AS identifier ]
query ::= selectExpr | expr
selectExpr ::= SELECT DISTINCT projectionAttributes fromClause [ whereClause ]
projectionAttributes ::= *
| projectionList
projectionList ::= projection { , projection }
projection ::= field
| expr [ AS identifier ]
field ::= identifier : expr
fromClause ::= FROM iteratorDef { , iteratorDef }
iteratorDef ::= expr [ [ AS ] identifier ] [ TYPE type ]
| identifier IN expr [ TYPE type ]
whereClause ::= WHERE expr
expr ::= castExpr
castExpr ::= orExpr
| left_paren type right_paren castExpr
orExpr ::= andExpr { OR andExpr }
andExpr ::= equalityExpr { AND equalityExpr }
equalityExpr ::= relationalExpr { ( = | <> | != ) relationalExpr }
relationalExpr ::= inExpr { ( < | <= | > | >= ) inExpr }
inExpr ::= unaryExpr { IN unaryExpr }
unaryExpr ::= [ NOT ] unaryExpr
postfixExpr ::= primaryExpr { left_bracket expr right_bracket }
| primaryExpr { dot identifier [ argList ] }
argList ::= left_paren [ valueList ] right_paren
qualifiedName ::= identifier { dot identifier }
primaryExpr ::= conversionExpr
| identifier [ argList ]
| undefinedExpr
| collectionConstruction
| queryParam
| literal
| ( query )
conversionExpr ::= element ( query )
undefinedExpr ::= IS_UNDEFINED ( query )
| IS_DEFINED ( query )
collectionConstruction ::= SET left_paren [ valueList ] right_paren
valueList ::= expr { , expr }
queryParam ::= $ integerLiteral
identifier ::= letter { letter | digit | _ }
literal ::= booleanLiteral
| integerLiteral
| longLiteral
| doubleLiteral
| floatLiteral
| charLiteral
| stringLiteral
| dateLiteral
| timeLiteral
| timestampLiteral
| NULL
| UNDEFINED
booleanLiteral ::= TRUE | FALSE
integerLiteral ::= [ - ] digit { digit }
longLiteral ::= integerLiteral L
floatLiteral ::= [ - ] digit { digit } dot digit { digit }
[ ( E | e ) [ + | - ] digit { digit } ] F
doubleLiteral ::= [ - ] digit { digit } dot digit { digit }
[ ( E | e ) [ + | - ] digit { digit } ] [ D ]
charLiteral ::= CHAR ' character '
stringLiteral ::= ' { character } '
dateLiteral ::= DATE ' integerLiteral – integerLiteral – integerLiteral '
timeLiteral ::= TIME
' integerLiteral : integerLiteral : integerLiteral'
timestampLiteral ::= TIMESTAMP
' integerLiteral – integerLiteral - integerLiteral
integerLiteral : integerLiteral : digit { digit } [ . digit
{ digit } ] '
letter ::= (any unicode letter)
character ::= (any character)
digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
dot ::= .
left_paren ::= (
right_paren ::= )
left_bracket ::= [
right_bracket ::= ]
Language Notes:
1. Query language keywords
(for example, SELECT, NULL, DATE) are
case-insensitive;
identifiers are case-sensitive.
2. Comment lines begin
with --
3. Comment blocks are
delimited by /* and */
4. String literals are
delimited by single-quotes; embedded single-quotes
are doubled. Examples:
'Hello' value = Hello
'He said, ''Hello''' value
= He said, 'Hello'
5. Character literals
begin with the CHAR keyword followed by the character
in single quotation marks.
The single-quotation mark character itself is
represented as CHAR ' ' '
' (with four single quotation marks).
6. In the TIMESTAMP
literal, there is amaximum of nine digits after the
decimal point.