All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.apache.geode.cache.query.internal.index.package.html Maven / Gradle / Ivy

Go to download

Apache Geode provides a database-like consistency model, reliable transaction processing and a shared-nothing architecture to maintain very low latency performance with high concurrency processing

There is a newer version: 1.15.1
Show newest version




	
	
	
	
	
	
	


Design: Indexes in GemFire Querying

This document describes the creation, use, and maintenance of indexes in the GemFire query processor, as designed for the 4.0 release. The index types that will be supported in that release include “functional sorted” indexes and “primary key” indexes.

  1. Types of Indexes

    1. Functional Sorted Indexes
      A functional index is so named because it can be used to index on data using any function of the region entries that make up the data. It is a sorted index, so it supports comparisons using any of the relational operators (<, >, <=, >=, =, <>).

    2. Primary Key Indexes
      A Primary Key index is cover for the keys that are already in a Region. Creating a primary key index allows the query processor to use the keys in a region to improve performance in query evaluation. A primary key index provides the query service with information about the relationship between the keys in the region and the values in the region. Since the keys are not sorted in a region, a primary key index is only used for queries using the = operator. As an example of a query that can use a primary key index, say the /Portfolios region has Portfolio objects keyed by its ID. The primary key index is created with the parameters fromClause=”/Portfolios” and indexedExpression=”ID”. The projectionAttributes must default to “*”. This primary key index could then be used for a query such as:
      SELECT DISTINCT * FROM /Portfolios WHERE ID = '3434'

      In GemFire, the primary key index does not require any extra structure or maintenance since it it implicit in the Region implementation itself. The rest of this document, therefore, describes the functional sorted indexes only.

  2. Structure of Indexes

    1. Indexes are stored in instances of IndexManager. Each region in the cache has an IndexManager. When a region is destroyed, so is its IndexManager and the indexes stored therein.

    2. An index contains two maps, a forward map and a reverse map. The forward map is used when evaluating a query, and the reverse map is used when updating the index when a region is modified.

      1. Forward Map

        1. The forward map has the structure:
          SortedMap<key: Object value: Map<key: RegionEntry value: Object>>
          The keys of the SortedMap are the values of the indexedExpression
          as is evaluated for each element in the base collection. The value is a Map that maintains an assocation between the context object from the Region and the targets derived from that context.

        2. The context object is a RegionEntry. This object is used to determine which entry in the region the target objects are derived from.

        3. The target objects are values that will be used in the results of the SELECT expression. This target object is an element from the base collection if the projectionAttributes is *, or is an element from the base collection transformed by the function defined by the projectionAttributes. The target is conceptually a Set of values, but to conserve on resources in the index, it is represented as follows:

          • When it is a single object, it is stored as such in the index.

          • If it is a set of objects, it is stored in a collection which itself is an instance of SelectResults, either a ResultsBag or a StructBag.

      2. Reverse Map

        1. The reverse map has the structure:
          Map<key: RegionEntry, value: Object>
          , where the value is the key(s) in the forward map. If there is more than one key in the forward map that reference a RegionEntry, then the value of this map is a collection of all those keys. The reverse map is used to quickly discover which keys in the forward map have a reference to a particular RegionEntry. It is used when a region entry is modified and the indexes of the region need to be updated.

  3. Index Creation
    Indexes are specified with three parameters: the fromClause, the indexedExpression, and the projectionAttributes. These parameters are analogous and equivalent to the corresponding parts of the SELECT expression that the indexes will be used to evaluate.

    1. FromClause
      The fromClause
      defines the base collection of objects that are being selected from, and optionally defines iterator variables that can be referred to in the indexedExpression or the projectionAttributes. The fromClause can be a single expression that specifies a collection, or it can be a list of expressions that drill down into or join across a complex object structure and define a namespace of objects that can be referenced in the query. Each expression in the fromClause sets up a nested iteration.

      1. Base Collection

        1. If there is one expression in the fromClause, then the base collection is the value of that expression.

        2. If there are multiple, comma-delimited, expressions in the fromClause, then the base collection is a struct that consists of a field for each of the expressions in order. These structs represent the cartesian product of each collection specified in the fromClause.

      2. Examples of fromClauses that could be used to create an index include:

        /root/employees
        /root/employees e
        /portfolios ptfo, ptfo.positions pos
        (in the last example the base collection will be of type struct<Portfolio, Position>)
    2. IndexedExpression
      The indexedExpression
      specifies the value that is indexed on. For a sorted index, it specifies the expression that will be used to compare with another value that is independent of objects in the from clause (i.e. a constant) using the relational operators (<, =, >, <=, >=, <>). Example indexedExpressions that could be used to create an index include:

      empId
      ptfo.active
      pos.sharesOutstanding
      ptfo.someMethodReturningAComparable(element(select distinct * from
      positions.values where sharesOutstanding > 10000))
    3. ProjectionAttributes
      [Open issue: Is it really practical to put projectionAttributes on an index? Does Sybase/Oracle support this at all? Considering the extra complexity it introduces in the implementation (as seen later) this may be a lower priority than other features].

      The projectionAttributes
      is an expression that does a transformation on the results of a query, and this projection can be pre-computed for each element in the results and stored in the index as well. If there is a comma-delimited list of expressions in the projectionAttributes, then a struct is created with the value of each expression as a field in the struct. An identifier can also be included to explicitly name the fields in the struct. If no identifiers are provided, then the field names are derived from the tail attributes names or generated by the query processor. If the projectionAttributes is * then there is no transformation on the results. Examples projectAttributes that could be used to create an index include:

      *
      empId
      e.key, e.value, e.value.id
      key: e.key, value: e.value, id: e.value.id
      e.key AS k, e.value AS v, e.value.id AS id

    4. Algorithm for Creating Indexes
      For the purposes of this section, assume we have a region of Portfolios with Positions as defined in the use cases section of the functional specification with region path “/portfolios”. An index is created to speed up queries that are similar to:

      SELECT DISTINCT posn
      FROM /portfolios ptfo, ptfo.positions.values posn
      WHERE ptfo.status = 'active' AND posn.sharesOutstanding > 10000

      The createIndex method is called with the following parameters:

      fromClause = “/portfolios ptflo, ptflo.positions.values posn”;
      indexedExpression = “posn.sharesOutstanding”;
      projectionAttributes = “posn”;

      1. Extract region information from the fromClause.
        In order to create an index, the fromClause must reference one and only one region using a regionPath, and the fromClause, indexedExpression, and projectionAttributes must not have any query parameters in them. If these restrictions do not hold, then the createIndex method will throw an exception.

        Determine the one and only one region path in the fromClause.

        For the above example, the region information from the fromClause determines that the region path is “/portfolios”.

      2. Transform the parameters.
        Make the following transformation on the fromClause::

        1. Where it references the regionPath, substitute $1.

        2. For the working example, we now have the fromClause
          $1 ptflo, ptflo.positions.values posn

      3. Construct and Excecute the Index Maintenance Query (IMQ)
        A special Query is created for an index which we will call the Index Maintenance Query (IMQ). This query is constructed as follows.

        1. Construct IMQ
          Using the transformed fromClause as described above and the other index parameters, construct the query as:
          SELECT DISTINCT idxExpr: indexedExpression, target: projectionAttributes
          FROM fromClause
          Save this query with the index and compile into bytecodes if possible,as it will be re-used for index maintenance as well as for index creation.
          For our working example, the IMQ would be:
          SELECT DISTINCT idxExpr: posn.sharesOutstanding, target: posn
          FROM $1 ptflo, ptflo.positions.values posn

        2. Execute IMQ.
          Iterate through each RegionEntry in the region, and for each RegionEntry:

          • Create an special instance of QRegion that contains exactly one entry, the current RegionEntry. Execute the IMQ using this QRegion as the $1 query parameter. The result of this query provides structs that contains the the index values and target values needed for the index for this region entry.
            [Note: We need to implement a light-weight read-only Region implementation that has just one RegionEntry in it, and use this Region to construct this special QRegion instance.]

      4. Use the results to build the index.
        These structs can now be used directly to build the index by iterating over them and collecting the indexed values and constructing the map of RegionEntry=>target objects, and adding the RegionEntry to the reverse map.

      5. Concurrency during Index Creation
        For indexes that are specified in the cache.xml, the indexes are created during initialization before a reference to the region is released to application threads.
        Whenever an index is being created, modifications to the region by other threads must be blocked, i.e a local region write lock is obtained. Threads that only read from the region are not blocked. [TBD – to we already have a local region write lock, or does the query group need to implement this?]

  4. Index Use while Executing a Query
    To determine if an index is compatible for a particular query, the fromClause, indexedExpression, and projectionAttributes of an index must be compatible with a query as described below. In the future, histograms should be added to indexes along with query transformations cost-based estimation to the query processor so that more intelligent heuristics can be used to determine whether the use of a particular index is actually worthwhile. In 4.0, if an index is deemed to be compatible then it will be used. The algorithms described here make some simplifying assumptions and an index may not necessarily be used in all cases where they could be if the algorithms were more sophisticated. This could be improved on in future releases.

    1. Canonicalization
      To facilitate the matching algorithms as described, later, the index parameters (fromClause, indexedExpression, and projectionAttributes) and the query being executed are first put into a canonicalized form so there are no variables in the query. Canonicalization is done as follows:

      1. Queries and index parameters are first compiled into a tree of nodes which are instances of CompiledValue. The term “compiled” here should not be confused with byte-code compilation which is a lower level of compilation that will most likely not be implemented in this release due resource restrictions. See the package.html for org.apache.geode.cache.query for further details on byte-code compilation.

      2. Each iterator definition in the fromClause is assigned a placeholder that represents its runtime iteration. The product currently has an internal class that implements these placeholders, named RuntimeIterator. For the purpose of this document, we will refer to these placeholders as itr1, itr2, ..., itrN. All identifiers in the query and index parameters that refer to explicitly declared iterator variables are replaced with reference to these placeholders. Any implicit references to attributes or methods are resolved to determine which iterator it operates on, and these implicit references are made explicit and given a reference to the appropriate placeholder.

      3. Remove any unreferenced iterator definitions in the fromClause, i.e. any definitions that are not referenced anywhere else in the query or index parameters. Note, however, that a * projectionAttributes implicitly references all iterator definitions in the fromClause.

      4. Given the example query
        SELECT DISTINCT posn
        FROM /portfolios, positions.values posn
        WHERE status = 'active' AND sharesOutstanding > $1

        the canonicalized form would be a tree of CompiledValue nodes which could be transcribed as:
        SELECT DISTINCT
        itr2
        FROM /portfolios itr1, itr1.positions.values itr2
        WHERE itr1.status = 'active' AND itr2.sharesOutstanding > $1

    2. Compatible fromClause
      The fromClause of an index is compatible with a query if the index fromClause is a sublist of the query fromClause (a sublist includes the case of being equivalent lists).

    3. Compatible projectionAttributes
      The projectionAttributes of an index is automaticaly compatible if it is * (no projection), or if it is equivalent to the projectionAttributes in the query.

    4. Compatible indexedExpression

      1. Equivalence: The indexedExpression (tree) is passed into the the whereClause (tree) of the query using a method that calculates compatibility by potentially recursively visiting children nodes that represent subexpressions. By default, a node in the whereClause will answer true only if the indexedExpression is equivalent to the node itself. E.g., the expression element(sub_expr1)in the indexedExpression is compatible with element(sub_expr2)in the whereClause if and only if sub_expr1 and sub_expr2 are equivalent.

      2. Some types of expression nodes, however, define compatible in other ways.

        1. An AND node in the whereClause is compatible with an indexedExpression not only if it is equivalent based on its unordered terms, but also if any of its terms are compatible.

        2. An OR node in this release is compatible only if it is equivalent based on its unordered terms.

        3. A relational expression (using one of the operators <, >, <=, >=, or <>) is compatible if equivalent or if both of the following are true:

          • one of its terms is compatible

          • the other term is a constant, i.e. is not dependent on any of the iterators.

    5. Compatible Index
      If all of these compatibility tests for a query pass for a particular index, then the query will use that index. Note that it is possible for a query to use multiple compatible indexes as explained below.

    6. Query Evaluation
      There are two ways a query can be evaluated, by
      iteration or “filtering” (for the lack of a better term).

      1. Iteration
        A query is evaluated by brute-force, most likely by iteration and cartesian product of the collections in the fromClause if there are no compatible indexes and the where clause is dependent on at least one of the iterators. The whereClause tree is visited for each element in the iteration across the cartesian product, and those elemets for which the whereClause evaluates to true are kept in the result set and the projection attributes are applied to it, and for those elements for which the whereClause evaluates to true are discarded.

      2. Filtered Evaluation
        When there is at least one compatible index, then the query is evaluated by “filtering” which means it does intersections or iterations on intermediate result sets obtained from indexes instead of the entire base collection. For filtered evaluation, the compiled whereClause tree is visited recursively, but instead of doing this for each iteration of the base collection, the entire result set is build up as it visits the nodes in the tree. An expression that is compatible with an index will produce a result set using that index. When combined with other terms in an AND expression, either other result sets will be intersected with each other to produce a result set for the entire AND expression, or some terms will produce result sets that are intersected and other terms that cannot use an index will be evaluated against the intermediate results from other terms by iteration, causing elements in the intermediate results to be dropped if they don't evaluate to TRUE for the other term(s). Terms in AND expressions that use indexes should always be evaluated first before terms that require iteration; this minimizes the size of the iteration required.

        1. Projections on Indexes
          When computing the result set from an index lookup that contains a projection, the RegionEntry where the target objects are derived from should be kept as well. This result set may need to be intersected with the results of another index lookup that may or may be projected, or may need to be iterated across with an expression that refers to data that is not in the projection. Each of these cases is described as follows.

          • If two index lookup result sets are intersected and they both have the same projection or they both have no projection, then do the intersection normally, keeping the context information (i.e. the RegionEntries) in the result.

          • It two index lookup result sets need to be intersected and one has a projection and the other does not, then apply the projection to the result set that does not have a projection and then do the intersection.

          • If an index lookup result set has no projection and an expression is being applied through iteration, then nothing special needs to be done.

          • If an index lookup result set does have a projection and an expression is being applied through iteration, then first determine if the expression refers to any iterator variable that is not available in the projection. If so, then the full (unprojected) base collection element(s) (i.e. the part of the cartesian product this entry contributes to) should be computed from the index results for each element using the RegionEntry instead of the projection retrieved from the index before the expression is applied to determine whether the projection(s) from the index should added to the intermediate results.

        2. Choice of multiple Indexes
          Where there is an expression that is compatible with more than one available index, then if one index has a projection and the other does not, then prefer the index with the projection. In some cases an index for a complex expression is available as well as an index for a simpler expression that is part of the more complex expression. In this case the index for the more complex expression is preferred. [to do – provide example]

      3. Independent whereClause. In the corner case where the whereClause is not dependent at all on any of the iterators then iteration is also not necessary as the result will simply be the entire projected base collection or the empty set.

  5. Index Maintenance

    1. Synchronous
      Synchronous index maintenance implies that the thread that makes a modification to region data does not return until the indexes for that region are updated. This guarantees that a thread that makes modifications to a region and then does a query will get results that reflect the changes.

    2. Asynchronous
      Asynchronous index maintenance uses a background thread that does the index maintenance. Operations to update the index are queued in Runnable added to a QueuedExecutor and a background thread takes operations off the queue and updates the indexes. Although the operations should be done in order with respect to a particular region, we should try to avoid using a thread per region. One thread per cache, however, may not be sufficient so some compromise may need to be made with respect to the number of threads. The size of the index maintenance queue may need to be made configurable by the user to prevent index maintenance from lagging behind region updates too much. Other than the use of background threads, index maintenance is the same for both synchronous and asynchronous.

    3. Upon region modification. When a region that has indexes is modified, an updateIndexes call is made to the region's index manager to update the indexes. A reference to the RegionEntry that is being modified is provided, and information regarding whether the modification was a create, destroy, or update. If it was an update, then including the “old value” is not necessary since the old data in the indexes is completely identified by the RegionEntry.

      1. Remove old data.
        If the operation is not a create, the old data that is associated with the RegionEntry is removed from the indexes, using the reverse map of the index.

      2. Add new data.
        If the operation is not a destroy, the new data is calculated and added to the index in both the forward and reverse maps. If it is a destroy, then skip the next section on computing the new data.

      3. New Data: Compute the new index values and target values.
        Execute the IMQ using the same procedure that was used for index creation, but only use the one created or updated RegionEntry to construct a QRegion. This provides the index values and target values to use to update the index for this entry.

    4. Concurrency
      Each index has a ReadWriteLock. This lock can either be an instance of the backport of the JDK 1.5 ReentrantReadWriteLock can be used. The read lock allows multiple readers and the write lock is exclusive. During index maintenance, all the indexes are write-locked up front. When a query needs to use an index, it obtains a read lock on the index while it uses it.

  6. Futures

    1. Improved Concurrency During Region Maintenance

      1. Multiple Reader, Multiple Writer ReadWriteLock
        <TBD>





© 2015 - 2024 Weber Informatics LLC | Privacy Policy