com.bigdata.btree.proc.IIndexProcedure Maven / Gradle / Ivy

Go to download
Show more of this group Show more artifacts with this name
Show all versions of bigdata-core Show documentation
Blazegraph(TM) DB Core Platform. It contains all Blazegraph DB dependencies other than Blueprints.
There is a newer version: 2.1.4
/*

Copyright (C) SYSTAP, LLC DBA Blazegraph 2006-2016.  All rights reserved.

Contact:
     SYSTAP, LLC DBA Blazegraph
     2501 Calvert ST NW #106
     Washington, DC 20008
     [email protected]

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; version 2 of the License.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

*/
/*
 * Created on Mar 15, 2007
 */

package com.bigdata.btree.proc;

import java.io.Serializable;

import com.bigdata.btree.IIndex;
import com.bigdata.btree.IRangeQuery;
import com.bigdata.btree.ISimpleBTree;
import com.bigdata.journal.IReadOnly;
import com.bigdata.service.DataService;
import com.bigdata.service.IDataService;
import com.bigdata.service.ndx.ClientIndexView;
import com.bigdata.sparse.SparseRowStore;

/**
 * An arbitrary procedure run against a single index.
 * 
 * There is a fairly direct correspondence between map / reduce processing and
 * the use of index procedures. The "map" phase corresponds to the
 * {@link IIndexProcedure} while the "reduce" phase corresponds to the
 * {@link IResultHandler}. The main difference is that index procedures operate
 * on ordered data. {@link IIndexProcedure}s MAY use
 * {@link IParallelizableIndexProcedure} to declare that they can be executed in
 * parallel when mapped across multiple index partitions - parallelization is
 * handled automatically by {@link ClientIndexView}. Otherwise it is assumed
 * that the procedure requires sequential execution as it is applied to the
 * relevant index partition(s) - again this is handled automatically.
 * 

 * There are basically three kinds of index procedures, each of which determines
 * where an index procedure will be mapped in a different manner:
 * 

 * 
 * A single key, byte[] key
 * 
 * Examples are when reading a logical row from a {@link SparseRowStore} or
 * performing a datum {@link ISimpleBTree#lookup(Object)}. These procedures are
 * always directed to a single index partition. Since they are never mapped
 * across index partitions, there is no "aggregation" phase. Likewise, there is
 * no {@link AbstractKeyArrayIndexProcedureConstructor} since the procedure instance is
 * always created directly by the application.
 * 
 * A key range, byte[] toKey, byte[] fromKey
 * 
 * Examples are {@link IRangeQuery#rangeCount(byte[], byte[])} and
 * {@link IRangeQuery#rangeIterator(byte[], byte[]). Since the same data is sent
 * to each index partition there is no requirement for an object to construct
 * instances of these procedures. However, the procedures do need a means to
 * combine or aggregate the results from each index partition. See
 * {@link IKeyRangeIndexProcedure}.
 * 
 * A set of keys, byte[][] keys
 * 
 * Examples are {@link BatchInsert}, {@link BatchLookup}, etc. For this
 * case the procedure is not only mapped across one or more index partitions,
 * but the procedure's data must be split such that each index partition is sent
 * only the data that is relevant to that index partition. See
 * {@link IKeyArrayIndexProcedure}. 
 * 
 * 
 * 
 * Note: this interface extends {@link Serializable}, however that provides
 * only for communicating state to the {@link IDataService}. If an instance of
 * this procedure will cross a network interface, then the implementation class
 * MUST be available to the {@link IDataService} on which it will execute. This
 * can be as simple as bundling the procedure into a JAR that is part of the
 * CLASSPATH used to start a {@link DataService} or you can use downloaded code
 * with the JINI codebase mechanism (java.rmi.server.codebase).
 * 
 * @author Bryan Thompson
 */
public interface IIndexProcedure extends IReadOnly, Serializable {
    
    /**
     * Run the procedure.
     * 
     * Note: Unisolated procedures have "auto-commit" ACID properties for a
     * local index only. In order for a distributed procedure to be ACID, the
     * procedure MUST be executed within a fully isolated transaction.
     * 
     * @param ndx
     *            The index.
     * 
     * @return The result, which is entirely defined by the procedure
     *         implementation and which MAY be null. In general,
     *         this MUST be {@link Serializable} since it may have to pass
     *         across a network interface.
     */
    public T apply(IIndex ndx);
	
}