All Downloads are FREE. Search and download functionalities are using the official Maven repository.

com.bigdata.package.html Maven / Gradle / Ivy



bigdata®



bigdata® is a scale-out data and computing fabric designed for commodity hardware. Scale-out is achieved using key-range partitioned B+Trees and distributed computing. The architecture supports both embedded and scale-out database applications. Unisolated transactions are supported and provide for extremely high read-write concurrency when used as a sparse row store. In addition, both read-committed, read-only, and fully isolated read-write transactions are supported using Multi-Version Concurrency Control (MVCC).

Services

The bigdata architecture is broken down into several services:

data service
The data service provides an API for reading and writing on index partitions
metadata service
The metadata services manages and locates index paritions
transaction service
The transaction service coordinates transaction start and commits and provides the integration point for both unisolated and isolated transactions.
map/reduce service
The map/reduce service provides an API for decomposing a problem across the distributed database.
The current release supports distributed services using JINI but there has been interest expressed in supporting other distributed application architectures as well, including OSGi and Service Component Archicture (SCA).

Note: Readers familar with Google's research publications or with the Apache Hadoop effort will recognize some similarities and some differences. For example, both Google and Hadoop both use a distributed file system for failover. While bigdata may be deployed in a similar manner using a third party distributed file system, it also offers a store-level media replication strategy for addressing failover.

Packages

{@link com.bigdata.cache}
A set of utility classes for creating weak reference object caches.
{@link com.bigdata.io}
A set of utility classes I/O.
{@link com.bigdata.util}
A set of utility classes.
{@link com.bigdata.rawstore}
A set of interfaces and utility classes defining the low-level protocol for operations on a persistence store. Operations at this level are expressed in terms of byte[] records and an "address" combining both the offset at which the a record was written and the length of the record.
{@link com.bigdata.btree}
This package provides both a implementation of both a mutable B+Tree and a read-only B+Tree. The mutable {@link com.bigdata.btree.BTree} supports variable length byte[] keys, a copy-on-write strategy for nodes and leaves which is used to support transactional isolation, and remains balanced under both insert and delete operations. a B+Tree may be exported into a read-only {@link com.bigdata.btree.IndexSegment} using an efficient bulk index build utility.
{@link com.bigdata.isolation}
This package provides specialized B+Tree classes designed to support transactional isolation. This builds on the features of the base B+Tree package, which already supports copy-on-write semantics, and on the Journal package, which already supports a policy in which valid data are never overwritten. The primary contribution of this package is a set of extensions and wrapper classes that manage {@link com.bigdata.isolation.IValue} objects wrapping application data values. Each {@link com.bigdata.isolation.IValue} encapsulates a version counter, which is used to detect write-write conflicts, and a deleted flag, which is used to mark keys that have been deleted until a full compacting merge can be performed.
{@link com.bigdata.sparse}
This package provides a sparse row store data model similar to Google's bigtable or the HBase component in the Apache Hadoop project. A sparse row store is a data model in which the B+Tree keys are formed as: [schemaName][primaryKey][columnName][timestamp]
{@link com.bigdata.journal}
This package provides a fast append-only persistence store. The journal is designed to minimize disk head movement and maximize the opportunity for sequential IO. Typically, multiple indices are mapped onto the same journal in order to minimize the #of distinct disk files and disk seeks on a server platform.
{@link com.bigdata.service}
This package realizes the services for a distributed scale-out database. The basic components of the scale-out architecture are the {@link com.bigdata.service.IDataService} and {@link com.bigdata.service.IMetadataService}.





© 2015 - 2025 Weber Informatics LLC | Privacy Policy