cern.colt.list.package.html Maven / Gradle / Ivy

Go to download
Show more of this group Show more artifacts with this name
Show all versions of parallelcolt Show documentation
Parallel Colt is a multithreaded version of Colt - a library for high performance scientific computing in Java. It contains efficient algorithms for data analysis, linear algebra, multi-dimensional arrays, Fourier transforms, statistics and histogramming.
The newest version!


Resizable lists holding objects or primitive data types such as int, 
  double, etc. For non-resizable lists (1-dimensional matrices) see 
  package {@link cern.colt.matrix}.
Getting Started
1. Overview
The list package offers flexible object oriented abstractions modelling dynamically 
  resizing lists holding objects or primitive data types such as int, 
  double, etc. It is designed to be scalable in terms of performance 
  and memory requirements.
Features include: 


  Lists operating on objects as well as all primitive data types such as int, 
    double, etc.
  Compact representations
  A number of general purpose list operations including: adding, inserting, 
    removing, iterating, searching, sorting, extracting ranges and copying. All 
    operations are designed to perform well on mass data.
  Support for quick access to list elements. This is achieved by bounds-checking 
    and non-bounds-checking accessor methods as well as zero-copy transformations 
    to primitive arrays such as int[], double[], etc. 
  
  Allows to use high level algorithms on primitive data types without any 
    space and time overhead. Operations on primitive arrays, Colt lists and JAL 
    algorithms can freely be mixed at zero copy overhead.

File-based I/O can be achieved through the standard Java built-in serialization 
  mechanism. All classes implement the {@link java.io.Serializable} interface. 
  However, the toolkit is entirely decoupled from advanced I/O. It provides data 
  structures and algorithms only. 
 This toolkit borrows concepts and terminology from the Javasoft  
  Collections framework written by Josh Bloch and introduced in JDK 1.2. 
2. Introduction
Lists are fundamental to virtually any application. Large scale resizable lists 
  are, for example, used in scientific computations, simulations database management 
  systems, to name just a few.

A list is a container holding elements that can be accessed via zero-based 
  indexes. Lists may be implemented in different ways (most commonly with arrays). 
  A resizable list automatically grows as elements are added. The lists of this 
  package do not automatically shrink. Shrinking needs to be triggered by explicitly 
  calling trimToSize() methods.
Growing policy: A list implemented with arrays initially has a certain 
  initialCapacity - per default 10 elements, but customizable upon instance 
  construction. As elements are added, this capacity may nomore be sufficient. 
  When a list is automatically grown, its capacity is expanded to 1.5*currentCapacity. 
  Thus, excessive resizing (involving copying) is avoided.
Copying
 
Any list can be copied. A copy is equal to the original but entirely 
  independent of the original. So changes in the copy are not reflected in the 
  original, and vice-versa. 
3. Organization of this package
Class naming follows the schema <ElementType><ImplementationTechnique>List. 
  For example, we have a {@link cern.colt.list.tdouble.DoubleArrayList}, which is a list 
  holding double elements implemented with double[] arrays. 

The classes for lists of a given value type are derived from a common abstract 
  base class tagged Abstract<ElementType>List. For example, 
  all lists operating on double elements are derived from {@link cern.colt.list.tdouble.AbstractDoubleList}, 
  which in turn is derived from an abstract base class tying together all lists 
  regardless of value type, {@link cern.colt.list.AbstractList}, which finally 
  is rooted in grandmother {@link cern.colt.list.AbstractCollection}. The abstract 
  base classes provide skeleton implementations for all but few methods. Experimental 
  data layouts (such as compressed, sparse, linked, etc.) can easily be implemented 
  and inherit a rich set of functionality. Have a look at the javadoc tree 
  view to get the broad picture.
4. Example usage
The following snippet fills a list, randomizes it, extracts the first half 
  of the elements, sums them up and prints the result. It is implemented entirely 
  with accessor methods.

     
      int s = 1000000;
AbstractDoubleList list = new DoubleArrayList();
for (int i=0; i<s; i++) { list.add((double)i); }
list.shuffle();
AbstractDoubleList part = list.partFromTo(0,list.size()/2 - 1);
double sum = 0.0;
for (int i=0; i<part.size(); i++) { sum += part.get(i); }
System.out.println(sum);



 For efficiency, all classes provide back doors to enable getting/setting the 
  backing array directly. In this way, the high level operations of these classes 
  can be used where appropriate, and one can switch to []-array index 
  notations where necessary. The key methods for this are public <ElementType>[] 
  elements() and public void elements(<ElementType>[]). The 
  former trustingly returns the array it internally keeps to store the elements. 
  Holding this array in hand, we can use the []-array operator to 
  perform iteration over large lists without needing to copy the array or paying 
  the performance penalty introduced by accessor methods. Alternatively any JAL 
  algorithm (or other algorithm) can operate on the returned primitive array. 
  The latter method forces a list to internally hold a user provided array. Using 
  this approach one can avoid needing to copy the elements into the list. 
As a consequence, operations on primitive arrays, Colt lists and JAL algorithms 
  can freely be mixed at zero-copy overhead.
 Note that such special treatment certainly breaks encapsulation. This functionality 
  is provided for performance reasons only and should only be used when absolutely 
  necessary. Here is the above example in mixed notation: 

     
      int s = 1000000;
DoubleArrayList list = new DoubleArrayList(s); // list.size()==0, capacity==s
list.setSize(s); // list.size()==s
double[] values = list.elements(); // zero copy, values.length==s
for (int i=0; i<s; i++) { values[i]=(double)i; }
list.shuffle();
double sum = 0.0;
int limit = values.length/2;
for (int i=0; i<limit; i++) { sum += values[i]; }
System.out.println(sum);



 Or even more compact using lists as algorithm objects: 

     
      int s = 1000000;
double[] values = new double[s];
for (int i=0; i<s; i++) { values[i]=(double)i; }
new DoubleArrayList(values).shuffle(); // zero-copy, shuffle via back door
double sum = 0.0;
int limit = values.length/2;
for (int i=0; i<limit; i++) { sum += values[i]; }
System.out.println(sum);

    

 
5. Notes 
The quicksorts and mergesorts are the JDK 1.2 V1.26 algorithms, modified as 
  necessary to operate on the given data types.