cern.colt.matrix.doc-files.function4.html Maven / Gradle / Ivy

Go to download

Show more of this group Show more artifacts with this name
Show all versions of parallelcolt Show documentation

Parallel Colt is a multithreaded version of Colt - a library for high performance scientific computing in Java. It contains efficient algorithms for data analysis, linear algebra, multi-dimensional arrays, Fourier transforms, statistics and histogramming.

The newest version!

Function Objects

Example 4: Sorting by user specified order 
Assume, we would like to sort the rows of a 2d matrix by the the last column (representing "age"). This can be done with
// sort by last column
sorted = matrix.viewSorted(matrix.columns()-1);

Or assume, we would like to sort the columns of a 2d matrix by the the last row. 
Unfortunately, there is no convenience method to directly sort by row. So we need to view columns as rows and rows as columns, then sort, then adjust our view again.
// sort by last row
int lastRow = matrix.rows()-1;
sorted = matrix.viewDice().viewSorted(lastRow).viewDice();

Next, we would like to sort the rows of a 2d matrix by the aggregate sum 
  of values in a row. A comparator object is used to do the job: 
// sort by sum of values in a row
DoubleMatrix1DComparator comp = new DoubleMatrix1DComparator() {
	public int compare(DoubleMatrix1D a, DoubleMatrix1D b) {
		double as = a.zSum(); double bs = b.zSum();
		return as < bs ? -1 : as == bs ? 0 : 1;
	}
};
sorted = cern.colt.matrix.tdouble.algo.Sorting.quickSort(matrix,comp);

Further, we would like to sort the rows of a 2d matrix by the aggregate sum of 
  logarithms in a row (which is a way to achieve sorting by geometric mean 
  when viewing a row as a series of samples). A slightly more complex comparator 
  object is needed: 
// sort by sum of logarithms in a row
DoubleMatrix1DComparator comp = new DoubleMatrix1DComparator() {
	public int compare(DoubleMatrix1D a, DoubleMatrix1D b) {
		double as = a.aggregate(cern.jet.math.Functions.plus,cern.jet.math.Functions.log); 
		double bs = b.aggregate(cern.jet.math.Functions.plus,cern.jet.math.Functions.log);
		return as < bs ? -1 : as == bs ? 0 : 1;
	}
};
sorted = cern.colt.matrix.tdouble.algo.Sorting.quickSort(matrix,comp);

This is certainly not most efficient since row sums are recomputed many times 
(2*rows*log(rows) times, on average), but will suffice as an example. 
An efficient app will precompute the sums and use cern.colt.GenericSorting 
to sort the matrix. In general, if comparisons are expensive, precomputation boots 
performance by a factor 2*log(rows). 
 Recently, 
  two methods that do exactly that were added to cern.colt.matrix.tdouble.algo.DoubleSorting. 
  One of them works by filling a row into a so-called "bin", which is a multi-set 
  with statistics operations defined upon. Aggregate measures over the row are 
  then computed via a DoubleBinFunction1D. 
  Some prefabricated functions are contained in DoubleBinFunctions1D 
  Here is how to solve the problem efficiently: 
// sort by sum of logarithms in a row
sorted = cern.colt.matrix.tdouble.algo.Sorting.quickSort(matrix,hep.aida.bin.DoubleBinFunctions1D.sumLog);

// sort by median in a row
sorted = cern.colt.matrix.tdouble.algo.Sorting.quickSort(matrix,hep.aida.bin.DoubleBinFunctions1D.median);

// sort by maximum in a row
sorted = cern.colt.matrix.tdouble.algo.Sorting.quickSort(matrix,hep.aida.bin.DoubleBinFunctions1D.max);