cern.colt.matrix.doc-files.performanceLog.html Maven / Gradle / Ivy

Go to download

Show more of this group Show more artifacts with this name
Show all versions of parallelcolt Show documentation

Parallel Colt is a multithreaded version of Colt - a library for high performance scientific computing in Java. It contains efficient algorithms for data analysis, linear algebra, multi-dimensional arrays, Fourier transforms, statistics and histogramming.

The newest version!



Results of Single and Dual Processor Colt Matrix Benchmark
using the matrix package. For 
  more explanations, on how to interpret and run benchmarks on your own boxes, 
  see the documentation of class BenchmarkMatrix.

   
    OS
    Linux
    Your config.
  
   
    OS Config.
    Red Hat 6.1, Kernel 2.2.12-20smp
     
  
   
    HW
    2 x PentiumIII@600 MHz, 512 MB, 32 KB L1, 
      2x256 KB L2 (lxplus012.cern.ch) 
     
  
   
    VM
    IBMJDK1.3, Classic VM, build cxdev-20000502, 
      jitc
     
  
   
    Performance
    here
     
  

Here the result for the matrix matrix multiply with one 
  thread and the parallel version with two 
  threads.
Each operation is timed varying the following parameters 

  Implementation type - DenseDoubleMatrix2D, SparseDoubleMatrix2D
  Density - the fraction of cells in non-zero state (randomly assigned)
  Size - all matrices are square with the given number of rows and 
    columns
  Computer Architecture, Operating System and Virtual Machine

Methodology

  Measurements given in Mops/sec (10^6 ops/sec) and Mflops/sec 
    (10^6 flops/sec). A[i,j]=B[k,l] counts as 1 op whereas sum 
    += A[i,j]*B[k,l] counts as 2 flops. For sparse implementations 
    Mops and Mflops are expressed in relation to the dense base line implementation: 
    If an operation on a dense matrix executes at 10 Mflops/sec but takes 2 times 
    longer to complete on a sparse matrix, the sparse matrix is said to have a 
    performance of 10/2=5 Mflops.

  
  All machines are empty.

  
  No explicit invocation of garbage collection within and between runs (there 
    is not much to collect).

  
  Each operation is repeated for at least 2 seconds (see command line); the 
    mean of all repetitions is reported.

  
  Some parameter combinations that do not occur in practice (but would take 
    lots of memory and time) are not benchmarked; they appear in the tables as 
    NaN's (this is not an error). For example, it is possible to multiply 
    two matrices of type SparseDoubleMatrix2D which are in fact very 
    dense. However, it doesn't make a lot of sense; one would take DenseDoubleMatrix2D 
    for such purposes. 

  

Command line: java -Xmx400m cern.colt.matrix.bench.BenchmarkMatrix -file 
  all 
Below some results from an old version 1.0Beta4-1. Of historic interest only.

   
    OS
    Linux
    Linux
    Linux
    Solaris
  
   
    OS Config.
    Red Hat 6.1, Kernel 2.2.12-20
    Red Hat 6.1, Kernel 2.2.12-20
    Red Hat 6.1, Kernel 2.2.12-20
    Solaris 2.6 (aka SunOS 5.6) 
  
   
    HW
    1 x PentiumIII@600 MHz, 128 MB, 32 KB L1, 256 KB L2 
      (linuxosdev.cern.ch) 
    1 x PentiumIII@600 MHz, 128 MB, 32 KB L1, 256 KB L2 
      (linuxosdev.cern.ch) 
    1 x PentiumIII@600 MHz, 128 MB, 32 KB L1, 256 KB L2 
      (linuxosdev.cern.ch) 
    Sun 450, 2 x Ultrasparc-II@400 MHz (1 CPU used), 256 
      MB, 32 KB L1, 4 MB L2 (shd70.cern.ch) 
  
   
    VM
    IBMJDK1.1.8 
    BlackdownJDK1.2.2RC3, Classic VM, native threads, sunwjit 
    
    SunInpriseJDK1.2.2RC1, Classic VM (build 1.2.2-I, green 
      threads, javacomp)
    SunJDK1.2.2, Classic VM
  
   
    Performance
    here
    here
    here
    here

OS	Linux	Your config.
OS Config.	Red Hat 6.1, Kernel 2.2.12-20smp
HW	2 x PentiumIII@600 MHz, 512 MB, 32 KB L1, 2x256 KB L2 (lxplus012.cern.ch)
VM	IBMJDK1.3, Classic VM, build cxdev-20000502, jitc
Performance	here