cern.colt.matrix.doc-files.performanceLog.html Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of parallelcolt Show documentation
Show all versions of parallelcolt Show documentation
Parallel Colt is a multithreaded version of Colt - a library for high performance scientific computing in Java. It contains efficient algorithms for data analysis, linear algebra, multi-dimensional arrays, Fourier transforms, statistics and histogramming.
The newest version!
Results of Single and Dual Processor Colt Matrix Benchmark
using the matrix package. For
more explanations, on how to interpret and run benchmarks on your own boxes,
see the documentation of class BenchmarkMatrix.
OS
Linux
Your config.
OS Config.
Red Hat 6.1, Kernel 2.2.12-20smp
HW
2 x PentiumIII@600 MHz, 512 MB, 32 KB L1,
2x256 KB L2 (lxplus012.cern.ch)
VM
IBMJDK1.3, Classic VM, build cxdev-20000502,
jitc
Performance
here
Here the result for the matrix matrix multiply with one
thread and the parallel version with two
threads.
Each operation is timed varying the following parameters
- Implementation type - DenseDoubleMatrix2D, SparseDoubleMatrix2D
- Density - the fraction of cells in non-zero state (randomly assigned)
- Size - all matrices are square with the given number of rows and
columns
- Computer Architecture, Operating System and Virtual Machine
Methodology
- Measurements given in Mops/sec (10^6 ops/sec) and Mflops/sec
(10^6 flops/sec). A[i,j]=B[k,l] counts as 1 op whereas sum
+= A[i,j]*B[k,l] counts as 2 flops. For sparse implementations
Mops and Mflops are expressed in relation to the dense base line implementation:
If an operation on a dense matrix executes at 10 Mflops/sec but takes 2 times
longer to complete on a sparse matrix, the sparse matrix is said to have a
performance of 10/2=5 Mflops.
- All machines are empty.
- No explicit invocation of garbage collection within and between runs (there
is not much to collect).
- Each operation is repeated for at least 2 seconds (see command line); the
mean of all repetitions is reported.
- Some parameter combinations that do not occur in practice (but would take
lots of memory and time) are not benchmarked; they appear in the tables as
NaN's (this is not an error). For example, it is possible to multiply
two matrices of type SparseDoubleMatrix2D which are in fact very
dense. However, it doesn't make a lot of sense; one would take DenseDoubleMatrix2D
for such purposes.
Command line: java -Xmx400m cern.colt.matrix.bench.BenchmarkMatrix -file
all
Below some results from an old version 1.0Beta4-1. Of historic interest only.
OS
Linux
Linux
Linux
Solaris
OS Config.
Red Hat 6.1, Kernel 2.2.12-20
Red Hat 6.1, Kernel 2.2.12-20
Red Hat 6.1, Kernel 2.2.12-20
Solaris 2.6 (aka SunOS 5.6)
HW
1 x PentiumIII@600 MHz, 128 MB, 32 KB L1, 256 KB L2
(linuxosdev.cern.ch)
1 x PentiumIII@600 MHz, 128 MB, 32 KB L1, 256 KB L2
(linuxosdev.cern.ch)
1 x PentiumIII@600 MHz, 128 MB, 32 KB L1, 256 KB L2
(linuxosdev.cern.ch)
Sun 450, 2 x Ultrasparc-II@400 MHz (1 CPU used), 256
MB, 32 KB L1, 4 MB L2 (shd70.cern.ch)
VM
IBMJDK1.1.8
BlackdownJDK1.2.2RC3, Classic VM, native threads, sunwjit
SunInpriseJDK1.2.2RC1, Classic VM (build 1.2.2-I, green
threads, javacomp)
SunJDK1.2.2, Classic VM
Performance
here
here
here
here
© 2015 - 2025 Weber Informatics LLC | Privacy Policy