cern.colt.matrix.doc-files.sparse.html Maven / Gradle / Ivy
Show all versions of parallelcolt Show documentation
A few more words about sparse matrices. In practice, sparse matrices are used
for one of two reasons: To safe memory or to speed up computation. Hash based
sparse matrices are neither the smallest possible matrix representation nor
the fastest. They implement a reasonable trade-off between performance and memory:
Very good average performance on get/set operations at quite small memory footprint.
However, they are not particularly suited for special-purpose algorithms exploiting
explicit knowledge about what regions are zero and non-zero. For example, sparse
linear algebraic matrix multiplies, inversions, etc. better work on other sparse
matrix representations like, for example, Harwell-Boeing. Harwell-Boeing also
has smaller memory footprint. However, those alternative sparse matrix representations
are really only usable for special purposes, because their get/set performance
is typically very bad. In contrast, hash based sparse matrices are more generally
applicable data structures.
Finally note, that some algorithms exploiting sparsity can be expressed in
a generic manner, without needing to know or dictate a special internal storage
format. For example, in many linear algebraic operations (like the matrix multiply)
the dot product is in the inner-most loop of a cubic or quadratic loop, where
one operand of the dot product "changes slowly". Detecting sparsity
in the blocked "slow changing" operand and using a quick generic dot
product algorithm summing only non-zero cells can drastically improve performance
without needing to resort to special storage formats. Imagine a 500 x 500 DenseDoubleMatrix2D
or SparseDoubleMatrix2D which is in fact populated with only one (or
few) non-zero cells per row. The innermost loop of the cubic matrix multiply
is reduced from 500 steps to 1 step, resulting in an algorithm that in benchmarks
runs about 50 times quicker (up to 500 "virtual" Mflops on a now outdated
processor Pentium 200Mhz, running NT, SunJDK1.2.2, java -classic, DenseDoubleMatrix2D.
The theoretical speedup of 500 cannot be achieved). Because the performance
overhead of sparsity detection is negligible (some 5%), this is the way the
linear algebraic matrix-matrix and matrix-vector multiplications of this toolkit
are implemented.
To summarize, generic algorithms can often detect and exploit sparsity with
insignificant overhead, without needing to know or dictate a special matrix
storage format.