sux4j.5.0.1.source-code.CHANGES Maven / Gradle / Ivy
5.0.1
- New EliasFanoMonotoneLongBigList16 class supporting very large
lists with fixed lower-bits width 16 (bypassing the LongArrayBitVector
128Gib limit).
5.0.0
- Major revamp of the it.unimi.dsi.sux4j.mph package.
4.3.0
- Several fixes and improvements.
4.2.0
- Java 8-only.
- New compressed functions (GV3CompressedFunction and
GV4CompressedFunction) which store a mapping key/value using a number of
bits per key close to the empirical entropy of the list of values.
- Parallel, multithreaded construction for GOV3Function, GOV4Function,
GOVMinimalPerfectHashFunction, GV3CompressedFunction and GV4CompressedFunction).
4.1.0
- We now use in all the new structures of the mph package a modulo-free
range reduction based on a integer-simulated floating point computation
(see http://www.drdobbs.com/tools/fast-high-quality-parallel-random-number/231000484).
This has improved lookup performance by 5-10% but it has made necessary
to bump the serial version UIDs.
- Fixed wrong mask in Rank12. It could lead to unpredictable
errors.
4.0.0
- Complete revamp of the mph package. New algorithms provide
better space and faster lookup. Please read carefully the
package documentation.
- Older maps such as MWHCFunction, MinimalPerfectHashFunction
and TwoStepsMWHCFunction have been deprecated, and will be
removed sooner or later.
- All serial version UIDs have been updated.
- We now use SpookyHash.
- We now support, for all functions for which it is possible,
raw (i.e., non-lexicographical) transformation strategies
(as they are faster).
- Fixed a lot of documentations, main methods, etc.
- Several functions can now be built from the command line
using byte-arrays as keys; strings are read without any
decoding.
3.2.2
- Improved speed for selection in a word thanks to an idea of Giuseppe
Ottaviano. Now we refer to the implementation found in the DSI
utilities (it.unimi.dsi.bits.Fast.select()).
3.2.1
- Fixed dependencies.
3.2.0
- New Rank11 and Rank12 implementations with new space/speed tradeoffs.
Taken from Simon Gog and Mathias Petri's paper.
- New ranking structure for MinimalPerfectHashFunction based on Rank11,
which made it necessary to bump the serialVersionUID.
- New HashDisplaceCompressMinimalPerfectHashFunction class implementing a
new technique by Belazzougui, Botelho and Dietzfelbinger, providing
minimal perfect hashing at 2.05 bits per key. Construction time is an
order of magnitude larger than MinimalPerfectHashFunction, and queries
are ~50% slower, so we will keep the other implementation as a reference
for the time being.
3.1.2
- Revised adapter (now renamed SignedFunctionStringMap) to accept
generic Object2Long extends CharSequence> functions, discovering
dynamically whether they implement Size64.
3.1.1
- New thin adapter SignedObject2LongFunctionStringMap to adapt the new
Sux4J self-signing functions to the StringMap (big) interface. The
adapter implements also a constructor based on an Iterable that
can be used directly with MG4J's IndexBuilder.
- New bulk get() methods for EliasFanoLongBigList,
EliasFanoMonotoneLongBigList and select() for SimpleSelect. They are
built one upon each other and they are an order of magnitude faster than
picking consecutive elements using getLong()/select().
3.1.0
- MinimalPerfectHashFunction, MWHCFunction,
LcpMonotoneMinimalPerfectHashFunction,
TwoStepsLcpMonotoneMinimalPerfectHashFunction and
ZFastTrieDistributorMonotoneMinimalPerfectHashFunction have now built-in
signing. The effort was relatively small, and the gain in speed is huge,
as we can use part of the hash computed by the ChunkedHashStore as a
signature.
- New "dictionary" option in MWHCFunction which stores the signature as a
value. It gives the fastest and most compact probabilistic dictionary.
- Big switch to builders for MinimalPerfectHashFunction, MWHCFunction,
TwoStepsMWHCFunction, LcpMonotoneMinimalPerfectHashFunction,
TwoStepsLcpMonotoneMinimalPerfectHashFunction and
ZFastTrieDistributorMonotoneMinimalPerfectHashFunction. As a final goal,
all multiple constructors will be eliminated in favour of Guava-style static
nested Builder classes.
- Major speed improvements in
ZFastTrieDistributorMonotoneMinimalPerfectHashFunction.
3.0.11
- Now we use XorShift1024StarRandom everywhere.
3.0.10
- New parameter to MWHCFunction makes it possible to build arbitrary
functions from the command line.
3.0.9
- Added for efficiency a lowerBitsMask field to
EliasFanoMonotoneLongBigList, which made it necessary to bump the
serialVersionUID.
- Aligned the behaviour of EliasFanoMonotoneLongBigList with that of
MG4J's implementation--the upper bound is not necessarily strict.
- Fixed subtle bug in ChunkedHashStore that could cause an out-of-bounds
access to an array in the (very rare) case of a retry.
- The documentation of Jenkins hashes was still stating, erroneously,
that we return the first computed value (we return the third one).
- Deprecated length() methods have been removed.
- Added UTF-32 support to all classes.
3.0.8
- Significantly improved speed of SimpleSelect and all related
classes (Elias-Fano lists, etc.).
- Improved speed of MinimalPerfectHashFunction.
- Integer overflow bugs fixes (thanks to Tim Potter).
3.0.7
- Some monotone hash functions would crash during construction due to an
integer overflow. Thanks to Tim Potter for reporting this bug.
3.0.6
- Switched to SLF4J for logging.
3.0.5
- The Ivy dependency file has been Maven-normalized (now we have a
default "compile" scope).
3.0.3
- Replaced methods from it.unimi.dsi.bits.Fast with equivalent methods in
Integer/Long, as recent JVMs intrinsify such methods.
3.0.2
- Fixed minor inconsistencies in the values returned on empty functions
(some implementations would actually throw an exception). Thanks to
Valentin Tablan for reporting the problem.
3.0
- WARNING: This release has minor binary incompatibilities with previous
releases, mainly due to the move from the interface
it.unimi.dsi.util.LongBigList to the now standard
it.unimi.dsi.fasutil.longs.LongBigList. It is part of a parallel release
of fastutil, the DSI Utilities, Sux4J, MG4J, WebGraph, etc. that were
all modified to fit the new interface, and that prepare the way for our
"big" versions, that is, supporting >2^31 entries in arrays (simulated),
elements in lists, terms, documents, nodes, etc. Please read our (short)
"Moving Java to Big Data" document (JavaBig.pdf) for details.
- We now require Java 6.
- it.unimi.dsi.util.LongBigList is dead. Long live to
it.unimi.dsi.fastutil.longs.LongBigList. We're sorry for the
nuisance--adapting the code should be very easy (and we warned you
anyway :).
- MWHCFunction has been re-engineered so that it will use very little
space beside that actually required by the function. Previously, a
number of large bit vectors were allocated at the same time, and they
have been replaced by a judicious use of OfflineIterable. All classes
that use MWHCFunction will benefit (at the expense of slightly increased
disk usage and access).
- All classes should support big collections. They use the new Size64
interface from fastutil. Implementations still support the old
(deprecated) length() method for backward compatibility.
- New FileLinesBigList class.
- We now have a MurmurHash3 full implementation.
2.0.1
- Sux4J is now distributed under the GNU Lesser General Public
License 3.
- Major rewriting of the hypergraph peeling code. Now we use less memory
and we are definitely faster.
- MWHCFunction and MinimalPerfectHashFunction accept a temporary directory
for the chunked hash store files.
- Improved ChunkedHashStore architecture that allow arbitrary values,
so functions can be built without keeping values in memory.
- Fixed bug in MinimalPerfetHashFunction that was causing exceptions
when using more than a billion keys (thanks to Wei Liu for reporting
and fixing this bug).
- Fixed bug in some rank/select classes that was causing integer overflow
errors when building structure over bit vectors with >2Gi bits.
- AbstractLongBitVector.equals() now uses getLong() on word boundaries.
- Fixed pernicious bug in Select9.
2.0
- General revamp, restructuring, improvements, new coherent names for
classes. Most of the code has been rewritten or improved.
- New (partial) structures for balanced parentheses.
- New build system based on ChunkedHashStore that works for billions
of keys.
- New structures for balanced parentheses (partially implemented).
- Faster Select9 operation: some broadword operations were implemented in
a redundant way.
- SparseSelect/SparseRank would give an incorrect (or at least little
useful) value for numBits() when using shared data.
- Fixed problem with SparseSelect: some methods inherited from
EliasFanoMonotoneBigList were causing exceptions because the size of the
SparseSelect is the number of bits, not the number of ones.
- New hollow trie implementation based on balanced parentheses.
- Fixed an old and severe bug in MinimalPerfectHashFunction, that
was causing the generated functions not to be perfect.
1.0.4
- New progressive hash-computation methods that provide Jenkins hash
in constant time on all prefixes (after a linear-time preprocessing).
1.0.3
- New TwoStepsMWHCFunction that records in s
© 2015 - 2025 Weber Informatics LLC | Privacy Policy