com.sleepycat.bind.tuple.package.html Maven / Gradle / Ivy
Bindings that use sequences of primitive fields, or tuples.
For a general discussion of bindings, see the
Getting Started Guide.
Tuple Formats
The serialization format for tuple bindings are designed for compactness,
serialization speed and proper default sorting.
When a format is used for database keys, it is important to use default
sorting for best performance. Although a custom comparator may be specified
for a {@link com.sleepycat.je.DatabaseConfig#setBtreeComparator database} or
entity
index, custom comparators often reduce performance because comparators are
called very frequently during Btree operations.
For proper default sorting, the byte array of the stored format must be
designed so that a byte-by-byte unsigned comparison results in the natural sort
order, as defined by the {@link java.lang.Comparable#compareTo} method of the
data type. For example, the natural sort order for integers is the standard
mathematical definition, and is implemented by {@code Integer.compareTo},
{@code Long.compareTo}, etc. This is called default natural
sorting.
Although most tuple formats provide default natural sorting, not all of them
do. Certain formats do not provide default natural sorting for historical
reasons (see the discussion of packed integer and float formats below.) Other
formats sacrifice default natural sorting for other performance factors (see
the discussion of BigDecimal formats below.)
Another performance factor has to do with amount of memory used by keys in
the Btree. Keys are stored in their serialized form in the Btree. If keys are
small (currently 16 bytes or less), Btree memory can be optimized. Optimized
memory storage is based on the maximum size of all keys in a single Btree
node. A single Btree node holds N adjacent key values, where N is 128 by
default and can be {@link com.sleepycat.je.DatabaseConfig#setNodeMaxEntries
configured} for each database or index.
String Formats
All {@code String} formats support default natural sorting.
Strings are stored as a byte array of UTF encoded characters, either where
the length must be known by the application, or the byte array is
zero-terminated. The UTF encoding is described below.
- Null strings are UTF encoded as { 0xFF }, which is not allowed in a
standard UTF encoding. This allows null strings, as distinct from empty or
zero length strings, to be represented. Using default sorting, null strings
will be ordered last.
- Zero (0x0000) character values are UTF encoded as non-zero values, and
therefore embedded zeros in the string are supported. The sequence { 0xC0,
0x80 } is used to encode a zero character. This UTF encoding is the same one
used by the native Java UTF libraries and is called
Modified UTF-8.
However, this encoding of zero does impact the lexicographical ordering, and
zeros will not be sorted first (the natural order) or last.
- For all character values other than zero, the standard UTF encoding is
used, and the default sorting is the same as the Unicode lexicographical
character ordering.
Binding classes and methods are provided for zero-terminated and
known-length {@code String} values.
- Single-value binding classes for zero-terminated {@code String}
values.
- {@link com.sleepycat.bind.tuple.StringBinding}
- Multi-value binding methods for zero-terminated and known-length {@code
String} values.
- {@link com.sleepycat.bind.tuple.TupleOutput#writeString(String)}
- {@link com.sleepycat.bind.tuple.TupleInput#readString}
- {@link com.sleepycat.bind.tuple.TupleInput#getStringByteLength}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeString(char[])}
- {@link com.sleepycat.bind.tuple.TupleInput#readString(char[])}
- {@link com.sleepycat.bind.tuple.TupleInput#readString(int)}
Integer Formats
Fixed Size Integer Formats
All fixed size integer formats support default natural sorting.
The size of the stored value depends on the type, and ranges (as one would
expect) from 1 byte for type {@code byte} and class {@code Byte}, to 8 bytes for
type {@code long} and class {@code Long}.
Signed numbers are stored in the buffer in MSB (most significant byte first)
order with their sign bit (high-order bit) inverted to cause negative numbers
to be sorted first when comparing values as unsigned byte arrays, as done in a
database.
- Single-value binding classes for signed, fixed size integers.
- {@link com.sleepycat.bind.tuple.ByteBinding}
- {@link com.sleepycat.bind.tuple.ShortBinding}
- {@link com.sleepycat.bind.tuple.IntegerBinding}
- {@link com.sleepycat.bind.tuple.LongBinding}
- Multi-value binding methods for signed, fixed size integers.
- {@link com.sleepycat.bind.tuple.TupleOutput#writeByte}
- {@link com.sleepycat.bind.tuple.TupleInput#readByte}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeShort}
- {@link com.sleepycat.bind.tuple.TupleInput#readShort}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeInt}
- {@link com.sleepycat.bind.tuple.TupleInput#readInt}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeLong}
- {@link com.sleepycat.bind.tuple.TupleInput#readLong}
Unsigned numbers, including characters, are stored in MSB order with no
change to their sign bit. Arrays of characters and unsigned bytes may also be
stored and may be treated as {@code String} values. For booleans, {@code true}
is stored as the unsigned byte value one and {@code false} as the unsigned byte
value zero.
- Single-value binding classes for characters and booleans.
- {@link com.sleepycat.bind.tuple.BooleanBinding}
- {@link com.sleepycat.bind.tuple.CharacterBinding}
- Multi-value binding methods for unsigned, fixed size integers, characters
and booleans.
- {@link com.sleepycat.bind.tuple.TupleOutput#writeBoolean}
- {@link com.sleepycat.bind.tuple.TupleInput#readBoolean}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeChar}
- {@link com.sleepycat.bind.tuple.TupleInput#readChar}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeUnsignedByte}
- {@link com.sleepycat.bind.tuple.TupleInput#readUnsignedByte}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeUnsignedShort}
- {@link com.sleepycat.bind.tuple.TupleInput#readUnsignedShort}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeUnsignedInt}
- {@link com.sleepycat.bind.tuple.TupleInput#readUnsignedInt}
- Multi-value binding methods for character arrays and unsigned byte arrays
that may be treated as {@code String} values.
- {@link com.sleepycat.bind.tuple.TupleOutput#writeChars(String)}
- {@link com.sleepycat.bind.tuple.TupleInput#readChars(int)}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeChars(char[])}
- {@link com.sleepycat.bind.tuple.TupleInput#readChars(char[])}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeBytes(String)}
- {@link com.sleepycat.bind.tuple.TupleInput#readBytes(int)}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeBytes(char[])}
- {@link com.sleepycat.bind.tuple.TupleInput#readBytes(char[])}
Packed Integer Formats
The packed integer format stores integers with small absolute values in a
single byte. The size increases as the absolute value increases, up to a
maximum of 5 bytes for {@code int} values and 9 bytes for {@code long}
values.
The packed integer format can be used for integer values between {@link
java.lang.Long#MIN_VALUE} and {@link java.lang.Long#MAX_VALUE}. However,
different bindings and methods are provided for type {@code int} and {@code
long}, to avoid unsafe casting from {@code long} to {@code int} when {@code
int} values are used.
Because the same packed format is used for {@code int} and {@code long}
values, stored {@code int} values may be expanded to {@code long} values
without introducing a format incompatibility. In other words, you can treat
previously stored packed {@code int} values as packed {@code long} values.
Packed integer formats come in two varieties: those that support default
natural sorting and those that don't. The formats of the two varieties are
incompatible. For new applications, the format that supports default natural
sorting should normally be used. There is no performance advantage to using
the unsorted format.
The format with support for default natural sorting stores values in the
inclusive range [-119,120] in a single byte.
- Single-value binding classes for packed integers with default natural
sorting.
- {@link com.sleepycat.bind.tuple.SortedPackedIntegerBinding}
- {@link com.sleepycat.bind.tuple.SortedPackedLongBinding}
- Multi-value binding methods for packed integers with default natural
sorting.
- {@link com.sleepycat.bind.tuple.TupleOutput#writeSortedPackedInt}
- {@link com.sleepycat.bind.tuple.TupleInput#readSortedPackedInt}
- {@link com.sleepycat.bind.tuple.TupleInput#getSortedPackedIntByteLength}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeSortedPackedLong}
- {@link com.sleepycat.bind.tuple.TupleInput#readSortedPackedLong}
- {@link com.sleepycat.bind.tuple.TupleInput#getSortedPackedLongByteLength}
The unsorted packed integer format is an older, legacy format that is used
internally and supported for compatibility. It stores values in the inclusive
range [-119,119] in a single byte. Because default natural sorting is not
supported, this format should not be used for keys. However, it so happens
that packed integers in the inclusive range [0,630] are sorted correctly by
default, and this may be useful for some applications.
- Single-value binding classes for legacy, unsorted packed integers.
- {@link com.sleepycat.bind.tuple.PackedIntegerBinding}
- {@link com.sleepycat.bind.tuple.PackedLongBinding}
- Multi-value binding methods for legacy, unsorted packed integers.
- {@link com.sleepycat.bind.tuple.TupleOutput#writePackedInt}
- {@link com.sleepycat.bind.tuple.TupleInput#readPackedInt}
- {@link com.sleepycat.bind.tuple.TupleInput#getPackedIntByteLength}
- {@link com.sleepycat.bind.tuple.TupleOutput#writePackedLong}
- {@link com.sleepycat.bind.tuple.TupleInput#readPackedLong}
- {@link com.sleepycat.bind.tuple.TupleInput#getPackedLongByteLength}
BigInteger Formats
All {@code BigInteger} formats support default natural sorting.
{@code BigInteger} values are variable length and are stored as signed
values with a preceding byte length. The length has the same sign as the
value, in order to support default natural sorting.
The length is stored as a 2-byte (short), fixed size, signed integer.
Supported values are therefore limited to those with a byte array ({@link
java.math.BigInteger#toByteArray}) representation with a size of 0x7fff bytes
or less. The maximum {@code BigInteger} value is (20x3fff7 - 1) and
the minimum value is (-20x3fff7).
- Single-value binding classes for {@code BigInteger} values.
- {@link com.sleepycat.bind.tuple.BigIntegerBinding}
- Multi-value binding methods for {@code BigInteger} values.
- {@link com.sleepycat.bind.tuple.TupleOutput#writeBigInteger}
- {@link com.sleepycat.bind.tuple.TupleInput#readBigInteger}
- {@link com.sleepycat.bind.tuple.TupleOutput#getBigIntegerByteLength}
Floating Point Formats
Floats and doubles are stored in a fixed size, 4 and 8 byte format,
respectively. Floats and doubles are stored using two different
representations: a representation with default natural sorting, and an
unsorted, integer-bit (IEEE 754) representation. For new applications, the
format that supports default natural sorting should normally be used. There is
no performance advantage to using the unsorted format.
For {@code float} values, Float.floatToIntBits
and the following
bit manipulations are used to convert the signed float value to a
representation that is sorted correctly by default.
int intVal = Float.floatToIntBits(val);
intVal ^= (intVal < 0) ? 0xffffffff : 0x80000000;
For {@code double} values, Float.doubleToLongBits
and the
following bit manipulations are used to convert the signed double value to a
representation that is sorted correctly by default.
long longVal = Double.doubleToLongBits(val);
longVal ^= (longVal < 0) ? 0xffffffffffffffffL : 0x8000000000000000L;
In both cases, the resulting {@code int} or {@code long} value is stored as
an unsigned value.
- Single-value binding classes for {@code float} and {@code double} values
with default natural sorting.
- {@link com.sleepycat.bind.tuple.SortedFloatBinding}
- {@link com.sleepycat.bind.tuple.SortedDoubleBinding}
- Multi-value binding methods for {@code float} and {@code double} values
with default natural sorting.
- {@link com.sleepycat.bind.tuple.TupleOutput#writeSortedFloat}
- {@link com.sleepycat.bind.tuple.TupleInput#readSortedFloat}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeSortedDouble}
- {@link com.sleepycat.bind.tuple.TupleInput#readSortedDouble}
The unsorted floating point format is an older, legacy format that is
supported for compatibility. With this format, only zero and positive values
have default natural sorting; negative values do not.
- Single-value binding classes for legacy, unsorted {@code float} and {@code
double} values.
- {@link com.sleepycat.bind.tuple.FloatBinding}
- {@link com.sleepycat.bind.tuple.DoubleBinding}
- Multi-value binding methods for legacy, unsorted {@code float} and {@code
double} values.
- {@link com.sleepycat.bind.tuple.TupleOutput#writeFloat}
- {@link com.sleepycat.bind.tuple.TupleInput#readFloat}
- {@link com.sleepycat.bind.tuple.TupleOutput#writeDouble}
- {@link com.sleepycat.bind.tuple.TupleInput#readDouble}
BigDecimal Formats
{@code BigDecimal} values are stored using two different, variable length
representations: a representation that supports default natural sorting, and an
unsorted representation. Differences between the two formats are:
- The {@code BigDecimal} format with default natural sorting should normally
be used for database keys.
- Default natural sorting is supported.
- The stored value is around 3 bytes larger than the unsorted format,
more or less, and is a minimum of 8 bytes.
- More computation is required for serialization than the unsorted
format.
- Trailing zeros after the decimal place are stripped, meaning that
precision is not preserved.
- The unsorted {@code BigDecimal} format should normally be used for non-key
values.
- Default natural sorting is not supported.
- The stored value is around 3 bytes smaller than the sorted format, more
or less, and is a minimum of 3 bytes.
- Less computation is required for serialization than the sorted
format.
- Trailing zeros after the decimal place are preserved, meaning that
precision is preserved.
Both formats store the scale or exponent separately from the unscaled value,
and the stored size does not increase proportionally as the absolute value of
the scale or exponent increases.
- Single-value binding classes for {@code BigDecimal} values with default
natural sorting.
- {@link com.sleepycat.bind.tuple.SortedBigDecimalBinding}
- Multi-value binding methods for {@code BigDecimal} values with default
natural sorting.
- {@link com.sleepycat.bind.tuple.TupleOutput#writeSortedBigDecimal}
- {@link com.sleepycat.bind.tuple.TupleOutput#getSortedBigDecimalMaxByteLength}
- {@link com.sleepycat.bind.tuple.TupleInput#readSortedBigDecimal}
- {@link com.sleepycat.bind.tuple.TupleInput#getSortedBigDecimalByteLength}
- Single-value binding classes for unsorted {@code BigDecimal} values.
- {@link com.sleepycat.bind.tuple.BigDecimalBinding}
- Multi-value binding methods for unsorted {@code BigDecimal} values.
- {@link com.sleepycat.bind.tuple.TupleOutput#writeBigDecimal}
- {@link com.sleepycat.bind.tuple.TupleOutput#getBigDecimalMaxByteLength}
- {@link com.sleepycat.bind.tuple.TupleInput#readBigDecimal}
- {@link com.sleepycat.bind.tuple.TupleInput#getBigDecimalByteLength}