All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.hibernate.search.util.impl.ConcurrentReferenceHashMap Maven / Gradle / Ivy

There is a newer version: 5.11.12.Final
Show newest version
/*
 * Hibernate Search, full-text search for your domain model
 *
 * License: GNU Lesser General Public License (LGPL), version 2.1 or later
 * See the lgpl.txt file in the root directory or .
 *
 * This class specifically was written by Doug Lea with assistance from members of JCP JSR-166
 * Expert Group and released to the public domain, as explained at
 * http://creativecommons.org/licenses/publicdomain
 */

package org.hibernate.search.util.impl;

import java.io.IOException;
import java.io.Serializable;
import java.lang.ref.Reference;
import java.lang.ref.ReferenceQueue;
import java.lang.ref.SoftReference;
import java.lang.ref.WeakReference;
import java.util.AbstractCollection;
import java.util.AbstractMap;
import java.util.AbstractSet;
import java.util.Collection;
import java.util.ConcurrentModificationException;
import java.util.EnumSet;
import java.util.Enumeration;
import java.util.HashMap;
import java.util.Hashtable;
import java.util.IdentityHashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.NoSuchElementException;
import java.util.Set;
import java.util.concurrent.locks.ReentrantLock;

/**
 * An advanced hash table supporting configurable garbage collection semantics
 * of keys and values, optional referential-equality, full concurrency of
 * retrievals, and adjustable expected concurrency for updates.
 *
 * This table is designed around specific advanced use-cases. If there is any
 * doubt whether this table is for you, you most likely should be using
 * {@link java.util.concurrent.ConcurrentHashMap} instead.
 *
 * This table supports strong, weak, and soft keys and values. By default keys
 * are weak, and values are strong. Such a configuration offers similar behavior
 * to {@link java.util.WeakHashMap}, entries of this table are periodically
 * removed once their corresponding keys are no longer referenced outside of
 * this table. In other words, this table will not prevent a key from being
 * discarded by the garbage collector. Once a key has been discarded by the
 * collector, the corresponding entry is no longer visible to this table;
 * however, the entry may occupy space until a future table operation decides to
 * reclaim it. For this reason, summary functions such as {@code size} and
 * {@code isEmpty} might return a value greater than the observed number of
 * entries. In order to support a high level of concurrency, stale entries are
 * only reclaimed during blocking (usually mutating) operations.
 *
 * Enabling soft keys allows entries in this table to remain until their space
 * is absolutely needed by the garbage collector. This is unlike weak keys which
 * can be reclaimed as soon as they are no longer referenced by a normal strong
 * reference. The primary use case for soft keys is a cache, which ideally
 * occupies memory that is not in use for as long as possible.
 *
 * By default, values are held using a normal strong reference. This provides
 * the commonly desired guarantee that a value will always have at least the
 * same life-span as it's key. For this reason, care should be taken to ensure
 * that a value never refers, either directly or indirectly, to its key, thereby
 * preventing reclamation. If this is unavoidable, then it is recommended to use
 * the same reference type in use for the key. However, it should be noted that
 * non-strong values may disappear before their corresponding key.
 *
 * While this table does allow the use of both strong keys and values, it is
 * recommended to use {@link java.util.concurrent.ConcurrentHashMap} for such a
 * configuration, since it is optimized for that case.
 *
 * Just like {@link java.util.concurrent.ConcurrentHashMap}, this class obeys
 * the same functional specification as {@link java.util.Hashtable}, and
 * includes versions of methods corresponding to each method of
 * {@code Hashtable}. However, even though all operations are thread-safe,
 * retrieval operations do not entail locking, and there is
 * not any support for locking the entire table in a way that
 * prevents all access. This class is fully interoperable with
 * {@code Hashtable} in programs that rely on its thread safety but not on
 * its synchronization details.
 *
 * 

* Retrieval operations (including {@code get}) generally do not block, so * may overlap with update operations (including {@code put} and * {@code remove}). Retrievals reflect the results of the most recently * completed update operations holding upon their onset. For * aggregate operations such as {@code putAll} and {@code clear}, * concurrent retrievals may reflect insertion or removal of only some entries. * Similarly, Iterators and Enumerations return elements reflecting the state of * the hash table at some point at or since the creation of the * iterator/enumeration. They do not throw * {@link ConcurrentModificationException}. However, iterators are designed to * be used by only one thread at a time. * *

* The allowed concurrency among update operations is guided by the optional * {@code concurrencyLevel} constructor argument (default {@code 16}), * which is used as a hint for internal sizing. The table is internally * partitioned to try to permit the indicated number of concurrent updates * without contention. Because placement in hash tables is essentially random, * the actual concurrency will vary. Ideally, you should choose a value to * accommodate as many threads as will ever concurrently modify the table. Using * a significantly higher value than you need can waste space and time, and a * significantly lower value can lead to thread contention. But overestimates * and underestimates within an order of magnitude do not usually have much * noticeable impact. A value of one is appropriate when it is known that only * one thread will modify and all others will only read. Also, resizing this or * any other kind of hash table is a relatively slow operation, so, when * possible, it is a good idea to provide estimates of expected table sizes in * constructors. * *

* This class and its views and iterators implement all of the optional * methods of the {@link Map} and {@link Iterator} interfaces. * *

* Like {@link Hashtable} but unlike {@link HashMap}, this class does * not allow {@code null} to be used as a key or value. * *

* This class is a member of the * Java Collections Framework. * * @param the type of keys maintained by this map * @param the type of mapped values * * @author Doug Lea * @author Jason T. Greene */ class ConcurrentReferenceHashMap extends AbstractMap implements java.util.concurrent.ConcurrentMap, Serializable { private static final long serialVersionUID = 7249069246763182397L; /* * The basic strategy is to subdivide the table among Segments, * each of which itself is a concurrently readable hash table. */ /** * An option specifying which Java reference type should be used to refer * to a key and/or value. */ public static enum ReferenceType { /** * Indicates a normal Java strong reference should be used */ STRONG, /** * Indicates a {@link WeakReference} should be used */ WEAK, /** * Indicates a {@link SoftReference} should be used */ SOFT } ; public static enum Option { /** * Indicates that referential-equality (== instead of .equals()) should * be used when locating keys. This offers similar behavior to {@link IdentityHashMap} */ IDENTITY_COMPARISONS } ; /* ---------------- Constants -------------- */ static final ReferenceType DEFAULT_KEY_TYPE = ReferenceType.WEAK; static final ReferenceType DEFAULT_VALUE_TYPE = ReferenceType.STRONG; /** * The default initial capacity for this table, * used when not otherwise specified in a constructor. */ static final int DEFAULT_INITIAL_CAPACITY = 16; /** * The default load factor for this table, used when not * otherwise specified in a constructor. */ static final float DEFAULT_LOAD_FACTOR = 0.75f; /** * The default concurrency level for this table, used when not * otherwise specified in a constructor. */ static final int DEFAULT_CONCURRENCY_LEVEL = 16; /** * The maximum capacity, used if a higher value is implicitly * specified by either of the constructors with arguments. MUST * be a power of two <= 1<<30 to ensure that entries are indexable * using ints. */ static final int MAXIMUM_CAPACITY = 1 << 30; /** * The maximum number of segments to allow; used to bound * constructor arguments. */ static final int MAX_SEGMENTS = 1 << 16; // slightly conservative /** * Number of unsynchronized retries in size and containsValue * methods before resorting to locking. This is used to avoid * unbounded retries if tables undergo continuous modification * which would make it impossible to obtain an accurate result. */ static final int RETRIES_BEFORE_LOCK = 2; /* ---------------- Fields -------------- */ /** * Mask value for indexing into segments. The upper bits of a * key's hash code are used to choose the segment. */ final int segmentMask; /** * Shift value for indexing within segments. */ final int segmentShift; /** * The segments, each of which is a specialized hash table */ final Segment[] segments; boolean identityComparisons; transient Set keySet; transient Set> entrySet; transient Collection values; /* ---------------- Small Utilities -------------- */ /** * Applies a supplemental hash function to a given hashCode, which * defends against poor quality hash functions. This is critical * because ConcurrentReferenceHashMap uses power-of-two length hash tables, * that otherwise encounter collisions for hashCodes that do not * differ in lower or upper bits. */ private static int hash(int h) { // Spread bits to regularize both segment and index locations, // using variant of single-word Wang/Jenkins hash. h += ( h << 15 ) ^ 0xffffcd7d; h ^= ( h >>> 10 ); h += ( h << 3 ); h ^= ( h >>> 6 ); h += ( h << 2 ) + ( h << 14 ); return h ^ ( h >>> 16 ); } /** * Returns the segment that should be used for key with given hash * * @param hash the hash code for the key * * @return the segment */ final Segment segmentFor(int hash) { return segments[( hash >>> segmentShift ) & segmentMask]; } private int hashOf(Object key) { return hash( identityComparisons ? System.identityHashCode( key ) : key.hashCode() ); } /* ---------------- Inner Classes -------------- */ static interface KeyReference { int keyHash(); Object keyRef(); } /** * A weak-key reference which stores the key hash needed for reclamation. */ static final class WeakKeyReference extends WeakReference implements KeyReference { final int hash; WeakKeyReference(K key, int hash, ReferenceQueue refQueue) { super( key, refQueue ); this.hash = hash; } @Override public final int keyHash() { return hash; } @Override public final Object keyRef() { return this; } } /** * A soft-key reference which stores the key hash needed for reclamation. */ static final class SoftKeyReference extends SoftReference implements KeyReference { final int hash; SoftKeyReference(K key, int hash, ReferenceQueue refQueue) { super( key, refQueue ); this.hash = hash; } @Override public final int keyHash() { return hash; } @Override public final Object keyRef() { return this; } } static final class WeakValueReference extends WeakReference implements KeyReference { final Object keyRef; final int hash; WeakValueReference(V value, Object keyRef, int hash, ReferenceQueue refQueue) { super( value, refQueue ); this.keyRef = keyRef; this.hash = hash; } @Override public final int keyHash() { return hash; } @Override public final Object keyRef() { return keyRef; } } static final class SoftValueReference extends SoftReference implements KeyReference { final Object keyRef; final int hash; SoftValueReference(V value, Object keyRef, int hash, ReferenceQueue refQueue) { super( value, refQueue ); this.keyRef = keyRef; this.hash = hash; } @Override public final int keyHash() { return hash; } @Override public final Object keyRef() { return keyRef; } } /** * ConcurrentReferenceHashMap list entry. Note that this is never exported * out as a user-visible Map.Entry. * * Because the value field is volatile, not final, it is legal wrt * the Java Memory Model for an unsynchronized reader to see null * instead of initial value when read via a data race. Although a * reordering leading to this is not likely to ever actually * occur, the Segment.readValueUnderLock method is used as a * backup in case a null (pre-initialized) value is ever seen in * an unsynchronized access method. */ static final class HashEntry { final Object keyRef; final int hash; volatile Object valueRef; final HashEntry next; HashEntry(K key, int hash, HashEntry next, V value, ReferenceType keyType, ReferenceType valueType, ReferenceQueue refQueue) { this.hash = hash; this.next = next; this.keyRef = newKeyReference( key, keyType, refQueue ); this.valueRef = newValueReference( value, valueType, refQueue ); } final Object newKeyReference(K key, ReferenceType keyType, ReferenceQueue refQueue) { if ( keyType == ReferenceType.WEAK ) { return new WeakKeyReference( key, hash, refQueue ); } if ( keyType == ReferenceType.SOFT ) { return new SoftKeyReference( key, hash, refQueue ); } return key; } final Object newValueReference(V value, ReferenceType valueType, ReferenceQueue refQueue) { if ( valueType == ReferenceType.WEAK ) { return new WeakValueReference( value, keyRef, hash, refQueue ); } if ( valueType == ReferenceType.SOFT ) { return new SoftValueReference( value, keyRef, hash, refQueue ); } return value; } @SuppressWarnings("unchecked") final K key() { if ( keyRef instanceof KeyReference ) { return ( (Reference) keyRef ).get(); } return (K) keyRef; } final V value() { return dereferenceValue( valueRef ); } @SuppressWarnings("unchecked") final V dereferenceValue(Object value) { if ( value instanceof KeyReference ) { return ( (Reference) value ).get(); } return (V) value; } final void setValue(V value, ReferenceType valueType, ReferenceQueue refQueue) { this.valueRef = newValueReference( value, valueType, refQueue ); } @SuppressWarnings("unchecked") static final HashEntry[] newArray(int i) { return new HashEntry[i]; } } /** * Segments are specialized versions of hash tables. This * subclasses from ReentrantLock opportunistically, just to * simplify some locking and avoid separate construction. */ static final class Segment extends ReentrantLock implements Serializable { /* * Segments maintain a table of entry lists that are ALWAYS * kept in a consistent state, so can be read without locking. * Next fields of nodes are immutable (final). All list * additions are performed at the front of each bin. This * makes it easy to check changes, and also fast to traverse. * When nodes would otherwise be changed, new nodes are * created to replace them. This works well for hash tables * since the bin lists tend to be short. (The average length * is less than two for the default load factor threshold.) * * Read operations can thus proceed without locking, but rely * on selected uses of volatiles to ensure that completed * write operations performed by other threads are * noticed. For most purposes, the "count" field, tracking the * number of elements, serves as that volatile variable * ensuring visibility. This is convenient because this field * needs to be read in many read operations anyway: * * - All (unsynchronized) read operations must first read the * "count" field, and should not look at table entries if * it is 0. * * - All (synchronized) write operations should write to * the "count" field after structurally changing any bin. * The operations must not take any action that could even * momentarily cause a concurrent read operation to see * inconsistent data. This is made easier by the nature of * the read operations in Map. For example, no operation * can reveal that the table has grown but the threshold * has not yet been updated, so there are no atomicity * requirements for this with respect to reads. * * As a guide, all critical volatile reads and writes to the * count field are marked in code comments. */ private static final long serialVersionUID = 2249069246763182397L; /** * The number of elements in this segment's region. */ transient volatile int count; /** * Number of updates that alter the size of the table. This is * used during bulk-read methods to make sure they see a * consistent snapshot: If modCounts change during a traversal * of segments computing size or checking containsValue, then * we might have an inconsistent view of state so (usually) * must retry. */ transient int modCount; /** * The table is rehashed when its size exceeds this threshold. * (The value of this field is always {@code (int)(capacity * * loadFactor)}.) */ transient int threshold; /** * The per-segment table. */ transient volatile HashEntry[] table; /** * The load factor for the hash table. Even though this value * is same for all segments, it is replicated to avoid needing * links to outer object. * * @serial */ final float loadFactor; /** * The collected weak-key reference queue for this segment. * This should be (re)initialized whenever table is assigned, */ transient volatile ReferenceQueue refQueue; final ReferenceType keyType; final ReferenceType valueType; final boolean identityComparisons; Segment(int initialCapacity, float lf, ReferenceType keyType, ReferenceType valueType, boolean identityComparisons) { loadFactor = lf; this.keyType = keyType; this.valueType = valueType; this.identityComparisons = identityComparisons; setTable( HashEntry.newArray( initialCapacity ) ); } @SuppressWarnings("unchecked") static final Segment[] newArray(int i) { return new Segment[i]; } private boolean keyEq(Object src, Object dest) { return identityComparisons ? src == dest : src.equals( dest ); } /** * Sets table to new HashEntry array. * Call only while holding lock or in constructor. */ void setTable(HashEntry[] newTable) { threshold = (int) ( newTable.length * loadFactor ); table = newTable; refQueue = new ReferenceQueue(); } /** * Returns properly casted first entry of bin for given hash. */ HashEntry getFirst(int hash) { HashEntry[] tab = table; return tab[hash & ( tab.length - 1 )]; } HashEntry newHashEntry(K key, int hash, HashEntry next, V value) { return new HashEntry( key, hash, next, value, keyType, valueType, refQueue ); } /** * Reads value field of an entry under lock. Called if value * field ever appears to be null. This is possible only if a * compiler happens to reorder a HashEntry initialization with * its table assignment, which is legal under memory model * but is not known to ever occur. */ V readValueUnderLock(HashEntry e) { lock(); try { removeStale(); return e.value(); } finally { unlock(); } } /* Specialized implementations of map methods */ V get(Object key, int hash) { if ( count != 0 ) { // read-volatile HashEntry e = getFirst( hash ); while ( e != null ) { if ( e.hash == hash && keyEq( key, e.key() ) ) { Object opaque = e.valueRef; if ( opaque != null ) { return e.dereferenceValue( opaque ); } return readValueUnderLock( e ); // recheck } e = e.next; } } return null; } boolean containsKey(Object key, int hash) { if ( count != 0 ) { // read-volatile HashEntry e = getFirst( hash ); while ( e != null ) { if ( e.hash == hash && keyEq( key, e.key() ) ) { return true; } e = e.next; } } return false; } boolean containsValue(Object value) { if ( count != 0 ) { // read-volatile HashEntry[] tab = table; int len = tab.length; for ( int i = 0; i < len; i++ ) { for ( HashEntry e = tab[i]; e != null; e = e.next ) { Object opaque = e.valueRef; V v; if ( opaque == null ) { v = readValueUnderLock( e ); // recheck } else { v = e.dereferenceValue( opaque ); } if ( value.equals( v ) ) { return true; } } } } return false; } boolean replace(K key, int hash, V oldValue, V newValue) { lock(); try { removeStale(); HashEntry e = getFirst( hash ); while ( e != null && ( e.hash != hash || !keyEq( key, e.key() ) ) ) { e = e.next; } boolean replaced = false; if ( e != null && oldValue.equals( e.value() ) ) { replaced = true; e.setValue( newValue, valueType, refQueue ); } return replaced; } finally { unlock(); } } V replace(K key, int hash, V newValue) { lock(); try { removeStale(); HashEntry e = getFirst( hash ); while ( e != null && ( e.hash != hash || !keyEq( key, e.key() ) ) ) { e = e.next; } V oldValue = null; if ( e != null ) { oldValue = e.value(); e.setValue( newValue, valueType, refQueue ); } return oldValue; } finally { unlock(); } } V put(K key, int hash, V value, boolean onlyIfAbsent) { lock(); try { removeStale(); int c = count; if ( c++ > threshold ) {// ensure capacity int reduced = rehash(); if ( reduced > 0 ) { // adjust from possible weak cleanups count = ( c -= reduced ) - 1; // write-volatile } } HashEntry[] tab = table; int index = hash & ( tab.length - 1 ); HashEntry first = tab[index]; HashEntry e = first; while ( e != null && ( e.hash != hash || !keyEq( key, e.key() ) ) ) { e = e.next; } V oldValue; if ( e != null ) { oldValue = e.value(); if ( !onlyIfAbsent ) { e.setValue( value, valueType, refQueue ); } } else { oldValue = null; ++modCount; tab[index] = newHashEntry( key, hash, first, value ); count = c; // write-volatile } return oldValue; } finally { unlock(); } } int rehash() { HashEntry[] oldTable = table; int oldCapacity = oldTable.length; if ( oldCapacity >= MAXIMUM_CAPACITY ) { return 0; } /* * Reclassify nodes in each list to new Map. Because we are * using power-of-two expansion, the elements from each bin * must either stay at same index, or move with a power of two * offset. We eliminate unnecessary node creation by catching * cases where old nodes can be reused because their next * fields won't change. Statistically, at the default * threshold, only about one-sixth of them need cloning when * a table doubles. The nodes they replace will be garbage * collectable as soon as they are no longer referenced by any * reader thread that may be in the midst of traversing table * right now. */ HashEntry[] newTable = HashEntry.newArray( oldCapacity << 1 ); threshold = (int) ( newTable.length * loadFactor ); int sizeMask = newTable.length - 1; int reduce = 0; for ( int i = 0; i < oldCapacity; i++ ) { // We need to guarantee that any existing reads of old Map can // proceed. So we cannot yet null out each bin. HashEntry e = oldTable[i]; if ( e != null ) { HashEntry next = e.next; int idx = e.hash & sizeMask; // Single node on list if ( next == null ) { newTable[idx] = e; } else { // Reuse trailing consecutive sequence at same slot HashEntry lastRun = e; int lastIdx = idx; for ( HashEntry last = next; last != null; last = last.next ) { int k = last.hash & sizeMask; if ( k != lastIdx ) { lastIdx = k; lastRun = last; } } newTable[lastIdx] = lastRun; // Clone all remaining nodes for ( HashEntry p = e; p != lastRun; p = p.next ) { // Skip GC'd weak refs K key = p.key(); if ( key == null ) { reduce++; continue; } int k = p.hash & sizeMask; HashEntry n = newTable[k]; newTable[k] = newHashEntry( key, p.hash, n, p.value() ); } } } } table = newTable; return reduce; } /** * Remove; match on key only if value null, else match both. */ V remove(Object key, int hash, Object value, boolean refRemove) { lock(); try { if ( !refRemove ) { removeStale(); } int c = count - 1; HashEntry[] tab = table; int index = hash & ( tab.length - 1 ); HashEntry first = tab[index]; HashEntry e = first; // a ref remove operation compares the Reference instance while ( e != null && key != e.keyRef && ( refRemove || hash != e.hash || !keyEq( key, e.key() ) ) ) { e = e.next; } V oldValue = null; if ( e != null ) { V v = e.value(); if ( value == null || value.equals( v ) ) { oldValue = v; // All entries following removed node can stay // in list, but all preceding ones need to be // cloned. ++modCount; HashEntry newFirst = e.next; for ( HashEntry p = first; p != e; p = p.next ) { K pKey = p.key(); if ( pKey == null ) { // Skip GC'd keys c--; continue; } newFirst = newHashEntry( pKey, p.hash, newFirst, p.value() ); } tab[index] = newFirst; count = c; // write-volatile } } return oldValue; } finally { unlock(); } } final void removeStale() { KeyReference ref; while ( ( ref = (KeyReference) refQueue.poll() ) != null ) { remove( ref.keyRef(), ref.keyHash(), null, true ); } } void clear() { if ( count != 0 ) { lock(); try { HashEntry[] tab = table; for ( int i = 0; i < tab.length; i++ ) { tab[i] = null; } ++modCount; // replace the reference queue to avoid unnecessary stale cleanups refQueue = new ReferenceQueue(); count = 0; // write-volatile } finally { unlock(); } } } } /* ---------------- Public operations -------------- */ /** * Creates a new, empty map with the specified initial * capacity, reference types, load factor and concurrency level. * * Behavioral changing options such as {@link Option#IDENTITY_COMPARISONS} * can also be specified. * * @param initialCapacity the initial capacity. The implementation * performs internal sizing to accommodate this many elements. * @param loadFactor the load factor threshold, used to control resizing. * Resizing may be performed when the average number of elements per * bin exceeds this threshold. * @param concurrencyLevel the estimated number of concurrently * updating threads. The implementation performs internal sizing * to try to accommodate this many threads. * @param keyType the reference type to use for keys * @param valueType the reference type to use for values * @param options the behavioral options * * @throws IllegalArgumentException if the initial capacity is * negative or the load factor or concurrencyLevel are * nonpositive. */ public ConcurrentReferenceHashMap(int initialCapacity, float loadFactor, int concurrencyLevel, ReferenceType keyType, ReferenceType valueType, EnumSet