net.sf.ehcache.distribution.package.html Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of ehcache-core Show documentation
Show all versions of ehcache-core Show documentation
This is the ehcache core module. Pair it with other modules for added functionality.
This package is for cache replication.
Overview
Problems with Instance Caches in a Clustered Environment
Many production applications are deployed in clusters. If each application maintains its own cache, then updates
made to one cache will not appear in the others. A workaround for web based applications is to use sticky sessions,
so that a user, having established a session on one server, stays on that server for the rest of the session. A
workaround for transaction processing systems using Hibernate is to do a session.refresh on each persistent object
as part of the save. session.refresh explicitly reloads the object from the database, ignoring any cache values.
Replicated Cache
Another solution is to replicate data between the caches to keep them consistent. This is sometimes called cache
coherency. Applicable operations include:
- put
- update (put which overwrites an existing entry)
- remove
Replicated Cache Terms
Replicated Cache - a cache instance that notifies others when its contents change
Notification - a mechanism to replicate changes
Topology - a layout for how replicated caches connect with and notify each other
Notification Strategies
The best way of notifying of put and update depends on the nature of the cache.
If the Element is not available
anywhere else then the Element itself should form the payload of the notification. An example is a cached web page.
This notification strategy is called copy.
Where the cached data is available in a database, there are two choices. Copy as before, or invalidate the data. By
invalidating the data, the application tied to the other cache instance will be forced to refresh its cache from the
database, preserving cache coherency. Only the Element key needs to be passed over the network.
ehcache supports notification through copy and invalidate, selectable per cache.
Topology Choices
Peer Cache Replicator
Each replicated cache instance notifies every other cache instance when its contents change. This requires n-1
notifications per change, where n is the number of cache instances in the cluster.
Centralised Cache Replicator
Each replicated cache instance notifies a master cache instance when its contents change. The master cache then
notifies the other instances. This requires n-1
notifications per change, where n is the number of cache instances in the cluster.
ehcache uses a peer replication topology. It adds a twist with CachePeerProvider, an interface which supplies a list
of cache instance peers, so as to handle peers entering and leaving the cluster. Some ideas for peer provider
implementations are: configuration time list, multicast discovery, application specific cluster list.
Replication Drawbacks and Solutions in ehcache's implementation
Some potentially significant obstacles have to be overcome if replication is to provide a net benefit.
Chatty Protocol
n-1 notifications need to happen each time a a cache instance change occurs. A very large amount of network traffic
can be generated.
ehcache will buffer changes to lower chattiness.
Redundant Notifications
The cache instance that initiated the change should not receive its own notifications. To do so would add additional
overhead. Also, notifications should not endlessly go back and forth as each cache listener gets changes caused by
a remote replication.
ehcache CachePeerProvider indentifies the local cache instance and excludes it from the notification list. Each Cache
has a GUID. That GUID can be compared with list of cache peers and the local peer excluded.
Infinite notifications are prevented by having each CacheReplicatorListener call putQuiet and removeQuite methods
on their decorated caches, so as not to nofify listeners.
Potential for Inconsisent Data
Timing scenarios, race conditions, delivery and reliability constraints, and concurrent updates to the same cached
data can cause inconsistency (and thus a lack of coherency) across the cache instancies.
Acknowledgement: Much of the material here was drawn from Data Access Patterns, by Clifton Nock.