com.gemstone.org.jgroups.primarychanges.txt Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of gemfire-jgroups Show documentation
Show all versions of gemfire-jgroups Show documentation
SnappyData store based off Pivotal GemFireXD
The newest version!
Here are some of the larger Jgroups issues addressed by GemStone:
1) concurrent programming errors
* lack of use of volatile or synchronization in many places
* removal of (thread != null && thread.isAlive()) anitpattern
* notify/wait races, especially in PingSender/PingWaiter during discovery
* lack of interrupt handling (inability to forcibly shut down the stack)
2) discovery problems, primarily in pbcast.ClientGMSImpl.join(), but also
in PingWaiter response collection. These usually lead to the formation
of multiple subgroups when there are large numbers of new members joining.
3) failure to contact GossipServer during startup must disallow start of stack
4) race condition between GossipServer.getmbrs and GossipServer.join
(atomicity of join operation)
5) lack of SSL support in GossipServer/GossipClient
6) GossipServer restart can cause discovery problems - should persist state
and attempt recovery from other servers
7) NIC failure in client can cause GossipServer to hang
8) UNICAST is far too slow, requiring context switches to receive an ACK
(added new DirAck protocol). There were also race conditions in UNICAST
handling of messages from new or departed members. These changes were
given to the JGroups dev group.
9) FC (flow control) protocol is subject to hangs, has skewed perception of
debt
10) too many threads in stack, requiring context switches (stack configuration)
11) too many clock probes, causing user/kernel delays
12) a member that is kicked out of the group fails to realize the fact,
leading to hang
13) delivery of messages to application is too slow (added Receiver interface
and contributed this to the jgroups dev group)
14) FD failure detection prone to false positives when stack isn't being used
much
15) FD_SOCK failure detection should have controlled shutdown protocol
(solution contributed to jgroups dev group)
16) jgroups shutdown does not stop timer thread (thread leak)
17) jgroups threads missing last-chance exception handlers (solution contributed
to jgroups dev group)
18) jgroups does not handle out-of-memory (being addressed in GemFire Fraser
release)
19) In the FD protocol, a member that is not in the group should not be able
to suspect a member that is in the group
20) mixed use of FD/FD_SOCK should raise an alert
21) record statistics on JGroups stack performance for customer support
analysis
22) view management thread should allow multiple join/leave in one view
change (this has also been addressed by the JGroups dev group)
23) need ability to find a random available port for automated testing
24) JGroups payload serialization is too slow (tie in GemFire DataSerializable
support)
25) FRAG/FRAG2 protocols subject to race conditions at startup, may hold onto
fragments that are never delivered to applications. Similar problem in
UNICAST.
26) FD_SOCK should have a faster failure mode when socket connections are broken
27) jgroups stack startup is too slow (xml parsing)