com.sleepycat.je.rep.impl.node
Class RepNode

java.lang.Object
  extended by java.lang.Thread
      extended by com.sleepycat.je.utilint.StoppableThread
          extended by com.sleepycat.je.rep.impl.node.RepNode
All Implemented Interfaces:
ExceptionListenerUser, Runnable

public class RepNode
extends StoppableThread

Represents a replication node. This class is the locus of operations that manage the state of the node, master, replica, etc. Once the state of a node has been established the thread of control passes over to the Replica or FeederManager instances. Note that both Feeders and the Replica instance may be active in future when we support r2r replication, in addition to m2r replication. For now however, either the FeederManager is active, or the Replica is and the same common thread control can be shared between the two.


Nested Class Summary
static class RepNode.Clock
           
 
Nested classes/interfaces inherited from class java.lang.Thread
Thread.State, Thread.UncaughtExceptionHandler
 
Field Summary
(package private)  LocalCBVLSNTracker cbvlsnTracker
           
(package private)  GlobalCBVLSN globalCBVLSN
           
(package private)  Logger logger
           
(package private)  RepGroupDB repGroupDB
           
 
Fields inherited from class com.sleepycat.je.utilint.StoppableThread
envImpl
 
Fields inherited from class java.lang.Thread
MAX_PRIORITY, MIN_PRIORITY, NORM_PRIORITY
 
Constructor Summary
RepNode()
           
RepNode(NameIdPair nameIdPair)
           
RepNode(NameIdPair nameIdPair, ServiceDispatcher serviceDispatcher)
           
RepNode(RepImpl repImpl, Replay replay, NodeState nodeState)
           
 
Method Summary
 void configLogFlusher(DbConfigManager configMgr)
           
 void currentCommitVLSN(VLSN commitVLSN)
          Notes the VLSN associated with the latest commit.
 String dumpState()
           
 FeederManager feederManager()
           
 void forceMaster(boolean force)
           
(package private)  LocalCBVLSNTracker getCBVLSNTracker()
           
 ChannelTimeoutTask getChannelTimeoutTask()
           
 long getCleanerBarrierFile()
          Returns the file number that forms a barrier for the cleaner's file deletion activities.
 RepNode.Clock getClock()
           
(package private)  DbConfigManager getConfigManager()
           
 VLSN getCurrentCommitVLSN()
          Returns the latest VLSN associated with a replicated commit.
(package private)  int getDbTreeCacheClearingOpCount()
           
 QuorumPolicy getElectionPolicy()
           
 int getElectionPriority()
           
 int getElectionQuorumSize(QuorumPolicy quorumPolicy)
          Returns the number of nodes needed to form a quorum for elections
 Elections getElections()
           
 FeederTxns getFeederTxns()
           
 RepGroupImpl getGroup()
           
 VLSN getGroupCBVLSN()
          May return NULL_VLSN
 int getHeartbeatInterval()
           
 LogFlusher getLogFlusher()
           
 Logger getLogger()
           
 LogManager getLogManager()
           
 RepNodeImpl[] getLogProviders()
          Returns a list of nodes suitable for feeding log files for a network restore.
 int getLogVersion()
           
 String getMasterName()
           
 MasterStatus getMasterStatus()
           
 MonitorEventManager getMonitorEventManager()
           
 NameIdPair getNameIdPair()
           
 int getNodeId()
          Returns the nodeId associated with this replication node.
 String getNodeName()
          Returns the nodeName associated with this replication node.
 RepUtils.ExceptionAwareCountDownLatch getReadyLatch()
           
 RepGroupDB getRepGroupDB()
           
 RepImpl getRepImpl()
           
(package private)  Replica getReplica()
           
(package private)  long getReplicaCloseCatchupMs()
           
 ServiceDispatcher getServiceDispatcher()
           
 InetSocketAddress getSocket()
           
 ReplicatedEnvironmentStats getStats(StatsConfig config)
          Returns the accumulated statistics for this node.
 int getThreadWaitInterval()
           
 Timer getTimer()
          Returns the timer associated with this RepNode
 UUID getUUID()
          Returns the UUID associated with the replicated environment.
 CommitFreezeLatch getVLSNFreezeLatch()
           
 VLSNIndex getVLSNIndex()
           
protected  int initiateSoftShutdown()
          Soft shutdown for the RepNode thread.
 boolean isActivePrimary()
          Returns true if the node is a designated Primary that has been activated.
 boolean isAuthoritativeMaster()
          Returns a definitive answer to whether this node is currently the master by checking both its status as a master and that at least a simple majority of nodes agrees that it's the master based on the number of feeder connections to it.
 boolean isMaster()
           
 ReplicatedEnvironment.State joinGroup(ReplicaConsistencyPolicy consistency, QuorumPolicy initialElectionPolicy)
          JoinGroup ensures that a RepNode is actively participating in a replication group.
 int minAckNodes(Durability.ReplicaAckPolicy ackPolicy)
          Returns the minimum number of replication nodes required to implement the ReplicaAckPolicy for a given group size.
 int minAckNodes(Durability durability)
           
 void passivatePrimary()
           
(package private)  void recalculateGlobalCBVLSN()
          Recalculate the Global CBVLSN, provoked by Replay, to ensure that the replica's global CBVLSN is up to date.
 RepGroupImpl refreshCachedGroup()
          This method must be invoked when a RepNode is first initialized and subsequently every time there is a change to the replication group.
(package private)  void reinitSelfElect()
          Establishes this node as the master, after re-initializing the group with this as the sole node in the group.
 void removeMember(String nodeName)
          Removes a node so that it's no longer a member of the group.
 Replica replica()
           
 void resetReadyLatch(Exception exception)
           
 void resetStats()
           
 void restartNetworkBackup()
          Restarts the network backup service *after* a rollback has been completed and the log files are once again in a consistent state.
 void run()
          The top level Master/Feeder or Replica loop in support of replication.
 void setElectableGroupSizeOverride(int override)
           
 void setVersionHook(TestHook<Integer> versionHook)
           
 void shutdown()
          Used to shutdown all activity associated with this replication stream.
 void shutdownGroupOnClose(long timeoutMs)
          Must be invoked on the Master via the last open handle.
 void shutdownNetworkBackup()
          Shuts down the Network backup service *before* a rollback is initiated as part of syncup, thus ensuring that NetworkRestore does not see an inconsistent set of log files.
 void syncupEnded()
           
 void syncupStarted()
          Returns the group wide CBVLSN.
 void trackSyncableVLSN(VLSN syncableVLSN, long lsn)
          Should be called whenever a new VLSN is associated with a log entry suitable for Replica/Feeder syncup.
 boolean tryActivatePrimary()
          Tries to activate this node as a Primary, if it has been configured as such and if the group size is two.
 void updateAddress(String nodeName, String newHostName, int newPort)
          Update the network address of a node.
 void updateGroupInfo(NameIdPair updateNameIdPair, RepGroupImpl.BarrierState barrierState)
          Updates the cached group info for the node, avoiding a database read.
 
Methods inherited from class com.sleepycat.je.utilint.StoppableThread
cleanup, getSavedShutdownException, getTotalCpuTime, getTotalUserTime, isShutdown, saveShutdownException, setExceptionListener, shutdownDone, shutdownThread
 
Methods inherited from class java.lang.Thread
activeCount, checkAccess, countStackFrames, currentThread, destroy, dumpStack, enumerate, getAllStackTraces, getContextClassLoader, getDefaultUncaughtExceptionHandler, getId, getName, getPriority, getStackTrace, getState, getThreadGroup, getUncaughtExceptionHandler, holdsLock, interrupt, interrupted, isAlive, isDaemon, isInterrupted, join, join, join, resume, setContextClassLoader, setDaemon, setDefaultUncaughtExceptionHandler, setName, setPriority, setUncaughtExceptionHandler, sleep, sleep, start, stop, stop, suspend, toString, yield
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

repGroupDB

final RepGroupDB repGroupDB

cbvlsnTracker

final LocalCBVLSNTracker cbvlsnTracker

globalCBVLSN

final GlobalCBVLSN globalCBVLSN

logger

final Logger logger
Constructor Detail

RepNode

public RepNode(RepImpl repImpl,
               Replay replay,
               NodeState nodeState)
        throws IOException,
               DatabaseException
Throws:
IOException
DatabaseException

RepNode

public RepNode(NameIdPair nameIdPair)

RepNode

public RepNode()

RepNode

public RepNode(NameIdPair nameIdPair,
               ServiceDispatcher serviceDispatcher)
Method Detail

getLogger

public Logger getLogger()
Specified by:
getLogger in class StoppableThread
Returns:
a logger to use when logging uncaught exceptions.

getTimer

public Timer getTimer()
Returns the timer associated with this RepNode


getServiceDispatcher

public ServiceDispatcher getServiceDispatcher()

getStats

public ReplicatedEnvironmentStats getStats(StatsConfig config)
Returns the accumulated statistics for this node. The method encapsulates the statistics associated with its two principal components the FeederManager and the Replica.


resetStats

public void resetStats()

getReadyLatch

public RepUtils.ExceptionAwareCountDownLatch getReadyLatch()

getVLSNFreezeLatch

public CommitFreezeLatch getVLSNFreezeLatch()

resetReadyLatch

public void resetReadyLatch(Exception exception)

feederManager

public FeederManager feederManager()

replica

public Replica replica()

getClock

public RepNode.Clock getClock()

getReplica

Replica getReplica()

getRepGroupDB

public RepGroupDB getRepGroupDB()

getGroup

public RepGroupImpl getGroup()

getUUID

public UUID getUUID()
Returns the UUID associated with the replicated environment.


getNodeName

public String getNodeName()
Returns the nodeName associated with this replication node.

Returns:
the nodeName

getNodeId

public int getNodeId()
Returns the nodeId associated with this replication node.

Returns:
the nodeId

getNameIdPair

public NameIdPair getNameIdPair()

getSocket

public InetSocketAddress getSocket()

getMasterStatus

public MasterStatus getMasterStatus()

isAuthoritativeMaster

public boolean isAuthoritativeMaster()
Returns a definitive answer to whether this node is currently the master by checking both its status as a master and that at least a simple majority of nodes agrees that it's the master based on the number of feeder connections to it. Such an authoritative answer is needed in a network partition situation to detect a master that may be isolated on the minority side of a network partition.

Returns:
true if the node is definitely the master. False if it's not or we cannot be sure.

getHeartbeatInterval

public int getHeartbeatInterval()

setVersionHook

public void setVersionHook(TestHook<Integer> versionHook)

getLogVersion

public int getLogVersion()

getElectionPriority

public int getElectionPriority()

getThreadWaitInterval

public int getThreadWaitInterval()

getDbTreeCacheClearingOpCount

int getDbTreeCacheClearingOpCount()

getRepImpl

public RepImpl getRepImpl()

getLogManager

public LogManager getLogManager()

getConfigManager

DbConfigManager getConfigManager()

getVLSNIndex

public VLSNIndex getVLSNIndex()

getFeederTxns

public FeederTxns getFeederTxns()

getElections

public Elections getElections()

getElectionPolicy

public QuorumPolicy getElectionPolicy()

getLogProviders

public RepNodeImpl[] getLogProviders()
Returns a list of nodes suitable for feeding log files for a network restore.

Returns:
a list of hostPort pairs

getLogFlusher

public LogFlusher getLogFlusher()

configLogFlusher

public void configLogFlusher(DbConfigManager configMgr)

getChannelTimeoutTask

public ChannelTimeoutTask getChannelTimeoutTask()

isMaster

public boolean isMaster()

currentCommitVLSN

public void currentCommitVLSN(VLSN commitVLSN)
Notes the VLSN associated with the latest commit. The updates are done in ascending order.

Parameters:
commitVLSN - the commit VLSNt

getMonitorEventManager

public MonitorEventManager getMonitorEventManager()

getMasterName

public String getMasterName()

getCurrentCommitVLSN

public VLSN getCurrentCommitVLSN()
Returns the latest VLSN associated with a replicated commit.


forceMaster

public void forceMaster(boolean force)
                 throws InterruptedException,
                        DatabaseException
Throws:
InterruptedException
DatabaseException

refreshCachedGroup

public RepGroupImpl refreshCachedGroup()
                                throws DatabaseException
This method must be invoked when a RepNode is first initialized and subsequently every time there is a change to the replication group.

The Master should invoke this method each time a member is added or removed, and a replica should invoke it each time it detects the commit of a transaction that modifies the membership database.

In addition, it must be invoked after a syncup operation, since it may revert changes made to the membership table.

Throws:
DatabaseException

removeMember

public void removeMember(String nodeName)
Removes a node so that it's no longer a member of the group. Note that names referring to deleted nodes cannot be reused.

Parameters:
nodeName - identifies the node to be deleted.
Throws:
MemberNotFoundException - if the node denoted by memberName is not a member of the replication group.
MasterStateException - if the member being removed is currently the Master
See Also:
Member Deletion

updateAddress

public void updateAddress(String nodeName,
                          String newHostName,
                          int newPort)
Update the network address of a node. Note that an alive node's address can't be updated, we'll throw an ReplicaStateException for this case.

Parameters:
nodeName - identifies the node to be updated
newHostName - the new host name of this node
newPort - the new port of this node

updateGroupInfo

public void updateGroupInfo(NameIdPair updateNameIdPair,
                            RepGroupImpl.BarrierState barrierState)
Updates the cached group info for the node, avoiding a database read.

Parameters:
updateNameIdPair - the node whose localCBVLSN must be updated.
barrierState - the new node syncup state

recalculateGlobalCBVLSN

void recalculateGlobalCBVLSN()
Recalculate the Global CBVLSN, provoked by Replay, to ensure that the replica's global CBVLSN is up to date.


getCBVLSNTracker

LocalCBVLSNTracker getCBVLSNTracker()

reinitSelfElect

void reinitSelfElect()
               throws IOException
Establishes this node as the master, after re-initializing the group with this as the sole node in the group. This method is used solely as part of the DbResetrepGroup utility.

Throws:
IOException

run

public void run()
The top level Master/Feeder or Replica loop in support of replication. It's responsible for driving the node level state changes resulting from elections initiated either by this node, or by other members of the group.

The thread is terminated via an orderly shutdown initiated as a result of an interrupt issued by the shutdown() method. Any exception that is not handled by the run method itself is caught by the thread's uncaught exception handler, and results in the RepImpl being made invalid. In that case, the application is responsible for closing the Replicated Environment, which will provoke the shutdown.

Note: This method currently runs either the feeder loop or the replica loop. With R to R support, it would be possible for a Replica to run both. This will be a future feature.

Specified by:
run in interface Runnable
Overrides:
run in class Thread

shutdown

public void shutdown()
              throws InterruptedException,
                     DatabaseException
Used to shutdown all activity associated with this replication stream. If method is invoked from different thread of control, it will wait until the rep node thread exits. If it's from the same thread, it's the caller's responsibility to exit the thread upon return from this method.

Throws:
InterruptedException
DatabaseException

initiateSoftShutdown

protected int initiateSoftShutdown()
Soft shutdown for the RepNode thread. Note that since the thread is shared by the FeederManager and the Replica, the FeederManager or Replica specific soft shutdown actions should already have been done earlier.

Overrides:
initiateSoftShutdown in class StoppableThread
Returns:
the amount of time in ms that the shutdownThread method will wait for the thread to exit. A -ve value means that the method will not wait. A zero value means it will wait indefinitely.

shutdownGroupOnClose

public void shutdownGroupOnClose(long timeoutMs)
                          throws IllegalStateException
Must be invoked on the Master via the last open handle. Note that the method itself does not shutdown the group. It merely sets replicaCloseCatchupMs, indicating that the ensuing handle close should shutdown the Replicas. The actual coordination with the closing of the handle is implemented by ReplicatedEnvironment.shutdownGroup().

Throws:
IllegalStateException
See Also:
ReplicatedEnvironment.shutdownGroup(long, TimeUnit)

joinGroup

public ReplicatedEnvironment.State joinGroup(ReplicaConsistencyPolicy consistency,
                                             QuorumPolicy initialElectionPolicy)
                                      throws ReplicaConsistencyException,
                                             DatabaseException,
                                             IOException
JoinGroup ensures that a RepNode is actively participating in a replication group. It's invoked each time a replicated environment handle is created. If the node is already participating in a replication group, because it's not the first handle to the environment, it will return without having to wait. Otherwise it will wait until a master is elected and this node is active, either as a Master, or as a Replica. If the node joins as a replica, it will wait further until it has become sufficiently consistent as defined by its consistency argument. By default it uses PointConsistencyPolicy to ensure that it is at least as consistent as the master as of the time the handle was opened.

Returns:
MASTER or REPLICA
Throws:
ReplicaConsistencyException
DatabaseException
IOException

trackSyncableVLSN

public void trackSyncableVLSN(VLSN syncableVLSN,
                              long lsn)
Should be called whenever a new VLSN is associated with a log entry suitable for Replica/Feeder syncup.


getGroupCBVLSN

public VLSN getGroupCBVLSN()
May return NULL_VLSN


getElectionQuorumSize

public int getElectionQuorumSize(QuorumPolicy quorumPolicy)
Returns the number of nodes needed to form a quorum for elections

Parameters:
quorumPolicy -
Returns:
the number of nodes required for a quorum

minAckNodes

public int minAckNodes(Durability.ReplicaAckPolicy ackPolicy)
Returns the minimum number of replication nodes required to implement the ReplicaAckPolicy for a given group size.

Returns:
the number of nodes that are needed

minAckNodes

public int minAckNodes(Durability durability)

syncupStarted

public void syncupStarted()
Returns the group wide CBVLSN. The group CBVLSN is computed as the minimum of CBVLSNs after discarding CBVLSNs that are obsolete. A CBVLSN is considered obsolete, if it has not been updated within a configurable time interval relative to the time that the most recent CBVLSN was updated.

Throws:
DatabaseException

syncupEnded

public void syncupEnded()

getCleanerBarrierFile

public long getCleanerBarrierFile()
                           throws DatabaseException
Returns the file number that forms a barrier for the cleaner's file deletion activities. Files with numbers >= this file number cannot be by the cleaner without disrupting the replication stream.

Returns:
the file number that's the barrier for cleaner file deletion
Throws:
DatabaseException

getReplicaCloseCatchupMs

long getReplicaCloseCatchupMs()

isActivePrimary

public boolean isActivePrimary()
Returns true if the node is a designated Primary that has been activated.


tryActivatePrimary

public boolean tryActivatePrimary()
Tries to activate this node as a Primary, if it has been configured as such and if the group size is two. This method is invoked when an operation falls short of quorum requirements and is ready to trade durability for availability. More specifically it's invoked when an election fails, or there is an insufficient number of replicas during a begin transaction or a transaction commit. The Primary is passivated again when the Secondary contacts it.

Returns:
true if the primary was activated -- the quorum value is 1

passivatePrimary

public final void passivatePrimary()

shutdownNetworkBackup

public final void shutdownNetworkBackup()
Shuts down the Network backup service *before* a rollback is initiated as part of syncup, thus ensuring that NetworkRestore does not see an inconsistent set of log files. Any network backup operations that are in progress at this node are aborted. The client of the service will experience network connection failures and will retry with this node (when the service is re-established at this node), or with some other node.

restarNetworkBackup() is then used to restart the service after it was shut down.


restartNetworkBackup

public final void restartNetworkBackup()
Restarts the network backup service *after* a rollback has been completed and the log files are once again in a consistent state.


dumpState

public String dumpState()

setElectableGroupSizeOverride

public void setElectableGroupSizeOverride(int override)


Copyright (c) 2004-2010 Oracle. All rights reserved.