com.sleepycat.je.rep.vlsn
Class VLSNBucket

java.lang.Object
  extended by com.sleepycat.je.rep.vlsn.VLSNBucket
Direct Known Subclasses:
GhostBucket

public class VLSNBucket
extends Object

A VLSNBucket instance represents a set of VLSN->LSN mappings. Buckets are usually not updated, except at times when the replication stream may have been reduced in size, by log cleaning or syncup. The VLSNBuckets in the VLSNIndex's VLSNTracker are written to disk and are persistent. There are also VLSNBuckets in the temporary recovery-time tracker that are used for collecting mappings found in the log during recovery. VLSNBuckets only hold mappings from a single log file. A single log file may be mapped by multiple VLSNBuckets though. As a tradeoff in space vs time, a VLSNBucket only stores a sparse set of mappings and the caller must use a VLSNReader to scan the log file and find any log entries not mapped directly by the bucket. In addition, the VLSN is not actually stored. Only the offset portion of the LSN is stored, and the VLSN is intuited by a stride field. Each VLSNBucket only maps a single file, though a single file may be mapped by several VLSNBuckets. For example, suppose a node had these VLSN->LSN mappings: VLSN LSN (file/offset) 9 10/100 10 10/110 11 10/120 12 10/130 13 10/140 14 11/100 15 11/120 The mappings in file 10 could be represented by a VLSNBucket with a stride of 4. That means the bucket would hold the mappings for 9 10/100, 13 10/140 And since the target log file number and the stride is known, the mappings can be represented in by the offset alone in this array: {100, 140}, rather than storing the whole lsn. Each bucket can also provide the mapping for the first and last VLSN it covers, even if the lastVLSN is not divisible by the stride. This is done to support forward and backward scanning. From the example above, the completed bucket can provide 9->10/100, 13->10/140, 15 -> 10/160 even though 15 is not a stride's worth away from 13. Because registering a VLSN->LSN mapping is done outside the log write latch, any inserts into the VLSNBucket may not be in order. However, when any VLSN is registered, we can assume that all VLSNs < that value do exist in the log. It's just an accident of timing that they haven't yet been registered. Note that out of order inserts into the buckets can create holes in the bucket's offset array, or cause the array to be shorter than anticipated. For example, if the insertion order into the bucket is vlsns 9, 15, we'll actually only keep an offset array of size 1. We have to be able to handle holes in the bucket, and can't count on filling them in when the lagging vlsn arrives, because it is possible that a reading thread will access the bucket before the laggard inserter arrives, or that the bucket might be flushed to disk, and become immutable.


Field Summary
(package private)  boolean dirty
           
protected  VLSN firstVLSN
           
protected  VLSN lastVLSN
           
 
Constructor Summary
VLSNBucket(long fileNumber, int stride, int maxMappings, int maxDistance, VLSN firstVLSN)
           
 
Method Summary
(package private)  void close()
           
 void dump(PrintStream out)
          For debugging and tracing.
(package private)  boolean empty()
           
(package private)  void fillDataEntry(DatabaseEntry data)
           
(package private)  boolean follows(VLSN vlsn)
           
(package private)  VLSN getFirst()
           
(package private)  long getGTELsn(VLSN vlsn)
          Returns the mapping whose VLSN is >= the VLSN parameter.
(package private)  VLSN getLast()
           
(package private)  long getLastLsn()
           
 long getLsn(VLSN vlsn)
           
(package private)  long getLTEFileNumber()
          Return a file number that is less or equal to the first lsn mapped by this bucket.
(package private)  long getLTELsn(VLSN vlsn)
          Returns the lsn whose VLSN is <= the VLSN parameter.
(package private)  int getNumOffsets()
           
(package private)  boolean isGhost()
           
(package private)  boolean owns(VLSN vlsn)
           
(package private)  boolean precedes(VLSN vlsn)
           
(package private)  boolean put(VLSN vlsn, long lsn)
          Record the LSN location for this VLSN.
static VLSNBucket readFromDatabase(DatabaseEntry data)
          Instantiate this from the database.
(package private)  VLSNBucket removeFromHead(EnvironmentImpl envImpl, VLSN lastDuplicate)
          Remove the mappings from this bucket that are for VLSNs <= lastDuplicate.
(package private)  void removeFromTail(VLSN startOfDelete, long prevLsn)
          Remove the mappings from this bucket that are for VLSNs >= startOfDelete.
 String toString()
           
(package private)  void writeToDatabase(EnvironmentImpl envImpl, Cursor cursor)
          Write this bucket to the mapping database using a cursor.
(package private)  void writeToDatabase(EnvironmentImpl envImpl, DatabaseImpl bucketDbImpl, Txn txn)
          Write this bucket to the mapping database.
(package private)  void writeToTupleOutput(TupleOutput to)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

firstVLSN

protected VLSN firstVLSN

lastVLSN

protected VLSN lastVLSN

dirty

boolean dirty
Constructor Detail

VLSNBucket

VLSNBucket(long fileNumber,
           int stride,
           int maxMappings,
           int maxDistance,
           VLSN firstVLSN)
Method Detail

put

boolean put(VLSN vlsn,
            long lsn)
Record the LSN location for this VLSN. One key issue is that puts() are not synchronized, and the VLSNs may arrive out of order. If an out of order VLSN does arrive, we can still assume that the earlier VLSNs have been successfully logged. If a VLSN arrives that is divisible by the stride, and should be recorded in the fileOffsets, but is not the next VLSN that should be recorded, we'll pad out the fileOffsets list with placeholders. For example, suppose the stride is 3, and the first VLSN is 2. Then this bucket should record VLSN 2, 5, 8, ... etc. If VLSN 8 arrives before VLSN 5, VLSN 8 will be recorded, and VLSN 5 will have an offset placeholder of NO_OFFSET. It is a non-issue if VLSNs 3, 4, 6, 7 arrive out of order, because they would not have been recorded anyway. This should not happen often, because the stride should be fairly large, and the calls to put() should be close together. If the insertion order is vlsn 2, 5, 9, then the file offsets array will be a little short, and will only have 2 elements, instead of 3. We follow this policy because we must always have a valid begin and end point for the range. We must handle placeholders in all cases, and can't count of later vlsn inserts, because a bucket can become immutable at any time if it is flushed to disk.

Returns:
false if this bucket will not accept this VLSN. Generally, a refusal might happen because the bucket was full or the mapping is too large a distance away from the previous mapping. In that case, the tracker will start another bucket.

owns

boolean owns(VLSN vlsn)
Returns:
true if this bucket contains this mapping.

getFirst

VLSN getFirst()

getLast

VLSN getLast()

getLTEFileNumber

long getLTEFileNumber()
Return a file number that is less or equal to the first lsn mapped by this bucket. In standard VLSNBuckets, only one file is covered, so there is only one possible value. In GhostBuckets, multiple files could be covered.

Returns:

empty

boolean empty()

follows

boolean follows(VLSN vlsn)

precedes

boolean precedes(VLSN vlsn)

getGTELsn

long getGTELsn(VLSN vlsn)
Returns the mapping whose VLSN is >= the VLSN parameter. For example, if the bucket holds mappings for vlsn 10, 13, 16, - the greater than or equal mapping for VLSN 10 is 10/lsn - the greater than or equal mapping for VLSN 11 is 13/lsn - the greater than or equal mapping for VLSN 13 is 13/lsn File offsets may be null in the middle of the file offsets array because of out of order mappings. This method must return a non-null lsn, and must account for null offsets.

Returns:
the mapping whose VLSN is >= the VLSN parameter. Will never return NULL_LSN, because the VLSNRange begin and end point are always mapped.

getLTELsn

long getLTELsn(VLSN vlsn)
Returns the lsn whose VLSN is <= the VLSN parameter. For example, if the bucket holds mappings for vlsn 10, 13, 16, - the less than or equal mapping for VLSN 10 is 10/lsn - the less than or equal mapping for VLSN 11 is 10/lsn - the less than or equal mapping for VLSN 13 is 13/lsn File offsets may be null in the middle of the file offsets array because of out of order mappings. This method must return a non-null lsn, and must account for null offsets.

Returns:
the lsn whose VLSN is <= the VLSN parameter. Will never return NULL_LSN, because the VLSNRange begin and end points are always mapped.

getLsn

public long getLsn(VLSN vlsn)
Returns:
the lsn whose VLSN is == the VLSN parameter or DbLsn.NULL_LSN if there is no mapping. Note that because of out of order puts, there may be missing mappings that appear later on.

getLastLsn

long getLastLsn()

removeFromHead

VLSNBucket removeFromHead(EnvironmentImpl envImpl,
                          VLSN lastDuplicate)
                    throws IOException
Remove the mappings from this bucket that are for VLSNs <= lastDuplicate. If this results in a broken stride interval, package all those mappings into their own bucket and return it as a remainder bucket. For example, suppose this bucket has a stride of 5 and maps VLSN 10-23. Then it has mappings for 10, 15, 20, 23. If we need to remove mappings <= 16, we'll end up without a bucket that serves as a home base for vlsns 17,18,19. Those will be spun out into their own bucket, and this bucket will be adjusted to start at VLSN 20. This bucket should end up with - firstVLSN = 20 - fileOffset is an array of size 1, for the LSN for VLSN 20 - lastVLSN = 23 - lastLsn = the same as before The spun-off bucket should be: - firstVLSN = 17 - fileOffset is an array of size 1, for the LSN for VLSN 17 - lastVLSN = 19 - lastLsn = lsn for 19

Returns:
the newly created bucket that holds mappings from a broken stride interval, or null if there was no need to create such a bucket.
Throws:
IOException

removeFromTail

void removeFromTail(VLSN startOfDelete,
                    long prevLsn)
Remove the mappings from this bucket that are for VLSNs >= startOfDelete. Unlike removing from the head, we need not worry about breaking a bucket stride interval. If prevLsn is NULL_VLSN, we don't have a good value to cap the bucket. Instead, we'll have to delete the bucket back to whatever was the next available lsn. For example, suppose the bucket has these mappings. This strange bucket (stride 25 is missing) is possible if vlsn 26 arrived early, out of order. in fileOffset: 10 -> 101 in fileOffset: 15 -> no offset in fileOffset: 20 -> 201 lastVLSN->lastnLsn mapping 26 -> 250 If we have a prevLsn and the startOfDelete is 17, then we can create a new mapping in fileOffset: 10 -> 101 in fileOffset: 15 -> no offset lastVLSN->lastnLsn mapping 17 -> 190 If we don't have a prevLsn, then we know that we have to cut the bucket back to the largest known mapping, losing many mappings along the way. in fileOffset: 10 -> 101 lastVLSN->lastnLsn mapping 10 -> 101 If we are deleting in the vlsn area between the last stride and the last offset, (i.e. vlsn 23 is the startOfDelete) the with and without prevLSn cases would look like this: (there is a prevLsn, and 23 is startDelete. No need to truncate anything) in fileOffset: 10 -> 101 in fileOffset: 15 -> no offset in fileOffset: 20 -> 201 lastVLSN->lastnLsn mapping 23 -> prevLsn (there is no prevLsn, and 23 is startDelete) in fileOffset: 10 -> 101 in fileOffset: 15 -> no offset in fileOffset: 20 -> 201 lastVLSN->lastnLsn mapping 20 -> 201

Parameters:
startOfDelete - is the VLSN that begins the range to delete, inclusive
prevLsn - is the lsn of startOfDelete.getPrev(). We'll be using it to cap off the end of the bucket, by assigning it to the lastLsn field.

getNumOffsets

int getNumOffsets()

close

void close()

writeToDatabase

void writeToDatabase(EnvironmentImpl envImpl,
                     DatabaseImpl bucketDbImpl,
                     Txn txn)
Write this bucket to the mapping database.


writeToDatabase

void writeToDatabase(EnvironmentImpl envImpl,
                     Cursor cursor)
Write this bucket to the mapping database using a cursor. Note that this method must disable critical eviction. Critical eviction makes the calling thread search for a target IN node to evict. That target IN node may or may not be in the internal VLSN db. For example, when a new, replicated LN is inserted or modified, a new VLSN is allocated. To do so, the app thread that is executing the operation A1. Takes a BIN latch on a BIN in a replicated db A2. Takes the VLSNINdex mutex Anyone calling writeDatabase() has to take these steps: B1. Take the VLSNIndex mutex B2. Get a BIN latch for a BIN in the internal vlsn db. This difference in locking hierarchy could cause a deadlock except for the fact that A1 and B2 are guaranteed to be in different databases. If writeDatabase() also did critical eviction, it would have a step where it tried to get a BIN latch on a replicated db, and we'd have a deadlock. [#18475]


readFromDatabase

public static VLSNBucket readFromDatabase(DatabaseEntry data)
Instantiate this from the database. Assumes that this bucket will not be used for insertion in the future.


fillDataEntry

void fillDataEntry(DatabaseEntry data)

toString

public String toString()
Overrides:
toString in class Object
See Also:
Object.toString()

dump

public void dump(PrintStream out)
For debugging and tracing.


isGhost

boolean isGhost()

writeToTupleOutput

void writeToTupleOutput(TupleOutput to)


Copyright (c) 2004-2010 Oracle. All rights reserved.