it.unimi.dsi.mg4j.index
Class MemoryMappedIndex
java.lang.Object
it.unimi.dsi.mg4j.index.Index
it.unimi.dsi.mg4j.index.BitStreamIndex
it.unimi.dsi.mg4j.index.MemoryMappedIndex
- All Implemented Interfaces:
- Serializable
public class MemoryMappedIndex
- extends BitStreamIndex
A local memory-mapped bistream index.
Memory-mapped indices are created by mapping the index file into memory
using a MappedByteBuffer
. The main advantage over an InMemoryIndex
is that only the most frequently used parts of the index will be loaded in core memory.
Note that due to insurmountable Java limitations, it is impossible to map an index larger than
2GiB. However, you can partition lexically an index so that
the resulting segments are smaller than 2GiB, and modify the property file of the resulting
cluster so that the URIs of the local indices require memory mapping. This will effectively memory-map
the whole index.
- Since:
- 1.2
- Author:
- Sebastiano Vigna
- See Also:
- Serialized Form
Fields inherited from class it.unimi.dsi.mg4j.index.BitStreamIndex |
bufferSize, countCoding, DEFAULT_BUFFER_SIZE, DEFAULT_HEIGHT, DEFAULT_QUANTUM, FIXED_POINT_BITS, FIXED_POINT_MULTIPLIER, frequencyCoding, height, offsets, pointerCoding, positionCoding, prefixMap, quantum, readerConstructor, termMap |
Fields inherited from class it.unimi.dsi.mg4j.index.Index |
field, hasCounts, hasPayloads, hasPositions, keyIndex, maxCount, numberOfDocuments, numberOfOccurrences, numberOfPostings, numberOfTerms, payload, properties, singletonSet, sizes, termProcessor |
Constructor Summary |
MemoryMappedIndex(ByteBufferInputStream index,
int numberOfDocuments,
int numberOfTerms,
long numberOfPostings,
long numberOfOccurrences,
int maxCount,
Payload payload,
CompressionFlags.Coding frequencyCoding,
CompressionFlags.Coding pointerCoding,
CompressionFlags.Coding countCoding,
CompressionFlags.Coding positionCoding,
int quantum,
int height,
TermProcessor termProcessor,
String field,
Properties properties,
StringMap<? extends CharSequence> termMap,
PrefixMap<? extends CharSequence> prefixMap,
IntList sizes,
LongList offsets)
|
Methods inherited from class it.unimi.dsi.mg4j.index.Index |
documents, documents, getEmptyIndexIterator, getEmptyIndexIterator, getEmptyIndexIterator, getEmptyIndexIterator, getInstance, getInstance, getInstance, getInstance, getReader, getTermProcessor, keyIndex |
index
protected final ByteBufferInputStream index
- The byte buffer containing the index.
MemoryMappedIndex
public MemoryMappedIndex(ByteBufferInputStream index,
int numberOfDocuments,
int numberOfTerms,
long numberOfPostings,
long numberOfOccurrences,
int maxCount,
Payload payload,
CompressionFlags.Coding frequencyCoding,
CompressionFlags.Coding pointerCoding,
CompressionFlags.Coding countCoding,
CompressionFlags.Coding positionCoding,
int quantum,
int height,
TermProcessor termProcessor,
String field,
Properties properties,
StringMap<? extends CharSequence> termMap,
PrefixMap<? extends CharSequence> prefixMap,
IntList sizes,
LongList offsets)
getInputBitStream
public InputBitStream getInputBitStream(int bufferSizeUnused)
- Description copied from class:
BitStreamIndex
- Returns an input bit stream over the index.
- Specified by:
getInputBitStream
in class BitStreamIndex
- Parameters:
bufferSizeUnused
- a suggested buffer size.
- Returns:
- an input bit stream over the index.
getInputStream
public ByteBufferInputStream getInputStream()
- Description copied from class:
BitStreamIndex
- Returns an input stream over the index.
- Specified by:
getInputStream
in class BitStreamIndex
- Returns:
- an input stream over the index.