it.unimi.dsi.mg4j.index
Interface IndexReader
- All Superinterfaces:
- Closeable, SafelyCloseable
- All Known Implementing Classes:
- AbstractIndexClusterIndexReader, AbstractIndexReader, BitStreamHPIndexReader, BitStreamIndexReader, DocumentalClusterIndexReader, GammaDeltaGammaDeltaBitStreamHPIndexReader, GammaDeltaGammaDeltaBitStreamIndexReader, LexicalClusterIndexReader, RemoteIndexReader, SkipGammaDeltaGammaDeltaBitStreamIndexReader
public interface IndexReader
- extends SafelyCloseable
Provides access to an inverted index.
An Index
contains global read-only metadata. To get actual data
from an index, you need to get an index reader via a call to Index.getReader()
. Once
you have an index reader, you can ask for the documents matching a term.
Alternatively, you can perform a read-once scan of the index calling nextIterator()
,
which will return in order the index iterators of all terms of the underlying index.
More generally, nextIterator()
returns an iterator positioned at the start of the inverted
list of the term after the current one. When called just after the reader creation, it returns an
index iterator for the first term.
Warning: An index reader is exactly what it looks like—a reader. It
cannot be used by many threads at the same time, and all its access methods are exclusive: if you
obtain a document iterator, the previous one is no longer valid. However,
you can generate many readers, and use them concurrently.
Warning: Invoking the DocumentIterator.dispose()
method
on iterators returned by an instance of this class will invoke Closeable.close()
on the instance, thus
making the instance no longer accessible. This behaviour is necessary to handle cases in which a
reader is created on-the-fly just to create an iterator.
Warning: As of MG4J 1.2, direct (i.e., bit-level) access to an inverted index is no longer possible.
- Since:
- 1.0
- Author:
- Paolo Boldi, Sebastiano Vigna
documents
IndexIterator documents(int termNumber)
throws IOException
- Returns a document iterator over the documents containing a term.
Note that the index iterator returned by this method will
return null
on a call to term()
.
Note that it is always possible
to call this method with argument 0, even if the underlying index
does not provide random access.
- Parameters:
termNumber
- the number of a term.
- Throws:
UnsupportedOperationException
- if this index reader is not accessible by term
number.
IOException
documents
IndexIterator documents(CharSequence term)
throws IOException
- Returns an index iterator over the documents containing a term; the term is
given explicitly.
Unless the term processor of
the associated index is null
, words coming from a query will
have to be processed before being used with this method.
Note that the index iterator returned by this method will
return term
on a call to term()
.
- Parameters:
term
- a term (the term will be downcased if the index is case insensitive).
- Throws:
UnsupportedOperationException
- if the term map is not available for the underlying index.
IOException
nextIterator
IndexIterator nextIterator()
throws IOException
- Returns an
IndexIterator
on the term after the current one (optional operation).
Note that after creation there is no current term. Thus, the first call to this
method will return an IndexIterator
on the first term. As a consequence, repeated
calls to this method provide a way to scan sequentially an index.
- Returns:
- the index iterator of the next term, or
null
if there are no more terms
after the current one.
- Throws:
IOException