|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjava.io.InputStream
it.unimi.dsi.fastutil.io.MeasurableInputStream
it.unimi.dsi.fastutil.io.FastBufferedInputStream
it.unimi.dsi.mg4j.document.InputStreamDocumentSequence
public class InputStreamDocumentSequence
A document sequence obtained by breaking an input stream at a specified separator.
This document sequences blindly passes to the indexer sequences of characters read in a specified encoding and separated by a specified byte.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class it.unimi.dsi.fastutil.io.FastBufferedInputStream |
---|
FastBufferedInputStream.LineTerminator |
Field Summary |
---|
Fields inherited from class it.unimi.dsi.fastutil.io.FastBufferedInputStream |
---|
ALL_TERMINATORS, avail, buffer, DEFAULT_BUFFER_SIZE, is, pos, readBytes |
Constructor Summary | |
---|---|
InputStreamDocumentSequence(InputStream inputStream,
int separator,
DocumentFactory factory)
Creates a new document sequence based on a given input stream and separator. |
|
InputStreamDocumentSequence(InputStream inputStream,
int separator,
DocumentFactory factory,
int maxDocs)
Creates a new document sequence based on a given input stream and separator; the sequence will not return more than the given number of documents. |
Method Summary | |
---|---|
void |
close()
Closes this document sequence, releasing all resources. |
DocumentFactory |
factory()
Returns the factory used by this sequence. |
void |
flush()
|
DocumentIterator |
iterator()
Returns an iterator over the sequence of documents. |
void |
mark(int readlimit)
|
boolean |
markSupported()
|
boolean |
noMoreBytes()
|
int |
read()
|
int |
read(byte[] b)
|
int |
read(byte[] b,
int offset,
int length)
|
void |
reset()
Deprecated. |
long |
skip(long skip)
|
Methods inherited from class it.unimi.dsi.fastutil.io.FastBufferedInputStream |
---|
available, length, noMoreCharacters, position, position, readLine, readLine, readLine, readLine |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public InputStreamDocumentSequence(InputStream inputStream, int separator, DocumentFactory factory, int maxDocs)
inputStream
- the input stream containing all documents.separator
- the separator.factory
- the factory that will be used to create documents.maxDocs
- the maximum number of documents returned.public InputStreamDocumentSequence(InputStream inputStream, int separator, DocumentFactory factory)
inputStream
- the input stream containing all documents.separator
- the separator.factory
- the factory that will be used to create documents.Method Detail |
---|
public DocumentIterator iterator()
DocumentSequence
Warning: this method can be safely called just one time. For instance, implementations based on standard input will usually throw an exception if this method is called twice.
Implementations may decide to override this restriction
(in particular, if they implement DocumentCollection
). Usually,
however, it is not possible to obtain two iterators at the
same time on a collection.
iterator
in interface DocumentSequence
DocumentCollection
public DocumentFactory factory()
DocumentSequence
Every document sequence is based on a document factory that transforms raw bytes into a sequence of characters. The factory contains useful information such as the number of fields.
factory
in interface DocumentSequence
public boolean noMoreBytes() throws IOException
IOException
public int read() throws IOException
read
in class FastBufferedInputStream
IOException
public int read(byte[] b) throws IOException
read
in class InputStream
IOException
public int read(byte[] b, int offset, int length) throws IOException
read
in class FastBufferedInputStream
IOException
public void mark(int readlimit)
mark
in class InputStream
public boolean markSupported()
markSupported
in class InputStream
public long skip(long skip)
skip
in class FastBufferedInputStream
@Deprecated public void reset()
reset
in class FastBufferedInputStream
public void flush()
flush
in class FastBufferedInputStream
public void close() throws IOException
DocumentSequence
You should always call this method after having finished with this document sequence.
Implementations are invited to call this method in a finaliser as a safety net (even better,
implement SafelyCloseable
), but since there
is no guarantee as to when finalisers are invoked, you should not depend on this behaviour.
close
in interface DocumentSequence
close
in interface Closeable
close
in class FastBufferedInputStream
IOException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |