org.apache.lucene.index.memory

Class SynonymTokenFilter


public class SynonymTokenFilter
extends TokenFilter

Injects additional tokens for synonyms of token terms fetched from the underlying child stream; the child stream must deliver lowercase tokens for synonyms to be found.
Author:
whoschek.AT.lbl.DOT.gov

Field Summary

static String
SYNONYM_TOKEN_TYPE
The Token.type used to indicate a synonym to higher level filters.

Fields inherited from class org.apache.lucene.analysis.TokenFilter

input

Constructor Summary

SynonymTokenFilter(TokenStream input, SynonymMap synonyms, int maxSynonyms)
Creates an instance for the given underlying stream and synonym table.

Method Summary

protected Token
createToken(String synonym, Token current)
Creates and returns a token for the given synonym of the current input token; Override for custom (stateless or stateful) behaviour, if desired.
Token
next()
Returns the next token in the stream, or null at EOS.

Methods inherited from class org.apache.lucene.analysis.TokenFilter

close

Methods inherited from class org.apache.lucene.analysis.TokenStream

close, next

Field Details

SYNONYM_TOKEN_TYPE

public static final String SYNONYM_TOKEN_TYPE
The Token.type used to indicate a synonym to higher level filters.

Constructor Details

SynonymTokenFilter

public SynonymTokenFilter(TokenStream input,
                          SynonymMap synonyms,
                          int maxSynonyms)
Creates an instance for the given underlying stream and synonym table.
Parameters:
input - the underlying child token stream
synonyms - the map used to extract synonyms for terms
maxSynonyms - the maximum number of synonym tokens to return per underlying token word (a value of Integer.MAX_VALUE indicates unlimited)

Method Details

createToken

protected Token createToken(String synonym,
                            Token current)
Creates and returns a token for the given synonym of the current input token; Override for custom (stateless or stateful) behaviour, if desired.
Parameters:
synonym - a synonym for the current token's term
current - the current token from the underlying child stream
Returns:
a new token, or null to indicate that the given synonym should be ignored

next

public Token next()
            throws IOException
Returns the next token in the stream, or null at EOS.
Overrides:
next in interface TokenStream

Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.