org.apache.lucene.analysis.fr
Class ElisionFilter
java.lang.Object
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.fr.ElisionFilter
public class ElisionFilter
- extends TokenFilter
Removes elisions from a token stream. For example, "l'avion" (the plane) will be
tokenized as "avion" (plane).
Note that StandardTokenizer sees " ' " as a space, and cuts it out.
- See Also:
- Elision in Wikipedia
Method Summary |
Token |
next(Token reusableToken)
Returns the next input Token with term() without elisioned start |
void |
setArticles(Set articles)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ElisionFilter
protected ElisionFilter(TokenStream input)
- Constructs an elision filter with standard stop words
ElisionFilter
public ElisionFilter(TokenStream input,
Set articles)
- Constructs an elision filter with a Set of stop words
ElisionFilter
public ElisionFilter(TokenStream input,
String[] articles)
- Constructs an elision filter with an array of stop words
setArticles
public void setArticles(Set articles)
next
public Token next(Token reusableToken)
throws IOException
- Returns the next input Token with term() without elisioned start
- Overrides:
next
in class TokenStream
- Parameters:
reusableToken
- a Token that may or may not be used to
return; this parameter should never be null (the callee
is not required to check for null before using it, but it is a
good idea to assert that it is not null.)
- Returns:
- next token in the stream or null if end-of-stream was hit
- Throws:
IOException
Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.