Uses of Class
org.apache.lucene.analysis.Tokenizer

Packages that use Tokenizer
org.apache.lucene.analysis API and code to convert text into indexable/searchable tokens. 
org.apache.lucene.analysis.cjk Analyzer for Chinese, Japanese and Korean. 
org.apache.lucene.analysis.cn Analyzer for Chinese. 
org.apache.lucene.analysis.ngram   
org.apache.lucene.analysis.ru Analyzer for Russian. 
org.apache.lucene.analysis.standard A grammar-based tokenizer constructed with JavaCC. 
 

Uses of Tokenizer in org.apache.lucene.analysis
 

Subclasses of Tokenizer in org.apache.lucene.analysis
 class CharTokenizer
          An abstract base class for simple, character-oriented tokenizers.
 class KeywordTokenizer
          Emits the entire input as a single token.
 class LetterTokenizer
          A LetterTokenizer is a tokenizer that divides text at non-letters.
 class LowerCaseTokenizer
          LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together.
 class WhitespaceTokenizer
          A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
 

Uses of Tokenizer in org.apache.lucene.analysis.cjk
 

Subclasses of Tokenizer in org.apache.lucene.analysis.cjk
 class CJKTokenizer
          CJKTokenizer was modified from StopTokenizer which does a decent job for most European languages.
 

Uses of Tokenizer in org.apache.lucene.analysis.cn
 

Subclasses of Tokenizer in org.apache.lucene.analysis.cn
 class ChineseTokenizer
          Title: ChineseTokenizer Description: Extract tokens from the Stream using Character.getType() Rule: A Chinese character as a single token Copyright: Copyright (c) 2001 Company: The difference between thr ChineseTokenizer and the CJKTokenizer (id=23545) is that they have different token parsing logic.
 

Uses of Tokenizer in org.apache.lucene.analysis.ngram
 

Subclasses of Tokenizer in org.apache.lucene.analysis.ngram
 class EdgeNGramTokenizer
          Tokenizes the input from an edge into n-grams of given size(s).
 class NGramTokenizer
          Tokenizes the input into n-grams of the given size(s).
 

Uses of Tokenizer in org.apache.lucene.analysis.ru
 

Subclasses of Tokenizer in org.apache.lucene.analysis.ru
 class RussianLetterTokenizer
          A RussianLetterTokenizer is a tokenizer that extends LetterTokenizer by additionally looking up letters in a given "russian charset".
 

Uses of Tokenizer in org.apache.lucene.analysis.standard
 

Subclasses of Tokenizer in org.apache.lucene.analysis.standard
 class StandardTokenizer
          A grammar-based tokenizer constructed with JavaCC.
 



Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.