com.ibm.icu.text

Class ComposedCharIter


public final class ComposedCharIter
extends Object

ComposedCharIter is an iterator class that returns all of the precomposed characters defined in the Unicode standard, along with their decomposed forms. This is often useful when building data tables (e.g. collation tables) which need to treat composed and decomposed characters equivalently.

For example, imagine that you have built a collation table with ordering rules for the canonically decomposed forms of all characters used in a particular language. When you process input text using this table, the text must first be decomposed so that it matches the form used in the table. This can impose a performance penalty that may be unacceptable in some situations.

You can avoid this problem by ensuring that the collation table contains rules for both the decomposed and composed versions of each character. To do so, use a ComposedCharIter to iterate through all of the composed characters in Unicode. If the decomposition for that character consists solely of characters that are listed in your ruleset, you can add a new rule for the composed character that makes it equivalent to its decomposition sequence.

Note that ComposedCharIter iterates over a static table of the composed characters in Unicode. If you want to iterate over the composed characters in a particular string, use Normalizer instead.

When constructing a ComposedCharIter there is one optional feature that you can enable or disable:

ComposedCharIter is currently based on version 2.1.8 of the Unicode Standard. It will be updated as later versions of Unicode are released.

Field Summary

static char
DONE
Deprecated. ICU 2.2

Constructor Summary

ComposedCharIter()
Deprecated. ICU 2.2
ComposedCharIter(boolean compat, int options)
Deprecated. ICU 2.2

Method Summary

String
decomposition()
Deprecated. ICU 2.2
boolean
hasNext()
Deprecated. ICU 2.2
char
next()
Deprecated. ICU 2.2

Field Details

DONE

public static final char DONE

Deprecated. ICU 2.2

Constant that indicates the iteration has completed. next() returns this value when there are no more composed characters over which to iterate.

Constructor Details

ComposedCharIter

public ComposedCharIter()

Deprecated. ICU 2.2

Construct a new ComposedCharIter. The iterator will return all Unicode characters with canonical decompositions, including Korean Hangul characters.

ComposedCharIter

public ComposedCharIter(boolean compat,
                        int options)

Deprecated. ICU 2.2

Constructs a non-default ComposedCharIter with optional behavior.

Method Details

decomposition

public String decomposition()

Deprecated. ICU 2.2


hasNext

public boolean hasNext()

Deprecated. ICU 2.2


next

public char next()

Deprecated. ICU 2.2


Copyright (c) 2006 IBM Corporation and others.