com.ibm.icu.lang

Interface UProperty

public interface UProperty

Selection constants for Unicode properties.

These constants are used in functions like UCharacter.hasBinaryProperty(int) to select one of the Unicode properties.

The properties APIs are intended to reflect Unicode properties as defined in the Unicode Character Database (UCD) and Unicode Technical Reports (UTR).

For details about the properties see http://www.unicode.org.

For names of Unicode properties see the UCD file PropertyAliases.txt.

Important: If ICU is built with UCD files from Unicode versions below 3.2, then properties marked with "new" are not or not fully available. Check UCharacter.getUnicodeVersion() to be sure.

Author:
Syn Wee Quek
See Also:
UCharacter

Nested Class Summary

static interface
UProperty.NameChoice
Selector constants for UCharacter.getPropertyName() and UCharacter.getPropertyValueName().

Field Summary

static int
AGE
String property Age.
static int
ALPHABETIC
Binary property Alphabetic.
static int
ASCII_HEX_DIGIT
Binary property ASCII_Hex_Digit (0-9 A-F a-f).
static int
BIDI_CLASS
Enumerated property Bidi_Class.
static int
BIDI_CONTROL
Binary property Bidi_Control.

Format controls which have specific functions in the Bidi Algorithm.

static int
BIDI_MIRRORED
Binary property Bidi_Mirrored.

Characters that may change display in RTL text.

Property for UCharacter.isMirrored().

See Bidi Algorithm; UTR 9.

static int
BIDI_MIRRORING_GLYPH
String property Bidi_Mirroring_Glyph.
static int
BINARY_LIMIT
One more than the last constant for binary Unicode properties.
static int
BINARY_START
First constant for binary Unicode properties.
static int
BLOCK
Enumerated property Block.
static int
CANONICAL_COMBINING_CLASS
Enumerated property Canonical_Combining_Class.
static int
CASE_FOLDING
String property Case_Folding.
static int
CASE_SENSITIVE
Binary property Case_Sensitive.

Either the source of a case mapping or _in_ the target of a case mapping.

static int
DASH
Binary property Dash.

Variations of dashes.

static int
DECOMPOSITION_TYPE
Enumerated property Decomposition_Type.
static int
DEFAULT_IGNORABLE_CODE_POINT
Binary property Default_Ignorable_Code_Point (new).
static int
DEPRECATED
Binary property Deprecated (new).

The usage of deprecated characters is strongly discouraged.

static int
DIACRITIC
Binary property Diacritic.

Characters that linguistically modify the meaning of another character to which they apply.

static int
DOUBLE_LIMIT
One more than the last constant for double Unicode properties.
static int
DOUBLE_START
First constant for double Unicode properties.
static int
EAST_ASIAN_WIDTH
Enumerated property East_Asian_Width.
static int
EXTENDER
Binary property Extender.

Extend the value or shape of a preceding alphabetic character, e.g.

static int
FULL_COMPOSITION_EXCLUSION
Binary property Full_Composition_Exclusion.

CompositionExclusions.txt + Singleton Decompositions + Non-Starter Decompositions.

static int
GENERAL_CATEGORY
Enumerated property General_Category.
static int
GENERAL_CATEGORY_MASK
Bitmask property General_Category_Mask.
static int
GRAPHEME_BASE
Binary property Grapheme_Base (new).

For programmatic determination of grapheme cluster boundaries.

static int
GRAPHEME_CLUSTER_BREAK
Enumerated property Grapheme_Cluster_Break (new in Unicode 4.1).
static int
GRAPHEME_EXTEND
Binary property Grapheme_Extend (new).

For programmatic determination of grapheme cluster boundaries.

Me+Mn+Mc+Other_Grapheme_Extend-Grapheme_Link-CGJ

static int
GRAPHEME_LINK
Binary property Grapheme_Link (new).

For programmatic determination of grapheme cluster boundaries.

static int
HANGUL_SYLLABLE_TYPE
Enumerated property Hangul_Syllable_Type, new in Unicode 4.
static int
HEX_DIGIT
Binary property Hex_Digit.

Characters commonly used for hexadecimal numbers.

static int
HYPHEN
Binary property Hyphen.

Dashes used to mark connections between pieces of words, plus the Katakana middle dot.

static int
IDEOGRAPHIC
Binary property Ideographic.

CJKV ideographs.

static int
IDS_BINARY_OPERATOR
Binary property IDS_Binary_Operator (new).

For programmatic determination of Ideographic Description Sequences.

static int
IDS_TRINARY_OPERATOR
Binary property IDS_Trinary_Operator (new).
static int
ID_CONTINUE
Binary property ID_Continue.

Characters that can continue an identifier.

ID_Start+Mn+Mc+Nd+Pc

static int
ID_START
Binary property ID_Start.

Characters that can start an identifier.

Lu+Ll+Lt+Lm+Lo+Nl

static int
INT_LIMIT
One more than the last constant for enumerated/integer Unicode properties.
static int
INT_START
First constant for enumerated/integer Unicode properties.
static int
ISO_COMMENT
String property ISO_Comment.
static int
JOINING_GROUP
Enumerated property Joining_Group.
static int
JOINING_TYPE
Enumerated property Joining_Type.
static int
JOIN_CONTROL
Binary property Join_Control.

Format controls for cursive joining and ligation.

static int
LEAD_CANONICAL_COMBINING_CLASS
Enumerated property Lead_Canonical_Combining_Class.
static int
LINE_BREAK
Enumerated property Line_Break.
static int
LOGICAL_ORDER_EXCEPTION
Binary property Logical_Order_Exception (new).

Characters that do not use logical order and require special handling in most processing.

static int
LOWERCASE
Binary property Lowercase.

Same as UCharacter.isULowercase(), different from UCharacter.islower().

Ll+Other_Lowercase

static int
LOWERCASE_MAPPING
String property Lowercase_Mapping.
static int
MASK_LIMIT
One more than the last constant for bit-mask Unicode properties.
static int
MASK_START
First constant for bit-mask Unicode properties.
static int
MATH
Binary property Math.

Sm+Other_Math

static int
NAME
String property Name.
static int
NFC_INERT
Binary property NFC_Inert.
static int
NFC_QUICK_CHECK
Enumerated property NFC_Quick_Check.
static int
NFD_INERT
Binary property NFD_Inert.
static int
NFD_QUICK_CHECK
Enumerated property NFD_Quick_Check.
static int
NFKC_INERT
Binary property NFKC_Inert.
static int
NFKC_QUICK_CHECK
Enumerated property NFKC_Quick_Check.
static int
NFKD_INERT
Binary property NFKD_Inert.
static int
NFKD_QUICK_CHECK
Enumerated property NFKD_Quick_Check.
static int
NONCHARACTER_CODE_POINT
Binary property Noncharacter_Code_Point.

Code points that are explicitly defined as illegal for the encoding of characters.

static int
NUMERIC_TYPE
Enumerated property Numeric_Type.
static int
NUMERIC_VALUE
Double property Numeric_Value.
static int
PATTERN_SYNTAX
Binary property Pattern_Syntax (new in Unicode 4.1).
static int
PATTERN_WHITE_SPACE
Binary property Pattern_White_Space (new in Unicode 4.1).
static int
POSIX_ALNUM
Binary property alnum (a C/POSIX character class).
static int
POSIX_BLANK
Binary property blank (a C/POSIX character class).
static int
POSIX_GRAPH
Binary property graph (a C/POSIX character class).
static int
POSIX_PRINT
Binary property print (a C/POSIX character class).
static int
POSIX_XDIGIT
Binary property xdigit (a C/POSIX character class).
static int
QUOTATION_MARK
Binary property Quotation_Mark.
static int
RADICAL
Binary property Radical (new).

For programmatic determination of Ideographic Description Sequences.

static int
SCRIPT
Enumerated property Script.
static int
SEGMENT_STARTER
Binary Property Segment_Starter.
static int
SENTENCE_BREAK
Enumerated property Sentence_Break (new in Unicode 4.1).
static int
SIMPLE_CASE_FOLDING
String property Simple_Case_Folding.
static int
SIMPLE_LOWERCASE_MAPPING
String property Simple_Lowercase_Mapping.
static int
SIMPLE_TITLECASE_MAPPING
String property Simple_Titlecase_Mapping.
static int
SIMPLE_UPPERCASE_MAPPING
String property Simple_Uppercase_Mapping.
static int
SOFT_DOTTED
Binary property Soft_Dotted (new).

Characters with a "soft dot", like i or j.

An accent placed on these characters causes the dot to disappear.

static int
STRING_LIMIT
One more than the last constant for string Unicode properties.
static int
STRING_START
First constant for string Unicode properties.
static int
S_TERM
Binary property STerm (new in Unicode 4.0.1).
static int
TERMINAL_PUNCTUATION
Binary property Terminal_Punctuation.

Punctuation characters that generally mark the end of textual units.

static int
TITLECASE_MAPPING
String property Titlecase_Mapping.
static int
TRAIL_CANONICAL_COMBINING_CLASS
Enumerated property Trail_Canonical_Combining_Class.
static int
UNICODE_1_NAME
String property Unicode_1_Name.
static int
UNIFIED_IDEOGRAPH
Binary property Unified_Ideograph (new).

For programmatic determination of Ideographic Description Sequences.

static int
UPPERCASE
Binary property Uppercase.

Same as UCharacter.isUUppercase(), different from UCharacter.isUpperCase().

Lu+Other_Uppercase

static int
UPPERCASE_MAPPING
String property Uppercase_Mapping.
static int
VARIATION_SELECTOR
Binary property Variation_Selector (new in Unicode 4.0.1).
static int
WHITE_SPACE
Binary property White_Space.

Same as UCharacter.isUWhiteSpace(), different from UCharacter.isSpace() and UCharacter.isWhitespace(). Space characters+TAB+CR+LF-ZWSP-ZWNBSP

static int
WORD_BREAK
Enumerated property Word_Break (new in Unicode 4.1).
static int
XID_CONTINUE
Binary property XID_Continue.

ID_Continue modified to allow closure under normalization forms NFKC and NFKD.

static int
XID_START
Binary property XID_Start.

ID_Start modified to allow closure under normalization forms NFKC and NFKD.

Field Details

AGE

public static final int AGE
String property Age. Corresponds to UCharacter.getAge(int).
Field Value:
16384

ALPHABETIC

public static final int ALPHABETIC
Binary property Alphabetic.

Property for UCharacter.isUAlphabetic(), different from the property in UCharacter.isalpha().

Lu + Ll + Lt + Lm + Lo + Nl + Other_Alphabetic.

Field Value:
0

ASCII_HEX_DIGIT

public static final int ASCII_HEX_DIGIT
Binary property ASCII_Hex_Digit (0-9 A-F a-f).
Field Value:
1

BIDI_CLASS

public static final int BIDI_CLASS
Enumerated property Bidi_Class. Same as UCharacter.getDirection(int), returns UCharacterDirection values.
Field Value:
4096

BIDI_CONTROL

public static final int BIDI_CONTROL
Binary property Bidi_Control.

Format controls which have specific functions in the Bidi Algorithm.

Field Value:
2

BIDI_MIRRORED

public static final int BIDI_MIRRORED
Binary property Bidi_Mirrored.

Characters that may change display in RTL text.

Property for UCharacter.isMirrored().

See Bidi Algorithm; UTR 9.

Field Value:
3

BIDI_MIRRORING_GLYPH

public static final int BIDI_MIRRORING_GLYPH
String property Bidi_Mirroring_Glyph. Corresponds to UCharacter.getMirror(int).
Field Value:
16385

BINARY_LIMIT

public static final int BINARY_LIMIT
One more than the last constant for binary Unicode properties.
Field Value:
49

BINARY_START

public static final int BINARY_START
First constant for binary Unicode properties.
Field Value:
0

BLOCK

public static final int BLOCK
Enumerated property Block. Same as UCharacter.UnicodeBlock.of(int), returns UCharacter.UnicodeBlock values.
Field Value:
4097

CANONICAL_COMBINING_CLASS

public static final int CANONICAL_COMBINING_CLASS
Enumerated property Canonical_Combining_Class. Same as UCharacter.getCombiningClass(int), returns 8-bit numeric values.
Field Value:
4098

CASE_FOLDING

public static final int CASE_FOLDING
String property Case_Folding. Corresponds to UCharacter.foldCase(String, boolean).
Field Value:
16386

CASE_SENSITIVE

public static final int CASE_SENSITIVE
Binary property Case_Sensitive.

Either the source of a case mapping or _in_ the target of a case mapping. Not the same as the general category Cased_Letter.

Field Value:
34

DASH

public static final int DASH
Binary property Dash.

Variations of dashes.

Field Value:
4

DECOMPOSITION_TYPE

public static final int DECOMPOSITION_TYPE
Enumerated property Decomposition_Type. Returns UCharacter.DecompositionType values.
Field Value:
4099

DEFAULT_IGNORABLE_CODE_POINT

public static final int DEFAULT_IGNORABLE_CODE_POINT
Binary property Default_Ignorable_Code_Point (new).

Property that indicates codepoint is ignorable in most processing.

Codepoints (2060..206F, FFF0..FFFB, E0000..E0FFF) + Other_Default_Ignorable_Code_Point + (Cf + Cc + Cs - White_Space)

Field Value:
5

DEPRECATED

public static final int DEPRECATED
Binary property Deprecated (new).

The usage of deprecated characters is strongly discouraged.

Field Value:
6

DIACRITIC

public static final int DIACRITIC
Binary property Diacritic.

Characters that linguistically modify the meaning of another character to which they apply.

Field Value:
7

DOUBLE_LIMIT

public static final int DOUBLE_LIMIT
One more than the last constant for double Unicode properties.
Field Value:
12289

DOUBLE_START

public static final int DOUBLE_START
First constant for double Unicode properties.
Field Value:
12288

EAST_ASIAN_WIDTH

public static final int EAST_ASIAN_WIDTH
Enumerated property East_Asian_Width. See http://www.unicode.org/reports/tr11/ Returns UCharacter.EastAsianWidth values.
Field Value:
4100

EXTENDER

public static final int EXTENDER
Binary property Extender.

Extend the value or shape of a preceding alphabetic character, e.g. length and iteration marks.

Field Value:
8

FULL_COMPOSITION_EXCLUSION

public static final int FULL_COMPOSITION_EXCLUSION
Binary property Full_Composition_Exclusion.

CompositionExclusions.txt + Singleton Decompositions + Non-Starter Decompositions.

Field Value:
9

GENERAL_CATEGORY

public static final int GENERAL_CATEGORY
Enumerated property General_Category. Same as UCharacter.getType(int), returns UCharacterCategory values.
Field Value:
4101

GENERAL_CATEGORY_MASK

public static final int GENERAL_CATEGORY_MASK
Bitmask property General_Category_Mask. This is the General_Category property returned as a bit mask. When used in UCharacter.getIntPropertyValue(c), returns bit masks for UCharacterCategory values where exactly one bit is set. When used with UCharacter.getPropertyValueName() and UCharacter.getPropertyValueEnum(), a multi-bit mask is used for sets of categories like "Letters".
Field Value:
8192

GRAPHEME_BASE

public static final int GRAPHEME_BASE
Binary property Grapheme_Base (new).

For programmatic determination of grapheme cluster boundaries. [0..10FFFF]-Cc-Cf-Cs-Co-Cn-Zl-Zp-Grapheme_Link-Grapheme_Extend-CGJ

Field Value:
10

GRAPHEME_CLUSTER_BREAK

public static final int GRAPHEME_CLUSTER_BREAK
Enumerated property Grapheme_Cluster_Break (new in Unicode 4.1). Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/) Returns UGraphemeClusterBreak values.
Field Value:
4114

GRAPHEME_EXTEND

public static final int GRAPHEME_EXTEND
Binary property Grapheme_Extend (new).

For programmatic determination of grapheme cluster boundaries.

Me+Mn+Mc+Other_Grapheme_Extend-Grapheme_Link-CGJ

Field Value:
11

GRAPHEME_LINK

public static final int GRAPHEME_LINK
Binary property Grapheme_Link (new).

For programmatic determination of grapheme cluster boundaries.

Field Value:
12

HANGUL_SYLLABLE_TYPE

public static final int HANGUL_SYLLABLE_TYPE
Enumerated property Hangul_Syllable_Type, new in Unicode 4. Returns HangulSyllableType values.
Field Value:
4107

HEX_DIGIT

public static final int HEX_DIGIT
Binary property Hex_Digit.

Characters commonly used for hexadecimal numbers.

Field Value:
13

HYPHEN

public static final int HYPHEN
Binary property Hyphen.

Dashes used to mark connections between pieces of words, plus the Katakana middle dot.

Field Value:
14

IDEOGRAPHIC

public static final int IDEOGRAPHIC
Binary property Ideographic.

CJKV ideographs.

Field Value:
17

IDS_BINARY_OPERATOR

public static final int IDS_BINARY_OPERATOR
Binary property IDS_Binary_Operator (new).

For programmatic determination of Ideographic Description Sequences.

Field Value:
18

IDS_TRINARY_OPERATOR

public static final int IDS_TRINARY_OPERATOR
Binary property IDS_Trinary_Operator (new). For programmatic determination of Ideographic Description Sequences.
Field Value:
19

ID_CONTINUE

public static final int ID_CONTINUE
Binary property ID_Continue.

Characters that can continue an identifier.

ID_Start+Mn+Mc+Nd+Pc

Field Value:
15

ID_START

public static final int ID_START
Binary property ID_Start.

Characters that can start an identifier.

Lu+Ll+Lt+Lm+Lo+Nl

Field Value:
16

INT_LIMIT

public static final int INT_LIMIT
One more than the last constant for enumerated/integer Unicode properties.
Field Value:
4117

INT_START

public static final int INT_START
First constant for enumerated/integer Unicode properties.
Field Value:
4096

ISO_COMMENT

public static final int ISO_COMMENT
String property ISO_Comment. Corresponds to UCharacter.getISOComment(int).
Field Value:
16387

JOINING_GROUP

public static final int JOINING_GROUP
Enumerated property Joining_Group. Returns UCharacter.JoiningGroup values.
Field Value:
4102

JOINING_TYPE

public static final int JOINING_TYPE
Enumerated property Joining_Type. Returns UCharacter.JoiningType values.
Field Value:
4103

JOIN_CONTROL

public static final int JOIN_CONTROL
Binary property Join_Control.

Format controls for cursive joining and ligation.

Field Value:
20

LEAD_CANONICAL_COMBINING_CLASS

public static final int LEAD_CANONICAL_COMBINING_CLASS
Enumerated property Lead_Canonical_Combining_Class. ICU-specific property for the ccc of the first code point of the decomposition, or lccc(c)=ccc(NFD(c)[0]). Useful for checking for canonically ordered text; see Normalizer.FCD and http://www.unicode.org/notes/tn5/#FCD . Returns 8-bit numeric values like CANONICAL_COMBINING_CLASS.
Field Value:
4112

LINE_BREAK

public static final int LINE_BREAK
Enumerated property Line_Break. Returns UCharacter.LineBreak values.
Field Value:
4104

LOGICAL_ORDER_EXCEPTION

public static final int LOGICAL_ORDER_EXCEPTION
Binary property Logical_Order_Exception (new).

Characters that do not use logical order and require special handling in most processing.

Field Value:
21

LOWERCASE

public static final int LOWERCASE
Binary property Lowercase.

Same as UCharacter.isULowercase(), different from UCharacter.islower().

Ll+Other_Lowercase

Field Value:
22

LOWERCASE_MAPPING

public static final int LOWERCASE_MAPPING
String property Lowercase_Mapping. Corresponds to UCharacter.toLowerCase(String).
Field Value:
16388

MASK_LIMIT

public static final int MASK_LIMIT
One more than the last constant for bit-mask Unicode properties.
Field Value:
8193

MASK_START

public static final int MASK_START
First constant for bit-mask Unicode properties.
Field Value:
8192

MATH

public static final int MATH
Binary property Math.

Sm+Other_Math

Field Value:
23

NAME

public static final int NAME
String property Name. Corresponds to UCharacter.getName(int).
Field Value:
16389

NFC_INERT

public static final int NFC_INERT
Binary property NFC_Inert. ICU-specific property for characters that are inert under NFC, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions.
Field Value:
39

NFC_QUICK_CHECK

public static final int NFC_QUICK_CHECK
Enumerated property NFC_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.
Field Value:
4110

NFD_INERT

public static final int NFD_INERT
Binary property NFD_Inert. ICU-specific property for characters that are inert under NFD, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions. There is one such property per normalization form. These properties are computed as follows - an inert character is: a) unassigned, or ALL of the following: b) of combining class 0. c) not decomposed by this normalization form. AND if NFC or NFKC, d) can never compose with a previous character. e) can never compose with a following character. f) can never change if another character is added. Example: a-breve might satisfy all but f, but if you add an ogonek it changes to a-ogonek + breve See also com.ibm.text.UCD.NFSkippable in the ICU4J repository, and icu/source/common/unormimp.h .
Field Value:
37

NFD_QUICK_CHECK

public static final int NFD_QUICK_CHECK
Enumerated property NFD_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.
Field Value:
4108

NFKC_INERT

public static final int NFKC_INERT
Binary property NFKC_Inert. ICU-specific property for characters that are inert under NFKC, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions.
Field Value:
40

NFKC_QUICK_CHECK

public static final int NFKC_QUICK_CHECK
Enumerated property NFKC_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.
Field Value:
4111

NFKD_INERT

public static final int NFKD_INERT
Binary property NFKD_Inert. ICU-specific property for characters that are inert under NFKD, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions.
Field Value:
38

NFKD_QUICK_CHECK

public static final int NFKD_QUICK_CHECK
Enumerated property NFKD_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.
Field Value:
4109

NONCHARACTER_CODE_POINT

public static final int NONCHARACTER_CODE_POINT
Binary property Noncharacter_Code_Point.

Code points that are explicitly defined as illegal for the encoding of characters.

Field Value:
24

NUMERIC_TYPE

public static final int NUMERIC_TYPE
Enumerated property Numeric_Type. Returns UCharacter.NumericType values.
Field Value:
4105

NUMERIC_VALUE

public static final int NUMERIC_VALUE
Double property Numeric_Value. Corresponds to UCharacter.getUnicodeNumericValue(int).
Field Value:
12288

PATTERN_SYNTAX

public static final int PATTERN_SYNTAX
Binary property Pattern_Syntax (new in Unicode 4.1). See UAX #31 Identifier and Pattern Syntax (http://www.unicode.org/reports/tr31/)
Field Value:
42

PATTERN_WHITE_SPACE

public static final int PATTERN_WHITE_SPACE
Binary property Pattern_White_Space (new in Unicode 4.1). See UAX #31 Identifier and Pattern Syntax (http://www.unicode.org/reports/tr31/)
Field Value:
43

POSIX_ALNUM

public static final int POSIX_ALNUM
Binary property alnum (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.
Field Value:
44

POSIX_BLANK

public static final int POSIX_BLANK
Binary property blank (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.
Field Value:
45

POSIX_GRAPH

public static final int POSIX_GRAPH
Binary property graph (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.
Field Value:
46

POSIX_PRINT

public static final int POSIX_PRINT
Binary property print (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.
Field Value:
47

POSIX_XDIGIT

public static final int POSIX_XDIGIT
Binary property xdigit (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.
Field Value:
48

QUOTATION_MARK

public static final int QUOTATION_MARK
Binary property Quotation_Mark.
Field Value:
25

RADICAL

public static final int RADICAL
Binary property Radical (new).

For programmatic determination of Ideographic Description Sequences.

Field Value:
26

SCRIPT

public static final int SCRIPT
Enumerated property Script. Same as UScript.getScript(int), returns UScript values.
Field Value:
4106

SEGMENT_STARTER

public static final int SEGMENT_STARTER
Binary Property Segment_Starter. ICU-specific property for characters that are starters in terms of Unicode normalization and combining character sequences. They have ccc=0 and do not occur in non-initial position of the canonical decomposition of any character (like " in NFD(a-umlaut) and a Jamo T in an NFD(Hangul LVT)). ICU uses this property for segmenting a string for generating a set of canonically equivalent strings, e.g. for canonical closure while processing collation tailoring rules.
Field Value:
41

SENTENCE_BREAK

public static final int SENTENCE_BREAK
Enumerated property Sentence_Break (new in Unicode 4.1). Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/) Returns USentenceBreak values.
Field Value:
4115

SIMPLE_CASE_FOLDING

public static final int SIMPLE_CASE_FOLDING
String property Simple_Case_Folding. Corresponds to UCharacter.foldCase(int, boolean).
Field Value:
16390

SIMPLE_LOWERCASE_MAPPING

public static final int SIMPLE_LOWERCASE_MAPPING
String property Simple_Lowercase_Mapping. Corresponds to UCharacter.toLowerCase(int).
Field Value:
16391

SIMPLE_TITLECASE_MAPPING

public static final int SIMPLE_TITLECASE_MAPPING
String property Simple_Titlecase_Mapping. Corresponds to UCharacter.toTitleCase(int).
Field Value:
16392

SIMPLE_UPPERCASE_MAPPING

public static final int SIMPLE_UPPERCASE_MAPPING
String property Simple_Uppercase_Mapping. Corresponds to UCharacter.toUpperCase(int).
Field Value:
16393

SOFT_DOTTED

public static final int SOFT_DOTTED
Binary property Soft_Dotted (new).

Characters with a "soft dot", like i or j.

An accent placed on these characters causes the dot to disappear.

Field Value:
27

STRING_LIMIT

public static final int STRING_LIMIT
One more than the last constant for string Unicode properties.
Field Value:
16397

STRING_START

public static final int STRING_START
First constant for string Unicode properties.
Field Value:
16384

S_TERM

public static final int S_TERM
Binary property STerm (new in Unicode 4.0.1). Sentence Terminal. Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/)
Field Value:
35

TERMINAL_PUNCTUATION

public static final int TERMINAL_PUNCTUATION
Binary property Terminal_Punctuation.

Punctuation characters that generally mark the end of textual units.

Field Value:
28

TITLECASE_MAPPING

public static final int TITLECASE_MAPPING
String property Titlecase_Mapping. Corresponds to UCharacter.toTitleCase(String).
Field Value:
16394

TRAIL_CANONICAL_COMBINING_CLASS

public static final int TRAIL_CANONICAL_COMBINING_CLASS
Enumerated property Trail_Canonical_Combining_Class. ICU-specific property for the ccc of the last code point of the decomposition, or lccc(c)=ccc(NFD(c)[last]). Useful for checking for canonically ordered text; see Normalizer.FCD and http://www.unicode.org/notes/tn5/#FCD . Returns 8-bit numeric values like CANONICAL_COMBINING_CLASS.
Field Value:
4113

UNICODE_1_NAME

public static final int UNICODE_1_NAME
String property Unicode_1_Name. Corresponds to UCharacter.getName1_0(int).
Field Value:
16395

UNIFIED_IDEOGRAPH

public static final int UNIFIED_IDEOGRAPH
Binary property Unified_Ideograph (new).

For programmatic determination of Ideographic Description Sequences.

Field Value:
29

UPPERCASE

public static final int UPPERCASE
Binary property Uppercase.

Same as UCharacter.isUUppercase(), different from UCharacter.isUpperCase().

Lu+Other_Uppercase

Field Value:
30

UPPERCASE_MAPPING

public static final int UPPERCASE_MAPPING
String property Uppercase_Mapping. Corresponds to UCharacter.toUpperCase(String).
Field Value:
16396

VARIATION_SELECTOR

public static final int VARIATION_SELECTOR
Binary property Variation_Selector (new in Unicode 4.0.1). Indicates all those characters that qualify as Variation Selectors. For details on the behavior of these characters, see StandardizedVariants.html and 15.6 Variation Selectors.
Field Value:
36

WHITE_SPACE

public static final int WHITE_SPACE
Binary property White_Space.

Same as UCharacter.isUWhiteSpace(), different from UCharacter.isSpace() and UCharacter.isWhitespace(). Space characters+TAB+CR+LF-ZWSP-ZWNBSP

Field Value:
31

WORD_BREAK

public static final int WORD_BREAK
Enumerated property Word_Break (new in Unicode 4.1). Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/) Returns UWordBreakValues values.
Field Value:
4116

XID_CONTINUE

public static final int XID_CONTINUE
Binary property XID_Continue.

ID_Continue modified to allow closure under normalization forms NFKC and NFKD.

Field Value:
32

XID_START

public static final int XID_START
Binary property XID_Start.

ID_Start modified to allow closure under normalization forms NFKC and NFKD.

Field Value:
33

Copyright (c) 2006 IBM Corporation and others.