CrystalSpace

Public API Reference

Main Page   Modules   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members   Related Pages  

csUnicodeTransform Class Reference
[Utilities]

Contains functions to convert between several UTF encodings. More...

#include <csuctransform.h>

List of all members.

Static Public Methods

UTF Decoders
int UTF8Decode (const utf8_char *str, size_t strlen, utf32_char &ch, bool *isValid=0)
 Decode an Unicode character encoded in UTF-8.

int UTF16Decode (const utf16_char *str, size_t strlen, utf32_char &ch, bool *isValid=0)
 Decode an Unicode character encoded in UTF-16.

int UTF32Decode (const utf32_char *str, size_t strlen, utf32_char &ch, bool *isValid=0)
 Decode an Unicode character encoded in UTF-32.

UTF Encoders
int EncodeUTF8 (const utf32_char ch, utf8_char *buf, size_t bufsize)
 Encode an Unicode character to UTF-8.

int EncodeUTF16 (const utf32_char ch, utf16_char *buf, size_t bufsize)
 Encode an Unicode character to UTF-16.

int EncodeUTF32 (const utf32_char ch, utf32_char *buf, size_t bufsize)
 Encode an Unicode character to UTF-32.

Converters between strings in different UTF encodings
size_t UTF8to16 (utf16_char *dest, size_t destSize, const utf8_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-8 to UTF-16.

size_t UTF8to32 (utf32_char *dest, size_t destSize, const utf8_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-8 to UTF-32.

size_t UTF16to8 (utf8_char *dest, size_t destSize, const utf16_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-16 to UTF-8.

size_t UTF16to32 (utf32_char *dest, size_t destSize, const utf16_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-16 to UTF-32.

size_t UTF32to8 (utf8_char *dest, size_t destSize, const utf32_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-32 to UTF-8.

size_t UTF32to16 (utf16_char *dest, size_t destSize, const utf32_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-32 to UTF-16.

Converters UTF and platform-specific wchar_t
size_t UTF8toWC (wchar_t *dest, size_t destSize, const utf8_char *source, size_t srcSize)
 Convert UTF-8 to platform-specific wide chars.

size_t UTF16toWC (wchar_t *dest, size_t destSize, const utf16_char *source, size_t srcSize)
 Convert UTF-16 to platform-specific wide chars.

size_t UTF32toWC (wchar_t *dest, size_t destSize, const utf32_char *source, size_t srcSize)
 Convert UTF-32 to platform-specific wide chars.

size_t WCtoUTF8 (utf8_char *dest, size_t destSize, const wchar_t *source, size_t srcSize)
 Convert platform-specific wide chars to UTF-8.

size_t WCtoUTF16 (utf16_char *dest, size_t destSize, const wchar_t *source, size_t srcSize)
 Convert platform-specific wide chars to UTF-16.

size_t WCtoUTF32 (utf32_char *dest, size_t destSize, const wchar_t *source, size_t srcSize)
 Convert platform-specific wide chars to UTF-32.

Helpers to skip encoded chars in different UTF encodings
int UTF8Skip (const utf8_char *str, size_t maxSkip)
 Determine how many characters in an UTF-8 buffer need to be skipped to get to the next encoded char.

int UTF8Rewind (const utf8_char *str, size_t maxRew)
 Determine how many characters in an UTF-8 buffer need to skipped back to get to the start of the previous encoded character.

int UTF16Skip (const utf16_char *str, size_t maxSkip)
 Determine how many characters in an UTF-16 buffer need to be skipped to get to the next encoded char.

int UTF16Rewind (const utf16_char *str, size_t maxRew)
 Determine how many characters in an UTF-16 buffer need to skipped back to get to the start of the previous encoded character.

int UTF32Skip (const utf32_char *str, size_t maxSkip)
 Determine how many characters in an UTF-32 buffer need to be skipped to get to the next encoded char.

int UTF32Rewind (const utf32_char *str, size_t maxRew)
 Determine how many characters in an UTF-32 buffer need to skipped back to get to the start of the previous encoded character.


Detailed Description

Contains functions to convert between several UTF encodings.

Definition at line 41 of file csuctransform.h.


Member Function Documentation

int csUnicodeTransform::EncodeUTF16 const utf32_char    ch,
utf16_char   buf,
size_t    bufsize
[inline, static]
 

Encode an Unicode character to UTF-16.

Parameters:
ch  Character to encode.
buf  Pointer to the buffer receiving the encoded character.
bufsize  Number of chars in the buffer.
Returns:
The number of characters needed to encode ch.
Remarks:
The buffer will be filled up as much as possible. Check the returned value whether the encoded character fit into the buffer.

Definition at line 304 of file csuctransform.h.

References CS_UC_CHAR_HIGH_SURROGATE_FIRST, CS_UC_CHAR_LOW_SURROGATE_FIRST, CS_UC_IS_INVALID, CS_UC_IS_SURROGATE, utf16_char, and utf32_char.

Referenced by UTF32to16(), and UTF8to16().

int csUnicodeTransform::EncodeUTF32 const utf32_char    ch,
utf32_char   buf,
size_t    bufsize
[inline, static]
 

Encode an Unicode character to UTF-32.

Parameters:
ch  Character to encode.
buf  Pointer to the buffer receiving the encoded character.
bufsize  Number of chars in the buffer.
Returns:
The number of characters needed to encode ch.
Remarks:
The buffer will be filled up as much as possible. Check the returned value whether the encoded character fit into the buffer.

Definition at line 331 of file csuctransform.h.

References CS_UC_IS_INVALID, CS_UC_IS_SURROGATE, and utf32_char.

Referenced by UTF16to32(), and UTF8to32().

int csUnicodeTransform::EncodeUTF8 const utf32_char    ch,
utf8_char   buf,
size_t    bufsize
[inline, static]
 

Encode an Unicode character to UTF-8.

Parameters:
ch  Character to encode.
buf  Pointer to the buffer receiving the encoded character.
bufsize  Number of chars in the buffer.
Returns:
The number of characters needed to encode ch.
Remarks:
The buffer will be filled up as much as possible. Check the returned value whether the encoded character fit into the buffer.

Definition at line 250 of file csuctransform.h.

References CS_UC_IS_INVALID, CS_UC_IS_SURROGATE, utf32_char, and utf8_char.

Referenced by UTF16to8(), and UTF32to8().

int csUnicodeTransform::UTF16Decode const utf16_char   str,
size_t    strlen,
utf32_char   ch,
bool *    isValid = 0
[inline, static]
 

Decode an Unicode character encoded in UTF-16.

Parameters:
str  Pointer to the encoded character.
strlen  Number of chars in the string.
ch  Decoded character.
isValid  When an error occured during decoding, ch contains the replacement character (CS_UC_CHAR_REPLACER). In this case, the bool pointed to by isValid will be set to false. The parameter can be 0, but in this case the information whether the decoded char is the replacement character because the source data is errorneous is lost.
Returns:
The number of characters in str that have to be skipped to retrieve the next encoding character.

Definition at line 164 of file csuctransform.h.

References CS_UC_IS_HIGH_SURROGATE, CS_UC_IS_INVALID, CS_UC_IS_LOW_SURROGATE, CS_UC_IS_SURROGATE, utf16_char, and utf32_char.

Referenced by UTF16to32(), and UTF16to8().

int csUnicodeTransform::UTF16Rewind const utf16_char   str,
size_t    maxRew
[inline, static]
 

Determine how many characters in an UTF-16 buffer need to skipped back to get to the start of the previous encoded character.

Parameters:
str  Pointer to buffer with encoded character.
maxRew  The number of characters to go back at max. Typically, this is the number of chars from str to the start of the buffer.
Returns:
Number of chars to skip back in the buffer. Returns 0 if maxRew is 0.

Definition at line 761 of file csuctransform.h.

References CS_UC_IS_HIGH_SURROGATE, CS_UC_IS_SURROGATE, and utf16_char.

int csUnicodeTransform::UTF16Skip const utf16_char   str,
size_t    maxSkip
[inline, static]
 

Determine how many characters in an UTF-16 buffer need to be skipped to get to the next encoded char.

Parameters:
str  Pointer to buffer with encoded character.
maxSkip  The number of characters to skip at max. Usually, this is the number of chars from str to the end of the buffer.
Returns:
Number of chars to skip in the buffer. Returns 0 if maxSkip is 0.

Definition at line 748 of file csuctransform.h.

References CS_UC_IS_HIGH_SURROGATE, and utf16_char.

size_t csUnicodeTransform::UTF16to32 utf32_char   dest,
size_t    destSize,
const utf16_char   source,
size_t    srcSize = (size_t)-1
[inline, static]
 

Convert UTF-16 to UTF-32.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 432 of file csuctransform.h.

References EncodeUTF32(), utf16_char, UTF16Decode(), UTF16to32(), and utf32_char.

Referenced by UTF16to32(), and WCtoUTF32().

size_t csUnicodeTransform::UTF16to8 utf8_char   dest,
size_t    destSize,
const utf16_char   source,
size_t    srcSize = (size_t)-1
[inline, static]
 

Convert UTF-16 to UTF-8.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 427 of file csuctransform.h.

References EncodeUTF8(), utf16_char, UTF16Decode(), UTF16to8(), and utf8_char.

Referenced by UTF16to8(), and WCtoUTF8().

size_t csUnicodeTransform::UTF16toWC wchar_t *    dest,
size_t    destSize,
const utf16_char   source,
size_t    srcSize
[inline, static]
 

Convert UTF-16 to platform-specific wide chars.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 531 of file csuctransform.h.

References utf16_char.

int csUnicodeTransform::UTF32Decode const utf32_char   str,
size_t    strlen,
utf32_char   ch,
bool *    isValid = 0
[inline, static]
 

Decode an Unicode character encoded in UTF-32.

Parameters:
str  Pointer to the encoded character.
strlen  Number of chars in the string.
ch  Decoded character.
isValid  When an error occured during decoding, ch contains the replacement character (CS_UC_CHAR_REPLACER). In this case, the bool pointed to by isValid will be set to false. The parameter can be 0, but in this case the information whether the decoded char is the replacement character because the source data is errorneous is lost.
Returns:
The number of characters in str that have to be skipped to retrieve the next encoding character.

Definition at line 209 of file csuctransform.h.

References CS_UC_IS_INVALID, and utf32_char.

Referenced by UTF32to16(), and UTF32to8().

int csUnicodeTransform::UTF32Rewind const utf32_char   str,
size_t    maxRew
[inline, static]
 

Determine how many characters in an UTF-32 buffer need to skipped back to get to the start of the previous encoded character.

Parameters:
str  Pointer to buffer with encoded character.
maxRew  The number of characters to go back at max. Typically, this is the number of chars from str to the start of the buffer.
Returns:
Number of chars to skip back in the buffer. Returns 0 if maxRew is 0.

Definition at line 792 of file csuctransform.h.

References utf32_char.

int csUnicodeTransform::UTF32Skip const utf32_char   str,
size_t    maxSkip
[inline, static]
 

Determine how many characters in an UTF-32 buffer need to be skipped to get to the next encoded char.

Parameters:
str  Pointer to buffer with encoded character.
maxSkip  The number of characters to skip at max. Usually, this is the number of chars from str to the end of the buffer.
Returns:
Number of chars to skip in the buffer. Returns 0 if maxSkip is 0.

Definition at line 782 of file csuctransform.h.

References utf32_char.

size_t csUnicodeTransform::UTF32to16 utf16_char   dest,
size_t    destSize,
const utf32_char   source,
size_t    srcSize = (size_t)-1
[inline, static]
 

Convert UTF-32 to UTF-16.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 443 of file csuctransform.h.

References EncodeUTF16(), utf16_char, utf32_char, UTF32Decode(), UTF32to16(), and utf8_char.

Referenced by UTF32to16(), and UTF32toWC().

size_t csUnicodeTransform::UTF32to8 utf8_char   dest,
size_t    destSize,
const utf32_char   source,
size_t    srcSize = (size_t)-1
[inline, static]
 

Convert UTF-32 to UTF-8.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 438 of file csuctransform.h.

References EncodeUTF8(), utf32_char, UTF32Decode(), UTF32to8(), and utf8_char.

Referenced by UTF32to8().

size_t csUnicodeTransform::UTF32toWC wchar_t *    dest,
size_t    destSize,
const utf32_char   source,
size_t    srcSize
[inline, static]
 

Convert UTF-32 to platform-specific wide chars.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 554 of file csuctransform.h.

References utf16_char, utf32_char, and UTF32to16().

int csUnicodeTransform::UTF8Decode const utf8_char   str,
size_t    strlen,
utf32_char   ch,
bool *    isValid = 0
[inline, static]
 

Decode an Unicode character encoded in UTF-8.

Parameters:
str  Pointer to the encoded character.
strlen  Number of chars in the string.
ch  Decoded character.
isValid  When an error occured during decoding, ch contains the replacement character (CS_UC_CHAR_REPLACER). In this case, the bool pointed to by isValid will be set to false. The parameter can be 0, but in this case the information whether the decoded char is the replacement character because the source data is errorneous is lost.
Returns:
The number of characters in str that have to be skipped to retrieve the next encoding character.

Definition at line 82 of file csuctransform.h.

References CS_UC_IS_INVALID, CS_UC_IS_SURROGATE, utf32_char, and utf8_char.

Referenced by UTF8to16(), and UTF8to32().

int csUnicodeTransform::UTF8Rewind const utf8_char   str,
size_t    maxRew
[inline, static]
 

Determine how many characters in an UTF-8 buffer need to skipped back to get to the start of the previous encoded character.

Parameters:
str  Pointer to buffer with encoded character.
maxRew  The number of characters to go back at max. Typically, this is the number of chars from str to the start of the buffer.
Returns:
Number of chars to skip back in the buffer. Returns 0 if maxRew is 0.

Definition at line 721 of file csuctransform.h.

References utf8_char.

int csUnicodeTransform::UTF8Skip const utf8_char   str,
size_t    maxSkip
[inline, static]
 

Determine how many characters in an UTF-8 buffer need to be skipped to get to the next encoded char.

Parameters:
str  Pointer to buffer with encoded character.
maxSkip  The number of characters to skip at max. Usually, this is the number of chars from str to the end of the buffer.
Returns:
Number of chars to skip in the buffer. Returns 0 if maxSkip is 0.

Definition at line 681 of file csuctransform.h.

References utf8_char.

size_t csUnicodeTransform::UTF8to16 utf16_char   dest,
size_t    destSize,
const utf8_char   source,
size_t    srcSize = (size_t)-1
[inline, static]
 

Convert UTF-8 to UTF-16.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 416 of file csuctransform.h.

References EncodeUTF16(), utf16_char, utf8_char, UTF8Decode(), and UTF8to16().

Referenced by UTF8to16(), and UTF8toWC().

size_t csUnicodeTransform::UTF8to32 utf32_char   dest,
size_t    destSize,
const utf8_char   source,
size_t    srcSize = (size_t)-1
[inline, static]
 

Convert UTF-8 to UTF-32.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 421 of file csuctransform.h.

References EncodeUTF32(), utf32_char, utf8_char, UTF8Decode(), and UTF8to32().

Referenced by UTF8to32().

size_t csUnicodeTransform::UTF8toWC wchar_t *    dest,
size_t    destSize,
const utf8_char   source,
size_t    srcSize
[inline, static]
 

Convert UTF-8 to platform-specific wide chars.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 521 of file csuctransform.h.

References utf16_char, utf8_char, and UTF8to16().

size_t csUnicodeTransform::WCtoUTF16 utf16_char   dest,
size_t    destSize,
const wchar_t *    source,
size_t    srcSize
[inline, static]
 

Convert platform-specific wide chars to UTF-16.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 574 of file csuctransform.h.

References utf16_char.

size_t csUnicodeTransform::WCtoUTF32 utf32_char   dest,
size_t    destSize,
const wchar_t *    source,
size_t    srcSize
[inline, static]
 

Convert platform-specific wide chars to UTF-32.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 597 of file csuctransform.h.

References utf16_char, UTF16to32(), and utf32_char.

size_t csUnicodeTransform::WCtoUTF8 utf8_char   dest,
size_t    destSize,
const wchar_t *    source,
size_t    srcSize
[inline, static]
 

Convert platform-specific wide chars to UTF-8.

Parameters:
dest  Destination buffer.
destSize  Number of characters the destination buffer can hold.
source  Source buffer.
srcSize  Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 564 of file csuctransform.h.

References utf16_char, UTF16to8(), and utf8_char.


The documentation for this class was generated from the following file:
Generated for Crystal Space by doxygen 1.2.18