com.ibm.icu.text
Class UnicodeDecompressor
- com.ibm.icu.text.SCSU
public final class UnicodeDecompressor
implements com.ibm.icu.text.SCSU
A decompression engine implementing the Standard Compression Scheme
for Unicode (SCSU) as outlined in
Unicode Technical
Report #6.
USAGE
The static methods on
UnicodeDecompressor may be used in a
straightforward manner to decompress simple strings:
byte [] compressed = ... ; // get compressed bytes from somewhere
String result = UnicodeDecompressor.decompress(compressed);
The static methods have a fairly large memory footprint.
For finer-grained control over memory usage,
UnicodeDecompressor offers more powerful APIs allowing
iterative decompression:
// Decompress an array "bytes" of length "len" using a buffer of 512 chars
// to the Writer "out"
UnicodeDecompressor myDecompressor = new UnicodeDecompressor();
final static int BUFSIZE = 512;
char [] charBuffer = new char [ BUFSIZE ];
int charsWritten = 0;
int [] bytesRead = new int [1];
int totalBytesDecompressed = 0;
int totalCharsWritten = 0;
do {
// do the decompression
charsWritten = myDecompressor.decompress(bytes, totalBytesDecompressed,
len, bytesRead,
charBuffer, 0, BUFSIZE);
// do something with the current set of chars
out.write(charBuffer, 0, charsWritten);
// update the no. of bytes decompressed
totalBytesDecompressed += bytesRead[0];
// update the no. of chars written
totalCharsWritten += charsWritten;
} while(totalBytesDecompressed <32len);
myDecompressor.reset(); // reuse decompressor
Decompression is performed according to the standard set forth in
Unicode Technical
Report #6
ARMENIANINDEX , COMPRESSIONOFFSET , GREEKINDEX , HALFWIDTHKATAKANAINDEX , HIRAGANAINDEX , INVALIDCHAR , INVALIDWINDOW , IPAEXTENSIONINDEX , KATAKANAINDEX , LATININDEX , MAXINDEX , NUMSTATICWINDOWS , NUMWINDOWS , RESERVEDINDEX , SCHANGE0 , SCHANGE1 , SCHANGE2 , SCHANGE3 , SCHANGE4 , SCHANGE5 , SCHANGE6 , SCHANGE7 , SCHANGEU , SDEFINE0 , SDEFINE1 , SDEFINE2 , SDEFINE3 , SDEFINE4 , SDEFINE5 , SDEFINE6 , SDEFINE7 , SDEFINEX , SINGLEBYTEMODE , SQUOTE0 , SQUOTE1 , SQUOTE2 , SQUOTE3 , SQUOTE4 , SQUOTE5 , SQUOTE6 , SQUOTE7 , SQUOTEU , SRESERVED , UCHANGE0 , UCHANGE1 , UCHANGE2 , UCHANGE3 , UCHANGE4 , UCHANGE5 , UCHANGE6 , UCHANGE7 , UDEFINE0 , UDEFINE1 , UDEFINE2 , UDEFINE3 , UDEFINE4 , UDEFINE5 , UDEFINE6 , UDEFINE7 , UDEFINEX , UNICODEMODE , UQUOTEU , URESERVED , sOffsetTable , sOffsets |
static String | decompress(byte[] buffer) - Decompress a byte array into a String.
|
static char[] | decompress(byte[] buffer, int start, int limit) - Decompress a byte array into a Unicode character array.
|
int | decompress(byte[] byteBuffer, int byteBufferStart, int byteBufferLimit, int[] bytesRead, char[] charBuffer, int charBufferStart, int charBufferLimit) - Decompress a byte array into a Unicode character array.
|
void | reset() - Reset the decompressor to its initial state.
|
UnicodeDecompressor
public UnicodeDecompressor()
Create a UnicodeDecompressor.
Sets all windows to their default values.
decompress
public static String decompress(byte[] buffer)
Decompress a byte array into a String.
buffer
- The byte array to decompress.
- A String containing the decompressed characters.
decompress(byte [], int, int)
decompress
public static char[] decompress(byte[] buffer,
int start,
int limit)
Decompress a byte array into a Unicode character array.
buffer
- The byte array to decompress.start
- The start of the byte run to decompress.limit
- The limit of the byte run to decompress.
- A character array containing the decompressed bytes.
decompress
public int decompress(byte[] byteBuffer,
int byteBufferStart,
int byteBufferLimit,
int[] bytesRead,
char[] charBuffer,
int charBufferStart,
int charBufferLimit)
Decompress a byte array into a Unicode character array.
This function will either completely fill the output buffer,
or consume the entire input.
byteBuffer
- The byte buffer to decompress.byteBufferStart
- The start of the byte run to decompress.byteBufferLimit
- The limit of the byte run to decompress.bytesRead
- A one-element array. If not null, on return
the number of bytes read from byteBuffer.charBuffer
- A buffer to receive the decompressed data.
This buffer must be at minimum two characters in size.charBufferStart
- The starting offset to which to write
decompressed data.charBufferLimit
- The limiting offset for writing
decompressed data.
- The number of Unicode characters written to charBuffer.
reset
public void reset()
Reset the decompressor to its initial state.
Copyright (c) 2006 IBM Corporation and others.