public abstract class UTF8Convert extends Object
The difference between utf8 and pseudo-utf8 is the special treatment of null. In utf8, null is encoded as a single byte directly, whereas in pseudo-utf8, it is encoded as a two-byte sequence. See the JVM specification for more information.
Modifier and Type | Class and Description |
---|---|
private static class |
UTF8Convert.ByteArrayStringEncoderVisitor
Visitor that builds up a char[] as characters are decoded
|
private static class |
UTF8Convert.ByteBufferStringEncoderVisitor
Visitor that builds up a char[] as characters are decoded
|
private static class |
UTF8Convert.StringHashCodeVisitor
Visitor that builds up a String.hashCode form hashCode as characters are decoded
|
private static class |
UTF8Convert.UTF8CharacterVisitor
UTF8 character visitor abstraction
|
Modifier and Type | Field and Description |
---|---|
(package private) static boolean |
ALLOW_NORMAL_UTF8
Set fromUTF8 to not throw an exception when given a normal utf8
byte array.
|
(package private) static boolean |
ALLOW_PSEUDO_UTF8
Set fromUTF8 to not throw an exception when given a pseudo utf8
byte array.
|
(package private) static boolean |
STRICTLY_CHECK_FORMAT
Strictly check the format of the utf8/pseudo-utf8 byte array in
fromUTF8.
|
(package private) static boolean |
WRITE_PSEUDO_UTF8
Set toUTF8 to write in pseudo-utf8 (rather than normal utf8).
|
Constructor and Description |
---|
UTF8Convert() |
Modifier and Type | Method and Description |
---|---|
static boolean |
check(byte[] bytes)
Check whether the given sequence of bytes is valid (pseudo-)utf8.
|
static int |
computeStringHashCode(byte[] utf8)
Convert the given sequence of (pseudo-)utf8 formatted bytes
into a String hashCode.
|
static String |
fromUTF8(byte[] utf8)
Convert the given sequence of (pseudo-)utf8 formatted bytes
into a String.
|
static String |
fromUTF8(ByteBuffer utf8)
Convert the given sequence of (pseudo-)utf8 formatted bytes
into a String.
|
private static void |
throwDataFormatException(String message,
int location) |
static byte[] |
toUTF8(String s)
Convert the given String into a sequence of (pseudo-)utf8
formatted bytes.
|
static void |
toUTF8(String s,
ByteBuffer b)
Convert the given String into a sequence of (pseudo-)utf8
formatted bytes.
|
static int |
utfLength(String s) |
private static void |
visitUTF8(byte[] utf8,
UTF8Convert.UTF8CharacterVisitor visitor)
Visit all bytes of the given utf8 string calling the visitor when a
character is decoded.
|
private static void |
visitUTF8(ByteBuffer utf8,
UTF8Convert.UTF8CharacterVisitor visitor)
Visit all bytes of the given utf8 string calling the visitor when a
character is decoded.
|
static final boolean STRICTLY_CHECK_FORMAT
static final boolean ALLOW_NORMAL_UTF8
static final boolean ALLOW_PSEUDO_UTF8
static final boolean WRITE_PSEUDO_UTF8
public UTF8Convert()
public static String fromUTF8(byte[] utf8) throws UTFDataFormatException
The acceptable input formats are controlled by the STRICTLY_CHECK_FORMAT, ALLOW_NORMAL_UTF8, and ALLOW_PSEUDO_UTF8 flags.
utf8
- (pseudo-)utf8 byte arrayUTFDataFormatException
- if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8public static String fromUTF8(ByteBuffer utf8) throws UTFDataFormatException
utf8
- (pseudo-)utf8 byte arrayUTFDataFormatException
- if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8public static int computeStringHashCode(byte[] utf8) throws UTFDataFormatException
The acceptable input formats are controlled by the STRICTLY_CHECK_FORMAT, ALLOW_NORMAL_UTF8, and ALLOW_PSEUDO_UTF8 flags.
utf8
- (pseudo-)utf8 byte arrayUTFDataFormatException
- if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8private static void throwDataFormatException(String message, int location) throws UTFDataFormatException
UTFDataFormatException
private static void visitUTF8(byte[] utf8, UTF8Convert.UTF8CharacterVisitor visitor) throws UTFDataFormatException
The acceptable input formats are controlled by the STRICTLY_CHECK_FORMAT, ALLOW_NORMAL_UTF8, and ALLOW_PSEUDO_UTF8 flags.
utf8
- (pseudo-)utf8 byte arrayvisitor
- called when characters are decodedUTFDataFormatException
- if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8private static void visitUTF8(ByteBuffer utf8, UTF8Convert.UTF8CharacterVisitor visitor) throws UTFDataFormatException
The acceptable input formats are controlled by the STRICTLY_CHECK_FORMAT, ALLOW_NORMAL_UTF8, and ALLOW_PSEUDO_UTF8 flags.
utf8
- (pseudo-)utf8 byte arrayvisitor
- called when characters are decodedUTFDataFormatException
- if the (pseudo-)utf8 byte array is not valid (pseudo-)utf8public static byte[] toUTF8(String s)
The output format is controlled by the WRITE_PSEUDO_UTF8 flag.
s
- String to convertpublic static void toUTF8(String s, ByteBuffer b)
The output format is controlled by the WRITE_PSEUDO_UTF8 flag.
s
- String to convertb
- Byte buffer to hold resultpublic static boolean check(byte[] bytes)
bytes
- byte array to checktrue
iff the given sequence is valid (pseudo-)utf8.