| public final class java.lang Character
|
Java SE 6 |
Character class wraps a value of the primitive
type char in an object. An object of type
Character contains a single field whose type is
char.
In addition, this class provides several methods for determining a character's category (lowercase letter, digit, etc.) and for converting characters from uppercase to lowercase and vice versa.
Character information is based on the Unicode Standard, version 4.0.
The methods and data of class Character are defined by
the information in the UnicodeData file that is part of the
Unicode Character Database maintained by the Unicode
Consortium. This file specifies various properties including name
and general category for every defined Unicode code point or
character range.
The file and its description are available from the Unicode Consortium at:
The char data type (and therefore the value that a
Character object encapsulates) are based on the
original Unicode specification, which defined characters as
fixed-width 16-bit entities. The Unicode standard has since been
changed to allow for characters whose representation requires more
than 16 bits. The range of legal code points is now
U+0000 to U+10FFFF, known as Unicode scalar value.
(Refer to the
definition of the U+n notation in the Unicode
standard.)
The set of characters from U+0000 to U+FFFF is sometimes
referred to as the Basic Multilingual Plane (BMP). Characters whose code points are greater
than U+FFFF are called supplementary characters. The Java
2 platform uses the UTF-16 representation in char
arrays and in the String and StringBuffer
classes. In this representation, supplementary characters are
represented as a pair of char values, the first from
the high-surrogates range, (\uD800-\uDBFF), the
second from the low-surrogates range
(\uDC00-\uDFFF).
A char value, therefore, represents Basic
Multilingual Plane (BMP) code points, including the surrogate
code points, or code units of the UTF-16 encoding. An
int value represents all Unicode code points,
including supplementary code points. The lower (least significant)
21 bits of int are used to represent Unicode code
points and the upper (most significant) 11 bits must be zero.
Unless otherwise specified, the behavior with respect to
supplementary characters and surrogate char values is
as follows:
char value cannot support
supplementary characters. They treat char values from the
surrogate ranges as undefined characters. For example,
Character.isLetter('\uD840') returns false, even though
this specific value if followed by any low-surrogate value in a string
would represent a letter.
int value support all
Unicode characters, including supplementary characters. For
example, Character.isLetter(0x2F81A) returns
true because the code point value represents a letter
(a CJK ideograph).
In the Java SE API documentation, Unicode code point is
used for character values in the range between U+0000 and U+10FFFF,
and Unicode code unit is used for 16-bit
char values that are code units of the UTF-16
encoding. For more information on Unicode terminology, refer to the
Unicode Glossary.
| since | 1.0 |
| Fields | |||
|---|---|---|---|
| final public static int | MIN_RADIX Details
The minimum radix available for conversion to and from strings.
The constant value of this field is the smallest value permitted
for the radix argument in radix-conversion methods such as the
digit method, the forDigit
method, and the toString method of class
Integer.
| ||
| final public static int | MAX_RADIX Details
The maximum radix available for conversion to and from strings.
The constant value of this field is the largest value permitted
for the radix argument in radix-conversion methods such as the
digit method, the forDigit
method, and the toString method of class
Integer.
| ||
| final public static char | MIN_VALUE Details
The constant value of this field is the smallest value of type
char, '\u0000'.
| ||
| final public static char | MAX_VALUE Details
The constant value of this field is the largest value of type
char, '\uFFFF'.
| ||
| final public static Class | TYPE Details
The Class instance representing the primitive type
char.
| ||
| final public static byte | UNASSIGNED Details
General category "Cn" in the Unicode specification.
| ||
| final public static byte | UPPERCASE_LETTER Details
General category "Lu" in the Unicode specification.
| ||
| final public static byte | LOWERCASE_LETTER Details
General category "Ll" in the Unicode specification.
| ||
| final public static byte | TITLECASE_LETTER Details
General category "Lt" in the Unicode specification.
| ||
| final public static byte | MODIFIER_LETTER Details
General category "Lm" in the Unicode specification.
| ||
| final public static byte | OTHER_LETTER Details
General category "Lo" in the Unicode specification.
| ||
| final public static byte | NON_SPACING_MARK Details
General category "Mn" in the Unicode specification.
| ||
| final public static byte | ENCLOSING_MARK Details
General category "Me" in the Unicode specification.
| ||
| final public static byte | COMBINING_SPACING_MARK Details
General category "Mc" in the Unicode specification.
| ||
| final public static byte | DECIMAL_DIGIT_NUMBER Details
General category "Nd" in the Unicode specification.
| ||
| final public static byte | LETTER_NUMBER Details
General category "Nl" in the Unicode specification.
| ||
| final public static byte | OTHER_NUMBER Details
General category "No" in the Unicode specification.
| ||
| final public static byte | SPACE_SEPARATOR Details
General category "Zs" in the Unicode specification.
| ||
| final public static byte | LINE_SEPARATOR Details
General category "Zl" in the Unicode specification.
| ||
| final public static byte | PARAGRAPH_SEPARATOR Details
General category "Zp" in the Unicode specification.
| ||
| final public static byte | CONTROL Details
General category "Cc" in the Unicode specification.
| ||
| final public static byte | FORMAT Details
General category "Cf" in the Unicode specification.
| ||
| final public static byte | PRIVATE_USE Details
General category "Co" in the Unicode specification.
| ||
| final public static byte | SURROGATE Details
General category "Cs" in the Unicode specification.
| ||
| final public static byte | DASH_PUNCTUATION Details
General category "Pd" in the Unicode specification.
| ||
| final public static byte | START_PUNCTUATION Details
General category "Ps" in the Unicode specification.
| ||
| final public static byte | END_PUNCTUATION Details
General category "Pe" in the Unicode specification.
| ||
| final public static byte | CONNECTOR_PUNCTUATION Details
General category "Pc" in the Unicode specification.
| ||
| final public static byte | OTHER_PUNCTUATION Details
General category "Po" in the Unicode specification.
| ||
| final public static byte | MATH_SYMBOL Details
General category "Sm" in the Unicode specification.
| ||
| final public static byte | CURRENCY_SYMBOL Details
General category "Sc" in the Unicode specification.
| ||
| final public static byte | MODIFIER_SYMBOL Details
General category "Sk" in the Unicode specification.
| ||
| final public static byte | OTHER_SYMBOL Details
General category "So" in the Unicode specification.
| ||
| final public static byte | INITIAL_QUOTE_PUNCTUATION Details
General category "Pi" in the Unicode specification.
| ||
| final public static byte | FINAL_QUOTE_PUNCTUATION Details
General category "Pf" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_UNDEFINED Details
Undefined bidirectional character type. Undefined char
values have undefined directionality in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_LEFT_TO_RIGHT Details
Strong bidirectional character type "L" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_RIGHT_TO_LEFT Details
Strong bidirectional character type "R" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC Details
Strong bidirectional character type "AL" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_EUROPEAN_NUMBER Details
Weak bidirectional character type "EN" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR Details
Weak bidirectional character type "ES" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR Details
Weak bidirectional character type "ET" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_ARABIC_NUMBER Details
Weak bidirectional character type "AN" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_COMMON_NUMBER_SEPARATOR Details
Weak bidirectional character type "CS" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_NONSPACING_MARK Details
Weak bidirectional character type "NSM" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_BOUNDARY_NEUTRAL Details
Weak bidirectional character type "BN" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_PARAGRAPH_SEPARATOR Details
Neutral bidirectional character type "B" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_SEGMENT_SEPARATOR Details
Neutral bidirectional character type "S" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_WHITESPACE Details
Neutral bidirectional character type "WS" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_OTHER_NEUTRALS Details
Neutral bidirectional character type "ON" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING Details
Strong bidirectional character type "LRE" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE Details
Strong bidirectional character type "LRO" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING Details
Strong bidirectional character type "RLE" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE Details
Strong bidirectional character type "RLO" in the Unicode specification.
| ||
| final public static byte | DIRECTIONALITY_POP_DIRECTIONAL_FORMAT Details
Weak bidirectional character type "PDF" in the Unicode specification.
| ||
| final public static char | MIN_HIGH_SURROGATE Details
The minimum value of a Unicode high-surrogate code unit in the
UTF-16 encoding. A high-surrogate is also known as a
leading-surrogate.
| ||
| final public static char | MAX_HIGH_SURROGATE Details
The maximum value of a Unicode high-surrogate code unit in the
UTF-16 encoding. A high-surrogate is also known as a
leading-surrogate.
| ||
| final public static char | MIN_LOW_SURROGATE Details
The minimum value of a Unicode low-surrogate code unit in the
UTF-16 encoding. A low-surrogate is also known as a
trailing-surrogate.
| ||
| final public static char | MAX_LOW_SURROGATE Details
The maximum value of a Unicode low-surrogate code unit in the
UTF-16 encoding. A low-surrogate is also known as a
trailing-surrogate.
| ||
| final public static char | MIN_SURROGATE Details
The minimum value of a Unicode surrogate code unit in the UTF-16 encoding.
| ||
| final public static char | MAX_SURROGATE Details
The maximum value of a Unicode surrogate code unit in the UTF-16 encoding.
| ||
| final public static int | MIN_SUPPLEMENTARY_CODE_POINT Details
The minimum value of a supplementary code point.
| ||
| final public static int | MIN_CODE_POINT Details
The minimum value of a Unicode code point.
| ||
| final public static int | MAX_CODE_POINT Details
The maximum value of a Unicode code point.
| ||
| final public static int | SIZE Details
The number of bits used to represent a char value in unsigned
binary form.
| ||
| Constructors | |||
|---|---|---|---|
| public | Character(char value) Details
Constructs a newly allocated Character object that
represents the specified char value.
| ||
| Methods | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| public static int | charCount(int codePoint) Details
Determines the number of char values needed to
represent the specified character (Unicode code point). If the
specified character is equal to or greater than 0x10000, then
the method returns 2. Otherwise, the method returns 1.
This method doesn't validate the specified character to be a
valid Unicode code point. The caller must validate the
character value using
| ||||||||||||||||||
| public char | charValue() Details
Returns the value of this Character object.
| ||||||||||||||||||
| public static int | codePointAt(CharSequence seq, int index) Details
Returns the code point at the given index of the
CharSequence. If the char value at
the given index in the CharSequence is in the
high-surrogate range, the following index is less than the
length of the CharSequence, and the
char value at the following index is in the
low-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at the given index is returned.
| ||||||||||||||||||
| public static int | codePointAt(char[] a, int index) Details
Returns the code point at the given index of the
char array. If the char value at
the given index in the char array is in the
high-surrogate range, the following index is less than the
length of the char array, and the
char value at the following index is in the
low-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at the given index is returned.
| ||||||||||||||||||
| public static int | codePointAt(char[] a, int index, int limit) Details
Returns the code point at the given index of the
char array, where only array elements with
index less than limit can be used. If
the char value at the given index in the
char array is in the high-surrogate range, the
following index is less than the limit, and the
char value at the following index is in the
low-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at the given index is returned.
| ||||||||||||||||||
| public static int | codePointBefore(CharSequence seq, int index) Details
Returns the code point preceding the given index of the
CharSequence. If the char value at
(index - 1) in the CharSequence is in
the low-surrogate range, (index - 2) is not
negative, and the char value at (index -
2) in the CharSequence is in the
high-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at (index - 1) is
returned.
| ||||||||||||||||||
| public static int | codePointBefore(char[] a, int index) Details
Returns the code point preceding the given index of the
char array. If the char value at
(index - 1) in the char array is in
the low-surrogate range, (index - 2) is not
negative, and the char value at (index -
2) in the char array is in the
high-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at (index - 1) is
returned.
| ||||||||||||||||||
| public static int | codePointBefore(char[] a, int index, int start) Details
Returns the code point preceding the given index of the
char array, where only array elements with
index greater than or equal to start
can be used. If the char value at (index -
1) in the char array is in the
low-surrogate range, (index - 2) is not less than
start, and the char value at
(index - 2) in the char array is in
the high-surrogate range, then the supplementary code point
corresponding to this surrogate pair is returned. Otherwise,
the char value at (index - 1) is
returned.
| ||||||||||||||||||
| public static int | codePointCount(CharSequence seq, int beginIndex, int endIndex) Details
Returns the number of Unicode code points in the text range of
the specified char sequence. The text range begins at the
specified beginIndex and extends to the
char at index endIndex - 1. Thus the
length (in chars) of the text range is
endIndex-beginIndex. Unpaired surrogates within
the text range count as one code point each.
| ||||||||||||||||||
| public static int | codePointCount(char[] a, int offset, int count) Details
Returns the number of Unicode code points in a subarray of the
char array argument. The offset
argument is the index of the first char of the
subarray and the count argument specifies the
length of the subarray in chars. Unpaired
surrogates within the subarray count as one code point each.
| ||||||||||||||||||
| public int | compareTo(Character anotherCharacter) Details
Compares two Character objects numerically.
| ||||||||||||||||||
| public static int | digit(char ch, int radix) Details
Returns the numeric value of the character ch in the
specified radix.
If the radix is not in the range
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| ||||||||||||||||||
| public static int | digit(int codePoint, int radix) Details
Returns the numeric value of the specified character (Unicode
code point) in the specified radix.
If the radix is not in the range
| ||||||||||||||||||
| public boolean | equals(Object obj) Details
Compares this object against the specified object.
The result is true if and only if the argument is not
null and is a Character object that
represents the same char value as this object.
| ||||||||||||||||||
| public static char | forDigit(int digit, int radix) Details
Determines the character representation for a specific digit in
the specified radix. If the value of radix is not a
valid radix, or the value of digit is not a valid
digit in the specified radix, the null character
('\u0000') is returned.
The
If the digit is less than 10, then
| ||||||||||||||||||
| public int | hashCode() Details
Returns a hash code for this Character.
| ||||||||||||||||||
| public static int | offsetByCodePoints(CharSequence seq, int index, int codePointOffset) Details
Returns the index within the given char sequence that is offset
from the given index by codePointOffset
code points. Unpaired surrogates within the text range given by
index and codePointOffset count as
one code point each.
| ||||||||||||||||||
| public static int | offsetByCodePoints(char[] a, int start, int count, int index, int codePointOffset) Details
Returns the index within the given char subarray
that is offset from the given index by
codePointOffset code points. The
start and count arguments specify a
subarray of the char array. Unpaired surrogates
within the text range given by index and
codePointOffset count as one code point each.
| ||||||||||||||||||
| public static char | reverseBytes(char ch) Details
Returns the value obtained by reversing the order of the bytes in the
specified char value.
| ||||||||||||||||||
| public static int | toChars(int codePoint, char[] dst, int dstIndex) Details
Converts the specified character (Unicode code point) to its
UTF-16 representation. If the specified code point is a BMP
(Basic Multilingual Plane or Plane 0) value, the same value is
stored in dst[dstIndex], and 1 is returned. If the
specified code point is a supplementary character, its
surrogate values are stored in dst[dstIndex]
(high-surrogate) and dst[dstIndex+1]
(low-surrogate), and 2 is returned.
| ||||||||||||||||||
| public static char[] | toChars(int codePoint) Details
Converts the specified character (Unicode code point) to its
UTF-16 representation stored in a char array. If
the specified code point is a BMP (Basic Multilingual Plane or
Plane 0) value, the resulting char array has
the same value as codePoint. If the specified code
point is a supplementary code point, the resulting
char array has the corresponding surrogate pair.
| ||||||||||||||||||
| public static int | toCodePoint(char high, char low) Details
Converts the specified surrogate pair to its supplementary code
point value. This method does not validate the specified
surrogate pair. The caller must validate it using isSurrogatePair if necessary.
| ||||||||||||||||||
| public static char | toLowerCase(char ch) Details
Converts the character argument to lowercase using case
mapping information from the UnicodeData file.
Note that
In general, Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| ||||||||||||||||||
| public static int | toLowerCase(int codePoint) Details
Converts the character (Unicode code point) argument to
lowercase using case mapping information from the UnicodeData
file.
Note that
In general,
| ||||||||||||||||||
| public String | toString() Details
Returns a String object representing this
Character's value. The result is a string of
length 1 whose sole component is the primitive
char value represented by this
Character object.
| ||||||||||||||||||
| public static String | toString(char c) Details
Returns a String object representing the
specified char. The result is a string of length
1 consisting solely of the specified char.
| ||||||||||||||||||
| public static char | toTitleCase(char ch) Details
Converts the character argument to titlecase using case mapping
information from the UnicodeData file. If a character has no
explicit titlecase mapping and is not itself a titlecase char
according to UnicodeData, then the uppercase mapping is
returned as an equivalent titlecase mapping. If the
char argument is already a titlecase
char, the same char value will be
returned.
Note that
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| ||||||||||||||||||
| public static int | toTitleCase(int codePoint) Details
Converts the character (Unicode code point) argument to titlecase using case mapping
information from the UnicodeData file. If a character has no
explicit titlecase mapping and is not itself a titlecase char
according to UnicodeData, then the uppercase mapping is
returned as an equivalent titlecase mapping. If the
character argument is already a titlecase
character, the same character value will be
returned.
Note that
| ||||||||||||||||||
| public static char | toUpperCase(char ch) Details
Converts the character argument to uppercase using case mapping
information from the UnicodeData file.
Note that
In general, Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| ||||||||||||||||||
| public static int | toUpperCase(int codePoint) Details
Converts the character (Unicode code point) argument to
uppercase using case mapping information from the UnicodeData
file.
Note that
In general,
| ||||||||||||||||||
| public static Character | valueOf(char c) Details
Returns a Character instance representing the specified
char value.
If a new Character instance is not required, this method
should generally be used in preference to the constructor
#Character(char), as this method is likely to yield
significantly better space and time performance by caching
frequently requested values.
| ||||||||||||||||||
| Properties | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| public static boolean | isDefined(char ch) Details
Determines if a character is defined in Unicode.
A character is defined if at least one of the following is true:
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isDefined(int codePoint) Details
Determines if a character (Unicode code point) is defined in Unicode.
A character is defined if at least one of the following is true:
| |||||||||||||||||||||||
| public static boolean | isDigit(char ch) Details
Determines if the specified character is a digit.
A character is a digit if its general category type, provided
by Some Unicode character ranges that contain digits:
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isDigit(int codePoint) Details
Determines if the specified character (Unicode code point) is a digit.
A character is a digit if its general category type, provided
by Some Unicode character ranges that contain digits:
| |||||||||||||||||||||||
| public static byte | getDirectionality(char ch) Details
Returns the Unicode directionality property for the given
character. Character directionality is used to calculate the
visual ordering of text. The directionality value of undefined
char values is DIRECTIONALITY_UNDEFINED.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static byte | getDirectionality(int codePoint) Details
Returns the Unicode directionality property for the given
character (Unicode code point). Character directionality is
used to calculate the visual ordering of text. The
directionality value of undefined character is #DIRECTIONALITY_UNDEFINED.
| |||||||||||||||||||||||
| public static boolean | isHighSurrogate(char ch) Details
Determines if the given char value is a
high-surrogate code unit (also known as leading-surrogate
code unit). Such values do not represent characters by
themselves, but are used in the representation of supplementary characters in the
UTF-16 encoding.
This method returns isch >= '\uD800' && ch <= '\uDBFF' true.
| |||||||||||||||||||||||
| public static boolean | isIdentifierIgnorable(char ch) Details
Determines if the specified character should be regarded as
an ignorable character in a Java identifier or a Unicode identifier.
The following Unicode characters are ignorable in a Java identifier or a Unicode identifier:
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isIdentifierIgnorable(int codePoint) Details
Determines if the specified character (Unicode code point) should be regarded as
an ignorable character in a Java identifier or a Unicode identifier.
The following Unicode characters are ignorable in a Java identifier or a Unicode identifier:
| |||||||||||||||||||||||
| public static boolean | isISOControl(char ch) Details
Determines if the specified character is an ISO control
character. A character is considered to be an ISO control
character if its code is in the range '\u0000'
through '\u001F' or in the range
'\u007F' through '\u009F'.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isISOControl(int codePoint) Details
Determines if the referenced character (Unicode code point) is an ISO control
character. A character is considered to be an ISO control
character if its code is in the range '\u0000'
through '\u001F' or in the range
'\u007F' through '\u009F'.
| |||||||||||||||||||||||
| public static boolean | isJavaIdentifierPart(char ch) Details
Determines if the specified character may be part of a Java
identifier as other than the first character.
A character may be part of a Java identifier if any of the following are true:
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isJavaIdentifierPart(int codePoint) Details
Determines if the character (Unicode code point) may be part of a Java
identifier as other than the first character.
A character may be part of a Java identifier if any of the following are true:
| |||||||||||||||||||||||
| public static boolean | isJavaIdentifierStart(char ch) Details
Determines if the specified character is
permissible as the first character in a Java identifier.
A character may start a Java identifier if and only if one of the following conditions is true:
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isJavaIdentifierStart(int codePoint) Details
Determines if the character (Unicode code point) is
permissible as the first character in a Java identifier.
A character may start a Java identifier if and only if one of the following conditions is true:
| |||||||||||||||||||||||
| public static boolean | isJavaLetter(char ch) Details
Determines if the specified character is permissible as the first
character in a Java identifier.
A character may start a Java identifier if and only if one of the following is true:
| |||||||||||||||||||||||
| public static boolean | isJavaLetterOrDigit(char ch) Details
Determines if the specified character may be part of a Java
identifier as other than the first character.
A character may be part of a Java identifier if and only if any of the following are true:
| |||||||||||||||||||||||
| public static boolean | isLetter(char ch) Details
Determines if the specified character is a letter.
A character is considered to be a letter if its general
category type, provided by
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isLetter(int codePoint) Details
Determines if the specified character (Unicode code point) is a letter.
A character is considered to be a letter if its general
category type, provided by
| |||||||||||||||||||||||
| public static boolean | isLetterOrDigit(char ch) Details
Determines if the specified character is a letter or digit.
A character is considered to be a letter or digit if either
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isLetterOrDigit(int codePoint) Details
Determines if the specified character (Unicode code point) is a letter or digit.
A character is considered to be a letter or digit if either
| |||||||||||||||||||||||
| public static boolean | isLowerCase(char ch) Details
Determines if the specified character is a lowercase character.
A character is lowercase if its general category type, provided
by The following are examples of lowercase characters: a b c d e f g h i j k l m n o p q r s t u v w x y z '\u00DF' '\u00E0' '\u00E1' '\u00E2' '\u00E3' '\u00E4' '\u00E5' '\u00E6' '\u00E7' '\u00E8' '\u00E9' '\u00EA' '\u00EB' '\u00EC' '\u00ED' '\u00EE' '\u00EF' '\u00F0' '\u00F1' '\u00F2' '\u00F3' '\u00F4' '\u00F5' '\u00F6' '\u00F8' '\u00F9' '\u00FA' '\u00FB' '\u00FC' '\u00FD' '\u00FE' '\u00FF' Many other Unicode characters are lowercase too. Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isLowerCase(int codePoint) Details
Determines if the specified character (Unicode code point) is a
lowercase character.
A character is lowercase if its general category type, provided
by The following are examples of lowercase characters: a b c d e f g h i j k l m n o p q r s t u v w x y z '\u00DF' '\u00E0' '\u00E1' '\u00E2' '\u00E3' '\u00E4' '\u00E5' '\u00E6' '\u00E7' '\u00E8' '\u00E9' '\u00EA' '\u00EB' '\u00EC' '\u00ED' '\u00EE' '\u00EF' '\u00F0' '\u00F1' '\u00F2' '\u00F3' '\u00F4' '\u00F5' '\u00F6' '\u00F8' '\u00F9' '\u00FA' '\u00FB' '\u00FC' '\u00FD' '\u00FE' '\u00FF' Many other Unicode characters are lowercase too.
| |||||||||||||||||||||||
| public static boolean | isLowSurrogate(char ch) Details
Determines if the given char value is a
low-surrogate code unit (also known as trailing-surrogate code
unit). Such values do not represent characters by themselves,
but are used in the representation of supplementary characters in the UTF-16 encoding.
This method returns isch >= '\uDC00' && ch <= '\uDFFF' true.
| |||||||||||||||||||||||
| public static boolean | isMirrored(char ch) Details
Determines whether the character is mirrored according to the
Unicode specification. Mirrored characters should have their
glyphs horizontally mirrored when displayed in text that is
right-to-left. For example, '\u0028' LEFT
PARENTHESIS is semantically defined to be an opening
parenthesis. This will appear as a "(" in text that is
left-to-right but as a ")" in text that is right-to-left.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isMirrored(int codePoint) Details
Determines whether the specified character (Unicode code point)
is mirrored according to the Unicode specification. Mirrored
characters should have their glyphs horizontally mirrored when
displayed in text that is right-to-left. For example,
'\u0028' LEFT PARENTHESIS is semantically
defined to be an opening parenthesis. This will appear
as a "(" in text that is left-to-right but as a ")" in text
that is right-to-left.
| |||||||||||||||||||||||
| public static int | getNumericValue(char ch) Details
Returns the int value that the specified Unicode
character represents. For example, the character
'\u216C' (the roman numeral fifty) will return
an int with a value of 50.
The letters A-Z in their uppercase ( If the character does not have a numeric value, then -1 is returned. If the character has a numeric value that cannot be represented as a nonnegative integer (for example, a fractional value), then -2 is returned. Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static int | getNumericValue(int codePoint) Details
Returns the int value that the specified
character (Unicode code point) represents. For example, the character
'\u216C' (the Roman numeral fifty) will return
an int with a value of 50.
The letters A-Z in their uppercase ( If the character does not have a numeric value, then -1 is returned. If the character has a numeric value that cannot be represented as a nonnegative integer (for example, a fractional value), then -2 is returned.
| |||||||||||||||||||||||
| public static boolean | isSpace(char ch) Details
Determines if the specified character is ISO-LATIN-1 white space.
This method returns true for the following five
characters only:
| |||||||||||||||||||||||
| public static boolean | isSpaceChar(char ch) Details
Determines if the specified character is a Unicode space character.
A character is considered to be a space character if and only if
it is specified to be a space character by the Unicode standard. This
method returns true if the character's general category type is any of
the following:
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isSpaceChar(int codePoint) Details
Determines if the specified character (Unicode code point) is a
Unicode space character. A character is considered to be a
space character if and only if it is specified to be a space
character by the Unicode standard. This method returns true if
the character's general category type is any of the following:
| |||||||||||||||||||||||
| public static boolean | isSupplementaryCodePoint(int codePoint) Details
Determines whether the specified character (Unicode code point)
is in the supplementary character range. The method call is
equivalent to the expression:
codePoint >= 0x10000 && codePoint <= 0x10FFFF
| |||||||||||||||||||||||
| public static boolean | isSurrogatePair(char high, char low) Details
Determines whether the specified pair of char
values is a valid surrogate pair. This method is equivalent to
the expression:
isHighSurrogate(high) && isLowSurrogate(low)
| |||||||||||||||||||||||
| public static boolean | isTitleCase(char ch) Details
Determines if the specified character is a titlecase character.
A character is a titlecase character if its general
category type, provided by Some characters look like pairs of Latin letters. For example, there is an uppercase letter that looks like "LJ" and has a corresponding lowercase letter that looks like "lj". A third form, which looks like "Lj", is the appropriate form to use when rendering a word in lowercase with initial capitals, as for a book title.
These are some of the Unicode characters for which this method returns
Many other Unicode characters are titlecase too.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isTitleCase(int codePoint) Details
Determines if the specified character (Unicode code point) is a titlecase character.
A character is a titlecase character if its general
category type, provided by Some characters look like pairs of Latin letters. For example, there is an uppercase letter that looks like "LJ" and has a corresponding lowercase letter that looks like "lj". A third form, which looks like "Lj", is the appropriate form to use when rendering a word in lowercase with initial capitals, as for a book title.
These are some of the Unicode characters for which this method returns
Many other Unicode characters are titlecase too.
| |||||||||||||||||||||||
| public static int | getType(char ch) Details
Returns a value indicating a character's general category.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static int | getType(int codePoint) Details
Returns a value indicating a character's general category.
| |||||||||||||||||||||||
| public static boolean | isUnicodeIdentifierPart(char ch) Details
Determines if the specified character may be part of a Unicode
identifier as other than the first character.
A character may be part of a Unicode identifier if and only if one of the following statements is true:
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isUnicodeIdentifierPart(int codePoint) Details
Determines if the specified character (Unicode code point) may be part of a Unicode
identifier as other than the first character.
A character may be part of a Unicode identifier if and only if one of the following statements is true:
| |||||||||||||||||||||||
| public static boolean | isUnicodeIdentifierStart(char ch) Details
Determines if the specified character is permissible as the
first character in a Unicode identifier.
A character may start a Unicode identifier if and only if one of the following conditions is true:
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isUnicodeIdentifierStart(int codePoint) Details
Determines if the specified character (Unicode code point) is permissible as the
first character in a Unicode identifier.
A character may start a Unicode identifier if and only if one of the following conditions is true:
| |||||||||||||||||||||||
| public static boolean | isUpperCase(char ch) Details
Determines if the specified character is an uppercase character.
A character is uppercase if its general category type, provided by
The following are examples of uppercase characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z '\u00C0' '\u00C1' '\u00C2' '\u00C3' '\u00C4' '\u00C5' '\u00C6' '\u00C7' '\u00C8' '\u00C9' '\u00CA' '\u00CB' '\u00CC' '\u00CD' '\u00CE' '\u00CF' '\u00D0' '\u00D1' '\u00D2' '\u00D3' '\u00D4' '\u00D5' '\u00D6' '\u00D8' '\u00D9' '\u00DA' '\u00DB' '\u00DC' '\u00DD' '\u00DE' Many other Unicode characters are uppercase too.
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isUpperCase(int codePoint) Details
Determines if the specified character (Unicode code point) is an uppercase character.
A character is uppercase if its general category type, provided by
The following are examples of uppercase characters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z '\u00C0' '\u00C1' '\u00C2' '\u00C3' '\u00C4' '\u00C5' '\u00C6' '\u00C7' '\u00C8' '\u00C9' '\u00CA' '\u00CB' '\u00CC' '\u00CD' '\u00CE' '\u00CF' '\u00D0' '\u00D1' '\u00D2' '\u00D3' '\u00D4' '\u00D5' '\u00D6' '\u00D8' '\u00D9' '\u00DA' '\u00DB' '\u00DC' '\u00DD' '\u00DE' Many other Unicode characters are uppercase too.
| |||||||||||||||||||||||
| public static boolean | isValidCodePoint(int codePoint) Details
Determines whether the specified code point is a valid Unicode
code point value in the range of 0x0000 to
0x10FFFF inclusive. This method is equivalent to
the expression:
codePoint >= 0x0000 && codePoint <= 0x10FFFF
| |||||||||||||||||||||||
| public static boolean | isWhitespace(char ch) Details
Determines if the specified character is white space according to Java.
A character is a Java whitespace character if and only if it satisfies
one of the following criteria:
Note: This method cannot handle supplementary characters. To support
all Unicode characters, including supplementary characters, use
the
| |||||||||||||||||||||||
| public static boolean | isWhitespace(int codePoint) Details
Determines if the specified character (Unicode code point) is
white space according to Java. A character is a Java
whitespace character if and only if it satisfies one of the
following criteria:
| |||||||||||||||||||||||
| About DocWeb · Bundles · Export · Export All | Top 10 · Statistics · Login |
| About Sun · Contact · Privacy · Terms of Use · Trademarks | Java SE 6 · Copyright © 1994-2009 Sun Microsystems, Inc.All rights reserved. Use is subject to license terms |
![]() |
![]() |
|