月是一轮明镜,晶莹剔透,代表着一张白纸(啥也不懂)
央是一片海洋,海乃百川,代表着一块海绵(吸纳万物)
泽是一柄利剑,千锤百炼,代表着千百锤炼(输入输出)
月央泽,学习的一种过程,从白纸->吸收各种知识->不断输入输出变成自己的内容
希望大家一起坚持这个过程,也同样希望大家最终都能从零到零,把知识从薄变厚,再由厚变薄!
直接看源码注释(我的翻译可能不太准,如果道友们有更棒的理解,可以留言或者私信)
二.Character的类图:
a).一个Comparable,这个接口对实现他的每个对象都可按照一定的规则来进行排序,详情请点击下面链接
......(假装这个是链接,以后补充)
b).Secializable,这个接口是可为实现的对象进行序列化,详情请点击下面链接
......(假装这个是链接,以后补充)
三.成员变量:
public static final int MIN_RADIX = 2;
public static final int MAX_RADIX = 36;
public static final char MIN_VALUE = 'u0000';
public static final char MAX_VALUE = 'uFFFF';
@SuppressWarnings("unchecked")
public static final Class TYPE = (Class) Class.getPrimitiveClass("char");
下面是一些Unicode中的一般规范,了解即可:
public static final byte UNASSIGNED = 0;
public static final byte UPPERCASE_LETTER = 1;
public static final byte LOWERCASE_LETTER = 2;
public static final byte TITLECASE_LETTER = 3;
public static final byte MODIFIER_LETTER = 4;
public static final byte OTHER_LETTER = 5;
public static final byte NON_SPACING_MARK = 6;
public static final byte ENCLOSING_MARK = 7;
public static final byte COMBINING_SPACING_MARK = 8;
public static final byte DECIMAL_DIGIT_NUMBER = 9;
public static final byte LETTER_NUMBER = 10;
public static final byte OTHER_NUMBER = 11;
public static final byte SPACE_SEPARATOR = 12;
public static final byte LINE_SEPARATOR = 13;
public static final byte PARAGRAPH_SEPARATOR = 14;
public static final byte ConTROL = 15;
public static final byte FORMAT = 16;
public static final byte PRIVATE_USE = 18;
public static final byte SURROGATE = 19;
public static final byte DASH_PUNCTUATION = 20;
public static final byte START_PUNCTUATION = 21;
public static final byte END_PUNCTUATION = 22;
public static final byte CONNECTOR_PUNCTUATION = 23;
public static final byte OTHER_PUNCTUATION = 24;
public static final byte MATH_SYMBOL = 25;
public static final byte CURRENCY_SYMBOL = 26;
public static final byte MODIFIER_SYMBOL = 27;
public static final byte OTHER_SYMBOL = 28;
public static final byte INITIAL_QUOTE_PUNCTUATION = 29;
public static final byte FINAL_QUOTE_PUNCTUATION = 30;
下面是一些Unicode规范中强双向字符,了解即可:
public static final byte DIRECTIONALITY_UNDEFINED = -1;
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT = 0;
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT = 1;
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC = 2;
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER = 3;
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR = 4;
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR = 5;
public static final byte DIRECTIONALITY_ARABIC_NUMBER = 6;
public static final byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR = 7;
public static final byte DIRECTIONALITY_NONSPACING_MARK = 8;
public static final byte DIRECTIONALITY_BOUNDARY_NEUTRAL = 9;
public static final byte DIRECTIONALITY_PARAGRAPH_SEPARATOR = 10;
public static final byte DIRECTIONALITY_SEGMENT_SEPARATOR = 11;
public static final byte DIRECTIONALITY_WHITESPACE = 12;
public static final byte DIRECTIONALITY_OTHER_NEUTRALS = 13;
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING = 14;
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE = 15;
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING = 16;
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE = 17;
public static final byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT = 18;
其他成员变量:
static final int ERROR = 0xFFFFFFFF;
public static final char MIN_HIGH_SURROGATE = 'uD800';
public static final char MAX_HIGH_SURROGATE = 'uDBFF';
public static final char MIN_LOW_SURROGATE = 'uDC00';
public static final char MAX_LOW_SURROGATE = 'uDFFF';
public static final char MIN_SURROGATE = MIN_HIGH_SURROGATE;
public static final char MAX_SURROGATE = MAX_LOW_SURROGATE;
public static final int MIN_SUPPLEMENTARY_CODE_POINT = 0x010000;
public static final int MIN_CODE_POINT = 0x000000;
public static final int MAX_CODE_POINT = 0x10FFFF;
public static final int SIZE = 16;
public static final int BYTES = SIZE / Byte.SIZE;
四.内部类:
Subset
public static class Subset {
//子集的名称
private String name;
protected Subset(String name) {
if (name == null) {
throw new NullPointerException("name");
}
this.name = name;
}
public final boolean equals(Object obj) {
return (this == obj);
}
public final int hashCode() {
return super.hashCode();
}
public final String toString() {
return name;
}
}
UnicodeBlock和Unicodescript是Character内部维护的编码块和编码脚本,内容较多,有兴趣的小伙伴可以自行前面源码查看,这里小编就不在上源码了.
CharacterCache
//内部维护了一个Character的缓存数组
private static class CharacterCache {
private CharacterCache(){}
//长度为128的Character数组
static final Character cache[] = new Character[127 + 1];
static {
for (int i = 0; i < cache.length; i++)
//值是从0-127
cache[i] = new Character((char)i);
}
}
五.构造方法:
public Character(char value) {
this.value = value;
}
六.内部方法:
valueOf(char c)
public static Character valueOf(char c) {
if (c <= 127) { // must cache
return CharacterCache.cache[(int)c];
}
return new Character(c);
}
charValue
public char charValue() {
return value;
}
hashCode
@Override
public int hashCode() {
return Character.hashCode(value);
}
public static int hashCode(char value) {
return (int)value;
}
equals(Object obj)
public boolean equals(Object obj) {
if (obj instanceof Character) {
return value == ((Character)obj).charValue();
}
return false;
}
toString()
public String toString() {
//把Character的value值fangruchar数组,调用String.valueOf()
char buf[] = {value};
return String.valueOf(buf);
}
public static String toString(char c) {
return String.valueOf(c);
}
判断代码点的一系列方法:
public static boolean isValidCodePoint(int codePoint) {
// Optimized form of:
// codePoint >= MIN_CODE_POINT && codePoint <= MAX_CODE_POINT
int plane = codePoint >>> 16;
return plane < ((MAX_CODE_POINT + 1) >>> 16);
}
public static boolean isBmpCodePoint(int codePoint) {
return codePoint >>> 16 == 0;
// Optimized form of:
// codePoint >= MIN_VALUE && codePoint <= MAX_VALUE
// We consistently use logical shift (>>>) to facilitate
// additional runtime optimizations.
}
public static boolean isSupplementaryCodePoint(int codePoint) {
return codePoint >= MIN_SUPPLEMENTARY_CODE_POINT
&& codePoint < MAX_CODE_POINT + 1;
}
public static int toCodePoint(char high, char low) {
// Optimized form of:
// return ((high - MIN_HIGH_SURROGATE) << 10)
// + (low - MIN_LOW_SURROGATE)
// + MIN_SUPPLEMENTARY_CODE_POINT;
return ((high << 10) + low) + (MIN_SUPPLEMENTARY_CODE_POINT
- (MIN_HIGH_SURROGATE << 10)
- MIN_LOW_SURROGATE);
}
public static int codePointAt(CharSequence seq, int index) {
char c1 = seq.charAt(index);
if (isHighSurrogate(c1) && ++index < seq.length()) {
char c2 = seq.charAt(index);
if (isLowSurrogate(c2)) {
return toCodePoint(c1, c2);
}
}
return c1;
}
public static int codePointAt(char[] a, int index) {
return codePointAtImpl(a, index, a.length);
}
public static int codePointAt(char[] a, int index, int limit) {
if (index >= limit || limit < 0 || limit > a.length) {
throw new IndexOutOfBoundsException();
}
return codePointAtImpl(a, index, limit);
}
// throws ArrayIndexOutOfBoundsException if index out of bounds
static int codePointAtImpl(char[] a, int index, int limit) {
char c1 = a[index];
if (isHighSurrogate(c1) && ++index < limit) {
char c2 = a[index];
if (isLowSurrogate(c2)) {
return toCodePoint(c1, c2);
}
}
return c1;
}
public static int codePointBefore(CharSequence seq, int index) {
char c2 = seq.charAt(--index);
if (isLowSurrogate(c2) && index > 0) {
char c1 = seq.charAt(--index);
if (isHighSurrogate(c1)) {
return toCodePoint(c1, c2);
}
}
return c2;
}
public static int codePointBefore(char[] a, int index) {
return codePointBeforeImpl(a, index, 0);
}
public static int codePointBefore(char[] a, int index, int start) {
if (index <= start || start < 0 || start >= a.length) {
throw new IndexOutOfBoundsException();
}
return codePointBeforeImpl(a, index, start);
}
// throws ArrayIndexOutOfBoundsException if index-1 out of bounds
static int codePointBeforeImpl(char[] a, int index, int start) {
char c2 = a[--index];
if (isLowSurrogate(c2) && index > start) {
char c1 = a[--index];
if (isHighSurrogate(c1)) {
return toCodePoint(c1, c2);
}
}
return c2;
}
public static int codePointCount(CharSequence seq, int beginIndex, int endIndex) {
int length = seq.length();
if (beginIndex < 0 || endIndex > length || beginIndex > endIndex) {
throw new IndexOutOfBoundsException();
}
int n = endIndex - beginIndex;
for (int i = beginIndex; i < endIndex; ) {
if (isHighSurrogate(seq.charAt(i++)) && i < endIndex &&
isLowSurrogate(seq.charAt(i))) {
n--;
i++;
}
}
return n;
}
public static int codePointCount(char[] a, int offset, int count) {
if (count > a.length - offset || offset < 0 || count < 0) {
throw new IndexOutOfBoundsException();
}
return codePointCountImpl(a, offset, count);
}
static int codePointCountImpl(char[] a, int offset, int count) {
int endIndex = offset + count;
int n = count;
for (int i = offset; i < endIndex; ) {
if (isHighSurrogate(a[i++]) && i < endIndex &&
isLowSurrogate(a[i])) {
n--;
i++;
}
}
return n;
}
public static int offsetByCodePoints(CharSequence seq, int index,
int codePointOffset) {
int length = seq.length();
if (index < 0 || index > length) {
throw new IndexOutOfBoundsException();
}
int x = index;
if (codePointOffset >= 0) {
int i;
for (i = 0; x < length && i < codePointOffset; i++) {
if (isHighSurrogate(seq.charAt(x++)) && x < length &&
isLowSurrogate(seq.charAt(x))) {
x++;
}
}
if (i < codePointOffset) {
throw new IndexOutOfBoundsException();
}
} else {
int i;
for (i = codePointOffset; x > 0 && i < 0; i++) {
if (isLowSurrogate(seq.charAt(--x)) && x > 0 &&
isHighSurrogate(seq.charAt(x-1))) {
x--;
}
}
if (i < 0) {
throw new IndexOutOfBoundsException();
}
}
return x;
}
public static int offsetByCodePoints(char[] a, int start, int count,
int index, int codePointOffset) {
if (count > a.length-start || start < 0 || count < 0
|| index < start || index > start+count) {
throw new IndexOutOfBoundsException();
}
return offsetByCodePointsImpl(a, start, count, index, codePointOffset);
}
static int offsetByCodePointsImpl(char[]a, int start, int count,
int index, int codePointOffset) {
int x = index;
if (codePointOffset >= 0) {
int limit = start + count;
int i;
for (i = 0; x < limit && i < codePointOffset; i++) {
if (isHighSurrogate(a[x++]) && x < limit &&
isLowSurrogate(a[x])) {
x++;
}
}
if (i < codePointOffset) {
throw new IndexOutOfBoundsException();
}
} else {
int i;
for (i = codePointOffset; x > start && i < 0; i++) {
if (isLowSurrogate(a[--x]) && x > start &&
isHighSurrogate(a[x-1])) {
x--;
}
}
if (i < 0) {
throw new IndexOutOfBoundsException();
}
}
return x;
}
判断代理的一系列方法:
public static boolean isHighSurrogate(char ch) {
// Help VM constant-fold; MAX_HIGH_SURROGATE + 1 == MIN_LOW_SURROGATE
return ch >= MIN_HIGH_SURROGATE && ch < (MAX_HIGH_SURROGATE + 1);
}
public static boolean isLowSurrogate(char ch) {
return ch >= MIN_LOW_SURROGATE && ch < (MAX_LOW_SURROGATE + 1);
}
public static boolean isSurrogate(char ch) {
return ch >= MIN_SURROGATE && ch < (MAX_SURROGATE + 1);
}
public static boolean isSurrogatePair(char high, char low) {
return isHighSurrogate(high) && isLowSurrogate(low);
}
public static char highSurrogate(int codePoint) {
return (char) ((codePoint >>> 10)
+ (MIN_HIGH_SURROGATE - (MIN_SUPPLEMENTARY_CODE_POINT >>> 10)));
}
public static char lowSurrogate(int codePoint) {
return (char) ((codePoint & 0x3ff) + MIN_LOW_SURROGATE);
}
charCount(int codePoint)
public static int charCount(int codePoint) {
return codePoint >= MIN_SUPPLEMENTARY_CODE_POINT ? 2 : 1;
}
将指定的字符(代码点)转化为其UTF-16表示
public static int toChars(int codePoint, char[] dst, int dstIndex) {
if (isBmpCodePoint(codePoint)) {
dst[dstIndex] = (char) codePoint;
return 1;
} else if (isValidCodePoint(codePoint)) {
toSurrogates(codePoint, dst, dstIndex);
return 2;
} else {
throw new IllegalArgumentException();
}
}
public static char[] toChars(int codePoint) {
if (isBmpCodePoint(codePoint)) {
return new char[] { (char) codePoint };
} else if (isValidCodePoint(codePoint)) {
char[] result = new char[2];
toSurrogates(codePoint, result, 0);
return result;
} else {
throw new IllegalArgumentException();
}
}
static void toSurrogates(int codePoint, char[] dst, int index) {
// We write elements "backwards" to guarantee all-or-nothing
dst[index+1] = lowSurrogate(codePoint);
dst[index] = highSurrogate(codePoint);
}
其中还有一些判断字符为什么类型的,种类很多,有兴趣的可以去亲自看看(我反正是涨知识了)
将字符转为大写或小写或标题类型:
public static char toLowerCase(char ch) {
return (char)toLowerCase((int)ch);
}
public static int toLowerCase(int codePoint) {
return CharacterData.of(codePoint).toLowerCase(codePoint);
}
public static char toUpperCase(char ch) {
return (char)toUpperCase((int)ch);
}
public static int toUpperCase(int codePoint) {
return CharacterData.of(codePoint).toUpperCase(codePoint);
}
public static char toTitleCase(char ch) {
return (char)toTitleCase((int)ch);
}
public static int toTitleCase(int codePoint) {
return CharacterData.of(codePoint).toTitleCase(codePoint);
}
digit()
public static int digit(char ch, int radix) {
return digit((int)ch, radix);
}
public static int digit(int codePoint, int radix) {
return CharacterData.of(codePoint).digit(codePoint, radix);
}
getNumericValue
public static int getNumericValue(char ch) {
return getNumericValue((int)ch);
}
public static int getNumericValue(int codePoint) {
return CharacterData.of(codePoint).getNumericValue(codePoint);
}
判断空格:
public static boolean isSpaceChar(char ch) {
return isSpaceChar((int)ch);
}
public static boolean isSpaceChar(int codePoint) {
return ((((1 << Character.SPACE_SEPARATOR) |
(1 << Character.LINE_SEPARATOR) |
(1 << Character.PARAGRAPH_SEPARATOR)) >> getType(codePoint)) & 1)
!= 0;
}
public static boolean isWhitespace(char ch) {
return isWhitespace((int)ch);
}
public static boolean isWhitespace(int codePoint) {
return CharacterData.of(codePoint).isWhitespace(codePoint);
}
isISOControl
public static boolean isISOControl(char ch) {
return isISOControl((int)ch);
}
public static boolean isISOControl(int codePoint) {
// Optimized form of:
// (codePoint >= 0x00 && codePoint <= 0x1F) ||
// (codePoint >= 0x7F && codePoint <= 0x9F);
return codePoint <= 0x9F &&
(codePoint >= 0x7F || (codePoint >>> 5 == 0));
}
getType
public static int getType(char ch) {
return getType((int)ch);
}
public static int getType(int codePoint) {
return CharacterData.of(codePoint).getType(codePoint);
}
forDigit
public static char forDigit(int digit, int radix) {
if ((digit >= radix) || (digit < 0)) {
return ' ';
}
if ((radix < Character.MIN_RADIX) || (radix > Character.MAX_RADIX)) {
return ' ';
}
//小于10
if (digit < 10) {
return (char)('0' + digit);
}
//大于10
return (char)('a' - 10 + digit);
}
getDirectionlity(char ch)
public static byte getDirectionality(char ch) {
return getDirectionality((int)ch);
}
public static byte getDirectionality(int codePoint) {
return CharacterData.of(codePoint).getDirectionality(codePoint);
}
isMirrored
public static boolean isMirrored(char ch) {
return isMirrored((int)ch);
}
public static boolean isMirrored(int codePoint) {
return CharacterData.of(codePoint).isMirrored(codePoint);
}
compare()
public int compareTo(Character anotherCharacter) {
return compare(this.value, anotherCharacter.value);
}
public static int compare(char x, char y) {
return x - y;
}
toUpperCaseEx
static int toUpperCaseEx(int codePoint) {
assert isValidCodePoint(codePoint);
return CharacterData.of(codePoint).toUpperCaseEx(codePoint);
}
static char[] toUpperCaseCharArray(int codePoint) {
// As of Unicode 6.0, 1:M uppercasings only happen in the BMP.
assert isBmpCodePoint(codePoint);
return CharacterData.of(codePoint).toUpperCaseCharArray(codePoint);
}
reverseBytes
public static char reverseBytes(char ch) {
return (char) (((ch & 0xFF00) >> 8) | (ch << 8));
}
getName()
public static String getName(int codePoint) {
if (!isValidCodePoint(codePoint)) {
throw new IllegalArgumentException();
}
String name = CharacterName.get(codePoint);
if (name != null)
return name;
if (getType(codePoint) == UNASSIGNED)
return null;
UnicodeBlock block = UnicodeBlock.of(codePoint);
if (block != null)
return block.toString().replace('_', ' ') + " "
+ Integer.toHexString(codePoint).toUpperCase(Locale.ENGLISH);
// should never come here
return Integer.toHexString(codePoint).toUpperCase(Locale.ENGLISH);
}
七.总结:
字节果然很底层,出现了很多不理解的概念和知识点.
果然知道的越多,不知道的越多.继续加油!



