× {{alert.msg}} Never ask again
Get notified about new tutorials RECEIVE NEW TUTORIALS

How to get a character from another character in java

Francis Galiegue
Mar 01, 2015
<p>This post is not an answer but a "confusion cleanup" post.</p> <p>First of all, a Unicode code point ranges from U+0000 to U+10FFFF; in all this range, some code points are in fact invalid.</p> <p>Java's <code>char</code> is, in essence, a UTF-16 code unit. For Unicode code points outside the Basic Multilingual plane, that is, above U+FFFF, two chars are needed to encode one code point; see <a href="http://docs.oracle.com/javase/8/docs/api/java/lang/Character.html#toChars-int-" rel="nofollow"><code>Character.toChars()</code></a>. For code points inside the BMP there is a one-to-one mapping between Unicode and <code>char</code>.</p> <p>Other than that, despite its particular role, a <code>char</code> is a numeric type; it is also the only unsigned numeric type in Java. You can use arithmetic operations on it, and for a code point inside the BMP, adding 7 to a <code>char</code> which is '\u9876' will give code point '\u9883'.</p> <p>But given the above, this is a dangerous manoeuver...</p> <p>(fwiw, above the BMP, <code>Character.toChars()</code> will return a "pair of char"s; the first element of the returned array will be a lead surrogate and the second will be a trail surrogate; the Java API still uses the old Unicode terminology and calls them a "high" and "low" surrogate respectively. For more details see the <a href="http://en.wikipedia.org/wiki/UTF-16" rel="nofollow">Wikipedia article on UTF-16</a>, which does a pretty good job at explaining what is what)</p> <p>This tip was originally posted on <a href="http://stackoverflow.com/questions/23863660/How%20to%20get%20a%20character%20from%20another%20character%20in%20java/23863737">Stack Overflow</a>.</p>
comments powered by Disqus