Unicode Consortium Releases Unicode 8.0.0
An anonymous reader writes: The newest version of the Unicode standard adds 7,716 new characters to the existing 21,499 – that's more than 35% growth! Most of them are Chinese, Japan and Korean ideographs, but among those changes Unicode adds support for new languages like Ik, used in Uganda.
> That slashdot didn't support unicode
However Soylent News has had full unicode support since last year.
Here is a recent thread with lots of greek.
There were already 113K characters in Unicode version 7.0. Which is more than 2^16 characters, so remember:
perl -e 'fork||print for split//,"hahahaha"'
Your post... has a Facebook icon next to it.
I knew Slashdot was going in a different direction, but... Facebook? The alt-text says "From Facebook". I'm not even completely sure what that means, but I don't want anything "from Facebook" in here. I hope you know your post has fucked up my world. I'm going to have a hard time sleeping now. Bro... your post has a Facebook icon on it! However you managed to get that to appear, don't do it ever again! And tell all your friends not to. Together, we can make Slashdot sane again...
In short:
To render text properly in Japanese, you need a Japanese font. To render text properly in Chinese, you need a Chinese font. It's not just because of character coverage, but because of a thing called Han unification the consortium did.
The Unicode consortium decided to map similar characters to the same code-point. Personally, I'm not particularly bothered by this. but it leads to the technical problem that each text must be supplied with a language tag to select a correct font.
And this is problematic when there are two CJK languages mixed in the same document -- in the GP's case, Chinese and Japanese --, or when a program must automatically decide which font to render things in.
Take a web browser for example. It reaches a random Chinese web page, encoded in UTF-8. The page's author never bothered adding a language tag. Now the web browser must guess whether to render the page in a Chinese font or a Japanese one. And a "guess" is really all that it can do.
(Typically, software used base the guesses on the user's locale. It's pretty accurate -- Chinese users tend to view Chinese documents, Japanese Japanese ones. But the problems start when someone tries viewing a 'foreign' document...)
It's really quite ironic that the consortium decided on codepoint unification for the three languages that would most benefit from Unicode.
"Slow Down Cowboy! It's been 58 minutes since you last successfully posted a comment" -- slashdot, driving users away.