New Unicode Bug Discovered For Common Japanese Character "No"
AmiMoJo writes: Some users have noticed that the Japanese character "no", which is extremely common in the Japanese language (forming parts of many words, or meaning something similar to the English word "of" on its own). The Unicode standard has apparently marked the character as sometimes being used in mathematical formulae, causing it to be rendering in a different font to the surrounding text in certain applications. Similar but more widespread issues have plagued Unicode for decades due to the decision to unify dissimilar characters in Chinese, Japanese and Korean.
The bug is that the Japanese, Chinese, Korean and mathematical versions of this character all share a common code point. There is no reliable way for an application to select the right character and render it properly.
You can't mix C/J/K and mathematics in Unicode, which is a new bug beyond just the failure to support mixing C/J/K.
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
As I pointer out elsewhere here, Chinese can be written with a latin alphabet and a few accents. Likewise languages such as Sanskrit. Just as there is a difference between English handwriting and what can be represented in Ascii, we face a related issue with ideograph based writing systems. We would be better of writing Chinese webpages in pinyin, and developing a separate system for calligraphy and ideographs.
Except that there are so many homonyms in pinyin that a strong sense of the context is needed to read it. The logograms are much harder to write but reading is quite a bit easier, which is why they are still in use. That's not the same as English handwriting vs printing, where the differences are only in rendering and there is a 1:1 correspondence between a handwritten and a printed character.
Just write chinese in pinyin and speak it normally. (the number of Chinese speakers does not matter, the issue is with how it is written down.)
"Chinese" is not a single spoken language. A passage written in one Chinese language, such as Mandarin, is often readable in another Chinese language, such as Cantonese, so long as they're written with Han characters. It's as if French could be read as Italian or Spanish with the same characters. In addition, different words that sound the same in a given Chinese language due to historic sound changes usually have different Han characters. They may end up sounding different in a different Chinese language whose different historic sound changes produced different homophone sets. Pinyin, on the other hand, depends on Mandarin and confuses homophones.