ICANN Approves Internationalized Chinese Domain Names
philalethiac writes "Millions of Chinese language users will soon be able to access the Internet using Chinese script following a decision today by ICANN's Board of Directors to approve a set of Chinese language internationalized domain names."
! ("shou3" = number one; "biao1" = to announce/post)
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
ICANN haz internationalized Chinese domain name?
I guess, until Slashdot enables the UTF character set like everyone else has for the past decade or so,
1. There will be some domain names that we can't link to on Slashdot
2. No one will get my First Post joke.
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
With all the non latin address character sets being approved I imagine there is a world of new opportunities which completely void all the "inspect the address bar" education which was pushed on the general public for so many years. ICANN has managed to turn the net into a pretty much anything goes place, almost every major company is practically extorted into buying the new extension flavour of the month to prevent spammers and fraudsters sending seemingly legitimate email and the general public is left completely confused with no guiding address principals.
This looks like a perfect opportunity to highlight this recent post at the Pinyin News blog, closely related to the issue at hand! (Disclaimer: I'm not affiliated with the blog in any way, but as a former student of Japanese I can relate to the general message.)
Looks like the domain names will be encoded using punycode instead of the cleaner UTF8 encoding:
.jp" in Romanji. That way you can still cater to a local/regional audience, and still allow everyone else in the planet to reach you.
http://en.wikipedia.org/wiki/Internationalized_domain_name
http://en.wikipedia.org/wiki/Punycode
However, my biggest concern is that the use of non-ascii characters in domain names breaks the whole International nature of the web, and imposes regional barriers. Your mail client and mail server software might not be too happy with you trying to send an e-mail to "joe@.jp" or "joe@.jp-r14k153opxc" in punycode. (Crap, it looks like slashdot does not accept international characters in comment submission, so you can't read this: "日本人".)
Remember that very few people have rendering and fonts for every written language on the planet, so most people will be cut off from many websites.) With the current IPv4 shortage, one can no longer reliable just use an IP address to access a specific website, e-mail address, etc., since a single IP address can host many domain names.
Personally I think that the best compromise solution would be to only allow non-ascii characters for domain names in different languages if there are submitted with a paired up romainization version that can be equally accepted for the same domain. So using my previous example, one could equally specify ".jp" in Japanese Kanji, ".jp-yn9d427hcvb" in punycode, or "nihonjin
For those that argue that it does not matter if a domain name is only specified in a foreign language, if all of the hosted content is in the same foreign language forget about all of current International collaboration in Mathematical, Scientific, Engineering, Programming, and other fields. (You can write an entire math proof or software program using only symbols without a single human word.)
Even for individual one-on-one e-mail communications between people in different countries that are able to communicate in a common language this would still be a problem, since a large percentage of e-mail accounts are hosted with a user's local ISP, that in future may leave them stuck with a non-ASCII e-mail address that would cut them off from the rest of the world.
While I don't like to raise too much sturm und drang about it, as a native English speaker I must still take some affront at the chutzpah with which these dirty foreigners waltz into our tongue, thinking they have carte blanche to sully our language.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10