Slashdot Mirror


Registrations Now Accepted For Asian Domain Names

Eric Sun was among the first to point out that as of Thursday evening, VeriSign has begun accepting Chinese, Japanese and Korean domain names. "This increases the possible characters from 37 (26 letters, 10 numerals, and hyphen) to 40,282. Find more information [see this AP story]." snrsamy points to the same story as featured on C|Net . jamie suggests reading the technical lowdown at VeriSign.

51 of 138 comments (clear)

  1. Re:Big5 or Unicode by Rob+Parkhill · · Score: 3

    Since nobody seems to want to read the article, or research any of the info, here is the quick low-down (since I have to deal with this at work right now...)

    - This solution is only for web browsers. It requires a special version of a web browser, or a plugin, to be able to use the new encoding scheme. It won't work for email, ftp, telnet, gopher, etc, unless a special version of the program is written.

    - DNS doesn't break. DNS still uses ASCII. This scheme uses RACE to encode the multi-lingual character set into ASCII. NSI will put a small prefix at the start of the domain name to identify it as multi-lingual (for example eq- would be found at the start of the domain name. The exact prefix has not yet been released to prevent squatters from snapping them up.)

    - The special browsers will detect the prefix, and translate the ASCII gibberish into the specified multi-lingual character set. The browser also does the conversion back to ASCII to allow a DNS lookup.

    - WHOIS does not/will not support this. You can only use WHOIS with the ASCII encoded gibberish.

    - This is not supported by the IETF. This is a custom solution implemented by NSI. But it looks like they are going to be WAY behind schedule in actually rolling this out.

    - They are accepcting registrations right now, but none of these names will resolve for at least a month, probably much longer. In other words, the system isn't useable yet, but NSI can collect money.

    - The IETF is working on their own, probably completely incompatible system, to do the same thing.

    --
    "Tomorrow's forecast: a few sprinkles of genius with a chance of doom!" - Stewie Griffin
  2. Re:What a lot of whining! by Stiletto · · Score: 2

    Well, if your only connection to the Asian population is spam email, this should make your isolationism even more simple: the standard uses a standard prefix for RACE-encoded domain names; block those and you're in arrogant English/USian bliss.

    Blah. Spare us your arrogant anti-English/US attitude.

    Fact is, it is conveniant to be able to block certain top-level country codes at the business gateway (or ISP) in order to cut down on spam.

    Incidentally, someone's connection to the Asian population is most likely NOT through spam, since most spam coming from asian top-levels is actually just U.S. spam--either routed through someone elses mail system, or with spoofed headers.

  3. Re:RFC by jafuser · · Score: 2
    The > and < symbols are not part of the RACE string. I tried typing in anime () into their "Multilingual Conversion Tool" and got the following result:

    Input String
    Utf-8

    Prepared String
    Utf-8

    Registration String
    RACE
    bq--gcrmxyi

    --
    EFF Member #11254

    --
    Please consider making an automatic monthly recurring donation to the EFF
  4. Re:Thats a lot of characters... by Sensor · · Score: 2

    The problem isn't necisarily with buffer overflows, read bug-traq...

    there was a report a couple of weeks ago regarding a problem with internationalised IIS's where unicode representations of directory traversal codes (.,/,\,etc) where being substitued after access checks had been applied...

    Now imagine domain based trust relationships - these will be implemented in numerous sub-systems (tcp wrappers, .rhosts, sendmail.cf, etc...) each of which may perform the normalisation/access checks slightly differently.

    I imagine that this will lead to numerous security issues due to slight differences in systems support for multi-byte characters.

    Another question (which I suspect will be answered in the FAQ) is do you need to register the same domain name several times to take account of the differing unicode byte widths?

  5. Re:RFC by Megane · · Score: 4
    Okay, for the PDF challenged, it seems to not be an RFC, but to be compliant with the current RFC spec, in consideration of RFC2825, which points out that there is simply too much software out there which will break when given UTF-8 domain names.

    How it works is there is a special prefix "<rp>" (or maybe this just represents the prefix, I can't really tell from the PDF, but I didn't think < and > were valid domain name characters) that indicates a part of the domain is encoded, followed by the encoded name which only uses ASCII characters, and includes information about which character set was used (Unicode, SJIS, etc.). The algorithm is called RACE, Row-based ASCII Compatible Encoding.

    A couple of examples were given for both a domain name and a server name:

    <rp>45dfg62de34432.COM
    <rp>3df45gd345.<rp>45dfg62de34432.COM

    So I guess you can set your spam filters to block any domain starting with <rp>! :)

    --
    #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
  6. Not a troll, but... by Speare · · Score: 3

    Will moderators shoot down the fact that I mention Microsoft?

    Windows has had a CJK-capable kanji input scheme for years. CJK: Chinese, Japanese, Korean. Windows also has had bidi (bidirectional) support for right-left and/or top-bottom languages, including Hebrew.

    If you have the appropriate cjk-input features installed, it's just a funky keyboard shortcut to open it up to enter kanji. If not, you'll probably be limited to clicking on visible links, not entering domain names or other text by hand.

    I don't know what features Linux has to handle EFIGSS (English, French, Italian, Swedish, Spanish) differences, nevermind bidi or kanji input.

    --
    [ .sig file not found ]
    1. Re:Not a troll, but... by BJH · · Score: 3

      Kanji are usually input under Linux with kinput2 (although Netscape has always had a few... problems... in dealing with them). Luckily, Mozilla is much better in this respect.
      Some programs, like Emacs, communicate directly with the Japanese conversion server (canna, Wnn[4|6], ATOK, etc.), but there are very few apps which can do this.

  7. RFC by Megane · · Score: 2

    So is there an RFC on how this works?

    --
    #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
    1. Re:RFC by Speare · · Score: 3

      The rp is a variable. The first couple pages notes that the implementation-testers should assume that the "RACE Prefix," or rp, should be "bq-".

      --
      [ .sig file not found ]
    2. Re:RFC by Megane · · Score: 2
      Though what you say is true, it would still be interesting to see how they deal with the fact that, say, Japanese character sets provide for full-width alphanumeric characters, which, although they look the same as A,B,C,etc... except for their width, have a different encoding.

      True, they say that any name part consisting entirely of USASCII characters are not allowed to be encoded this way, but they would have to go out of their way if they wanted to ensure that double-wide SJIS romaji were not confusingly registered. Then again, we can already do "s1ashdot.org" with just plain ASCII.

      In addition, there's the inherent difficulty in the fact that a Chinese website using a Simplified Chinese set of ideographs could hijack surfers wanting to go to a site with the same name, but with Traditional Chinese ideographs.

      IIRC, in Unicode, Chinese and Japanese ideographs all map to the same code if they're basically the same character, with the differences considered font-specific. In the extreme case, one common radical is rendered with one less stroke in Japanese, which could have created hundreds of extra codes.

      Most simplified kanji/hanzi should be unique, but a few, at least in Japanese, use an already existing, more common character. Generally, though, this won't be a problem if Unicode is used.

      --
      #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
    3. Re:RFC by truthsearch · · Score: 3

      Kind of ironic the algorithm is called RACE, isn't it? Can we filter by RACE? Can we browse domains of only a certain RACE? Can it be enhanced with RACISM, Row-based ASCII Compatible Interface for Stereotyping Mayhem?

  8. Re:Oh great, Japanese URLs, just what we need. by DrWiggy · · Score: 2

    Had you ever actually considered what using the Internet must be like for non-English speaking countries? Probably something equally unpleasing to the eye.

    Seeing as the Internet is supposed to be the medium that allows a break-down of barriers between nations and a free flow of information, don't you think that it might be a good idea to include as many languages as possible rather than exclude anybody who doesn't use a language that conforms to your standards?

    I think you need to realise now, that English is not the only language in the world - in fact we're in a vast minority. It's possible that at some point enough people will undertake the task of learning enough foreign languages to free up communication between ourselves, and perhaps ulitmately one language will be considered the accepted standard - however, don't expect that to be English.

  9. Spamming floodgate by AntiPasto · · Score: 3
    Man I thought the long IP http://2034890234890294 thing was annoying... now I won't be able to make sense of *anything* in their damn spam. Oh well... another clue to hit delete.

    ----

  10. Re:IMO about time by Tet · · Score: 2
    It's nice to see that the global part of the Internet is still spreading...

    No, it's not. This is one of the most brain dead decisions ever made, in the name of political correctness, with complete disregard for the practical issues. The effect of this will be to reduce the global appeal of the web, not increase it. Western surfers will now effectively be cut off from many far Eastern domains. Sure, there's a reasonable workaround for entering non-ASCII domains on an ASCII keyboard, but it's too complex for the general public, and far Eastern companies are unlikely to publish the ASCII-fied domain anyway. This is a very black day for the net...

    --
    "The invisible and the non-existent look very much alike." -- Delos B. McKown
  11. Re:Ideogrammatic languages are a pain by chrischow · · Score: 2

    the commies tried with pinyin but it doesn't work very well because of the many homophores in chinese. hanzis are much cooler anyway and a more compact way of writing and representing data.

  12. Oh great, Japanese URLs, just what we need. by AFCArchvile · · Score: 2
    So will we have to extend ASCII to 65,536 from 256? Will legacy Japanese URLs look like "http://%0077%0077%0077.%0073%006F%006e%0075.%0063 $006F.006A%0070/"?

    And what will the new ones look like to us Americans? Ugh, I can't bear to think of it.

    --
    "Ancillary does not mean you get to rule the world." --U.S. Circuit Judge Harry Edwards, speaking to the FCC's lawyer
  13. Re:Big5 or Unicode by Yardley · · Score: 3

    This is probably an attempt to force migration over to Unicode. Anyways, why is Verisign behind this? Didn't we learn from Network Solutions that a privately-owned, commercial company is not the solution to internet domain name databases (and their "ownership")?

    How can one company be granted the monopoly rights to something so important to the world's economy and everyone on the Internet again? Should this be assigned to a not-for-profit entity under the auspices of ICANN?

    --

    --

    --
    He lives in a world where those who do not run the client software of the omnipresent meme are unacceptable.
  14. Re:Big5 or Unicode by Speare · · Score: 2

    Since the majority of chinese users input their chinese as big5, (eg www.ê.com) will not be the same as the unicode equivalent

    I think it's probably not too difficult for the Chinese browsers to do the conversion behind the scenes. Kinda like ASCIIEBCDIC conversions; you don't need to change the keyboard to enter text of the other variety.

    Now, which one does the registrar accept, and the DNS servers cache? Read the article? From the first couple pages, it appeared that the domain name is actually not in Unicode nor Big5; it's translated to an ugly ASCII encoding.

    --
    [ .sig file not found ]
  15. Re:Unicode would be better. by Guy+Harris · · Score: 2
    Unless you want to register domain names in Klingon.

    Michael Everson of Everson Gunn Teoranta has proposed an encoding of Klingon in Plane 1 of ISO/IEC 10646-2; if it gets adopted, future versions of Unicode may adopt it (Everson's one of the editors and authors of Unicode 3.0).

  16. I'd argue the other way by Galvatron · · Score: 2
    Most (I would say all, but I'm not entirely certain of that) have roman alphabet representations, usually without using accents, umlauts, or what have you. So, they can represent their languages in urls, just a less commonly used form. German, Spanish, French, etc, often have words that, stripped of special characters, are written identically. On top of this, it's relatively easy to write special roman-alphabet characters on a QWERTY keypad (I managed to figure it out through trial and error), but quite difficult to type asian characters, so asian character urls will serve to make the Internet more regional.

    I have occasion to buy an international airline ticket this year, and I refuse to use priceline because they have Will Shitner doing their ads. Give me Nemoy, Stewart, Dorn, Spiner, McFadden, anyone but shitner. Blow me priceline.

    Man, you have got some real problems, don't you? Did Shatner beat you as a child or something? I mean, I'm not crazy about Troi, but it's not like I carry some kind of grudge. And you manually typed in a .sig as an anonymous coward? That's just weird.

    --
    "The question of whether a computer can think is no more interesting than that of whether a submarine can swim" -EWD
  17. Re:Why asian character sets? by AlanStokes · · Score: 2

    The proposal includes umlauts - it's based on a mapping to US-ASCII from any Unicode string. (Admittedly if you only wanted to represent a handful of European languages you'd come up with a different scheme, but it would obviously be less general.)

    Presumably they're pitching it at the asian market cos that's where they expect to make money.

    There are apparently good reasons for not allowing 8-bit characters not in US-ASCII in domain names - it would break too much.

    --
    - Alan
  18. Re:It breaks the dns-rfc. by lizrd · · Score: 2
    If I remember correctly, it do NOT allow special chars in the domainnames.

    Damn you're quick. Of course the whole point of this is to provide a work-around to that problem. All it does is make an ASCII representation of a different character set. These representations are flagged by having the hostname start with bq-. So if you run across a hostname that looks like bq-safjdlfaqwue72819.bq-hewaguifuifdajhks.co.jp you'll know that the hostname probably makes good sense to anyone who has a Japaneese web browser. If you are in the habit of reading such pages you'll get the appropriate plugin. If you don't have the plugin, you probably couldn't read the content anyway and believe you me, there is a LOT of content on the web that's written in a language you can't read. (I'm not saying that you're stupid or anything, I'm just making the bet that there isn't anyone here who knows every language in which material has been posted to the internet, this includes Klingon)
    _____________

    --
    I don't want free as in beer. I just want free beer.
  19. Re:Quick (maybe stupid) question... by truthsearch · · Score: 3

    My Chinese co-worker has informed me that to type Chinese, he sets the desired language in whatever app to Chinese and then types phonetically. The problem is that even phonetically there are many similar words, so he basically types a few English letters to verbally spell out a word, then Chinese characters appear on the screen which he must then choose. He tells me there are also special keyboards where you hold down multiple keys.

  20. Re:IMO about time by Tet · · Score: 2
    So what your saying is that it's ok for non-english speaking people to try and use our ASCII system but totally wrong and inappropriate for them to have their own native language system and for us to to try and learn how to use that?

    Yes, that's *exactly* what I'm saying. I'm not saying it because I happen to use ASCII, but because ASCII is a more natural system for computers to deal with. If Western European and American languages consisted of 30000+ characters, and those in the the East consisted of some 100 or so, I'd suggest using the Eastern system at the drop of a hat, even if it wasn't my native system. This has nothing to do with whether or not it's my native character set that's chosen, and everything to do with whether a good decision is made from a techincal perspective.

    --
    "The invisible and the non-existent look very much alike." -- Delos B. McKown
  21. What about Cyrillic and ISO-Latin? by vlax · · Score: 2

    I want to be able to register domain names in French, German and Russian too. If they are going to support all three zillion kanji and Chinese characters, they need to at least support the various Cyrillic and eastern European Roman alphabets, and the rest of ISO-Latin-1 (which covers all the major and most of the minor Western European languages.) The Persian-based alphabets (Arabic, Farsi, Urdu, etc), Hebrew and Thai are written right-to-left, so I suppose that won't be implimented right away, but it needs to be on the drawing board.

    If all those other languages are accounted for, I view this as a good thing. If this is part of an overall shift to Unicode on the web, then all these languages are automatically supported, and I would think it an even better thing.

  22. Re:What a lot of whining! by Alan+Shutko · · Score: 2

    Actually, most of my spam is from Asian top-levels (mostly cn) and in some CJK encoding. (Not being able to read it, I don't know if it's _really_ US spam in a foreign language, but....)

    Furthermore, much of that spam comes through the same set of systems which never seem to do anything about it.

  23. Re:Quick (maybe stupid) question... by fatphil · · Score: 2

    There was such a thing as a Chinese Typewriter. It had 300 keys and required multiple presses (Shift, Ctrl, Meta, Alt, Hyper etc. style)
    to generate characters.

    This is a really crap picture of one:

    http://acc6.its.brooklyn.cuny.edu/~phalsall/imag es/typewrit.gif

    So many keys each one is barely distinguishable from the next (that's also poor photo quality though)

    If fell into disuse fairly swiftly because it was slower than script.

    Our typewriters were invented so that they could be faster than script.

    They lose.

    FatPhil

    --
    Also FatPhil on SoylentNews, id 863
  24. What a lot of whining! by Speare · · Score: 2

    Within a few minutes of this story being posted, most of the posts are along the following lines.

    • Why not get European hacks like uumlauts working first?
      I dunno; maybe because the Japanese don't know enough German? Why should the Asians wait for Europe to get its act together before they solve the issues they face every day?
    • Great, now I have to see even more ugly spam!
      Well, if your only connection to the Asian population is spam email, this should make your isolationism even more simple: the standard uses a standard prefix for RACE-encoded domain names; block those and you're in arrogant English/USian bliss.
    • How can I enter these funky characters?
      I dunno, just a guess, but maybe someone's already thought of this? Perhaps the people who work in kanji all day know something about entering kanji, and have hardware or software solutions around. If you don't normally have to type it, I'm sure your browser will let you CLICK on encoded links just fine.

    Missed anything?

    --
    [ .sig file not found ]
  25. Re:Thats a lot of characters... by Malc · · Score: 2

    If it's implemented properly, surely it shouldn't matter. It's not just size of the Unicode chars, but also the big and little endian-ness. If it's implemented properly, the DNS would just determine what you're using (UCS-2BE, UCS-2LE, UCS-4BE. UCS-4LE) and convert it to it's internal representation for the lookup.

  26. Spam spam spam spam... by sdo1 · · Score: 2

    I guess I can look at this two ways...

    1) Oh God, there's gonna be a MASSIVE amount of spam coming from domains with characters outside of the standard 37.

    2) I can block anything and everything coming from domains with characters outside of the standard 37.

    -S

    --
    --- What parts of "shall make no law", "shall not be infringed", and "shall not be violated" don't you understand?
  27. Re:Big5 or Unicode by spitzak · · Score: 2
    Gad. We should just say that "bytes with the high bit set must be sent unchanged" through everything and scrap everything that does not obey this.

    This would allow all transports to ignore the character encoding as long as the encoding only uses bytes with the high bit for non-ascii. It also means that case-independence of non-ascii would be illegal, thus stopping the emergence of a dangerous (for security) mess of incompatable implementations of equality tests for URLs.

    This would allow us to use UTF-8 for the URL, for the page contents, for email, for everything, and we would not have this horrid mess of prefixes and mime types.

    Yes, some programs, routers, etc, would not pass this stuff through. Well, tough, those should be obsolete!

  28. Other issues: ASCII fallbacks by vlax · · Score: 2

    According to the artivle, they're working on a substitution scheme so ASCII only users can still type in the URL's. Does this mean that ASCII equivalets will be arbitrary and unintuitive? If so, that's a problem. Let me propose something slightly different:

    Unicode is not supposed to over-unify characters, so the ASCII fallback for Japanese could be the romanji transcription - and therefor registering a Japanese domain name automatically registers the romanji equivalent, except that some kanji have more than one possible romanji transcription.

    However, some kanji are unified with Chinese characters, which have a different pinyin trasncription.

    Chinese is another problem. The logical ASCII equivalent is pinyin stripped of its diacritical marks. But then, many different characters may have the same transcription.

    All Cyrillic languages also have an ASCII trasncription scheme too, but it isn't unified. One character may be trasncribed one way in Russian and another way in Bulgarian. Is there a unified transcription scheme for all Cyrillic languages, and is it truely one-to-one? I don't think so. Look at the character usually transcribed as "j" in Russian, and the one usually transcribed that way in Serbian.

    ISO-Latin-1 and -2 fallbacks: For ISO-Latin-1, the fallbacks are pretty obvious: "Champs-Élysée" ==> "Champs-Elysee" or in German "Düsseldorf" ==> "Duesseldorf", but in Czech it's a little less obvious. Does "C hacek" map to "Cz" or "Ch" or "Cs"?

    So, here is a possible solution: devise unified ASCII transcritption schemes for each language, admitting whatever ambiguities exist in Japanese or similar languages. Then, when you register a non-ASCII name, you are asked on the form to fill out the transcribed ASCII name that corresponds to it and it is also automatically registered to you.

    There is some potential for conflict here, if the ASCII transcription corresponds to an existing registered domain or, as in the case of Chinese more than one foreign name corresponds to the same transcription, but I think the problem is manageable.

  29. Re:Quick (maybe stupid) question... by Malc · · Score: 2

    It's easy under Windows. For everything but Win2K (and ME?) you will have to download and install Global IME from MSFT. I don't know how you do this under X, or for Lynx users, in a console. I have to admit, MSFT makes it quite easy for us developers to internationalise our products.

  30. Re:English Based Systems sending E-Mail? by Speare · · Score: 4

    So how's this gonna work for systems not set up to handle the asian character set?

    Read the links.

    The proposal implements an ASCII encoding scheme, called RACE. A certain prefix (they list the debugging prefix as "bq-") indicates a RACE-encoded domain name.

    The rest of the ASCII encoding either appears in ASCII for dumb browsers, or is converted to Unicode or Big5 or whatever character set it wants.

    For "dumb browsers" (not a flame, just an indication of character-set-awareness), you'd see some crazy domain like http://www.bq-ag0970ag00ah07h.or.jp/; for "smart browsers," it would appear in your own kanji font.

    --
    [ .sig file not found ]
  31. It breaks the dns-rfc. by arcade · · Score: 2

    Has there been an update to the DNS RFC allowing this? If I remember correctly, it do NOT allow special chars in the domainnames.

    Furthermore, does this limit those domains to 32 chars of length? (unicode, 2 bytes per char, dns system allows a maximum of 64 chars for domainnames .. but, that should probably be interpreted as bytes).

    Also, doesn't it kinda suck to make large parts of the net unavailable for most?

    --paddy
    --

    --
    "Rune Kristian Viken" - http://www.nwo.no - arca
    1. Re:It breaks the dns-rfc. by Speare · · Score: 2

      Such an authoritarian title. Are you sure? It proposes ASCII encoding, not a Unicode or other mbcs usage directly.

      Also, doesn't it kinda suck to make large parts of the net unavailable for most? Don't you think the Chinese and Japanese people could say the same thing about English?

      --
      [ .sig file not found ]
  32. Multilingual Domain Names by Smuj · · Score: 3

    A few notes...

    The Internet Society probably isn't too happy about this. They released a statement on November 8th encouraging NSI to back off and let the IETF IDN WG do its job.

    Also, there are companies that are already currently operating in this market, including WALID, which is taking registrations for Arabic domain names (AND RESOLVING THEM), and will soon be adding Hindi, Tamil, and two Chinese scripts before moving into other markets.

  33. Ideogrammatic languages are a pain by swb · · Score: 2

    Because the Mediteranneans figured out that if they came up with simple symbols that represented sounds (an alphabet) and could be strung together to transcribe those spoken words instead of sepeate ideograms for each spoken word, you could not only learn to read and write much more easily you could also write down other languages with the same written symbols.

    One of the major reasons this happened was there was they were trading with different peoples who used ideograms instead of alphabets. Since learning one ideogrammatic written language is hard enough and learning 5 is a single lifetime's achievment, a simpler way was found.

    The Chinese were heterogenous and didn't need to deal with anyone other than the Chinese and hence kept their ideogrammatic written language.

    It's a simple fact that it's far easier to implement the Roman alphabet on a computer than a zillion independant symbols -- you need less RAM, simpler displays and so on.

    What the Chinese need to do is settle on a single way to transliterate spoken Chinese into the Roman alphabet (or even the Cyrillic, Hebraic or Greek if that's what they want). Ideograms are neat, but they're a pain in the ass.

    Sorry, it's not cultural imperalism, just pragmatism!

  34. Re:Why asian character sets? by FigWig · · Score: 3

    Wouldnt it make more sense to implement umlauts like ö/ü/ä first?

    I have dibs on släshdot.org!!

    --
    Scuttlemonkey is a troll
  35. Kanji by dizee · · Score: 2

    w3m, the console web browser that can format tables, frames, etc, was written by Akinori Ito. He includes support for kanji. I know because there is a #ifdef PC_KANJI that is misplaced every time I go to download and compile it without japanese character support.

    I believe there is also a xterm counterpart for kanji.

    Mike

    "I would kill everyone in this room for a drop of sweet beer."

  36. Re:This is not meant to sound xenophobic, but by Malc · · Score: 2

    It's called evolution. Things weren't implemented properly the first time. Now we're correcting that. A lot of modern computing was invented in English speaking countries, it's hardly any wonder our systems can't cater for the rest of the world. It seems rather unfair to put them at a disadvantage. Besides, they will eventually force a change, and we don't want incompatibilities now, so we? Personally, I can't wait for everybody to move to Unicode - it will make life as a software developer easier.

  37. canonicalisation issues by jbert · · Score: 3

    Hmm. This could lead to fun. Some character sets/character encodings allow different byte sequences to map to the same character.
    (See the Unicode bugs recently in IIS, where a unicode representation of '../' is used to navigate upwards in the directories of the server to view files outside of the server root.)
    Now, does a company have to register all possible permutations of byte sequences which all map to the same character sequence? As well as doing so in .com, .net and .org.
    We'll see.

  38. English Based Systems sending E-Mail? by tomjgroves · · Score: 4

    So how's this gonna work for systems not set up to handle the asian character set? Lets say I want to send to joe.bloggs@somechinesename.net from my FBSD or Linux boxes? Not too much fun, I think...

  39. Great, if not already blocked by strredwolf · · Score: 2
    This would be great for China, if half (if not all) it's mail servers didn't relay spam back to the US (and therefore be blocked independently by ISP's and by the MAPS RSS). There's been no responce out of those admins who don't have the latest software (comeon! Sendmail 8.10 is free! Why are you running the broken SMI Sendmail?!?).



    --
    WolfSkunks for a better Linux Kernel
    $Stalag99{"URL"}="http://stalag99.keenspace.com";

    --

    --
    # Canmephians for a better Linux Kernel
    $Stalag99{"URL"}="http://stalag99.net";
  40. Why asian character sets? by Ashran · · Score: 3

    Wouldnt it make more sense to implement umlauts like ö/ü/ä first?
    Easier to test etc..

    --

    Before you email me, remember: "There is no god!"
  41. Big5 or Unicode by Giant+Robot · · Score: 4

    How is this going to work? Since the majority of chinese users input their chinese as big5,
    (eg www.ê.com) will not be the same as the unicode equivalent..

  42. Appearance of names by ce25254 · · Score: 2

    The general FAQ answers how the names will appear in a web browser, but they use a GIF to show the Chinese name. So I'm still wondering how it will look to someone without an OS that displays the characters properly. Never mind that you can download extensions to display the content in the web browser; the location will be garbage, right?

    Will this be a good kick in the butt for internationalization of your OS?

  43. IMO about time by RCobbett · · Score: 2

    I'm surprised it took so long for somebody to do this. I don't relish trying to learn a whole new set of shortcuts (my grasp of the 255 odd ASCII set is slipping fast, never mind kanji!). I did a story about this yesterday called over at http://www.t3.co.uk. It's nice to see that the global part of the Internet is still spreading...

  44. Re:Quick (maybe stupid) question... by darthaya · · Score: 2

    It is easy, you use CXterm, a special program developed to input chinese under X. And there are a number of other programs you can use to input chinese under UNIX's console mode as well.

  45. RACE Encoding scheme is not very PC by ers81239 · · Score: 2

    Isn't it odd that the acronym for the encode scheme of asian domains is called RACE? Who's in charge over there at Verisign, the Klu Klux Klan?

    --
    there are 2 kinds of people. those who divide people into 2 kinds, and those who don't.
  46. TLD by Fjord · · Score: 2

    I noticed a promotion for this on networksolutions website a week or two ago. I think that this is great, but we need TLDs in these characers as well, one with the chinese character for commercial, one for organization, one for educational. I wonder if that new TLD system that they are testing will allow these characters. For 50,000, you could register one of these Chinese TLDs and probably make a lot of money.

    --
    -no broken link