Slashdot Mirror


ICANN Under Pressure Over Non-Latin Characters

RidcullyTheBrown writes "A story from the Sydney Morning Herald is reporting that ICANN is under pressure to introduce non-Latin characters into DNS names sooner rather than later. The effort is being spearheaded by nations in the Middle East and Asia. Currently there are only 37 characters usable in DNS entries, out of an estimated 50,000 that would be usable if ICANN changed naming restrictions. Given that some bind implementations still barf on an underscore, is this really premature?" From the article: "Plans to fast-track the introduction of non-English characters in website domain names could 'break the whole internet', warns ICANN chief executive Paul Twomey ... Twomey refuses to rush the process, and is currently conducting 'laboratory testing' to ensure that nothing can go wrong. 'The internet is like a fifteen story building, and with international domain names what we're trying to do is change the bricks in the basement,' he said. 'If we change the bricks there's all these layers of code above the DNS ... we have to make sure that if we change the system, the rest is all going to work.'" Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?

4 of 471 comments (clear)

  1. Re:Watch out for attacks by gsasha · · Score: 5, Informative

    It's called a "Homograph Attack". See http://en.wikipedia.org/wiki/IDN_homograph_attack

  2. you couldnt be more wrong by Srin+Tuar · · Score: 5, Informative

    much even when Windows solved the problem soooo long ago

    i18n on windows is far from "solved".
    I do admit that MS had a huge benefit when they started pushing unicode.
    (It takes a company with microsoft's level of clout to push around national governments )


    And the ASCII problem isn't just bad because it forces people to use inefficient encodings like UTF-8 (THREE bytes per character?)


    Perhaps you don't realize that UTF-8 is moving on to become the most dominant character encoding,
    and the legacy cruft such as UTF-16 (designed to deal with design flaws in windows) is being phased out.

    Even languages that would end up as mostly 3 byte characters tend to benefit from the savings on single byte
    characters for control and formatting markup.

    I'm not going to harp on about it, but a few basic web searches could enlighten you here.

    if(string[index] == '.' || string[index] == '?' || string[index] == '!') sentenceEnd = true;

    Code like that *works* in UTF-8, which is one of the things that makes it beatiful. (among many others)

    It allows you to deal with world characters sets when it matters, and allows you to ignore them when it does not.
    (for example, a lexical analyzer that specifies its tokens does not want to support punctuation from every language ever conceived)

    And if you think code like that doesnt exist in the windows world, you are sadly quite naive.
    In my experience internationalizing applications, its typically far easier to upate unix applications, which
    on occaision need nearly no changes at all, compared to the laborious grind and near total re-write often needed
    for ms-windows applications.

  3. Re:Changing a system by Anonymous Coward · · Score: 5, Informative

    What's this? I've been able to use the Norwegian characters in domain names for a long time. There are screetshots over at http://en.wikipedia.org/wiki/Internationalized_dom ain_name

  4. Re:Changing a system by cortana · · Score: 3, Informative

    It depends on your operating system. The "standard" way is to hold Ctrl+Shift and then type the hexadecimal representation of the unicode code point that you want, but that conflicts with a lot of keyboard shortcuts that people use and so implementors often alter it a bit (for example, with GTK+ you press Ctrl+Shift+U and then type the code point).

    If your keyboard has a compose key then you can often compose a glyph from two similar looking glyphs. For example, for an o with an umlaut, " o -> ö (though I expect Slashdot will filter that character out).

    Macintosh users have an Option key that they can use to make weird glyphs (option-8 for the infinity symbol, option-g for the copyright symbol, etc). On most operating systems, various other combinations of the Ctrl/Shift/Meta/Alt/AltGr modifier keys and regular keys will allow you to type more glyphs. Most desktop environments also have an on-screen keyboard type program that ease experimentation in this area.

    Users of complex (e.g, Asian) scripts have a host of input methods to choose from and configure.

    Finally, if all else fails, create a text file full of your faviourite non-ascii characters and resort to the tried and tested method of copying and pasting! :)