Slashdot Mirror


ICANN Under Pressure Over Non-Latin Characters

RidcullyTheBrown writes "A story from the Sydney Morning Herald is reporting that ICANN is under pressure to introduce non-Latin characters into DNS names sooner rather than later. The effort is being spearheaded by nations in the Middle East and Asia. Currently there are only 37 characters usable in DNS entries, out of an estimated 50,000 that would be usable if ICANN changed naming restrictions. Given that some bind implementations still barf on an underscore, is this really premature?" From the article: "Plans to fast-track the introduction of non-English characters in website domain names could 'break the whole internet', warns ICANN chief executive Paul Twomey ... Twomey refuses to rush the process, and is currently conducting 'laboratory testing' to ensure that nothing can go wrong. 'The internet is like a fifteen story building, and with international domain names what we're trying to do is change the bricks in the basement,' he said. 'If we change the bricks there's all these layers of code above the DNS ... we have to make sure that if we change the system, the rest is all going to work.'" Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?

24 of 471 comments (clear)

  1. Changing a system by Kamineko · · Score: 5, Insightful

    Changing a system which works is a very, very bad idea.

    Wont this open up the system to many more phishing attacks involving addresses which include non-latin characters which look similar to latin ones?

    1. Re:Changing a system by Daniel_Staal · · Score: 4, Insightful

      That's one possible problem. Then there are characters that are technically equivilent but have different representations. (Accented vowels for instance: you can code them directly, or you can code the accent and the vowel seperate.) You need some way to make sure they both go the same place, no matter UTF-8, -16, -32 or whatever else people throw at it.

      And, of course, you need to make sure when someone types this into a browser some major DNS server someplace won't crash.

      I'm all for adding non-latin characters. But I do recognize that it should be a slow process.

      --
      'Sensible' is a curse word.
    2. Re:Changing a system by KingJoshi · · Score: 5, Insightful

      But it's not working. Mainly for all those people that want non-latin characters. It's been broken from the beginning. Sure, there is historical reasons why we have the system we do, but change is definitely needed. Twomey is right that a change can't be rushed and it needs to be done right (for reasons of security, compatibility, stability, etc). However, the change does need to occur and there needs to be some level of pressure to ensure that it happens.

      --
      In times like these, it is helpful to remember that there have always been times like these. - Paul Harvey
    3. Re:Changing a system by jmorris42 · · Score: 4, Insightful

      > Wont this open up the system to many more phishing attacks involving addresses which include non-latin characters which look similar to latin ones?

      Even worse, although your problem is reason enough to postpone doing this change. It will break the very idea of the Internet as a common when URLs can't even be typed in on all keyboards. There are good reasons why DNS didn't even include the whole ASCII set. Least common denominator is a good design decision. Every character currently allowed is easy to generate on ALL keyboards, can be printed in an unambigious way by EVERY printing system, etc. Remember that a lot of wire services aren't even 7-bit ASCII clean, email addresses on a lot of news wires have to use (at) instead of @.

      More bluntly, of what use is the parts of the Internet I can't even type the domain name for? As things now stand I CAN, and have, snarfed firmware directly from .com.tw sites where I couldn't read any of the text. Learned things from sites where I couldn't read anything but the code text and command lines. Seen images and understood even when the captions were meaningless to me. I'm sure the reverse is equally true, that those who do not speak English still benefit from the English majority of the Internet the same way. All this because DNS is currently universal. Break that universal access feature and, frankly they can just as easy ingore ICANN and just get the hell off the Internet and make their own walled garden network based in IPv6 technology.

      At a minimum, unicode DNS should be restricted to IPv6 ONLY. No sense wasting scarce IPv4 resources on supporting walled off ghettos.

      --
      Democrat delenda est
    4. Re:Changing a system by ericlondaits · · Score: 4, Insightful

      Accented vowels would be a problem, at least in spanish. Though their use is "mandatory", people with mediocre spelling don't use them in the internet. Even people who use them don't always do it: even though the use of accents is mostly regular, there are many (and very common) irregular placements.

      Let's say for instance we have an online shop for tea called "Sólo Té" (Tea Only). Both accents are due to irregular rules ("Sólo" = "Only" and "Solo" = "Alone", "Te" is a personal pronoun and "Té" = Tea). Some people would try the current www.solote.com, others would try the correct www.sóloté.com, some would try www.sólote.com and yet others www.soloté.com depending on their spelling capabilities.

      What this basically means is that in order to make sure everybody finds your domain and to avoid phishing you have to register four different domains.

      A solution to this problem could be what Google does right now with accents: map them to the unnacented vowel. Thus "Solo Te" and "Sólo Té" would both find the "Sólo Té" store.

      --
      As a Slashdot discussion grows longer, the probability of an analogy involving cars approaches one.
    5. Re:Changing a system by Sin+Nombre · · Score: 5, Insightful

      'when URLs can't even be typed in on all keyboards'
      As far as Japanese go, there are very usable technologies that allow to type in kanji. Using a standard latin keyboard. It works pretty well, and i'm not sure what other languages have such options available, but since most of Asia uses the same kanji system I'm pretty sure that at least Asia has viable typing options.
      'of what use is the parts of the Internet I can't even type the domain name for?'
      Its of no use... to you. But then again, can you read Japanese, Korean, Arabic, Sanskrit or any other non-latin language? no? Then your usability isn't in question here.

      --
      "Im such a nonconformist I'm going to not conform to the rest of you!"
      "Dude I think we just got goth-served"
    6. Re:Changing a system by teh+kurisu · · Score: 4, Insightful

      Just because the letters aren't printed on your keyboard doesn't mean it won't type them. Have a look at the list of keyboard layouts in your OS. Sure, it's an inconvenience for you, but less of an inconvenience than it is to the people for whom it is a barrier to entry. Or you could use Google - a lot of people don't even bother typing in domain names any more, they just search.

      The whole point about this is that it avoids walled gardens, because the DNS records are still held by ICANN. The alternative is that China decides it's had enough, and creates its own root servers, causing a very real split.

    7. Re:Changing a system by Zaatxe · · Score: 3, Insightful

      As far as Japanese go, there are very usable technologies that allow to type in kanji. Using a standard latin keyboard. It works pretty well, and i'm not sure what other languages have such options available, but since most of Asia uses the same kanji system I'm pretty sure that at least Asia has viable typing options.

      I wonder how you got +4 mod points... this makes no sense at all!!

      Let's suppose you are are a japanese person and you travel to Brazil. Nevermind if can speak portuguese or not, but then you need to send an e-mail using your company's webmail server from a computer at the hotel. And suppose this webmail server has kanji characters in its URL. How are you going to type them? Believe me, brazilian portuguese Windows has no support for asian languages (at least not by default, and actually I don't know if it's even possible with a regular brazilian Windows XP). What now?

      --
      So say we all
    8. Re:Changing a system by dasunt · · Score: 4, Insightful
      As far as Japanese go, there are very usable technologies that allow to type in kanji. Using a standard latin keyboard. It works pretty well, and i'm not sure what other languages have such options available, but since most of Asia uses the same kanji system I'm pretty sure that at least Asia has viable typing options.

      I must have missed where Japan conquered 51%+ of the area east of the Ural mountains.

      AFAIK (and I'm not an expert), China, Japan, Korea and Vietnam used very similar writing system decended from Chinese Hanji characters. Vietnam and Korea (South Korea at least) later adopted other alphabets. So really, only China and Japan commonly use Hanji/Kanji, and even then, the CJK unification of hanji/hanja/kanji characters really annoyed a few purists when similar hanji/hanja/kanji were merged in unicode.

      So, other than hanji/kanji, there is hangul (S. Korea), hana/kana (Japan -- yes, they have more than one writing system!), the Thai alphabet, the Cyrillic alphabet (former USSR), the Arabic alphabet (Middle East), Hebrew (Israel), the Brahmic scripts (India) and the Georgian alphabet. (And this is just off the top of my head, I wouldn't be surprised if there were a few more writing systems in use in Asia!).

      And then, just to confuse the problem, there are the various forms of encoding. Admittedly, unicode would probably be one of the better methods, but there are a lot of pre-unicode encodings in common use.

      When you expand the problem to be worldwide, there's also the Ethiopian and Greek alphabets that are used in their respective regions. There's also a ton of latin-based alphabets, which introduces many more characters than are currently used in the DNS system. (Including characters that look a lot like existing characters!)

      And then you have the problem of alphabets used only by very small groups, such as Cherokee (Oh, I'm going to get flamed!). There are very few people who can write in Cherokee, but does that mean that the Cherokee language shouldn't be part of the DNS system?

      Now, can you see why this is a mess?

    9. Re:Changing a system by Znork · · Score: 3, Insightful

      "It will break the very idea of the Internet as a common when URLs can't even be typed in on all keyboards"

      You know, when one sees comments like that, it's not strange that non-7bit ascii countries find themselves rather exasperated with the rate of progress. If you take a few seconds to actually research the issue you'll find both a suggestive lack of multi-thousand key keyboards, as well as a whole host of solutions to that problem.

      I mean, I can cut'n'paste chinese and japanese into vi, save the file with a unicode filename, and it'll just work. Earlier valid technical reasons are gone, everyone else has solved this; now the excuses start sounding really hollow.

      It's time to drag DNS kicking and screaming out of the dark ages.

    10. Re:Changing a system by dcam · · Score: 3, Insightful

      You mean: It can only be better for me, in the long run, if we all end up using my alphabet.

      --
      meh
  2. Yes and No by Aadain2001 · · Score: 4, Insightful

    Yes, countries that use non-English characters should be able to interact with the rest of the world using their natural language. No, they shouldn't rush the change and risk a possible crash of a large portion of the Internet. Be patient young patawans, soon you will be able to have DNS names with any character you can think of, but it will be reliable and actually work.

    --
    Space for rent, inquire within
  3. When you've built on a foundation of straw- by Bonker · · Score: 4, Insightful

    - Don't be too surprised when people around you start building their own houses rather than choosing to pay rent.

    DNS upheaval has been a long time coming, and the current anti-American sentiment worldwide isn't exactly helping to stabilize it. We're already seeing all sorts of adhoc routing setups that deal with shortcomings of an ameri-centric DNS. My guess is that within the next few years, ICANN's 'control' of the internet will be in name only as everyone else in the world will have moved on to alternative routing and domain systems.

    --
    The next Slashdot story will be ready soon, but subscribers can beat the rush and slashdot the links early!
    1. Re:When you've built on a foundation of straw- by t0tAl_mElTd0wN · · Score: 3, Insightful

      I think that might be jumping the gun. American or not, the internet plays a huge role in the functionality of the modern world. Just imagine the chaos if international office networks went from "I can't open this word document you sent me because it's in a different format" to "I can't get email from you because you're on a different internet". American DNS control or not, decentralizing the internet like you suggest might happen could be one of the worst things that could happen for global communications.

    2. Re:When you've built on a foundation of straw- by benoitg · · Score: 3, Insightful

      Please, there have been complaints about DNS not supporting most language's (even latin) character sets since the birth of the web, so it's completely untrue that we waited till everything was built. After well over a decade of patient waiting, it seems that actual pressure was required to get this change through.

  4. Stupid question by VENONA · · Score: 3, Insightful

    "Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?"

    No.

    Zonk either knows zero about the histories of the Internet or DNS, or is so enamored of finishing stories with questions that he'll tack on the truly ridiculous.

    --
    What you do with a computer does not constitute the whole of computing.
  5. Watch out for attacks by Agelmar · · Score: 5, Insightful

    For all you people saying "There's no problem, just do it" - I say watch out... there will be a rush of attacks and spoofs as soon as this is opened up. The letter "a" appears in the unicode character set multiple times, and some of the variants are almost indistinguishable. I'm not just talking about someone registering släshdot.org, I'm talking about someone reigstering slashdot.org (the a is FF41 instead of the normal a). Good luck telling the attacks appart from the real sites.

  6. Sure, go 'head by kahei · · Score: 4, Insightful


    I'd be in favor of the change just because anything that undermines the Unix Tower of Babel -- the dependency on ASCII which complicates text handling sooooo much even when Windows solved the problem soooo long ago -- is good. Even Java gets it. Even Apple (finally) get it. Unix Is Teh Problem.

    And the ASCII problem isn't just bad because it forces people to use inefficient encodings like UTF-8 (THREE bytes per character?) It's bad because it allows people to write code like:

    if(string[index] == '.' || string[index] == '?' || string[index] == '!') sentenceEnd = true;

    (a line repeated, with subtle variations, several hundred times in the code of a certain ubiquitous editor).

    And, lo and behold, the above does not work, but once it appears in a few thousand places it's impossible to fix, and a vast towering structure of fixes made by people who don't really understand why it's an issue is built.

    So, even though the proposed change would be hugely inconvenient for a huge number of people, I'm in favor, because I want the world to grow the fork up and understand that text != byte array some time while I'm still alive.

    --
    Whence? Hence. Whither? Thither.
  7. URL goldmine. by emmagsachs · · Score: 4, Insightful
    Imagine the land rush that'll ensue if DNS will allow non-Latin characters. Trademark transliteration ? A heaven for domainsquatters and an upcoming surge of legal fees for trademark lawyers, if you ask me.

    Nice for localising, sure, but how usable will Japanese, Indian, or Arabic script URLs -- for example -- be for those who do not have access to the respective sets or keyboard layouts?

  8. Not a trivial job by turnipsatemybaby · · Score: 4, Insightful

    The internet was originally conceived, designed, and implemented in the USA at a time where hardware was at a premium, and corners were cut to conserve that limited resource. DNS was just one of the results of that era. However, it is the most visible because it is the front end means for people to find each other. That means there is now a very well established standard, used by people across the entire globe, that is very difficult to change.

    Changing all the DNS servers in the world to switch from ASCII to Unicode is NOT trivial. The fact that some societies have used non-latin characters for thousands of years is completely and utterly irrelevant. THEY didn't make the internet. They simply bolted themselves on to an existing infrastructure.

    I agree that progress needs to be made to accomodate non-latin characters, but to have people whining about "how they want it, and want it now"... That's just ridiculous. It's like waltzing into a house that was built 40 years ago and having a tantrum because the stairs are too steep and the house is too squished. Major structural renovations take time, effort, and careful planning. And there is nothing you can do to avoid that, short of implementing cheap stop-gap measures that are virtually guaranteed to cause even bigger unintended headaches later on.

  9. thousands of years? by sexyrexy · · Score: 3, Insightful

    Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?

    Those societies did not build an entire economic and social infrastructure using all 50,000 of those characters in a few decades, though.

    --

    Rex is 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
  10. Re:Can't trust your browser's address bar anymore. by NoMoreNicksLeft · · Score: 4, Insightful

    Why not have the browser fail to render them outside of the user's preferred alphabet?

    Cyrillic users would see www.**c******.com, latin users would see www.mi*rosoft.com?

    Or better yet, put up a big warning that it's using mixed alphabets?

  11. Re:English, not latin languages by brusk · · Score: 3, Insightful

    True, but the English subset of the alphabet has another feature that matters in this regard: it's a lowest common denominator that all computers on the planet are capable of producing. I can type any letter easily on a computer in China, Israel, Jordan, Russia, Spain, India, etc. I can't necessarily input a given Chinese character, Arabic letter, or Cyrillic letter.

    Why does this matter? Well, one argument is that it doesn't, much: if I want to view a Chinese website I'm probably in China and can input Chinese characters on my computer. But what about a Chinese person visiting an English-speaking country and surfing at a public computer (e.g. in a web cafe)? If the computer isn't set up for input of Chinese, he/she won't be able to view certain sites if they can only be accessed by inputting a non-latin URI. Thus to serve all possible customers, the computer would need dozens of input systems installed. That simply isn't going to happen. The alternative of just inputting Unicode codes is unworkable.

    Hence it makes more sense to have a requirement that any non-Latin DNS registration ALSO be accompanied by a pure ASCII one, so that any computer will be able to access it. This also helps people who don't know a given language very well: if you don't know Chinese well, and are just learning it, you may find it hard to type in a web address with unfamiliar characters, even if your computer has Chinese input enabled. That shouldn't keep you from visiting a site.

    In fact, there are some Chinese systems that do this, by creating a registry of Chinese names for websites. But they involve kludgy workarounds like browser bars that are not universal and are otherwise evil.

    --
    .sig withheld by request
  12. I think the whole idea is a mistake by msobkow · · Score: 4, Insightful

    Instead of changing the fundamental DNS which is a programmer's and administrator's tool, not an advertising medium. It is founded, like programming languages, on a fundamental 7-bit ASCII character set, and is not intended to be used for NLS text.

    A far better solution is some form of VDNS that translates NLS text names into the proper domain name at the system level. That also allows the same domain to have multiple language translations to reflect localized product and service names.

    We seriously need to kick the general political community in the arse. They keep trying to impose technical decisions, and it fails as miserably as any corporate PHB's uninformed decisions. ASK the techies to propose solutions instead of shoving ill-conceived ideas down our throats.

    For example -- once you mandate multibyte domains, you implicitly mandate multibyte URL components. Goodbye direct mapping of names to the directories, file systems, and servers.

    Bad idea. Very bad idea.

    --
    I do not fail; I succeed at finding out what does not work.