Slashdot Mirror


ICANN Plans Non-English Character Domain Testbed

Wanted writes: "This article reveals ICANN's plan to open registration of domain names with national characters. Actually it's Network Solutions, who are responsible for technical issues of implementing that project. Initially they want to support CJK (Chinese, Japanese, Korean), then Spanish and other European languages. I don't know why they like Spaniards, but I'd rather say about supporting ISO-8859-1, not particular languages. Nevertheless the Internationalized Domain Names IETF Working Group should be pretty happy about it. Wonder, how would you type www.wong-kar-wai.org in Chinese with classic keyboard :)"

138 comments

  1. Re:Questions by Anonymous Coward · · Score: 1

    Second, if you can't type the url because you don't understand it, how do you expect to understand the information on the page that it points to?

    Well, for example, assume that there's a new brand called El Perro Rabioso selling, say, spiced cucumbers. They have a web page www.el-perro-rabioso.com, which they are advertising all over the place. I don't understand Spanish, and have simply memorized the link, so I'll just go there and see if they have an English section. All works fine since it's Spanish, I understand the alphabet.

    Substitute Spanish for Korean and maybe you'll see the problem.

    Besides, not all information needs be written. Some Chinese site by Mr. Wu might just contain nice images of sunsets in the Chinese countryside or something. But if I'm unable to type the URL of the web page, I cannot view the information, even though I would understand the images (but not necessarily the image captions).

    Your argument about software is true and interesting. Maybe we'll soon have a Babelfish-kind function in our web browser, paint-n-translate! Now that would be cool.

  2. �ou people just don't get it by claes · · Score: 2

    I guess most of the posters here are americans, because you just can't grasp that there is a need for this in most other countries. If you can't type ö on your keyboard - too bad. You know what? There will be software that helps you with this, or web services. I promise: the majority of times you want to visit a webpage on a URL with a >7 bit character in it, you will have a link for it. Just point and click! What is so difficult with that? The big problem will occur when you try to read the page. It will consist of letters in combinations that you can not understand. They will not make up english words. You will have to take a course in another language to get it. That will be the hard part. But just relax, because the english/american web will still be the dominant one, and you will not feel you have missed anything.

    1. Re:�ou people just don't get it by dadragon · · Score: 1

      Personally speaking as a Canadian, and being fully bi-lingual (speaking only english at home while speaking only french at school), what I think is that these unicode/high ascii/national letters should be allowed only under the TLDs of their respective countries. IE, montrél.ca, DeutschegegenfranzösischeFreiheit.de etc. This would make it easy for yankees to go to their treasured com/org/net sites while still allowing national letters.

      --
      God save our Queen, and Heaven bless The Maple Leaf Forever!
    2. Re:�ou people just don't get it by claes · · Score: 1

      Good point

    3. Re:�ou people just don't get it by Kickasso · · Score: 1

      Bad point. What if I'm making a site for Norvegian ex-pats in Thailand?
      --

    4. Re:�ou people just don't get it by CmdrTHAC0 · · Score: 1

      ...you will not feel you have missed anything.

      I'm not sure the point is to see everything in the first place. There is no way you can view all the content of the Internet in a single lifetime.

      The problem comes when somebody is picking out the good hits from a search. They remember they want to go to fahrvergnügen.com, but they have no way to type the ü. Then what?

      The new domain name system should be designed with the following key points in mind:

      1. Backwards compatibility.
      2. Backwards compatibility.
      3. Backwards compatibility.
      Just because you want to have a domain with ü in it doesn't mean you have to implement it in such a way that I'm locked out. (Hmm, we could try escaping it with '\', as in \XXXXX\, where the X's are the Unicode character.) Just as the current system shouldn't lock out people who want to register with international characters.

      As you said, web services and software might help, but how many clueless users are going to be able to find them? How many of those are going to bother? In the "Internet economy" everyone keeps blathering about, they'll go somewhere else. Then all your "the Internet is global" propaganda is worthless...
      <<< CmdrTHAC0 >>>

      --
      __CmdrTHAC0__
      In Soviet Russia, Spanish Inquisition doesn't expect YOU!!
  3. Think. by maur · · Score: 2

    www.ràÐÊMåRk-LâwSÜïTs-ÄRè-ÙS.com

    [ maur_at_technologist.com ] "For a sufficiently powerful message,
    [ http://maur.litestep.com ] the medium is irrelevant."

  4. Re:EXCELLENT POINT... by Almighty+Akbar · · Score: 1

    Spoken by a man whose country invented the fine art of Paki-bashing.

  5. Re:Bad idea, NOT by Alex+Belits · · Score: 2

    First, 8-bit computers are still in use now, and their bus width does not prevent them from dealing with data of any format.

    Second, DNS already is in use, and NOONE BUT "UNICODERS" EVER COMPLAINED about ASCII use in it. There is no demand for this feature, only some people's desire to break all existing software to sell "updates".

    Domain name is an address. Address should be reachable from everywhere and everything. This works even for postal addresses -- I can write them in English, and they will reach the intended destination in any country, be it US, Spain, Russia or Japan. The same functionality is available now with DNS, but if this proposal will be implemented, it won't be available from every computer unless everyone will switch to Unicode -- and that won't happen until Hell freezes over (being Russian I have all reasons to be sure about that).

    If someone is too concerned about "good-looking" addresses, they should implement some name-translation service like AOL keywords for people who don't like DNS, but the basic architecture of the Internet should not lose interoperability just because someone wants to add one useless feature to his shitty software.

    --
    Contrary to the popular belief, there indeed is no God.
  6. Re:Bad idea, NOT by guran · · Score: 2
    Domain name is an address. Address should be reachable from everywhere and everything.

    Not completely true. The domain name is an alias. The dotted quad is the address.
    Tell me why I can't put a name server out there that supports more characters? Yes, comaptibility will be a problem but not an impossible one. The DNS request must inform the server which charset it is using. Default to UTF-8 of course.

    So we have the following alternatives. (Let's accept "unicode" for "standardized extended charset" in the following OK?)
    1) I run a web server with a UTF-8 domain name. No problem, a Unicode DNS will be able to handle the ascii subset.
    2) I run a web server with a unicode domain name. I must register my domain with a unicode dns and I'd be wise to also register a UTF-8 domain name as an alias if I think my domain name will cause trouble.

    This works even for postal addresses -- I can write them in English, and they will reach the intended destination in any country, be it US, Spain, Russia or Japan

    Well, but can you write an adress in russian and expect the letter to be delivered in the US? Or in russia, if mailed in the US?

    If someone is too concerned about "good-looking" addresses, they should implement some name-translation service like AOL keywords for people who don't like DNS, but the basic architecture of the Internet should not lose interoperability just because someone wants to add one useless feature to his shitty software.

    Someone sure is "too concerned" and I rather have ICANN setting a standard than wait until there is an AOL/MS proprietary name space.

    I do see your point. Clueless fibbling with DNS is not a good thing. *My* point it that that is exactly what will happen unless it is done the proper way.

    --

    All opinions are my own - until criticized

  7. Chinese input is rather easy by PhatBhuda · · Score: 1

    All you need is MacOS 9 or Windows 98SE. Both are capable of Chinese input. With Windows 98SE, I believe you have to download it from Microsoft, but it's no big deal. There are two basic input methods for Traditional Chinese characters. Pinyin and BoPoMoFo. Pinyin is spelling out the sounds with arabic characters, and BoPoMoFo maps the Mandarin alphabet (of which the four first 'letters' are the name) to the standard keyboard layout.

  8. Re:ugg by jmd! · · Score: 1

    dont fscking assume

    the purpose of a domain is global accessibility, not so users with keyboard type x and input method y can access it.

  9. Re:A split on the internet? by webcrafter · · Score: 1

    Why not for .com, .org and .edu? What makes you think non-US netizens are not entitled to use such TLD's?

    the target audience for a given domain can always have the input method to support accessing hosts in that domain

    What if the target audience is on vacation in some other foreign country? Will they have to go to a cyber-ethnic-café to browse their local paper?

    The DNS system is the same for all domains, so if one certain TLD can use certain encoding, what's to stop the rest of TLD's to use it? Nameservers don't know the difference, they just hold the data. Many .es, .se, .de etc. are hosted on the US, and viceversa.

    The fact that you can't make sense of the names is irrelevant. As long as someone might get benefited (domain registrars, for example) it will eventually happen.

    The problem is not that we have language-specific characters, is that you don't. We don't consider them special, except for the fact that we can't use them normally (without kludges like i18n features) on a computer.

    i18n should be about content, not about presentation.

  10. Re:Why Spanish before other European languages. by webcrafter · · Score: 1

    Well, we don't host all of our domains in Spanish ISP's, you know.

    Ummm... I thought there were a lot of european-descent people among the US. Wouldn't for example the irish americans prefer to type fáilté.com?

  11. Re:How about *dis*-allowing characters? by webcrafter · · Score: 1

    So would you use something like vvvvvv.domain.se? :-) (assuming you use vv instead of w)

  12. Why Spanish before other European languages. by SEE · · Score: 4

    Numbers. 417 million people speak Spanish, 191 million speak Portuguese, 128 million each speak French and German, and no other Latin-alphabet European language has as many as 100 million speakers. It isn't that NSI prefers Spaniards, it's that it prefers larger markets over smaller ones.

    CJK has a similar "numbers" vibe. Since the CJK character sets are generally handled by a single solution in software (esp. since written forms of Japanese and Korean include both native syllabic/alphabetic [respectively] scripts and Chinese idographic script), you get Japan, Korea, and Greater China in one fell swoop. (Greater China here not only including the PRC and Taiwan, but the Chinese-speaking groups in Maylasia, Singapore, and Indonesia.)

    So why not Devanagari too? Because 1) there are a lot more CJK and Spanish language customers than Hindi/Bengali customers due to internet penetration and financial factors, and 2) the people who would buy the domains in India generally are of the educated classes that speak English. So there's less demand for Devanagari.

    Steven E. Ehrbar

    1. Re:Why Spanish before other European languages. by Cygnus+Rosebud · · Score: 2

      Wrong on a couple of points. I would say
      (I have no concrete figures) that the majority of people in South America (excluding Brazil, they don't speak Spanish) DO have access to the Net. Because of tourism, there are Internet stores all over the place. And that's not just in the big cities either. Even in the little smaller cities near the jungle, there are stores providing access. Whether everyone can afford it is another issue (cost about the equiv of 1 us dollar and hour) but ALOT of people are have the access...

      Just so ya know...

      --
      // Brought to you by letters Q and E and by the number 7.
    2. Re:Why Spanish before other European languages. by David+A.+Madore · · Score: 2

      That's only part of the story. The only non-ASCII characters in Spanish are entirely contained in the ISO-8859-1 (aka Latin-1) character set. Since most programs are already configured to work correctly with Latin-1, supporting that (in browsers and such, that is) should be rather easy.

      Chinese is moderately complicated. Yes, it does have a huge number of characters, but on the other hand they are fixed-width, and the difficulty of rendering Chinese is rather small once you have the appropriate font (I'm talking of rendering, e.g. in the URL bar); in fact, Chinese is simpler to render than the Latin script.

      Devanagari (just like every other Indic script), on the other hand, is hugely complicated. The crazy ligature system means that we are going to have to wait a looong time before we see software that correctly handles the Nagari script in any non-trivial situation.

      For example, consider the Unicode test page I keep referring to: you have a sample of Russian, a sample of classical (polytonic) Greek, a sample of Sanskrit (written in Devanagari script) and a sample of Chinese. Many browsers will handle the Russian and Chinese correctly, and the non-polytonic Greek characters; very few will handle the full Greek text correctly; none is known that correctly displays the Sanskrit text.

    3. Re:Why Spanish before other European languages. by Fred+Ferrigno · · Score: 2

      Numbers. 417 million people speak Spanish ...

      No one is questioning that there are a lot of Spanish speakers out there. I would doubt, however, that there are many native Spanish speakers with access to the Internet and those that are don't understand another the English alphabet well enough to use domain names. (Couldn't Spanish ISPs patch BIND to resolve domains typed with accents to their low ASCII counterparts?)

      The Internet right now is dominated by rich people in rich countries which, for better or worse, seems to disproportionately exclude more of those 417 million people than other languages. I think ICANN is expecting those 417 million people to 'wake up' to the Internet and wonder why it's not in Spanish. Frankly I don't see this happening soon unless Spanish speaking people have an overwhelming need for porn and illegal mp3s.

      --

    4. Re:Why Spanish before other European languages. by chrischow · · Score: 2

      the vast majority of the people on the net don't bother or could care less about pr0n or MP3s, surprising as that may sound

  13. We wont be able to access half the domains then... by Anaplexian · · Score: 1

    Yeah, like if you have a domain in say, swahili, that's the official site for The country's government, How does the Japanese community type the domain to go to the site?

    I see the future with this domain system pretty stupid.
    The American Embassy, for Instance, have 24 sets of Keyboards per computer- to access the sites of the other countries.

  14. Re:Unicode Limitations / BIND by Anonymous Coward · · Score: 1

    You are mistaken in that it does not always use TWO characters, but it is a variable number. You are also mistaken in the assumption of 65536 characters in Unicode, there actually are more than 65536 codes and the provision is for up to 20 bits. Finally you are mistaken in the assumption that 31 characters would be limiting for Chinese names. Those chinese characters are much more powerful than letters or digits, so far fewer of them are required to form a name.

  15. example... by Barbarian · · Score: 2

    http://domän.nu/

    Interesting, BIND 8 works with it (my nameserver), but when I enter that in nslookup it pukes (i.e. I can use a webbrowser [IE 5.5], but can't type it into nslookup).

    --

    1. Re:example... by toriver · · Score: 1
      ... except a conforming browser should escape the ä so that what gets sent is http://dom%e4n.nu/, which of course is not the same.

      Once internet protocols stop catering to whatever archaic systems out there that are limited to 7-bit transport, the scheme would work.

    2. Re:example... by Barbarian · · Score: 2

      That's how I entered the URL in the a href= field: dom%e4n.nu.

      --

    3. Re:example... by mbyte · · Score: 1

      Squid does the same:

      While trying to retrieve the URL: http://domän.nu/

      The following error was encountered:

      Invalid URL
      Some aspect of the requested URL is incorrect. Possible problems:

      Missing or incorrect access protocol (should be `http://'' or similar)
      Missing hostname
      Illegal double-escape in the URL-Path
      Illegal character in hostname; underscores are not allowed


      Samba Information HQ

    4. Re:example... by hwaara · · Score: 1

      FYI, domän is swedish for domain. So, this is intended for swedish or other nordic countrys to use. But that doesn't explain why you can't nslookup it!?

      --
      -Håkan
    5. Re:example... by ZanshinWedge · · Score: 1
      Wow, I can't believe that worked.

      Though it shows it as dom%e4n.nu in IE's address bar.

  16. Re:Questions by claes · · Score: 2
    My point is that even though the greatest feature with the internet is that it spans the globe, it is not always used that way. There are content on it directed only to a local audience. That kind of content should be able to exist on its own terms.

    If you sell a product internationally, and use a lots of strange letters in the url, you are just stupid. Of course you should use a "classic" url for this. But if you have a company that delivers shrimp sandwiches within a swedish town, you should be able to use the domain räksmörgåsar.se instead of raksmorgasar.se.

  17. Re:Conflict with existing names by Leper · · Score: 1
    While I'm not saying that companies should always be able to get their domain names back from squatters, everyone can agree that a company has the right -- and the need -- to maintain its identity without being taken advantage of by pornographers or warez traders.

    No, not everyone. I don't agree that a company has that right, and I bet I'm not alone. As far as I'm concerned, in the chaos of the net you get to sink or swim on your own merits and companies don't get any special perks. The real irony here is that some of these so-called pornographers actually supply more content than the domains they ape, but you suggest we should take measures to prevent that.

    (UNFAIR Term applied to advantages enjoyed by other people which we tried to cheat them out of and didn't manage. See also DISHONESTY, SNEAKY, UNDERHAND, and JUST LUCKY I GUESS.
    --The Hipcrime Vocab by Chad C. Mulligan)

  18. Re:Conflict with existing names by bk1e · · Score: 1

    One word: Paypal (don't go there, it's fake!), the site that was set up to divert Paypal customers and get them to leak their credit card info to some HaX0R d00d. Note that the 'I' and the 'l' look very similar on the bottom line of Netscape...

  19. Possible solutions by uriyan · · Score: 1

    It seems a cool idea. I think that many problems surrounding URL i18n can be solved.

    • Accented letters can be resolved into plain letters. For example, A with circle into plain A.
    • In languages that use a non-Latin character set, a name can be aliased to a transcribed Latin version of it. A flexible aliasing system can be invented so that it won't cost too much.
    • The biggest problem is with BIDI text. I think that a logical standard should be used (the character which is read first appears first in the binary stream). There are a few algorithms (some of them detailed in certain RFCs) that can help to store and transmit names correctly. They can also be transcribed to Latin (as I wrote in the previous bullet).
  20. great! by nomadic · · Score: 3

    Does this mean I can register micrösoft.com and yàhoo.com and släshdot.org?
    --

  21. Re:Um, why would they have to register those? by TSN · · Score: 1

    They don't have to by law, obviously. They would have to do it if they want to avoid people registering a similar name and putting up a porn site, or something.

  22. Re:Not good by TSN · · Score: 1
    Well, just looking at the Windows character map, the letter 'a' has seven forms (six different accents plus regular), 'c' has two, 'e' has five, 'i' has five, and 'n' has two. In "galeriacentral", the letter 'a' appears three times, 'c' once, 'e' twice, 'i' once, and 'n' once.

    (3 * 7) * (1 * 2) * (2 * 5) * (1 * 5) * (1 * 2) == 4200.

    How do you decide which of those forty-two hundred possible accented domains Docrates gets first refusal rights on?

  23. Re:Conflict with existing names by David+A.+Madore · · Score: 2

    Conflicts with existing names are mostly dangerous when the user might type the name and make a typo. Evidently you would not type such a thing as "amazn.com" (here the "o" before the "n" is replaced by the cyrillic form of the same letter which is supposedly indistinguishable from it). When you are following links, well, you are following links, and you are therefore trusting the site with the links to some extent. After all, non-power users rarely read the URL written at the bottom of the page, in any case: if someone writes a site which looks very much like a well-known site and links to it, whatever the URL, many users will be fooled. I don't think "internationalized URLs" will be a major change in this respect.

    Slashdot's handling of accented characters in nicknames was completely grotesque, in any case. It was done naïvely by taking the 8-bit data as submitted and using it in the URL. But this is not how it works: the data should have been encoded in UTF-8 beforehand.

    --
    Here you should see an upper-case e with an acute acent: é. Here you should see an upper-case Y with two dots on it: . Here you should see a capital greek Gamma: . Here you should see a Hebrew aleph and a Hebrew beth: ; of course, the aleph should be on the right because it is first (unless there was a line split between the two). Here you should see the Devanagari "OM" sign: . Here you should see a smiling face: . Here you should see the Chinese (or Japanese) character for "sun": . None of this should depend on your selected "document encoding". If you did not see all that, then your browser is broken and you should change it.

  24. Questions by Anonymous Coward · · Score: 1

    I think this is a great idea. But I also think it has some problems, such as how does one write the address?

    For example, I can't tell a Chinese symbol for dog from that of a cat, so how am I supposed to write the address from memory, supposing I want to visit the English section of Wang-Chan's Noodle Soup Test Drive or something?

    And vice versa, how will some Chinese person write, for example, umlauted characters (å, ö, ä, ë and so on)

    1. Re:Questions by Anonymous Coward · · Score: 1

      Don't worry, I'm sure Wang-Chan's Noodle Soup comes in both dog AND cat flavors.

    2. Re:Questions by chrischow · · Score: 1

      i hope not, dog tastes good but cat sure as hell doesn't

    3. Re:Questions by giggls · · Score: 1

      I dont think this would be a problem. For Browsers it would be possible to adopt an encoding scheme like it is used in the tcl scripting language. Any Unicode Character is just escaped with \uXXXX where XXXX is the unicode number of a given character. So for an "A" one could use \u0041 as well

    4. Re:Questions by nickol · · Score: 1

      And vice versa, how will some Chinese person write, for example, umlauted characters (å, ö, ä, ë and so on)
      YOU see umlauted characters here, but I see normal capital cyrillic letters. This problem comes out of the default code page used. Currently it is necessary to place
      meta content="text/html; charset=" http-equiv="Content-Type"
      and
      meta content="" http-equiv="Content-Language"
      only into pages in languages, other than English. The question is - If I have a web page in Russian, and I want to provide a link to a page in Sweden, what language should I declare for my page ? AFAIK there are no means in HTML 4.0 (except using UNICODE) to make multilanguage contents on one page (correct me if I'm wrong). At least all or most of original english-language pages with default character set should be updated. So the real problem is not how one will type the link, but in just seeing it more or less correctly.

    5. Re:Questions by claes · · Score: 2

      I doubt that this will be a big problem for you. First of all, most URLs you will encounter as links, so you don't have to type anything at all. Second, if you can't type the url because you don't understand it, how do you expect to understand the information on the page that it points to? And third: there will probably be software that helps you with this.

    6. Re:Questions by tfxx · · Score: 1

      well here 's a start.

      Japanese characters recognition. Katakana, hiragana AND kanji.

  25. Re:Unicode Limitations / BIND by waldoj · · Score: 2

    Finally you are mistaken in the assumption that 31 characters would be limiting for Chinese names. Those chinese characters are much more powerful than letters or digits, so far fewer of them are required to form a name

    That's interesting -- that had entirely escaped me. (This from the kid who spent years studying Egyptian and Mayan ideograms. *smack*) And I thought 63 characters was incredibly long in English. You could have an several haikus in Japanese ideograms as a domain name!

    I like ICANN but
    NSI really pisses
    me off a lot

    -Waldo

    -------------------

  26. A split on the internet? by Stoutlimb · · Score: 2

    I could see this as a possible way of the internet "cleaving" into national groups. What I mean, is that there is no easy way for me to type in asian characters to get to a site, and if someone is used to an asian-only computer system, how do they go to Russian sites without clicking on a link or knowing an IP address?

    At the risk of sounding anglo-centric, isn't this a big blow against interoperability?

    1. Re:A split on the internet? by smack.addict · · Score: 2
      You really don't have the first clue about namespace issues. You hide your ignorance as an attack on my multi-cultural sensitivities. TLDs exist to break up the namespace specifically in terms of the user base of computers exactly because it is beyond absurdity to think that all computers can fit into a single namespace. If a given namespace is geared at Americans, the namespace should be usable by Americans. If it is aimed at Russians, it should be usable by Russians, if it is aimed at a global audience, then it should strive for a commonality in the global audience.

      Why not for .com, .org and .edu? What makes you think non-US netizens are not entitled to use such TLD's?

      Because those TLDs are American TLD's. A historical quirk of the fact that the USA started the Internet. Now, if you want to co-opt them for world use, then they should use the lowest-common denominator character set usable by all users and input methods: ISO-8859-1. If they are to retain their US origins, then it should be ASCII. There is absolutely no use for forcing domains aimed at an American audience to deal with the inefficienies of two, three, and multi-byte character sets or of conversions among character sets.

      The problem is not that we have language-specific characters, is that you don't. We don't consider them special, except for the fact that we can't use them normally (without kludges like i18n features) on a computer.

      You are making an ass out of yourself. My machine supports entering characters in several alphabets, including many accented European languages, Russian, Arabic, and Thai. The issue is nothing about considering certain characters "special". It is about understanding your audience. I spend a lot of time on I18N issues and am a huge advocate of I18N concerns, but opening every domain to every character set on earth is not I18N: it is wreckless multiculturalism.

    2. Re:A split on the internet? by titus-g · · Score: 2
      .edu is a US tld.

      .com/net/org are g(lobal)TLDs.

      From the US DOC White paper at ICANN:

      A small set of gTLDs do not carry any national identifier, but denote the intended function of that portion of the domain space. For example, .com was established for commercial users, .org for not-for-profit organizations, and .net for network service providers.

      Could everyone Please stop making this mistake now, it's really starting to bug me.

      --

      ~ppppppppö

    3. Re:A split on the internet? by smack.addict · · Score: 1
      Then do it for .se, not for .com, .org, and .edu.

      Nevertheless, I think all domain names should be unicode or UCS with registry restricted some subset of the unicode spectrum appropriate to the TLD in question: .com as 8859-1, .ru as 8859-1 + cyrillic ISO, etc. That way all DNS servers can easily resolve names, but the target audience for a given domain can always have the input method to support accessing hosts in that domain.

    4. Re:A split on the internet? by webcrafter · · Score: 1

      Las time I checked, there were non-US universities under .edu
      Check www.unica.edu for an example

    5. Re:A split on the internet? by webcrafter · · Score: 1

      Because those TLDs are American TLD's. A historical quirk of the fact that the USA started the Internet.
      Then, all TLD's are also american. Or was there a non-american internet (non capitalized so as not to confuse it with the actual Internet) that was later merged in?

      TLD's existed to break up the namespace based on geographic location, not user base. Of course, domain names and IP's no longer represent actual geographic location.

      they should use the lowest-common denominator character set usable by all users and input methods: ISO-8859-1
      Well, I don't use any chars that are not in this set. If i wanted to register víctor.com (which I don't) I would just need ISO-8859-1.

      My machine supports entering characters in several alphabets
      when I said you don't have special characters, I was not speaking about your machine, but about your language. If English had had accented characters in the first place, ASCII would have them by now. And not only just those that were necessary but the ones pertaining to other languages as well, because the creators would have had the sensibility to consider them.
      BTW, I'm not attacking you. Please refrain of doing it yourself.

    6. Re:A split on the internet? by claes · · Score: 2

      No of course not. This is a much needed feature in non-english speaking countries. For example in Sweden where I live, there are three letters that can't be used in URLs: å ä and ö. There are lots of local companies, government organizations etc that only directs themselves to a local/national audience. They should be able to do this in a localized way. There are so many words that has these letters in them, and not being able to use them in domains and URLs is a very big disadvantage. The web is not only for english-speaking people. If you on the other hand direct yourself to an international audience, you of course avoid characters outside of a-z. Pretty simple isn't it?

    7. Re:A split on the internet? by mvl · · Score: 1

      > At the risk of sounding anglo-centric, isn't this a big blow against interoperability?

      No. First, interoperability is about technical systems working together. I'm certain that any solution will provide interoperability: Any client will be capable of working with any name server.

      If you can't type the Asian characters easily, there is a good chance that you can't read the page you get, either. If they also provide contents in English, they may chose to provide an all-ASCII domain name, too.

    8. Re:A split on the internet? by titus-g · · Score: 1
      Yup, right you are, I had the feeling I should have checked a bit deeper...

      I guess most unis etc go for ccTLDs, because well, they don't move very much, or start international franchise (yet).

      hmm sunsite.doc.ic.edu doesn't seem to work:)

      It's even in the FAQ

      --

      ~ppppppppö

  27. Re:Ýou people just don't get it by smack.addict · · Score: 1
    I never said they were stupid americans. I said that they were americans and that they obviously could not understand that this is an important issue in other countries.

    Again, your assumption is that people who do not agree with you do not understand those issues and therefore must be American. Furthermore, the issue is not whether you should be able to have a domain name with a Jönsson component, but whether you should be able to stick that in .com, .org, or .net. I do not see the problem with it for ISO 8859-1 characters, but I do understand the argument against it. Furthermore, I don't think .com, .org, or .net should be extended beyond the extended latin characters for reasons I have already enumerated. And I really do not like your dismissal of any American point of view on the matter.

  28. resume.com by Anonymous Coward · · Score: 2
    Some people will be pissed off. For example, resume.com will have to register:

    {resume, résume, resumé, résumé}.{com, net, org, new TLDs}

    Costly!

  29. If they can do this, then why not? by SEE · · Score: 2

    The only thing I'm worried about is that infrastructure/backbone-level software might break.

    Because:
    1) I can't read a Japanese-language site whether or not I can get to it.

    2) If I could read it, I'd use software that let me input it.

    3) A rational web designer will register a non-accented Roman/ASCII character name if they intend to reach an audience that may include people who can't input other characters. The irrational deserve to have their sites fail anyway.

    Steven E. Ehrbar

  30. Re:Conflict with existing names by Another+MacHack · · Score: 1
    Japanese romaji characters --the pictographic characters used to write English words -- could also be used to confuse Japanese readers.

    Romaji is the roman alphabet transliteration of hiragana and katakana. You've confused it with a mixture of katakana, the phonetic alphabet used for imported words, or kanji, the pictographic alphabet.

  31. domain names should be like soundex by kevin805 · · Score: 3

    Domain names should map from something like: "Señor Hussong's Cantina.com" to "senorhussongscantina.com". Spaces, punctuation, and hyphens should be deleted. Special characters should be translated into the closest low ascii character.

    This way, you can write your domain name however you want, and there isn't so much of a potential for people registering something similar.

    Hyphens have got to be the dumbest idea of all time. If you have a multi-word name, you almost have to register both with and without the hyphen or you will lose visitors.

    Even better would be using something like soundex, which makes a "hash" of a name so that similar sounding words map to the same value. Memorizing exact spelling is not something people are used to doing.

    They shouldn't do CKJ domain names, they should just define a standard translation, which can then be incorporated into client software and possibly into DNS systems. What's next, I'll be unable to get to a site unless I also choose the correct encoding? Let's see, was that "cool-shit.org in 8859-1, or coolshit.org in japanese encoding, or maybe cool-shit.net eastern european encoding. Or was it coolshít.org?"

    1. Re:domain names should be like soundex by chrischow · · Score: 1
      They shouldn't do CKJ domain names, they should just define a standard translation, which can then be incorporated into client software and possibly into DNS systems. What's next, I'll be unable to get to a site unless I also choose the correct encoding? Let's see, was that "cool-shit.org in 8859-1, or coolshit.org in japanese encoding, or maybe cool-shit.net eastern european encoding. Or was it coolsh't.org?

      er i don't think they'll be writing the domains in english using CJK encoding, they'll be writing the domains in CJK languages

    2. Re:domain names should be like soundex by DeamonGorgos · · Score: 1

      Actually, there are valid reasons that two different sites should have the same name. For example, I see no reason why either American Naughtical Institute or Autism Network International would have more a right to the domain "ani.org." (Note, the former had the domain at one point, the latter went to "ani.ac.") True, some hold names hostage, and many (porn and otherwise) use similar names to attract user. And some names are household words which would be deceptive for anyone else to use. But there are valid reasons for it.

      I suppose that from a business perspective the goal is to attract as many viewers as possible, and keep them from competitions. Still, it always seems like an excercise in wanton selfishness when the doctorine of "take ever variation of you name as an URL" is preached.

      I see hyphens as a good thing, both because they make domains easier to read, and because they increase the amount of namespace for meaningful (to human readers) domains. I find it unfortunate that this is sacrificed out of greed, paranoia, and similar motivations.

  32. Re:Ýou people just don't get it by smack.addict · · Score: 1
    I said:
    "If .com is aimed at a global audience, then domains registered under that TLD should support a global audience: i.e. ASCII or ISO-8859-1. NOTHING ELSE." To which you replied:
    Yeah, cause we all know everyone in the whole fuckin' world can read English! (Well, those who can't aren't worth shit anyway.)

    ISO-8859-1 is not an English-specific character set. There is nothing english-centric about it. And again, my comments are directed towards .com, .org, and .net which are American TLDs in spite of the trend of foreign companies using them.

    I suggest you get a fucking clue.

  33. Re:Unicode Limitations / BIND by scrytch · · Score: 2

    > Unless I'm mistaken, Unicode is a combination of two ASCII characters to create a single one

    You are mistaken.

    --
    I've finally had it: until slashdot gets article moderation, I am not coming back.
  34. Re:There is Need for a Non-DNS URL System by pryan · · Score: 1


    See my comment about just such a system.

  35. Another attempt to screw us over. by d.valued · · Score: 1

    Several points that ICANN should keep in mind:

    1. America built the foundations of the internet as we know it. The core, as DARPANET, was our government's baby.

    2. Most languages can translitterate to Latin characters. (It may be messy, but it works.)

    3. Other nations that have sites on the Internet have decided to play by the rules we (Americans) laid out. Only one person per IP at a time, Latin-character domain names.

    4. I don't care what you'll tell me; the Net is way too new to be fragmented by this sort of gross nationalization.


    "And they said onto the Lord.. How the hell did you do THAT?!"

    --
    I used to be someone else. Now I'm someone better.
    Real life is underrated.
  36. Give people some credit by w00ly_mammoth · · Score: 3

    Slashdot has had to ban accented characters to prevent this kind of abuse; ICANN should do the same lest they a similar outbreak of mimicry infect the entire Web.

    History of the "Something must be done to control the outbreak" syndrome.

    Early 1990s: OMG! People are making up their own web sites in large numbers. Thousands of people will see them and be unable to distinguish fact from fiction.

    Mid 1990s: OMG! People are now making up their own news sites. Millions of people are reading them and can't tell the difference between real and fake news.

    Late 1990s: OMG! People are posting stock market tips which are causing market fluctuations. People will be unable to tell the difference between real and fake stock market news!

    Early 2000s: OMG! People are allowed to use accented chars. Millions of people will be diverted to fake sites which use similar accented chars in their domain name, and thus be unable to tell the difference between real and fake sites!

    Here, take a chill pill. Welcome to the internet, my friend.

    w/m

    1. Re:Give people some credit by Elvis+Maximus · · Score: 2

      Late 1990s: OMG! People are posting stock market tips which are causing market fluctuations. People will be unable to tell the difference between real and fake stock market news!

      You're right -- people are way too smart to get taken in by that kind of thing. I mean, it's not like some 23 year-old could release fake stock market news that would cause the stock of some electronics company to plummet, knocking 2.5 billion from its market cap.

      As for people being unable to tell fact from fiction on the Internet in general, Slashdot is more than enough to reassure us in that regard.

      So yeah, give people some credit!

      -

      --

      -
      Give me liberty or give me something of equal or lesser value from your glossy 32-page catalog.

    2. Re:Give people some credit by w00ly_mammoth · · Score: 2

      You're right -- people are way too smart to get taken in by that kind of thing. I mean, it's not like some 23 year-old could release fake stock market news that would cause the stock of some electronics company to plummet

      As you eloquently pointed out, the masses are confused by some young guy posting a stock market rumor. And they are confused by fake sites pretending to be something else. What would your solution be? can you really think of any fix to this, other than letting the thing settle down and everyone chill out and understand how the net works?

      Whenever something new happens on the net, there is always mass panic and hysteria, but eventually people settle down. The way to deal with issues of confusion is not to PREVENT abuses from happening, or put in "safeguards" to avoid the dumb masses from getting confused.

      Things that need regulation need to have a high threshold to satisfy a need to slow down the anarchy and vibrancy of the net. Stuff like market rumors or fake sites or accented characters...geez, let millions of people panic and get confused and learn in the process. They will get over it and figure it out. There's no way to hold everyone's hand and regulate panic and confusion.

      I also think there's a misunderstanding about accented chars. people who type something know what they want. For instance, someone who didn't know English at all might wonder if people get confused by some fake site registering yah00.com or m1cros0ft.com - they all look the same to him.

      The o and the o with 2 dots above it may seem the same to us, but it's about as different to a Norwegian as a zero is to an 'o'. So don't worry, "they" won't get confused by similar looking domains. Even if they do, they'll get over it.

      w/m

  37. Bad idea by Alex+Belits · · Score: 4

    This is a bad idea -- domain names must be interoperable on all systems, with or without Unicode or any other charset support, with or without keyboard capable of entering certain characters. The ASCII subset allowed in DNS now is the only subset supported by absolutely all computers (even ones that natively use EBCDIC), and no matter how the use of other charsets (and/or Unicode) will expand, this is not going to change. I see it as an attempt to just promote "unicodefication" of existing standards for no good reason.

    And if anyone cares, my native language has nothing to do with ASCII.

    --
    Contrary to the popular belief, there indeed is no God.
    1. Re:Bad idea by chrischow · · Score: 1

      ni shi bu shi troll?

  38. what a poorly written piece of trash by koax · · Score: 1

    The squishy-wishy writing style is reminisent of Omni magazine. Bleh!
    And to answer the
    Q: why are today's laptops so damn slow?
    A: 5400rpm IDE drives.

    http://koax.org

  39. But. Isn't english really the language of the net? by maunleon · · Score: 1

    Considering monumental failures such as Esperanto and others, personally I can consider english the closest we have to an international language. I know that at least in europe and many asian countries, you can get around at least a little with english. No other language can claim that.

    The internet has silently standardized on english. I always thought standards were a good thing. The english alphabet is relatively easy to learn, and noone forces you to actually spell english words.

    (As a disclaimer: I'm not a native english speaker, but I don't have a problem with english being the language of the 'net)

    I think this would be a huge step backwards. people will start memorizing IP addresses again instead of names.

    probably the first thing I'd do is write an application that for every funky-characterd domain name I see I'd automatically alias it to an english name in /etc/hosts so it won't break any of the existing tools.

    BlahSo the problems are that DNS servers won't support it for a while (if ever) and the people with such domain names will get a rude awakening when they won't be able to use most of the existing tools with their funky new domain names.

  40. Re:Conflict with existing names by ralmeida · · Score: 2

    Slashdot has had to ban accented characters to prevent this kind of abuse; ICANN should do the same lest they a similar outbreak of mimicry infect the entire Web.

    People should realize that it's a world wide web. It's not only american, and it should not only be in english -- diversity is important. And if you want to support other languages, you have to accept accented characters; they are not only "decorative", they make a whole difference.

    Sure people will abuse it. But we already have slahsdot.org and other similar sites. It's already being abused.

    Você não acha?

    --

    --
    This space left intentionally blank.
  41. Re:Unicode Limitations / BIND by David+A.+Madore · · Score: 2

    In the UTF-8 encoding (defined by RFC2279), it takes between one and six octets (bytes) to encode one character, although no currently assigned character needs more than three. UTF-8 can address all the 2147483648 characters of ISO-10646-1.

    In the UTF-16 encoding (RFC2781), it takes either two or four octets (bytes) to encoed one character, although no currently assigned character needs four. UTF-16 can access only the first 1114112 characters of ISO-10646-1 (the first 17 planes), which form the Unicode range proper.

    Both these encodings use characters outside the ASCII range (i.e. 8-bit characters), which are not supported by current BIND versions, but which are still permitted by the DNS standards (RFC1034&1035).

    However, the proposed IDNS standard does not use either of these encodings (IMHO not using UTF-8 is a terrible mistake) but yet another one, called UTF-5 (see "draft-jseng-utf5-00" in Internet Drafts).

    In the UTF-5 encoding (defined by the aforementioned dreft), it takes between one and eight octets (bytes) to encode one character, although no currently assigned character needs more than four. UTF-5 can address all the 2147483648 characters of ISO-10646-1.

    If UTF-5 is used on DNS labels, you can have up to 15 Chinese characters in such a label.

  42. Re:Conflict with existing names by kindbud · · Score: 1

    It's a minor miracle that I can read this and reply to it at all. I'm at the airport waiting for a flight and using my PalmIIIc with Avantgo, hooked up to a Motorola Timeport PCS phone. Most of the Unicode characters in your post did not show up. There are few alternatives to Avantgo for the Palm. Eudora Web doesn't do any better...

    --
    Edith Keeler Must Die
  43. Re:ugg by drewness · · Score: 1

    To input CJK chars, you don't just poke in alt codes until you get it. Asians would go insane if that was how you had to type. If you have to type in CJK characters there is a frontend that converts them (I use kinput2 with wnn on *nix, and the built in frontend in BeOS.), and I'm sure that if the idea to have multinational chars succeeds, then some input method will be written to put them in unicode (as BeOS does by default). If you were going to a site that used CJK chars, then presumably you know the language, and know how to input them, and I bet that that assumption is built into their plan.

  44. 8-bit DNS software by kindbud · · Score: 1

    Here is some DNS server software that supports 8-bit data, no problem. Finding a compliant client resolver library is another matter, however...

    --
    Edith Keeler Must Die
  45. ICAN'T by Homer+Sexual · · Score: 1

    oooo! i've test bedded a few non-english characters in my day! i just love those turks.

    --

    NAMBLA. Because Scouting can only take you so far.

  46. Network Solutions? by z-axis · · Score: 1

    Why is Network Solutions in charge of a potentially significant project? Does anybody trust them?

  47. Evidently, no one bothers to check the links. by Kickasso · · Score: 2
    If you did, you'd quickly find out that all such schemes in development have an ASCII-only fallback mechanism. It is the requirement. Otherwise older programs will fail miserably. Quoth the doc:
    The ASCII Compatible Encoding (ACE) is used to support older software expecting only ASCII and to support downgrading from 8-bit to 7-bit ASCII in other protocols (like SMTP). It is a transition mechanism and will no longer be supported at some future time when it is so decided.

    All software following this specification MUST recognise ACE and decode them into their true name when doing matching and handling. A DNS server must recognise ACE in a query.


    --
  48. Sounds great, but... by Kanasta · · Score: 1

    I hope it doesn't crash my browser any more than normal. It crashes enought with multilangualge sites and in JS sites.


    ---

  49. this is basically just NSI trying to make more $$$ by keithmoore · · Score: 1
    NSI is doing this because they see a potential market for non-ASCII names in the domains that they control. which is hardly surprising. the problem is that there are still a lot of technical details that need to be worked out regarding how internationalized domain names should work. some of these details might end up limiting the kinds of names that should be registered, or specify that registrations for two different spellings/encodings of what looks like the same name should not both be accepted. (the technical definition of "looks like the same" is still being debated)

    if NSI starts accepting registration of internationalized domain names before these details are worked out, they are inviting trouble.

    this is just another example of why .COM, .NET and .ORG need to be taken away from NSI and put in the hands of a group that will act responsibly.

  50. Spam woes... by sdo1 · · Score: 1

    Oh this is just frickin' great...

    Now I can get spam from domains with oddball characters that will just be about impossible to trace using standard tools.

    --
    --- What parts of "shall make no law", "shall not be infringed", and "shall not be violated" don't you understand?
    1. Re:Spam woes... by martyb · · Score: 1

      Good point, but I think it's even worse than just that.

      web bugs - web sites use these to track customers. It's difficult enough to ferret them out in the first place, in English, let alone when the 1X1 .gif is located on some site whose name is indecipherable to an English-reading person. Yes, maybe the DNS can resolve it, and the internet infrastructure will sort things out correctly, but it sure seems to me that the average user is going to be even more impeded in their efforts to protect their privacy.

      And THAT's just in web sites and HTML e-mail, what about Microsoft Word Documents That "Phone Home" , and as others pointed out in that thread, the same can be done with Excel, and other apps, too?

      I'm not against the idea of this internationalizing (I18Ning?), but I sure do hope that a long, hard look is taken at these potential, non-technical pitfalls that pertain to people's perceptions and abilities, and what steps can reasonably be taken to protect users.

  51. chinese/japan characters? by hwaara · · Score: 1

    What!? It's far more important to first of all support the nordic characters like å,ä and ö. And all keyboards can type it so why not?

    Chinese and Japanese characters can't be typed with Nordic/American keyboards, so that'
    s just silly. For example, look at the .nu domains. They support all usual alphanumeric characters + å, ä and ö.

    --
    -Håkan
  52. Well duh by Captain+Pillbug · · Score: 1

    It's not the Spaniards that ICANN is catering to. It's those Uruguayans.

  53. Re:just one problem... by smack.addict · · Score: 1
    This is such a hypocritical point of view. Americans are supposed to design the systems they build for the rest of the world.

    The fact is EVERYONE builds their systems for their needs. The fact also is that Americans lead the world in driving issues of I18N and L10N.

  54. Good Idea! by syf0n · · Score: 1

    The whole idea behind this isn't to confuse the gigantic english-speaking crowd who uses the internet, it's to help the smaller group of non-english speaking people who have trouble navigating the internet or registering domains that are useful to them.

  55. It's not ICANN, and it's not non-English by paulhoffman · · Score: 2
    As pointed out in the main link, ICANN has nothing to do with the testbed. NSI Registry (not the registrar) decided to do it because, if the IETF makes this available, it would be nice of the registrars in com/net/org were able to register the names. There are already other testbeds run by other (less newsy) groups of people; you can even set up your own. It's not that big of a deal.

    And it's "non-ASCII", not "non-English". There are already plenty of domain names that are non-English, as others have pointed out already. ASCII is a character set (of sorts); English is a human language. The differences are defined in detail in the requirements document for the IETF's working group.

    Full details on the working group can be found at http://www.i-d-n.net. Maybe folks should consider reading the copious archives before declaring that it can't be done. It can be done, and hopefully it can be done right. We're quite sure that the Powers That Be in the IETF won't allow it to become a standard if it isn't right.

  56. Reason behind it by fssd · · Score: 1

    I am working as a IT tech in a multinational company, which happens to handle all sortz of domain registration. The reason for us to register under different character set is because it is easy for the people in that particular country to remember an address; but we also register an equivalent english name for that as well. Since the main point of the registration in different character set is simplcity for typical user who don't use English as their first/second language, it will be hard for them to remember any address in the first place.

  57. Re:Not good by fm6 · · Score: 1
    Since a user from a non-English-speaking country was good enough to make this point, I can expand on it without seeming a bigot.

    Rather than trying to shoehorn international functionality into a system with limited, English-oriented semantics, the standards folks should be working an intuitive, international, squat-resistant layer above the current domain-name/URL conventions. It won't be easy to do, but there's got to be a better way to distinguish vaguely similar sites than panamahats.com versus panamá-se-expatria.com. (Apologies for my Spanish.)

    By the way, does this mean that domain names will now be case sensitive? I've always thought that the resource part of a URL should never be considered case sensitive, even when they map to files on case-sensitive file systems. This was actually considered, but rejected on the grounds that case-mapping is ambiguous in some languages. If this is a legitimate issue, than internationalizing domain names is even more problematic than Docrates suggests.

    Well, excuse me, I have to go register Amazon.com, aMazon.com, AMazon.com....

  58. Re:Bad idea, NOT by guran · · Score: 2
    Hard to implement, yes
    Expensive, yes
    A pain for hard core geeks to get used to, yes
    Necessary, hell yes! Pehaps not today, but soon.

    I do see your point, but the same argument could be used against 16-bit computers (8-bits is the current standard and programs and data must be interoperable...)

    Do you know how much creative spelling there is, simply to force non a-z characters into the DNS? Simply removing dots, rings and accents is not good enough. (oops sudddenly my domain name became equivalent to "www.faggot.com" or someone elses brand name)

    Ever tried enforcing a "8 character a-z only" file name policy on a network where *some* servers and programs could not handle other names? Forget it. It was cheaper to dump those, buy a new network and microsoft products (as you can see this was after the dos days:-) even if tecnically inferior, than to handle the constant hassle.

    People *hate* modifying spelling to comply with stupid limits. There is no standard way to map non a-z chars onto a dns

    ASCII is outdated, get rid of it!
    Either it will be done in a standardized way, or it will be done by Microsoft. I prefer the former.

    --

    All opinions are my own - until criticized

  59. Very good by Mask · · Score: 1

    Not all of us are as lucky as European language speakers. Latin (font) based languages have the privilege to remove the accents from their characters and use plain Latin characters. Some languages like Greek and Russian are a bit less lucky, but they still have a usable (but ugly) mapping. Not all Russians can understand this mapping (most of the Russian population knows only one language). Hebrew and Arabic are completely out of luck. No reasonable mapping exist from these languages to Latin font. The same word can be written is several (wrong) ways if Latin fonts are used (remember squatting?). For example, the Hebrew word for shopping mall is Kenyon, but is written the same way as Canyon, so many (most) people pronounce it as Canyon. So if an internet mall wants to register itself it will have to register Canyon, Kenyon and Kanyon. Making Hebrew registration will help here. You may wonder why this has to be implemented in COM and ORG TLDs. I think it should not be implemented at this level (yet). Instead a standard should be prepared so that browsers, servers and other tools can handle the new domain (which will be deployed in local domains).

  60. Re:How about *dis*-allowing characters? by iktos · · Score: 1

    Well, actually it'd be vvv.domain.se which happens to be pronounced the same way as www.domain.se ("ve ve ve punkt domain punkt ess e"), because W is sort of treated as a "different looking" V when it occurs in proper names and such, for pronounciation and alphabetization purposes.

  61. How you would type www.wong-kar-wai.org by keithso · · Score: 2

    It depends on which Chinese character set you use (either traditional for Hong Kong and Taiwan or simplified for China)

    For each character set there's a choice between a couple of input methods to map keystrokes from a QWERTY keyboard to the actual Chinese characters. I normally use a method called traditional Cangjei and here's how you type the URL:

    twlb vfog vfbtv .mg jmso hodqn .dvii dttb
    w w w .wong kar wai .(--org--)

    Of course there are rules to generate the above if you know what the word looks like :-). However as you can see it's much more inconvenient that way, and anyone who thinks that the average person who doesn't know Chinese typing would be able to reach their Chinese domain is being silly at the least.

    --
    Keith So GnuPG fingerprint = 168F 874B 4E26 DCA8 B8BF 57F4 80F9 412E F82B AE4C
  62. [Off-topic] Non-profit organizations by Far� · · Score: 1
    [...] the fact that the U.S. government is a non-profit organization [...]

    Uh? Who are you kidding? Looks like you have been smoking something fairly strong, lately (such as political propanganda).

    "Non-profit organization": this term is such an oxymoron, anyway! As if any organization could last without somehow benefitting its members!

    The whole DNS monopoly is a huge racket on the internet, anyway.

    -- Faré @ TUNES.org

    --

    -- Faré @ TUNES.org
    Reflection & Cybernet

  63. Re:Ýou people just don't get it by smack.addict · · Score: 2
    People don't agree with you, so you just dismiss them as stupid Americans? How enlightened!

    There are many Americans who understand internationalization issues very thoroughly, and some of them disagree with this proposal. It is a bad proposal because, first off, it really does not seem to understand internationalization issues. You do not accomplish I18N by using national character sets. Using an NCS is not making your content supported for an international environment. It is doing exactly the opposite. In many cases, there are half a dozen NCS that support the same damn alphabet. If you really want I18N, you need to use Unicode (preferably UTF8) or UCS.

    If you are going to support multilingual domain names, resolution must occur in either Unicode or UCS. Let DNS lookup libraries handle the conversion from KOI-8 to UTF8. The user enters the domain in their NCS and the DNS server only has to handle one character set.

    Beyond the issue of I18N, however, is the issue of who a TLD is targetted at. If .com is aimed at a global audience, then domains registered under that TLD should support a global audience: i.e. ASCII or ISO-8859-1. NOTHING ELSE. Let .ru use more of the unicode spectrum. Or even allow for a .ðî (that was the cyrillic letters for the first to characters in the Russian word Rossiya, in case your browser cannot resolve those) for domains aimed specifically at Russian speakers.

  64. Wow, this is amazing... by The_Messenger · · Score: 1
    ICANN't believe it, myself.

    ---------///----------
    All generalizations are false.

    --

    --
    I like to watch.

  65. Re:w00ly_mammoth vs. wooly mammoth by Anonymous Coward · · Score: 1

    DISCLAIMER I am a fag. But you are seriously *gay*.

  66. How it's supposed to work by David+A.+Madore · · Score: 4

    Recently I posted this comment mentioning the fact that there's really no reason why a domain such as www..com (you should see two Chinese ideograms meaning "China" between the "www." and the ".com" parts; further, if you click on this link, your browser should open a window telling you that the domain "www..com" does not exist, with the same two Chinese ideograms) doesn't exist.

    Let us recall: first, as specified by the HTML specification, every HTML document, no matter what character set it is "encoded" as, is written in the all-englobing Unicode character set. So when you write something like "&#20013;&#22269;" in HTML, it refers to the Unicode characters (decimal) 20013 and 22269, no matter what the current character encoding and font are. So that's how you write the link text. Second, as for the URL itself, well, although it is not (as far as I know) formally recommended by an Internet standard, it is widely recognized that URLs are written in the UTF-8 encoding format (which is afterward %-encoded into ASCII).

    The whole process is described in this Internet Draft ("Internationalized Uniform Resource Identifiers"; WORK IN PROGRESS!) by Larry Masinter and Martin Duerst where the relationship between URIs and IURIs (Internationalized URIs) is discussed in detail.

    The DNS is the toughest part of all. The DNS specification (RFC1034) states (section 3.1) that DNS data is to be taken as binary for possible upward compatibility (this was wonderful foresight on Mockapetris' part!). Consequently, there is nothing as per standards wrong with using (UTF-8 encoded Unicode) 8-bit data in DNS labels. Except, of course, that many "buggy" implementations will have to be corrected for broken assumptions, *sigh*. The IDNS working group suggests using a UTF-5 encoding to avoid going beyond the current domain name limits: I think this is not a good thing and we should stick to UTF-8 and repair broken software.

    Oh, and incidentally, see this page too know how broken your browser's Unicode support is.

  67. I can see where this is going... by batobin · · Score: 2

    Now when I buy a product and need technical support, they can tell me that my keyboard isn't compatible with their web site! I knew this day would come! Brian Tobin

  68. ugg by jmd! · · Score: 2

    I dont have anything against other cultures, and dont mind other languages exsisting, in writing or on web pages... but DNS is NOT the place for them.

    a domain name i supposed to be universally accessable. this is going to make a great many pains in the asses.

    old browsers wont work
    english keyboards lack accented characters
    its not fun changing your charset, then punching in random alt+XXX codes until you match the CJK symbol your looking at.

    the internet is really becoming dumb.

    1. Re:ugg by chrischow · · Score: 1

      as the population on the net who can't speak english well or at all increases (and it is) we need stuff like i18n domains . just because u can't see a need for it doesn't mean there is no need for it. actually there is a great need for it. or does "globally accessible" mean written in english?

    2. Re:ugg by ZeroData00 · · Score: 1

      Well, if you have a mac it is a simple matter to pull up a web page with accented using opt or the key Caps prog. or the Microsoft equivalent. And I am an American but have the name Bjørn yes with an "ø" and Bjorn.com/org/net is in use. along with Bjoern the English equivalent to Bjørn. plus can .no .se .de and so on use there accented characters. Now as for non-roman characters there on your own. But a for roman characters you could make it so you type Bjørn.com and it goes to Bjorn.com. so you could type bjørn.com or bjørn.com By the way I sign my Bjørn and I was born in the USA and live there.

      --
      When I was a boy the goverment stole everything from us.
    3. Re:ugg by claes · · Score: 1

      What rubbish!!! Is it so hard to understand that other languages should be able to use the internet on their own terms? The whole world does not revolve around the US alone...

  69. ��? by Anonymous Coward · · Score: 1

    There was a nice (funny) scandal here in Spain a couple of years ago when the health ministry bought a new "computer system" which couldn't cope with the ñ (enye,\tilde{n},ñ,whatever). I don't know the details but it makes me wonder if its worth extending the system if (even) the government(s) will still be too clueless to use it. But at least it will be one less excuse for those who can't be bothered to accent correctly and blame their laziness on the machines.

  70. just one problem... by jmd! · · Score: 2

    i dont see how this would be possible without the modification of every name server in the world to support multibyte domains... since BIND 9 is in feature freeze... this might get in to BIND 10... look for betas in about 10 years.

    wouldn't this choke most applications? im not entirerly sure how CJK are handled... doesnt seem to me like it would be a pop-in transition.

  71. Ummm... by dagoalieman · · Score: 1

    I assume we'll have some english letter translations? For example, ss (or some letter combo) for a symbol like the German ess-stet (or however you spell it.)? That looks like a capitol B, with a long tail on the left... alt-225 for those with ascii working...

    If we don't, there will be a lot of people who can't access web pages. HOWEVER, this may be a good thing.. For example, Asians can keep Americans off their sites, and in the same context, we can use some spanish characters and keep them off. Interesting possibilities, but where do we go from here?

    --
    We don't need no Net Explorer We don't need no Thought control
    1. Re:Ummm... by ariux · · Score: 1

      Most Americans couldn't be bothered with such foreign sewage.

      Ugh. At this point I can't tell the U.S. trolls from the European trolls.

  72. Internationalized Domain Names by jseng · · Score: 1

    For those who are interested in IDN, here are some URL.

    IETF IDN WG
    http://www.i-d-n.net/

    NSI Registry Testbed
    http://www.nsiregistry.com/

    i-DNS.net (Technology Provider for NSI ML.com testbed)
    http://www.i-DNS.net/

    Multilingual.COM Promotion
    http://we-multilingual.com/

  73. Re:Also great. OSes now forced to support CJK inpu by chrischow · · Score: 1

    Mac OS 9 does this already, CJK (and other languages - indian ones and some others) are an optional install

  74. not a good thing, at the monment, by delmoi · · Score: 1

    I really don't think this is a good thing right now, unless you did something like map a unicode domain to a standard, limited characterset domain name.

    Thats what I'd do anyway, but as the poster sugessted, the people who decided to do this probably don't know anything, since they decided to go by spesific languages, rather then unicode all at once.

    But anyway, in my oppinion, this could really negatively impact the global nature of the internet, at least right now. I've got my computer rigged up to let me enter chinese characters, both simplified and traditional, but although I know how to read a little traditional chinese, I can't figure out how to type it in, this system would make it imposible for me to visit internet sites with those characters in the domain name, or it would at least make it difficult for me to type those URLS in. I, and many other people, could be cut off from parts of the internet that use character sets we don't know how to enter.

    Domain names were designed the way they were for a reason, and I don't think its a good idea to go back on that....

    --

    ReadThe ReflectionEngine, a cyberpunk style n
  75. Re:EXCELLENT POINT... by chrischow · · Score: 1

    a true stereotype judging from some of the posters here however

  76. Re:Not good by TulioSerpio · · Score: 1

    Im argentino (hablo castellano). In the case of spanish, I think the solucion is to allow accented chars in the name, but MAPPED in the unaccented chars. so you can type galeríacentral, and go to galeriacentral. that way the people can type the real word in spanish, an get the real site.

    --

    I'm from Argentina: Tango, Asado, Mate, Gaucho, Maradona, YPF

  77. bad example.. but good point.. by radja · · Score: 1

    the ß is a bad example, because it actually is 2 letters: a long and a short 's', which is why it can be written as 'ss'.

    //rdj

    --

    No one can understand the truth until he drinks of coffee's frothy goodness.
    --Sheikh Abd-Al-Kadir, 1587
    1. Re:bad example.. but good point.. by afc · · Score: 1

      Actually, I think it's an 's' (es) and a 'z' (zet, pronounced 'tset') tied together by the liaison in old German (gothic) script. Don't know why it came to be written alternatively as an 'ss', but then again, IANAG (but I imitate accents quite well :-)
      --

      --
      Information wants to be beer, or something like that.
  78. mapping back to ASCII? by Goonie · · Score: 2
    I'm all for making the web accessible to everyone, including those whose native language is not English, but this could lead to website that are virtually inaccessible to somebody who doesn't have the correct keyboard and/or input software. This is a bad thing.

    However, if you had some kind of translation software that automatically mapped the local character set back to ASCII (and of course disallowed name clashes for the mapped names when registration occurred), it could be a win/win situation both for making the DNS more useful for non-english speakers, and keeping the net globally accessible.

    --

    Any sufficiently advanced technology is indistinguishable from a rigged demo
    --Andy Finkel (J. Klass?)
  79. Re:Unicode Limitations / BIND by chrischow · · Score: 1

    exactly, for example chinese.com... if written in chinese it could be 2 characters

  80. Re:Not good by Mr+Z · · Score: 1

    One possible "solution" would be to treat differently-accented versions of a registered name as reserved. That is, if you register galeríacentral.com, you have first dibs (or more likely, right of first refusal) for galerìacentral.com and galeríacentrál.com.

    That would help stop the copycatters a little.

    --Joe
    --
  81. Re:Ýou people just don't get it by claes · · Score: 1
    I never said they were stupid americans. I said that they were americans and that they obviously could not understand that this is an important issue in other countries. Most people here have complained about (in my opinion) very trivial problems. Like how they should be able to type the characters in an URL on their keyboard that lacks key for these. These problems are much smaller than the problems non-americans have to deal with, and that this suggestion tries to solve. (Why are these problems smaller? Think about it: If you do not live in a country were you encounter these URLs in advertising on television or in newspapers etc, you will only encounter them as text on your screen, in an email for example, or as a result from a search engine. And then you can just click on them or copy and paste them. And in the rare cases were even this does not work, you will have to use software that can make you type them. Probably integrated in future browsers. And also: if you use these charactes when turning to global audience, you are somewhat stupid. Most people will realize this. I think this will not be a problem after all.)

    If my last name is Jönsson, I want to be able to use that in a domain, and not have to use Jonsson which is a completely different name. The same applies to lots of words, I estimate about 1/3 or 1/4 of all words in the swedish language have å ä or ö in them. These words are like second class citizens on the internet today, even when only used in a Swedish context.

  82. How about *dis*-allowing characters? by iktos · · Score: 1

    Those who run the .nu domain wants to let those who registers domains use the å ä and ö characters (which are letters in their own right in Swedish and not accented letters), because .nu is used by lots of Swedes. The major operators in Sweden don't like it, as it's not an international standard, so I think it's too early to say it "works". If this becomes a standard, to be logical, then nobody from Sweden should be allowed to register a domain with "w" in it, as we don't have that character as a separate letter in our alphabet (any more).

  83. There is Need for a Non-DNS URL System by winterstorm · · Score: 2

    I think it is about time we tossed out DNS when it comes to URLs. It is ridiculuous that so many millions of non-technical users are expected to use DNS. The further absurdity of the DNS systems application to URLs is realized by the endless "property" claims made by rich litigious corporations.

    Why not use some kind of distributed, non-exclusive labeling system that lets IBM have the name "IBM". Maybe something LDAP based?

    We are not going to get anywhere by patching up the DNS system a problem at a time. We need to engineer a new solution. I'm all for evolution but I don't want to wait for it to come up with something that works

  84. Re:.nu by ~MegamanX~ · · Score: 1
    No, i don't always follow links.
    No, i don't want to bookmark every site i visit.
    No, i won't go to the site if i can't type it's address.

    • What if they decide to host an english section?
    • These addresses would be limited to small communities sites and would not help the fact that you have the opportunity to reach a gobal public as you said.
    Actually, modern desktop computers all share a common character set: ASCII. Let's use it. Actually, this address is only an easier thing to remember than a numerical IP (XXX.XXX.XXX.XXX)...

    phobos% cat .sig
    --
    phobos% cat .sig
    cat: .sig: No such file or directory
  85. Re:That's funny... by generic-man · · Score: 1

    Yes, wooly mammoth does exist. You just have to use the correct, standard space character.

    --
    For more information, click here.
  86. Thought it would be a joke... by charon.de · · Score: 2



    I have read about it on: http://www.spiegel.de but as far as I remember, it was only a doubtful claim by some politicans, who have a german special sign like ä,ö or ü in their names and could not register it without using substituions like ä --> ae and so on.

    As far as I remember, the chairman of http://www.denic.de only laughed about this claim.

    I think it would lead us to more problems than it would solve! Or is it just again about making $$?

    Michael

  87. Re:like latin for science by eastern · · Score: 1

    English is not required in schools in India. (I'm an Indian, schooled in India). It's a different matter that most schools perceived as 'good' not only teach English but also teach everything else like science and arts in English. This is probably a colonial hangover but clearly a great advantage for those who can afford an English education. There's a strong business and employment bias against those who can't handle English but almost everyone I know would rather learn English and join in rather than struggle to change the world.

  88. Unicode Limitations / BIND by waldoj · · Score: 3

    Unless I'm mistaken, Unicode is a combination of two ASCII characters to create a single one, which is how Japanese, Chinese, etc., characters are created. 255^2 is a lot of characters. (65025, to be exact.) Doesn't this mean that these domains are limited to 31 characters? Further, can BIND *support* using characters beyond [a-z0-9-.]? I sure wouldn't think that it could.

    I didn't find these questions answered anywhere on ICANN or NSI's sites. Anybody have any ideas?

    -Waldo

    -------------------

  89. Not good by Docrates · · Score: 5

    Look, I'm panamanian. Spanish is my first language (it is Panamá, not Panama), but i just can't agree with this because i don't think it's practical at the moment. Take for example this web site we're building called galeriacentral.com. everyone knows automatically how to acces it when they hear an ad for it on the radio, but with the intl characters allowd, I would have to register galeriacentral.com, galeríacentral.com (correct form) and galerìacentral.com. and then someone would register galeríacentrál.com and i'd be screwed (cybersquattin is allowed in most parts of the world)...

    my recomendation would be to leave it up to the countrlies TLD's. so if i want to register cualquiercosa.com.pa then ok, but the regular .com/net/org are already abused enough to leave more room for stuff like slashdog.org.

    --

    There are two kinds of people in the world: Those with good memory.
    1. Re:Not good by Mr+Z · · Score: 1

      Here's one simple solution: All of them. Turn accents into icing, don't make them the cake. Make the comparison accent-insensitive.

      --Joe
      --
  90. Re:Also great. OSes now forced to support CJK inpu by tfxx · · Score: 1

    What's preventing you from inputing CJK characters right now?

  91. There's a problem with this... by Millennium · · Score: 2

    ...namely, it's not yet practical. It may not be for a long time.

    One, people talk about accented characters as being harder to recognize when spoken. While this is true, there's another problem, and one that's a lot tougher: there is no standard way to type these characters. On a Mac it's done one way (fairly intuitive, based on the character over which a given mark most frequently appears), on Windows it's done another way (an unnecessarily difficult process involving a four-digit keycode), and on Linux/Unix it's still another (I don't even know how it's done there).

    Part of the reason the keyboard works so well is that it's at least semi-standardized; for the basic Roman character set I can move across platforms effortlessly. But when you start throwing diactiricals into the mix, I'm lost when I move from platform to platform. We need to solve that problem before we can even think of putting such characters in URL's. Can it be done? I think so.

    Now, there's the problem of CJK characters in URL's. First of all, most computers aren't even capable of recognizing these without special software. As a result, the characters come out as a sequence of ASCII chars which if you're really lucky might all be printable. If you're not so lucky, the characters won't even be printable, or they'll be indistinguishable from one another so you still don't know what to type.

    The answer here? Unicode (specifically UTF-8) helps, but many computers still don't support Unicode. Even in the case of those that do, I doubt there are any fonts which support every single character in the CJK set yet (remember, the Chinese character set in particular is truly vast; a two-byte encoding system is still insufficient for encoding all the possible characters). While all current operating systems can banage Unicode, many people are unable or unwilling to upgrade to current technology, and that's going to be a huge barrier to overcome (it may even prove insurmountable).

    Supporting all the world's languages in URL's is a Good Thing. However, we have more than a few problems that we have to get through before we can accomplish that goal. The resources currently being spent on this project would be better spent solving those problems first.
    ----------

  92. Re:barf. by linuxgod · · Score: 1

    Yes, that sucks.
    I get that crap to.

  93. Re:Ýou people just don't get it by ariux · · Score: 1

    very trivial problems. Like how they should be able to type the characters in an URL on their keyboard that lacks key for these.

    Like it or not, multinational domain names will tend to fragment the Internet, by making it harder for people from different backgrounds to communicate.

    As it stands, English is a kind of common language for the Internet.

    Yes, this is to some extent unfair - it gives an advantage to naitive English speakers over everyone else. I can see why this would bother people with other naitive languages.

    It also brings the same advantage in international communication that Latin did to the international community of scientists hundreds of years ago - it makes it possible for everyone to communicate after learning only one new language, not fifty.

    To some extent what is easy is what gets done; and making it harder for people from different countries to reach each other's Internet hosts will make it happen less.

    Don't get me wrong - this is not an argument that international domain names are bad or should not be used. I have not discussed at all their formidable advantages. But the fragmentation I speak of is an unavoidable consequence; to ignore it is to fool oneself.

  94. ICANN Plans Non-English Character Domain Testbed by Anonymous Coward · · Score: 1
    After following a few links, I have some questions:

    1. Why is NSI in the loop? Haven't they caused enough damage already?
    2. Why not start by allowing full ISO 10646, encoded with UTF-8, and say that other coutry-dependent or language-dependent issues may need to be addressed.
    3. Some of the posters seem to think that keyboards are an issue? Why?

    BTW, I'm a native Anglophone, and I could get along just fine without all of those accented characters. But there is no reason to expect the rest of the world to be happy with a character set defined by and for Anglophones, and I regard internationalization as long overdue.

    English may be the lingua franca of the Internet, but that doesn't meant that we should be linguistic <godwin>facists</godwin>.

  95. Re:EXCELLENT POINT... by HarryZink · · Score: 1

    Currently, we have a global lingua-franca in the form of English, which requires that domain names and in some way URLs, use English and the appropriate characterset. In many ways, this brings people together around a common set of rules, yet still gives tremendous individual leeway.

    By introducing domain names with national-characterset (and, make no mistake about it, the reason behind this is puirely profit-drive), you are introducing the equivalent of national barriers which will make it annoyingly difficult, if not outright impossible, for westerners to reach oriental websites (for example) - more so for Windows users, than Mac users (Macs have a majority of western characters accessible via pretty much any alphabetic keyboatd through simple, and intuitive keyboard combinations).

    This must be the most incredibly stupid suggestion I have ever read.

    Harry

  96. Re:like latin for science by claes · · Score: 1

    Most people don't use latin. They use their native language for flowers and species. They should be able to do that on the internet as well.

  97. .nu by mayar · · Score: 3
    If you want to register a .nu domain you can already use the some of these features. "Swedish, Danish, Norwegian, German, Spanish and other Western European language characters, such as å, ä, ö, ç and ñ."

    And worldsnames.net also features Japanese characters, Chinese, Korean, Arabic, Cyrillic.

    Tho one of the nice "features" of the internet is the fact that you have the opportunity to reach a gobal public. Which is rather hard when you have country/language specific characters. my 0.02
  98. Re:Looks NSI/ICANN just discovered... by Anonymous Coward · · Score: 1

    That's not the point...

    revisionism is a state sponsored pastime in the
    majority of the countires that will be using new
    domain names.

    The second point is that those "blockheaded
    westerners" got and still get a lot of help from
    people all over the world. To think the internet
    is an American invention is to be ignorant of
    both history and reality.

    Thridly, those "blockheaded westerners" (NSI
    primarlily) just don't get it... so it's time
    for the innovation to happen where the need is.
    As it happens, I-DNS arose out of Singaporian
    research money.

    Finally, If you think everything Internet has
    alreadly been thought of, you're so desperately
    wrong that you're probably not capable of
    contemplating your belly-button.