Slashdot Mirror


Internationalized Domain Names Coming Soon

rduke15 writes "You think you know how to parse a domain name for validity? Well, in case you haven't noticed, things are getting tougher as registrars keep adopting IDN (Internationalized Domain Names), which uses a weird encoding named Punycode to enable accented characters in domain names. The Register reports about Switzerland, Germany and Austria's joint move to enable IDN. See the overview in English from Switch. But I guess it would be difficult to talk about this on /., since it does not even support basic Latin-1 ... :-)"

526 comments

  1. FINALLY! by penguinrenegade · · Score: 1, Flamebait

    I'm glad to see that people other than Americans are being recognized on the internet. Which originally started as an American military project...

    1. Re:FINALLY! by DoorFrame · · Score: 0, Redundant

      Wait, huh? How was your second point linked to your first point?

    2. Re:FINALLY! by arcanumas · · Score: 5, Funny
      I'm glad to see that people other than Americans are being recognized on the internet. Which originally started as an American military project...

      I am glad too see others than the Mesopotamians using the wheel which was originally invented for use in Mesopotamia.

      --
      Slashdot Sig. version 0.1alpha. Use at your own risk.
    3. Re:FINALLY! by wo1verin3 · · Score: 0

      *ROBBLE ROBBLE WHAT ABOUT THE GALGAMECKS?!?!?*

      +1 Elite SouthPark Reference places to avoid lameness filter

    4. Re:FINALLY! by Anonymous Coward · · Score: 0

      > I'm glad to see that people other than Americans are being recognized on the internet. Which originally started as an American military project...

      >> I am glad too see others than the Mesopotamians using the wheel which was originally invented for use in Mesopotamia.

      and I'm even more glad that people other than Mexicans are using the number 0, which was invented (discovered?) by the Mayas. :-)

    5. Re:FINALLY! by Anonymous Coward · · Score: 1, Informative

      Elsewhere in the world, the Arabic numeral system (012345679) had zero, and before that, so did ancient India.

      I don't think the Mayans even used a base-ten system like the rest of the world, so attributing zero to them seems odd to me.

    6. Re:FINALLY! by cynicalmoose · · Score: 4, Funny

      The internet was built as a highly decentralised, noncontrolled network, so that, in the event of a nuclear war, military leaders would have unrivalled access to pornography. (3DTIAB)

      --
      Exercise your right not to vote. thinkoutside.org
    7. Re:FINALLY! by Anonymous Coward · · Score: 0

      Yea, but the Sumarians freewared it.

    8. Re:FINALLY! by Anonymous Coward · · Score: 0

      well the mesopotamians didn't use galvanized rubber or water channeling tread on wheels, so why attribute the invention of the wheel to them?

    9. Re:FINALLY! by jea6 · · Score: 2, Interesting

      The last time I checked, binary had zero, so an off-hand uninformed (slightly prejudiced) comment as yours is even dumber when you actually think about it.

      For the Maya's, zero was not just a placeholder. It signified the concept of an absence of value, a.k.a. an empty set.

      http://en.wikipedia.org/wiki/Zero

      History
      The numeral or digit zero is used in numeral systems, where the position of a digit signifies its value, with successive positions having higher values, and the digit zero is used to skip a position. By about 300 BCE the Babylonians used two slanted wedges to mark an empty place in a given sequence of positional digits. It did not function in the true sense of a number. The use of zero as a number unto itself was introduced into mathematics relatively late by Indian mathematicians. An early study of the zero by Brahmagupta dates to 628.

      Zero was also used as a numeral in Pre-Columbian Mesoamerica. It was used by the Olmec and subsequent civiliations; see also: Maya numerals.

      The ancient Maya civilization used a vigesimal (base-20) numeral system.

      A vigesimal numeral system has a base of twenty.

      --

      sarchasm: The gulf between the author of sarcastic wit and the person who doesn't get it.
    10. Re:FINALLY! by mgcsinc · · Score: 1

      The infrastructure we use may have been derived from DARPANet, but the network grew out of networked research institutions, and that's an important distinction!

    11. Re:FINALLY! by McDutchie · · Score: 2, Insightful
      I'm glad to see that people other than Americans are being recognized on the internet. Which originally started as an American military project...

      I'm glad to see that people other than the Swiss are being recognized on the web. Which originally started as an Swiss scientific project...

      Without the rest of the world, the Internet would have been obsolete and irrelevant by now. Deal.

    12. Re:FINALLY! by Anonymous Coward · · Score: 0

      Wheels typically don't contain any rubber, or have any tread at all.

      No one said that the Mesopotamians invented the tire.

    13. Re:FINALLY! by claes · · Score: 1

      Swiss? Because of CERN? Hardly. CERN is an international project.

    14. Re:FINALLY! by McDutchie · · Score: 1
      Swiss? Because of CERN? Hardly. CERN is an international project.

      I know. But thanks.

    15. Re:FINALLY! by znesic · · Score: 1

      LOL :-) Your reply AND your sig ended up in my "interesting quotes" notebook. Thanks! Zoran

    16. Re:FINALLY! by cynicalmoose · · Score: 1

      Sorry, the sig should end "category" (see below) And yes, I know this post is off-topic.

      --
      Exercise your right not to vote. thinkoutside.org
  2. Ah great... by Worminater · · Score: 5, Insightful

    More ways for trolls to disguise goatse.cx links...

    1. Re:Ah great... by Hanzie · · Score: 2, Insightful

      The parent post is absolutely not flamebait. It actually brings up an extremely good point. There will unquestionably be domain squatting and misdirection with use of accented characters.

      --
      ********* sig: If you don't like the law, get filthy stinking rich, and buy a better one.
    2. Re:Ah great... by MikeXpop · · Score: 4, Insightful

      Heh. Worse than that. Imagine http://www.paypal.com/enteryourcreditcardnumberher e.php! How many people do you think that would fool? I'd be guessing a lot more than sites now are.

      --
      Etiquette is etiquette. He kills his mother but he can't wear grey trousers.
    3. Re:Ah great... by nsebban · · Score: 1, Funny

      c'mon man ! you forgot the link ! BTW, DON'T CLICK HERE :)

      --
      ____
      nico
      Nico-Live
    4. Re:Ah great... by Basehart · · Score: 1

      Aaaw for the love of God. I'd just about forgotten about that picture and then clicked the link.

      Thank you very much!!

    5. Re:Ah great... by LouisZepher · · Score: 1

      And who's the dumbass to blame for that, eh?

    6. Re:Ah great... by ceejayoz · · Score: 2, Funny

      I have modpoints right now, but I can't find the "-1, Dumbass" one... hmm...

    7. Re:Ah great... by Basehart · · Score: 1

      I do too, but if I mod you -1 Offtopic my original post will disappear and we'll all cease to exist.

    8. Re:Ah great... by Trejkaz · · Score: 3, Insightful

      Okay, you got me. That domain name is 100% pixel identical to www.paypal.com ... which letter is changed?

      --
      Karma: It's all a bunch of tree-huggin' hippy crap!
    9. Re:Ah great... by Anonymous Coward · · Score: 0

      Slashdot only supports 7-bit US-ASCII. If you enter accented characters it will remove the accents. Hence, the point of the original post was devoured by Slashcode.

    10. Re:Ah great... by lisany · · Score: 2, Funny

      Maybe he works for Verisign and is planning to hijack the domain "For the good of the Internet."

    11. Re:Ah great... by Trejkaz · · Score: 1

      Or at least, devoured by someone who couldn't find the "Preview" button, which would have highlighted the limitation instantly.

      --
      Karma: It's all a bunch of tree-huggin' hippy crap!
    12. Re:Ah great... by jrumney · · Score: 1

      Although domain names are internationaliSed, only good old fashioned AMERICAN letters are accepted on Slashdot. [watch for the "-1 UnAmerican" mods to prove my point]

    13. Re:Ah great... by JackRabbitSlims · · Score: 1

      That's a fact. In Spain it happened a couple years ago when a local registrar started to get pre-orders of internationalized domains (in spain that is accented vowels and our beloved n-tilde). Domain squatting was crazy. Someone quickly registered "elcorteingle's" (that's last e, accented) -a Harrod's type department store in Spain-.
      I even registered (an paid!) one myself to prevent someone taking over one domain I have (bolsagra'fica.com) to only see that 1 year after the registrar itself admitted those domains were hardly ever make it through.

  3. sure, whatever. by Anonymous Coward · · Score: 0


    Will this complicate some pages a little? probably.
    Will it make a lot of people a little happier? Sure.
    Is it a big deal? *shrugs*

    1. Re:sure, whatever. by spinspin · · Score: 1

      This kind of united states centric (I was going to say amero-centric, but then I remembered the rest of the continent) thought is getting pretty old. Just because you don't have to write accented characters, doesn't mean it's not news (for nerds no less).

  4. sounds like by Anonymous Coward · · Score: 3, Insightful

    Sounds like a job for Unicode.
    Unicode.org

    1. Re:sounds like by Daniel_Staal · · Score: 1

      The problem is Unicode support is not present in the places it is needed. So they have a way of encoding Unicode in ASCII.

      --
      'Sensible' is a curse word.
    2. Re:sounds like by Anonymous Coward · · Score: 0

      It is in fact a subset of Unicode. There is restriction from the current DNS system like the fact that the case of the character is not important, the limit in the number of character, some restricted character and there must be only one way to write the domain at the registrar.

      Space characters would make visual transcription of URLs nearly impossible and could lead to user entry errors in many ways. So they are removed.

      The following character are also removed:

      -All the control character
      -The private use of some table
      -non character
      -surrogate code
      -annotation code
      -change to display(right to left,symetric, national digit shapes ... )
      -unassigned code

      In english you have the minor case a and major case A, but when you add all the other language some character may have up to four unicode code that are the same character.

      There is some different way to write one character in unicode like the a with a ring above

      1) U+00C5 LATIN CAPITAL LETTER A WITH RING ABOVE
      2) U+0041 LATIN CAPITAL LETTER A followed by U+030A COMBINING RING ABOVE
      3) U+212B ANGSTROM SIGN

      This is the reason why they have not use the plain unicode. By using an encoding they were keeping the backward compatibility and following the restriction of DNS

  5. Isn't there a better way? by CTalkobt · · Score: 4, Interesting

    It looks to me like the problem is that the DNS servers don't support unicode so they're using a bad implementation of it.

    Why not extend dns to support unicode? That way they'd be no translation or other crap to go through.

    Granted software would need changing but that be the case with the mangled crap that's mentioned in the article.

    What am I not understanding here? Or is this just implementation dreamed up to make life complicated?

    --
    There's a gorilla from Manilla whose a fella that stinks of vanilla and has salmonella.
    1. Re:Isn't there a better way? by Anonymous Coward · · Score: 2, Funny
      Better still, why not just suck it up and get by with ascii? It only requires 1 byte per character, and is easy to memorize.

      Accented characters are so Old World and passe, anyway.

    2. Re:Isn't there a better way? by Horny+Smurf · · Score: 2, Interesting
      Paul Vickie (of BIND fame) has stated that supporting unicode in bind would probably require at least a year to implement, and could introduce new buffer overflow exploits.

      djbdns doesn't support unicode either, although it doesn't rely on standard c-libraries, so unicode support might only take a few weeks to add.

      Unicode would be better than punycode, but punycode works with existing DNS client and server software.

    3. Re:Isn't there a better way? by Moebius+Loop · · Score: 1, Informative

      It looks as if the goal is to implement this without breaking existing implementations. I did RTFA, although I might be missing something, but it seems to be that the translation is done by the client/local nameserver.

      i would imagine it probably attempts to query with the unicode first, and upon failure tries the munged address. since both versions are in the whois db, as DNS servers become unicode compliant, this would be naturally phased out.

      however, it means that any accent-containing domains would actually have two entries; i wonder, would you have to actually register twice (i.e., pay twice)?

      one good thing is that it does look like suficiently undesireable names are the result of the conversion, so i don't think there would be much overlap between existing domains and the converted form of new accent-containing domains...

      --
      have you been seen on slash?
    4. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      Why would it try Unicode first? No server is going to implement Unicode and Unicode-reinterpreted-as-a-string-of-ASCII-bytes is going to be full of NULs, control characters, etc. Basically, noncompliant.

      It's much better just to use Punycode all the time. And the bonus is that a Punycode encoding of any current domainname is exactly the same stream of bytes as the ASCII encoding.

    5. Re:Isn't there a better way? by the+morgawr · · Score: 1

      Maybe I'm being naive here, but wouldn't UTF-8 have worked just as well? It's backwards compatible with ASCII, and allows unicode charecters.

      --
      The policy of the United States is worse than bad---it is insane. -- Ludwig von Mises, Economic Policy(1959)
    6. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      Bravo! Masterfully hidden sarcasm - I LOVE it!

    7. Re:Isn't there a better way? by CaptnMArk · · Score: 1

      In a way I agree with you.

      All this I18N and L10N is anti-globalization.

    8. Re:Isn't there a better way? by wmshub · · Score: 4, Insightful

      Unicode-reinterpreted-as-a-string-of-ASCII-bytes (taken literally) can only mean UTF-7, which never really got much traction, but had no NULs or control characters in it - all pure, readable ASCII. It's problem in DNS would be that it treats upper and lower case as distinct, which is not true for current DNS queries. If you meant "UTF-8" when you said "unicode-reinterpreted-as-a-string-of-ASCII-bytes" , that also has no NUL or control codes in it, and unlike UTF-7 it lets you treat upper/lower case any way you want. It's drawback is that it will insert bytes in the 128..255 (ie, non-ASCII) range into the data stream, which will probably cause trouble for current DNS servers.

      So, to sum it up, you are right that current Unicode encodings will not meet current DNS RFCs, but the reason you gave wasn't quite right. Punycode does solve the problem, but ugh, punycode is an awful hack of a character encoding system. I'd hate to see it live on forever, but it might be useful getting us started on i18n-ified DNS.

    9. Re:Isn't there a better way? by Rob+Riggs · · Score: 4, Informative
      wouldn't UTF-8 have worked just as well?

      No. The problem that punycode solves is that the encoded DNS names are themselves valid RFC1034 DNS names. That is, even when encoded, standard DNS validity checkers will accept the name.

      UTF-8 does not have this property

      --
      the growth in cynicism and rebellion has not been without cause
    10. Re:Isn't there a better way? by pawal · · Score: 4, Interesting

      There are _so_ many applications using the domain name system that feeding UTF-8 through it will break most of them. Except for perhaps Internet Explorer.

      The registries using UTF-8 (most notably .NU) are running IDN in parallell with UTF-8 now.

      The Swedish registry is only using IDN. The reason for that is that UTF-8 in DNS is not an internet supported standard at all.

      http://www.xn--rksmrgs-5wao1o.se/ will work if you are using a recend Mozilla. (Slashdot should upgrade to at least ISO-8859-1 or UTF-8... I couldn't write raksmorgas.se correctly.)

      Microsoft are extremly slow in supporting IDN, and will probably not launch it until next OS release which is in 2006... There are plugins from Verisign.

      Do a good thing, release an open source plugin for MSIE.

    11. Re:Isn't there a better way? by defMan · · Score: 2, Funny

      i18n-ified

      internationalization-ified? Wow.

    12. Re:Isn't there a better way? by Anguo · · Score: 3, Insightful
      It looks to me like the problem is that the DNS servers don't support unicode so they're using a bad implementation of it. Why not extend dns to support unicode? That way they'd be no translation or other crap to go through.

      It's not as simple as you may think. I am all for Unicode, but to use it for domain names can lead to unwanted consequences.

      There already exists some intenationalized domain names in Chinese, so instead of having chopsticks.com we can have [insert chinese characters for chopsticks here].com.

      The problem comes from the fact that there are tens of thousands of different Chinese characters, each of those having a different unicode code but many of those being only slight variations of each other, or even so similar than a regular Chinese reading user wouldn't notice the difference at first glance. Thus you could have two very different websites having seemingly exactly the same name in Chinese but being different nonetheless because the unicode for their names is different.

      With only 26 letters and a few more characters, there has been many abuses of domain names, like www.microsoft-.com instead of www.microsoft.com (or some similar abuse), but the possibilities for abuse in chinese are almost infinite. The same would be in many European languages: many will not pay enough attention to the differences between an acute accent and a grave accent in French and might be mislead to a different site than the one they were looking for. Imagine the credit card payment page of the bank Societe Generale: with the accents written backwards, a lookalike site can be created under a domain name that looks the same. The same in Polish, with the cedillas under many of the letters or the L, with or without the bar accross it.

      Technically, unicode may be feasible, but human beings cannot distinguish between the hundreds of thousands (and more!) differents letters, characters and other signs that it offers...

      --
      http://www.masquilier.org/republic/election/ Condorcet, Plurality voting and alternative voting enabled bulletin board.
    13. Re:Isn't there a better way? by TheAJofOZ · · Score: 1

      If you meant "UTF-8" when you said "unicode-reinterpreted-as-a-string-of-ASCII-bytes" , that also has no NUL or control codes in it

      Erm, I'm not sure which control codes you're missing in UTF-8 but all the control codes from ASCII are present, because UTF-8 is precisely equivalent to ASCII for strings which only include charactersin ASCII's range. ie: the lower 128 characters in UTF-8 is the set of characters in the ASCII charset in the same order.

      If you're looking for a NUL character in UTF-8, it's code is 0 just like in ASCII.

      Also, no form of unicode, nor ASCII, inherently allows you to ignore capitalization. The comparison algorithm you use completely defines the characteristics of the search, not the set of numbers you're searching. 'A' and 'a' are just as different in UTF-8 as in UTF-7 as in ASCII. You essentially need a mapping of uppercase and lowercase equivalents as part of your search algorithm. With ASCII that's easy because you've only got the basic english alphabet, with UTF-8 that's hard because you start getting into all kinds of other languages and having to deal with accents too. There is however plenty of information around (including in the form of standards) which defines how to do that.

    14. Re:Isn't there a better way? by Matrix272 · · Score: 1

      Paul Vickie (of BIND fame) has stated that supporting unicode in bind would probably require at least a year to implement, and could introduce new buffer overflow exploits.

      Oh, we should abandon hope for it then, because, as we all know, there are NO buffer overflow exploits now!

      --
      "It's better to have a gun and not need it than need a gun and not have it." ~ Christian Slater, True Romance
    15. Re:Isn't there a better way? by caluml · · Score: 2

      So if he'd started working on it over a year ago, it would be ready by now.

    16. Re:Isn't there a better way? by Zeinfeld · · Score: 3, Insightful
      Paul Vickie (of BIND fame) has stated that supporting unicode in bind would probably require at least a year to implement, and could introduce new buffer overflow exploits.

      If Paul Vixie did say that it would kinda argue for chosing that route rather than trying to get the IETF to agree to anything, so far it has been over five years since the start of this effort and counting.

      The real problem is not fixing Bind, that is easy. Deploying bind updates and deploying compatible client updates is the real problem. It just isn't feasible.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    17. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      I'm not worried about fraud so much as the sheer impossibility of getting anyone outside China to even be able to type in the domain name in the first place. I have this concern when it comes to these Europeanized names, as well. While I am perfectly capable of writing in Japanese hiragana, katakana, and kanji on my GNU/Linux systems, I am at a loss to do so on Windows. Neither can I put an umlaut on a "u" on any system except my old Mac OS 8.6-running iMac.

    18. Re:Isn't there a better way? by Minna+Kirai · · Score: 4, Insightful

      Why not extend dns to support unicode?

      DNS should never get Unicode support, or any form of "internationalization" for that matter!

      DNS is supposed to be a way for humans to communicate with computers about internet hosts. The intent is not for some human to be able to read it, but for all humans. This has worked until now because hostnames were limited to only ~37 characters. Regardless of native language, any computer operator can quickly learn to handle the [a-z][0-9] gylphs. Basically anyone literate in one language can copy ASCII characters from a signpost onto a notepad, and then punch those into a keyboard. Even if her culture doesn't use the ASCII set in normal daily activities (which about everyone in America, Europe, and Japan does), then the shapes are at least simple enough to copy geometrically.

      But if 16-bit charsets are allowed in DNS, we could get hostnames composed of 3 Chinese characters and two Arabic ones, and which a Russian or Briton will be incapable of processing without tremendous pain.

      DNS is something that should be left in a "lowest common denominator" form, so that it's accessible to all of humanity (if they meet the low hurdle of operating a normal PC)

      Internationalized host identifiers in URLs will be important, of course. But they should be a separate layer implemented on top of DNS. DNS is a standard that already exists. Rather than changing the standard and breaking every single internet-using computer (the "flag day" scenario), a new system should be rolled out for people who want host identifiers in funny-looking squiggles.

    19. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      Could someone please explain how the hell the abbreviation "i18n" came to be? I have learnt it's supposed to mean "internationalisation", but WTF?

      Eye-eighteennnnn? Eye-one-eight-enn?

      And what's with ell-tennnnnn? Ell-ten-enn?

    20. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      I have to partially agree, I live in Sweden where we have three "special" charachters (AAO, there's suppose to be dots and circles and stuff over those letters). I think in most cases the spelling could be changed to AE, AO or OE or something like that. In the long run though, I see al languages converging into a single one...

    21. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      If you haven't figured it out yet, it's I and N with 18 letters in between.

    22. Re:Isn't there a better way? by Anonymous Coward · · Score: 0
      "It's better to have a gun and not need it than need a gun and not have it."

      But it's best to not have a gun and not need it either; every other possibility is dystopic.

    23. Re:Isn't there a better way? by SoupaFly · · Score: 1

      WinXP actually seems to have pretty good language support (yeah, I was surprised too). It's not perfect. I have used it though (for Japanese), and it works pretty good.

      Here's a link to a MS KB article.

    24. Re:Isn't there a better way? by neurojab · · Score: 1

      Yeah. It's dumb. Other posts have covered what these dumbass abbreviations mean.

      It's basically a variant of "l33tsp33k" that spomehow made its way into the corproate jargon. I refuse to use it. I spell out words IN ENGLISH, because that's the language I write in.

      Fight the current. Stop this idiotic intentional mis-spelling. Let's save the language.

    25. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      Then do Unicode -> UTF-8 -> yEnc !

    26. Re:Isn't there a better way? by Uerige · · Score: 1

      Yup you're soo right.
      Why don't these people think? I'd never get one of these domains for sure. Imagine you're telling someone over the phone to "go to my website, at www.[somethingfrench].fr", only to discover that he/she is using an american keyboard layout...
      Hope that was no business partner...

    27. Re:Isn't there a better way? by HiThere · · Score: 1

      Not really a good answer, but almost. Every language should adopt a mapping of the 37 characters uniquely specifiable in a DNS name. There is *NO* requirement that it match any particular set of characters in that language. Just the 37 that think fit. This will cause some sites to adopt names that look really wierd in English, but *what's wrong with that?* So a German mapping decides that it prefers not not use numbers, but to have some accented characters and a DM sign? No problem.

      Unicode would be a better answer, but if we must live in a small code space, let each language pick it's own keyboard mapping. (Browsers should be nice and allow folk to specify what mapping they are using, and accept unicode strings for conversion.) Some strings will be just unprintable, but that's the cost of not using unicode.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    28. Re:Isn't there a better way? by Spellbinder · · Score: 1

      why should french people use a american keyboard layout
      why should french people care if an american/or a german can go to their webpage??
      if it is a english webpage it will not be using those characters!!! if not and you are able to read french it should be possible to figure out how to enter those fancy characters
      try to read the articles next time!!!
      they have a alternative way to enter the urls in ascii
      b(ue)cher.ch would look like xn--bcher-kva.ch with ascii
      why are you against globalisation of the web
      maybe all the english speakers could start to use all those fancy stuff again they lost in their languages

      --


      stop supporting microsoft with pirating their software!!!!!
    29. Re:Isn't there a better way? by MagicM · · Score: 1

      !

      It'll work in the latest version of Opera (7.22) too.

    30. Re:Isn't there a better way? by mattjb0010 · · Score: 1

      Basically anyone literate in one language can copy ASCII characters from a signpost onto a notepad, and then punch those into a keyboard. Even if her culture doesn't use the ASCII set in normal daily activities (which about everyone in America, Europe, and Japan does), then the shapes are at least simple enough to copy geometrically.

      Why should they have to learn another character set? The same argument could be made in reverse, which you in fact do:

      But if 16-bit charsets are allowed in DNS, we could get hostnames composed of 3 Chinese characters and two Arabic ones, and which a Russian or Briton will be incapable of processing without tremendous pain.

      But those characters could be very simple to copy geometrically!

    31. Re:Isn't there a better way? by rduke15 · · Score: 1

      So a German mapping decides [...] to have some accented characters and a DM sign? No problem.

      Except the DM disappeared a few years ago. If you never get out, maybe you should at least watch some TV? :-)

    32. Re:Isn't there a better way? by iburrell · · Score: 1
      DNS could be extended to support UTF-8. The protocol is 8-bit clean. The problem is with the applications, protocols, and formats that include domain names. They assume that domain names are composed of only letters, numbers, and hyphens. For example, putting UTF-8 domains in an email message would confuse mail servers, mail readers, and lots of other stuff.

      The advantage of IDNs is that only software that understand the extra characters need to be changed. All the old pieces of software can keep working and treat the encoded IDNs as odd-looking but valid domain names.

    33. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      And the "human readibility" of said encodation becomes increasingly theoretical.

    34. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      Correct me if I am way off base but doesn't the A in ASCII stand for American?
      And as a sidenote Americans tend to simplify things like ignoring the strange characters in a foreign language... umlauts and whatevers.

      So instead of ANSI don't we need a INSI (I standing for international) standard? Or does the rest of the world run on ASCII (again doesn't the first word mean or say "american" ) hmmm If the standards are foreign then doesn't mean that you have to adjust your "standards" to conform to the standards that are already in place for say 50% of the world or do you need some obscure character that can't be reproduced by using the "standard" 255 char set?) Me being an american and having the equivalent of the gun pointing at others's head is just that that ...
      Do it our way or go fsck yourselves....
      I also apologize for my america-centric ramble but .... To quote some one more wiser than I ....
      He who holds the gold makes the rules. I think that you should all follow the standards that the community has greatfully allowed you to follow.

    35. Re:Isn't there a better way? by newhoggy · · Score: 1
      DNS is supposed to be a way for humans to communicate with computers about internet hosts. The intent is not for some human to be able to read it, but for all humans.

      If that's a goal for DNS, it's already failed.

      But if 16-bit charsets are allowed in DNS, we could get hostnames composed of 3 Chinese characters and two Arabic ones, and which a Russian or Briton will be incapable of processing without tremendous pain.

      Using Chinese characters in DNS name meant for Russians is just plain stupid. If the site really was meant for you to view, it will have a corresponding DNS name in english. If not, you probably won't be able to read the contents of that site anyway.

    36. Re:Isn't there a better way? by Anonymous Coward · · Score: 2, Insightful

      While I agree with the spirit of your post -- mainly that there should be internationalized domain names -- I do find fault with your argument.

      What the grandparent post was saying was that what makes the current DNS scheme universally accesible is its small codespace, not just that latin letters are used. While he did take a very anglocentric tone in his post -- which believe me, I have some issue with -- you failed to address the main issue here, which is a 16-bit codespace and its relative inaccessibility.

      I live in China and let me tell you, it took me several years to become appreciably literate in Chinese. 37 glyphs is not a lot to learn, and if DNS were based on 37 Chinese characters or whatever that would be fine, we could all learn that. But 37,000?

      Also, while I dislike the notion of non-globalized DNS, consider the facts: every keyboard on every computer in every nation can type those 37 characters. Moreover, the dominance of the western world today has ensured that there almost always exists some romanization system which the locals are vaguely familiar with. These systems may discard possibly vital information (for example, tones in Pinyin or umlauts on the Swedish town of Horby as mentioned by another poster), but they remain a) universally accessible and b) basically fairly easy to remember for all people.

      Let's be honest, computers and the computer world are (at least currently) very anglocentric. This is not right. In the future, hopefully, it will not be this way. And there are some places where using ascii is a pain in the butt for the locals.

      The argument that "if you can't type tho domain name, you can't read the content" isn't a bad one, although it does require multi-lingual sites to register redundant domain names.

      I think it would be simpler, at least for those countries which use some superset of ascii as their local writing system, to have DNS simply map intelligently to ascii.

      For example, in Germany, an umlauted letter could be transparently mapped to the same letter, sans umlaut, followed by an e, and the sharp s could be mapped to two ss.

      In French, where no established system for representing accents exists, letters with accents could be simply mapped to their respective non-accented counterparts.

      Because sometimes (and this often happens to me) you're on a computer that can display the characters but cannot type them; I'm french and when I'm in the states and I write an e-mail home I just don't use the accents. It's annoying, yes. But it's comprehensible.

      But what if I needed to type an accent just to get to a news site, or something? I would need to either figure out how to type the character -- not always easy -- or I would have to find the character and copy/paste it. Annoying, to say the least.

      Much better would be the option of typing without the accent, and the option of typing with. Both would map "internally" to the same domain name. And we europeans could get our accent fix.

      But for non-latin character sets (which IDN doesn't aim to support anyway, is eurocentrism really better than anglocentrism?) this system would of course not work, and so the locals would need to rely on officialized transliteration systems. But actually, you could certainly use a localized DNS system that did automatic transliteration (Chinese characters to pinyin, for example). That way the locals could use their character set, but internally, it would still just be ascii, ensuring that typing the URL for both the locals when abroad and for others (clients) would be possible without registering multiple domain names.

      What do you guys think?

    37. Re:Isn't there a better way? by CAIMLAS · · Score: 1

      Yes, you're correct. Except for the fact that 99% of civilization has no idea how to access characters that don't exist on the keyboard, that is. WTF is wrong with you?

      "Yes, my web site is ache tee tee pee colon slash slash dub dub dub dot, then "alternate 123235", ampersand, "alt-2245", dot com.!"

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    38. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      DNS is supposed to be a way for humans to communicate with computers about internet hosts.

      So those, who cannot read Roman characters are apparently not humans? Thanks!

      Besides, if site name is in Chinese, so is its contents. Which you could not read anyway!

      Or did you insinuate that the contents must be written in Roman transliteration too?
      No, I did not think so ...

    39. Re:Isn't there a better way? by darco · · Score: 1

      Minor point I figured I would make here...

      Unicode doesn't imply wide characters. UTF-8 is a superset of ASCII, meaning that all current domain names are valid UTF-8. Gnome uses this encoding heavily. So does MacOS X.

      UTF-8 "characters" are not necessarily one byte long. Obviously, all of the ascii characters are one byte. To get to the other character sets, you use a non-ascii character byte (ie: >127) followed by another byte. You can also have three byte characters, or larger in some cases. You can still enforce limitations regarding the absence of whitespace and character "upper/lowercase" issues.

      And hell, if we are going to go ahead and switch over to UTF-8, we might as well extend the size of a hostname so that UTF-8 names can be useful in any language.

      This all means that switching to "internationalized DNS" would be much easier than you might otherwise think. Still a pain in the ass, but at least not impossible.

      Of course, fat chance of any of this happening. Your point is a good one--I may be for this now, but the first time I have to try to type in an internationalized hostname, I'll be pissed.

      --
      — darco
    40. Re:Isn't there a better way? by ispeters · · Score: 1

      This might be the first Insightful post I've seen that should have been (-1, Flamebait) or maybe (-1, Troll).

      DNS is supposed to be a way for humans to communicate with computers about internet hosts. The intent is not for some human to be able to read it, but for all humans.

      You claim that all humans should be able to read a random hostname. What about an average grandmother who happens to only read Chinese? She can't read cnn.com, and that's about as simple as possible when it comes to domain names. Your scheme has already failed. As an aside, if you can't read the URL because it's written in a traditional Chinese charset, what are the chances you can make any use of the content? What's the point in making sure that you can read this URL? I'm assuming you can't read the content of that page--it's not English.

      This has worked until now because hostnames were limited to only ~37 characters. Regardless of native language, any computer operator can quickly learn to handle the [a-z][0-9] glyphs.

      This has worked until now because computers still have high barriers to entry for non-English-speakers. Characters in the ASCII set are just as much "funny-looking squiggles" to an average Chinese, Arabic, Russian, Isreali, or Indian person as any of their native character sets would be to the average English-speaker! At the moment, maybe every computer-user in the world can recognize and copy ASCII characters, even if they don't understand them, but the current trend of bringing computers to anyone and everyone means that there will eventually be more computer-users who can't read ASCII than who can.

      Basically anyone literate in one language can copy ASCII characters from a signpost onto a notepad, and then punch those into a keyboard. Even if her culture doesn't use the ASCII set in normal daily activities (which about everyone in America, Europe, and Japan does), then the shapes are at least simple enough to copy geometrically.

      If your criteria for acceptable charsets is that they are "geometrically simple", then why leave out any of the accented characters in the French or Spanish alphabets? They're just as easy as any ASCII character. Or for that matter, go have a look at the Russian version of Google. Those characters are certainly "funny-looking squiggles" to me, but they're also certainly "geometrically simple", and if I had a pencil and paper, I could copy them down.

      Just because the internet is currently mostly English (it still is, right?) there's no rational reason to hinder its expansion into other languages--other than some kind of ethnocentric blindness to other people's needs.

      Ian

    41. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      On a trip to Hyderabad, India, most of the signs, billboards, and menus I saw were in English. There are too many people who either don't know Telugu or don't know Hindi, so English is the fallback. That's actually what made India so attractive for offshoring--many other countries have well-educated workers with low costs of living, but few know English so well.

    42. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      HCI is still pretty lousy. It's perfectly possible I can read an IDN but don't know how to type it on whatever box I find myself in front of.

    43. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      DNS should be revised to allow UTF-8 and longer names. But you need to maintain some way for an RFC 1035-only client to resolve those names, and Punycode fits the bill.

    44. Re:Isn't there a better way? by tetrode · · Score: 1
      Tested with Firebird 0.7/Win32 - that works. On the site there are the two links

      * www.raksmorgas.se

      * www.xn--rksmrgs-5wao1o.se

    45. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      For example, in Germany ... the sharp s could be mapped to two ss.

      Sharp S? In Germany? No such thing mate. They revised it out of the language years ago. You must be thinking of Swiss German.

    46. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      Yes, I can just see all those Arabs suddenly abandoning their script and using our alphabet, which can't represent many of their sounds. I can just see the Russians dropping the Cyrillic alphabet, which can't represent many of their sounds. I can just see the Indians abandoning all their scripts, and adopting ours, which... are you getting the message yet?

    47. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      "corproate"?

      Look to the log in your own eye, mister.

    48. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      You essentially need a mapping of uppercase and lowercase equivalents as part of your search algorithm. With ASCII that's easy because you've only got the basic english alphabet...

      Er... the basic Latin alphabet. As in, the one that lots of non-English languages use, and some of them have different conventions about?

    49. Re:Isn't there a better way? by Mr+Europe · · Score: 1

      The word "Humans" means of course American people, in a wider interpretetion also other English-speaking people.

      DNS is supposed to be a way for humans (=American people) to communicate with computers about internet hosts. The intent is not for some human to be able to read it, but for all humans (=also other English speking people). This has worked until now because hostnames were limited to only ~37 characters. Regardless of native language (=English) , any computer operator can quickly learn to handle the [a-z][0-9] gylphs. Basically anyone literate in one language (=English) can copy ASCII characters from a signpost onto a notepad, and then punch those into a keyboard. Even if her culture doesn't use the ASCII set in normal daily activities (which about everyone in America, Europe, and Japan does (=all the World !) ), then the shapes are at least simple enough to copy geometrically. Even for their smaller brains..

      You don't get it, do you ? Chinese people really dont use English alphabets ! Or Thai, Greek, Korean, Arabic, Vietnamese, Hebrew. You do remember that in China there are more internet users than in USA. Even in the "Western Europe" only part of the alphabet is same. We constantly have point out that the letter "A with two dots" is in this case replaced with an "A" and not with "AE" as Germans often do.
      How would you like to make a compromise, let's use only the common letters ?! Between original English, German, French, Swedish and Finnish the common letters would be: A,D,E,G,H,I,J,K,L,M,N,O,P,R,S,T,U,V,Y Make your favourite domain from those ! vvv.slashtot.kom

    50. Re:Isn't there a better way? by Anonymous Coward · · Score: 0

      But it's best to not have a gun and not need it either; every other possibility is dystopic.

      And while you don't have a gun, thinking you won't need it, I'll shoot your sorry peace-loving tree-hugging hippy ass, faggot.

    51. Re:Isn't there a better way? by neurojab · · Score: 1

      That was unintentional. That's different.

    52. Re:Isn't there a better way? by Minna+Kirai · · Score: 1

      You claim that all humans should be able to read a random hostname. What about an average grandmother who happens to only read Chinese? She can't read cnn.com, and that's about as simple as possible when it comes to domain names. Your scheme has already failed.

      Ahem. She can look at a printout for cnn.com, look down at her keyboard, which has letters "C" "N" "O" "M" plainly printed on them, do some pattern-matching, and punch it in.

      Or did you not know that well more than 99% of computer keyboards are based on English ASCII? Regardless of whether someone is typing in Japanese, Chinese, Hebrew, or Korean, they've all got a slight variant of an IBM 101 keyboard under their hands. Those 1% that don't are using some software inputs, and that software has an easily accessible panel to pick out the 37 critical ascii characters.

      I got a lot of upmods for that post. And I got a lot of negative responses for people who evidently either didn't read it completely (and don't understand that resisting the internationalization of DNS in no way prevents use of native-charset website names), or don't know that English is the international language of technology, and that the 37 char ASCII set is the easiest possible alphabet for all people to learn.

      ASCII37 is a subset of the alphabets of 50% of the people on earth. Of those not using it natively, the majority of their computer-using population can already recognize ASCII37. And compared with any other candidate alphabet system, ASCII37 is usually much simpler or at worst equal in abstract difficulty. The combination of it being a small, easy system, and already being the most commonly understood in the world makes ASCII37 the indisputable choice for universal human readability.

    53. Re:Isn't there a better way? by Minna+Kirai · · Score: 1

      On a trip to Hyderabad, India, most of the signs, billboards, and menus I saw were in English.

      India is not well represenative of non-English nations. It was part of England for decades, and had a peaceful separatation. Even now, English is one of the about 15 official national languages, and is one of the 2 languages which important government business can be conducted in.

      When two Indians from different regions meet, chances are their local Indian dialects will not be mutually comprehensible. They will attempt falling back to either Hindi or English.

      And in an area like Hyderabad, which attacts so many foreigners, sign-makers will be even more inclinded to English over Hindi.

    54. Re:Isn't there a better way? by Cow+Jones · · Score: 1
      Sharp S? In Germany? No such thing mate. They revised it out of the language years ago.

      You're wrong, the "sharp s" is still very much alive in Germany (as well as in Austria and Switzerland, the other IDN nations).
      If you don't believe me, this 10 volume dictionary even has a "sharp s" in the title.

      --

      Ah, arrogance and stupidity, all in the same package. How efficient of you. -- Londo Mollari
    55. Re:Isn't there a better way? by imroy · · Score: 1
      djbdns ... doesn't rely on standard c-libraries

      Now why doesn't that surprise me in the least?

    56. Re:Isn't there a better way? by Uerige · · Score: 1

      I am not against globalization. That's why I want everyone to speak english, especially on the web.
      I don't really see what your'e trying to tell us.

    57. Re:Isn't there a better way? by Rick+the+Red · · Score: 1

      And when the rest of us catch you, we'll collectively kill you by lethal injection.

      --
      If all this should have a reason, we would be the last to know.
  6. really dumb sounding by happyfrogcow · · Score: 4, Interesting

    I'm sorry, is it just me or do they seem to be taking a bad shortcut to get to a good end? It doesn't seem like they are doing this correctly. Why not plan to migrate to unicode? Their choice seems shortsighted and flawed. I hope they atleast considered unicode and came up with real reasons why not to use it.

    1. Re:really dumb sounding by Anonymous Coward · · Score: 0

      Considered? Yes. Reasons? Yes. Real reasons? I dunno.

    2. Re:really dumb sounding by Anonymous Coward · · Score: 0

      One advantage of this scheme is that DNS servers do not need to be updated, only clients that want to use new characters -- and even old clients can use the punycode form (xn--...). If links in html code use the xn-- form, they will work with old clients, too. Thus this allows for transition without waiting for everybody to upgrade at once.

      The obvious (?) alternative would been using some mapping of Unicode into ASCII. That would allow full Unicode character set and make for easier transition to Unicode-aware DNS servers. The only reason against it I can think of is that the coding would necessarily be longer and thus harder to remember - although I'm not at all sure xn--bcher-kva is any easier than, say, xn--b_cher-bcc3 would be (using 16-bit hex coding with _ as placeholder).

    3. Re:really dumb sounding by bill_mcgonigle · · Score: 1

      Why not plan to migrate to unicode?

      Because if you do it this way, you can sell domain names today.

      So, yeah, it's a kludge. Follow the money.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    4. Re:really dumb sounding by Anonymous Coward · · Score: 0

      > One advantage of this scheme is that DNS servers do not need to be updated

      That is not an advantage. DNS servers need to be regurlarly updated anyway. How else would you solve security problems, support IPv6, etc.

    5. Re:really dumb sounding by Pflipp · · Score: 1

      The obvious (?) alternative would been using some mapping of Unicode into ASCII.

      Isn't that what they call UTF-8/ UTF-7?

      I mean, true, it's not ASCII, but it's a mapping from Unicode to 8-bits systems.

      Now if you don't support Punycode, you also can't practically reach these new domains -- because their names appear totally clueless to you. So again, why not just upgrading DNS??

      --
      "We can confirm that Debian does *not* ship the version with the trojan horse. Our version predates it." [CA-2002-28]
    6. Re:really dumb sounding by x+mani+x · · Score: 2, Informative

      They did obviously consider unicode, perhaps you did not RTFA. However their solution uses unicode at a different layer.

      I think the *real* solution here is to reimplement ALL top level DNS servers to support unicode. But the overhead in doing this, when you really think about it, seems difficult (ICANN approval, unicode related bugs, getting everyone to use new DNS server, etc). At least, since the ASCII text supported by DNS are exactly the same in Unicode, backwards compatibility should not be a problem.

      This solution is a workaround that uses unicode at the client level, encodes it to "punicode" (which only contains characters supported by DNS, unlike, say, BASE-64 or Quoted-Printable), and sends the request to the DNS server. It is a quick and easy solution to a messy problem. But its hacky-ness makes me doubt it will be supported by whatever governing body influences this stuff (IETF, ICANN, etc).

      -Mani

    7. Re:really dumb sounding by hchaos · · Score: 1
      I'm sorry, is it just me or do they seem to be taking a bad shortcut to get to a good end? It doesn't seem like they are doing this correctly. Why not plan to migrate to unicode? Their choice seems shortsighted and flawed. I hope they atleast considered unicode and came up with real reasons why not to use it.
      It would be a bad shortcut 15 years ago. Today it is a necessity. There's more than just DNS servers at stake. Web browsers would have to be rewritten to accept Unicode URLs, and everybody would have to get the new version. Tens of thousands of databases that store URLs would have to be rewritten to support Unicode. Standard Internet utilities, such as telnet, ftp, and ping, would have to be reviewed to ensure compatibility, and every networked application that transports data across the Internet would be affected. All of these systems would need to be tested to ensure that there's sufficient buffer space for the new characters, and that there's no buffer overflows or other newly introduced bugs. Individually, this may be simple, but there are a lot of these programs out there, and converting all of them would be a huge effort.
  7. Why not by Pingular · · Score: 4, Funny

    But I guess it would be difficult to talk about this on /., since it does not even support basic Latin-1
    Just say the ascii number?

    --

    When anger rises, think of the consequences.
    Confucius (551 BC - 479 BC)
    1. Re:Why not by Anonymous Coward · · Score: 0

      119104097116063

    2. Re:Why not by lokedhs · · Score: 1, Informative
      Right... What's the ASCII code for the Euro sign? Or even accented "a"? How about the russian Gze?

      Hint: ascii is 7-bit.

    3. Re:Why not by Golias · · Score: 2, Funny
      Euro: 0x450x750x24

      Accented a: 0x61

      "gze" sound: 0x670x7a0x65

      That was easy!

      --

      Information wants to be anthropomorphized.

    4. Re:Why not by gnu-generation-one · · Score: 2, Funny

      "Right... What's the ASCII code for the Euro sign? Or even accented "a"? How about the russian Gze?"

      Simple, just cut-and-paste them from Word, like those fantastically useful ?intelligent quotes? you keep seeing on people?s websites

    5. Re:Why not by I8TheWorm · · Score: 1

      Most people mistakenly use ASCII and ANSI interchangeably. Although ANSI doesn't make up for all non-latin characters, it does encompass quite a few.

      --
      Saying Android is a family of phones is akin to saying Linux is a family of PCs.
    6. Re:Why not by stesch · · Score: 1
      The only "ANSI" I know of is still 7-bit ASCII: ANSI X3.4

      I think you mean the codesets Microsoft calls "ANSI", although this isn't the right name.

    7. Re:Why not by I8TheWorm · · Score: 1

      You're absolutely right.. I should have clarified that further. The MS ANSI set is 8-bit while the real ANSI set is only 7. Sorry if I caused any confusion.

      --
      Saying Android is a family of phones is akin to saying Linux is a family of PCs.
    8. Re:Why not by Fastolfe · · Score: 1
      This is actually a perfectly legitimate suggestion. The problems people come across are caused by:
      1. sites that fail to declare a character set properly, where the browser fails to auto-detect it (e.g. slashdot, which has no declared character set and where users, as a result, end up pasting things in a multitude of character sets, causing the browser to auto-detect one of them and misrender the rest)
      2. sites that declare a character set that's too constrictive for the characters someone is trying to paste
      A variation of #1 involves sites that may appear on the face to be all UTF-8 and internationalized, but utilize databases that store text in, say, 7-bit ASCII, or otherwise fail to preserve their Unicode data on its way to/from any back-ends.
    9. Re:Why not by Random832 · · Score: 1

      there's also ANSI X3.110 which appears to be very different from iso 8859-anything or windows 1252

      --
      We've secretly replaced Slashdot with new Folgers Crystals - let's see if it notices.
    10. Re:Why not by lokedhs · · Score: 1

      I honestly hope that no one is using ANSI character encodings anymore. I believed that the last renamnts went out with DOS?

    11. Re:Why not by I8TheWorm · · Score: 1

      You can guarantee it's still in use somewhere, along with COBOL, 8088's, IBM System 36's, any development software written by CA, token rings, and dumb terminals.

      --
      Saying Android is a family of phones is akin to saying Linux is a family of PCs.
  8. Useful? Naw. by grub · · Score: 4, Interesting


    I'm not sure what all the accents are on the alphabet, will I have to know to type them to access a simple website? Sorry, this doesn't make using the net easier.

    --
    Trolling is a art,
    1. Re:Useful? Naw. by Anonymous Coward · · Score: 1, Interesting

      I'm not sure what all the accents are on the alphabet, will I have to know to type them to access a simple website? Sorry, this doesn't make using the net easier.

      I'm sure it will make it easier if that is your native language!

      That said, this looks like a stupid kludgy implementation of accented characters. Use unicode!!

    2. Re:Useful? Naw. by tuffy · · Score: 4, Insightful

      If you don't know how to type the characters necessary to access the web site, chances are you won't be able to read the content anyway. So I think it's a moot point.

      --

      Ita erat quando hic adveni.

    3. Re:Useful? Naw. by ColdCuts · · Score: 1

      Option1:
      cntl-c cntl-v

      Option2:
      click on a link.

    4. Re:Useful? Naw. by grub · · Score: 1

      google/babelfish do a half-decent job at making the text somewhat understandable (at least enough to get the gist of the page across) but if I'm unable to access these pages in the first place...

      --
      Trolling is a art,
    5. Re:Useful? Naw. by Anonymous Coward · · Score: 1, Interesting

      that's sort of ridiculous. What if content behind the domain is in readable text?

    6. Re:Useful? Naw. by ShecoDu · · Score: 2, Insightful

      As others have pointed out, if you dont use the accents, why would you want to visit a foreign language page? if you happend to like the language you can find the way to type the characters... besides, there is always a way to use google to locate the page and click on the link or something like that, you dont have to be so closed minded, not just because you dont find it usesful, everybody will see it the same way as you do...

      Just as the moderator guideline says "focus on promoting instead of modding down", the same applies, focus on the things you like and ignore those that dont mean a thing.

      If I happend to see a slashdot post about war and I couldnt give a f*ck less, I would certainly just avoid looking inside, it's obvious I'm going to get irritated with the comments inside, but that doesnt mean they are wrong...

      By the way, I'm not trolling or beeing agressive... I just express my point of view. =)

    7. Re:Useful? Naw. by tuffy · · Score: 0
      that's sort of ridiculous. What if content behind the domain is in readable text?

      Then you'll have to figure out how to type internationalized characters in your OS of choice. But I don't see why everyone else should have to suffer with ASCII-only domain names just because you or I might not know how to type them on our keyboards.

      --

      Ita erat quando hic adveni.

    8. Re:Useful? Naw. by PitaBred · · Score: 1

      Problem being that everyone can type ASCII. It's a segregationist move. Unless we want to start going to 200 key keyboards or something equally stupid.

    9. Re:Useful? Naw. by Anonymous Coward · · Score: 0

      I completely agree. As far as I'm concerned, those addresses don't exist. In fact, I'd be pretty pissed if my nameserver got fubar'd on account of the Swiss adding 7M f'd up hostnames... those damn canadians too...

    10. Re:Useful? Naw. by Anonymous Coward · · Score: 0

      Option1: cntl-c cntl-v

      Yep...you have no idea how many times I've typed:

      perl -e "print chr(241)"

    11. Re:Useful? Naw. by Anonymous Coward · · Score: 0

      Ever heard of dead keys?

      You can type any western-european character using a correctly set up US keyboard, that is out of the box on almost all Linux distributions (except RedHat which always had atrocious i18n support.)

      Of course non-latin caracters will still be a problem, but websites with only non-latin server names have a good chance to have non-latin only content.

    12. Re:Useful? Naw. by Fastolfe · · Score: 1

      Bear in mind that hostnames and URLs were never truly meant to be consumer-friendly. The goal of things like HTML was to hide those as implementation details. You didn't need to know what HTTP was so long as you could click on a link. So long as applications try to keep that implementation detail away, things should still be OK in that regard, but everyone knows that's a dream that will never be fully realized. Eventually URLs with scripts not in your native language will make their way to other formats (print and broadcast media, for example), requiring the viewer to manually enter them. So long as we're relying on URLs and DNS to be our primary means of naming content for the general public, this will always be a problem. It really sounds like we need a better, logical directory to sit atop DNS that maps real-world names (in whatever languages and scripts) to the one, single, exclusive DNS domain for that entity. This, along with better use of search engines (e.g. a formal engine for searching trademarks), could allow us to search for things in our native scripts and follow links to URLs and DNS domains using a different script entirely.

    13. Re:Useful? Naw. by EinarH · · Score: 1

      Ahh, the classic "this is not usefull for me so it can not be usefull for anyone else either" attitude.

      --

      Melius mori in libertate quam vivere in servitute.

    14. Re:Useful? Naw. by Anonymous Coward · · Score: 0

      Ahh, the classic "I can't spell 'usefull'(sic) but will try to look smart" attitude.

    15. Re:Useful? Naw. by jwr · · Score: 1

      I guess not all websites, then, are built to be accessible to Americans...

      I suggest sueing for discrimination.

    16. Re:Useful? Naw. by Just+Some+Guy · · Score: 2, Insightful
      Bzzzt - wrong. You may not've travelled to countries with different "standard" keyboard layouts, but that's not going to help a Japanese businessman on a trip to Los Angeles figure out how to type the name of his company's website on a PC-104 setup. Put him on a Kanji keyboard and he'll be there in seconds. Give him a nice en.US layout and see what happens.

      What was your point again?

      --
      Dewey, what part of this looks like authorities should be involved?
    17. Re:Useful? Naw. by grub · · Score: 1


      I guess not all websites, then, are built to be accessible to Americans

      You assume I'm an American, which I'm not.

      --
      Trolling is a art,
    18. Re:Useful? Naw. by Anonymous Coward · · Score: 0

      > Problem being that everyone can type ASCII. It's a segregationist move.

      The 'segregationist move' was putting the 'A' in ASCII.

    19. Re:Useful? Naw. by poot_rootbeer · · Score: 1


      Zuh? I can read Spanish and German text and comprehend the contents quite well, but I wouldn't count on myself being able to construct a grammatically correct sentence on my own without reference materials. Being able to read a language and being able to write or speak a language are fairly distinct skill sets.

      It's kind of like saying that if you don't know how to play guitar, you don't have any reason to listen to a Jimi Hendrix album.

    20. Re:Useful? Naw. by Bio · · Score: 1

      I'm Swiss and unlike 99.9% of the people living here I don't use a keyboard with the Swiss(-German/-French/-Italian) layout. I prefer the US layout because most special characters used for programming are accessible in an easier way, and for entering accents etc. there are other ways, in LaTeX use \"a or in HTML ä or else use Ctrl-C, Ctrl-V.

      I can imagine that there are situations where typing special characters in the URL can be difficult, say when you're travelling or you have the "wrong" keyboard.

      The article talks about the domain name (that is the 2nd level domain name). Will the same encoding be used for other parts of the URL, e.g. the 3rd level name?

    21. Re:Useful? Naw. by McDutchie · · Score: 3, Insightful
      I'm not sure what all the accents are on the alphabet, will I have to know to type them to access a simple website?

      Never fear, oh monolingual one, I found this very handy site that will help solve this pesky problem for you. Try it some time and let us know what you think!

    22. Re:Useful? Naw. by Vann_v2 · · Score: 1

      Most every Japanese person I know either uses a standard English keyboard and converts from Romaji -> Kana -> Kanji. Otherwise they use kana keyboard and skip the romaji step.

      It would be sort of crazy to have a keyboard with kanji on it. What would you do, have a key for each basic radical

    23. Re:Useful? Naw. by Minna+Kirai · · Score: 1
      As others have pointed out, if you dont use the accents, why would you want to visit a foreign language page?
      • To look at the pictures
      • To download the audio
      • To read text which you're personally able to understand, even if you don't know how to generate those symbols with the computer you're sitting at
      • To click the little GIF reading "English" that many foreign websites already use
      • To file a complaint that one of the system's users is DOSing you
      • To feed the text into your Machine Translation software


      focus on the things you like and ignore those that dont mean a thing.

      I (and everyone) currently has the can look at any DNS hostname, scribble it onto a pad, read that over the phone to a friend, and have him connect to the computer. Internationalized DNS will take away that ability.
    24. Re:Useful? Naw. by lfm_the_couch · · Score: 1

      >>I'm not sure what all the accents are on the alphabet, will I have to know to type them to access a simple website? Sorry, this doesn't make using the net easier.

      >I'm sure it will make it easier if that is your native language!

      Spoken like a true monolingual. Get this: just like English-native people can't spell "to loose something" or "definatly", people native to other languages can't spell either. My personal experience with Spanish was that I didn't really always know where to accent a word until I learned the explicit rules in high school, and I still get mixed up sometimes. And it gets much worse in languages with larger, more confusing character sets, like Armenian, Arabic, Japanese or Chinese (all of which I've studied, thank you, so yes I do know a little about it).

      At least ASCII-only DNS keeps the problem-set down a bit.

    25. Re:Useful? Naw. by mijok · · Score: 2, Interesting

      Well for non-English speakers it will make quite a big difference. Let me give you two funny and/or embarrasing examples: Two municipality names in Sweden: Mnsters and Hrby. As you (hopefully) can see the first one has two dots over the "o" (called "umlaut" in german, i.e. a form of the letter "o", in Swedish it is considered a different letter in the alphabet) and a ring above the a and the latter name has two dots over the "o". Well, these municipalities have websites and since they can't get the dots and the rings the names are as follows: www.monsteras.se www.horby.se Now comes the funny and embarrasing part, since the names have become words, which mean something, translations: www.monstercarcass.se and www.hookervillage.se Now, try to tell the not-so-internet-literate people what to type in their web browser and get some reactions :)

      --
      Karma. Moderation. Is my .sig good now?
    26. Re:Useful? Naw. by Anonymous Coward · · Score: 0

      [I]If you don't know how to type the characters necessary to access the web site, chances are you won't be able to read the content anyway. So I think it's a moot point.[/I]

      I don't read the words, I just look at the pictures! ;)

    27. Re:Useful? Naw. by Anonymous Coward · · Score: 0

      I have been living in the States for the past 4 years. Spanish is my first language and English my second. However, over time I tend to "forget" Spanish. What I so to memorize the accent rules is to remember the word how to accent "cabro'n"

      Right there you see all the rules:
      word stressed on the last syllabe and word ending in N,S or vowel=accent, otherwise no accent

      word stressed on the second to last syllabe and word ending in N,S or vowel=NO accent, otherwise accent

      any other stressing beyond the second to last=always accent.

    28. Re:Useful? Naw. by Bakaneko · · Score: 1

      I could see several American concerns using accented e's and a's though, and that'll get annoying quick. And that would be in English too, but companies/organizations that think accent marks are "hip."

      I have several concrete examples in mind...

    29. Re:Useful? Naw. by Anonymous Coward · · Score: 0
      You make many assumptions.

      I can speak and read English, French and Romanian fluently (all three are my "native" language). I can converse and write in Spanish, Italian and Russian. But I would be hard-pressed to type a Russian domain name into the computer I'm currently using (a Windows box, not my own).

      My preferred method for inputting text with diacriticals is emacs with a postfix input method. I struggle when placed in front of a non-American keyboard. I struggle when forced to use a Windows or Macintosh input method/keyboard layout, and most of the computers I use don't have alternative input methods installed. I don't have the authority to install an alternative input method on lots of computers I use - how are you going to input cyrillic in the Internet cafe down the street?

      I'm not unique. Most Europeans can converse in multiple languages although they don't have multiple keyboards on one machine. Consider that Chinese requires special software to be input efficiently: what happens when your Chinese friend tries to use your Linux box?

      You assume that anyone who can read a language with diacriticals always has access to their preferred input method. This assumption is very American.

    30. Re:Useful? Naw. by Lacton · · Score: 1

      How can this post be considered +5, Insightful? What it says isn't true. You may be fluent in a language but not have a properly configured computer to type something in this language.

      For instance, I can read English, French and Japanese, but my computer at work has been locked down and won't let me write in Japanese.

    31. Re:Useful? Naw. by julesh · · Score: 1

      If you don't know how to type the characters necessary to access the web site, chances are you won't be able to read the content anyway. So I think it's a moot point.

      Huh? Are you joking? I can read French fairly easily, and Spanish not quite so easily. It would probably take me a lot more effort to figure out how to type \'e or \~n or any of the other non-ASCII characters that are used in either of those languages. I would probably have to resort to finding them in Windows' "character map" (which isn't installed by default on all systems, I've noticed in the past, so I may have to install it for the purpose) and copying and pasteing them into the address bar. This isn't ideal...

  9. Oh great... by JoeLinux · · Score: 2, Funny

    Now the Europian Union will want everyone to click on the left side of the mouse, left-handers be damned.

    The French will demand that "bandwidth exceeded" errors be renamed to "(web page) surrenders"

    The Germans will try to take over the internet.

    In a sneak attack, the Iraqis will launch a massive DDOS attack, but accidently hard-code localhost in the trojan. The Iraqi information guru will deny everything.

    1. Re:Oh great... by Anonymous Coward · · Score: 0

      And the US will secretly be behind it all.

    2. Re:Oh great... by Anonymous Coward · · Score: 0


      The Germans will try to take over the internet.

      Ah yes, I can see them sewing little stars on the Microsoft CDs and forcing the filthy Cisco equipment into those tiny showers for debugging.

    3. Re:Oh great... by elvum · · Score: 0, Troll

      Presumably if I pointed out that many Europeans would find your post offensively racist you'd tell me to get a sense of humour?

    4. Re:Oh great... by Anonymous Coward · · Score: 0

      And where did the OP mention anything about race.

      Clue: German, French, etc are not races (unless you believe the 'superior race' nonsense of the Nazi's)

    5. Re:Oh great... by Anonymous Coward · · Score: 0
      Nope.

      M$ is a good Aryan company run by an egomaniac dictator. It supports the anti-communist Western ideology. Now Open Source, as we all know, is a untermensch conspiracy to undermine the new world order. Any programmer from any filthy little non-white country is free to contribute and take credit. It's one thing to employ the subhumans in our code work-camps and package the results as good aryan work. It's another to allow the wogs to publicly take credit for their own work.

      Not only that, having the source open is tanatamount to allow any old person into the Reichstag with a can of gasoline.... Who knows what the security implications will be?

      We will celebrate another kristallnacht as Linux cds are smashed into a million glittering pieces! Those that survive will be branded with the yellow star! Macintosh will carry the pink triangle as it is the OS of Homosexuals!

      One World!, One Web!, One Program!

    6. Re:Oh great... by el-spectre · · Score: 1

      It's more, ah... nationist (yuck) actually...

      I kinda wonder when Germany will stop getting shit for the wars. We'll probably have to wait for the grandchildren of the combatants to die.

      On the other hand, most american folks are OK with Japan these days... odd (nothing against the japanese... it's just that they did lotsa nasty stuff too. I guess it's because they aren't a military threat).

      Americans are just tweaked at france cuz it didn't fall in line like Britain did. It'll pass.

      --
      "Faith: Belief without evidence in what is told by one who speaks without knowledge, of things without parallel." - A.B.
    7. Re:Oh great... by Anonymous Coward · · Score: 0

      Get a sense of humor.

    8. Re:Oh great... by Anonymous Coward · · Score: 0

      It's more, ah... nationist

      Like your post? Or does generalization not fall under that umbrella?

      Your post is nearly identical to the grandparent's, minus the humor. So why is yours okay?

    9. Re:Oh great... by beebware · · Score: 2, Funny

      Whilst us poor Brits will just do everything President Bush's lapdog (aka Tony Blair) tells us to do.

    10. Re:Oh great... by Anonymous Coward · · Score: 0
      Ahem:
      Race \'rAs\ n. 1. The descendants of a common ancestor; a family, tribe, people, OR NATION, believed or presumed to belong to the same stock; a lineage; a breed.
      (Emphasis mine)
      Racist \'rA-"sist\ n. 1. based on racial intolerance; "racist remarks"

      2. discriminatory especially on the basis of race or religion
      So, STFU on this "discrimination based on nationality isn't racist" line.
    11. Re:Oh great... by el-spectre · · Score: 1

      Well, being an american, I tend to reflect that viewpoint. The original poster slammed several countries, trying to be funny. I wasn't attempting to be funny, I was stating facts:

      1) much of the world (and many americans) still look on germany with suspicion, despite all the efforts by germany to move away from its past. Hell germany is one of the U.S.'s strong allies.
      2) Most of the world (excluding SE asia, with was most affected) doesn't harbor such feelings towards japan. Japan did really nasty things in WW2 (go ask an old chinese or korean fellow). Japan is STILL not allowed to have an offensive army, hence, no military threat.
      3) France didn't respond the way many americans wanted it to. So some got mad. It happens, and it transitory. No big deal.

      So far as "OK", that is meaningless. I really don't give a fuck if someone 'approves' of my thoughts, I was just bringing up something I found interesting. Deal.

      --
      "Faith: Belief without evidence in what is told by one who speaks without knowledge, of things without parallel." - A.B.
    12. Re:Oh great... by Anonymous Coward · · Score: 0

      I'm a left-hander and I click on the left side of the mouse, using my right hand. I also play guitar right-handed. I think it would be stupid for me to get used to doing it otherwise, since then I couldn't use any public computers or play my friends guitar etc.

    13. Re:Oh great... by Anonymous Coward · · Score: 0

      There are two worlds now? Or for that matter, two World Wide Webs? Interesting.

    14. Re:Oh great... by Anonymous Coward · · Score: 0

      Programmieren macht frei!

    15. Re:Oh great... by Anonymous Coward · · Score: 0

      As a German, let me assure you that those war jokes are 100% okay. I mean, it's better than being called a cheese eating surrender monkey for sure. ;-)

      (Besides, I always side with the bad guys in the movies anyway >:-))

    16. Re:Oh great... by Anonymous Coward · · Score: 0

      Deal.

      Try taking your own advice some time.

    17. Re:Oh great... by Anonymous Coward · · Score: 0
      In a sneak attack, the Iraqis will launch a massive DDOS attack, but accidently hard-code localhost in the trojan.

      You're one of the 80% of Americans who believe the World Trade Centre attacks were by Iraqis, aren't you?
    18. Re:Oh great... by brocheck · · Score: 1

      And thats why we love you!

      --

      suddenly I feel very tired

  10. Re:English runs the net by Anonymous Coward · · Score: 0

    Have you ever been to a non-english speaking country?

    alawys, think more talk less

  11. Re:English runs the net by Anonymous Coward · · Score: 0

    Yes, but this has nothing to do with physical space. We're talking about Teh IntarWeb, remember?

  12. Taco, why did you remove the accents from slashdot by Anonymous Coward · · Score: 5, Funny

    ,
    Taco est un mechant garcon.
    '

  13. Maybe not as useful as one might believe by Ryu2 · · Score: 4, Interesting

    While it's logical for, say, Chinese companies to have a Chinese domain name and Chinese e-mail addresses, it may not be the best choice if the company wishes to expand oversea.

    Unfortunate but true, if a company has a Chinese domain name, it would probably be only used within China, Taiwan, Hong Kong, Singapore, Japan (since it's unicode), and maybe South Korea. The company would be pretty much limited to the East Asia market.

    However, I suppose the company could get both a Chinese domain and an English, or rather Pinyin, domain so they could make their Chinese, or maybe other Asian clients feel "closer" while also being able to reach clients outside of East Asia.

    I also think that it'd be great to give people the option of having a native-language email address. It's not too hard to set up a romanized email alias for it. An SMTP "X-Roman-Address" header could even by added to outgoing messages in case a recipient can't read the default "From" line.

    --
    There's 10 types of people in this world, those who understand binary and those who don't.
    1. Re:Maybe not as useful as one might believe by Councilor+Hart · · Score: 1
      The company would be pretty much limited to the East Asia market.

      With over a billion (potential) costumers, I'll gladly be limited to that part of the world.

    2. Re:Maybe not as useful as one might believe by Scrameustache · · Score: 2, Interesting

      Unfortunate but true, if a company has a Chinese domain name, it would probably be only used within China, Taiwan, Hong Kong, Singapore, Japan (since it's unicode), and maybe South Korea. The company would be pretty much limited to the East Asia market.

      Yeah, they would "limit" themselves to the fastest growing economy in the world and a market of about 2 billion people...who'd want that?

      P.S. Why can't that company have a chineese domain name and a roman-character domain name? Is there a law I don't know about?

      --

      You can't take the sky from me...

    3. Re:Maybe not as useful as one might believe by HeghmoH · · Score: 1

      Yeah, it's not like Chinese companies don't already have this exact same problem when dealing with postal addresses. And it's not like they have it solved by having the Chinese post office understand Romanized addresses.

      You solved the problem in your post; get two domains. Not too hard.

      --
      Mod down posts with a "Free Mac Mini/iPod" sig, they're spam!
    4. Re:Maybe not as useful as one might believe by dabadab · · Score: 1

      Well, you MUST be American, otherwise you would have heard that there are two-letter TLDs, which are country specific and mostly used so. So, for example you can count on siemens.de being in German, but that does not hinder the internation expansion of Siemens in any way. And these country-specific domains are the ones that most probably would use localized domain names.

      Anyway, if you would use any other language than English you would know how silly it sounds e.g. penzugyminiszterium.hu instead of penzugyminiszterium.hu - and I really don't expect our Ministry of Finance to expand overseas :)

      --
      Real life is overrated.
    5. Re:Maybe not as useful as one might believe by kisak · · Score: 1
      Even though Chinese domain names makes these pages less accessible to you, it makes it more accessible to a lot of people.

      The world does not center around us in the west, something it is important to remember.

      --

      --- guns don't kill people, people with guns kill people ---

    6. Re:Maybe not as useful as one might believe by Fastolfe · · Score: 1

      P.S. Why can't that company have a chineese domain name and a roman-character domain name? Is there a law I don't know about?

      No law, but best DNS practice usually suggests selecting a single exclusive DNS domain for an organization. Units within that organization get subdomains within your primary DNS domain, not their own independent domain. Usually network administrators prefer to have all of their computers on a single DNS domain anyway, it's just the web and marketing guys that want to have a "presence" with all of those other names.

      In practice, going with one exclusive DNS domain isn't possible, since DNS domains now carry so much intellectual property weight, and companies feel they have to snatch up every DNS domain that's remotely similar to any mark they have or will ever own. But that's not the way things were meant to be, and certainly not in the best interests of DNS.

      But until we come up with anything better, and a company wants to market their URLs and DNS domains "out of band" (on printed media or broadcast) on a global scale, they'll want to have more than one domain in as many native scripts as they can.

      I would personally prefer to see a directory atop DNS that would map "logical" real-world names (in as many scripts as someone wants) to an organization's single exclusive DNS domain. As soon as we stop (ab)using DNS to be a content label or a yellow pages, the easier it will be to cope with domains in other scripts.

    7. Re:Maybe not as useful as one might believe by Anonymous Coward · · Score: 0

      boy, more than one post by you that sounds a bit anti-american.

      Don't worry, we pay much less credence to the validity of your post when you come off like an asshole.

    8. Re:Maybe not as useful as one might believe by Anonymous Coward · · Score: 0

      west not equal g.w. bushes arse either... er i mean america :)

    9. Re:Maybe not as useful as one might believe by Anonymous Coward · · Score: 0
      boy, more than one post by you that sounds a bit anti-american.

      And every bit is deserved.

  14. URLs that you cannot type by HermanZA · · Score: 3, Insightful
    That is sure to improve your hit rate no end...

    I sure hope this harebrained idea doesn't take off.

    1. Re:URLs that you cannot type by Scrameustache · · Score: 3, Insightful

      That is sure to improve your hit rate no end...

      URLs that you cannot type. But why would they want your hits if you can't even type their domain name? Its not like you'll be able to read the content if you get there, or understand their ads.

      --

      You can't take the sky from me...

    2. Re:URLs that you cannot type by WegianWarrior · · Score: 2, Interesting

      Or how about URLs you have to spell differently than you spell the name of the company in question? Thats a pretty harebraided idea, but one very many* people online today. Take for instance norwegians (as I happen to be one myself). The norwegian alphabet consists of 29 letters, the old 26 from latin (a-z) as well as three I can't show you here on /. since the site for some bizarre reason don't support them**. Therefore we're forced to use 'ae', 'oe' and 'aa'*** instead, opening for plenty more misunderstandsings for _norwegian_ websites catering for the _norwegian_ public. And since I still have to discover any online tranlator that can translate norwegian into english, I dare say that the chance of any non-norwegian needing to type the URL is slim at best.

      So frankly, you can have a big serving of STFU. If you don't see the point of this, you prolly will never use it anyway, or even notice. For those of us who actually care, this is pretty good news.

      __*) I would - wihtout seeing any proof - guess that the majority of people online today does not speak english as their native tounge.
      _**) Other US sites do...
      ***) For those interested, the ascii-codes are 230, 248 and 229 for small letters, and 198, 216 and 197 for capitals.

      --
      Everything in the world is controlled by a small, evil group to which, unfortunately, no one you know belongs.
    3. Re:URLs that you cannot type by bill_mcgonigle · · Score: 1

      But why would they want your hits if you can't even type their domain name?

      There are some websites, really popular ones in fact, where the text on the site has almost no meaning to those visiting the site. :)

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    4. Re:URLs that you cannot type by penultimatepost · · Score: 1

      There are plenty of corporate websites that offer their content in multiple languages (bbc.co.uk comes to mind). of course they could always register several names or opt for the ascii equivalent. In any event it is not as cut and dry as you put it.

    5. Re:URLs that you cannot type by pommiekiwifruit · · Score: 1
      babelfish.altavista.com

      Ohmigod, the world suddenly got bigger!

      And guess what, some electronic stuff is made in china or japan or taiwan, and I might find the (english) pdfs stored on their corporate websites useful if I could find the damn things (they assume no-one accessing their english sites wants to know anything technical).

    6. Re:URLs that you cannot type by MSG · · Score: 1

      the ascii-codes are 230, 248 and 229 for small letters, and 198, 216 and 197 for capitals.

      ASCII only covers 7 bit values. Those are the decimal values of an encoding other than ASCII.

      Most likely, slashdot has chosen not to support non-ASCII text because there's no context that conveys what non-ASCII encoding they're actually in. The only way that slashdot can fix this, and remain readable to everyone, would be to allow characters that compose valid UTF-8 stings to be included in posts. Your character codes "230, 248 and 229" would not be allowed. However, "303 246", "305 223", and "303 245" would.

    7. Re:URLs that you cannot type by Sycraft-fu · · Score: 1

      "I would - wihtout seeing any proof - guess that the majority of people online today does not speak english as their native tounge"

      You would be correct. However I KNOW, that of all the languages they speak, the majority of people online do speak English, in at least some capacity. English is NOT the largest first language, not even close. It is however, by far, the largest second language. If your are a Japanese bussiness doing bussiness in China it is highly likely the talks are conducted in English. Why? Well they don't learn chinese in Japan and they don't learn Japanese in China. They do learn English in both places and so can use it to converse.

      Thus there is some logic to an English-based domain system. Though you aren't hitting most people's primary language, you are at least hitting something most people can understand.

    8. Re:URLs that you cannot type by Anonymous Coward · · Score: 0

      I go to a lot of good Japanese porn pages even though I can't read Japanese.

    9. Re:URLs that you cannot type by The+Wing+Lover · · Score: 1
      But why would they want your hits if you can't even type their domain name? Its not like you'll be able to read the content if you get there, or understand their ads.

      I read French fluently, but I've never had the need to type French accents on the computer, so never have learned how to input them (any my keyboard sure isn't French).

      --

      - In Capitalist America, law violates YOU!

    10. Re:URLs that you cannot type by Anonymous Coward · · Score: 0
      Not the only way. Slashcode could easily accept multipart/form-data with a text/plain;charset="..." header for each field and then send either UTF-8 or entity markup to clients.

      For that matter, I think entities used to be allowed in comments, though at the moment they seem to be silently deleted.

    11. Re:URLs that you cannot type by Scrameustache · · Score: 1

      I read French fluently, but I've never had the need to type French accents on the computer, so never have learned how to input them (any my keyboard sure isn't French).

      Set your keyboard to canadian french, wich is probably close to what you are used to, and check the layout. You can switch to it and back with a mouseclick usually.

      Le e accent aigu est a cote du SHIFT de droite.

      --

      You can't take the sky from me...

    12. Re:URLs that you cannot type by rduke15 · · Score: 1
      Most likely, slashdot has chosen not to support non-ASCII text because there's no context

      There is!

      The server says it's sending latin-1 (iso-8859-1):
      head slashdot.org
      Content-Type :text/html; charset=iso-8859-1
      Server :Apache/1.3.29 (Unix) mod_gzip/1.3.26.1a mod_perl/1.29
      Unfortunately, some coder was bored and decided to kill some time by writing a conversion routine that would transcode all posts by removing the accents. I hope (s)he enjoyed that rainy Sunday afternoon, spent on breaking stuff that works flawlessly on the rest of the web.

      (you may have noticed I would favor accents on /. But for DNS, I'm still not sure if I like it.)
    13. Re:URLs that you cannot type by Thomas+Miconi · · Score: 1

      I read French fluently,

      Congratulations. Ever tried Marcel Proust ? :-)

      but I've never had the need to type French accents on the computer,

      Be warned that it may make your prose somewhat difficult to read.

      so never have learned how to input them (any my keyboard sure isn't French).

      The most important accented letter in French is "E with acute accent". Windows allows you to produce it by typing Ctrl-Alt-e (at least with UK layout).

      (SUBLIMINAL MESSAGE TO LINUX CODERS)And it would be damn nice if I could do the same, not necessarily in any linux app, but at least with KDE/Gnome

      Thomas Miconi

  15. Companies will shell out more to registrars now by Arcturax · · Score: 4, Insightful

    After all, now they need not only worry about registering say...

    Microsoft.com
    Microsoft.net
    Microsoft.org
    Mic rosoft.tv
    etc..

    But also
    Microsoft.com
    Microsoft.com

    Well, you get the picture.

    --

    --Won't that be grand? Computers and the programs will start thinking and the people will stop. - Dr. Walter Gibbs
    1. Re:Companies will shell out more to registrars now by Arcturax · · Score: 1

      Well ok, it looks like /. doesn't support these characters either... That or my font settings need adjusted. just imagine that last two Microsoft's.com had a collection of accents, double dots, the o with a slash through it and such.

      --

      --Won't that be grand? Computers and the programs will start thinking and the people will stop. - Dr. Walter Gibbs
    2. Re:Companies will shell out more to registrars now by grazzy · · Score: 1

      Nah, not true, microsoft and microsoft for instance are two completly diffrent words. There is no way you could accidently mistype that.

    3. Re:Companies will shell out more to registrars now by Anonymous Coward · · Score: 0

      RTFS. But I guess it would be difficult to talk about this on /., since it does not even support basic Latin-1 ... :-)

    4. Re:Companies will shell out more to registrars now by Delirium+Tremens · · Score: 1
      It's less about mistyping than it is about luring you to click on a link that, at first glance, you would think is legit, such as http://www.paypal.com.
      Not as bad as DNS hijacking, but very similar outcome for the untrained eye.

      Everytime I have to click on a link that takes me to Paypal, American Express, my bank's web site, my broker, etc ..., I make sure the URL is not spoofed. Unless there is a referer id you need to reuse in the link that is presented to you, it is always better to go to your familiar sites via bookmarks.

    5. Re:Companies will shell out more to registrars now by Fastolfe · · Score: 1

      And this is why DNS is not (and never was) appropriate for use as a directory or a Yellow Pages. It's simply there to create a naming hierarchy. I would have hoped that the creation of a bunch of new generic top-level domains would show everyone the futileness of trying to control every possible form of their name in DNS, but I should never underestimate the will of a legal department to make work for themselves.

      This just takes things to a new level entirely.

      It is not appropriate to use DNS as a directory ("search term dot com"), and it's not appropriate to use it as a form of authentication ("if it has ebay in its name, it must be eBay, right?"). Better technology needs to be pushed for these needs.

    6. Re:Companies will shell out more to registrars now by grazzy · · Score: 1

      you dont need to have someone lure you to lose your money at paypal.

      that company sucks.

    7. Re:Companies will shell out more to registrars now by Anonymous Coward · · Score: 0

      futileness = futility

  16. Race you... by D-Cypell · · Score: 1

    www.slashdot.org, she will be mine... oh yes... she will be mine.

    1. Re:Race you... by D-Cypell · · Score: 1

      Your right... it doesnt work =o\

  17. Re:English runs the net by Anonymous Coward · · Score: 0


    yep. They all spoke english too, although they didn't seem to enjoy it...
    (France, Germany, Denmark, and of course GB)

  18. IDN? Mozilla supports it by ospirata · · Score: 3, Informative

    I'm delighted to tell that Mozilla is one step forward again, and already supports IDN since version 0.9.5 http://www.mozilla.org/projects/intl/idn_mozilla.h tml

    1. Re:IDN? Mozilla supports it by Psychic+Burrito · · Score: 1
      From the website:
      IDN as HREF is not supported but patch is available
      <sarcasm mode=on>Woww, coool! So people can enter the domains, but when Google indexes the page and you'd like to visit the page through a search result, you're f*cked... And Mozilla is the only browser to advanced, right? And this starts next spring, everybody will buy these new domains, and then everybody will be dissappointed that it won't work in 95% of all circumstances, right? Wow, cool.<sarcasm mode=off>
    2. Re:IDN? Mozilla supports it by hey · · Score: 1

      If you read your link you'll see that its supported
      in the title bar but not in HREF. HREF is a pretty darn important place to support it. ie no linking!

    3. Re:IDN? Mozilla supports it by Anonymous Coward · · Score: 0

      URLs are restricted to US-ASCII anyway, and I'd rather work with Punycode than %-encoded UTF-8.

  19. How Long Till the Scams Start? by GeekLife.com · · Score: 0, Offtopic

    I give it a day before we're deluged with emails asking to send credit card numbers to a paypal.com site or domain registration renewal notices linking to networksolutions.com.

    1. Re:How Long Till the Scams Start? by Fastolfe · · Score: 1

      We already see scams with domains like "ebayfake.com" (replace 'fake' with some other plausible term). People need to stop doing their authentication with a cursory visual check of the domain name and start using technologies designed for the purpose (e.g. TLS).

  20. Mixed feelings by f97tosc · · Score: 5, Informative

    I have mixed feelings about this. I am from Sweden, and it always looks kind of ugly when names lose their dots and circles in the domain name.

    On the other hand, this is also quite convenient. I live in the US now, and I travel around quite a bit. I often surf on Swedish Internet sites, typically without access to a Swedish keyboard. It would not be very convenient if the domain names used non-English symbols.

    Sometimes I go to Japanese sites also, and I am really glad that I don't have to install a Japanese word processor to do this...

    Tor

    1. Re:Mixed feelings by HerbieStone · · Score: 2, Insightful
      That's why Website owner will register thier sites under two Domain: The current one for english-keyboard users, and the (orginal) foreign-named Domain

      And that's also why registrars love it.

    2. Re:Mixed feelings by Troed · · Score: 1

      .. and I can't say we like it. Since we already have sangberg.se we "had" to get sangberg.se (that's an a with a little circle above it if Slashdot removes the character I wrote) as well to avoid confusion if someone else registered it.

    3. Re:Mixed feelings by Rich+Dougherty · · Score: 1

      On the other hand, this is also quite convenient. I live in the US now, and I travel around quite a bit. I often surf on Swedish Internet sites, typically without access to a Swedish keyboard. It would not be very convenient if the domain names used non-English symbols.

      Just type it in punycode. Easy!

  21. It's a bad solution good idea by edubarr · · Score: 1

    This will enable more domains and people from non english-speaking countries will be able to register their domains with their correct syntax.

    One thing that kinda bugs me is that this will not be a full port to unicode (apparently it'll be hard to port it all), but a work-around. Kind of reminds the entire Y2K problem... "Why write like it's supposed to if we can make it shorter?" Then, in a decade everybody will be worried because the work-around no longer works and they'll have lots more to do in order to port it all to unicode.

  22. Re:English runs the net by Anonymous Coward · · Score: 0

    they have their own keyboears with their alphabet, and if they dont want to goto an english site, why would need to switch keyboard mode. think asian languages, think middle eastern, russian, greek, ...

  23. Does this mean by gregarican · · Score: 0, Flamebait

    That www.dell.com will automatically forward to www.dell.in?

    1. Re:Does this mean by Anonymous Coward · · Score: 0

      If I had mod points Id mod this up as funny.

  24. complexity for nothing by Anonymous Coward · · Score: 0

    this is going to add another level of complexity to things for functionality which a) isn't needed b) nobody wants.

  25. Not to be Overly American... by mse61 · · Score: 1

    I realize that it's important to some people to have their native language represented in what they type and in their communication but isn't this more trouble then it's worth? As it stands now, the system works well. Sure you may not be able to get a umlaut in your domain, but is that really a just cause to change the entire fscking DNS system?

    --
    ++mse61--
    1. Re:Not to be Overly American... by Mnemia · · Score: 4, Insightful

      Yes, it is. Because it's not just a few "umlauts". When you're talking about Asian or other non-Romanized languages then the Romanization may be totally incomprehensible to even some speakers of that language. It's one thing to lose a few accent marks and such but it's quite another to translate your language into a totally incomprehensible and unrelated format. In fact in kanji based languages at the very least Romanization actually LOSES information. It's not just a matter of transcribing the sounds into another format because the kanji carry additional meaning not present in just the phonetic lanaguage. If you've ever seen two native Chinese or Japanese speakers talk to each other they frequently will "write" kanji in the air or on the palm of the other person's hand with their fingers because their spoken language is imprecise.These changes are very necessary for the Internet to become a truly international phenomenon

    2. Re:Not to be Overly American... by geoffspear · · Score: 1

      "Sure our computers use only ugly, hard to read, uppercase letters, but as long as we can spell 'God' correctly everyone should be ok with it."

      --
      Don't blame me; I'm never given mod points.
    3. Re:Not to be Overly American... by defMan · · Score: 1

      This system will not support chinese/japanese charsets for this.
      This just adds the Latin-1 part of unicode to the DNS system.

    4. Re:Not to be Overly American... by Anonymous Coward · · Score: 0

      How about we start supporting tribes of africa also that speak in clicks and pops?

    5. Re:Not to be Overly American... by Mnemia · · Score: 1

      Yes, but that's a problem with the solution being proposed more than it's a problem with the entire concept of adding non-English domain names. I agree that this solution sucks but only because it's a sloppy fix to the overall problem. Domain names should definitely have full Unicode/UTF-8 support so that they are equally usable for everyone. And that should be done now so that the problem doesn't become worse to fix in the future as these "local" workarounds proliferate. And they will proliferate, because people do not like to have to learn a foreign language just to use the Internet.

      Internationalization issues like this are going to be more and more of problem unless global fixes get applied. All computers should have the ability to input/display arbitrary human languages in my opinion.

    6. Re:Not to be Overly American... by Tailhook · · Score: 1

      If you've ever seen two native Chinese or Japanese speakers talk to each other they frequently will "write" kanji in the air or on the palm of the other person's hand with their fingers because their spoken language is imprecise.

      If this actually goes on, doesn't it behoove the victims to fix their language? What good does it do to have a language that can't be properly spoken? How does purely audio, such as radio (!), communication function? For better or worse, English essentially won the language war. It will take another century or so for that to set in fully, but it's a done deal. Just watch some si-fi; even the aliens have English nailed.

      --
      Maw! Fire up the karma burner!
    7. Re:Not to be Overly American... by dieresis · · Score: 1

      In the ten years that I have known her, I have never seen my Chinese friend gesture characters when she talks to other Chinese speakers.

    8. Re:Not to be Overly American... by caranha · · Score: 1

      If you've ever seen two native Chinese or Japanese speakers talk to each other they frequently will "write" kanji in the air or on the palm of the other person's hand with their fingers because their spoken language is imprecise.

      And I call bullshit :-)

      Having been living in Japan for one year now, I'm yet to see this curious behaviour you spoke of. (coincidentally, a group of chinese live in the same building as me, and they don't seem to draw kanji to each other while speaking either)

      However, you are right to say that romanization may lead to lost information. There is indeed a lot of phonetical redundancy (in Japanese at least), but most homophones can be solved by context, which is not the case in the short internet addresses.

    9. Re:Not to be Overly American... by Anonymous Coward · · Score: 0

      >> If you've ever seen two native Chinese or Japanese speakers talk to each other they
      >> frequently will "write" kanji in the air or on the palm of the other person's hand with
      >> their fingers because their spoken language is imprecise.

      If you've ever seen two native English speakers talk to each other, they frequently will "spell" ambiguous words to each other because their spoken language is imprecise.

    10. Re:Not to be Overly American... by Mnemia · · Score: 1

      Yes...usually the time you see this is when they are talking about something like a place or person's name. That is by no means possible to clear up from context. Fortunately, it doesn't matter very often in regular conversation. If you're talking about a place or person that one person is unfamiliar with it's definitely necessary to write it down or point out which kanji it is if you want to be able to read it though. There are many many kanji for common names.

      I also lived in Japan for about 6 months.

  26. read the motherfucking article, bunghole by Anonymous Coward · · Score: 0
  27. Re:Bad idea but bound to happen with todays thinki by Anonymous Coward · · Score: 0

    africa has a flag?

  28. Super Monkeys! by Speare · · Score: 5, Funny

    Any Internet RFC which includes the phrase, -with-SUPER-MONKEYS, has GOT to be good. (And in case you think I'm trolling, check the link.)

    --
    [ .sig file not found ]
    1. Re:Super Monkeys! by Saeger · · Score: 1
      -with-SUPER-MONKEYS

      That's sooooo "unprofessional". *MY* megacorp would NEVER ever associate ourselves with anything that hasn't had every last drop of fun and humanity removed from it with the Politically-Correct-Hammer(R) ... unless we're faking our humanity for advertising purposes only, of course.

      --

      --
      Power to the Peaceful
  29. Re:Bad idea but bound to happen with todays thinki by Anonymous Coward · · Score: 0

    Is it just me, or does this totally sound like an article in The Onion?

  30. Re:Bad idea but bound to happen with todays thinki by Anonymous Coward · · Score: 0

    Uh oh. You're speaking "the plain truth". People might be offended.

    LOOK OUT!

  31. A Step In The Right Direction by kavachameleon · · Score: 1

    Seriously, while a good portion of the Internet is English speaking, there's a need for this. Accents are notoriously hard to get into computer programs, and even some languages. For instance, ancient Greek. While this may elicit scorn and laughter, I do in fact need to type Ancient Greek into my browser and my word processor on a daily basis, and the Symbol font just won't cut it. Why? Because I need accents, both stress accents and pitch accents. Even Unicode can't really help me out. I'm glad someone's finally making a little bit of a step in the right direction, even though this probably won't help me at all.

    1. Re:A Step In The Right Direction by geoffspear · · Score: 2, Funny

      Bah. The ancient Greeks didn't need any accents, why should we?

      --
      Don't blame me; I'm never given mod points.
    2. Re:A Step In The Right Direction by ybpizw · · Score: 1

      Not sure if this qualifies as off-topic or not, but I do a lot of typing Ancient Greek myself and have found some very useful and very free ways to do it in unicode. Check out Tavultesoft Keyman and Keyman developer (http://www.tavultesoft.com/). With a list of the codes for the various glyphs you can write your own keyboard. I hope this is useful, and I'd be eager to know if you're using something better.

  32. wonderful solution by Anonymous Coward · · Score: 0

    Great

    Whatever we do, lets not really solve this problem; lets just add an ill thought out work-around to complicate our lives and introduce misery and unneccessary complexity into the world.

    With this carefully thought out lunacy, we can struggle with bugs and problems related to alternate character sets for years.

    *Sigh*

  33. USA! by ekephart · · Score: 2, Funny

    U.S.A.!!! U.S.A.!!! U.S.A.!!!

    If it wasn't for us we'd all be speaking German. Wait.

    [ducks]

    --
    sig
    1. Re:USA! by lemsip · · Score: 1

      Yeah, and if it wasn't for us Brits you'd still be speaking Apache Indian.... :-)

    2. Re:USA! by Anonymous Coward · · Score: 0

      Yeah, and if it wasn't for us Vandals you'd all be speaking Latin!

    3. Re:USA! by tsmccaff · · Score: 1

      And if it wasn't for us Proto-Indo-European nomadic tribesmen, you'd all be speaking a different variation of Proto-Indo-European.

      --
      "the starry sky above and the moral law within"-Kant
    4. Re:USA! by Anonymous Coward · · Score: 0

      Gond-wa-na! Gond-wa-na!

      If it weren't for us brachiopods, you trilobites wouldn't have survived!

      Keep going, keep going! Show the American how stupid he is!

    5. Re:USA! by Araneas · · Score: 1

      I think a couple of million Russians _might_ have had something to do with it too.....

    6. Re:USA! by Anonymous Coward · · Score: 0

      You == idiot without a sense of humour.

    7. Re:USA! by Araneas · · Score: 1
      I guess it was too subtle a comment for you then.

      "idiot" was uncalled for but then you may be hiding behind AC for a reason....

  34. Why so late? by zdzichu · · Score: 1

    Poland (.pl) officialy have IDN domain since 11th September 2003.

    --
    :wq
    1. Re:Why so late? by Anonymous Coward · · Score: 0

      Where is my Porsche?

  35. Punycode *is* a Unicode encoding. by Speare · · Score: 4, Informative

    Punycode *is* a Unicode encoding.

    Unicode has many encodings; UTF-8 is one encoding and Punycode is another. UTF-8 aims for efficiency when the majority of the text is ASCII, and Punycode aims for completeness when you must fit in 64 characters and use only the ASCII characters to do it.

    --
    [ .sig file not found ]
    1. Re:Punycode *is* a Unicode encoding. by dmelomed · · Score: 1

      So does most DNS software support Punicode? If not, then how is Punycode a smooth plan for transition? Fix the problem, not the symptoms.

    2. Re:Punycode *is* a Unicode encoding. by killenheladagen · · Score: 1

      > So does most DNS software support Punicode? RTFRFC - The good thing with punycode is that it is completely transparent for the DNS servers. It all becomes a matter of upgrading the clients (which would be necessary anyway). Old software will still work, but the punycode-encoded "raw" ascii strings would be visible.

    3. Re:Punycode *is* a Unicode encoding. by dmelomed · · Score: 1

      This is a backwards approach to fix the problem. What software that depends on DNS supports Punycode right now? All that software will be broken. Why not first fix other software first, MTA, MUA, browsers, etc to support UTF-8, then fix the servers?

    4. Re:Punycode *is* a Unicode encoding. by Anonymous Coward · · Score: 0

      Does it strike anyone as wrong that *Uni*code has many encodings. Its name just seems to say *ONE* code, one encoding...

    5. Re:Punycode *is* a Unicode encoding. by Speare · · Score: 1
      Unicode describes a canonical mapping of tens of thousands of characters in many languages. That character will always be that number, and that number will always be that character.

      The problem is, bytes only fit integers from 0 to 255. Some situations (like DNS entries) only allow a subset of those byte values. Larger numbers must be described with more than one byte. That's where the encodings come in. How will you refer to the number 10000 if you can only use A-Z, a-z, 0-9 ASCII values? How will you refer to the number 10000 if you can use any bit pattern, on a little-endian computer? If you always use four bytes to allow for any legal Unicode number, do you waste 80% of the space when you have a text file that's *almost* entirely composed of plain old ASCII characters?

      --
      [ .sig file not found ]
  36. Subject to Approval by The_Systech · · Score: 3, Funny

    Yeah, but did anybody get Al Gore's approval to make these changes?

    --
    To err is human, but to really foul things up requires a computer
    1. Re:Subject to Approval by james_orr · · Score: 1

      Al Gore never claimed to have invented the internet. A few minutes on google would reveal the truth about this lie.

    2. Re:Subject to Approval by Anonymous Coward · · Score: 0

      He claimed to have created the Internet. I fail to see the distinction between creating an invention and inventing it.

    3. Re:Subject to Approval by Anonymous Coward · · Score: 0

      Read this and see how wrong you are!

  37. Japanese Domains work... by Anonymous Coward · · Score: 0

    Maybe I am confused, but, Japanese domains already exist.

    http://www.lSOaZY.com...
    Wow, Japanese characters don't seem to show up properly on slashdot...

    Anyway, if you enter http://www.ningen-isan.com. Where ningen-isan is the kanji equivilent (human treasure), both URL will resolve correctly.

    I don't know the technical details of how it works, but is this a different case?

  38. Taking 1337-speek to a new level by RobertB-DC · · Score: 2, Informative

    Now I won't have to be limited to using a hyphen! I can register d[i-circ]xiechicks.com, or dixi[e-grave]chicks.com, or maybe dixie[c-cedil]hicks.com!

    That last one would be doubly good, because if I understand the Punycode spec correctly, it'll get translated to ASCII as dixiehicks-XXXX.com. Not my opinion of the group, but maybe it would attract hits from the Toby Keith crowd.

    --
    Stressed? Me? Of course not. Stress is what a rubber band feels before it breaks, silly.
  39. mod parent flamebait by Anonymous Coward · · Score: 0
    or at very least as a troll

    you think keyboards the world over look exactly like yours? why should they?

    technology is not inherently english-speaking. and even if it starts out that way, it shouldn't have to remain so. just because we made it doesn't mean others don't have the right to use it in a manner more suited to their needs.

    Hell, why have screen readers? Is it too much to ask for blind people to print out braille versions of webpages? Why not make everyone use hex lookup tables to enter and read information? or punchcards?!

  40. it works fine on /. by GillBates0 · · Score: 4, Funny

    - - - - ..
    I, for one, welcome our new European overlords.

    --
    An Indian-American Hindu committed to non-violent thought/speech/action alarmed by the global explosion of radical Islam
  41. Re:Taco, why did you remove the accents from slash by Anonymous Coward · · Score: 1, Informative

    don't understand a lick of french? 'taco is a mean man'

  42. Because by SweetAndSourJesus · · Score: 2, Funny

    &#73;&#116;&#39;&#115;&#32;&#118;&#101;
    &#114;&#1 21;&#32;&#100;&#105;&#102;&#102;
    &#105;&#99;&#117 ;&#108;&#116;&#32;&#116;
    &#111;&#32;&#116;&#121;& #112;&#101;&#32;
    &#108;&#105;&#107;&#101;&#32;&#1 16;&#104;
    &#105;&#115;

    --

    --
    the strongest word is still the word "free"
    1. Re:Because by Tackhead · · Score: 2, Funny
      > &#73;&#116;&#39;&#115;&#32;&#118;&#101 ;
      &#114;&#1 21;&#32;&#100;&#105;&#102;&#102;
      &#105;&#99;&#117 ;&#108;&#116;&#32;&#116;
      &#111;&#32;&#116;&#121;&; #112;&#101;&#32;
      &#108;&#105;&#107;&#101;&#32;& #1 16;&#104;
      &#105;&#115;

      An opportunity to quote one of my favorite bits of .sigfodder of all time:

      Now, I knew this was coming, but that still didn't prepare me to actually see it. I'm looking at this thinking "You know, that couldn't be ANY MORE WRONG if it was in HTML with a .GIF of a psychotic nun in a bondage outfit clubbing a baby seal to death with an Al Gore doll." I mean, _ew_. Is that supposed to mean anything to ANYBODY? Can I put that address on an envelope and have it get delivered somewhere other than "Ampersand Incorporated"? WHAT IDIOT THINKS THAT THIS IS A GOOD IDEA?

      Huey, on news.admin.net-abuse.email, commenting on the same issue, over two and a half years ago.

    2. Re:Because by el-spectre · · Score: 1

      It's just as hard to read it... damned whitespace...

      --
      "Faith: Belief without evidence in what is told by one who speaks without knowledge, of things without parallel." - A.B.
    3. Re:Because by caluml · · Score: 1
      a .GIF of a psychotic nun in a bondage outfit clubbing a baby seal to death with an Al Gore doll.

      Please post a link to this pic, as I think you just made it up.

    4. Re:Because by Anonymous Coward · · Score: 0

      I think he did make it up, sic.

    5. Re:Because by Anonymous Coward · · Score: 0

      translation = It's very difficult to type like this
      yes i did it by hand
      yes i want my time back
      yes i am going to steal someones soul to suck that time back from them real vampire like

    6. Re:Because by caluml · · Score: 1

      Why the "sic" ?

    7. Re:Because by Tackhead · · Score: 1
      > > a .GIF of a psychotic nun in a bondage outfit clubbing a baby seal to death with an Al Gore doll.
      >
      > Please post a link to this pic, as I think you just made it up.

      Dude, this is Slashdot, not Fark! (But now that you mention it, perhaps we should send a request to Fark and SomethingAwful.com, and let 'em have at it.)

  43. Unicode is too bulky and not secure by Anonymous Coward · · Score: 0

    There are many Unicode representations for the same character. (Different problem than multiple Unicode characters with similar names, symbols, or meanings.)

    This _is_ Unicode, but with a non-standard encoding that forces one and only one representation for each Unicode character.

    Also, Unicode UTF8, UTF16, or UDF16 tend to create very long byte strings for non-western languages. The new encoding is designed to be compact.

    Also the 16 or 32 bit encoding variants of Unicode take up 100% more room for Western languages and are not backwards compatible -- only UTF8 and this new encoding preserve ASCII-only domains as readable ASCII strings and preserve backwards compatability with old software (client and server) for all domains.

  44. Re:Bad idea but bound to happen with todays thinki by Scrameustache · · Score: 0, Flamebait

    To accomodate people who are "insulted" or "offended" that thier native language is not fully "respected" by the internet is ludacris.[...]As much as the information on the web should be free, if you can't handle a little learning curce to access the info, IMO you aren't capable mentaly of doing anything with the info once you access it

    Like, for instance, figuring out what the hell YOU are saying requires a steep learning curve.

    Your inability to master one language must the root of your jingoist hatred of the idea that people from other cultures might get full access to the internet's potential too. I mean, it would be outrageous to expect you to have a bit of a learning curve to use the internet. The rest of the world, yes, but that you might need to learn a lil' something new? Folly!

    --

    You can't take the sky from me...

  45. Compatible Domains by Talrias · · Score: 1

    It seems sensible to me that, in a similar fashion that domains are case-insensitive, accented characters, etc. should be based on an original letter, e.g. "a acute" or "a grave" should all be based on the "a" letter, as "A" is based on a". This way, its possible to have the domain name with accented characters or any other non-unicode letters, in exactly the same way we *can* have http://TALrIaS.Net/ which is exactly the same as http://talrias.net/ (shameless plug there).

    Obviously this wouldn't work for non-Latin alphabets like greek, chinese, japanese. Thoughts on this anyone?

    --
    aterr - an open source threaded discussion board.
    1. Re:Compatible Domains by TomV · · Score: 1

      There's a B-shaped thing in German that's actually a double-S, and a thing like an L with a diagonal slash that sounds like W in Polish, so there would be room for confusion even in Latin-based alphabets only. And there are an awful lot of people whose primary script isn't latin-based, probably a majority.

    2. Re:Compatible Domains by Anonymous Coward · · Score: 0

      I agree. This is a much better idea.

  46. Re:Bad idea but bound to happen with todays thinki by dREI · · Score: 1

    I totally agree with you.

    dREI from Rome (IT)

  47. Same character in different character sets by suso · · Score: 1

    What happens when someone registers the domain cnn.com where the c or n is actually a character in a different character set. Then it would be difficult for 99% of the population to tell the difference say when they follow a link to http://www.cnn.com/the_world_is_ending_sell_your_s oul.html

    1. Re:Same character in different character sets by TKinias · · Score: 1

      scripsit suso:

      What happens when someone registers the domain cnn.com where the c or n is actually a character in a different character set. Then it would be difficult for 99% of the population to tell the difference say when they follow a link to http://www.cnn.com/the_world_is_ending_sell_your_s oul.html

      Ewww, yuck! I do not want to have to guess whether I'm really looking at <cyrillic s>nn.com, or whether this is slashd<omicron>t.org. It's bad enough already with some of the typo-grabbing pr0n sites, etc.

      --
      In principio creauit Linus Linucem.
    2. Re:Same character in different character sets by Anonymous Coward · · Score: 0

      There is only one character set used in IDN, and that's Unicode.

      Now some characters have several representations in Unicode, that's why domain names are normalized first. That insures that there is only one way to represent a given character string.

    3. Re:Same character in different character sets by Fastolfe · · Score: 1

      Precisely why DNS isn't appropriate to authenticate an organization. We need to push technologies like TLS and discourage users from giving so much weight to DNS hostnames (and URLs for that matter).

    4. Re:Same character in different character sets by sonamchauhan · · Score: 1

      Ewww, yuck! I do not want to have to guess whether I'm really looking at nn.com, or whether this is slashdt.org. It's bad enough already with some of the typo-grabbing pr0n sites, etc.

      There's a sore need for browsers to start displaying the DNS name in the address bar using *only* the logged-in user's default character set. Hopefully, this would show "nn.com" as "?nn.com".

      To the poster who suggested TLS - do you mean X.509 server certificates?
      They won't help - the person running show "nn.com" would simply have a certificate for "nn.com" (as indeed, he has a right to have).

      So it comes down to what you see on the browser address bar.

    5. Re:Same character in different character sets by sonamchauhan · · Score: 1

      Sorry, the "nn.com" mentions above really should have been "nn.com"

    6. Re:Same character in different character sets by TKinias · · Score: 1

      scripsit sonamchauhan:

      There's a sore need for browsers to start displaying the DNS name in the address bar using *only* the logged-in user's default character set. Hopefully, this would show "[<cyrillic s>]nn.com" as "?nn.com".

      That won't help users like me who use UTF-8 locales. I routinely visit Web sites in more than one language, including the occasional Russian one.

      --
      In principio creauit Linus Linucem.
    7. Re:Same character in different character sets by sonamchauhan · · Score: 1

      The website would still display properly - only the address bar would be affected.

      In the case of a user using the default ASCII character set, this is so he is alerted that the DNS name in the UTF-8 URL he cut and pasted in his address bar really translates to a different DNS name in ASCII. Perhaps the UTF-8 rendering could be displayed in a floating popup next to the "ASCII-translated" URL. (IIRC, DNS names are ASCII.)

      In the case of a Russian web user - his default character set would be Cyrillic, so typing Cyrillic URLs would be no problem. The problem would be typing ASCII URLs, but they can always change their default character set on the fly. I guess in this case, the popup data would be interchanged.
      This would also allow, say, a Russian web user to use Kanji URLs. :)

    8. Re:Same character in different character sets by TKinias · · Score: 1

      scripsit sonamchauhan:

      In the case of a user using the default ASCII character set, this is so he is alerted that the DNS name in the UTF-8 URL he cut and pasted in his address bar really translates to a different DNS name in ASCII.

      This still doesn't help the bloke whose default character encoding is UTF-8. For example, I use the locales en_US.UTF-8 and fr_FR.UTF-8. In both of those, Cyrillic and CJK characters are ``native.'' If non-ASCII domain names become common, I don't want to be pestered every time I try to go to a Web site with an acute accent in the domain name--for example, the Web site of the French President would presumably add the accent to elysee.fr.

      --
      In principio creauit Linus Linucem.
    9. Re:Same character in different character sets by sonamchauhan · · Score: 1

      Well, for a regular user, the choice really boils down to these alternatives:

      1. Be pestered (actually an unobtrusive tooltip floating next to the address bar is what I had in mind),

      2. Have no way of defeating a homograph attack .

      If you cut and paste the two "microsoft.com" URLs from the article I linked, you'll see one works and the other doesn't, but both look identical. The dummy link could be sent in an email, or even put up temporarily on a website - and point to an exactly mirrored fake Microsoft site with a dummy "virus patch".

      Note, regular X.509 certificate don't help (Verisign will just be issue one to the dummy site). Only code-signing procedures that check for Microsoft's signature will work.

  48. No change needed... by JohnGrahamCumming · · Score: 5, Informative

    > You think you know how to parse a domain name for validity?

    Yes, I do, and if you _read_ the RFC you'll see that nothing changes, these domain names are encoded into the same character set as the current DNS system. And hence if you give me a URL I can validate it with existing scripts. There's an example which shows that Bucher.ch (with an umlaut on the u) would be translated to: xn--bcher-kva.ch which looks totally parseable to me.

    John.

    1. Re:No change needed... by Psychic+Burrito · · Score: 1
      Does anybody know if this will just work "out of the box" with every computer that can produce umlauts?

      I'm asking because today, I've tried out the Netsol way of doing umlauts and they don't work at all with my Mac OS X and Safari: None of the listed domains work. The page lists a "plugin" that every web user is supposed to install, but it's Win only (of course...) and it's quite silly to have a domain with umlauts if you have to tell all your customers "before visiting me, please install this plugin"...

      Any idea if this new way work in all circumstances where the user has a international keyboard? Thanks!

    2. Re:No change needed... by Anonymous Coward · · Score: 0

      xn--slashdt-???.org

  49. MOD PARENT UP!! too funny! by Anonymous Coward · · Score: 0

    MOD PARENT UP!!

  50. I can't wait by nizo · · Score: 5, Funny

    Personally I can't wait to see funky chinese character domain names in my web logs (mostly from infected windows machines trying to attack my apache server).

  51. Re:Bad idea but bound to happen with todays thinki by TomV · · Score: 2, Funny

    thier
    ludacris
    femail
    curce
    mentaly


    ...after all, some people find just 26 letters and 0-9 hard enough already ;)

  52. No time for that, man! by Anonymous Coward · · Score: 0

    There is evidence the Mesopotamians are building weapons of mass destruction!

    1. Re:No time for that, man! by a+whoabot · · Score: 1

      (Pssst! No one knows their geo-political history!)

  53. Re:Bad idea but bound to happen with todays thinki by Quixo-tastic · · Score: 1

    Errr, congratulations. Your Ethnocentric garbage has been modded up. English was not the first and is not the only language in the world, and just because someone who spoke English designed the standards we use today does not justify those standards excluding all other languages for all time. I'd expect someone as gung-ho on the English language to be able to spell "ludicrous" correctly. Turn off the MTV and turn on something educational.

  54. Reason by ajnlth · · Score: 3, Insightful
    I would guess that the reason for this rather than redesigning DNS to use Unicode is beacause of the still rather dominant presence of the USA on the internet.

    Since this solution doesn't break any old implementation just the countries that need it will have to modify their software, and not wait for the slow and expensive process of changing all of DNS, which a large part of the 'net isn't motivated do pay for.

    1. Re:Reason by Chibi+Merrow · · Score: 1

      Actually it doesn't really have anything to do with the USA. Most of the world uses BIND and the BIND guys say it'd take them a year to get full UNICODE support, not to mention the security problems they'd have to worry about then. After reading the RFC, I have to say this is a pretty nifty hack to make it work since every string gets encoded into a unique punycode string that's compliant with the current DNS RFCs.

      Plus the guy uses Amuro Namie in one of his examples, so he's gotta be cool. :P

      --
      Maxim: People cannot follow directions.
      Increases in truth directly with the length of time spent explaining them
  55. Re:Bad idea but bound to happen with todays thinki by Mod+Me+God · · Score: 1

    "26 letters and 0-9 are the best" but what about the punctuation marks? And spaces? Imagine communicating with the computer without spaces? Well, if we all wrote Chinese we wouldn't need spaces, we wouldn't need letters either, just particles (Chinese characters can be constructed from arrangement of 1, 2 or 3 of a small amount of simple shapes), nor would we need numbers.

    What's next? Everyone must accept the latin script (or rather the English bastardisation of it)? What will I do when I want to code my APL, I'll have no greek letters to use. And damn, think about the amount of money the US will lose from the lack of expansion of IT in countries which no not use the bastardised latin script??? Hell, computers were _designed_ to run in English after all. No accent marks? Tough sh!t I say.

    --
    --

    FreeNET user? Comfortable with the adverse selection?
  56. I wish I had mod points by Anonymous Coward · · Score: 0
    You are so right.

    If these people new how ugly a domain name becomes when one can't use all characters in the alphabet. Wonder what Microsoft would use if "f" was not available in domain names. "Microsopht"? It's things like that we have to live with.

  57. Just use Google by bstadil · · Score: 2, Insightful
    The whole issue of convenient Domain names is a bit passee.

    Often used url's I have as book marks and when i need some other site, it is much easier to make a guess via Google. What I am looking for is almost always on page one of googles choices.

    Sure Google could find a way to handle the special characters and make an intelligent suggestion, if nothing else based on IP address of the request. If it is from Burundi chances of needing a German umlaut is slim

    --
    Help fight continental drift.
  58. similar domain registrations... by barryfandango · · Score: 1

    let me be the first to call www.hotmail.com !

    --
    In all matters of opinion, our adversaries are insane. -Oscar Wilde
    1. Re:similar domain registrations... by barryfandango · · Score: 1

      d'oh... i had the i with the two little dots over it... it was really quite clever, i swear... looks like slash filtered it out... With this explanation you can now feel free to mod my parent post as funny and insightful. thank-you for your time.

      --
      In all matters of opinion, our adversaries are insane. -Oscar Wilde
    2. Re:similar domain registrations... by Anonymous Coward · · Score: 0

      If you don't read the summary you won't get your mod!! How can you have your mod if you don't read the summary?!

  59. Warning! Typo in parent! by Anonymous Coward · · Score: 0

    You meant to say: "a) isn't needed by english speaking people b) americans don't want".

  60. Wrong way on a one-way track... by mishehu · · Score: 2, Insightful

    Let's assume (and I might not be correct in this assertion) that every computer in every country can at least type & see the 26 letters used in the English language plus digits 0-9 and the dash & period signs. However, I have no idea how to type anything coherent in Chinese Simplified or Traditional (hell, it's all Chinese to me...)...

    In the interest of fostering the best method to communicate your ideas, products, services, etc., would you not want to use the characters that most everybody can type?

    Oh, and this begs the next question - what about languages that go right-to-left instead of left-to-right? How about Thai, Arabic, and Hebrew? Personally, I don't want to see any domain names outside of the 26 chars used in English, 0-9, and the period & dash signs.

    1. Re:Wrong way on a one-way track... by Anonymous Coward · · Score: 0

      Don't worry you probably won't see them. Because you know there is a whole world outside of the English speaking world, and there is no reason why the English speaking world will be affected by that change. I don't see CNN suddenly changing its domain name to c~n~n.com just because they can.

    2. Re:Wrong way on a one-way track... by fiddlesticks · · Score: 1

      > Let's assume [snip]

      let's not assume, let's STFW

      > However, I have no idea how to type anything coherent in Chinese Simplified or Traditional (hell, it's all Chinese to me...)...

      big deal. this doesn't affect you. You dont have to change anything

      >what about languages that go right-to-left instead of left-to-right? How about Thai, Arabic, and Hebrew?

      what of it? maybe this change will allow for that, maybe it won't. why do you care?

      >However, I have no idea how to type anything coherent in Chinese Simplified or Traditional (hell, it's all Chinese to me...)...

      big deal. children and babies would like the domain name system to be at a level that they could understand. Should we limit domain names to less than 5 chars, simple words, possibly chocolate flavoured?

      It's a big fucking world out there. > 1 billion Chinese people couldn't give 2 flying fucks whether or not you think it would be better to have no-non-'English language' letters (whatever they are..) available in a domain name

      >In the interest of fostering the best method to communicate your ideas, products, services, etc[snip] ....

      shouldn't everyone just learn ENGLISH, dammit?

      oh and BZZZZ, you can't have both a 'dash & period' in a domain name

      i don't know if you mean underscore and hyphen, but you can't use underscores in a domain name.

      in a file name, maybe

      If you used another charset than the one you obviously do, you might have different opinions about this issue.

  61. Sorry, but this is really stupid... by coene · · Score: 2, Insightful

    "Yeah, let's make sure that every normal english domain name can easily be spoofed with accented characters, not to mention having everyone open up and hunt around charmap to get to these new domains"...

    This isnt going to be abused, AT ALL. Worst idea ever.

    The Internet (domain names, top-tier nameservers, nameserver software, web and e-mail server software, all markup documents) runs on english, there's no way to i18n it without opening up a world of hurt. Sorry, but I don't want to have to upgrade BIND to a whole new series of bugs and exploits just so that some jagoff can open up his own go~o`le'.com.

    1. Re:Sorry, but this is really stupid... by pawal · · Score: 2, Informative

      Nothing in the DNS infrastructure need to be upgraded. There is only us-ascii in the zones. BUT, you have to upgrade your applications in order to read them the names the way they are supposed to read, otherwise you will end up with www.xn--rksmrgs-5wao1o.se instead of "www.raksmorgas.se".

    2. Re:Sorry, but this is really stupid... by dabadab · · Score: 4, Insightful

      You know, this arrogant, self-centric view does not help the discussion.
      Anyway, the current infrastructure DOES NO have to be updated and this change is NOT intended to be "some jagoff's playground", but rather for the non-English speaking people - there are quite a few of them.

      --
      Real life is overrated.
    3. Re:Sorry, but this is really stupid... by Fastolfe · · Score: 1

      The bulk of the world does not run on the Latin alphabet. Either they go off and create their own Internets that follow their rules, at everyone's expense, or we resolve to use one single root and find ways to make it work for everyone's rules.

      There are worse ways to approach this problem, and I don't see any better suggestions.

      Your information is also somewhat dated or not completely accurate.

      DNS, collectively, operates on a standardized set of Latin characters to identify country codes. This is the crux of the issue, obviously. I'll speak more to this later.

      Web markup languages are currently moving to those based on XML. XML allows Unicode anywhere, including the use of Unicode characters in XML elements and attributes. It's pretty easy to create an XML schema that only uses characters from non-Latin scripts. HTML and its XML-based children continue to use tags with a clearly visible English background, but who cares? It's trivial to create an XML schema with a CSS style sheet that allows Chinese authors to create markup using elements in Chinese, and XML- and CSS-aware browsers will actually render this correctly today. The presence of English is no longer a design issue, it's just that our standards bodies speak English and by this point, it's easier for non-Latin developers to deal with Latin scripts than it is for Latin developers to deal with non-Latin scripts.

      E-mail is, from the user's perspective, completely internationalized as well. Only the mail header names carry English-language words in them, but these shouldn't matter. The values of these headers can be internationalized, along with the content of the e-mail itself. The header names themselves are an implementation detail that can be completely isolated not only from the user, but from the developer as well, in the form of a library abstracting the implementation away.

      You will quickly find that many other protocols share these properties. Things like header names can be treated simply as opaque tokens (and frequently are, in the case of programmers that don't speak English). Their values are usually more opaque tokens, or internationalized characters.

      We run into issues when the stuff that isn't an implementation detail need to be internationalized. DNS domains are the most critical. Given that our society places so much intellectual property and "first line of search" weight on DNS domains, it's only natural that change like this is going to make things difficult for a lot of people. But remember that in the end, it's going to make things a lot less difficult for a lot more people.

      There will absolutely be issues with people "spoofing" similar-looking domains, and you can bet that companies are probably doubling the size of their Internet legal departments for the next rounds of litigation. But seriously, if you're relying on the appearance of DNS domains as some form of authentication, your security model is badly broken to begin with.

      It is my hope that this internationalization effort is a catalyst to:

      a) make people realize that DNS domains do not make a good Yellow Pages
      b) spur development on a better form of directory to supplement search engines to identify the Internet location of a real-world entity
      c) promote the use of other technologies to better establish "identity" online (e.g. SSL/TLS or some public key infrastructure)

      My two cents.

    4. Re:Sorry, but this is really stupid... by toriver · · Score: 1

      runs on english

      That's English, and it's thanks to such thinking that the rather large portion of the world that writes non-English as their native language had to battle with the stupid 7-bit limitation in transport protocols.

      Thanks for nothing.

    5. Re:Sorry, but this is really stupid... by jacobito · · Score: 1
      The Internet (domain names, top-tier nameservers, nameserver software, web and e-mail server software, all markup documents) runs on english, there's no way to i18n it without opening up a world of hurt. (emphasis mine)

      Huh? You better not tell the W3C, who have put a great deal of work into i18n support for web protocols and markup languages. You better not tell your browser maker, the majority of whom includes support for multiple character encodings and the HTTP Accept-Language header. You better not tell Google, who have localized their search interface to support an impressive number of languages, using the HTTP Accept-Language header that your browser sends to determine which language to present. In fact, you better ignore the thousands, if not millions, of documents on the web right now that include non-English content, and the existing infrastructure that serves and presents those documents.

      I can't speak for DNS or the email infrastructure, but the WWW is already internationalized.

  62. Re:Bad idea but bound to happen with todays thinki by Anonymous Coward · · Score: 0

    26 letters and 0-9 are the best, most simple way to use and communicate with a computer, IMO, other than speaking binary at the CPU with a f*cking megaphone. Just because you don't need them doesn't mean everyone else doesn't. I doubt any American websites would ever need to take advantage of this (but then again, why not?), but many European languages simply cannot be rendered properly without accents. It just makes a lot of sense to allow these for URL's as well. I'm all for information being free, and the web remaining a pace for a free flow of information to the whole world, but complicating the very foundation of the way the tech works to avoid some learning curve is just plain stupid. I don't see what a learning curve has to do with anything. Most Europeans are quite adept with English (as opposed to many Americans who can't speak any foreign languages). The point is that they simply want to be able to type URL's correctly in their own language. You don't speak Hungarian? Fine, you probably won't need this, but why not let Hungarians use their own language with their own accents?

  63. Umlauts? by Wun+Hung+Lo · · Score: 0

    Get ready for a slew of new hair band websites...it just wasn't the same without the umlauts!

    Beavis: heh heh, he said umlaut, heh heh

  64. Re:Bad idea but bound to happen with todays thinki by isaac338 · · Score: 2, Insightful

    The funny part is you'd probably be the first to complain had the Internet been designed by some foreign country and you couldn't register a plain English URL. Learning a whole new language isn't a "little learning curve", it's actually pretty hard.

    if you can't handle a little learning curce to access the info, IMO you aren't capable mentaly of doing anything with the info once you access it.

    Next time you go to a country the native language of which you can't understand, try planning your whole trip without once reading an English translation of any map or sign. Then you possibly might see how ignorant that statement sounds.

    The Internet is a world-wide resource, and like it or not, people who speak other languages have a say in how it works too.

    isaac

  65. Re:Taco, why did you remove the accents from slash by Anonymous Coward · · Score: 0

    No, that's boy.

    Homme is man, or gentleman really.

  66. Yes, as a matter of fact by joggle · · Score: 1

    Does Texas count?

    1. Re:Yes, as a matter of fact by Anonymous Coward · · Score: 0

      Naw, Texians speak TexMex.

    2. Re:Yes, as a matter of fact by Anonymous Coward · · Score: 0

      > Naw, Texians speak TexMex.

      Okay. But what to Texans speak?

      <duck>

  67. Just like name-mangling by Anonymous Coward · · Score: 0


    Reminds me of name-mangling, which I saw a lot back when I was assisting on a cfront porting project (the old C++ preprocessor).

    You still see mangling in C++ object file symbol tables -- that's because they wanted to keep the linker's name-space nice and simple and flat.

    (I think the new politically-correct term for it now is "name decoration".)

  68. Re:Bad idea but bound to happen with todays thinki by Anonymous Coward · · Score: 0

    You, sir, are a fucking idiot. I'd say it in the six languages I know if it weren't for the fact that three of them aren't representable without Unicode.

  69. Well... it's still not perfect by Krach42 · · Score: 3, Interesting

    Ok, so you're mostly guarenteed a domain name if you own the trademark on the name. (To prevent cybersquatters right?)

    Well, what about the .jp domain? How can they possibly handle this, since in Japan you cannot copyright latin characters. (Or at least as far as I've heard)

    This is the reasoning I've heard, as to why IBM is ai-bi-emu in Japan. And maikurosofuto, souni, etc. (roomaji transliteration there, sorry if you don't get why ai=I)

    So what do you do in this case? Unless they can enter Shift-JIS or Unicode URLs, then you're stuck having people enter roomaji versions of your name, which remember, aren't technically trademarkable.

    I'd love to hear I'm wrong on some point here, could anyone with more info clue me in?

    --

    I am unamerican, and proud of it!
    1. Re:Well... it's still not perfect by Xeger · · Score: 1

      The point of the Punycode encoding is that end-users can register Unicode domain names and type Unicode domain names into their browsers. Punycode then provides a translation from those pesky Unicode strings into nice 8-bit ASCII, which is compatible with the DNS protocol.

      For example: in Japan, Namie Amuro probably has a trademark on the Kanji characters her name. When it becomes possible for her to register her name .com.jp she will be legally entitled to register that name. What that effectively means, is that she will be legally entitled to register the *Punycode encoding* of that name...although if everything is implemented correctly, she (or her web designers) will never know this is going on.

      The only problem that might occur, is if someone else wants to innocently register the same domain name and isn't aware that it is the Punycode encoding for a trademarked name. This is highly unlikely, however, since Punycode-encoded strings tend to have long sequences of random-looking digits. Furthermore, unless this person has a *trademark* on the Punycode-encoded string for Namie Amuro's name, there is no legal conflict -- just a disappointed would-be domain name holder.

      So the only conceivable problem I can think of is a generation of i18n-squatters, who attempt to legally squat by trademarking Punycode-encoded versions of foreign stars' names. And I'm sure there's a legal precedent for this -- it would be equivalent to my trying to register "Maikurusofuto" as a trademark in Japan.

    2. Re:Well... it's still not perfect by Krach42 · · Score: 1

      Actually, I read the Punycode spec.

      It does cover the full unicode (yay)

      and according to the spec it uses characters, which are currently invalid to use in domain name registration.

      Thus, it should be impossible to register a punycode address that would conflict with a currently standard address.

      --

      I am unamerican, and proud of it!
    3. Re:Well... it's still not perfect by Anonymous Coward · · Score: 0

      They are missing the boat!

      Minimum Term Registration: 1 Year
      Domain Name: www.maikurosofuto.com

      Domain Availability: Yes

      In order to purchase this domain, please sign in if you are already a member

  70. Oh please. by Anonymous Coward · · Score: 0

    Yes because it's so important for Hans to have an email address:

    sexymutha21354828246@ge~'rm\a|n.gmn
    that has the correct accents.

  71. Well, it had to happen sometime...I guess by The+Spanish+Ninja · · Score: 2, Insightful

    It looks to me like this isn't really going to be such a big deal. Their domain names are going to be converted for DNS anyway, so it's not like we would have to type in a complicated string of characters that aren't on our keyboards. So we can't remember what to type so easily, so what? That's why we have bookmarks. Besides, this isn't really for us anyway. It's purpose seems to be to allow the people in other countries to use their own native languages for their own domain names. Easier for them, right? And if we want to access their domains, we just have to remember a few extra letters and dashes. No big deal. They get to do stuff in their language, we translate to ours, the whole world speaks, and maybe something gets done.

    --
    "I like you, but I wouldn't want to see you working with subatomic particles."
  72. Re:Bad idea but bound to happen with todays thinki by mkiesila · · Score: 2, Interesting

    Good day to answer to a troll, here goes...

    26 letters and 0-9 are not the best way to communicate with computer if your native language has more than 26 letters in its alphabet. It's not about being insulted or offended, it's about being understood. The computer speaks all natural languages equally badly, after all.

    Let's think about average nordic webshop owner who sells beds online for a minute, operating for example in Finland or Sweden. He wants to sell stuff to the native dwellers and hence needs a domain name that has an "a" with two dots on top of it so that the domain name for bed is spelled corretly in swedish or finnish. It might surprise some people, but there are quite a lot of people who don't speak a single word of english. So the people who he wishes to sell beds to A) know how to spell "bed" in their native language and B)have a key like that in their keyboards, and, *gasp* prefer to use correct spelling when referring to things!

    So you don't have an "a" with two dots on your keyboard? That's just too bad, but then again you probably don't speak finnish too well either. Why would you want to visit that e-bedshop then?

  73. am I the only one who sees this by dknight · · Score: 1

    as a precursor to a much greater problem?

    This is a step in a direction I dont think we want to go. Imagine if this goes through, if you will. What will follow?

    Next you're going to hear about programming languages being developed in other languages. Think outsourcing to india is so great? Wait til your next batch of outsourced code cannot be read, because it's not in english anymore!

    One of the things about computing has been the language standardization. Sure, you can do things in other languages, but it's generally been accepted that English is the way to go for things like programming languages and domain names. Granted, this only happened because of the involvement of the US in the creation of the net, but still, it's primarily a Very Good Thing (TM).

    Perhaps an international language will come out of this? That would be nice, but I see this as the first step in splintering the internet and the computing world at large.

    1. Re:am I the only one who sees this by Anonymous Coward · · Score: 0

      Well, well, ever heard of Microsoft Office?

      Their macro language used to be localized you know? However they droped it sometime arround Office 95 or 97, it was probably a royal pain to maintain.

      You don't want to go there and still, many countries have already made the switch. Have you been affected in any way? Do you care if some romanian news site is accessible only with characters you can't type?

      Oh my Gosh, this is the fall of the Internet as we know it!

    2. Re:am I the only one who sees this by Haeleth · · Score: 1

      Next you're going to hear about programming languages being developed in other languages.

      I deduce you haven't heard of APL?

  74. Re:Bad idea but bound to happen with todays thinki by Calaf · · Score: 1, Flamebait

    I'm all for information being free, and the web remaining a pace for a free flow of information to the whole world

    As long as the language is US English, eh?

    The world is a big place. You ought to get out and see more of it.

  75. Use utf-8 instead of 'punycode'. by blitz487 · · Score: 2

    That's what utf-8 is for. Why on earth invent yet another encoding?

    1. Re:Use utf-8 instead of 'punycode'. by toriver · · Score: 1

      Backward compatibility. Lots of UTF-8 "character" values are outside the range existing DNS servers deal with.

  76. You RTFA by Krach42 · · Score: 4, Insightful
    The introduction of the new IDN (Internationalised Domain Name) standard does much more than permit umlauts. A total of 92 additional characters, from the French e to the Danish o, will adorn domains.


    This means that it can't possibly include ALL of the unicode spectrum, as Unicode supports far more than just 92 extra characters.

    Also, the way the coding is going to work, you still can't register a name with B.

    According to international rules, this is equivalent to its transcription as ss. It would simply not be possible to distinguish between the domains straBe.de and strasse.de.
    --

    I am unamerican, and proud of it!
    1. Re:You RTFA by Anonymous Coward · · Score: 0
      Also, the way the coding is going to work, you still can't register a name with B.
      Are you a troll or are you just feeble-minded?

      The German "eszett" character is not a capital B.
    2. Re:You RTFA by Anonymous Coward · · Score: 0

      feeble minded. Just like the Troll that responded to me.

    3. Re:You RTFA by Anonymous Coward · · Score: 0

      This is all bullshit.
      I am german, and have that silly B (actually exactly like a beta, why?) in my surname, and I live in spain. So I have dealt with umlauts and accents.
      While they have a good purpose, in texts and such, they do very very bad in names. Any names. It braught me a lot of beaurocracy trouble having a letter in my surname thats not in the spanish alphabet. I eventually decided to leave the beta as a double 's'.

      And, especially with computers, its very bad having them in anything you would need to type in somewhere. It is good that we have different encondings for websites, but it is absolutely obstrusive having them in urls. In fact, in anything we'd need to refer to precisely, correctly spelled.
      How are we gonna get to www.informaci(O-w/-accent)n-espa(N-w/-tilde)a.es ? www.tr(umlaut-u)bsal.de ? www.hagend(o-stroken)del.sw ?
      I had a spanish mandrake, and to complete cfdisk you had to type in 'si' with an accent on the 'i'.
      I use american keyboard layout, cus spanish and german screwd it too much up.

      So, international urls is bullshit. The eu should worry about other stuff (sw patents).

      greetings,
      steels

    4. Re:You RTFA by Anonymous Coward · · Score: 0
      This is all bullshit.
      If I imagine you saying this with an Arnold Schwarzenegger voice, it sounds really cool!

      (I know, I know, he's Austrian)
    5. Re:You RTFA by Anonymous Coward · · Score: 0

      Austria = Germany for all practical purposes. And if it weren't forbidden by the so-and-so treaty, the two would have reunited long ago. Which would mean I could study in Vienna and still receive my student loans from the German government. :-\

    6. Re:You RTFA by Anonymous Coward · · Score: 0

      hehe sorry, i have a spanish accent, if any.
      andale hermanito!

    7. Re:You RTFA by Krach42 · · Score: 2, Informative

      Actually, I'm aware of that, but Slashdot seems to have stripped out the accents from my stuff...

      I am aware that the German scharf s is not a capital B. I had it correctly in my submission, but someone who was working on the slashcode thought it would be a good idea to eliminate accents, rather than to possibly HTMLize them.

      Try it yourself, put in an scharf s into a Slashdot comment, and see what happens.

      I notice that you DIDN'T complain about the missing accent on the French e, or the missing slash through the Swedish o.

      Now, as a speaker of German for 10 years, I'm going to leave it at that.

      --

      I am unamerican, and proud of it!
    8. Re:You RTFA by Krach42 · · Score: 0

      I wholy agree with you on many points, because it's mostly stupid for languages written in the latin alphabet.

      But what about Chinese, Korean and Japanese? Should they be forced to use a foreign alphabet when doing any url on their computer?

      That doesn't seem right to me.

      On the other hand, I feel your sorrow for having the scharf s in your name and being anywhere else but German, Austria or other German speaking land. I have "oe" in my name, because when my ancestor came to America, they didn't have the umlaut.

      I just wonder what it'll be like for me to go back. "Ich heisse Daniel Foesch... nein, mit O-E, nicht O-umlaut."

      --

      I am unamerican, and proud of it!
    9. Re:You RTFA by Anonymous Coward · · Score: 0

      The article is incomplete: it talks about the extra characters being allowed in the German/Swiss and Austrian domains. IDN supports the full spectrum of Unicode, not limited even to the Basic Multilingual Plane.

    10. Re:You RTFA by MidnightBrewer · · Score: 1

      Japanese won't mind, they've been romanizing their words for 500 years, thanks to Portuguese missionaries. The Japanese use four writing methods in their everyday lives already, so they're pretty flexible.

      There are a lot more than just Chinese, Korean and Japanese to worry about. What about those scripts that are entered right-to-left instead of left-to-right? Do you have the ability to type in Sanskrit URLs?

      The IDN is not a solution to the problem, it merely compounds it (and makes life more difficult for the guys responsible for the router tables.) Even with the animosity that some obviously feel towards the English language (and now the Roman alphabet, by association), it is at least a standard. It has the benefit of not including any special characters right off the bat, without losing the ability to transcribe most, if not all, of them.

      Also, regardless of country, you're probably going to have those particular 26 characters on your keyboard (I'm typing this on a Japanese keyboard, no sweat.)

      --
      "Give a man fire, and he'll be warm for a day; set a man on fire, and he'll be warm for the rest of his life
    11. Re:You RTFA by spyfrog · · Score: 1

      It is the danes and norweigans who puts a slash through o. In Sweden we put dots about it instead. =)

    12. Re:You RTFA by KD5YPT · · Score: 1

      Perhaps they should start design keys with accentation on them?
      Example.
      type i with accent key held down.
      Get i with accent.
      Kinda like getting a cap letter with Shift.

      --
      In US, you can easily buy enough major firearms to wipe out your neighbourhood but a few little fireworks are banned.
    13. Re:You RTFA by rduke15 · · Score: 1

      You already have that!

      Look at your keyboard layouts. In Windows it's called "English (United States) - United States International" or something. Never looked for it in Linux, but it certainly exists.

    14. Re:You RTFA by Haeleth · · Score: 1

      What about those scripts that are entered right-to-left instead of left-to-right? Do you have the ability to type in Sanskrit URLs?

      Sanskrit, like most Indic languages, runs left to right. You're probably thinking of Hebrew and Arabic.

    15. Re:You RTFA by MidnightBrewer · · Score: 1

      My bad. Besides that, though, the point still stands - without the ability to type in the characters in question, we will be unable to visit sites that use characters beyond what we can type with our given keyboard.

      Macs are pretty good about being able to easily access characters within the Western European alphabets (option+u gets you a German umlaut, followed by a, o or u), but it's a nightmare on Windows - a different nonsensical number sequence for each character (alt+0246 - or was it alt+0253?) I would be interested to know what Linux's options are.

      --
      "Give a man fire, and he'll be warm for a day; set a man on fire, and he'll be warm for the rest of his life
    16. Re:You RTFA by Krach42 · · Score: 1

      O. du har ratt... jag har mistippt :P

      --

      I am unamerican, and proud of it!
    17. Re:You RTFA by Krach42 · · Score: 1

      I'm aware of Roomaji... read my post.

      My point was that in Japan you are not able to trademark a word that has in it a character not written in either Hiragana, Katakana, or Kanji. This means you can't trademark Roomaji.

      I've also used a Japanese keyboard, and I'm aware of how easy it is to type English with it. (I own a Japanese IBM365 Laptop).

      --

      I am unamerican, and proud of it!
    18. Re:You RTFA by MidnightBrewer · · Score: 1
      I read your post. You referred to Latin alphabets as "foreign" in regards to Korean, Chinese and Japanese.

      You made absolutely no reference in your post to trademarking a word in Japan. Your point, insofar as you made one, was that it was unfair for countries using non-Latin alphabets to have to enter URLs that included them. I've been living in Osaka for over a year and a half now, and studying Japanese for two; as far as I can tell, the Japanese consider the Latin alphabet part of their modern culture; they don't have issues with it. You might want to check your post again.

      And just FYI: there is no specific statement made in the Japanese trademark laws, under "Unregistrable Trademarks," which states that you cannot use Romaji. There is specific mention later that you can protest a registration if it happens to sound like a trademark written in any of the alphabets, including the Latin alphabet. Check here for the latest revision.

      Nur zum Hinweis, Herr Foesch. Das naechstem Mal sollen Sie besser auf Ihrem eigenen Zeug aufpassen, wie?

      --
      "Give a man fire, and he'll be warm for a day; set a man on fire, and he'll be warm for the rest of his life
    19. Re:You RTFA by nutshell42 · · Score: 1
      (actually exactly like a beta, why?)

      look here

      --
      Don't think of it as a flame---it's more like an argument that does 3d6 fire damage
    20. Re:You RTFA by Krach42 · · Score: 1

      I reread my post... I got confused between this post and another that I wrote on the subject: The Other Comment.

      Not until after I sent the comment out did I realize that this could be a different thread from the other one.

      Of significance, I've heard that both IBM, and Microsoft have "ai-bi-emu" and "maikurosofuto" (written in katakana) as their trademarks, respectively, in Japan. The reason given was that one cannot trademark roomaji in Japan.

      Perhaps it's just that they couldn't register as a business in Japan in pure Latin characters.

      But, to get to my points: I've studied Japanese two years, also. (Although not anywhere near Japan) And I can read Roomaji, Hiragana, Katakana, and at most short of a thousand Kanji.

      If you'll check my above linked post, I asked for real input on the matter, but still gave the understanding I was under.

      And fact of point, Microsoft is certainly "maikurosofuto" (in Katakana) in Japan. I'm certain of this, because I own a Japanese laptop, and that's all they ever refer to themselves with. Never with "Microsoft" no matter how "accustomed" the Japanese may be with Roomaji, and foreign words.

      Danke fuer den Hinweis. Ich werde versuchen.

      --

      I am unamerican, and proud of it!
    21. Re:You RTFA by MidnightBrewer · · Score: 1

      Weird. When Windows boots here, it's always with the standard Microsoft Windows boot-up screen. Same for my Mac running OSX. Sony, Panasonic, etc. all refer to themselves by their "English" names. Every Japanese appliance I own has a Latin alphabet name on it.

      --
      "Give a man fire, and he'll be warm for a day; set a man on fire, and he'll be warm for the rest of his life
    22. Re:You RTFA by Krach42 · · Score: 1

      *sigh* Yes, the boot screen doesn't change.

      I'm refering to the material in the programs.

      Like in the "About" boxes.

      --

      I am unamerican, and proud of it!
    23. Re:You RTFA by Krach42 · · Score: 1

      I expand... There exists official Microsoft documentation or text, which contains the katakana version of Microsoft (maikurosofuto) as opposed to the actual word Microsoft.

      As for IBM, I got Ai-Bi-Emu from a guy who actually worked for IBM. So there must exist somewhere evidence to support IBM as Ai-Bi-Emu.

      --

      I am unamerican, and proud of it!
    24. Re:You RTFA by Krach42 · · Score: 1

      Found it really fast, too...

      http://www.microsoft.co.jp/

      The title of the page, is "maikurosofuto - hoomu" in katakana.

      --

      I am unamerican, and proud of it!
  77. Babylon 5 by uberdave · · Score: 3, Funny

    microsoft and microsoft for instance are two completly diffrent words.

    Reminds me of that Babylon 5 episode when they find a person named Zathras down on this planet. Ivanova thought she had been talking to Zathras:

    "No, that was not Zathras, that was Zathras. There are 10 of us, all of family Zathras, each one named Zathras. Slight differences in how you pronounce. Zathras, Zathras, Zathras.. You are seeing now?" - Zathras, Babylon 5: Conflicts of Interest

    1. Re:Babylon 5 by geoffspear · · Score: 1

      Which was totally ripped off from Robert Aspirin.

      --
      Don't blame me; I'm never given mod points.
    2. Re:Babylon 5 by Anonymous Coward · · Score: 0

      Which was totally ripped-off from George Foreman.

  78. Why not put English as IETF standard? by tjstork · · Score: 1


    If http can be a standard, xml can be a standard, posix can be a standard, why stop there? Why not have english be the standard too? If developers have to wade through the confused bable that is the W3C recommendations, then certainly the rest of the world can drop their own native languages just as surely as we drop our own native implementations of rendering and networking engines.

    English as the world language is surely as efficient as a single standards based unix as a world operating system.

    --
    This is my sig.
    1. Re:Why not put English as IETF standard? by The+Spanish+Ninja · · Score: 1

      English as the world language is surely as efficient as a single standards based unix as a world operating system. Yeah, and about as likely too.

      --
      "I like you, but I wouldn't want to see you working with subatomic particles."
    2. Re:Why not put English as IETF standard? by smcv · · Score: 1

      American English is already used like technical standards on the Internet. People don't drop their native languages (just like app authors don't often drop their native formats), but they do use English in contexts where "cross-platform compatibility" is important (newsgroups/mailing lists/forums - wherever they don't know whether the people involved will understand their first language).

      I quite like having Unicode formats myself, ASCII isn't quite good enough. Never mind transcribing foreign words, ASCII can't cope with correct English typography either :-)

    3. Re:Why not put English as IETF standard? by dajak · · Score: 1

      English is not comparable to W3C recommendations. There is no body maintaining the specification of standard English, as opposed to some other European languages.

      English is the M$ windows of languages on the internet. Just because lots of people use it, doesn't mean it is any good. Sentence constructions are often ambiguous, requiring semantics to parse the sentence, and the relation between English phonemes and graphemes is a complete mess. English should be replaced by a neutral exchange language designed for that purpose.

    4. Re:Why not put English as IETF standard? by The+Spanish+Ninja · · Score: 1

      Not only that, but the English language itself isn't even standardized. So do we use American English, or British English, or Australian English, or what? I'm a big fan of "Engrish" myself. Let's face it, a "common world language" is never going to happen, nor should it. We should be able to learn to communicate effectively, as civilized people, without resorting to forcing everyone to accept one single language, whichever one it might be.

      --
      "I like you, but I wouldn't want to see you working with subatomic particles."
    5. Re:Why not put English as IETF standard? by Anonymous Coward · · Score: 0

      >> We should be able to learn to communicate effectively, as civilized people, without resorting
      >> to forcing everyone to accept one single language, whichever one it might be.

      We should all be able to swallow a whole goat, as civilized people, without resorting to forcing everyone to use a knife.

      Your suggestion is about as reasonable. Noble intention, but of no practical value whatsoever.

    6. Re:Why not put English as IETF standard? by mijok · · Score: 1

      Well, if your language uses a different character set - how can you tell someone (in your language) the name of a website? In Swedish I can say the name and people are clever enough to figure out to turn a:s and o:s with dots over them and a:s with rings over them into simple a:s,o:s and a:s (the meaning changes at times, though, as I've pointed out in another post, examples: municipality names Horby and Monsteras - without the dots and rings: www.horby.se and www.monsteras.se mean www.hookervillage.se and www.monstercarcass.se).

      But what do Japanese/Chinese/Arabic/Hebrew/Korean/... (a long list) speakers do? Try to tell people to spell it using English letters? That's impossible since there is no simple "pronounced like this in my language"="write like this in English" conversion.

      --
      Karma. Moderation. Is my .sig good now?
  79. Can we have punctuation while we are at it? by Andy_R · · Score: 1

    Why on earth are hyphens the only allowable punctuation at the moment?

    Is there really any reason to continue to disallow things like:

    10%.com
    10off.com
    #dot.org
    and most importantly Andy_R.com

    while allowing motorhead.com to have their umlauts?

    --
    A pizza of radius z and thickness a has a volume of pi z z a
    1. Re:Can we have punctuation while we are at it? by Tazzy531 · · Score: 2, Informative
      There are technical reasons for disallowing certain characters. They are "reserved characters" in URLS.
      • The ? signifies the end of the URL and the beginning of the parameters.
      • The & deliminates the parameters.
      • The % are used for escapes [ie %20; is a space in URL parameters].
      • The = is the assignment operation in URL parameters.
      • The # is link anchors


      There are a couple others, but I don't remember them offhand... So in other words, these characters are unusable for a reason.
      --


      _______________________________
      "I'm not Conceited...I'm just a realist..."
  80. programs programmed in foreign languages by Tumbleweed · · Score: 1

    > Think outsourcing to india is so great? Wait til your next batch of outsourced code cannot be read, because it's not in english anymore!

    You've obviously never seen a program coded in Perl. *cringe*

    1. Re:programs programmed in foreign languages by TKinias · · Score: 1

      scripsit Tumbleweed:

      You've obviously never seen a program coded in Perl. *cringe*

      Oh, don't be such a killjoy. CJK ideograms as variable names are k3wl!

      --
      In principio creauit Linus Linucem.
    2. Re:programs programmed in foreign languages by zeath · · Score: 1

      Oh, Perl can be perfectly readable as long as you use strict. And keep your variable names long. And avoid anything with regexps. And fully qualify all package references. And avoid interpolation. And avoid default variables. And don't use DBI, CGI, or any GUI. And don't use references or typeglobs. And use perltidy religiously.

      By this point you'd just be using Java or VB or some other equivalently castrated language. I'm not sure what the moral of the story is. I suppose it could be that Perl is messy because it's useful. And tasty. Like a burrito.

  81. Re:Bad idea but bound to happen with todays thinki by Anonymous Coward · · Score: 0

    I agree that the world has many langauges and many nationalities, cultures, etc. that should have ready access to the Net, and for whom English is not their native tounge. But I must make two important points that go hand-and-hand.

    1) English is the most widely used language in the world, using a ROMAN LATIN alphabet that many other langauges hold largely in common with it.

    2) Complicating matters is antithetical to the very nature of the Net. We need a LCD for addressing. Otherwise large sections of the Net will become segmented based on national and lingustic boundries.

    I like being able to browse Japanese web sites, for instance. But if I had to use kanji to get there? I'd have a hard time doing so. Most English speakers would be likewise. This is so simply b/c they already speak the most common tounge. Conversely, numerous Japanese citizens are familiar with the Roman characters, and even some basic english.

    We should consider this carefully.

    As my Pappy used to say, "you just can't get here from there" :)

  82. Well now by Anonymous Coward · · Score: 0

    isn't slashdot rendering html like shit today. shouldn't you guys be testing on a test server or something?

  83. Why not do it right? It's only a little extra work by Anonymous Coward · · Score: 0


    As far as I can tell, every client's "gethostbyname()" is going to have to be modified to support this, no matter how it's implemented. Right?

    That's a huge number of machines to be upgraded anyway. So why not do a clean design? The added cost of upgrading the DNS servers will be miniscule compared to the client-upgrade costs.

    This reminds me of the flaw with Verisign's SiteFinder so-called "service" -- where they mistakenly put a client-side feature on the server-side. With Punycode translation, they seem to be making the opposite mistake -- they're applying the character translation on the client-side instead of the server-side where it belongs -- the server already provides translation services, so why not simply add in the Unicode translation as well?

  84. Server support by dmelomed · · Score: 1

    It's a crappy solution to get rid of the symptoms - 8-bit brokenness of DNS and SMTP servers and web browsers. Those apps should be fixed in the first place to be 8-bit clean so we can finally use something other than ASCII that actually makes sense, such as UTF-8.

    1. Re:Server support by LordNimon · · Score: 1

      Then what about UTF-7?

      --
      And the men who hold high places must be the ones who start
      To mold a new reality... closer to the heart
    2. Re:Server support by Anonymous Coward · · Score: 0

      That's a "boil the ocean" solution. There are thousands of programs out there that know what legal DNS labels look like, and nobody will adopt i18n if some random subset of their apps and embedded hardware stops working. I agree we should revise the DNS protocol (IPv6 would be a good time to do it) but you have to maintain interop with RFC 1034 clients basically forever--this is a fairly painless way, when i18n-aware DNS servers or config file editors are available.

    3. Re:Server support by BJH · · Score: 1

      He said something that makes sense, dude.

  85. It's very simple to divide the world by HarveyBirdman · · Score: 1
    *.Dar-al-Islam

    *.Dar-al-Kufr

    *.Dar-al-Harb

    What more do you need? :-)

    --
    --- Ban humanity.
  86. Backwards compatability by Stephen+Samuel · · Score: 3, Informative
    Why not extend dns to support unicode? That way they'd be no translation or other crap to go through.

    Sounds like a great idea.... If you're willing to re-implement the DNS code in my Win-95 box.... or on my Amiga-4000. How about my 10 year old Apollo workstation or the SUN-3 that's still working just fine, thank you. etc. etc.

    A lot of old DNS implementations would choke (and properly so) on UTF-8 encoded DNS names. We probably could have seeded the needs of the future by saying that IP-6 DNS servers should support unicode, but I think that even that boat has been missed. (or is quickly leaving dock).

    In the meantime the old DNS and it's anglo-centric presumptions and restrictions are with us for the next few years (or decades, as the case may be). Clearly some people feel the need to live within those restrictions.

    --
    Free Software: Like love, it grows best when given away.
    1. Re:Backwards compatability by dmelomed · · Score: 1

      "A lot of old DNS implementations would choke (and properly so) on UTF-8 encoded DNS names."

      Those old implementations need to be fixed instead of forcing new broken standards on the rest of us.

      "In the meantime the old DNS and it's anglo-centric presumptions and restrictions are with us for the next few years (or decades, as the case may be). Clearly some people feel the need to live within those restrictions."

      Aaaarg!

      Those people need to wake up and redesign their broken software instead of introducing yet more brokenness to the standards for work-arounds that (the standards) unfortunately now follow broken implementations. Cough BIND cough.

    2. Re:Backwards compatability by Anonymous Coward · · Score: 0

      Go back and read the parent again, Win-95, SUN-3, these are old obsolete systems that will never be upgraded. They are not OSS, so unless the origonal vendor does the upgrade . . . and they have no ($$$)incentive to do so. You are asking for the impossible.

      If you had said upgrade then ok, (stupid but possible).

      In this case the broken software is the unicode DNS servers etc.

    3. Re:Backwards compatability by Anonymous Coward · · Score: 0

      It amazes me how unbelievably arrogant and stupid all anglo-saxans are.

      You are pushing some 99.9% broken system to everyone only because you are not willing to fix your systems.

      i18n and l10n should prohibited by castration by all who think ascii (ponyfuckingcode) is enough. Seems like most of the americans, btw.

      The time for UTF-8 is NOW! We should start using it everywhere.

      Besides, UTF-8 WILL come, whether you and ICANN and other ascii maniacs want it or not. There is nothing you can do, except fix your broken 7-bit-only scrapware.

      Let the luddites who want to live in their caves live in their caves!

    4. Re:Backwards compatability by Anonymous Coward · · Score: 0
      "WILL come"? If ICANN's IDN committee doesn't approve it, it's Not Going To Happen.

      STD 13 still defines DNS, and software that conforms to it is not "broken". IDNA is already a Proposed Standard--you'd better hurry up and find a better solution to the interoperability problem if you expect to convince the IETF.

    5. Re:Backwards compatability by Stephen+Samuel · · Score: 1
      Those people need to wake up and redesign their broken software instead of introducing yet more brokenness to the standards for work-arounds that (the standards) unfortunately now follow broken implementations. Cough BIND cough.

      It's not the software that's broken. The protocol defines (a subset of) printable ASCII, and that's what properly designed software filters for. DNS was (if I remember correctly) designed in the pre-UTF days. At that time, international character defs were something of a hodge-podge, and I think that printable ASCII-7 was chosen as a least-common-denominator.

      Even now, redefining the protocol to accept other printable characters without such hacks as they're using now could lead to some interesting problems (i.e. what's printable??? Am I going to start seeing domain names with smiley faces and chinese characters in them?). In many ways, limiting domain names to european scripts would be no less (perhaps even more) ego-centric than the old decision to use ASCII-7.

      --
      Free Software: Like love, it grows best when given away.
  87. Who types URLs? by Royster · · Score: 2, Insightful

    Geeks do, but your average surfer does not. They go clickly clickly on the results returned by the search engine or clicky clicky on the link someone emailed them or clicky clicky on the link from some other website.

    Most users don't even *know* that you can type stuff in the Address field.

    --
    I have discovered a truly marvelous sig, unfortunately the sig limit is too small to contain i
    1. Re:Who types URLs? by Minna+Kirai · · Score: 1

      your average surfer does not. They go clickly clickly on the results returned by the search engine

      If that were true, it would be an argument against changing DNS in any way. Since normal users never look at URLs, who cares what character set they use?

  88. It's not for trolls. by dmelomed · · Score: 3, Insightful

    "djbdns doesn't support unicode either, although it doesn't rely on standard c-libraries, so unicode support might only take a few weeks to add."

    djbdns is 8-bit clean. Use UTF-8 all you want right now.

    1. Re:It's not for trolls. by jannotti · · Score: 1

      Are you claiming that if I use UTF-8 to encode a string, I will never get a bytestring that contains a 46 (that is, a dot ".")?

      If you are claiming that, then I claim you are wrong. If you are not claiming that, then trying to lookup XXXX.example.com, where XXX encodes to something that contain a dot will not work. (because it will look like YYY.ZZZ.example.com and ZZZ.example.com is not going to exist)

      You can't just say "Such and such is 8-bit clean, so encode however you want" if "such and such" interprets some characters specially. You need to ensure those characters don't show up in your encoding.

    2. Re:It's not for trolls. by dmelomed · · Score: 1

      "You can't just say "Such and such is 8-bit clean, so encode however you want" if "such and such" interprets some characters specially. You need to ensure those characters don't show up in your encoding."

      No. It's 8-bit clean, so you can serve UTF-8 data. Other software needs to be fixed, including web browsers.

    3. Re:It's not for trolls. by divec · · Score: 2, Insightful
      Are you claiming that if I use UTF-8 to encode a string, I will never get a bytestring that contains a 46 (that is, a dot ".")?

      That's correct - no unicode codepoint apart from [FULL STOP] will cause a \x2E to appear in a UTF-8 stream. UTF-8 encodes the first 128 code points of Unicode using the identical ASCII values (which all have the eighth bit set to 0), and then only using combinations of the other 128 byte values (which all have the eighth bit set to 1) to encode every other character. It's very cool - that's why existing software doesn't usually need much modification to support UTF-8.
      --

      perl -e 'fork||print for split//,"hahahaha"'

    4. Re:It's not for trolls. by Carewolf · · Score: 2, Insightful

      Yes he is not only claiming that. He is right and you should look up your facts.

      UTF-8 only uses non-ascii values to produce non-ascii characters. That's one of the things that make it really neat, and easy to convert to. It also means that you jump into an UTF-8 stream at any point without getting out of sync and receiving trash. this makes it more powerfull than UTF-16.

    5. Re:It's not for trolls. by BJH · · Score: 1

      Those characters don't show up in UTF-8. It was designed that way.

      In fact, the only encoding that I know of that was stupid enough to overlap with ASCII is SJIS - used by (you guessed it) Microsoft. (SJIS includes characters that contain the ASCII code for \.)

  89. Compatibility question by Psychic+Burrito · · Score: 2, Interesting
    Does anybody know if this will just work "out of the box" with every computer that can produce umlauts?

    I'm asking because today, I've tried out the Netsol way of doing umlauts and they don't work at all with my Mac OS X and Safari: None of the listed domains work. The page lists a "plugin" that every web user is supposed to install, but it's Win only (of course...) and it's quite silly to have a domain with umlauts if you have to tell all your customers "before visiting me, please install this plugin"...

    Any idea if this new way work in all circumstances where the user has a international keyboard? Thanks!

  90. How about by Wolfier · · Score: 1

    Base64-ed UTF-8? String comparisons can still be carried out in the encoded form anyway. Nothing except the browsers needs to be updated?

    1. Re:How about by cicadia · · Score: 1
      Base64 isn't acceptable for DNS -- it has to be case-sensitive, while DNS isn't, and it uses '+', '/', and '=', which are all illegal in domain name components.

      Also, the RFC says that the internationalized domain name spec requires that any domain name containing all low-ASCII characters shouldn't be changed by the new standard. A straight base64 or similar encoding would end up mangling existing ASCII domain names as well.

      --
      Living better through chemicals
  91. I for one... by corebreech · · Score: 2, Redundant

    ...most certainly do not welcome our new Unicode-munging overlords.

    I don't care what the issues are. I have had it up to HERE with charset issues! ENOUGH ALREADY!

    If you can't do it using UTF-8, don't do it at all!

    Dammit.

  92. Japanese, for one, might prove interesting by minkeyboodle · · Score: 1

    One thing that's nice about using roman-character-based languages for DNS is that there's only one REAL way to spell things. But what about in Japanese where you can use kanji, hiragana/katakana, or romaji?

    For example, what if someone wants to have the domain "nanisore.com" (translated: "what is that?"). Do they get the domain in kanji/hiragana, all hiragana, or all romaji? I guess they have to buy all combinations if they want the domain exclusively. Ugh.

    (That must be the point of this whole exercise. Make the domain registrars more money by creating the need to register multiple domains where one would have sufficed previously! :) )

  93. And also... by Anonymous Coward · · Score: 0

    Microsoft.com
    Microsoft.com

    and
    Microsoft.com

    It's tough to tell the difference, but it's there for those who can see it.

    1. Re:And also... by Anonymous Coward · · Score: 0

      There is no difference. I did a view source to verify.

    2. Re:And also... by Anonymous Coward · · Score: 0

      Look again!

  94. Let me get this straight, please? by Anonymous Coward · · Score: 0
    Punycode, as described in the RFC referenced in the story, is a Unicode transformation into the subset of octet values between 0 and 127 which are permitted in the (relevant-RFC-compliant-) DNS domain names.

    To signal use of this scheme, Switch et al propose to signal use of this encoding by prefixing domain names using it with the sequence "xn--".

    Does anyone else see a problem here?

    <thememusic>
    Your assignment, should you choose to accept it, is to use the xn-- prefix and punycode to register a domain in the Swiss, German, or Austrian country domains which will transliterate to g*****.** in the asian script of your choice. ....

  95. Spoof with accented characters by Cardbox · · Score: 2, Informative

    There's no need to put accents on things, you can spoof just as well without. For example: the Greek omicron, Russian lowercase o, and Latin lowercase o all look identical... but they are all different Unicode characters!
    Unless the registries all implement some sort of canonicalization, owners of domain names containing the letter "o" are going to have a combinatorial explosion!

  96. This is important.. by k98sven · · Score: 4, Interesting

    Just to diverge, I'd like to represent the non-english speaker view here.

    In most of the languages with 'funny accents' like umlauts, these characters often have a completely different pronounciation, and are often considered to be a completely different letter than without the 'accent'.

    Simply 'brushing off the dirt' and removing the 'accent' thus changes the word. Sometimes with wierd results.
    Just ask someone from the town of Moensteraas, Sweden.
    Their website contains mostly municipal information intended for swedes, but due to the restrictions of DNS, the name is instead spelt 'monsteras', which means 'monster-carcass' in Swedish.

    Obviously, these people would be happier spelling it with umlauts on the o, and a ring over the a.

    1. Re:This is important.. by Anonymous Coward · · Score: 0

      Why monsteras instead of moensteraas?

    2. Re:This is important.. by Anonymous Coward · · Score: 0

      Let me give the evil example from turkish language. We use iso-8559-9 latin...

      SIKILDIM means=I am bored. Yes, its dotless "i" char in turkish alphabet

      sikildim means= I am fucked

      This topic also shows how US centric slashdot community is. Also its amazing that most of those clever geeks has no respect to others...

      First thing comes to mind in such subject is goatse! with 5 points...

  97. Just wonderful. by Geekenstein · · Score: 1

    I've had a hard enough time trying to educate my family to check that a URL in an email is actually from the domain they say it is from (ebay, etc).

    Now I have to deal with teaching them to recognize domains with accented characters to be fraudulent. Here's to another wave of harder to detect social engineering.

  98. This article reminded me by hookedup · · Score: 1

    of an article I read a while back, about "URL hiding by using alternate character set". I did a little bit of searching, and come up with this

    One problem with non-Latin scripts is that cybersquatters could begin registering non-Latin versions of popular domain names in order to divert viewers from intended destinations. Two Israeli students did just that in order to make an international point: They registered microsoft.com using the Russian Cyrillic "o" and "c," an international domain that looks exactly like microsoft.com in English even though it is in fact a different domain name.

    Whole text can be found here

  99. Becuase Esperanto already exists by Overzeetop · · Score: 1

    Here we are arguing about new standards supplanting old ones. We've already got esperanto, and - being technical people here on slashdot - we already understand it. Let's just make it the standard.

    Oh, it didn't catch on? Damn, I thought I had this one licked.

    Anyway, if the characters wont render on every machine properly, then it'll be a great day for the crooks out there who already do a pretty good job of fooling the masses (can you say paypa1.com? I know you could.)

    FWIW, when talking about languages, especially the rise of English, I view most of the comments from the French as sour grapes. I mean, how insulting is it to have to admit that the Lingua Franca of the world is now American (which is somewhat distict from, but clearly similar to "English," which is spoken by very few - mostly in primary education classes devoted to that topic).

    --
    Is it just my observation, or are there way too many stupid people in the world?
    1. Re:Becuase Esperanto already exists by dajak · · Score: 1

      American is not a language. It is a dialect of English, with only minor differences. Brits and Americans even understand eachother's spoken and written language.

  100. Did they really do this? by Anonymous Coward · · Score: 0

    Did they really require fags to wear a pink triangle in Nazi Germany? Those crazy krauts, gotta love their sense of humor!

  101. Know your French by Anonymous Coward · · Score: 0

    Only the English drive at the left side, and they aren't the EU last I checked (they don't even use Euro's)

    1. Re:Know your French by Anonymous Coward · · Score: 0

      The Irish drive at the left side too, an both *are* EU. The English don't use the Euro, just like the Swedish, but they're still part of the EU.

  102. There is a sort of computer attack based on this by Maimun · · Score: 1
    As far as I understood, they are augmenting the set of symbols only with letters from West European languages that differ visually from the letters in the English alphabet.

    Now, suppose that Cyrillic letters are added to the DNS in the future. Many people say that unicode should be used, and that implies Cyrillic too. It is impossible to distinguish visually Cyrillic "o" from Latin "o", yet their Unicode codepoints are different. In other word, you cannot be sure that the URL containing "o" is the one you want -- it may be one that is visually the same, yet the codes behind it may be different. Thus an attacker can lure people with URLs that look properly, yet they resolve to completely different IP addresses, namely the attackers'.

  103. Multiple Non-Technical Problems by angedinoir · · Score: 2, Interesting

    First of all, this opens a huge hole for url hijacking and obfuscation.

    Say for instance, you get a spam that has a url to http://www.microsoft.com/freeoffers

    You too were tricked, but you'll notice that instead of a normal i, it is instead replaced with an accented i or an i with a grave (slashdot strips these btw). Anyone that doesn't use accents (english, japanese, chinese, etc) probably won't catch the minor detail and will probably think that it's really pointing to www.microsoft.com.

    This is very similar to, but less obvious than using:

    http://www.microsoft.com@via.gra.biz/offers

    Most non-tech internet users will also believe this to be Microsoft's web-site. Spammers will have a hay day with all of the new opportunities.

    The second non-technical problem is that say I want to go to a Japanese web-site that doesn't have an english url. If I don't know kana/kanji (like most countries don't), then I don't know what letters to type in to get the correct japanese. I would have to get a dictionary and look up each character to figure out what to type.

    I agree that it's lame to only have it in english, but at this point, any country that uses the internet already has the ability to type english, but now they will need to be able to type in Japanese, Chinese, Russian, Greek, etc, etc, etc....

  104. Better usability by neves · · Score: 1
    It will be a great usability advance for non-ASCII languages. A big problem in portuguese domains is when you have an accented word in a domain name. A lot of people try to visit the accented address and get a "domain not found" error. Now everyone can register the accented and the unaccented version and have their due number of visitors. We will have a lot more of good domains to choose.

    If you have a domain name like www.samba-choro.com.br here in Brazil, the registar won't let any one but you to register www.sambachoro.com.br (without the hifen). Hope they are smart enough not to let anyone register the accented version too.

  105. Next step, Chinese, Klingon, and Heiroglyphics by extrasolar · · Score: 1

    Why not? We need more fonts, better input methods, and ideally, better keyboards. Our 26 letters, and the ten or so other symbols isn't enough anymore--for the internet guru, this restriction should feel restraining. 101-key keyboards? Please. Keyboards with more keys aren't exactly hard to make.

    The only real obstacle is getting past the sentiment that English or Latin-based languages are the only important natural languages. The foundation technology for true multi-lingual access is already here.

  106. Re:Bad idea but bound to happen with todays thinki by Anonymous Coward · · Score: 0

    > It might surprise some people, but there are quite a lot of people who don't speak a single word of english.
    Really? I have never spoken to any of them ;-)

  107. Re:Taco, why did you remove the accents from slash by Pflipp · · Score: 1

    Oh, the typewriter days.

    I'm getting all warm inside.

    You know, I'm only 23 and I can still recall doing those accent tricks on a typewriter? (A good ole Dutch one with a letter "ij" under the left little finger -- now why did we have to loose that letter in the Information Revolution when the Germans still got their Umlaut? We have been walked over!)

    And having a C= 64 of course. You know, my grampa has seen some history passing in his life, but so have I! ;-)

    --
    "We can confirm that Debian does *not* ship the version with the trojan horse. Our version predates it." [CA-2002-28]
  108. A few questions by Psychic+Burrito · · Score: 1
    A few questions, maybe somebody could help me here...
    • Since domains can have up to 63 chars, and the encoding takes away 5 chars plu 3 chars per umlaut, the longest domain name only consisting of umlauts is 19, right? And registrars will have the tedious task to explain to every customer that the length of their domain name is no longer fixed, right?
    • I have a basic understanding of entering a domain name and what happens after it. But will this work with every OS and browser out of the box?
    • Since there are two ways of writing those domain names, which layer should do the transcoding?
      • If it is the application layer, how are hyperlinks supposed to work with not-yet-upgraded browsers?
      • If it is the display layer (links are always used in their complicated form unless the end user sees them - I know this is no real network layer but bear with me), should we create hyperlinks to the complicated form?
    • So far, I've only heard marketing talk about "how cool this is", but is really ensure that absolutely nothing is broken? Because I've tried out Verisigns extended charset examples and not a single one of the provided links worked on my machine. Can anybody tell me if it worked for them? Thanks!
    1. Re:A few questions by Psychic+Burrito · · Score: 1

      Oh yeah, another question: Since domains normally have to have at least 3 chars (in Switzerland, at least), but since the domain o.ch will be encoded in quite a long fashion, will we be able to register o.ch? Would be cool, nice and short!

    2. Re:A few questions by The+Spanish+Ninja · · Score: 1

      None of their links worked for me either, but then again, I'm not running Windows. If the transcoding is handled by the application layer, it becomes a situation of "wait for the upgrade" or, to look at it from a M$ point of view, "push now, patch later." However, that's still more feasible than creating hyperlinks to the complicated form, because then the webmaster still has to remember all that crap, and some of those transcoded domain names (especially the Asian ones) look so much like a random string of babble that it just becomes ludicrous. So what you have here is a string of crap with a neat little "xn--" in front of it. So...why not use that as some kind of little subroutine trigger, use a standardized assignation of a fifth and possibly sixth character as some kind of country code, the servers modify the packet accordingly, and the routers handle the rest? Or maybe I'm just stupid...

      --
      "I like you, but I wouldn't want to see you working with subatomic particles."
    3. Re:A few questions by Anonymous Coward · · Score: 0

      The registrar will make an online form where the people will wrote their domain name in their language. If it is too long then the name won't be registered. That way people won't even bother with the encoded form of their domain name.

      If your OS does not support it you will see the encoded ascci character instead of the chinese character in your browser. If you want to write an email to an address that you cannot wrote on your keyboard you can still wrote the encoded address made of ASCII character.

      The hyperlink will have the encoded ASCII character in the web page but the browser will make you view the display of the decoded character. So if you open the webpage with notepad you will see the encoded hyperlink.

      Some company have introduce proprietary encoding. So your problem with verisign may come from a proprietary design of verisign.

    4. Re:A few questions by Psychic+Burrito · · Score: 1
      Thank you, you've been very helpful.

      A few more questions:
      As I understand it, I have to use a new browser in order to use these URLs, right? Because mozilla says it already supports it. So I guess a hyperlinks can be in both forms: The encoded or the decoded version, right? Does this mean:

      • Unless I upgrade my mail client, I cannot send emails to these new domains?
      • A hyperlink can be encoded in two ways, and unless it's encoded in the long form, any not-yet-upgraded browser won't be able to follow links?
      • Since browsers need to be upgraded, doesn't this mean that many other apps need to be upgraded as well, like IM-clients, web-grabbers, filesharing apps and search-engine spiders?
    5. Re:A few questions by Anonymous Coward · · Score: 0

      Client support is only needed to display and edit IDNs conveniently. The Punycode-encoded version of an IDN will be ugly but work fine everywhere.

  109. Re:Bad idea but bound to happen with todays thinki by TKinias · · Score: 1

    scripsit isaac338:

    Next time you go to a country the native language of which you can't understand, try planning your whole trip without once reading an English translation of any map or sign.

    I would add that anything Indo-European using the Latin, Greek, or Cyrillic alphabets doesn't count. A smart American can fake his or her way through a lot of signs in anything from Spanish to Russian. Try that with Hebrew or Korean.

    --
    In principio creauit Linus Linucem.
  110. Interesting, but... by dacarr · · Score: 1

    Why not just incorporate this into the IPv6 standards?

    --
    This sig no verb.
  111. Re:Bad idea but bound to happen with todays thinki by Anonymous Coward · · Score: 0

    I learned that your a confrontational twit from MTV. Oh, baby, baby...

  112. Re:Oh great... Another American... by Anonymous Coward · · Score: 0

    Another ignorant american? The American historical perspective is constrained to the last century it seems. The world has been going on for a lot longer than you think, 'Joe'.
    The most glaring evidence of this conclusion is the confusion in the parent about who is the historical agressor and who does the majority of the surrendering.
    France is one of the all-time most agressive countries. They've taken on almost all of europe at once, and prior to the last century, no single country managed to defeat them. And the only reason france collapsed so bad in WW2 is that they were hurting from the double-impact of the depression and massive losses from WW1. Surrender and subversion was their only option, seeing as how the americans did not see fit to provide backup for them.
    Does this matter to the high-school-history-challenged american? nope. Apparently not...
    And germany, a traditionally pascifist country, made was involved in just a few wars in its history (and only the in the two of recent times were they agressors)
    sigh...

  113. FFFFF.... by burbilog · · Score: 1

    FUCK!!!!!!

    I thought this idea was put into the grave year ago. It required crazy plugin and of couse nobody had it.

    Now to the problem. In Russian some letters look like latin and other don't. So if you type aroma.ru you get to the company site. Now imagine that they registered name apoma.ORG. In cyrillic. I.e. that 'p' is cyrillic 'r'. And other letters are cyrillic to. But .ORG ISN'T!

    What I have to enter when I see (and remember later) that URL from street ad? If I see lats .ORG then the whole URL must be written in Latin letters (there are no 'R' or 'G' letters in Russian), no? I go to apoma.org and see something really different from Russian wine supplier.

    Even if everyone got used that domain name contains both (and I don't know how to explain that to average dumb net user) the confusion will happen too often.

  114. As if it wasn't already by UnConeD · · Score: 1

    The domain name system is already messed up.

    A month ago an old domain name of mine (that I hadn't renewed in time) was re-registered by someone else hours after it was released. The domainname in question was a long, flemish name with a very specific meaning, so it was a deliberate overtake. The squatters are simply comparing snapshots of the domain name DB's to find expired domain names.

    The new owner was a shady company... they look like a registrar, but to buy a domain name you first have to fill in a form where you enter nothing but the wanted domainname and your email address: no doubt so they can register it before you can, and then force you to buy it at any price they like (namegiant.com).

    Domain name spoofing has already been done through simpler means (m1crosoft, http://www.microsoft.com@1.2.3.4/, ...) which are more than enough to fool Joe Average.

    It might not be important for English speakers, but even accents can cause differences in meaning. Plus all those 'squiggly chinese characters' are the normal way of writing that particular language. If it bothers you that you cannot read it, learn the language.

    I don't see the web getting any much worse from this, yet it offers significant advantages to the rest of the world.

    1. Re:As if it wasn't already by TheMidget · · Score: 1
      buy a domain name you first have to fill in a form where you enter nothing but the wanted domainname and your email address: no doubt so they can register it before you can, and then force you to buy it at any price they like (namegiant.com).

      If they are really doing this, they are begging to be taught a lesson!

      Indeed what's to stop a vengeful user from deliberately "suggesting" thousands of names to them, in which he is not interested at all? Stoopid namegiant will register all those, and will be stuck with the (first year) bill if they can't sell them...

      Note: be careful, use a throaway (hotmail...) e-mail address, you never know...

  115. Re:Oh great... Another American... by Anonymous Coward · · Score: 0

    And germany, a traditionally pascifist country

    Doh?! You obviously have never heard of Prussia, have you? They practically invented militarism. Living there basically meant being in the army.

  116. Punycode by sharkey · · Score: 1

    Enlarge your code Today!!! And inches in length and diameter!! Make her scream with delight!

    --

    --
    "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
  117. careful with your ratios by vortigern00 · · Score: 1

    my favorite quote from the spec is "The ratio of basic string length to extended string length is small."

    That means the extended string length is much longer than the basic string length....

    Somehow I don't think that's what they meant.

  118. Explanation by siskbc · · Score: 0, Offtopic
    I kinda wonder when Germany will stop getting shit for the wars.

    Never. ;) Actually, it's not so much the Holocaust that the g'parent was mentioning - more respect for Germany's war abilities, like winning a war with France as soon as it even existed as a unified nation. ;)

    Americans are just tweaked at france cuz it didn't fall in line like Britain did.

    Not the case - if that were true, you wouldn't find a littany of France jokes before 2001. But you do, because France has been a military joke for the last 200 years, approx. It's also the massive ingratitude following WWII displayed by France that tweaked Americans, combined with France's complete military ineptitude.

    It's not France's not supporting the war that angers Americans so, at least me. It's the fact that they were more interested in playing politics simply to get revenge with America for some perceived slight, which prompted the move they made. That and France was trying to become a friend of Sadaam, trying to get sanctions lifted so they could trade with Sadaam, etc. I truly think that with France's support early on, the whole scenario could and would have been resolved without a war.

    On the other hand, most american folks are OK with Japan these days... odd

    I would say it's the way that Japan, in the wake of getting nuked, turned into a very peaceful, hard-working society. Hard not to respect that a bit.

    --

    -Looking for a job as a materials chemist or multivariat

    1. Re:Explanation by Minna+Kirai · · Score: 1

      I truly think that with France's support early on, the whole scenario could and would have been resolved without a war.

      Maybe. But that's not what GW Bush wanted. He wanted a war. He didn't want to "resolve" the situation- he wanted to invade Iraq and kill Saddam. Finishing "family business" had been a goal ever since his election.

      The US would've gotten international assistance if they'd taken the approach "We need to solve the Iraq problem". But they didn't; they said "We will invade Iraq", and proceeded to do so.

    2. Re:Explanation by siskbc · · Score: 1
      Maybe. But that's not what GW Bush wanted. He wanted a war. He didn't want to "resolve" the situation- he wanted to invade Iraq and kill Saddam. Finishing "family business" had been a goal ever since his election.

      That's possible, certainly. Note that I didn't defend Bush here - but I really believe France deserves as much culpability. I think that if France had gotten behind inspections, Bush never would have had a real reason to invade, not enough to sell to America.

      The US would've gotten international assistance if they'd taken the approach "We need to solve the Iraq problem".

      For what it's worth, we tried for 10 years and the entire world wanted to pretend the problem didn't exist. That's no excuse for the outcome, necessarily, but it certainly mitigates. Honestly, France was trying to get sanctions *lifted*, it would have meant billions to them.

      I think that the main problem is that no country save the US takes any responsibility for the global political situation, and it's reflected by countries like North Korea who beg us to get involved even when we try to ignore them. They know we're the only ones active in anything that happens outside our borders. And when you have one country unilaterally implementing solutions, they'll screw up a lot.

      The world would be a better place if Europe and the stronger parts of Asia would take such responsibilities more seriously, instead of trying to flex their political muscle in the UN to prove how tough they are.

      And no, this isn't a troll, just an observation.

      --

      -Looking for a job as a materials chemist or multivariat

  119. MOD PARENT UP by Anonymous Coward · · Score: 0

    It's not just the anglo-chauvinists who want to leave domain names in ASCII. There are strong arguments for it. Some things ... word-processors come to mind ... obviously must be able to use a large character set like Unicode. But the argument for expanding the character set that domain names can be written in is much weaker. And if any move in that direction breaks a lot of existing implementations, maybe it just shouldn't be done.

  120. lose by Anonymous Coward · · Score: 0

    use lose

    not loose

    you know it makes sense kids

  121. Oh no. by mcpkaaos · · Score: 1

    You realize umlauts in domain names will only bring about another wave of hair metal bands, don't you?

    --
    It goes from God, to Jerry, to me.
  122. Moot point? by Anonymous Coward · · Score: 0

    For Gawd's sake, this is SLASHDOT, Man! You must mean mute point!

  123. general idea sucks by Anonymous Coward · · Score: 0

    Personally, the current implementation of IDNs is just a cop out to provide a marketable option for the folks out there. It's all just following the horrendous RFC guideline for timeliness. Wouldn't it have been better that every system just switches to accepting 8 bit, and hence unicode?

  124. Will these be... by Beolach · · Score: 1

    Will these domain names be Funkifiable?
    For example, http://slashdot.org == http://1109654166/.

    --
    Join moola.com, play games to earn money.
  125. Accecents like case? by davburns · · Score: 3, Insightful
    Perhaps I'm showing grave naivete, but it seems like it would be better to treat accents (dots, slashes and stuff) like case. DNS names are case insensitive, but case preserving. So, you can type all your fancy European characters if you want, but you don't have to mess with them if you're on a keyboard where that's difficult, and there's no additional opportunity for squatting or visual name hijacking. Naturally, you would want the accents to appear on reverse lookups (just like mixed case domain names work.)

    I know there are times when differnet accents sometimes indicate different words -- but I'm under the impression that it is unlikely that more than one of them would be a "good" domain name. (Am I wrong about that?)

    This won't work for non-latin characters, obviously. But UTF-8 seems like a better solution to that. (I understand that most chineese words are 2-3 characters of 2-3 bytes (unified is U-430 to U-9fa and upto U-7ff is 2 characters) for 4-9 bytes -- clearly less than 63 bytes) The obvious downside is that it means that all DNS servers and resolvers must (at least!) be 8-bit clean.

    1. Re:Accecents like case? by Haeleth · · Score: 1

      (I understand that most chineese words are 2-3 characters of 2-3 bytes (unified is U-430 to U-9fa and upto U-7ff is 2 characters)

      Try U+4300 to U+9fa0. All CJK characters are three octets in UTF-8.

    2. Re:Accecents like case? by davburns · · Score: 1

      Erm... Yes, you're right -- my error. (I was only looking at the top of the table at unicode.org, which leaves me missing one hex digit.)

  126. Good for English speaking Metalheads... by Anonymous Coward · · Score: 0

    Motorhead and Motley Crue rejoice!

    - DRFSR

  127. So lemme get this straight... by Cid+Highwind · · Score: 1

    This system:
    -Requires a browser plugin (therefore doesn't work with mail, IRC, etc)
    -The plugin is Windows only and IE 5+ only
    -shortens domain names by 4 chars + at least 2 more per every accent/umlaut/kanji
    -Is mostly unreadable in it's ASCII-fied form ...and this is supposed to be *better* than upgrading our DNS servers to support UTF-8?

    --
    0 1 - just my two bits
  128. Source code for Punycode encode/decode by LinuxParanoid · · Score: 1

    I found the technical section of the Switch article didn't fully explain the encoding. After thinking that I should code up the RFC to figure it out (and perhaps gain a name for myself on CPAN), I found to my mixed delight that someone had already beaten me to it: IDNA::Punycode and Encode::Punycode.

    --LP

    1. Re:Source code for Punycode encode/decode by vacuum_tuber · · Score: 1

      LinuxParanoid wrote:

      I found the technical section of the Switch article didn't fully explain the encoding. After thinking that I should code up the RFC to figure it out...

      Only look at that RFC if you want to become blind and brain damaged in a single sitting.

      --
      Look at the bright side: there's always seppuku.
  129. i18n domains means i18n SPAM! by MexicanMenace · · Score: 1

    No, wait.

    Already getting that.

    ' ' .. .. ..
    Digital Cable Booster - only 49.95 izc x uxqdshl

    ..
    D'oh!

  130. Re:Bad idea but bound to happen with todays thinki by Anonymous Coward · · Score: 0

    If "trilingual" means someone who speaks three languages, and "bilingual" means someone who speaks two language, what's English for someone who can only speak one language?

    American.

    ...it loses a lot in translation, but you get the idea.

  131. Damn! Slashdot left out the characters... by mijok · · Score: 1

    My post should be understandable anyway but slashdot left out the non-English characters o with two dots over it and a with a ring over it in the names Monsteras and Horby.

    --
    Karma. Moderation. Is my .sig good now?
  132. Re:Oh great... Another American... by JeffTL · · Score: 1

    Sparta hardly invented militarism...Sparta, anyone?

  133. latin_1? by Filik · · Score: 1
    JUST BE GLAD WE SUPPORT LOWERCASE!


    -BIFF

    (antilamenessfilter lowercases and some more)

  134. Somebody MOD ME UP by penguinrenegade · · Score: 0

    HEY! Who's modding a legit first post as FLAMEBAIT???

    The internet is now worldwide, but it originally started as DARPANET - an American military project!

    I'm American, but honestly most Americans can't see past the nose on their face!

    Somebody mod me back up!

  135. Faking Our Humanity by handy_vandal · · Score: 1

    ... unless we're faking our humanity for advertising purposes only, of course.

    Thank God a mega-corp isn't actually demonstrating REAL humanity for advertising purposes. That would cause devastating confusion.

    Let's not even get started with genuine corporate humanity: my cynicism is perfectly functional under the current arrangement.

    -kgj

    --
    -kgj
  136. Re:Maybe not as useful... (+5, interesting ????) by cobbaut · · Score: 1

    And here i was thinking that South East Asia is where most people live and where the fastest growing economy is.

    But off course you mean that USA companies also need a chinese domain name...to not be limited to the few people on this world that only speak english.

    --
    European Linux user, living in Antwerp
  137. Not AGAIN! by Montreal+Geek · · Score: 1
    This is, again, the fault of the marketing droids!

    First they destroy the very concept of a clean, simple and reasonably useful markup language because they couldn't care less about organisation and contents as long as they can show shiny bits to mesmerize the consumer.

    Took years to begin to fix that one.

    Having DNS support other character sets is a consequence of the second of their mostrous brain farts: confusing addressing and indexing.

    A domain name is NOT an index keyword. It has no meaning beyond setting up a delegation structure. 'foo.com' need have nothing to do with foos; it's just a frigging address .

    My postal address is "40 Some Street, Laval, Qc, CA" Canada -> Quebec -> Laval -> exact location. Repeat after me. De-le-ga-tion.

    Let's do a little brain experiment. Let's say all the postal services of the world decided to go the DNS way:

    My address could be marc-andre-pelletier.programmer.

    It would be a nightmare for any postal service, because rather than delegate the information to find where the house is to increasingly regional authorities, they'd have to look me up in a huge database to figure out where to send my mail.

    It would also be confusing because of mark-andre-pelletier.programmer and marc-andre-peletier.programmer ad nauseam.

    Of course, then there is the conflict that will ensue because there is inevitably another programmer with my name who may or may not be better known than I and will want my 'address'.

    Sounds familiar?

    This is what is inexorably happening with DNS. It's too late to change any of this; the marketroids already imposed that fuck-up and domain names are already worthless technically.

    So what do we do? Let's ask the marketroids. "Let's increase the breadth of the flat namespace even further!"

    Sigh.

    Allowing more (conceptual) codepoints in domain names isn't being nice to non-english speakers. It's just increasing to the confusion with no useful value.

    Maybe we should petition the phone companies to get unicode phone numbers. It'll make your phone ring much less often-- not only will many people not be able to remember your number right (or even successfuly write it down (how many people know how to write an uppercase epsilon?)) but many people won't even have a phone with the right symbols.

    Who knows; maybe telemarketers will stop calling when they can't figure out your number.

    -- MG

  138. Just to be the devil's advocate... by popo · · Score: 1

    I'll just be the devil's advocate here:

    The premise of this entire project is that we really *need* european letters for domain names. I say we don't. (I'm not being U.S.-centric... hell, I'm not even American.)

    The point is:

    Let's say I claim to need a ' ' (space) in my domain name. Or let's say J.Crew needs a '.' for their URL. Or let's say I need an exclamation point for my domain name here in the U.S. in order to correctly reflect my copyright.

    The point is that domain names were *never intended* to completely reflect spoken or written language -- English or any other. They have and have *always* been a 'semi-representational' system.

    The undercurrent of most of the discussion here is that the current lack of unicode characters reflects a sort of digital American unilateralism.

    This is a crock.

    Let's not forget that what the current lack of unicode actually represents is a *non-capitalist* system where brand names and precise spellings are less important.

    I for one am quite happy to approximate German letters with the closest equivalent. And if Volkswagen gets upset about it then maybe their advertising agency should have been thinking globally and rebranded without an umlaut.

    Pthtp!

    --
    ------ The best brain training is now totally free : )
    1. Re:Just to be the devil's advocate... by rduke15 · · Score: 1

      The undercurrent of most of the discussion here is that the current lack of unicode characters reflects a sort of digital American unilateralism.

      It certainly does when Slashcode rejects accented characters (making this discussion difficult at times).

      But as far as DNS is concerned, I'm also not sure this Ponycode ride is worth the trouble.

      I for one am quite happy to approximate German letters with the closest equivalent.

      That is not a very good example, because in German, it is very easy and "looks/feels" ok. They happen to use only one type of accent on a few vowels, always producing the same transformation which can be decently approximated with an "e".

      It's not as simple with many other (even Western European) languages.

      See this post or a few others

  139. Re:Punycode, not Unicode by rduke15 · · Score: 1

    I guess the pun(s) was(were) intended by the author of the coding? Will have to Google around a bit...

  140. Re:Taco, why did you remove the accents from slash by Anonymous Coward · · Score: 0

    Loose is loose because it has two o's.

    Lose has lost because it has one o.

  141. Duh by autopr0n · · Score: 1

    The solution here is to not use domain names for verification rather then trying to force the whole world to use Latin characters for naming internet sites. As you mentioned, there are already problems with using Latin. A bank system should already have a digital certificate, and that's mostly what we're worried about.

    The reason not to use Unicode is that it won't be backwards compatable with the old DNS standard, which was meant to be compatable across ASCII and non-ascii systems.

    --
    autopr0n is like, down and stuff.
    1. Re:Duh by julesh · · Score: 1

      The solution here is to not use domain names for verification rather then trying to force the whole world to use Latin characters for naming internet sites. As you mentioned, there are already problems with using Latin. A bank system should already have a digital certificate, and that's mostly what we're worried about.

      You don't get it, do you?

      The point is that this would allow people to create two different domain names that looked, when displayed, so close to identical as to be impossible to distinguish from each other. But they could refer to two different sites. How would a user know which one was which? How would a user know which one to use?

      Also, you could create domains which do seem to be identical. Unicode has many symbols which are actually identical and equivalent to each other in all senses, but which are defined in multiple places in the code.

      Furthermore, you say a digital certificate should do the job. But what do digital certificates guarantee? At present, they tend to give a good degree of confidence that the person operating the site you are communicating with is the person who owns the domain name it is running on, because that is the information that is checked by the issuer, and that is the information in the CN field which your browser verifies matches what it expects. Therefore, even with digital certificates, you're still relying on being able to distinguish different organisations via their domain name.

      The reason not to use Unicode is that it won't be backwards compatable with the old DNS standard, which was meant to be compatable across ASCII and non-ascii systems.

      That, of course, is also a good reason. But it doesn't invalidate the first...

  142. Arrrgggh by nnnneedles · · Score: 2, Funny
    The intent is not for some human to be able to read it, but for all humans.

    Ohhh the arrogance of americans.

    Here's an example of why this is good. In sweden there is a town called Horby. That's 'o' with two dots over it. Their site has to be named 'horby' as it is now (without the dots). Horby means 'the village of whores' in swedish.

    Do you think that billions of people who use other alphabets than the american one, are going to agree with anything you said in your post?

    This change IS a big deal, not only for small towns, but for loads of big companies, government websites and all kinds of sites you can think of.

    --
    Will code a sig generator for food
    1. Re:Arrrgggh by lucas+teh+geek · · Score: 0, Flamebait

      Yeah great, so if this ever gets off the ground you want to block large portions of the world from ever visiting some stupid towns website?

      ... come to think of it, thats not such a bad idea! maybe if we make all the stupid websites use stupid characters in their domain names the internet will no longer be shit, sort of like a filter

      --
      TIAEAE!
  143. Goodie. More efficient SPAM filters :) by saikou · · Score: 1

    Anything that has link to non-"old-standard" site in the body of the message yet got sent to plain old latin US domain can be easily throwin into SPAM folder.
    Might not work for companies but certainly will for individuals.

  144. Translitteration by k98sven · · Score: 2, Informative

    Why monsteras instead of moensteraas?

    Good question. Basically people don't think/too lazy to translitterate the letters properly.

    Some places have the forethought to register both:
    Munich in Germany has registered both "munchen.de" and "muenchen.de".
    (But it's really a u with an umlaut)

  145. Re:Example by rduke15 · · Score: 2, Informative

    http://www.xn--rksmrgs-5wao1o.se/ will work if you are using a recend Mozilla

    Thanks for the example. Let's do a few quick tests.

    The encoded version always works, and leads to a page where you have an unencoded link (normal spelling with the accents).

    Copied the unencoded version, and tried:

    On WinXP:

    - Mozilla 1.4 : OK
    - MSIE 6, Opera 6.2 : NO

    On Linux - Red Hat 6.2 (of course, that's a pretty old system):

    - lynx, ping, host, dig, ... : NO
    (cannot test Mozilla, since this server has no GUI.)

    Well, I guess we'll have to live with that horrible Punycode.

  146. djb's comments - http://cr.yp.to/djbdns/idn.html by Anonymous Coward · · Score: 0

    fleshes out some of the issues.

  147. This will become normal when... by WoTG · · Score: 1

    we've all switched to IPV6. =)

    Seriously, who's going to use one of these domains when big chunks of your visitors can't call up your site, or at least can't call it up with the "same" address (I gather from one of the earlier posts that there is some sort of encoding to make it magically work for everyone).

  148. Punycode? by tuxlove · · Score: 1

    Why not UTF-8? I suppose that might have broken apps that expect ASCII-only, which I guess is one of the reasons we now have base64 or quoted-printable encoding for email. But it sure would have been nice. Reading domain names in Punycode in many cases will not be a lot easier than reading binary... :(

  149. Backward compatability. by Anonymous Coward · · Score: 0

    This method allows new domain names to work with existing DNS servers and clients. While old clients won't show internationalized domains they will still be able to use them. In other words, it doesn't break anything that currently works. While unicode is altermately a good solution it requires every piece of software that uses domain names to be rewriten to handle unicode. This means everything from mail clients, web browsers, web servers, dns server, mail servers etc.

  150. Re:Bad idea but bound to happen with todays thinki by droleary · · Score: 1

    The world is a big place. You ought to get out and see more of it.

    This is an odd statement. The truth is that those who actually do get out and see the world are those most exposed to the problems that miscommunication causes. Far too many people tie their culture to their communication and are unwilling to change. If most countries can seem to standardize on the metric system without much problem, why can't they all standardize on a language (any single language; not necessarily English)? If we can all share Euros, why can't we all share the same word for money?

  151. And to make it worse by Anonymous Coward · · Score: 0

    When you get to the web page its all in this funny Kanji stuff. Clearly they should all just write in American english;)

  152. Re:ASCII ANSI INSI ... by rduke15 · · Score: 1

    doesn't the A in ASCII stand for American?

    So instead of ANSI don't we need a INSI (I standing for international) standard?


    That sounds like a very good idea. To make it shorter, we could call it ISO (International Standards Organization). Yeah, what a terrific idea. Let's get a domain name straight away. Maybe iso.org? Sounds cool? Well someone took that already: ISO.

  153. lol mod parent UP UP UP by Anonymous Coward · · Score: 0

    "Here's an example of why this is good. In sweden there is a town called Horby. That's 'o' with two dots over it. Their site has to be named 'horby' as it is now (without the dots). Horby means 'the village of whores' in swedish."
    rofl

    i those two lines convinced me why your right

  154. There is no universal capitalization by Anonymous Coward · · Score: 0

    Capitalization is language dependant.

    Just as the quality of being a "different letter" is language dependent.

    But, certainly we can make a good compromise that most people will like, for global capitalization.

  155. re: But if 16-bit charsets are allo by Anonymous Coward · · Score: 0

    re: But if 16-bit charsets are allowed...

    16-bit charsets ?

    Perhaps you confuse Microsoft's implementation of outdated Unicode 2.1, and their deep confusion between character sets and character byte encodings.

    But if the essence of your argument is that we should not allow Korean domain names which contain both Chinese and Hangul, your argument seems to me to be badly misguided. I argue we should give people what will please them, rather than what seems easy to implement.

    AFAICT, your argument is that we should not allow anything outside of ASCII, because it will complicate your life. I assume this means denying even basic Korean domain names (say, in Hangul only). That seems worse than misguided to me -- that seems to be what we started with -- the mantra of "English only and forever".

  156. the English are from another planet by Anonymous Coward · · Score: 0

    I always find it annoying when that happens, how come all aliens speak English? Babylon five handled it pretty well as far as I can remember, in fact in a Crusade episode some English speaking aliens show up and the humans get freaked because of that. I won't even get into the humanoid aliens thing (Star Trek being the biggest offender of all).
    Alien languages are always severely mistreated, it's a pity, really.

    1. Re:the English are from another planet by crabpeople · · Score: 1

      but a translator from any of those languages wouldnt work unless they were able to at least get the base idea of what they were saying.

      what i think the parent should have said is that if its an idea it can be translated. you might just have to make up some new words in english to translate them. maybe someone who speaks alien or even kanji can say the contents of a sentance in one word but hey, you can always fall back on the sentance.

      --
      I'll just use my special getting high powers one more time...
  157. Re:djb's comments - http://cr.yp.to/djbdns/idn.htm by Anonymous Coward · · Score: 0

    Unfortunately, djb's "proposal" boils down to "use UTF-8" with no interoperability for RFC 1035 clients (the idea that "all relevant software has been upgraded" will ever happen is simply ludicrous). Only allowing registration of dissimilar characters is a good move, but he should have gone further and mandated a mapping from prohibited to allowed characters (if the problem is that the user can't distinguish them, how are they supposed to know which one to type?)

  158. Direction by dieresis · · Score: 1

    The script direction would seem to me to be just a display issue. However the user enters the domain name, it would be encoded and transmitted in just one direction. I think that even scripts written vertically (as mainland Chinese was until the Communist Party adopted the horizontal as a standard) should be supportable.

  159. Re:Bad idea but bound to happen with todays thinki by Anonymous Coward · · Score: 0

    Because standardizing on a language is just slightly more complex than standardizing on a measurement system or currency--not to mention the cultural implications (though I wouldn't expect an American to understand that).

  160. Cool, i want a domain "/..com" by GNUALMAFUERTE · · Score: 0

    Actually, /.ters, we should ask them for a domain name "/..org" ... i will add this: /..com. in a 66.35.250.151

    to my company's dns server ... if i don't came back in 5 minutes, i know why : )

    --
    WTF am I doing replying to an AC at 5 A.M on a Friday night?
  161. Not good from a security perspective by ddent · · Score: 1

    I am actually quite concerned about the push to internationalize DNS.

    It is not that I don't have things to gain from it -- people would be buying more domains, and my company, among other things, sells domains. I also speak two languages; one of them requires accents in some situations. It would be nice to be able to include them.

    So its not that I don't understand the attractiveness to the various stakeholders.

    BUT, from a practical perspective, I think it is a nightmare. We've already seen situations where people register paypa1.com (the last character there is a one, not the character l) and use it to grab people's info. Additional possibilities include spammers registering domains similar to others' and sending spam with a URL on that domain. Or entries in syslog. With the limited characterset currently allowed, the only thing that can happen is people who aren't looking closely, or are using certain fonts that don't necessarily distinguish things as well as they could/should, get burnt. But if we implement international domains, there will be a LOT of ways to register names that are incredibly similiar -- and depending on how much of unicode/utf-8 we implement, it would actually be possible that there would be two different encodings for a character that is *supposed* to appear exactly the same on screen.

    What a nightmare.

  162. happy for Motley Crue by thomas_klopf · · Score: 1

    I'm just happy that I can finally start my Motley Crue fan site:
    www.motleycruerox.org

    They put the "uber" in "umlaut"! Rock on!

    p.s.: this was, of course, the real reason why unicode was invented.
    (slashdot isn't allowing posting of html entities/unicode, :P )

  163. Re: Al Gore by Anonymous Coward · · Score: 0

    > He claimed to have created the Internet. I fail to see the distinction between creating an invention and inventing it.

    No, he claimed to have taken the initiative in Congress in creating the Internet.

    Get it? Congress created the Internet; Al Gore led the charge. That much is true, it's on the public record. Without Gore's efforts, the Internet as we know it today would not exist.

    Yes, he could have been less boastful and he could have worded it better. But he didn't lie, and he didn't say he 'invented the Internet.'

    This has been a public service announcement.

  164. Because it's already patented !!! by AftanGustur · · Score: 1


    Why not extend dns to support unicode? That way they'd be no translation or other crap to go through.

    Because some total fuckhead at the patent office allowed the idea to be Patented.

    --
    echo '[q]sa[ln0=aln80~Psnlbx]16isb572CCB9AE9DB03273snlbxq' |dc
  165. Bollocks to IDN by gfreeman · · Score: 1

    Just get everyone to remember the IP address of all sites they are ever likely to visit.

    Roll on IPv6!

    --
    Ceci n'est pas un sig.
  166. Then let's have a standardized english! by tjstork · · Score: 1

    Saying that because english is not standardized does not address the problem. What we really need to do is have a world wide standard language. Esperanto sucks, but, something english like could gain considerable traction.

    Some principals of this language:

    a) a phonetic language. a symbol in writing should represent a sound produced by humans.

    b) one sound, one symbol. I agree that the relationship between phonemes and graphemes is a complete mess. Many to many relations between the two cause needless confusion.

    If all we did was to have a single set of characters represent a single set of sounds, we would be infinately farther along then we are now. I think UNICODE solves the wrong problem.

    We argue of favor of standards for all levels of human work - suggestions that some other protocol than http be carried on the web are oft met with derision, that some other database language than SQL ought to be introduced decried as heresy, and often cultural arguments for technology are dismissed as laziness. But, those same arguments can be applied to human language, and arguably, they should carry greater weight, as, what's the point of any standard if half the world cannot read, or if, cultural differences in languages render meanings subtley different?

    One World == One Language.

    Everything else is unimportant.

    --
    This is my sig.
    1. Re:Then let's have a standardized english! by dajak · · Score: 1

      The cultural differences are there regardless of language. If I speak English, I still have to be aware of subtle differences in meaning.

      On one sound, one symbol: Just because we can have one symbol for the spirant guttural g or velar nasal ng doesn't have to mean you know what phoneme corresponds with it, of course. Dissecting words into phonemes is also an acquired skill.

      Still, a simple controlled English along the lines of COBOL or ACE for user interfaces, specifications, manuals, etc. as a standard would be nice.

      My objection was mainly to the analogy between English and unix. I wanted to point out that the process by which English has become a de facto standard, is comparable to the process by which windows has become a de facto standard. clear specification or "fitness" of the language as such are clearly not relevant factors.

      I use a keyboard with US layout (cheapest and most choice) and always install English (preferably UK) versions of software now, because not everything is available in all languages and very often only the English version of an application is patched regularly. In the past I had a bug-ridden and confusing mixture of Dutch, German, and English applications on my computer. This is the method by which English becomes a de facto standard, isn't it?

      UNICODE (and punycode for domain names) is mostly relevant for reducing the number of different spellings used as workarounds (e after umlaut, prefix hyphen instead of trema, etc.) I have to try to find names of people, companies, etc in some languages. I never considered domain names a problem (just IP numbers would be fine with me), but phasing out obsolete character sets in standard protocols is a sensible idea.

    2. Re:Then let's have a standardized english! by toph42 · · Score: 1

      We all know that by the 24th century, when we are the first members of the United Federation of Planets, we will all be speaking English (not to mention the rest of the entire universe).

      Why fight it, convert now for the good of the Federation. :P

  167. I have a better one!! by Anonymous Coward · · Score: 0

    For the love of GOD, or just plain your eyesite, do NOT visit: this site.

    If you do visit this site, please post your experiences with the rest of slashdot so we can laugh at your stupidity.

    Go on. I double. Naaa. Triple. Naaaaa. Pentiple dare you to visit the above site.

    Thank you. HAND.

  168. alphabet! by Anonymous Coward · · Score: 0

    Do foreign languages have alphabets, or is the concept generalized from its original meaning? alphabet. aka. alpha beta. aka. the first two letters of the latin, for want of a better word, alphabet.

    So should other langauges have squiggle1squiggle2 as their name for an alphabet?

    Curious, or brain dead, people want to know.

    1. Re:alphabet! by Anonymous Coward · · Score: 0

      > alpha beta. aka. the first two letters of the latin, for want of a better word, alphabet.

      I hate to break it to you, but 'alphabet' comes from Greek! Alpha and beta are Greek letters.

      It's an interesting factoid (to me anyway) that our 'Latin' alphabet has 2 letters (J and W) that the Romans didn't have. :)

  169. Re: Al Gore by Anonymous Coward · · Score: 0

    DARPA and USC contributed money and goals, but Bob Kahn, Vint Cerf, Jon Postel, and their contemporaries had already created the Internet well before Gore's involvement.

  170. Re: Al Gore by james_orr · · Score: 1
    Again, google is your friend. Let's quote Vint Cerf ...
    Bob [Kahn] and I believe that the vice president [Al Gore] deserves significant credit for his early recognition of the importance of what has become the Internet.
    Here's the link
  171. Re: Al Gore by Anonymous Coward · · Score: 0

    Of course Gore deserves a lot of credit for what he did. But "creating the Internet" is not what he did (that was already done). Cerf's well-earned respect is leading him to overlook the plain meaning of Gore's statement.

  172. Re: Al Gore by Anonymous Coward · · Score: 0

    And the point remains, he didn't claim to have 'created the Internet.' Congress did. Vint Cerf et al did their research with money approved by Congress. Al Gore was the major legislative backer for NSFnet, which is the basis for our modern Internet.

    Of course the technologies were already invented, but the original ARPANET didn't become the "Internet" (using TCP/IP) until 1983. That was pretty much a government/academic network, not a commercial one, and the "old" Internet was subsumed by NSFNet in 1990. Thanks, in large part, to Gore.

    He sort of fumbled his comment, I'll grant you. But he was talking about his congressional record at the time, and his statement was mostly on-target.

    Here's the story if you have the desire to read the whole thing.

  173. Re: Al Gore by Anonymous Coward · · Score: 0

    We'll just have to disagree. IMHO "the Internet" is a protocol suite and an address registry, which Cerf et al are the creators of even though the US Congress sponsored their work.

  174. This *is* Unicode by zenzen667 · · Score: 1

    This *is* Unicode. To deal with Unicode, participants need to agree on an encoding, such as UTF-8, UTF-16, ISO-8859-1, etc. In this case, they have chosen a 7bit clean encoding method that means no changes need to be made to the DNS infrastructure and still usable by people using applications without Unicode domain name support.

    An application that wants to support Unicode domains needs to implement RFC3490 (Internationalized Domain Names in Applications) and RFC3492 (Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)). Nameprep is sort of like an i18n version of lower(), used to compare domains unambiguously. IDNA defines the mechanism for encoding a Unicode domain into the ASCII clean representation. If people want to play, Python 2.3 has this out of the box:

    % python
    Python 2.3 (#1, Sep 13 2003, 00:49:11)
    [GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> ascii = 'www.xn--rksmrgs-5wao1o.se'
    >>> uni = ascii.decode('idna')
    >>> print uni.encode('latin1') # Or whatever your terminal uses
    www.r?ksm?rg?s.se
    >>> uni.encode('idna')
    'www.xn--rksmrgs-5wao1o.se'
    > >> from encodings import idna
    >>> idna.nameprep(u'WWW.MICROS\N{LATIN CAPITAL LETTER O WITH DIAERESIS}FT.COM')
    u'www.micros\xf6ft.com'
    >> idna.nameprep(u'WWW.MICROS\N{LATIN SMALL LETTER O WITH DIAERESIS}FT.COM')
    u'www.micros\xf6ft.com'
    (erk - Python's triple > prompt confuses SlashDot...)

    And yes, Virginia, there is nothing stopping you registing Unicode domain names in .com, .net, .org etc. right now, if you know how to encode them yourself. To the DNS system and most registrars they are just perfectly valid ASCII domain names, with the decoding left to the applications and/or network libraries. Dealing with US keyboards and obsolete browsers is left as an excercise for the reader.

  175. They'd register both, of course. by jtheory · · Score: 1

    Actually, that makes me suspect that domain registrars must be pushing pretty hard for this.

    Think about it -- a French site would have to register their domain w/ the accents AND the unaccented domain: one to be correct, and the other for the foreigners. An international Japanese company might need dozens of domains, because of the different ways their name is transliterated into various languages.

    Hm. It seems like there are going to be a lot of problems that this change can cause, mostly because it's NOT easy to type the chars that will be in some of these URLs. It's not obvious at all how to type an accented e (plus it's different by OS). Why? It's a cause of frustration even for English-only speakers! "So... you will resume sending me your what??" When I was studying French in school, I used to mark all the accents by hand into papers I typed up, after printing them out. It was just so much faster than any other method I found. I can't even imagine how I'd manage to input a URL in cyrillic or pictographs.

    The thing is, though... this is the lesser of two evils. I really don't see everyone sharing a common language anytime soon, which would be the other solution. The internet all over the world is swiftly moving from academics and geeks to businesses to common folks, all over the world. Don't you WANT to be able to fully translate your website, so any bumpkin in Siberia can order your products? They aren't all going to learn a new alphabet.

    Think about it. Just about every language has its own keyboard layout. All my emails from Europe looked pretty funny until I figured out how to change the keyboard layout... (yes, you can do this in many web cafes. Damn! I can't spell caf-ay without an accent!). Someone somewhere is going to have a tough time typing any given characters.

    It's only fair if accessing foreign language sites is an equal pain-in-the-ass for all. The internet is not destined to be a single language medium.

    Maybe the operating systems and keyboards need to improve. What a shocking thought that is. Hey, it would be a huge help to lots of frustrated kids trying to do their foreign language homework on a computer.

    And really, if you think about it, how many foreign-language URLs do you type in during your day? Sometimes you get a search result that looks interesting so you translate it. Okay... you don't type that URL, you copy it and paste in the translator. Where else do we get URLs? When won't you be able to just copy it? Very rarely...

    I think we can deal with it for a while.

    --
    There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
  176. I think I could fix the accents! by jtheory · · Score: 1

    / / ..
    I emailed taco my resume from the web cafe. Am I being naive?

    (they should hire me; I know how to fix it!)

    --
    There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.