Slashdot Mirror


ICANN Approves Non-Latin ccTLDs

Several readers including alphadogg tipped the news that ICANN has approved non-Latin ccTLDs at its meeting in Seoul. "Starting in mid-November, countries and territories will be able to apply to show domain names in their native language, a major technical tweak to the Internet designed to increase language accessibility. On Friday, the Internet's addressing authority approved a Fast-Track Process for applying for an IDN (Internationalized Domain Name) and will begin accepting applications on Nov. 16. The move comes after years of technical testing and policy development... Currently, domain names can only be displayed using the Latin alphabet letters A-Z, the digits 0-9 and the hyphen, but in future countries will be able to display country-code Top Level Domains (cc TLDs) in their native language. ... 'The usability of IDNs may be limited, as not all application software is capable of working with IDNs,' ICANN said in a 59-page proposal (PDF) dated Sept. 30 that describes the [application] process." Reader dhermann adds, "Great, now even less chance I can identify NSFW links before they are blocked by my work's big brother app and my boss is notified... again."

284 comments

  1. terrorist level domain by czarangelus · · Score: 4, Funny

    Arabic TLDs are a threat to national security

    --
    When a true genius appears, you can know him by this sign: that all the dunces are in a confederacy against him.
    1. Re:terrorist level domain by Canazza · · Score: 1

      it's only a matter of time before someone registers bánkófámérícá.com or llóydstsb.co.uk for their phishing schemes

      --
      It pays to be obvious, especially if you have a reputation for being subtle.
    2. Re:terrorist level domain by pablo.cl · · Score: 1

      bánkófámérícá is not a TLD. This was allowed several years ago.

    3. Re:terrorist level domain by orangesquid · · Score: 1

      Oh hell, just register [add-accent-to-previous-letter]bankofamerica.com. There are tons of ways to abuse unicode, as many security papers have already discussed.

      --
      --TheOrangeSquid Is it any wonder things seem so awry? We swim in a sea of confusion and don't have to think to survive
    4. Re:terrorist level domain by Anonymous Coward · · Score: 3, Informative

      That has been possible for years.
      This is about registering bankofamerica.cõm or lloydstsb.cø.ûk

      The part AFTER the dot.

    5. Re:terrorist level domain by selven · · Score: 1

      Or, even worse, bankofamerica.com where the o in com is actually a Cyrillic o.

  2. Hmm... by Anonymous Coward · · Score: 1, Insightful

    micrösöft.cöm?

    1. Re:Hmm... by Anonymous Coward · · Score: 1, Insightful

      Only if you can name a country whose native language ccTLD would be .cöm .

    2. Re:Hmm... by tepples · · Score: 1

      You're assuming that this isn't a trial run before ICANN expands IDN to gTLDs.

    3. Re:Hmm... by 4D6963 · · Score: 1

      No fortunately they're not going yet for a full unicode thing, only a few select character sets like Chinese or Arabic. So for the moment that shouldn't be a problem.

      --
      You just got troll'd!
    4. Re:Hmm... by Chris+Mattern · · Score: 1

      You're assuming that this isn't a trial run before ICANN expands IDN to gTLDs.

      IDK, but IMHO it could be a real FUBAR if they do it ASAP. YMMV.

    5. Re:Hmm... by MadKeithV · · Score: 1

      vi nöt tri a höliday in svéden this year?
      Sëe the löveli lakes?

    6. Re:Hmm... by xaxa · · Score: 2, Funny

      micrösöft.cöm?

      That's Microsoft with the volume turned up to 11?

    7. Re:Hmm... by SEWilco · · Score: 1

      OK, I'll name the country Cömet. Now to find a landmass to park it on.

    8. Re:Hmm... by Valdrax · · Score: 1

      Despite how metal that sounds, it's not where you'll find Chrome.

      --
      If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
    9. Re:Hmm... by gnud · · Score: 1

      A Micrösöft product once BSODd my sisters computer.

    10. Re:Hmm... by allcaps · · Score: 1

      Go claim somewhere on Mars.

    11. Re:Hmm... by badkarmadayaccount · · Score: 1

      ROFL

      --
      I know tobacco is bad for you, so I smoke weed with crack.
    12. Re:Hmm... by Anonymous Coward · · Score: 0

      Is it a phishing trip?

  3. I took Latin in high school by Anonymous Coward · · Score: 2, Funny

    I'm glad we're going with Non-Latin TLDs now, I never understood going to the website "e.pluribus.unm"

  4. Perdire by SEWilco · · Score: 2, Funny

    There go my plans for world domination through venividivici.vvv

    1. Re:Perdire by dkleinsc · · Score: 1

      What about fahrvergnugen.vw?

      --
      I am officially gone from /. Long live http://www.soylentnews.com/
  5. first urls, then slashdot by azior · · Score: 5, Funny

    ï höpé thãt slâshðõt wìll dö thís töø wìth ÜRLs!

    www.íçáñn.örg

    ìt wörkéð!

    1. Re:first urls, then slashdot by Anonymous Coward · · Score: 5, Funny

      Here's a demonstration of how non-Latin characters show on /., starting with Arabic:

      Hindi:
      Russian:
      Japanese:
      Korean:
      Chinese:

    2. Re:first urls, then slashdot by camperdave · · Score: 1

      Here's a demonstration of how non-Latin characters show on /., starting with Arabic:

      Hindi:
      Russian:
      Japanese:
      Korean:
      Chinese:

      Just because the characters don't show up in the edited text doesn't mean that they won't be handled in anchor tags or Slashdot's URL tag.

      --
      When our name is on the back of your car, we're behind you all the way!
    3. Re:first urls, then slashdot by mrdoogee · · Score: 1

      A Møøse once bit my sister...

    4. Re:first urls, then slashdot by rxmd · · Score: 2, Insightful

      Just because the characters don't show up in the edited text doesn't mean that they won't be handled in anchor tags or Slashdot's URL tag.

      Well, Slashdot mangles them anyway. The URL should end in .com.

      Slashdot's web interface is quite embarrassing in this respect. Having a non-Unicode-capable page in 2009 is like having one that is optimized for Netscape 0.9, no matter what amount of JavaScript and Web 2.0 bling they put in there.

      If international URLs will finally force Slashdot to implement a triviality such as string parsing, so much the better.

      --
      As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)
    5. Re:first urls, then slashdot by Mister+Whirly · · Score: 1

      Having a non-Unicode-capable page in 2009 is like having one that is optimized for Netscape 0.9,

      Damn it! It took me this long to get used to Netscape 0.8, and now you want me to upgrade again? No way! All of my Geocities and Angelfire pages load perfectly.

      --
      "But this one goes to 11!"
    6. Re:first urls, then slashdot by Bigbutt · · Score: 1

      ... in bed

      [John]

      --
      Shit better not happen!
    7. Re:first urls, then slashdot by Anonymous Coward · · Score: 0

      slashdot is not using utf-8? God.

    8. Re:first urls, then slashdot by Anonymous Coward · · Score: 0

      You must have been browsing in offline mode for a while as Geocities shut down a few days ago.

    9. Re:first urls, then slashdot by primus1024 · · Score: 1

      I'm pretty sure Geocities pages don't load anymore. They were closed on 26th ;-)

  6. FORSTUS POSTUS by Anonymous Coward · · Score: 0

    First non-latin top level post.

    1. Re:FORSTUS POSTUS by twistedcubic · · Score: 2, Funny

      That's PRIMUS POSTUS.

    2. Re:FORSTUS POSTUS by mrdoogee · · Score: 1

      Proximus Primus Postus.

    3. Re:FORSTUS POSTUS by Anonymous Coward · · Score: 0

      That's PRIMUS POSTUS.

      No, the GP did say non-latin.

  7. Toto, you are not in nss any more. by Anonymous Coward · · Score: 0

    nss or Kansas for that matter

  8. ICANN has lost it! by RiotingPacifist · · Score: 4, Insightful

    Far too much software makes the assumption that TLDs only contain [a-z0-9-], so if you want to go changing that there needs to be a damn good reason, there is not. There are ~1369 2 letter TLDS to be shared between ~200 soverin states and 49284 3 letter generic ones to be split between uses (.xxx .nws .org .edu, etc), there doesn't seam to be any good reason to expand that and make lots of software more complex.

    --
    IranAir Flight 655 never forget!
    1. Re:ICANN has lost it! by Anonymous Coward · · Score: 0

      That no reason to stop progress. If everyone thought like this, we would still all be programming in COBOL

    2. Re:ICANN has lost it! by imagoon · · Score: 2, Informative

      If everyone in the world liked those latin characters, then sure. But maybe someone else in the world prefers yahoo.(nihon*)? Wanted to write it in kanji but /. doesn't seem to take unicode.

    3. Re:ICANN has lost it! by Idiomatick · · Score: 1

      Domain names have been muddy for quite some time. Think of all the non commercial dot coms. Or government sites on anything other than their .gov or their country code. del. del.icio.us? They've been mostly ignored. People get .com to look professional, .net at random (though it is supposed to be for ISPs), and .org if you want to stand for some ideal.

      Though TBH I'm not certain WHY we need TLDs anyways. It isn't like there is some commercial slashdot.com it just redirects. I imagine that any big name will own all of the tlds with its name. Google is the only company that differentiates but it does it completely wrong. It shouldn't be google.com and google.org ... it could just be www.google vs org.google. I guess it is something that lets companies sell the same product to people multiple times.

    4. Re:ICANN has lost it! by Anonymous Coward · · Score: 2, Insightful

      This is about letting people use characters from their frickin' own language instead of just english.

      Just like so many other things in programming.. if the software doesn't do international, it doesn't do international.

      This has nothing to do with making more TLDs.

    5. Re:ICANN has lost it! by Jorgensen · · Score: 3, Insightful

      Yeah right. Because everybody in the whole world only uses ASCII right?

      Sorry for sounding flippant, but such US-myopia is far to prevalent for my liking.... Come on guys: Wake up and smell the coffee! There is more to the world than the US! There is no reason to make most of South East Asia and China 2nd-rate citizens on the internet.

      I agree that there is a lot of software that needs changing as a result though. But that just means more work, right? You could probably sell this as an anti-recession measure too.

    6. Re:ICANN has lost it! by cdrudge · · Score: 1

      Far too much software makes the assumption that TLDs only contain [a-z0-9-]

      It's not really an assumption is it if until now the "standard" only called for [a-z0-9-].

    7. Re:ICANN has lost it! by Anonymous Coward · · Score: 0

      what about working with people around the planet?
      what if a french email has a cedille in it? how do you compose this cedille on your qwerty keyboard?
      how do you compose arabic url's from your qwerty keyboard?
      tell me it isn't splitting the internet again.

    8. Re:ICANN has lost it! by pablo.cl · · Score: 1

      What about .museum or .info? (more than three letters).

    9. Re:ICANN has lost it! by Tanktalus · · Score: 2, Funny

      And now, with today's progress, that'd be CØBÖL.

    10. Re:ICANN has lost it! by Anonymous Coward · · Score: 0

      We are not stopping them from building their own separate networks if they don't like the one that is already in place. But the US built the groundwork for the first computer network that evolved into the internet. Why would you be surprised that something that was invented in the US would be in English? And why do you expect the people that the system is working for to fix it for the people who don't like it? I would figure that it is up to the people who don't like it to fix it.

    11. Re:ICANN has lost it! by defaria · · Score: 1

      Huh? Wouldn't software be simplified by not applying any regex to the TLD?!? IOW why not simply assume it can be anything (short of a null byte).

    12. Re:ICANN has lost it! by agnosticnixie · · Score: 1

      You look in the table of characters, it's not like there's thousands of french words that have that. Hell, the cedille is common enough to be added to the characters you can hit with a modifier on at least apple's version of US qwerty. Also, US ASCII isn't even considered good enough for british english because of loan words - it can do three languages right: Latin, US English and Hawai'ian.

    13. Re:ICANN has lost it! by Jorgensen · · Score: 1

      I'm not surprised that it is in English. But I am surprised of the reluctance to change it. If it is not changed, then sooner or later we will end up with two separate "internets" and we will all be poorer as a result.

    14. Re:ICANN has lost it! by teh+kurisu · · Score: 1

      I can't speak for Windows or Linux, but on a Mac:

      how do you compose this cedille on your qwerty keyboard?

      Option-C.

      how do you compose arabic url's from your qwerty keyboard?

      System Preferences > Language & Text > Input Sources, and select 'Arabic'. There's a keystroke that will let you switch instantly between layouts.

    15. Re:ICANN has lost it! by Anonymous Coward · · Score: 0

      > And why do you expect the people that the system is working for to fix it for the people who don't like it?

      Because they might actually take you up on this threat and tell ICANN & the US Government to fuck off. And that's not ideal for anybody, least of all the US.

    16. Re:ICANN has lost it! by jayme0227 · · Score: 2, Insightful

      You know, except for ease of use for those who don't use Latin characters in their daily lives. But who cares about them? They should just go back to their own country and create their own internet.

      --
      But then I realized the cable was blue, so I only gave it one star. I hate blue.
    17. Re:ICANN has lost it! by xaxa · · Score: 1

      Slashdot won't take accept my links, but some examples are on this page

    18. Re:ICANN has lost it! by xaxa · · Score: 1

      Arabic, Hindi etc is more of a problem, since the characters change depending on adjacent characters. I can copy Russian or Greek without a problem, but I can't copy Arabic or Hindi.

      However, I doubt I'll need to. If I do, the company will hopefully give two URLs -- one in Roman script, and one in the complex Asian script.

    19. Re:ICANN has lost it! by Toonol · · Score: 1

      Or everybody knowing English, which would be a good thing. Not because English is superior, but because having ANY language as a truly universal lingua franca would be wonderful.

    20. Re:ICANN has lost it! by bill_mcgonigle · · Score: 2, Informative

      if you want to go changing that there needs to be a damn good reason

      I don't have any first-hand experience, but according to the BBC story when one enters a native-script domain name into one's browser, the domain name is entered normally (for the locale) and then to enter, e.g., ".in", one needs to press a key combination to shift the keyboard into latin-mode, then, enter the two letters, then shift the keyboard back into native mode.

      It's a usability problem. I sure would be annoyed if .com had to be rendered in Kanji on my system.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    21. Re:ICANN has lost it! by raju1kabir · · Score: 1

      US ASCII isn't even considered good enough for british english because of loan words - it can do three languages right: Latin, US English and Hawai'ian.

      Bahasa Malaysia, Bahasa Indonesia, Tagalog, and I'm sure many many more.

      --
      "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
    22. Re:ICANN has lost it! by agnosticnixie · · Score: 1

      Yeah I corrected myself lower - the Austronesian family mostly works.

    23. Re:ICANN has lost it! by agnosticnixie · · Score: 1

      Admittedly, yeah, not that arabic is that terrible (I grew up on arabic windows even if I only have very cursory basics in it). It's when they'll start expanding to UTF-16 and UTF-32 that they may have trouble getting everything in in such a way that you can make key modifiers work well enough to cover as much as possible.

    24. Re:ICANN has lost it! by dominious · · Score: 1

      FYI there is an option to change the keyboard layout in your OS and there is every language in there. Including French. You can even swap between languages by pressing ALT+SHIFT together:)

      Now excuse me, I will switch my layout to type in some Greek blog...

    25. Re:ICANN has lost it! by Anonymous Coward · · Score: 0

      Actually, this is why Dutch (and many other country's) domains end in .nl instead of .co.nl (like .co.uk) - ideally you'd want to be able to leave off that bit too of course, but ICANN wouldn't stand for that and this is the next best thing. Still, I think we should move on from TLDs (and of course the omnipresent www.). Forget two slashes for a time waster, think about all the keystrokes wasted on www. .com, which carries no meaning whatsoever.

    26. Re:ICANN has lost it! by Valdrax · · Score: 1

      If it is not changed, then sooner or later we will end up with two separate "internets" and we will all be poorer as a result.

      How will we be poorer? If we end up splitting the domain system, there will be interoperability bridges created, and I don't really see any huge difference between the current system where Chinese people read Chinese pages and Americans read English pages than a system where you have a slightly harder time getting to pages 90%+ of the people who speak the same language as you can't read anyway. Those who have a special interest will go to the minor technical trouble to get both DNS systems, and those who don't won't notice the difference.

      --
      If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
    27. Re:ICANN has lost it! by DarkOx · · Score: 1

      There is no reason to make most of South East Asia and China 2nd-rate citizens on the internet.

      Sure there is a reason. The internet is most likely the most influential media outlet in existence today. It can do more to spread culture and ideas than any other tools we have. I for one enjoy living in the dominate culture where global communication and commerce are concerned. As an American I very much want American hegemony CONTINUED! I also understand not cooperating with other cultures at all will lead them to form their own little clubs which exclude us; and if that happens and enough of them get together than we would find ourselves on the outside.

      I am not totally against letting them have their funny little characters in URLs for stuff only they care about anyway. I just hope its part of a larger plan to keep them on our network doing things for the most part on our terms. I am all for marginalizing other cultures and making them feel good about it at the same time if we can get away with it.

      --
      Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
    28. Re:ICANN has lost it! by Anonymous Coward · · Score: 0

      Yea, boy, that Internet would suck if all those legitimate non-US pages went away ... like, uhm ...

  9. Encoding? by mewsenews · · Score: 1, Interesting

    The encoding seems weird to me:

    In reality, the new domain names will be stored in the DNS as sequences of letters and numbers beginning xn-- in order to maintain compatibility with the existing infrastructure. The characters following the xn-- will be used to encode a sequence of Unicode characters representing the country name.

    Any DNS gurus care to explain why they wouldn't simply use UTF8?

    1. Re:Encoding? by Psx29 · · Score: 3, Informative

      "in order to maintain compatibility with the existing infrastructure." Tons (dare I say, a majority) of software would break if they used UTF8

    2. Re:Encoding? by DamonHD · · Score: 5, Informative

      To avoid breaking all the DNS-related code out there that assumes (ie correctly, based on the current spec) only alphanumerics and '-' in each component.

      If you wish to rewrite every single bit of DNS-dependent code, in every laptop, server, embedded network device, etc, etc, ... well assume that it can't be done, and with this mechanism it doesn't need to be. Though I bet a few bits of code will barf at the '--' anyhow...

      Rgds

      Damon

      --
      http://m.earth.org.uk/
    3. Re:Encoding? by tokul · · Score: 2, Informative

      Any DNS gurus care to explain why they wouldn't simply use UTF8?

      I am not DNS guru, but guessing. RFC882 - November 1983. RFC2044 - October 1996.

    4. Re:Encoding? by NevarMore · · Score: 1

      Backwards compatibility with existing systems that don't support UTF-8 but still need to make DNS queries. Ranges from basic tools like dig, to un-updated browsers, to embedded devices like routers.

      Are there any public DNS servers that support this to see what happens with my existing software??

    5. Re:Encoding? by Looce · · Score: 1

      Since software makes the assumption that TLDs only contain [a-z0-9-], UTF-8 can't be used in the DNS. Internationalised domain names, even before these new ccTLDs, used that xn-- system, called Punycode. For instance, the site tinyarro.ws, which provides short URLs via a Unicode domain name, already used .ws for that purpose. It turns into xn--hgi.ws when the DNS request is issued.

      ccTLDs using Punycode is just an extension of that mechanism for second-level domains.

    6. Re:Encoding? by jeffasselin · · Score: 1

      Good question. The field size for DNS requests is in double words (16bits) increments, so I don't see why it couldn't have been.

      --
      If he explores all forms and substances Straight homeward to their symbol-essences; He shall not die.
    7. Re:Encoding? by bradley13 · · Score: 1

      The goal is to encode international characters as the characters currently accepted by the standard (a-z, 0-9, etc.) UTF does not do this. Also, the number of characters you can have in a domain name is limited to 26 (and it is the encoded length that counts), so the coding has to be efficient. This is precisely what Punycode is designed to do. Software can recognize an encoded name by the fact that it begins with the special sequence of letters "xn--"

      --
      Enjoy life! This is not a dress rehearsal.
    8. Re:Encoding? by Delwin · · Score: 1

      because UTF8 only solves the null term problem, not the readable character issue.

    9. Re:Encoding? by amorsen · · Score: 1

      (ie correctly, based on the current spec)

      Only hostnames are restricted. Other than that, DNS is almost 8-bit-clean (it case folds A-Z to a-z and dot is special) so UTF-8 is fine.

      Punycode only exists because some people have puny ...

      --
      Finally! A year of moderation! Ready for 2019?
    10. Re:Encoding? by Anonymous Coward · · Score: 0

      But it's not compatible with URLs that contain xn--, intended to show as xn--. Punycode is going to cause those URLs to not render as intended (with the associated risk of lawsuits from webmasters), while UTF8 encoding wouldn't have this problem.

    11. Re:Encoding? by ObsessiveMathsFreak · · Score: 1

      Any DNS gurus care to explain why they wouldn't simply use UTF8?

      Because they know full well that the vast majority of web developers don't really know what unicode is or how it works. Moreover the unicode spec is forever in flux and complete overkill for the international url problem. Lation only urls are a fly, we don't need a bazooka.

      Frankly, the current Punycode based system is truly inspired, giving the best of both worlds. Newer browsers can display and use international urls seamlessly, but older systems need never know they exist. People get what they wanted and the entire system chugs on as before. This is exactly what was needed. A simple and effective system that sits on top of existing infrastructure.

      If only IPv6 has been designed to be this transition friendly.

      --
      May the Maths Be with you!
    12. Re:Encoding? by DamonHD · · Score: 1

      Can you be sure that the DNS code in the WinME that runs your building's lifts is 8-bit clean, just for example? Or your old-but-good HP laser printer with embedded networking?

      This is pragmatically addressing the probability of code still in use but written long ago when UTF-8 and 8-bit-clean were woolly notions and twinkles in academic eyes, or just badly slapped together by some junior lowest-big developer who thought "oh, just (ASCII) letters and numbers" and it seemed to work...

      I'd bet you a whole dollar that at least one piece of DNS software that you use, explicit or embedded, would break if fed, say, ISO-Latin-1, never mind the UTF-8 control codes, etc.

      Rgds

      Damon

      --
      http://m.earth.org.uk/
    13. Re:Encoding? by ReallyEvilCanine · · Score: 1

      To prvnt phishing and other abus. provids an identifiabl stpgap to prvnt m ding smthing with th URL that I'v just dn in this mmnt.

    14. Re:Encoding? by petermgreen · · Score: 1

      But it's not compatible with URLs that contain xn--, intended to show as xn--.
      But how many of those were there? and afaict it's not xn-- anywhere just xn-- at the start of a part of a DNS name.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    15. Re:Encoding? by Creepy · · Score: 2, Insightful

      Actually, UTF-8 can and is being used in DNS - as long as you stick to basic Latin characters, that is. Also it is Unicode - as I posted earlier, Unicode is a blanket for UTF-8, UTF-16 and UTF-32 which makes it ambiguous.

      UTF-8 bits 0-7 is ASCII as long as bit 8 isn't set, so to fully support it you'd need to still exclude bits below 7 that are not valid html characters and include support for multiple bytes and bit 8. The reason existing DNS servers won't work with it is because bit 8 indicates multibyte and the second byte may carry an invalid character from the 0-7 bits and the first byte may have a language encoding for the second byte (indicated by bit 8). For instance character 43 is + and that is invalid in a URL. If character 1 had bit 8 set and indicated the language as French in the language encoding (which I believe is done in the first 7 bits and can in some cases be extended to the second or even third byte, but its been a while since I read the spec - I do know there is an encoding that does this and I'm pretty sure it is UTF-8), the second byte 43 would (probably - I'm not going to look it up) mean something entirely different and be perfectly valid.

    16. Re:Encoding? by kc8apf · · Score: 1

      Actually, all labels are restricted to the characters allowed for ARPANET hosts. The spec does state that implementations should store labels as a length octect followed by a sequence of octets, thus implying that any compliant software _should_ handle UTF8, but no one wants to take that chance.

      --
      kc8apf
    17. Re:Encoding? by tokul · · Score: 1

      The field size for DNS requests is in double words (16bits) increments, so I don't see why it couldn't have been

      UTF8 is variable length 8bit charset. Symbol length varies from 1 to 6 bytes. Three and five byte characters included.
      You confused UTF8 with UCS2 and UTF16.

    18. Re:Encoding? by IcyWolfy · · Score: 1

      UTF-8 at the very least is ASCII-safe. if bit 8 is set, then it is part of a multi-byte character (all bytes of a multibyte character have bit-8 set). But if bit 8 is not set, then it's a plain ascii character, no matter it's position. It was this self-correcting portion of UTF-8 that maraks whether you are on a initial byte, or single byte that makes it good for syncronising in stream.

      Encodings like S-JIS where the lead byte changes the meaning of all bytes ahead of it, does have the issue you mentioned. But, that's the reason everyone switched to UTF-8.

    19. Re:Encoding? by Anonymous Coward · · Score: 0

      It's a myth that all DNS related code would break. DNS has been 8-bit clean from the very beginning. UTF8 is the obvious choice for international domain names but the IETF screwed up. Now the IAB is trying to fix the mess.

      http://tools.ietf.org/html/draft-iab-idn-encoding-00

    20. Re:Encoding? by amorsen · · Score: 1

      Why would I care whether the HP laser printer or the elevator can look up the new domain names?

      If the mere existence of new UTF-8 domain names causes problems for those devices, then I can already cause problems for them today. I can add whichever names I want to my own subdomains.

      --
      Finally! A year of moderation! Ready for 2019?
    21. Re:Encoding? by spitzak · · Score: 1

      In UTF-8 if the second byte does not have the high bit set, it is an invalid encoding, and therefore should be treated as an error byte followed by that ASCII letter. You can be certain that if you see the byte 43 anywhere that it means a '+' , you do not ever have to examine surrounding bytes.

      UTF-8 is much much easier to deal with than people think. The best way to deal with UTF-8 is to treat all sequences of bytes with the high bit set as some kind of "foreign word" that you should not interpret and not split. The fact that it might actually be a single Unicode code point or mulitple ones or contain errors really should have no effect on your software. Only in extremely rare circumstances, such as when you actually need to draw a string on the display for the user, do you need to do anything more complicated such as detect errors and figure out what Unicode code points are represented.

      Unfortunatly there are far too many programmers who get completely flummoxed when presented with UTF-8 and go crazy trying to treat the Unicode decoding as "characters". This seems to be due to decades of being exposed to documentation that says "character" instead of "byte" so that they start to think there is some magical property so that the computer is incapable of pointing to a byte inside a character or that existence of such a pointer will somehow make the string unreadable. The best exercise is to imagine "words" in text and figure out why a huge amount of text processing is possible without having to always find the word boundaries, and why you can iterate over the "pieces" of the "words" without crashing and without making the string somehow "wrong", and that the word 'I' and 'a' have only one letter but are still "words", and then hit yourself with the clue-by-4 until you realize that there is no difference between a "word" and a "character".

    22. Re:Encoding? by amorsen · · Score: 1

      Well, as I said, everyone apparently has puny balls.

      If the DNS software itself breaks by being fed UTF-8, then I can break it today by simply putting rødgröd.amorsen.dk into my DNS server and asking the broken host to fetch that record. So security isn't an argument. What is left is a fear that some people might not be able to access the new domains. Well guess what happened with punycode: Noone could access the new domains because browsers didn't implement punycode. And of course punycode didn't actually prevent security problems.

      Fail.

      --
      Finally! A year of moderation! Ready for 2019?
    23. Re:Encoding? by pwfffff · · Score: 1

      "Why would I care whether the HP laser printer or the elevator can look up the new domain names?"

      Judging by the latest trends in technology, I'm guessing that pushing your floor's button no longer sends an analog signal to the motor, but rather sends it a twitter.

    24. Re:Encoding? by DamonHD · · Score: 1

      Either way you might or might not be able to look them up, but shoving binary in those fields might make old implementations *crash*.

      Rgds

      Damon

      --
      http://m.earth.org.uk/
    25. Re:Encoding? by Anonymous Coward · · Score: 0

      I'm not a DNS guru, but I've had lots of experience with UTF-8 characters and problems that involve them (not UTF8's fault, but the fault of bad systems that don't accept them properly).

      UTF-8 has, in addition to a wonderfully rich set of characters that can represent almost any written text in the world's history, also tends to have lots of repeated and easily confusable characters. E.g. "your–bank–website.com" looks just like "your-bank-website.com", but it's not! Several other characters have the same thing going on. This is a Cyrillic "x": "". Can you tell the difference? UTF8 will lead to abuse in this way. Unless you tried to restrict the set of UTF8 characters that can be used in DNS but that's obviously going to lead to very messy implementations.

    26. Re:Encoding? by Anonymous Coward · · Score: 0

      Hmm, Slashdot ate my Cyrillic 'x'. Try this link and check out the result. Compare to a search for a regular ASCII 'x'.

    27. Re:Encoding? by amorsen · · Score: 1

      Again, I can do the exact same thing today by putting garbage into my own domain. Waving the standard at black hats won't accomplish much.

      --
      Finally! A year of moderation! Ready for 2019?
    28. Re:Encoding? by fm6 · · Score: 1

      Because they'll stop working if they can't. And yes, there are printers that need to access the network. Home printers tend to pretty simple-minded (though many now have their own network addresses) but office printers often double as fax machines and scanners -- which means they have to look up the name of a mail server to forward those. And the mail server might well be named .

      (Oops, Slashdot still barfs on non-Latin characters. Here's the cute Chinese name I came up with: http://tiny.cc/MUhUL )

      Elevators too. Some of these are pretty smart and probably have the ability to report malfunctions, usage patterns, etc.

    29. Re:Encoding? by jc42 · · Score: 1

      Backwards compatibility with existing systems that don't support UTF-8 but still need to make DNS queries.

      Well, on a couple of projects I've tested the software for the ability to handle UTF-8, and I've been duly impressed by how hard it has been to find anything that fails. In the few rare cases where there's a problem, it has usually taken a one-line change to fix it: You add code to classify all the bytes above 0x7F as a "letter". Typically that's just one pattern or range check, only a few extra characters in one line of code.

      A curious case: A year or so back, I was working on some CSS that handled Chinese and Japanese text, mostly by defining classes that used a bigger font for those chunks. Just for fun, I gave those classes Chinese names, spelled with Chinese chars of course. I tested it with all the browsers I could get my hands on. I couldn't find one that didn't handle those class names correctly. I was surprised (and impressed) by this. But I eventually realized that, given the previous paragraph, I shouldn't have been surprised. The code that parses the CSS probably just dumbly looks for all the syntax chars in CSS, and everything else is "just text". Maybe I'll test this by writing some CSS with ASCII control chars as the names of classes. My prediction is that if I don't use and of the usual (\b, \n, \r, \t or \0), it'll work.

      Sometime the dumbest approach works best. If your code just thinks everything above 0x7F is a letter (or unanalyzed text), it'll probably work with any encoding that preserves the meaning of ASCII chars. The only likely exceptions are for code that has to actually render the text for human legibility, and most of the world's software doesn't ever do that.

      (I also experimentally wrote some C code that included variables with Chinese Japanese and Arabic names, as UTF-8 Unicode. The couple of compilers I fed it to compiled it without even any warnings. But somehow I suspect that this won't be true for all current C compilers. ;-)

      UTF-8 Unicode is easy to code for. It's usually harder to not handle it correctly.

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
    30. Re:Encoding? by amorsen · · Score: 1

      It's fine that they need to access the network, but that doesn't mean they are going to look up a random UTF-8 domain name which doesn't have anything to do with them.

      I repeat myself again: if this was a problem, I could already crash the world's printers by adding a UTF-8 record to my own domain.

      --
      Finally! A year of moderation! Ready for 2019?
  10. NSFW by Anonymous Coward · · Score: 0

    ===Reader dhermann adds, "Great, now even less chance I can identify NSFW links before they are blocked by my work's big brother app and my boss is notified... again."===

    Seriously? You think people shouldn't be able to use internet in their native language because you are afraid of getting in trouble for browsing the web at work when you already know you shouldn't? I'd fire you right now if I was your boss.

    1. Re:NSFW by Anonymous Coward · · Score: 0

      Says the guy browsing Slashdot at work.

    2. Re:NSFW by Anonymous Coward · · Score: 0

      Some of us are jobless and living on welfare, you insensitive clod!

    3. Re:NSFW by Anonymous Coward · · Score: 0

      The difference is that OP is trying to deceive his boss, while I work for myself and can read slashdot all day long if i want.

    4. Re:NSFW by Zontar+The+Mindless · · Score: 1

      Are you really that dense? This has nothing to do with any deception by dhermann. Let me spell it out for you.

      His boss may very well have no problem with him reading tech or news sites while he's at work.

      Rather dhermann would prefer not to follow a link posted in a discussion at "forums.megaculenewhardwarereviews.com" like this one (yes, this really is NSFW!):

      Here's a post on my blog with some test results I got using the new BrandX SSD.

      And have the boss red-alerted that dhermann's just paid a visit to PornTube using the company's Net connection.

      This is bad enough with URL-shorteners (and one reason why I happen to detest them), but gets even worse when what appears in your status bar uses characters you can't even read.

      *Now* do you get it?

      --
      Il n'y a pas de Planet B.
    5. Re:NSFW by Anonymous Coward · · Score: 0

      I don't normally browse websites written in a language I can't understand. I still see an ignorant american that thinks the whole world should read and write english for people like dhermann. NSFW has nothing to do with supporting more internationalization and it's all a cop out.

    6. Re:NSFW by Zontar+The+Mindless · · Score: 2, Insightful

      I don't normally browse websites written in a language I can't understand.

      1. The link text in the example I provided was in English.

      2. I am not aware of any requirement that only one language may be used on a given website. If there is such a requirement, please inform my contacts on Facebook of this, because they post messages there in about 15 different languages using at least 4 different writing systems. (And I've posted there myself in 4 languages, including English.)

      I still see an ignorant american that thinks the whole world should read and write english for people like dhermann.

      1. See above.

      2. So you are saying that you can read my mind? Perhaps this ability of yours needs some fine-tuning, since I never made any such assertion.

      3. It's true that I still carry a US passport, but I've not lived there in many years.

      NSFW has nothing to do with supporting more internationalization and it's all a cop out.

      Nobody is "copping out", and if you seriously think I am opposed to internationalisation, you're barking up the wrong tree.

      Nevertheless, dhermann is voicing what I believe is a legitimate concern, even more so for less sophisticated and experienced Internet users.

      The answer to such concerns is, of course, education. Many people are not even aware that services like Google Translate are available.

      In the meantime, I suggest you remove the chip from your shoulder. Not all Americans are alike, you know.

      --
      Il n'y a pas de Planet B.
  11. And the answer to that... by Looce · · Score: 4, Interesting

    ... of course, is Punycode.

    A comment before yours has www.íçáñn.örg, which, when entered into Firefox, turns into

    www.xn--n-tfarxw.xn--rg-eka

    . Looks like the software will still live :)

    1. Re:And the answer to that... by Anonymous Coward · · Score: 0

      When you enter "www.íçáñn.örg" into the chrome browser (at least in windows) it automagically thinks you're searching for the term and not actually entering it in as a valid address...

    2. Re:And the answer to that... by Firehed · · Score: 1

      Probably because up until about five minutes ago, it wasn't a valid address.

      --
      How are sites slashdotted when nobody reads TFAs?
    3. Re:And the answer to that... by Clairvoyant · · Score: 1

      FYI: in my Opera it's just www.íçáñn.örg. Exactly the way you wanted it.

    4. Re:And the answer to that... by Looce · · Score: 4, Informative

      You don't understand. Punycode is how second-level domains are already implemented, even on top of relatively old browsers. This is an extension of Punycode to be usable in the TLD as well.

      In other words, your current version of Firefox will be able to visit pages in IDN TLDs when they're implemented, and so if someone does create a .örg TLD today, you can go to www.anysite.örg to your heart's content already.

      Note that this doesn't mean you can go to www.anysite.örg in NCSA Mosaic or anything, because these old browsers were around when Punycode wasn't even a standard. You can go to www.anysite.xn--rg-eka and NCSA Mosaic will recognise that, though. The seamless IDN TLD usage is just going to be present in the more modern browsers. I expect that Opera 8+, IE 6+, Firefox 2+ and recent Safari/Konqueror/Epiphany are going to be able to visit www.anysite.örg and 'hide' the xn--etc- access details from you, the user.

      Happy surfing!

    5. Re:And the answer to that... by aztracker1 · · Score: 1

      So long as my email client doesn't hide it, I'd be okay with that.

      --
      Michael J. Ryan - tracker1.info
    6. Re:And the answer to that... by ciggieposeur · · Score: 1

      I expect that Opera 8+, IE 6+, Firefox 2+ and recent Safari/Konqueror/Epiphany are going to be able to visit www.anysite.örg and 'hide' the xn--etc- access details from you, the user.

      They had better make that user configurable. I do not want to click on "https://paypal.com" and be silently redirected to "paypal.xn--w-h-a-t-ev-er" -- which would have a perfectly valid SSL cert -- instead.

    7. Re:And the answer to that... by Anonymous Coward · · Score: 0

      I expect that Opera 8+, IE 6+, Firefox 2+ and recent Safari/Konqueror/Epiphany are going to be able to visit www.anysite.örg and 'hide' the xn--etc- access details from you, the user.

      VARY BAD idea.

    8. Re:And the answer to that... by Anonymous Coward · · Score: 0

      Yeah, just like they hide %20 %etc

  12. Phishing aid by querist · · Score: 5, Insightful

    This will only make phishing attacks easier unless there are SERIOUS checks on domain name registrations. There are letters in the Cyrillic alphabet that have different character codes than their look-alike letters in the Latin alphabet. I'm sure there are other collisions as well. I'm sure they accounted for this in the proposal, but the problem always lies in the implementation. From a security standpoint, this is a VERY bad idea without proper regulation of domain name registrations, and so far it has been demonstrated that we cannot manage them properly even with only the Latin alphabet. From a cultural and usability standpoint, this is a good thing. It will be easier for someone whose native language uses a non-Latin alphabet to recognize the supposed purpose of a web site by its domain name if some of those domain names can be in their native language. A hypothetical native Tamil speaker who speaks no English will be able to recognize the purpose of a site with an appropriate domain name in Tamil, for example

    1. Re:Phishing aid by nsayer · · Score: 3, Informative

      I think the limitation that nationalized character sets will be restricted to the country TLDs where that language is native is a good first step. Additionally, I believe you're not allowed to use the latin alternative form characters from unicode (like 0xFF20-0xFF5F).

      If you're really paranoid, you could just be extra suspicious of domains that end in two letters (and yes, I am including .us), particularly when the 2nd level name is something you recognize, like paypal, ebay, etc. If you're in China, there may indeed be a legitimate paypal.cn, but I suspect it would set off my spidey sense to see a URL like that show up in my e-mail.

    2. Re:Phishing aid by dkf · · Score: 2, Insightful

      If you're really paranoid, you could just be extra suspicious of domains that end in two letters (and yes, I am including .us), particularly when the 2nd level name is something you recognize, like paypal, ebay, etc. If you're in China, there may indeed be a legitimate paypal.cn, but I suspect it would set off my spidey sense to see a URL like that show up in my e-mail.

      That won't work. There really are a lot of big companies that have country-specific sites that use the two-letter global domains. For example, if you're after books in German then you might be very interested in visiting amazon.de, which is totally legit.

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    3. Re:Phishing aid by petermgreen · · Score: 1

      I don't think it's a big deal for TLDs since afaict those are created manually anyway.

      For lower level domains (which are already using IDN) it's a bigger issue, firefox resorted to using a whitelist to get arround irresponsible registrars.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    4. Re:Phishing aid by pablo.cl · · Score: 2, Informative

      There are letters in the Cyrillic alphabet that have different character codes than their look-alike letters in the Latin alphabet.

      Remember we are talking about ccTLDs. There are no more than 200 countries that would like to use non ASCII ccTLD, and they can be inspected manually. Russia wasn't awarded Cyrillic .ru because it looks like Latin .py (Paraguay). They will get .fr (Russian Federation) that looks like 0p (0 with vertical bar).

    5. Re:Phishing aid by nsayer · · Score: 2, Insightful

      Yeah, but if you know that you want that, then you'll be expecting it. We're talking about being on the lookout for 2 letter TLDs in places you don't expect them.

    6. Re:Phishing aid by Mathieu+Lu · · Score: 4, Insightful

      This risk can be greatly reduced if they limit domain names to only one alphabet, i.e. Russian domain with Cyrillic ccTLD should have only Cyrillic letters in it.

      In many of these countries, they often have two domain names for a website: one that is easy to remember by foreigners, one that is easy to remember by locals (i.e. cyrillic name transliterated to Latin alphabet). The transliterated domain name is usually horrible, sounds weird, and often people transliterate stuff in different ways, so it's often not easy to remember anyway.

      I think non-latin ccTLDs is a good thing.

      matt

    7. Re:Phishing aid by shutdown+-p+now · · Score: 1

      Russia wasn't awarded Cyrillic .ru because it looks like Latin .py (Paraguay). They will get .fr (Russian Federation) that looks like 0p (0 with vertical bar).

      Are you sure it's not .rf - which doesn't clash with anything either, and makes much more sense to Russians themselves (since that is the standard abbreviation for Russian Federation in Russian).

    8. Re:Phishing aid by Yvanhoe · · Score: 1

      That was my first thought. I'll go register "slshdot.org" now (the first 'a' is in fact the cyrilic version). I'm sure .net will be popular as well.

      --
      The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
    9. Re:Phishing aid by Yvanhoe · · Score: 1

      Ow, Slashdot is really non utf8 then...
      I was first mentioning U+1072 ('a' in cyrillic) and U+0FCC (a swatiska)

      --
      The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
    10. Re:Phishing aid by Tony+Hoyle · · Score: 1

      They will get .fr (Russian Federation) that looks like 0p (0 with vertical bar).

      That's gonna piss off the french.

    11. Re:Phishing aid by pablo.cl · · Score: 1

      The French use .fr, the Russians will use .øp (ø with a vertical bar, not diagonal).

      And since I made a mistake, it's actually .pø

    12. Re:Phishing aid by pablo.cl · · Score: 1

      Are you sure it's not .rf - which doesn't clash with anything either, and makes much more sense to Russians themselves (since that is the standard abbreviation for Russian Federation in Russian).

      You are right. I checked and it's .pø (ø with a vertical bar).

    13. Re:Phishing aid by Anonymous Coward · · Score: 0

      an idea i've thought about was to colour-code the url in the address bar to what language it's using. for example, if it has cyrillic letters they appear against a red background while english ones appear against a blue one.

      getting what language range a certain character is in with most utf8 (i know this isn't) systems is easy enough.

      just £0.01.

    14. Re:Phishing aid by Anonymous Coward · · Score: 0

      What? Since when are TLDs semantic? Most TLDs are freely available, so you cannot infer anything from them, other than the image the domain holder tries to convey. I don't know if anyone will be fooled by a Cyrillic .ru domain into thinking it must be a website about Python. And if they are, what is the damage?

    15. Re:Phishing aid by smoker2 · · Score: 1

      What like in the url bar ? Maybe the answer is to actually pay attention to the world around you.

  13. Re:Speeding the path to IPv6? by Anonymous Coward · · Score: 2, Insightful

    I wonder what impact this will have on the ever decreasing amount of IPv4 addresses available.

    This will have absolutely no effect on IPv4/IPv6. This is a DNS change to allow additional characters in domain names.

    The domain names get translated to ip addresses by DNS servers.

    I doubt that individuals & companies said, "No! We refuse to go on the internet until we can have TLDs with non-Latin characters."

  14. Here comes the Phishers! by ircmaxell · · Score: 1, Insightful

    Yay!!! The door is open for an even harder to detect phishing scheme! Imagine the emails linking to http://slashd/öt.org/something...

    I'm all for internationalization, but perhaps limit it to internationalized domain extensions (.jp or .es for example)...

    --
    If a man isn't willing to take some risk for his opinions, either his opinions are no good or he's no good
    1. Re:Here comes the Phishers! by nsayer · · Score: 1

      You not only didn't read TFA, but you didn't even read the summary very well, did you?

    2. Re:Here comes the Phishers! by pablo.cl · · Score: 1

      slashdöt is not a ccTLD, and currently is allowed. Click here in Firefox or Opera: http://xn--slashdt-f1a.de/

    3. Re:Here comes the Phishers! by Anonymous Coward · · Score: 0

      Hi, you must be new here...

    4. Re:Here comes the Phishers! by mjwx · · Score: 2, Informative

      You do know that this is for the TLD part of the URL only. The first part of a domain can already be written in non latin scripts, Korean for example but the TLD must but Latin, this decision just enables the .com.kr to be turned into Hangul.

      If ICANN did not standardise this then nations will just implement their own systems which will be different and incompatible with each other, much like China and Thailand have already done.

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
  15. Re:how to... by Anonymous Coward · · Score: 0, Insightful

    So build your own damn internet.

  16. Are we going to have to update the URL RFC? by adamy · · Score: 1

    Thee current RFC 1738 http://www.faqs.org/rfcs/rfc1738.html Only allows URLs to be composed of

    " Within those parts, an octet may be represented by the chararacter which has that octet as its code within the US-ASCII [20] coded character set. In addition, octets may be encoded by a character triplet consisting of the character "%" followed by the two hexadecimal digits (from "0123456789ABCDEF") which forming the hexadecimal value of the octet. (The characters "abcdef" may also be used in hexadecimal encodings.)"

    So A-Z and %ddd Just ain't gonna cut it.

    Currently URLs are in the ASCII subset of utf-8. What are they going to be in in the future?

    What about languages that go from right to left like Hebrew and Arabic?

    --
    Open Source Identity Management: FreeIPA.org
  17. don't forget who wer're talking about here... by damn_registrars · · Score: 5, Insightful

    There are letters in the Cyrillic alphabet that have different character codes than their look-alike letters in the Latin alphabet. I'm sure there are other collisions as well. I'm sure they accounted for this in the proposal, but the problem always lies in the implementation

    This is a decision made by ICANN. We've known for some time that they will willingly approve really tremendously bad ideas, if enough money is presented to them. They recently moved on a motion to start selling gTLDs, after all.

    From a security standpoint, this is a VERY bad idea without proper regulation of domain name registrations, and so far it has been demonstrated that we cannot manage them properly even with only the Latin alphabet

    Security is not of any concern for ICANN. Never has been, never will be. As long as they keep making money they're happy; security, spam, phishing, etc, be damned.

    --
    Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
  18. Erratum by Looce · · Score: 1

    Yeah, Slashdot apparently needs to be internationalised too. That ".ws" should be "[U+27A1].ws" (BLACK RIGHTWARDS ARROW).

    1. Re:Erratum by tepples · · Score: 1

      Slashdot apparently needs to be internationalised too.

      Slashdot uses a Unicode character whitelist due to past abuse, and U+27A1 isn't on that whitelist. The euro sign € is though.

    2. Re:Erratum by koolfy · · Score: 1

      And what's the story of the "" ( ... ) unicode character ? Is it supported already ? (looks like it isn't)

      --
      Segmentation Fault in "Life, Universe and Everything" at line 42. Don't Panic.
    3. Re:Erratum by shvytejimas · · Score: 1

      I'm curious. Where can i find this list? Or does it only exist in the SLASH source code?
      It looks to me they just threw the baby out with the bathwater. If the problem was unicode's direction control characters, why not just blacklist those few control chars? Instead we now have a whitelist so ridiculously small, it's useless.

  19. TLDs only? by Sockatume · · Score: 1

    Is it my imagination, or does this proposal only apply for TLDs, like .uk and .jp? I don't see any mention of supporting it for the rest of the domain name. That seems a logical extension, but it's not been announced.

    --
    No kidding!!! What do you say at this point?
    1. Re:TLDs only? by petermgreen · · Score: 2, Informative

      It's already been in use for the rest of the domain name under certain TLDs for some time.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
  20. Not trying to be harsh but... by aceofspades1217 · · Score: 1

    cmon how could you think "but in future countries" sounds okay.

    it should be "but in the future countries"

    great info though. I mean its nice to see that the internet is starting to become more international, especially as the US cuts mandatory ties to ICANN.

    1. Re:Not trying to be harsh but... by Zontar+The+Mindless · · Score: 1

      Speaking of 'international'... FYI, outside the US, 'in future' is perfectly acceptable usage. In the UK and Australia it's generally preferred over 'in the future'.

      --
      Il n'y a pas de Planet B.
    2. Re:Not trying to be harsh but... by Anonymous Coward · · Score: 0

      Currently, domain names can only be displayed using the Latin alphabet letters A-Z, the digits 0-9 and the hyphen, but in future countries will be able to display country-code Top Level Domains (cc TLDs) in their native language.

      Inside the US, at least in the General American" dialect, which I speak, "but in future countries" (pronounced with a rising pitch on 'future' and an accent on countries) means "but in countries that do not currently exist, though will in the future" (like "future cars" or "future operating systems").

      The phrase "but in the future, countries" (with or without a comma) would typically be used in the quoted sentence. However, because of the word "currently," I re-read and interpreted it as "but in future, countries" (with a falling pitch and an accent on the first syllable of 'future', followed by a slight pause) meaning basically the same as "but in the future, countries." However if the subject was implied, "but in future will be able to..." would be fine with or without "the" and would use no comma.

      The difference is mainly that American dialects takes "in future NOUN" to be a modified noun with adjective preceded by a preposition, or (in (future NOUN)), but other dialects take "in future NOUN" to be a noun preceded by an adverb, or ((in future) NOUN). The comma would have made the usage much clearer and disambiguated it, even for American speakers.

      Compare:
      "Currently, Windows requires complicated input devices called the "mouse" and "keyboard" to operate, but in future may be given input entirely from the brain, requiring no external hardware." (makes perfect sense to me)
      "Currently, Windows requires complicated input devices called the "mouse" and "keyboard" to operate, but in future, input may be given entirely from the brain, requiring no external hardware." (ditto, "the" is unnecessary, but would typically be used)
      "Currently, Windows requires complicated input devices called the "mouse" and "keyboard" to operate, but in future input may be given entirely from the brain, requiring no external hardware." (sounds weird, but I understand the meaning from context)

    3. Re:Not trying to be harsh but... by Zontar+The+Mindless · · Score: 1

      My point was that 'in future' did not require the 'correction' offered by aceofspades1217, since it was not incorrect to begin with. He said,

      cmon how could you think "but in future countries" sounds okay. it should be...

      This implies that 'in future' is wrong, and only an unsophisticated American would make such a claim. (If he'd complained about it being ambiguous, that would be a different matter. But he didn't.)

      As for the refresher course in General American, I grew up speaking it -- well enough that I spent most of the 1990s working an a radio announcer. (I even still have my copy of the NBC Handbook.)

      --
      Il n'y a pas de Planet B.
  21. Raise your hand if you're surprised... by damn_registrars · · Score: 1, Redundant

    ... Yeah, I didn't think so.

    ICANN just made another move to make everyday life on the internet slightly more difficult for many users, while making life for con artists, spammers, phishers, etc, much much easier (and more profitable). It is safe to expect that someone (probably more than one actually) at ICANN made some money on the deal.

    Hell it wouldn't surprise me if they were working with some financiers to try to find a way to sell internet subprime mortgages for profit as well.

    --
    Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
    1. Re:Raise your hand if you're surprised... by Anonymous Coward · · Score: 0

      You're obviously in the minority of people on this planet for whom English is their first language. Also, going by your username, you have an axe to grind.

      Mod +1 Troll

    2. Re:Raise your hand if you're surprised... by damn_registrars · · Score: 1
      I shouldn't be feeding the trolls, but sometimes the low-hanging fruit needs to be knocked down...

      You're obviously in the minority of people on this planet for whom English is their first language

      If you honestly believe for some reason that ICANN is doing this out of the kindness of their hearts ... well ... I wish my world was as rosy-colored as yours. This has nothing to do with accessibility, and everything to do with money.

      Also, going by your username, you have an axe to grind.

      I wasn't discussing registrar problems in this topic. I was discussing ICANN problems. While registrar problems do in part come from ICANN being overly complacent, not all ICANN problems are inherently related to registrars.

      And judging by your own username ... oh, wait. You can't be bothered to even put a made-up name behind your comments.

      Mod +1 Troll

      Apparently you haven't been here long, or you would know that a "troll" mod automatically carries a "-1". A mod of "+1 troll" doesn't make any sense. But thank you for providing some amusement, my day was rather drab so far.

      --
      Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
    3. Re:Raise your hand if you're surprised... by Estanislao+Mart�nez · · Score: 1

      ICANN just made another move to make everyday life on the internet slightly more difficult for many users, while making life for con artists, spammers, phishers, etc, much much easier (and more profitable).

      Why do I get the feeling that there's a group of people affected by this change that you're not mentioning...

    4. Re:Raise your hand if you're surprised... by Anonymous Coward · · Score: 0

      They also just made everyday life on the internet less difficult for an even larger number of people. You should already be paying close attention to the links you're clicking - maybe you should just stick to latin based domains?

    5. Re:Raise your hand if you're surprised... by Zontar+The+Mindless · · Score: 1

      Mod +1 Troll

      Apparently you haven't been here long, or you would know that a "troll" mod automatically carries a "-1". A mod of "+1 troll" doesn't make any sense.

      Whoosh

      --
      Il n'y a pas de Planet B.
  22. Excellent idea by ugen · · Score: 3, Insightful

    Now those countries, organizations and businesses that wish to become inaccessible to most of the world (except the native speakers of their own language) can finally do so as easily as possible. Create their own little Internet reservations and stay there :)

    As long as my software (such as Firefox) obligingly converts these IDN urls into the dash-hex notation making them obviously unreadable, I am ok with that.

    Disclaimer: I am a native of non-English speaking country. I am sure a few of my countrymen will use this feature based on misplaced patriotism. I am also sure that vast majority will ignore it just like they ignore potential to use non-latin domain names that exists right now.

    1. Re:Excellent idea by Alioth · · Score: 1

      How is this any different today?

      If the content pointed to by a domain written in Latin characters points to a site written in Chinese, non-speakers of Chinese still can't actually do anything useful with the site.

    2. Re:Excellent idea by pablo.cl · · Score: 1

      Compare http://xn--slashdt-f1a.com/ and http://xn--slashdt-f1a.de/ In Firefox the first one shows punycode and the German one correctly appears as slashdöt.de, because Germans are known to forbid scammers, and .com is notoriously the opposite.

    3. Re:Excellent idea by Petaris · · Score: 1

      Why? There are translators. The problem is that you are cutting people off as they can't type in Chinese characters (I know its possible but you have to install extra bits to do so, and know what your typing and how to type it on a latin keyboard).

      I have friends in a few different non-latin alphabet countries and family in one. If their email addresses were in their alphabet I likely wouldn't be able to email them easily even though we, mostly anyway, correspond in English.

      --
      ~Petaris "The world is open. Are you?"
    4. Re:Excellent idea by wvmarle · · Score: 1

      For example in China this will be used a lot. Other countries using non-Latin scripts will do so as well.

      Actually it was possible already for a few years to register domain names in Chinese characters in Hong Kong, but still ending in .hk. Now that part can also become Chinese characters as replacement for .hk, .cn or .tw.

      The catch was that a Chinese URL would work only within HK/China. Now this will also start to work worldwide.

      One big issue for many lower-educated Chinese is that the Latin script is as strange as Chinese characters to us. Of course you can look at the shape and recognise them, but that's it: the letters do not carry sounds to them. So it's impossible to remember an URL for those people. Even for Chinese that can read/understand the Latin script it is far harder to remember an URL that they see in say an advertisement, than if that URL were in Chinese.

      I bet the same accounts for many other languages. Japanese, Indian languages, Vietnamese (Latin letters with lots and lots of accents to make it pronounceable for them), Korean, Russian. Even many West-European languages French would be happy with adding just the accents.

      This is a major leap forward for the Internet at large, don't underestimate it just because your language (group) is doing fine with just a-z, 0-9 and hyphens. Already more than half of the Internet users worldwide is using a non-Latin script for their native language. And that's the users that are going to benefit most.

    5. Re:Excellent idea by Hythlodaeus · · Score: 1

      They make sense as aliases to sites that also have standard domain names.

      --
      For great justice.
    6. Re:Excellent idea by chord.wav · · Score: 1

      Your point may be valid for some non-latin languages but as soon as you put China into the equation the figures change radically.

      China IS the world. They are zillions! If they start using Chinese names it'll be us, "latin speakers", who'll be confined to our "own language and make our sites inaccesible to the rest of the world".

    7. Re:Excellent idea by koxkoxkox · · Score: 1

      Well, you can copy and paste, follow a link, use a search engine, etc. How often do you really type an address anyway ?

      Your friends and family can write to you and you just have to answer to them, how difficult is that ?

    8. Re:Excellent idea by w000t · · Score: 1

      Believe it or not, it is possible to register multiple domain names and make them point to the same site. I mean, who would have thought of that, right? I've even heard it's possible to have more than one site to your name. Maybe this other little know fact could be used to have 2 sites, perhaps a local one and an international one (but who am I to tell what people will come up with when they found out all this?). But, yeah, other than that, I totally agree with you. Non-latin ccTLDs can't possibly be used for anything else than fulfilling people's "misplaced patriotism". It's not like everybody in the world can't understand the latin alphabet already. That's only to be expected, after all, we would too fell right at home if asked to type giberish in other alphabets as well. I for one would certainly have no trouble typing anything you throw at me, irregardless of alphabet. Chinese, Arabic, Cyrillic, Japanese, you name it, I've mastered them all. What's more, thanks to my great observation skills, I can guess the general meaning of it simply by staring at said giberish for a minute or so. Anyways, I hope to have contributed to your in-deep analysis of the situation.

    9. Re:Excellent idea by koxkoxkox · · Score: 1

      Exactly, a lot of Chinese people are accustomed to addresses meaning absolutely nothing to them. You can see website composed only of numbers, or addresses composed of the first letter of the romanisation of each character. It gives a string of meaningless letters, which you can guess from the real name of the website but not the other way around.

    10. Re:Excellent idea by ugen · · Score: 1

      This is certainly NOT the case with Russian. While alphabets are different, and majority of Russians do NOT speak a foreign language, acceptance of latin alphabet is high (nearly universal). There is absolutely no issue with using latin-based URLs or addresses and very little drive to change that.

      Trying to remove universal access and Babylonize the internet under the fairly flimsy pretext of internationalization seems a very misplaced effort to me.

      Imagine what would happen if instead of converging on the same "arabic" numerals, each country would keep using a different counting/numeric system? Domain names are no different - they are NOT general purpose words and they should be built using one universal approach.
       

    11. Re:Excellent idea by Estanislao+Mart�nez · · Score: 1

      One big issue for many lower-educated Chinese is that the Latin script is as strange as Chinese characters to us. Of course you can look at the shape and recognise them, but that's it: the letters do not carry sounds to them.

      Actually, many of them have problems even in the "look at the shape and recognize them" part. Spend some time looking at some of those "Engrish" websites featuring incorrect and/or nonsensical English in Asian countries, and you can spot this regularly: people mixing up "p" and "q", "b" and "d", "r" and "n", "t" and "f", "i" and "l", etc., because the shapes are similar.

    12. Re:Excellent idea by raju1kabir · · Score: 1

      Actually it was possible already for a few years to register domain names in Chinese characters in Hong Kong, but still ending in .hk. Now that part can also become Chinese characters as replacement for .hk, .cn or .tw. The catch was that a Chinese URL would work only within HK/China. Now this will also start to work worldwide.

      There has been nothing stopping you from visiting a Chinese-character .hk domain name for lo these many years, they've worked fine.

      What's changing is that now the .hk part itself can be in Chinese characters.

      --
      "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
    13. Re:Excellent idea by wvmarle · · Score: 1

      Trying to remove universal access and Babylonize the internet under the fairly flimsy pretext of internationalization seems a very misplaced effort to me.

      Sorry but this sounds very much like arrogant neo-colonialism.

      Why would everyone have to use the Latin script? Just because it's convenient for you and the language of the people that happened to have set up the Internet? How about all those people that do not understand Latin scripts in the first place? Forcing them to use Latin scripts instead of their own is NOT universal access. Latin script is just one of the many scripts used in this world.

      On top of htat you don't even need to know how to enter Chinese or Arabic scripts as the underlying tech is still using the legacy Latin scripts. You can enter the Chinese URL in that way as well.

    14. Re:Excellent idea by jc42 · · Score: 1

      If they start using Chinese names it'll be us, "latin speakers", who'll be confined to our "own language and make our sites inaccesible to the rest of the world".

      Not really. English has been the official second language in China for some time now, a required part of the curriculum in the school systems. The young people in China routinely sprinkle English into their speech, often in the form of prefixes and suffixes attached to Chinese words. In Chinese text, it's common to see little bursts of Latin characters here and there.

      A "Chinese" DNS system would have no problems at all with Latin inclusions, and at least the younger people wouldn't be the least bit puzzled by such things. Our Latin-only sites would be quite accessible to them. It's only our own insularity that would make their part of the Internet inaccessible to us. (And don't look too closely, or you may notice the people in the West who have been studying those inscrutable Eastern languages for some time. ;-)

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  23. Good news by qtriangle · · Score: 1

    Good news for the non-english speaking users. Though a challenge for search engines.

    --
    QTriangle Infotech Best web design, Web Hosting and Domain Registration
    1. Re:Good news by dominious · · Score: 1

      Though a challenge for search engines.

      Not really, check google.gr google.de google.ru and many more...

      Go on, search for something in non-english characters:)

  24. Re:how to... by TeknoHog · · Score: 1

    With blackjack and hookers! In fact, forget about the internet.

    Seriously though, it is nice to have a lowest common denominator in characters, so that everyone can type every address on the Internet.

    --
    Escher was the first MC and Giger invented the HR department.
  25. Um, can they be more specific than "Unicode"? by Creepy · · Score: 1

    Unicode can mean many things - UTF-8, UTF-16, UTF-32 - so specifying Unicode is not detailed enough to implement and by not specifying, it is opening a can of worms IMO. UTF-8 tends to be slower and larger for non-ASCII but has wide acceptance. It would also be the favorite for Linux/UNIX because it is very common there (my Linux box has LANG=en_US.UTF-8) and also for communication with databases (in my experience, UTF-8 is what most enterprise companies use for their database settings if they need multi-language databases). UTF-16 is worse for ASCII because it always has a second byte, but is generally faster and smaller for multibyte languages. It is also the default character encoding for MacOS and Windows (and contrary to its name, it can, in fact, contain 4 bytes of characters - the older format, UCS-2 was 2-byte only). It would be possible to support multiple encodings maybe on the URL, but this needs to be specified (for instance you could do something like http8:// or http16://).

    To further throw a wrench in the works, wchar_t in C has unspecified length and can be 8, 16, or 32 bit characters. On Windows and Linux it is 16 bits. On mac and BSD UNIX it is generally 32 bit. This makes multi-platform programming using wide characters in C/C++ a bitch (and I say that from experience).

    1. Re:Um, can they be more specific than "Unicode"? by petermgreen · · Score: 1

      TFA is badly written and factually inaccurate.

      All that is actually going on here is that icann is allowing use of IDN (which is already in use at lower levels of the heirachy) in the root. The standards for IDN already specify exactly how the names are encoded.

      http://tools.ietf.org/html/rfc3490

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    2. Re:Um, can they be more specific than "Unicode"? by maxume · · Score: 1

      The article doesn't say so, but the standard uses Punycode, which is pretty much equivalent to UTF-8 and such (in that it encodes some Unicode codepoints in a specific way).

      --
      Nerd rage is the funniest rage.
    3. Re:Um, can they be more specific than "Unicode"? by Kickasso · · Score: 1

      Unicode does not necessarily mean any of this crap. International domain names don't use UTF-8 or UCS2 or anything like that, they are represented with a scheme called Punycode. Being a software developer, you may want to know a bit more about it. Just stop by any information kiosk marked with big rainbow-coloured GOOGLE sign and ask the friendly staff. Don't hesitate to ask about the difference between Unicode and the UTFs too, while you're at it.

      The C programming language and sizeof(wchar_t) has absolutely nothing to do with this discussion. Internet standards are not defined in terms of C and its data types.

    4. Re:Um, can they be more specific than "Unicode"? by spitzak · · Score: 2, Informative

      Several mistakes there.

      First of all any domain name is going to have to be encoded as a stream of bytes somehow because far too much stuff is already implemented to handle the string that way. As others pointed out punycode is used.

      Second, UTF-8 is smaller than UTF-16 for all languages, even Chinese. This is because all the ASCII 0x00-0x7F characters are smaller, and therefore the encoding will be smaller if there are more of these than there Unicode 0x800-0xFFFF characters. This seems incorrect for Chinese but you have to realize that ASCII includes spaces, newlines, numbers, and all XML and HTML markup and therefore any reasonable sized Chinese document will be smaller in UTF-8.

      Translating encodings to "wide characters" is a mistake, as you have noticed. You should write your software to deal with it in it's original encoding because that is the only way to intelligently deal with errors in the string. The fact that Windows uses UTF-16 for an encoding a lot seems to be confusing people no end, but please check exactly what they do when that UTF-16 has surrogate pairs, or even "invalid" surrogate halves. They are handling the original encoding, they are not "translating it to Unicode".

    5. Re:Um, can they be more specific than "Unicode"? by raju1kabir · · Score: 1

      Unicode can mean many things - UTF-8, UTF-16, UTF-32 - so specifying Unicode is not detailed enough to implement and by not specifying, it is opening a can of worms IMO.

      You seem to know enough to sling a lot of words (and boy have you slung a lot of irrelevant words) but not enough to understand what you're talking about.

      I'll help. You're talking about different encodings. Well, punycode is also an encoding, in the category of UTF8 and UTF16. And it's specified. So that's all they need to do.

      --
      "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
  26. Re:Speeding the path to IPv6? by expat.iain · · Score: 1

    I doubt that individuals & companies said, "No! We refuse to go on the internet until we can have TLDs with non-Latin characters."

    You think that companies have only a single domain? You think that they use only a single IP?

    iain@expat-tc ~ $ host www.microsoft.com.au
    www.microsoft.com.au has address 203.19.66.74

    iain@expat-tc ~ $ host www.microsoft.it
    www.microsoft.it is an alias for microsoft.it.
    microsoft.it has address 207.46.232.182
    microsoft.it has address 207.46.197.32
    microsoft.it mail is handled by 10 maila.microsoft.com.

  27. The problem is switching keyboard input by boef · · Score: 1

    Most people here seem to miss one of the big reasons for this. Just imagine what a pain it would be for you if it was required that you type 2 or 3 Kanji characters at the end of every URL that you type out manually. These are not characters that are generally available on your keyboard and you have to switch they keyboard input to try and type them, or use a software keyboard etc. Even if you are fluent in both languages, it is a pain in the arse.

    1. Re:The problem is switching keyboard input by CityZen · · Score: 1

      Indeed. And most cell phones and video game consoles and other specialized devices have no way at all to enter foreign alphabets.

    2. Re:The problem is switching keyboard input by Lord+Lode · · Score: 1

      Hmm, I rarely, if ever, manually type URL's. Do you often have to?

    3. Re:The problem is switching keyboard input by Kickasso · · Score: 1

      Big surprise here! Cell phones and consoles sold in foreign lands have ways to enter their respective foreign characters. Who'd think.

    4. Re:The problem is switching keyboard input by Locke2005 · · Score: 1

      Even if you are fluent in both languages, it is a pain in the arse. If typing is a pain in the arse for you... well then, you're obviously doing it wrong!

      In general, I really don't see any benefit to non-latin URLs; most webpages are accessed through links, and most people type URLs into Google anyway to let it Google correct their typos for them. People can't type URLs correctly now; how are they going to type mixed character set URLs correctly?

      --
      I've abandoned my search for truth; now I'm just looking for some useful delusions.
    5. Re:The problem is switching keyboard input by raju1kabir · · Score: 1

      most cell phones and video game consoles and other specialized devices have no way at all to enter foreign alphabets.

      Huh? Most cell phones are sold in non-ASCII markets and have ample ways to enter foreign alphabets. Maybe you were just talking about most cell phones in your pocket.

      --
      "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
    6. Re:The problem is switching keyboard input by CityZen · · Score: 1

      You are reading my comment from a US-centric point of view ("foreign" = "not US"). My comment is mostly correct from a global point of view ("foreign" = "not local"). If you are in Japan and buy a Japanese phone, for instance, you can most likely enter Japanese (local) characters, but not likely Arabic (foreign) ones. You can always enter Latin letters, though.

    7. Re:The problem is switching keyboard input by raju1kabir · · Score: 1

      Latin seems foreign to most Japanese people, I think.

      Here in Malaysia, almost all phones support Latin and Chinese, and some also support Thai.

      I suspect my phone (Nokia E61i) has the capacity to support most major scripts; as when I travel elsewhere I often get SMS spam in local character sets (Arabic, Hebrew, Russian, etc). Complex scripts are correctly rendered.

      --
      "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
  28. NSFW links by TheGreatOrangePeel · · Score: 1

    "Great, now even less chance I can identify NSFW links before they are blocked by my work's big brother app and my boss is notified... again."

    If this is a common problem for you, turn off your browser's "load images" setting. Not a perfect solution, but better than a flashing neon animated GIF of bouncing boobs right as your boss walks by. Myself, I've a number of people I follow on twitter who post links and often fail to mention if they're work appropriate, so I set up PuTTY to be an SSH tunnel/SOCKS proxy (scroll down to, "PuTTY for WindowsXP") to my home file server.

  29. IDNs are dangerous. by harry666t · · Score: 1
  30. Re:how to... by Mister+Whirly · · Score: 1

    Finally, someone who gets it! Now can you please explain that to all your other non-English speaking brethren, because we only speak English here..

    --
    "But this one goes to 11!"
  31. Latin =/= Support for English only. by mano.m · · Score: 3, Insightful

    A lot of the debate here seems to be about English-speaking countries vs. the rest of the world, but English isn't the only language that uses the Latin. Also, the unavailability of non-Latin scripts hasn't hampered the flourishing of home-grown websites in India and China named in their many local languages - what makes the ICANN think this is even necessary?

    --
    Karma fed to this user will be promptly burnt. Be warned; be wary.
    1. Re:Latin =/= Support for English only. by agnosticnixie · · Score: 1

      There's only a handful languages that use strict ascii, one is dead, and a bunch is a small family of closely related languages spoken by about 2 million people in the Pacific, which is what TFS and TFA describes TLDs as being able to do.

    2. Re:Latin =/= Support for English only. by w000t · · Score: 1

      English uses a simple form of Latin. Other languages using the Latin alphabet, such as Spanish, French or Portuguese, German, etc. contain symbols not available in the English version. It is generally not a problem though, as most used symbols are available and missing ones can generally be replaced by similar ones.

    3. Re:Latin =/= Support for English only. by pablo.cl · · Score: 3, Insightful

      Actually we are talking about the English alphabet, with j, u and w, which Latin din't have.

    4. Re:Latin =/= Support for English only. by minion · · Score: 0

      Its not necessary, its purely political. The world hates the US, and what better way to tell the world that the US "does not run the internet" by making non-latin character sets valid.

      --

      -- If we don't stand up for our rights, now, there will be no right to stand up for them later.
    5. Re:Latin =/= Support for English only. by Estanislao+Mart�nez · · Score: 2, Insightful

      Also, the unavailability of non-Latin scripts hasn't hampered the flourishing of home-grown websites in India and China named in their many local languages - what makes the ICANN think this is even necessary?

      And how exactly do you claim to know this? It certainly makes it difficult to market the website among the potential user base who have only a shaky command of the Latin alphabet.

    6. Re:Latin =/= Support for English only. by raju1kabir · · Score: 1

      There's only a handful languages that use strict ascii, one is dead, and a bunch is a small family of closely related languages spoken by about 2 million people in the Pacific, which is what TFS and TFA describes TLDs as being able to do.

      Strict-ASCII nations Indonesia, the Philippines, and Malaysia collectively amount to 370 million people, 185 times your estimate. And that's before we get to all the little countries floating around here.

      --
      "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
    7. Re:Latin =/= Support for English only. by Anonymous Coward · · Score: 0

      English is one of the few language that uses only 26 letters. For example, Spanish, French and German all have special letters or accents such as : éèçôâüì

    8. Re:Latin =/= Support for English only. by mano.m · · Score: 1

      By being Indian, may be? Even film posters are printed in the Latin script in India to be accessible to speakers of all languages who may be unfamiliar with each other's scripts.
      Non-Latin URLs haven't been a barrier to anandabazar.com (Bengali), bartamanpatrika.com (Bengali), eenadu.net (Telugu), and navbharattimes.com (Hindi) from being successful and popular among their target audiences. I imagine the same is true of baidu.com (Chinese) and al-khaleej.com (Arabic) in their home markets.

      --
      Karma fed to this user will be promptly burnt. Be warned; be wary.
  32. Re:Are we going to have to update the URL RFC? by nsayer · · Score: 1

    RTFA. Internationalized characters in domains are encoded. See also RFC 3492.

  33. Squatting by Anonymous Coward · · Score: 0

    Regardless of implementation, once/if this does go through, my biggest question is what (if anything) is being done about domain squatting? We are talking about opening up potentially millions of domain names that have never been registered and I assume the moment this begins to be possible there would be some mad dash to register everything imaginable...

    1. Re:Squatting by pablo.cl · · Score: 1

      We are talking about opening up potentially millions of domain names that have never been registered

      There are less that two hundred, not millions, of potential ccTLDs.

    2. Re:Squatting by raju1kabir · · Score: 1

      There are less that two hundred, not millions, of potential ccTLDs.

      How do you come by that calculation? There are thousands of characters in the Chinese space alone. Is your desk perhaps being affected by a factorial dampening field?

      --
      "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
  34. it just got easier to phish by Nadaka · · Score: 3, Informative

    Yay. Now you can can register yourbankname.com with some funky characters that render in exactly the same way as the letter you are used to.

    1. Re:it just got easier to phish by Rudolf · · Score: 1

      Now you can can register yourbankname.com with some funky characters that render in exactly the same way as the letter you are used to.

      From the Summary:
      countries will be able to display country-code Top Level Domains (cc TLDs) in their native language

      .com isn't a country-code TLD is it?

  35. What's new here? by nstrom · · Score: 1

    Aren't IDNs already available via Puncode encoding? (For example the ones at http://www.w3.org/2003/Talks/0425-duerst-idniri/slide12-0.html) Or am I missing something?

    1. Re:What's new here? by shutdown+-p+now · · Score: 1

      You are. This story is about applying Punycode to TLDs (like .cn, .jp or .ru)

  36. Just what we needed... by sajuuk · · Score: 1

    More places for those damn domain squatters to snatch up before we can.

  37. Re:Speeding the path to IPv6? by twistedcubic · · Score: 1


    This will have absolutely no effect on IPv4/IPv6.

    It's not as clear as you think. The post you respond to probably thinks that having non-Latin TLDs will increase domain registrations, which might require more IP addresses. Not all new registrations will be redundant.

  38. A complete mess. by Anonymous Coward · · Score: 0

    As I said, a complete mess.

    God knows what will happen with all DNS caches full of cn--, all security risks, bugs, all unreachable websites (unless you have unicode in your system), confusion, the exponential gowth of phishing, scams, and domain theft.

    Unbelievable. Today Internet was such an orderly, quiet medium and now this. One day they will allow people to call each other through the Internet without using their home telephone! Can you imagine?

  39. Its to do with people with the wrong keyboard ... by Viol8 · · Score: 1

    ... not being able to enter the URL!

    How exactly do you think you'll be able to type in a URL in mandarin or russian on west european keyboard?

  40. Don't by Anonymous Coward · · Score: 0

    Don't most countries already have a country code tld? What will they do with the old code for thier country? Sell it like Tonga and Tuvalu?
    (And those two countries just mentioned don't have a different alphabet - I don't think they had a written language before contact with europeans.

  41. Re:how to... by andreyvul · · Score: 1

    apart from typing google into google?

    --
    proud caffeine whore
  42. Re:Its to do with people with the wrong keyboard . by LordAndrewSama · · Score: 1

    how is that a problem? if you can't get to the website because it's in a funny language, what makes you think you can read the contents?

  43. so chase.com by circletimessquare · · Score: 1

    could be chàse.com or cháse.com

    every website i go to from now on, i need to study the url with a magnifying glass to make sure i am getting the actual site i wanted. not even as a security precaution, but just to avoid phony sites that might be spoofing a real one for all sorts of purposes, even if just humor, not all of them nefarious, but all of them certainly annoying

    a with accent mark may be easy to see, but there are some subtle unicode characters that are so completely like the lowercase "L" or upper case "I" or upper and lowercase "O", etc., and each different font might render the different characters in so many subtle variations, that its almost impossible anymore to guarantee that the link you followed actually went to the site you think it did

    so we have to type addresses by hand to make sure they are genuine from now on?

    its not cultural imperialism to support only 30 or so characters for website addresses. think of it as a universal routing system, that is purposefully limited, simply for the sake of security and peace of mind

    characters for website addresses should remain small in number. simplicity means security. now we have opened a can of worms, and i think the spoofing will actually be worst for those who use nonlatin alphabets, as they are more likely to be mixing latin and nonlatin characters in their address bar

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    1. Re:so chase.com by IcyWolfy · · Score: 1

      You do realise, that you could have already registered domains with IDN names for years.
      All this new update is extending it to the .com portion. So, countries without latin scripts don't have to settle for .cn, or .jp.

    2. Re:so chase.com by pablo.cl · · Score: 1

      could be chàse.com or cháse.com

      every website i go to from now on, i need to study the url with a magnifying glass to make sure i am getting the actual site i wanted.

      The new characteristic is that ccTLDs are allowed. Second level domains have been available for several years. For example http://www.xn--and-6ma2c.cl/ ñandú .cl

  44. Re:Its to do with people with the wrong keyboard . by jason.sweet · · Score: 2, Insightful

    There are a lot of websites where the words don't matter.

  45. Re:Its to do with people with the wrong keyboard . by mea37 · · Score: 3, Insightful

    Uh, yeah, because the keyboard you're using is a clear indicator of which language(s) you understand.

  46. The Internet has to evolve by PinkyDead · · Score: 4, Funny

    ....although obviously not ... in Kansas.

    --
    Genesis 1:32 And God typed :wq!
    1. Re:The Internet has to evolve by steelfood · · Score: 1

      We stopped being in Kansas a long time ago. Or maybe Kansas stopped being in us, for most of us anyway.

      --
      "If a nation expects to be ignorant and free in a state of civilization, it expects what never was and never will be."
  47. How far will they allow this.... by PinkyDead · · Score: 1

    One word: Klingon.

    --
    Genesis 1:32 And God typed :wq!
  48. Search Engins ^ by EgNagRah · · Score: 1

    This will make guys like google seem like definite gateway to the internet... Everything seems to be shifting. I don't think phishing will become a huge problem for people who get some PC training. On the other hand the countries more likely to have problems with this also have less of a chance to get people trained. I also would hope that browsers will adopt a feature to protect us in a more realistic way.

  49. Re:Its to do with people with the wrong keyboard . by shutdown+-p+now · · Score: 3, Informative

    How exactly do you think you'll be able to type in a URL in mandarin or russian on west european keyboard?

    You enable Chinese keyboard layout (dunno what's it called), and type it. The letters printed on the keys of your keyboard aren't some sort of magic that lets your computer input languages written in them, you know.

    I don't have any keyboards with Russian characters on them, but I happily type in Russian regardless (in fact, I only first realized that I do actually truly touch type when I first ran into this problem, which turned out to not be a problem in the end).

  50. Re:Its to do with people with the wrong keyboard . by koxkoxkox · · Score: 1

    There is no Mandarin keyboard, you have to use an input method to input Chinese characters. Computers sold in China are in qwerty. You type the romanisation and you can choose from a dictionary of characters which one you want. Of course you should not have to do it to type an URL to visit a page in English, but I expect all Mandarin speakers to have a way to type Chinese on their computers so it should not be a problem.

    You may have more problems with european languages, for example French and its accents. Input methods are available but not widely used because computer sold in France have the accents directly on the keyboard.

  51. Re:Speeding the path to IPv6? by Anonymous Coward · · Score: 0

    You think that companies have only a single domain? You think that they use only a single IP?

    I am well aware that companies have many domain names and many ip addresses.

    But like I said, I doubt that individuals & companies said, "No! We refuse to go on the internet until we can have TLDs with non-Latin characters."

    The limiting factor was not non-latin domain names.

  52. Now the web is truly geek by Anonymous Coward · · Score: 0

    Finally we can do web addresses in Klingon!

  53. Re:Its to do with people with the wrong keyboard . by OolimPhon · · Score: 1

    how is that a problem? if you can't get to the website because it's in a funny language, what makes you think you can read the contents?

    Ever go on holiday? Ever need to use an internet cafe in your holiday country?

  54. Need ability to block this by Anonymous Coward · · Score: 0

    I look forward to browser plugins that block or auto translate these urls for the sake of security.

  55. Actually might make security easier by Anonymous Coward · · Score: 0

    As you should be able to use all non-latin TLD's as a element to filter against.

  56. right by Anonymous Coward · · Score: 0

    'Cause as a website visitor, I *really* want to learn all those other languages and switch my keyboard every other website I go to. *REALLY!* I've wanted to for *years*!!

    And as a website owner, I want to pay to register 248+ domain names just to cover all the new TDL languages.. instead of having http://www.mywebsite.com/jp/index.php, et al.

    Smells like teen scam to me. I really had to laugh when they showed the South Korean Pensioners learning how to use the internet. And complaining about having to learn english.

  57. Re:how to... by Locke2005 · · Score: 1

    English: the de facto Lingua Franca of the web!

    --
    I've abandoned my search for truth; now I'm just looking for some useful delusions.
  58. Re:Its to do with people with the wrong keyboard . by Abreu · · Score: 1

    Last time I was in Thailand, the keyboards at the internet cafe had both latin and thai characters on them, and it was trivial to switch the keyboard language

    --
    No sig for the moment.
  59. Re:Are we going to have to update the URL RFC? by fahrbot-bot · · Score: 1

    What about languages that go from right to left like Hebrew and Arabic?

    Then there are the ones written vertically, like Japanese and Chinese - yikes! :-)

    --
    It must have been something you assimilated. . . .
  60. Boo hoo by Anonymous Coward · · Score: 0

    How about just put the icon in the URL bar (like there is for FAVICON etc) with the country flag of the non-ASCII components in a URL if it exists.

    The only reason not to use a flag in the case of plain old ASCII 7 bit is because it should be the UK flag, but the USians wouldn't like that.

    So if you see www.chase.com and a french flag, YOU run away. Unless you know they host a french site at that URL.

  61. Re:Its to do with people with the wrong keyboard . by Mister+Whirly · · Score: 1

    No. Why do you ask?

    --
    "But this one goes to 11!"
  62. Please... by shentino · · Score: 1

    At least restrict the character set to UTF-8

    1. Re:Please... by raju1kabir · · Score: 1

      At least restrict the character set to UTF-8

      What does that even mean? UTF-8 isn't a character set.

      --
      "Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
  63. Most Slashdot stories are not about i18n by tepples · · Score: 1

    Where can i find this list? Or does it only exist in the SLASH source code?

    The SLASH source code is published in a Git repository. However, SLASH exposes several settings to the site owner, such as how much karma a "Funny" is worth (+1 in stock SLASH, 0 on Slashdot), and I'm guessing this character whitelist is one of those.

    It looks to me they just threw the baby out with the bathwater.

    Given how few articles on Slashdot are explicitly about internationalization, there is only enough baby to count as "acceptable collateral damage". A SLASH-based site directly about i18n issues would obviously have a wider whitelist.

    If the problem was unicode's direction control characters, why not just blacklist those few control chars?

    Because we don't know what additional control characters Unicode Consortium will define in the future. Also because Slashdot admins want to discourage, say, ASCII art made out of Japanese characters.

    Instead we now have a whitelist so ridiculously small, it's useless.

    The success of Slashdot shows that the character whitelist on Slashdot is useful for everything but talking about i18n.

  64. Re:Are we going to have to update the URL RFC? by pablo.cl · · Score: 1

    h
    t
    t
    p
    :
    /
    /
    w
    w
    w
    .

  65. Nobody's stopping you from anything. by Estanislao+Mart�nez · · Score: 1

    Trying to remove universal access and Babylonize the internet under the fairly flimsy pretext of internationalization seems a very misplaced effort to me.

    Your computer can almost certainly display Chinese and support the same text input methods that the Chinese do. Your browser, if it's a recent version, already implements Punycode. And nobody's stopping you from learning Chinese, you know. Or from hiring people who know it to browse the web for you and help you deal with Chinese-language sites with Chinese-character URLs.

    In fact, what all these standards are doing is to make it possible for you to access the same websites as everybody else, that they're going to write in a foreign language anyway. It's not like they need your permission to use their language, you know.

  66. dhermann by Anonymous Coward · · Score: 0

    You know, dhermann makes a really good point. I think we should hold off on any further changes to the internet or the web, so that he can continue shirking his duties at work. Why should he be inconvenienced, just so that all these barbarians with their crazy moon languages can have domain names that make sense to them? The audacity of these people!

    What a dumb asshole comment to make. Grr.

  67. Re:Its to do with people with the wrong keyboard . by minion · · Score: 1

    How exactly do you think you'll be able to type in a URL in mandarin or russian on west european keyboard?

    You enable Chinese keyboard layout (dunno what's it called), and type it. The letters printed on the keys of your keyboard aren't some sort of magic that lets your computer input languages written in them, you know.

    I don't have any keyboards with Russian characters on them, but I happily type in Russian regardless

    I'm happy you'll do this. I won't, and the majority of the internet users won't either. It'll just further separate nations, because I won't go through the hassle of typing in a foreign character domain name - it'll just a site I won't visit.

    --

    -- If we don't stand up for our rights, now, there will be no right to stand up for them later.
  68. Re:Its to do with people with the wrong keyboard . by shutdown+-p+now · · Score: 2, Interesting

    I'm happy you'll do this. I won't, and the majority of the internet users won't either. It'll just further separate nations, because I won't go through the hassle of typing in a foreign character domain name - it'll just a site I won't visit.

    Presumably, if a site is designed to be visited by someone who only understands English, it will use an English TLD. If it uses TLD with national characters, then most likely the content is in the language other than English as well, and you'd need to have means to input that language to fully interact with the site anyway.

  69. Did anyone tell the IETF yet? by help_cecil_help · · Score: 1

    I used to look here (http://www.ietf.org/rfc/rfc1738.txt) for this kind of thing...

  70. The Emperor's New Clothes by ub3r+n3u7r4l1st · · Score: 1

    Yup, I see jackshit.

    Here is a demonstration of how non-latin characters really show here:

  71. Re:Are we going to have to update the URL RFC? by Jesus_666 · · Score: 1

    What about languages that go from right to left like Hebrew and Arabic?

    What about them? The URL is still encoded exactly the same; it's just displayed differently in the browser. Of course that would require that LTR and RTL characters aren't combined but someone already suggested that domain names should only be allowed to contain characters from one alphabet (more or less; for example there are special Unicode code points for Latin letters used in Japanese text).

    --
    USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
  72. Re:how to... by dominious · · Score: 1

    Seriously though, it is nice to have a lowest common denominator in characters, so that everyone can type every address on the Internet.

    Um...Why? Do you know every address on the internet? There are already web sites in other languages that you don't know about what is the problem if their address is in some other language? Does it really bother you?

  73. Fragmentation. by MaWeiTao · · Score: 1

    Initially I thought this was cool. But then I started thinking about this and I realized that all this is going to do is fragment the internet. The existing system ensured a convenient standard that anyone could access. How the hell are non-Chinese, for example, every going to figure out how to type a Chinese address? Unless someone provides you with an address it's not likely you'll ever figure it out.

    Even being able to speak Chinese this would be a challenge for me. I expect even Chinese natives are going to have a hard time with this. I could tell someone my web address, but then I also have to explain which character I mean because there could be there might be multiple characters for that particular phonetic. And lets not get into all the languages out there with their own unique writing systems.

    The fact is that certain languages aren't quite as conducive to use with computers as others. In many cases it's probably just that nobody has made the effort to optimize input devices and system interfaces. But then when you do that you also alienate the rest of the world. It's entirely possible most foreigners wouldn't ever end up on these sites anyway but I don't like this fragmentation by what I see as dumping a standard. Technology will eventually reach a point when this is not an issue, but we're not there yet. I really don't see what was wrong with the Latin alphabet and Arabic numerals. Every computer in the world supports this by default so how exactly does this move enhance accessibility?

    1. Re:Fragmentation. by edschurr · · Score: 1

      How the hell are non-Chinese, for example, every going to figure out how to type a Chinese address?

      Visit a dictionary site, lookup the words they need via Pinyin or English, and copy-paste into the URL field. Or, use a Chinese Javascript IME website. Or, write it down somewhere and type it directly via Punycode (probably possible). Or, find the site via Google based on a term they can remember (that's the only time I use "I'm Feeling Lucky", e.g. I search "ascii table" rather than remember if it's asciitable.com, asciitable.org, ascii-table.com, etc.).

      People will figure out strategies to make use of these URLs.

      The simplest one: write it down. It's not like the spelling of Hudong or Zhongguo is obvious when you have only heard it.

  74. Re:Its to do with people with the wrong keyboard . by blackraven14250 · · Score: 1

    ...nobody in the world speaks Chinese and Arabic? Last time I checked, they're not on the same keyboard. There's a multitude of languages that aren't on the same keyboard. Chinese keyboards don't always even have English on them.

  75. Re:Its to do with people with the wrong keyboard . by Runaway1956 · · Score: 2, Insightful

    "the majority of the internet users won't either."

    Sorry, but that sounds like typical American ethnocentricity. The MAJORITY of internet users actually are people who don't natively speak English. Chinese speakers, Russian speakers, European people, many of whom use cyrilic alphabets, Arabs, South Americans, Indians, and others that I'm surely missing.

    How can you possibly speak for "the majority of internet users", when people who speak English as their native language constitute a pretty small percentage of the world's people? I could google, but I'm almost willing to bet that more people on this earth grow up speaking Chinese, than people who grow up speaking English as their first language.

    If a guy is more comfortable using his own language, I'm all for him doing so.

    --
    "Windows is like the faint smell of piss in a subway: it's there, and there's nothing you can do about it." - Charlie Br
  76. Re:Its to do with people with the wrong keyboard . by Shrike82 · · Score: 1

    You enable Chinese keyboard layout (dunno what's it called), and type it.

    You forgot the part where you strap on around 40,000 extra keys to cover all the different pictograms. Typing Mandarin, Cantonese and other Chinese dialects (or at least one way a colleague showed me) involves using latin characters to spell the start of the pictogram sound, then selecting from sub-menus the actual word or part word you want.

    --
    You can advertise in this sig from as little as £99.99 a month!
  77. Re:Its to do with people with the wrong keyboard . by Anonymous Coward · · Score: 0

    Not a chance spic.

  78. Re:how to... by morgauxo · · Score: 1

    I agree! Therefore I suggest that all tlds and the whole domain for that matter be binary, 1s and 0s only. In preparation for my obviously superior domain scheme being implemented I am registering 01110011 01101100 01100001 01110011 01101000 01100100 01101111 01110100 00101110 01101111 01110010 01100111 today just to be an a$$ so there!

  79. Re:Its to do with people with the wrong keyboard . by morgauxo · · Score: 1

    Oh, man, how will us westerners get our fill of burkini babe pics?

  80. .com/.net/.org should be off-limits to Unicode by Miamicanes · · Score: 1

    IMHO, the worst decision ever made in the history of the internet was ICANN's decision to allow non-ASCII subdomains of .com, .net, and .org. Those three gTLDs, if not others as well, should be forever off-limits to any characters besides the original 26 letters a..z, the digits 0..9, and hyphen.

    For other TLDs, and for national TLDs, DNS should be extended to allow a TLD's authoritative top-level registrar to authoritatively indicate which UTF8 characters (or range(s) of characters) beyond those historically-allowed for ASCII DNS are valid for its subdomains. The registrar for Spain might decide it needs accented vowels and tilde+n, but has no reason for Turkish vowels that conveniently (for phishers) look identical to ASCII characters.

    The next step would have been the creation of a few brand new TLDs, like ".Zhong" (U4E2D) and ".Nihongo" (U65E5 + U672C) -- think ".(Chinese)" and ".(Japanese)", not to mention similar TLDs for Hindi, Korean, Russian, and other languages that use non-Roman alphabets.

    The point is, ICANN could have done a much, much better job with this whole mess. I think everyone can agree that international domain names are something that needs to exist, but trying to staple them onto .com/.net/.org was an incredibly bad idea.

  81. Re:Its to do with people with the wrong keyboard . by mea37 · · Score: 1

    Thank you, Captain Obvious, I had no idea!

    Since you missed the sarcasm in my previous post, I assume you'll miss the sarcasm in this one as well. As such I will translate:

    "Yes, I am aware of that, as is everyone involved in the thread."

  82. Dear slashdotters, by Anonymous Coward · · Score: 0

    before posting any more comments to this story be sure you understand what these terms mean:
    * TLD = top level domain
    * punycode

    <grumble>...used to be news for nerds...</grumble>

  83. parent not troll by reiisi · · Score: 1

    I suppose I should go read the friendly A and see if ICANN has already specified all the native TLDs allowed as equivalents for country codes, and probably for .com, .org, .net, .mil, .edu, .etc., and mapped them to the equivalents.

    Somehow, I doubt it.

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  84. Not necessarily me, by reiisi · · Score: 1

    It may be that I am not directly affected in some cases, but I'm pretty sure I'm going to hit a wall sometime trying to figure out whether the uri in some cryptographic siggy is valid or not.

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  85. Re:Its to do with people with the wrong keyboard . by reiisi · · Score: 1

    So, you're telling me that there will be no documents I need to read on the website ".."?

    In my case, both my keyboard and my eyeballs have no problem with the characters (unlike the slashdot software).

    It is true that I could probably dig out as many as I could find of the relevant (English language) pages on the Japanese government's tax office websites and send them to my sister, were I to ask her to help me with my taxes, but even that is not always an available option.

    To say nothing of the potential need to verify a uri or url written native.

    We at least need to be able to map the TLDs to something more or less commonly legible.

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  86. The closest thing to a majority in this world? by reiisi · · Score: 1

    So, you're planning to learn Chinese as your second or third language?

    So you can get to all the important (by majority reasoning) websites?

    Yeah, I know I'm being obtuse. There's a reason. Majority has nothing to do with the argument, on either side.

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  87. non-ccTLDs by reiisi · · Score: 1

    Well, for the problems with strange variations of .com, .org, .etc., don't forget that they are opening up the whole TLD space.

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  88. Re:Its to do with people with the wrong keyboard . by _merlin · · Score: 1

    Ever heard of wubi, cangjie or daiyi? There are plenty of stroke-based input methods where keys are assigned groups of strokes and you compose the character that way - no Latin involved. Then there's the zhuyin/bopomofo phonetic input used in Taiwan, which uses Chinese phonetics. Once again, not Latin involved.

  89. Re:Its to do with people with the wrong keyboard . by AaronLawrence · · Score: 1

    I know what you mean, but I suspect it won't make much difference.
    Most of us find new sites either by a search engine, which is only going to look for sites with content in the language we are using to search (and mostly ignoring the domain name), or by a link from somewhere, in which case it won't matter at all. The only case that would matter is where links are printed so that you have to type them out again (or occasionally, the moronic designers who put them in graphic images).

    --
    For every expert, there is an equal and opposite expert. - Arthur C. Clarke
  90. Re:Are we going to have to update the URL RFC? by jc42 · · Score: 1

    Then there are the ones written vertically, like Japanese and Chinese - yikes! :-)

    Actually, the national specs for how Japanese and Chinese are written include horizontal left-to-right as one of the two standard layouts. True, both were primarily written vertically (starting at the upper right), and there are still publications that do that. But the European horizontal printing convention has long since been decreed legal and standard in all the countries that use Chinese characters, and it's widely used.

    It's no big deal, actually. Consider that it's not unusual to see English written vertically, mostly on signs hanging above the sidewalk in front of buildings. Few English-speaking people have any trouble reading those signs. Why would you think that the Japanese or Chinese would have any trouble with their language written horizontally?

    (Well, OK; the Chinese used to also write horizontally from right to left. But you mostly only see that in museums and a few historic buildings nowadays, plus the equivalent of "Ye Olde ____ Shoppe" signs. ;-)

    --
    Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  91. Re:Its to do with people with the wrong keyboard . by jc42 · · Score: 1

    If you ask google about "most widely spoken languages", you can find a number of good articles on the topic. Currently the first hit is http://www2.ignatius.edu/faculty/turner/languages.htm, which gives a number of rankings of the top languages, depending on just how you phrase the question. They point out that the number of native speakers isn't necessarily the best way to judge the importance of a language. By that simple measure, Mandarin is the top language. But it isn't used much outside of east Asia. English, French and Spanish have fewer native speakers, but are more important in most of the world, for a number of reasons.

    Anyway, you can learn a lot of interesting stuff about the topic by reading a few of the things in the above google search. It's a lot more complex than you might think, especially if you live in one of the parts of the world (e.g. the US) where most everyone speaks the same language.

    --
    Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  92. Obligatory by Anonymous Coward · · Score: 0

    ICANN haz UTF-8 domainz?

  93. Homonyms? by Anonymous Coward · · Score: 0

    The landgrab is going to be a mess, especially in places like China. Here domains use romanization to represent domain names, but each pinyin (romanized) syllable maps to sometimes hundreds of different chinese characters... definitely will be a lot of jostling.

    LS

  94. Re:Are we going to have to update the URL RFC? by fahrbot-bot · · Score: 1

    Then there are the ones written vertically, like Japanese and Chinese - yikes! :-)

    Actually, the national specs for how Japanese and Chinese are written include horizontal left-to-right as one of the two standard layouts.

    It was a joke. :-) Although, using Kanji or Han -like characters might be problematic as some (many, most?) can mean entire words or concepts and are context dependent... Perhaps they'll use Romaji -like characters instead.

    --
    It must have been something you assimilated. . . .
  95. Re:Its to do with people with the wrong keyboard . by Anonymous Coward · · Score: 0

    I live in Taiwan, I have no idea what you are talking about. "Chinese" keyboards are nothing more than your standard keyboards. They have all the same keys and everything. All you have is a different typing system. Why would their keyboards not have English, how do they use the internet right now? Most people who use simplified Chinese use roman pinyin for character entry.

    I use Mandarin Phonetic Symbols for character entry. From Windows I just add that keyboard from the regional settings and I can type. In linux I use SCME and I can type. All on the same keyboard. I also type in Dvorak when I type English, guess what, it uses the same keyboard. I bought a keyboard in Taiwan, and guess what, I can type in both Chinese AND English. I know not of a language that exists that uses a keyboard different than our standard 104 or 108 key keyboard. It's all about the keyboard mapping. I don't know how this thread got so long without someone realizing everyone here are just a bunch of idiots.

    To address a real problem:
    What if I am on a computer that I cannot add the correct language keymapping on to visit the site? That is the problem that I see. It is a software, not a hardware issue. /end rant

  96. Re:Its to do with people with the wrong keyboard . by pmontra · · Score: 1

    If the webmaster wants people of other languages to access the site he'll also register a latin domain name. If he don't cares, it's mainly his problem as he loses audience and possibly money.

    Second: services to map non-latin domain names into latin ones will appear, similarly to URL shorteners services. That will solve the problem for most of us.

    But also think about this: Chinese-language sites have latin domain names and Chinese-speaking users typing on Chinese keyboards now. They'll be able to let their main users to type URLs in their native language. That's definitely a good thing.

  97. Why should Americans care? by tjstork · · Score: 1

    I'm going to just throw it out there, but seriously, why should an American web site care about the rest of the world. Honestly, I could put a big filter on any domain that has any non-8 bit ASCII character out there and I would be utterly happy. While it might be nice to talk to the rest of the world, its not worth the extra byte for unicode, and its certainly not worth f---ing up polymorphism between strings and vectors just so we can have dumbass umlauts and other crap in our text.

    Call me a flamebait, but seriously, for consumers, if you carved up the whole internet into 8 bit character fiefdoms, and had just the asians deal with utf-16 or even utf-32, then, wouldn't that just actually be smarter for end users? Sure, monolithic corporations might balk at the cost of this, but why should I need to give the likes of Exxon a goddamn doubling of all of my strings just to make it easier for them to do world wide operations.

    I'm in favor of ASCII, that's what I'm saying.

    --
    This is my sig.
    1. Re:Why should Americans care? by Anonymous Coward · · Score: 0

      I'm going to just throw it out there, but seriously, why should an American web site care about the rest of the world.

      Because a significant percentage of its userbase is from said rest of the world?

    2. Re:Why should Americans care? by leenks · · Score: 1

      Unicode isn't two bytes.
      ASCII isn't 8-bits.
      UTF-8 (the most popular encoding for Unicode) uses 1 byte for each ASCII character.

      if you carved up the whole internet into 8 bit character fiefdoms, and had just the asians deal with utf-16 or even utf-32, then, wouldn't that just actually be smarter for end users?

      So what about those users outside of America that speak multiple languages? Or countries that have multiple languages with different scripts that can't be expressed in a single 8bit code page?

      Seriously, storage space is cheap. Most modern programming languages support unicode just fine. What's the problem?

  98. Re:Its to do with people with the wrong keyboard . by Shrike82 · · Score: 1

    Ok, I should have put "one of the ways of inputting Chinese characters is..." as obviously there are others. My point was it's not just as simple as changing from UK input to French so you can do all those little curly things under your c's!

    --
    You can advertise in this sig from as little as £99.99 a month!
  99. This could actually be useful. by KritonK · · Score: 1

    Although we already have non-Latin domain names, these were rather inconvenient for languages that are based on alphabets that are not derived from the Latin alphabet. E.g., if you wanted a Greek domain, you'd get something like <bunch_of_Greek_characters>.gr, which requires keyboard switching to type, and is more inconvenient than simply typing an all-Latin name. Turning that "gr" into Greek may actually make browsing Greek sites easier, as the keyboard may be left permanently switched to Greek, while typing.

    The same goes for Cyrillic, Arabic, Chinese, etc.

    It goes without saying that there is a lot of money to be made here. Not only are non-English web sites now going to have an incentive to actually register non-Latin domain names, they're still going to keep renewing the old Latin domain names as well, so that the sites remain accessible from the English-speaking parts of the world.