ICANN Under Pressure Over Non-Latin Characters
RidcullyTheBrown writes "A story from the Sydney Morning Herald is reporting that ICANN is under pressure to introduce non-Latin characters into DNS names sooner rather than later. The effort is being spearheaded by nations in the Middle East and Asia. Currently there are only 37 characters usable in DNS entries, out of an estimated 50,000 that would be usable if ICANN changed naming restrictions. Given that some bind implementations still barf on an underscore, is this really premature?" From the article: "Plans to fast-track the introduction of non-English characters in website domain names could 'break the whole internet', warns ICANN chief executive Paul Twomey ... Twomey refuses to rush the process, and is currently conducting 'laboratory testing' to ensure that nothing can go wrong. 'The internet is like a fifteen story building, and with international domain names what we're trying to do is change the bricks in the basement,' he said. 'If we change the bricks there's all these layers of code above the DNS ... we have to make sure that if we change the system, the rest is all going to work.'" Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?
Changing a system which works is a very, very bad idea.
Wont this open up the system to many more phishing attacks involving addresses which include non-latin characters which look similar to latin ones?
Yes, countries that use non-English characters should be able to interact with the rest of the world using their natural language. No, they shouldn't rush the change and risk a possible crash of a large portion of the Internet. Be patient young patawans, soon you will be able to have DNS names with any character you can think of, but it will be reliable and actually work.
Space for rent, inquire within
Perhaps, but I can't fault ICANN for this one, as much as I might like to. Like it or not, most internet technologies have their roots in latin speaking countries, which means systems developed there may not be tweaked to work with outside language schemes.
If the fault lies with anyone, it's with the individual contributers of the tech. Or better, with the non-latin countries appearent lack of interest in some of the core projects needed to push this through ICANN ( specifically DNS, httpd ).
Mod me down with all of your hatred and your journey towards the dark side will be complete!
- Don't be too surprised when people around you start building their own houses rather than choosing to pay rent.
DNS upheaval has been a long time coming, and the current anti-American sentiment worldwide isn't exactly helping to stabilize it. We're already seeing all sorts of adhoc routing setups that deal with shortcomings of an ameri-centric DNS. My guess is that within the next few years, ICANN's 'control' of the internet will be in name only as everyone else in the world will have moved on to alternative routing and domain systems.
The next Slashdot story will be ready soon, but subscribers can beat the rush and slashdot the links early!
"Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?"
No.
Zonk either knows zero about the histories of the Internet or DNS, or is so enamored of finishing stories with questions that he'll tack on the truly ridiculous.
What you do with a computer does not constitute the whole of computing.
For all you people saying "There's no problem, just do it" - I say watch out... there will be a rush of attacks and spoofs as soon as this is opened up. The letter "a" appears in the unicode character set multiple times, and some of the variants are almost indistinguishable. I'm not just talking about someone registering släshdot.org, I'm talking about someone reigstering slashdot.org (the a is FF41 instead of the normal a). Good luck telling the attacks appart from the real sites.
I'd be in favor of the change just because anything that undermines the Unix Tower of Babel -- the dependency on ASCII which complicates text handling sooooo much even when Windows solved the problem soooo long ago -- is good. Even Java gets it. Even Apple (finally) get it. Unix Is Teh Problem.
And the ASCII problem isn't just bad because it forces people to use inefficient encodings like UTF-8 (THREE bytes per character?) It's bad because it allows people to write code like:
if(string[index] == '.' || string[index] == '?' || string[index] == '!') sentenceEnd = true;
(a line repeated, with subtle variations, several hundred times in the code of a certain ubiquitous editor).
And, lo and behold, the above does not work, but once it appears in a few thousand places it's impossible to fix, and a vast towering structure of fixes made by people who don't really understand why it's an issue is built.
So, even though the proposed change would be hugely inconvenient for a huge number of people, I'm in favor, because I want the world to grow the fork up and understand that text != byte array some time while I'm still alive.
Whence? Hence. Whither? Thither.
Nice for localising, sure, but how usable will Japanese, Indian, or Arabic script URLs -- for example -- be for those who do not have access to the respective sets or keyboard layouts?
Of course it's late in coming.
But that doesn't mean it should be done hastily and badly.
http://alternatives.rzero.com/
Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?
Let's be clear. The domain name system only uses English characters. There are lots of languages in Europe (Italian, Spanish, French...) which are closer to latin than English (which isn't really a latin language at all) which are not currently represented, because you can't use accents in domain names, or other letters such as the spanish Enye (n with a squiggle, actually a distinct letter). English speakers often think accents aren't important but they can completely change a word's meaning.
The internet was originally conceived, designed, and implemented in the USA at a time where hardware was at a premium, and corners were cut to conserve that limited resource. DNS was just one of the results of that era. However, it is the most visible because it is the front end means for people to find each other. That means there is now a very well established standard, used by people across the entire globe, that is very difficult to change.
Changing all the DNS servers in the world to switch from ASCII to Unicode is NOT trivial. The fact that some societies have used non-latin characters for thousands of years is completely and utterly irrelevant. THEY didn't make the internet. They simply bolted themselves on to an existing infrastructure.
I agree that progress needs to be made to accomodate non-latin characters, but to have people whining about "how they want it, and want it now"... That's just ridiculous. It's like waltzing into a house that was built 40 years ago and having a tantrum because the stairs are too steep and the house is too squished. Major structural renovations take time, effort, and careful planning. And there is nothing you can do to avoid that, short of implementing cheap stop-gap measures that are virtually guaranteed to cause even bigger unintended headaches later on.
Given that some societies have used non-Latin characters for thousands of years, is this a bit late in coming?
Those societies did not build an entire economic and social infrastructure using all 50,000 of those characters in a few decades, though.
Rex is 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
Why not have the browser fail to render them outside of the user's preferred alphabet?
Cyrillic users would see www.**c******.com, latin users would see www.mi*rosoft.com?
Or better yet, put up a big warning that it's using mixed alphabets?
In general, browsers ought to make users more aware of the parts of their current URL, and maybe also of link destinations (also mail client).
For example, seperate the URL into its parts (scheme, host, path). Display some of the WHOIS info below the hostname, and some info from the SSL certificate if it has one.
This would help people spot phishing scams or other suspicious activity.
Reed
Thats a good start.
Registrars shouldnt accept such names in the first place though: Is there a valid reason to ever have a domain name with stray characters mixed in from different languages?
If a standard were to specify that a domain name must use a subset of unicode that is self-consistent, and that browsers should turn the address bar red to warn anytime a domain uses characters not in the users selected languages subsets, that would go a long way towards minimizing the phishing problem.
There would still be issues between users of the same orthography, but in general there is no way to prevent phishing style attacks completely, which fundamentally rely upon people to be careless. Even the current DNS system is vulnerable:
spoofing "cnn.com" with "cnn-news.com" or "cnn.newsnetwork.com" doesnt need i18n support to work at all.
Instead of changing the fundamental DNS which is a programmer's and administrator's tool, not an advertising medium. It is founded, like programming languages, on a fundamental 7-bit ASCII character set, and is not intended to be used for NLS text.
A far better solution is some form of VDNS that translates NLS text names into the proper domain name at the system level. That also allows the same domain to have multiple language translations to reflect localized product and service names.
We seriously need to kick the general political community in the arse. They keep trying to impose technical decisions, and it fails as miserably as any corporate PHB's uninformed decisions. ASK the techies to propose solutions instead of shoving ill-conceived ideas down our throats.
For example -- once you mandate multibyte domains, you implicitly mandate multibyte URL components. Goodbye direct mapping of names to the directories, file systems, and servers.
Bad idea. Very bad idea.
I do not fail; I succeed at finding out what does not work.
It's been obvious since the Europeans got DNS for their ftp and email that there was a problem, even before they invented the web, and even aside from myopic silliness like having
DNS has a couple of restrictions that may have made sense in 1985, long before Unicode was invented. Some of them are easy to fix, especially since most DNS servers in the world use versions of one of three or four server programs, but there's a lot more resolver software out there that deliberately casefolds (though you could fix most of that in two or three generations of Microsoft releases, if you knew what you wanted it to do), and you can fix some of it administratively, by having the people who register UPPERCASE-EXAMPLE.COM also register uppercase-example.com and maybe Uppercase-example.com and do a few similar things for munged Unicode.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
It's amazing, krell. How do you type so much with Rush Limbaugh's cock rammed down your throat?
I feel like death on a soda cracker.