ICANN Mulling Multilingual URLs
griffjon writes "The Washington Post is reporting that ICANN is testing out fully multilingual domain names. These won't just be [non-western-language].com, but would have TLDs translated into other scripts, fixing annoyances for non-English speaking audiences. An example: 'Speakers of Hebrew, Arabic and any other language written from right to left must type half of the URL in one direction and the other half — the .com, .net or .org postscript — the opposite way.' Let's hope it goes better this time around: 'Next week's experiments use the domain name "example.test" translated into 11 languages. A previous model, however, used "hippopotamus" instead of "test." These plans went awry when an Israeli registrar realized the Hebrew word ICANN thought meant "hippopotamus" was an expletive and threatened to involve the Israeli government.'"
Well hippopotamus me, what will they think of next?
The NSA: The only part of the US government that actually listens.
We only need one language, and it's English. Take my international words for it!
--
Some Norwegian guy.
A URL is an entire address, including the protocol, local path and fragment identifier. This is a URL:
A domain name does not include the protocol, the local path or the fragment identifier. This is a domain name:
This is talking about domain names, not URLs. If anybody would talk about multilingual URLs, it would be the IETF, not ICANN, and they already have, they are called IRIs.
Seriously, multilingual domain names are a pain (for the whole humanity). Visiting japan, last year, I saw a lot of servers using japanish simplified language on it. As a foreigner, I hadn't the minimal idea about what the site was (without clicking on ot). Clicking on it didn't help either. Yes, a lot of japanese have the same problem with english domain names, but adding multilanguage names adds more complexity to the whole thing. I would like to see the face of a chinese guy trying to decrypt some URL using ukranian characters... or... trying to write it on his japanese keyboard...
It's time to realise that Abble's products are the biggest abomination these days. Just say NO to the dumb iAbble way!!
I'd love to know what Hebrew word for hippo is explicative. All my life I've only ever heard "hipopotam" in Hebrew for hippo- not a very dirty word. In any case, Hebrew URLs have been the norm at the Hebrew Wikipedia since as long as I've been using it. Hebrew domain names, on the other hand, would be interesting (even though I'm sure this is what the poster meant).
It is dangerous to be right when the government is wrong.
xc.estaog//:ptth
I wonder what test translates to... I hope they hired a translator who doesn't like practical jokes.
ICANN reads NNACI
What do the israelis think about this?
http://org.slashdot/ or is it org.dotslash://http or org.dotslashcolon://http or.... ah, hippo it!
If it's going to use characters not present on normal keyboards, what's the point? Why not just use IP addresses?
Mulling != Testing.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
Those countries are, understandably, uncomfortable with a US controlled DNS system.
ssa lamron a tub htuom eguh a htiw yug a fo etis kcohs a si tI
"I'm glad to say I was there to see the day the US government sold out the Internet in Berlin" - Don Telage.
"The response was basically, 'I'm too busy. Go learn English."
That's about right. In the day ICANN was concentrating on trademark issues and the reasons it got to exist in the first place (new tlds, international domain names) were back benched. It's not like we didn't have laws against trademark infringment, but the trademark lobby wanted greater rights in cyberspace than it has in the real world, and it can be argued they got 'em. Yo folks would shit if you knew how much money the TM guys spent on this. It's greater than hundreds of millions. Just to prevent a few new lines of text from going into a a file called db.root on a computer in Virginia.
Keep in mind the mantra ICANN likes to chant is "stability" which in this case means "no growth".
"There's . . . a little anti-American rock-throwing in that description," said Mike Roberts, the first president and chief executive of ICANN. "The engineers thought that trying to do the non-Roman alphabet thing with all this growth would destabilize the Internet and cause crashes."
Show me an engineer than ever said this. How has the DNS changed since then? It hasn't of course. But then Mike Roberts has lied to me before.
Before they rush on with alphabets that read right to left and use alternative character sets they really should try English words with greater than 8 bit characters. Are they gonna actually work? There's still a lot of old DNS code out there *cough*BIND4*cough*.
But, if they don't work, it's not like the existing 8 bit characters names will suddenly stop working or that sparks will fly out the back of your computer.
Working models of IDN ("International Domain Names") have been shown for over a decade. Whenever some alternative to the US controlled DNS starts to get "legs" ICANN does something. In all cases it's been too little too late.
Fasten your seat belts, this is gonna be a rough ride.
Need Mercedes parts ?
All I have to say is about time. Yes, I'm a native English speaker, and yes I see some technical problems with this, but I'm also fairly cosmopolitan (not the magazine) and do think that multi-lingual domains are the way to go.
One request I would have of ICANN is to limit the use of accented character to help prevent fishing scams.
- I voted for Nintendo and against Bush
I live in France and in France we have accentuated letters. They are
present on our keyboards too, and we naturally use them when typing
text. But I would hate it to have to use accents in domain
names. This is plain silly. Uppercase letters do not have the accents,
lower case letters have. What character should I type for a given
domain name? And given that the domain names remain case insensitive,
the ambiguity remains forever. Not to mention having to type characters
you can't type from your keyboard, or that you can't even name or
identify!
Let me repeat it, this is plain stupid. I would like the domain names
to remain english 7bit-only. Yes it has limitations and it is inconvenient
to many people, but those people have overcome the problem (otherwise they
would not even connect to the net). Now we will enlarge the problem, and
for everyone.
Plain stupid move. Sounds like marketting from registrars.
Willy
Maybe if I did a search for something, and the answer is in one of those "other" languages written by those "other" people, maybe I could somehow click some kind of--I don't know--maybe a representation of that site, using my rat or squirrel or whatever these new-fangled devices are called. Then of course I'd like to be able to save this transportation capability for future use; if only there were a way to save some kind of cyber-bookmark in my browser, to keep my place without having to type in all those funny characters ever again. I think I have some ideas, but I need to contact my patent attorney first.
Oh, no. Wait. I just thought of something bad. You know, when I actually get to this site, it's probably going to be really hard to understand what's written on the page. Funny squiggles and such. I suppose there's really just no reason for me to go to such a page, if I can't read it anyway, so why even bother? Plus "they" probably don't know anything good anyway, but there's always a chance that "they" might be more intelligent than we thought. If only there were some site that provided a service that could help me translate this page, then maybe, just maybe, I'd be Ok with allowing these foreign-speaking visitors to spread their native language like some kind of disease all over "my" Internet. If only...
Oh, and I was thinking it was something like, "I seem to be having tremendous difficulty with my lifestyle," a phrase that is known to have started at least two wars.
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
And besides, links can use IP addresses just as easily.
If a big concern truly is the direction text is written, shouldn't this be taken care of by the programs? Why can the Arabic version of Firefox/IE/etc. just reverse the flow of text input in the URL bar and then switch it to left-to-right behind the scenes? Or have people gotten accustomed to writing in their languages backwards? Then, if you really want to be nice, maybe add translations for the TLDs in different scripts.
What next will they think of? The idea of a "Russian *Internet*" or a "Chinese *Internet*" just makes me laugh!! As for accusations of "cultural imperialism" - can I just point out that English speaking people developed the Internet at their own time and expense (and a lot of tax-payers money) - so they are entitled to have it in English if they want ..
this is daft - just plain daft ... when we look back at this moment and realise that we opened the can of worms that led to a 21st C. "Tower of Babel" - we will weep ...
You're making the assumption that the computer should display characters in the order in which they were typed. You need to lose that assumption. A good technical solution would be simply to make the computer able to display the characters in the opposite order to which they were typed. That way, languages that are written backwards can be displayed correctly even when their users type the text in the correct order.
Are you adequate?
To use characters present in abnormal keyboards. You know, the kind of keyboards that people have in countries where these sites' audiences would be.
I am impressed by the fact that you managed to frame the question in exactly the relevant terms, what characters are present in your keyboard, and not the facile, easy way out that those crazy foreigners do, when they insist in framing this in terms of presenting addresses in their own native language and script.
Are you adequate?
So just have DNS zones set up equivalence rules between domain names, so that the difference between, for example, E, e, É, é, È, è, Ê and ê is ignored for DNS requests within zones where it is appropriate to do so.
The fact that multilingual domain names would require some policy decisions in the parts of registrars doesn't have nearly as much importance to the task of providing a mechanism for providing such names.
Are you adequate?
Their characters are not available to me. By using them in the DNS system, a part of the net is effectively being segregated.
hmmmm.... How much would, xxx.madamiamadam.xxx, sell for?
In a net cafe on the other side of the world?
I don't think the computing world is ready for this yet, and it may never be a good idea.
Internationalization in software and operating systems is in a horrible state of excess
complexity right now. When everything top to bottom runs unicode UTF8 as its default
mode, then MAYBE.
But even then, there is a single language for Aviation communications (happens
to be English) but that is done so that there is some hope that everyone will know what
everyone is talking about, because everyone can learn the aviation subset of a single
natural language.
Also, most programming languages retain a small set of keywords in a single natural
language, so that most people will have a chance of learning that small set.
Simplicity-and-universality-first arguments maybe should win the day
for domain names too.
"Nationalized" domain names are one more step in the very unfortunate
trend toward balkanization of the Internet. The Internet is to some extent and
should continue to be one place where all people around the world start working
and communicating and trading and problem solving together. A Lingua Franca
is clearly needed if this is to remain true.
Where are we going and why are we in a handbasket?
This is a worst idea ever invented. There are a lot of languages like Russian where some letters are the same as normal english letters. So if they manage to roll out multiligual domains it meas they will roll out a lot of possibilites to spoof domains. For example russian 'R' looks like english 'P'. So suddenly any jerk could register well-known domain substituting one letter for another and capture a lot of passwords...
Comparing the situation with aviation is a very bad analogy. Computers don't crash.
Are you adequate?
Did they solve the problem where the same or indistinguishably similar character appears multiple times in the character set?
Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
That one's easy. I'd read it just for the pictures.
Are you adequate?
Have the registrars and/or administrators of DNS zones enforce policies that decide which domain names are allowed in their zones, and also, which are equivalent to each other. That way, the .com zone administrators can just forbid registration of 'ébay', and in addition, enforce that queries for 'ébay' must resolve the same way as queries for 'ebay' do.
We already do this: you can't register 'EBAY.COM' separately from 'ebay.com', and if you type the former, you get the same answer as the latter.
Are you adequate?
1 And the whole earth was of one language, and of one speech. 2 And it came to pass, as they journeyed from the east, that they found a plain in the land of Shinar; and they dwelt there. 3 And they said one to another, Come, let us make brick, and burn them thoroughly. And they had brick for stone, and slime had they for mortar. 4 And they said, Come, let us build us a city and a tower, whose top may reach unto heaven; and let us make us a name, lest we be scattered abroad upon the face of the whole earth. 5 And the Lord came down to see the city and the tower, which the children builded. 6 And the Lord said, "If as one people speaking the same language they have begun to do this, then nothing they plan to do will be impossible for them. 7 Come, let us go down, and there confound their language, that they may not understand one another's speech. 8 So the Lord scattered them abroad from thence upon the face of all the earth: and they left off to build the city. 9 Therefore is the name of it called Babel (confusion); because the Lord did there confound the language of all the earth: and from thence did the Lord scatter them abroad upon the face of all the earth.
-fragbait
In the British JANET, machine names looked like UK.AC.HATFIELD.STAR .
__
Men with no respect for life must never be allowed to control the ultimate instruments of death.
GW Bu
Oh come on . . . everyone should be giving up their native tongue and learning English anyway. You WILL sooner or later. all your brain are belong to us
URLs weren't meant to be exposed. They weren't meant to be branded. It was a mistake for the first browsers to have an address bar displaying the URL, once they left alpha-testing. Most people navigate by selecting bookmarks or clicking links to other sites. Seldom do people type in a URL, and if they need too, a pop-up dialog, which went away after taking input, would have been enough. There never was a need to show all the URLs that for a page element flying by in the status bar, or to shoe the site's URL in the unnecessary address bar. People weren't supposed to need to know those things to use the WWW.
Now that the visible address bar has led to domain branding, of course countries with languages than can't be represented in left-right ASCII want to brand their superfluous address bars in their native scripts, too.
All this needless internationalization of the DNS could have been avoided if only the first browsers had hidden the URLs, like they were supposed to do.
Edith Keeler Must Die
There was an article here on Slashdot last week on multilungualism, and all the anglo slashdotters were talking about some wonderful "universal" language (which of course, meant english because none of them ever bothered to even try to learn anything else), and then they get seriously confused when they go to another country and discover that, shock, gasp, the local not only speak another language, but they also use another script entirely, and this, especially in countries with big populations where there are enough resources to have there own digital script systems, such as Chinese, Japanese, Hindi, Arabic or Cyrillic means that the previously imperious "universal language" advocate is suddenly up shit street without a paddle.
What this means is that just because all you know is English, doesn't mean that all the others are going to suddenly run to learn your language and kiss your ass. Those days are over.
In almost all cases, they'll simply be unable to do so.
I"m supporting a standard which can be easily implemented on all available hardware.
On any keyboard, so it makes sense to limit URLs to that.
In the ancestor post, I pointed out that this nullifies the point of DNS. So yes, they may as well be using IP addresses.
Unless I'm mistaken, all keyboards can do the basic ASCII characters. From a standardization point of view, it makes sense to limit something as crucial as DNS to characters available to everyone.
Really, is there a pressing need for änderungsaufträge.de to be distinct from anderungsauftrage.de?
How are you going to enter these URLs on a standard keyboard from a different region? What about Blackberry/Treo style devices?
Not everyone will have access to system configuration on the machines they use (especially in net cafes), nor will all systems be guaranteed to be so configurable in any event (especially mobile/embedded devices).
About the only thing you can depend on is support for ASCII. This isn't an issue of language. It doesn't matter you can or cannot speak if the machine in your using doesn't support your character set.
Do you really want a world where people can only check their email if they bring their own laptop?
How do you change the keyboard layout on a kiosk with no configuration available to the user?
Will these new TLDs translate into punycode or will they just be inaccessible for us without perusing a Unicode table?
Complaining to the manufacturer of a kiosk in a foreign country isn't going to help you check your email. If you can even figure out who the manufacturer is.
I'd much rather a world where you have to use a few ASCII characters (which aren't really foreign anywhere on this planet), than one in which the internet is segregated by country. That's what you're advocating.
As soon as China opens up the Great Firewall of China and adopts a free and
open access to information and communications policy, with no "jail-time" benefits
for curiosity or initiative.
Where are we going and why are we in a handbasket?
People wouldn't be using locked down kiosks if there was another option.
Alternative scripts should be *optionally* supported as widely as is at all feasible, but they should never be mandatory, especially not in something as crucial as DNS. That still allows native scripts to be used the vast majority of the time.
Your solution locks people out of certain segments of the net based on the hardware they have access to. There's something very wrong with you if you see that as anything but completely unacceptable.
How the ascii-special character mappings work? Administrators aren't my primary concern, I'm more worried about, say, a (possibly unilingual) Cantonese speaker trying to use a net cafe in Vancouver.
Wingdings isn't in nnicode, AFAIK, it's just a font.
As for änderungsaufträge, the most sensible thing to do would be to map ä to a silently. Allowing ä is just asking for phishing problems (ie, citibänk.com).
I prioritize the ability to access all of the internet by everyone, everywhere, and you prioritize the ability to use local scripts everywhere.
The mapping strategy takes some of the bite out of IDN, but it doesn't help our stereotypical windings using grandma. She'll just look at you blankly if you tell her she has to do that, where as if you map accented characters to their non-accented ascii lookalike, everything works automagically for her.
Given time, probably less than a generation, people will adapt to ascii-only domain names. I don't think you can say the same for language support in every hotel netcafe in the world. There are just too many different scripts for them all to be supported everywhere.
I don't really see a problem with google.de and göögle.de colliding. The later is almost certain to be some sort of scam site if it's allowed to exist in any event.
When I suggested silently mapping accent characters to their lookalikes. I meant for someone who requested mélangerie.com would in fact get melangerie.com. The clueless user still gets to type what the word sounds like to them, but we aren't obligated to handle dangerous characters that can be abused by phishers.
If we want to use them in a universally usable system, every machine *does* need to be able to use all characters in play. I'm not suggesting that we ban unicode and universally apply ascii to everything, just that we do so on critical, low level infrastructure. Again, it comes down to preference. You favour the aesthetic use of local scripts over universal access.
There's no money in installing Arabic interface methods in a Kansas hotel net cafe. You can't depend on the market for that sort of thing, it'll always leave some people out in the cold.
It isn't about *enforcing* universal access, but *allowing* it. We can't do anything about a website doing silly things like not having an ascii only version, but we can at least make sure people can enter the domain name.
It doesn't matter what the accented characters mean. I'm just suggesting that we map them, entirely silently, to their lookalikes. A user could type göögle.de, have göögle.de packed in the DNS request and displayed in their browser. Only on the server side would göögle.de become google.de. Uneducated users would only encounter an issue if they tried to register a name with an accented character which has a non-accented equivalent. That's an *extremely* minor infringement for a very major reduction in phishing potential.
I'll bring up the hotel in Kansas example again, because you haven't satisfactorily dealt with it. You cannot, in general, make any modifications to netcafe computers, so the power adaptor analogy doesn't fly. You have to accept it's capabilities as they are, and that means ascii only. Maybe they won't be able to read a page anyway, but at least they have a chance. Further, it provides guidance to the people registering the domain that they should have an ascii only version available.
Of course you can't map the kanji for water to an ascii lookalike. You also can't use it to spoof paypal.com. This is a separate issue from IDN in general. Even if we do implement them, only one of similar characters should be permitted.
:P
The goal isn't to stop *all* fishing, but simply to stop certain specific forms of it which similar characters cause. Speed limits don't stop traffic accidents, but they do stop some of them.
It's not stupid to have no ASCII version of a page. A chinese community does not need any ASCII. Their domain does not need any ASCII, either. If you want to access it from the hotel computer in Kansas and anticipate traveling there and doing so, log on to del.icio.us and click on your link, put in your flash drive and use your bookmarks, use google and search for it, etc. There's a chance that it will work. So yeah, you can prepare for that eventuality, JUST like you can prepare for a different power supply. The power adaptor analogy flies
Again we come back to the original point. IDN negates the purpose of DNS completely. If we're going to have to depend on USB keys (which may not be allowed in the Kansas netcafe) or social bookmark sites, we may as well just use IP addresses.
Further, you're "power adaptors" necessarily exclude economically disadvantaged people. A Chinese dissident fleeing execution isn't going to have time to back up all his bookmarks, and may not have the means to acquire their own system when seeking asylum.
ASCII is available on all machines, therefore it is universal.
It's a choice between language and accessibility. It should be obvious to everyone of good will which has priority.