Internationalized Domain Names Coming Soon
rduke15 writes "You think you know how to parse a domain name for validity? Well, in case you haven't noticed, things are getting tougher as registrars keep adopting IDN (Internationalized Domain Names), which uses a weird encoding named Punycode to enable accented characters in domain names. The Register reports about Switzerland, Germany and Austria's joint move to enable IDN. See the overview in English from Switch. But I guess it would be difficult to talk about this on /., since it does not even support basic Latin-1 ... :-)"
More ways for trolls to disguise goatse.cx links...
It looks to me like the problem is that the DNS servers don't support unicode so they're using a bad implementation of it.
Why not extend dns to support unicode? That way they'd be no translation or other crap to go through.
Granted software would need changing but that be the case with the mangled crap that's mentioned in the article.
What am I not understanding here? Or is this just implementation dreamed up to make life complicated?
There's a gorilla from Manilla whose a fella that stinks of vanilla and has salmonella.
I'm sorry, is it just me or do they seem to be taking a bad shortcut to get to a good end? It doesn't seem like they are doing this correctly. Why not plan to migrate to unicode? Their choice seems shortsighted and flawed. I hope they atleast considered unicode and came up with real reasons why not to use it.
But I guess it would be difficult to talk about this on /., since it does not even support basic Latin-1
Just say the ascii number?
When anger rises, think of the consequences.
Confucius (551 BC - 479 BC)
I'm not sure what all the accents are on the alphabet, will I have to know to type them to access a simple website? Sorry, this doesn't make using the net easier.
Trolling is a art,
Taco est un mechant garcon.
'
While it's logical for, say, Chinese companies to have a Chinese domain name and Chinese e-mail addresses, it may not be the best choice if the company wishes to expand oversea.
Unfortunate but true, if a company has a Chinese domain name, it would probably be only used within China, Taiwan, Hong Kong, Singapore, Japan (since it's unicode), and maybe South Korea. The company would be pretty much limited to the East Asia market.
However, I suppose the company could get both a Chinese domain and an English, or rather Pinyin, domain so they could make their Chinese, or maybe other Asian clients feel "closer" while also being able to reach clients outside of East Asia.
I also think that it'd be great to give people the option of having a native-language email address. It's not too hard to set up a romanized email alias for it. An SMTP "X-Roman-Address" header could even by added to outgoing messages in case a recipient can't read the default "From" line.
There's 10 types of people in this world, those who understand binary and those who don't.
After all, now they need not only worry about registering say...
c rosoft.tv
Microsoft.com
Microsoft.net
Microsoft.org
Mi
etc..
But also
Microsoft.com
Microsoft.com
Well, you get the picture.
--Won't that be grand? Computers and the programs will start thinking and the people will stop. - Dr. Walter Gibbs
I have mixed feelings about this. I am from Sweden, and it always looks kind of ugly when names lose their dots and circles in the domain name.
On the other hand, this is also quite convenient. I live in the US now, and I travel around quite a bit. I often surf on Swedish Internet sites, typically without access to a Swedish keyboard. It would not be very convenient if the domain names used non-English symbols.
Sometimes I go to Japanese sites also, and I am really glad that I don't have to install a Japanese word processor to do this...
Tor
Any Internet RFC which includes the phrase, -with-SUPER-MONKEYS, has GOT to be good. (And in case you think I'm trolling, check the link.)
[
I am glad too see others than the Mesopotamians using the wheel which was originally invented for use in Mesopotamia.
Slashdot Sig. version 0.1alpha. Use at your own risk.
Punycode *is* a Unicode encoding.
Unicode has many encodings; UTF-8 is one encoding and Punycode is another. UTF-8 aims for efficiency when the majority of the text is ASCII, and Punycode aims for completeness when you must fit in 64 characters and use only the ASCII characters to do it.
[
- - - - ..
I, for one, welcome our new European overlords.
An Indian-American Hindu committed to non-violent thought/speech/action alarmed by the global explosion of radical Islam
> You think you know how to parse a domain name for validity?
Yes, I do, and if you _read_ the RFC you'll see that nothing changes, these domain names are encoded into the same character set as the current DNS system. And hence if you give me a URL I can validate it with existing scripts. There's an example which shows that Bucher.ch (with an umlaut on the u) would be translated to: xn--bcher-kva.ch which looks totally parseable to me.
John.
Personally I can't wait to see funky chinese character domain names in my web logs (mostly from infected windows machines trying to attack my apache server).
I Am My Own Worst Enemy
The internet was built as a highly decentralised, noncontrolled network, so that, in the event of a nuclear war, military leaders would have unrivalled access to pornography. (3DTIAB)
Exercise your right not to vote. thinkoutside.org
This means that it can't possibly include ALL of the unicode spectrum, as Unicode supports far more than just 92 extra characters.
Also, the way the coding is going to work, you still can't register a name with B.
I am unamerican, and proud of it!
Yes, it is. Because it's not just a few "umlauts". When you're talking about Asian or other non-Romanized languages then the Romanization may be totally incomprehensible to even some speakers of that language. It's one thing to lose a few accent marks and such but it's quite another to translate your language into a totally incomprehensible and unrelated format. In fact in kanji based languages at the very least Romanization actually LOSES information. It's not just a matter of transcribing the sounds into another format because the kanji carry additional meaning not present in just the phonetic lanaguage. If you've ever seen two native Chinese or Japanese speakers talk to each other they frequently will "write" kanji in the air or on the palm of the other person's hand with their fingers because their spoken language is imprecise.These changes are very necessary for the Internet to become a truly international phenomenon
Just to diverge, I'd like to represent the non-english speaker view here.
In most of the languages with 'funny accents' like umlauts, these characters often have a completely different pronounciation, and are often considered to be a completely different letter than without the 'accent'.
Simply 'brushing off the dirt' and removing the 'accent' thus changes the word. Sometimes with wierd results.
Just ask someone from the town of Moensteraas, Sweden.
Their website contains mostly municipal information intended for swedes, but due to the restrictions of DNS, the name is instead spelt 'monsteras', which means 'monster-carcass' in Swedish.
Obviously, these people would be happier spelling it with umlauts on the o, and a ring over the a.
You know, this arrogant, self-centric view does not help the discussion.
Anyway, the current infrastructure DOES NO have to be updated and this change is NOT intended to be "some jagoff's playground", but rather for the non-English speaking people - there are quite a few of them.
Real life is overrated.