International URLs Pass First Test
Off the Rails writes "The BBC reports on the results of a successful test of non-ASCII domain names on Internet-equivalent hardware (pdf) carried out last October. The next stage is to plug the system into the net, and if it still works, it could go live sometime next year. 'Early work on the technical feasibility of using non-English character sets suggested that the address system would cope with the introduction of international characters tests were called for to ensure this was the case ... Also needed are policy decisions by Icann on how the internationalised domain names fit in and work with the existing rules governing the running of the address books. Icann is under pressure to get the international domain names working because some nations, in particular China, are working on their own technology to support their own character sets.'"
now I have to learn second languages to look at asian porn.
In a world of acronyms, the words are the real victims.
Imaging all the new ways to spell bank0famerlca.com.
Best Windows Freeware
Non-ASCII? This is awesome! I can't wait for the ANSI addresses to start showing up.
I got dibs on sêx.com!
Developers: We can use your help.
I don't see this as being very popular. Does the average Internet user know how to get an umlaut to display?
All it's going to do is open the door for more domains for the squatters to sit on.
In my skim through the various links, I didn't see what they are proposing to do for practical real-world problems such as phishing. What are they going to do to ensure that a phisher doesn't register a domain with characters that look almost indistinguishible from different characters in a different language, so as to trick users into visiting the phisher's site instead of the legitimate version of the site?
Oolite: Elite-like game. For Mac, Linux and Windows
While browsers can't even properly show non-english alphabet, this doesn't seem to be a good a idea. My native language contains many special characters and I usually end up deciphering the emails sent by mom to me, because along the way, servers replace these characters with funny things.
The concern I have with IDNs is that they will make it too easy to produce "lookalike" domains, like "mcrosoft.com".
Testing functionality and behaviour with "good" names is an easy bar to hurdle.
I look forward to www.paypa|.com etc etc
We should simply invade any country that doesn't use the latin alphabet and teach them English.
If your company/organisation/you have any international contacts then you will NOT be using these international URLs. So you still need the old-style URLs or you'll need to explain how to get those umlauts etc to type in the url. On their national keyboard... not yours that has them. And if you've done any support you know how hard it's even to get someone to READ what's already on the screen...
And quite posting as AC...we know who you are.
I'm going to be the only slashdotter in history to have se×.com
It can be cool (?).
-- Rastignac was here.
They are internationalized urls. If they were international urls I would be able to enter them in my browser without doing funky stuff.
Pardon my ignorance, but couldn't they have just thought of an encoding scheme? Similar to how certain characters are encoded in the path of an URL ("&"-style or "%20"-style). Possibly a more complicated scheme would have been necessary, but surely it would have been possible without requiring changes to the ASCII nature of domains.
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
Call them, say, "character sets.
Then only allow names and queries all from the same character set.
Deleted
This is just common sense -- there's no reason why Chinese, Greeks, and Russians should have to use a character set meant for the English language. But any given URL should have a language associated with it and any character in that URL not associated with its language should be color coded. So English language URLs would get "omicron" flagged while Greek URLs would get "O" flagged. The "default" language could be English so that existing URLs are unchanged, for other languages their ISO code could precede the URL. Now this particular scheme might have some fatal flaw but something similar ought to be workable.
Also needed is automatic translation by, say, a Firefox extension, from the domain name's registered home language (if any) into the user's default language. How do you say "goatse" in Urdu?
A good complement to the new system to preempt the huge coming problem of "glyph masquerade" would be registrations including a list of the domain name translated into different languages. Or at least a declaration of the home language. Without enforcement (ICANN doesn't even enforce name/address veracity) it won't be proof of anything, but it would be a start. And 3rd party databases could include in trust ratings the completeness of the name entry, as well as cross-checks.
I'd like my GUI to at least indicate when a domain name is rendered in foreign glyphs, so I can try to tell whether it's really just foreign glyphs that look like a familiar English word, fooling me into clicking on something totally unrelated.
Opening the system to foreign scripts and languages will get even more worthwhile people and orgs onto the Net, so it's well worth the risks of misidentification. But the risks are real, and largely predictable. We should roll out the new, inclusive system with risk mitigations to welcome those new people in greater security.
--
make install -not war
With the exception of the phishing possibilities that others have already noted, there really shouldn't be any change for English speaking internet users. Most English websites aren't going to want to use special characters. My parents have a hard enough time grasping ctrl-c and ctrl-v for copy and paste. Good luck to anyone explaining alt-145 for them to get to æon.com
"Always forgive your enemies; nothing annoys them so much." - Oscar Wilde
I saw this domain.. ©.com (http://©.com) for me it is accessible in firefox but not IE
66.35.250.150 is non-ASCII.
Will having non-ASCII data in FQDN's open us up to buffer-overflow attacks in various network-aware services?
It's true no man is an island, but if you take a bunch of dead guys and tie 'em together, they make a good raft.
Below is a quick copy and paste from one of my posts on DNForum regarding IDNs ... I own some IDNs and believe they have much potential, but there are still many unanswered questions...
...
... it's among the reasons that English dominates in some areas; some natives, even if they can understand a particular dialect, will sometimes speak a totally non-native language, such as English, instead to avoid risk of offending the other party. One can't assume one language dominates an entire region - languages can also overlap many areas ... it's one of the reasons some are pushing for language / culture based TLDs, such as .CAT (among the dumbest ideas ever, but that's another discussion for the .CAT thread running here on DNF).
... ie. cafe.com verses café.com ... what happens? Will the IDN be highlighted / blocked by default? ... likely an easy UDRP target? ... introduction of a new IDN specific dispute procedure? -perhaps there already is one?
... ie. an IDN that is similar / exact to a trademark in another country ... less obvious, what about an IDN that translates to that of a trademarked word / phrase? -I believe there's a thread discussing such an issue now on one of the other boards here.
... how good / stable are the various language variant tables?
... does the current registrant get first dibs? ... even if yes, it may not be quite that simple if a character variant occurs in numerous permutations.
... probably not a biggie compared to some other issues, but one to be aware of.
... IDN resolution depends on much client-side APIs.
... I can easily envision scenerios in which a web browser and/or other applications (email, IM, etc) implement resolution differently ... ie. adding and/or ignoring one or more valid language associations for a particular IDN / converting similar-looking western european characters to standard A-Z characters, etc. A related concern is language table management - I'm a little hazy on if the tables will be internally stored by each app or remotely loaded for each session, etc.
Excerpt from a post of mine on DNForum regarding IDNs:
http://www.dnforum.com/showthread.php?p=732080
I'm running into a lot of issues that many IDN folks aren't discussing - probably because they've not consider them
Various issues / threats / questions:
?? The existance of numerous diverse dialects, even totally different languages, etc in the same country
?? An IDN that contains western european characters that very close matches a non IDN
?? Trademark issues
?? language variants (more applicable to asian languages, etc) related issues
?? what happens when a language variant table changes? -how are conflicts handled?
?? what happens if a character variant (an IDN [IDL package] technically can comprise multiple character variants [code points]) is released?
?? What happens if a reserved character variant is changed to a preferred character variant? - while such a change would have little to no effect on affected IDNs (IDL packages), it could result in the appearance of some IDNs changing
?? How reliable, especially for those in languages with numerous character variants, will IDN domain resolution be?
?? How well will IDN resolution APIs be regulated
Rambling on, but there are a lot of things that one needs to be aware of with IDNs.
http://www.145/|-|D07.org
Imagine it with different ANSI colors for each char.
Would this lead to segregation of the internet into zones defined by the language used for the domain name? At the moment, I can access e.g. Japanese websites easily, even if the content of that site is in a language I don't understand [1].
If non-Roman domain names become popular, will I still be able to access them, or will they disappear behind untypeable URLs? A search engine may be able to mitigate this problem somewhat, but ATM I sometimes get search results for Japanese-language pages only because my search term is present in the URL.
1: yes, a site can still be useful in this case and no, despite the stereotype it's not just for porn.
What's with the stupid dates - eg 7 March 2007 on that site?!
You'd think they'd use the ISO 2007.Mar.7
As far as I know, Japanese URLs have been working and in use for quite some time. I've visited several myself. Mind you, I'm surprised anyone in the anglophone sphere takes notice.
He who lights his taper at mine, receives light without darkening me.
Couldn't these linguistically-heterogenous domain spaces still be universally linked through romanization? I see one possible solution: An intermediary DNS conversion server; i.e. type "[those were supposed to be Japanese kanji].co.jp" and your DNS request is treated the same as "rakuten.co.jp". Beyond the inability to rake in tons of money for new registrations, what might be the disadvantages of such a system?
Your mind is clear / The things that you fear / Will fade with how much you / Believe what you hear
Now we're just letting the terrorists win! They're hide behind their exotic non-ascii URL names, hold secret forum meetings, etc., and there is nothing the USA can do to see them! Hopefully the NSA will get special training ("Okay. Hold down ALT. Now press these numbers on the numeric pad...")
And there are quite some solutions to it. One of them (I think this is the one we're talking about) is converting the characters to ASCII and serialize them. Quite simple, let the browser do it.
Custom electronics and digital signage for your business: www.evcircuits.com
So it's about non-ASCII top level domains, not just non-ASCII domains, i.e. . instead of
http://pi.cr.yp.to/
As a side note, it's interesting that Slashdot says this link is at cr.yp.to.
so how am i, on my gb keyboard suppose to conveniently type in all sorts of foreign characters?
if there is going to be some traditional ASCII alternative url.. then just what are we doing?
i am all for versatility, but there is always talk about unification, this would just segregate the web into 'things i can type' and 'things i can't'
and considering that html is in american, and that most people take into account that english is a very common language when designing a page, are we not just creating some novelty, which after a while will annoy all but a few?
of course, dns is only a convenience anyway, we could solve all this and all start memorising ip addresses, especially when IPv6 should soon be in play. XD
Prov 9:8 Do not rebuke mockers or they will hate you; rebuke the wise and they will love you.
Once again, committees lag behind actual problems and actual solutions.
Now if you'll excuse me I'll go back to browsing
(I seem to recall that
Whence? Hence. Whither? Thither.
If ASCII was good enough for the Apostles Peter and Paul then it ought to be good enough for everyone.
You are in a maze of twisty little passages, all alike.
Okay, I'll bite. I have what I think amounts to a fairly good, if basic, understanding of how internationalized character sets and encodings work, but I don't understand how you'd encode multiple character sets into one URL.
I mean, first of all, in order to use non-Latin characters at all, you have to have some way of transmitting which character set / codepage you want to use. I can't find any place in TFA where they actually describe how this is going to work (although I didn't read the PDF, so perhaps it's in there), but my assumption was that it would be transmitted outside the actual stream of bytes that represent the URL.
So, a "URL block" might consist of some metadata about the URL that's going to be transmitted -- e.g., what character set it's written with, etc. -- and then the stream of bytes that actually represent the address. Doing it that way would by definition only allow one character set per URL, because there's no way of changing it mid-stream.
If you allow people to change character sets in the middle of the address, so as to have an address where one part was written in ASCII or Latin-1, and then another byte or two in UTF-8, and then the remainder in Latin again, would hugely complicate the standard both from an implementation and use perspective.
As long as all the alternative (that is, alternative to ASCII) encodings include within them a minimalist Latin charset, enough so that you can type the ".com" and other TLDs, then there doesn't seem to be any reason to allow mixed-charset URLs.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
Workaround
English is not my native language (as if you didn't notice) I still think this is not exactly the best idea ever, I actually think it is pretty bad... Phising has been named, but it also seems as a huge overcomplication. Most sites (aka youtube) already get to survive with totally cryptic URLs , so I don't really thing this is a problem at all.
Copyright infringement is "piracy" in the same way DRM is "consumer rape"
Actually, as the abstract of the paper correctly states, it's about non-ascii characters in TLDs. International characters already exist in the domain names, as some posters have pointed out.
In this article they applied the same encoding used for domain names to TLDs, and they noticed it works fine. So to summarize, it's not about miçrósoft.com, it's about microsoft.çóm . That's much more fun!
The day that goes on-line I'll be able to filter scads of spam simply by refusing to resolve international domain names. Woot!
Given the number of passwords that the average person who does a lot of stuff online needs to remember, unless they're doing something hideously insecure already (like using the same password everywhere), they can probably only sign on from a single computer anyway, because that's where their passwords are stored or written down.
The problem of certificate management is, IMO, actually more tractable than the problem of password management. There are lots of ways that you could allow people to move certificates around, if you really wanted to; you could issue USB sticks or smartcards that they could jack in to public machines (although preferably you'd create some method that never actually let the unsecure machine 'see' the certificate itself; you'd just do some sort of challenge/response with the USB key or smartcard).
Passwords really aren't all that convenient; if you're using passwords properly (not reusing the same ones in multiple places), and you're not using a crutch like iterative generation, or just writing the things down (which basically makes it a very insecure "analog certificate"), you're probably way out on the tail-end of the bell curve of what a normal person can remember. Passwords are only "user friendly" because the way that most people use them is hideously insecure.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
How will they translate "www."?
How then are English visitors suppose to visit one of these sites? Purely by links?
:), and not of these poncy half wars, where we stop fighting before we've won and 'peacekeep' for the next century. At least then there would only be the conquering language to speak.
Although I can kinda see the point, I can't see how this will work...all I can see is the internet fragmenting, which seems to be against the whole spirit of things!
For those that don't see why someone who can't read the language would want to visit the site...the reason is simple: Pictures tell a thousand words. Secondly technology and science is often language independent, so the specs on a Korean phone site are useful.
What we really need is an massive all out war
----- I refuse to have an argument with an unarmed person