Spoofing URLs With Unicode

Our Task is Obvious by donnacha · 2002-05-27 13:53 · Score: 4, Funny

So, what would be the cyrillic for Slashdot.org?

old trick by krokodil · 2002-05-27 13:53 · Score: 3, Interesting

It is widely used on russian-language IRC
networks like RusNet. http://www.irc.net.ru/

Done in DOS a long time ago by aoihai · 2002-05-27 13:56 · Score: 4, Interesting

Anyone else remember using alt+255 and other special characters to make hard to open directories (idiot proof anyway) on shared command line systems?

--
You were eaten by a grue.

Re:Done in DOS a long time ago by Hollinger · 2002-05-27 14:39 · Score: 2

There was something from way back when that would effectively "lock" a folder in explorer in Win95 and make it damn hard to open in DOS. I think you had to go to a prompt and tag an +255 or something to the leading or trailing edge of the folder name.

--
Michael C. Hollinger
Re:Done in DOS a long time ago by chabotc · 2002-05-27 15:31 · Score: 3

Tabs, spaces, dots and even backspaces... *sigh* those were the days directories could still be confusing, hidden and full of illegal software.. On second thought, nothing changed realy, except that now nothing on the net is hidden anymore ;-)
Re:Done in DOS a long time ago by achurch · 2002-05-27 16:30 · Score: 2

Yes! I remember making a whole tree of directories like that on my parents' 286 back when I was a kid, to keep my stuff hidden (I very conveniently ignored the fact that my dad had about 15 years of computer experience even then, because I was obviously being so clever). In order to remember where it was myself, I used the digits of pi, so the top directory had 3 Alt-255's, the next one had 1, the next 4, and so on. For a fifth grader I sure was proud of myself--my very own passworded directory, that no one could possibly guess at!
. . .
Sure is a good thing that computer isn't still hanging around . . .
Re:Done in DOS a long time ago by GutBomb · 2002-05-27 19:18 · Score: 2

boot from a linux floppy, mount the hard drive, and deltete it from there.
Re:Done in DOS a long time ago by Jugalator · 2002-05-27 20:22 · Score: 2

You mean this one: ""

Ha! Still possible in Windows 2000. That's so much MS when you think about it. :)

--
Beware: In C++, your friends can see your privates!
Re:Done in DOS a long time ago by LadyLucky · 2002-05-27 20:30 · Score: 2

That's brilliance. One hopes you turned out ok, not some broken, hollow, shell of a man(?).

--
dominionrd.blogspot.com - Restaurants on
Re:Done in DOS a long time ago by Dahan · 2002-05-27 20:35 · Score: 2

The directory's probably not "com", since there's nothing wrong at all with that as a file or directory name. Maybe you mean "com1" or "con"? In any case, those names are reserved, since they're DOS device names--you can't name a directory that. However, you can make "com1 " (trailing space), which Explorer doesn't know how to handle.
Anyways, an easy way to get rid of directories with leading/trailing spaces in their names is to open a command prompt and do "dir /x" to show the short names, then "rd /s shortn~1"

Re:I fail to see by NanoGator · 2002-05-27 13:56 · Score: 2

It's important because at least with yahii.com you know that there's something wrong with the address, ie the typo.

But with this unicode spoof, then you could go to a site you think is legitimate, and you'd have no way of knowing it's not.

--
"Derp de derp."

I gave m1cr0s0ft.com my credit card number!!!! by Anonymous Coward · 2002-05-27 13:57 · Score: 4, Funny

Should I be concerned?

Re:I gave m1cr0s0ft.com my credit card number!!!! by Roosey · 2002-05-27 14:07 · Score: 2, Funny

No, not at all. In fact, it's probably more secure there than with Passport. :]
Re:I gave m1cr0s0ft.com my credit card number!!!! by NoMoreNicksLeft · 2002-05-27 14:52 · Score: 2

No. Thieves are alot less likely to steal from you than M$, and if they do, it will still be for less.
Re:I gave m1cr0s0ft.com my credit card number!!!! by Have+Blue · 2002-05-27 15:17 · Score: 2

Don't worry, I'll take good care of it.
Re:I gave m1cr0s0ft.com my credit card number!!!! by Dr.+Awktagon · 2002-05-27 15:56 · Score: 4, Insightful
Whew, good thing you caught it in time! Don't worry, the credit card companies can take care of it, no worries, just enter your name,credit card number, social security number, and mother's maiden name at each of the following URLs:
- AMERlCANEXPRESS.COM
- ClTlBANK.COM
- FlRSTUSA.COM
- DlSCOVERCARDS.COM
(Those all use "ell" instead of "eye" when possible.. they look exactly the same with my fonts.. Since there already "homographs" in plain ASCII, and plus Javascript mouseovers can be used to change the browser status area, and plus many people don't even fully understand the difference between "microsoft.com" and "microsoft.evil.com", this Unicode trick is nothing to worry (more) about!)
Re:I gave m1cr0s0ft.com my credit card number!!!! by Tet · 2002-05-28 00:36 · Score: 2

Javascript mouseovers can be used to change the browser status area
Not in my browser they can't. Edit -> Preferences -> Advanced -> Scripts & Windows -> Allow webpages to change status bar text.

--
"The invisible and the non-existent look very much alike." -- Delos B. McKown

Workaround by neolazer · 2002-05-27 13:57 · Score: 3, Insightful

What is InterNic and such doing in the meantime to help prevent spoofs such as this? The Legal ramifications of this are interesting. One could also post stories with false links, that most people would never even realize weren't true.

Re:Workaround by wo1verin3 · 2002-05-27 15:13 · Score: 2

It doesn't take that much effort to fool people.

I took a call from a customer who got Windows with his pc and wanted a refund because he was going to use linux.... (i think he's smart)

While we're talking about my recent install of xfree86 4.2.0 he mentions MS is taking over linux now with MS Linux and belives it's real. (i realize he's dumb)

I spent the required time to guide him to the domain whois to show him this page isn't owned by Microsoft. He thought the quotes on the page are real....

sigh, anyway it's not hard to fool people

WHY THIS IS IMPORTANT by Anonymous Coward · 2002-05-27 14:05 · Score: 5, Informative

people seem to be missing the point in this thread. Here is why this is very important.

When you pay money, say with paypal.com, you always want to check the URL. Of course someone could have fake link like: "click here to pay with paypal" and then redirect you to their bogus site with the intention of stealing your passwords. But it would be fairly obvious from the location bar in the broswer that the URL was not paypal.com. But if unicode can be used to spoof the location bar then it will rope in even cautious users.

Re:WHY THIS IS IMPORTANT by Compuser · 2002-05-27 16:53 · Score: 2

I wish browsers showed IPs as well as URLs in
the location box. Can I get Mozilla to do this?
Re:WHY THIS IS IMPORTANT by cybermage · 2002-05-27 18:30 · Score: 2

But if unicode can be used to spoof the location bar then it will rope in even cautious users

Ewww.....

I wonder if unicode is even supposed to be allowed in domain names. If not, maybe this will prompt Microsoft, Mozilla, and the like to error when the host/domain contains unicode.

Until then, I guess I'll retype the host/domain of any site I intend to log into before doing so. What a pain.

If unicode is valid, heaven help us. With registrars charging as little as $8 for a domain, you know they aren't checking these things.

You're my hero, AC.

--
Some people have a way with words, and some people, um, thingy.
Re:WHY THIS IS IMPORTANT by RussGarrett · 2002-05-27 20:21 · Score: 2

Unicode IS valid in domains now, so the Russians and the Chineese, etc. can have domains in their own character set, instead of having to live with latin characters.
Re:WHY THIS IS IMPORTANT by Permission+Denied · 2002-05-27 23:05 · Score: 2

I wonder if unicode is even supposed to be allowed in domain names.
Absolutely. click here for a demonstration. I haven't yet used a browser that could handle the links on that page. I share your disgust.
Re:WHY THIS IS IMPORTANT by GMontag451 · 2002-05-28 03:40 · Score: 2

Wouldn't work for me. I'm using Opera for Mac. I have to say, Opera for Mac has got to be one of the shittiest browsers I've ever used. The UI is so slow that half the time I make spelling errors when typing URLs because I can't see what I'm typing until I stop typing and wait for Opera to catch up.
The only reason I use it at all is because my favorite browser, iCab, doesn't display nested comments on Slashdot correctly for some strange reason. The show up unindented.
Re:WHY THIS IS IMPORTANT by Compuser · 2002-05-28 10:58 · Score: 2

No, I'd just like this feature for myself.

I would have thought it wasn't a problem except... by SwellJoe · 2002-05-27 14:05 · Score: 4, Informative

I recently received an email from a confused user who had received an email that appeared to be from Apple, and was selling Apple products using Apple logos, Apple website concepts and images, etc., but was not from Apple. He didn't sign up for the list, and though it appeared to be a legitimate Apple affiliate as far as I could tell (though perhaps one that used somewhat shaky methods to reach customers), he was confused why Apple was sending him email that he didn't ask for. It was his belief that the mail had actually come from Apple, because it looked like it was from Apple.

Non-nerds have proven to be extremely difficult to educate on the concept that "what email claims to be is not always what email is, and where it claims to come from is not always where it really came from". During the recent Klez outbreak, I even received a message from a nerd-friend saying that he thought my machine might be infected, because he received an infected message from "me". Of course it was spoofed, because I happen to be in a lot of peoples address books, but since I haven't used Windows on the desktop in over three years, it clearly didn't actually originate with my box.

Folks are just kinda thick about questioning the veracity of claims (hell, astrology still sells books and 900-number phone calls). And this could definitely be used for nasty purposes...and certainly will. Spammers will have a field day with this, because they can't help but seem 'fly by night' because they cannot establish a real brand name due to the disgusting nature of their busines. If they stand still, they'll get lynched. But if they can, even for a short time, hijack a real name that people trust, and offer up a too-good-to-be-true scam under that trusted name...well, you see where I'm going with this.

Of course, everyone here knows that unsolicited "business offers" by email are always scams run by filthy people...but my grandmother doesn't know it, nor do my parents or many of my non-nerd friends for that matter.

Just a thought. We'll see how it plays out, I reckon...

Re:I fail to see by Gaccm · 2002-05-27 14:07 · Score: 2, Insightful

There is a key failure. If someone tried to copy and paste the text into the URL and they weren't using the trick language, it wouldn't work. However if there is a link that says "microsoft.com" then that could send you to a different page. And as everyone knows, people are much more likely to click a link than to copy & paste it in the address bar.

--

Only dead fish swim with the stream...

Unicode Environments by saveth · 2002-05-27 14:10 · Score: 4, Insightful

I develop applications for a DSP company, and we've recently switched to using Unicode in our products. Unicode certainly has its quirks, and this is one of the more obvious ones. I fail to see why it has been implemented so widely, without very, very rigorous testing.

Actions like the one described in this article could bring down a company, if a person tried hard enough. Of course, Microsoft could just call Verisign and ask them to remove the Cyrillic domain, with no problems. But, for a small company, it could be hell. An entire user group using the same character set to access a certain website would be sent to a different site. In a worst case scenario, anti-company propaganda might be posted on the spoofing site, and it would deter people from visiting the "real" site in the future.

The only solution I can imagine is to simply prevent the translation of characters among character sets, especially in this sort of environment.

A Russian site, such as The Moscow Times, could have its site spoofed in exactly the same manner, and everyone using the Cyrillic character set (obviously, widely used in Russia, for example) would be sent to some other site, possibly indefinitely, knowing how registrars have been acting lately. This would create havoc for the newspaper and significant hurt revenue.

Re:Unicode Environments by bani · 2002-05-27 15:20 · Score: 2

"we've recently switched to using Unicode in our products."

...why?
Re:Unicode Environments by saveth · 2002-05-27 15:49 · Score: 2

"we've recently switched to using Unicode in our products."

...why?

It started out with a weirdness in Windows 2000 we had to work around, and it involved using the Win32 API TCHAR data type, so that it could compile on both Unicode-enabled systems and ANSI character systems.

To make a long story short, we were forced to enable Unicode in one of our products; then, we thought it a good idea to have all our products capable of internationalised data.

Yeah. That. :P
Re:Unicode Environments by dvdeug · 2002-05-27 17:02 · Score: 2

Unicode certainly has its quirks, and this is one of the more obvious ones. I fail to see why it has been implemented so widely, without very, very rigorous testing.

It was tested; this is considered acceptable, as there are no workarounds.

There will be look alikes in Unicode, just like there are in ASCII. Prior character sets, including KOI8-R, ISO-8859-5, ISO-8859-7, and JIS X0213 - pretty much every character set with either Cyrillic or Greek in it - have the Cyrillic or the Greek A seperate from the Latin A. Besides backward compatibility, proper multilingualization calls for them to be kept seperate; what's the lowercase A look like, if
the Greek and Latin A are merged?

Comment removed by account_deleted · 2002-05-27 14:12 · Score: 2, Flamebait

Comment removed based on user account deletion

Re:Terminology whine by os2fan · 2002-05-27 14:17 · Score: 3, Insightful

Russian Cyrillic is not redundant. The other languages that use cyrillic letters have different letters, (eg Ukrainian has an "I", instead of the back-to-front N), and some of the Russian letters are uniquely Russian.

--
OS/2 - because choice is a terrible thing to waste.

DNS was, and is, an ugly kludge by Sanity · 2002-05-27 14:17 · Score: 4, Interesting

Amazing how many comments betray the fact that people haven't read the article.

At the moment these unicode domain names will not be displayed correctly by web-browsers, rather you will see a bunch of cunfusing control codes, so this threat isn't really a problem yet.

Of course, the underlying problem is that DNS is an ugly kludge which has long-outgrown itself. The administrative cost of constructing a massive global namespace is vast, and we can all see the opportunities for cyber-squatting it creates, to the detriment of the public interest.

These days I am more likely to go to Google and type in a few words, rather than try to guess the URL. The task of finding the website you are interested in should be left to the specialists (like Google and other search engines), we shouldn't try to maintain an ugly, broken, monopolistic, and expensive "first come first serve" architecture like DNS.

There is no good reason why a web user should ever need to see a URL (except perhaps momentum), any more than they need to see the HTML which makes up a document.

Re:DNS was, and is, an ugly kludge by Mandelbrute · 2002-05-27 14:54 · Score: 2

Of course, the underlying problem is that DNS is an ugly kludge
Will IPv6 use DNS or something different?
Obviously with IPv7 we'll just have to ask lain to send us to the right site.
Re:DNS was, and is, an ugly kludge by Sanity · 2002-05-27 15:12 · Score: 2

Oh, and we should instead rely on a search engine scheme, where a company may never get the users that are searching for it, because of a million idiots (Sadly, they turn out to be non-idiots more often than idiots. My apologies) ranting about XYZ Inc. ?
If there is a demand for a service which locates the authorative websites of corporations, then capitalism will provide. This is a lame argument specific to the way Google happens to work.
First off, it's laziness on the part of morons like yourself, that lust after AOL keywords and are pissed that the internet doesn't bend itself to fit your warped little design philosophies.
Actually, it is insulting wannabe-elitist morons like you that are ruining Slashdot - but that is a different argument. If you think my design philosophy is flawed, why don't you explain why rather than wasting my time with Ad Hominem (look it up) attacks.
Secondly, not everything is the web. Not even close. DNS and domain names aren't about identifying your lousy porn site, they are about identifying a particular host. Done well though (which isn't the case), it's pretty decent at getting you within a few clicks of where you want to be.
What about the cyber-sqatting, cost, and creation of private monopolies? DNS is an ugly ugly solution to the problem of finding IP addresses.
Thirdly, how the fuck do you expect to ever type in the first URL, google.com or whatnot, if it's hidden from you on your brand new Dell? I can see the horror that would be inevitable in such a scheme. microsoft-search.com as a nice little button on the toolbar, that never ever brings up a link to click on for google or yahoo, no matter how you phrase the keywords.
Market forces will create a demand for comprehensive search-engines which aren't biased, in fact, they already have.
Finally, the problem is the fact that the vast majority of ISP's view their customers as users of content that they provide, rather than participants in the first, and largest, p2p network ever devised. At best, you'll recieve a lousy homepage with no ftp, cgibin, or any other goodies, and a lousy url like http://www.smalltown-isp.net/users/~dumbfart/.
What the hell are you ranting about? This has nothing todo with whether your ISP supports cgi.
Sen. Hollings wants to know why there isn't enough compelling content to drive demand for broadband?
Are you just ranting mindlessly or did you actually have a point?
Re:DNS was, and is, an ugly kludge by Sanity · 2002-05-27 15:14 · Score: 2

This is one of the things that "RealNames" was trying to fix/exploit. Of course it has since gone out of business.
The main problem with developing a replacement for the DNS function/service, is getting everyone to agree on how the service will be provided and operate.

Not really, for example, Google is a great replacement for DNS, not functionally equivalent, but basically does the same thing but much more effectively.
Re:DNS was, and is, an ugly kludge by great+throwdini · 2002-05-27 15:37 · Score: 2

For example, Google is a great replacement for DNS, not functionally equivalent, but basically does the same thing but much more effectively.

No, not really. You seem to support replacing one 'middleman' with another [DNS -> Google]. Google is good and all as a search engine, but I don't really understand why you think indexing existing pages and ranking content based on some scheme (a la Google) somehow improves upon rational DNS entries or eliminates the risk of underhanded manipulation of whatever system is in use.

Yes, there are flaws with most DNS naming schemes. No, it's not perfect. I'm skeptical of your claims that a Google-ish system somehow fixes things for everyone. It certainly would eliminate the ability to go directly to a known resource without there first being a 'search' ... and a resource that was 'known' to be within the first n results might not be a day, week, or month later. Maybe I'm missing something, but where's the progess there?

Meanwhile, hosts would be identified how? Simply by numeric address? I think earlier in-thread comments need to be emphasized: DNS is for naming hosts, not Web sites.

Much as I favor Google for blind Web searches, it really hasn't attained a level of perfection as a search engine alone and nothing else. I'm at a loss to grasp what insight you possess into the workings of Google and the nature of the Web as a whole to believe -- as you must -- that there is complete certainty/symmetry in what is 'out there' on the Web and what Google presents to users.

Google certainly doesn't function in a way analogous to DNS, and I don't think that you should claim that something "not fuctionally equivalent" could be seen fairly as a "great replacement."

Think about it.
Re:DNS was, and is, an ugly kludge by NoMoreNicksLeft · 2002-05-27 15:50 · Score: 3, Insightful

If there is a demand for a service which locates the authorative websites of corporations, then capitalism will provide. This is a lame argument specific to the way Google happens to work.

If there is a demand for something that we already have at this time, for free and with no effort? In other words, you would like it if I paid for something I already get now for free... well, if you can't find a good business model, why not create an artificial one?

What about the cyber-sqatting, cost, and creation of private monopolies? DNS is an ugly ugly solution to the problem of finding IP addresses.

Cyber-squatting is simple. Outlaw domain parking, domain transfers, false advertising (which is what registering www.books.com and pointing it at a porn site is), and enforce trademarks. If you want a domain, then use it. Use it for something other than pointing yet another name at your lame web site. Only allow registrations and de-registrations... if someone wants to try and sell the domain and someone else wants to pay money for it fine. But they don't get it, it just goes back into the unregistered pool. And if someone has a valid trademark (microsoft is valid, computers.com isn't) by all means give it back to the trademark holder. Duh. DNS is pretty handy for finding IP's, actually. It just isn't as good at making websurfing as effortless as you'd prefer. Or for keeping people from being assholes and polluting the namespace, I should add.

Market forces will create a demand for comprehensive search-engines which aren't biased, in fact, they already have.

Dumbass. On a fresh install of the browser of your choice (or lack thereof), you can't get everywhere you want to go only by clicking links. If the url field is hidden or disabled, which you advocate, you'll be reduced to clicking a toolbar button or a pre-loaded bookmark. I'm sure one such will be a searh engine... but with M$ can you count on its integrity?

What the hell are you ranting about? This has nothing todo with whether your ISP supports cgi.

So sorry, I thought you might have the ability to understand non-monosyllabic words. Let me try again...

I-S-P bad. No like us have nice web names. Must use bad homepage **DAMN* ... It can't be done.

I'm tired, so I'll try to make this clearer. If users are only ever allowed to use crappy homepage webspace, of course half the URL's on the net will be long and ugly. I also failed to mention that many commercial sites have bad web design... this accounts for the other half being ugly.

And if I got off on a rant, so what? I see someone like you talk out of your ass, I become a little bit upset. Well, guess what? If you want to add another protocol, pick a port number and get to work. I won't stop you. But stop ranting yourself about how the current ones are ugly, when you have no clue why they are even like they are.

DNS isn't broken, and it isn't ugly. As a protocol, it is highly distributable, robust, and solves the IP-human readable name problem as well as anything that has ever been published. It is the foundation of many protocols and services available on the internet, only one of which is the web. We don't need a seperate, incompatible system for the web, and you've offered nothing that would suffice for anything but that, and even then only poorly.
Re:DNS was, and is, an ugly kludge by Permission+Denied · 2002-05-27 15:59 · Score: 2
There are potentially two uses for domain names:
1. Guessing a company's domain. Eg, I want to see if there are any recalls on my Pontiac Grand Am, so I type in pontiac.com. As you've noted, this usage of domain names is pointless - instead, I'll go to google and type in "pontiac grand am recalls."
2. Recalling a domain name. I go to a cyber café and want to check slashdot. I type in slashdot.org into the browser and I read slashdot.
Now, I agree that use (1) is dead. However, I don't want to have to remember 64.28.67.150 to read slashdot, nor do I want to be dependent on google to find slashdot. Think of the pontiac example, where I'm looking for a specific page: google rankings change, but domain names change less often. If google decides they don't like the American Communist Party, I may have a hard time finding their website without DNS, whereas google does not control the cpusa.org domain name.
There are also other, less obvious, uses for DNS. For example, I can type in ftp11.freebsd.org and see if that's faster than ftp6.freebsd.org, without having to search for the FreeBSD mirrors page. You can also publish spammer's IP addresses to DNS tables, like what RBL does. That means when I write my MTA, I don't need a full HTTP engine in it along with an XML/SGML/HTML/WHATEVERML parser, but I can just do a simple "gethostbyname()" and see if that returns an error. There are lots of other creative abuses for DNS.
Anyway, I think there's still a real need for DNS; however, DNS administration leads to so many politics...this article mentions a technical problem, but the real problems are social/political. These problems are much harder to solve.
Re:DNS was, and is, an ugly kludge by NoMoreNicksLeft · 2002-05-27 16:12 · Score: 2

Generic example. Idiot.
Re:DNS was, and is, an ugly kludge by NoMoreNicksLeft · 2002-05-27 16:47 · Score: 2

If you bothered to log in, why not post as such, so they could see this wasn't just a lame attempt by me to make it look like I had some support? Besides, I garbled half my arguments... been up 26 hours straight at this point, not exactly articulate. Maybe not even coherent.
Re:DNS was, and is, an ugly kludge by sparcv9 · 2002-05-27 17:00 · Score: 2

Will IPv6 use DNS or something different?
IPv6 won't use DNS any more than IPv4 uses DNS. In other words, Neuther IPv4 nor IPv6 "use" DNS at all. DNS is just a single mechanism for resolving hostnames to IP addresses, and vice-versa. I think what you may have meant to ask was if DNS will be used to resolve IPv6 addresses/hosts, and the answer is, at least on the Internet, yes. The RFCs for DNS have included IPv6 record types (type AAAA) for a long time, and most DNS servers support them. However, anyone is still free to use DNS, NIS/NIS+ or even /etc/hosts (or any other name-resolution service you can think of) on their own networks. Just don't expect the world to be able to see it.

--

This is not a Fugazi .sig
Re:DNS was, and is, an ugly kludge by dvdeug · 2002-05-27 19:18 · Score: 2

[...]pretty-much done all of the things that would allow me to be as arrogant as you try to be, that you haven't.

And at the grand old age of 25 (according to your webpage), too.
Re:DNS was, and is, an ugly kludge by NoMoreNicksLeft · 2002-05-27 19:35 · Score: 2

He's also incredibly humble, if you haven't noticed.
Re:DNS was, and is, an ugly kludge by NoMoreNicksLeft · 2002-05-27 19:50 · Score: 2

Not to belittle Google, which is the best damn search engine ever, but it's not the answer to the non-problem you're trying to invent.

Your opinion itself speaks volumes about what you know and understand. You're the clueless suit touring the factory, wondering why the steam pipes aren't chromed.

No one has ever charged me for typing in "ibm.com". My use of DNS is either free, or depending on how you look at it, the cost is rolled into my ISP subscription. They're not going to give me a refund if somehow stooges like you gut DNS.

DNS doesn't solve your "URL's are ugly" problem, because 1) it isn't DNS's problem to solve, 2) it is largely the result of either bad web design or bad ISP policy and 3) only idiots are complaining about this.

The expense of the domain name squabbling solutions I outlined, is either already being paid for (we have a system to settle trademark disputes), or could be done realtively inexpensively with what amount to a few shell scripts (do the parked domains all bounce to a single site). All that would have to happen is for ICANN to pull its collective ass, and make some sensible policy. Simple as that.

Oh, and one last thing. How do you expect google to index websites, if there is no DNS? Are we all supposed to go back to using IP's? Do we embed those in the hyperlinks instead of domain names?
Re:DNS was, and is, an ugly kludge by PatientZero · 2002-05-27 23:32 · Score: 2

No worries, you've more than covered belligerent. ;)
Your arguments make sense so far, but it took a lot of restraint on my part not to dismiss you entirely due to your attitude. You'll convince far more people in the future if you drop the name-calling.

--
Freedom to fear. Freedom from thought. Freedom to kill.
I guess the War on Terror really is about freedom!
Re:DNS was, and is, an ugly kludge by Just+Some+Guy · 2002-05-28 04:12 · Score: 2

Of course, the underlying problem is that DNS is an ugly kludge which has long-outgrown itself.

This blatantly ignorant post got modded up?

Search engines only map content to hosts, and DNS provides the "last mile" mapping of hosts to addresses. What would you do when your IP changes if DNS no longer exists? My server has moved at least 4 times in the last three years, but thanks to DNS, all content is still available at the exact URL it's always been at. What's your workaround? Telling Google about every page I host every time I switch providers, move, or upgrade my connectivity?

Second, what will you use to replace MX record functionality? Or are you also advocating replacing SMTP with something new?

No, DNS is still pretty damn handy. You may not like it, but those of us working in non-ideal worlds can still find a use or two for it.

--
Dewey, what part of this looks like authorities should be involved?
Re:DNS was, and is, an ugly kludge by HiThere · 2002-05-28 06:04 · Score: 2

There is no good reason why a web user should ever need to see a URL ...

When I see some of the URL's, I almost agree with you, e.g.:
...&op=Reply&threshold=-1&commentsort=3&tid = . .

But normal URLs are an important concept in learning to use the web, and hiding it from people does them a disservice. Those who want to can easily ignore them (I usually do), but they can be a crucial part of the learning experience. So people need to have the exposure to them, so they know that they exist, and have the chance to learn what they mean.

It is, in a way, cognate to your example of HTML (though at an earlier part of the learning curve). If the view option removed the ability to observe the HTML of web pages, then learning to code HTML would be much more difficult. And learning URLs is right at the beginning of the process. By hiding them, you would be doing a gross disservice to everyone who is starting to learn how to use the web.

--

I think we've pushed this "anyone can grow up to be president" thing too far.
Re:DNS was, and is, an ugly kludge by Sanity · 2002-05-28 18:32 · Score: 2

Ever wondered why books of names (the kind that expecting parents buy) are so popular? To put it into computing jargon: we like to give our children 'meaningful identifiers'. So it is with computers, not least because system administrators in charge of large sites would have a hard time remembering all the individual IP addresses.
The problem is not where DNS is being used as some form of UID, but where DNS is being used as a tool to allow people to locate information relevant to a particular concept, such as "www.dictionary.com" etc. That was never DNSs intended purpose. It is the fact that DNS is being used well beyond its initial intended purpose that makes it a kludge.
It is a testament to the inventors that they came up with such an expandable, distributed, hierarchial system.
Perhaps your head has been buried in the sand, but haven't you been paying attention to the problems created by the Internet's centralized architecture (do a search for ICANN here on slashdot for some good pointers).
Re:DNS was, and is, an ugly kludge by Sanity · 2002-05-28 18:35 · Score: 2

Your opinion itself speaks volumes about what you know and understand. You're the clueless suit touring the factory, wondering why the steam pipes aren't chromed.
Don't be a dick, you have no idea what I know and understand, but if you did, you would be feeling rather stupid right now. I have been a qualified software engineer for years, on slashdot for a lot longer than you (judging by your user number), so please don't pull that patronising crap on me.
Re:DNS was, and is, an ugly kludge by Sanity · 2002-05-28 18:44 · Score: 2

He's also incredibly humble, if you haven't noticed.
Hey, you are the one who started with the accusations that I am some sort of newbie, unworthy of debating with your holiness, yet even by your own admission (your journal) you are a poor programmer. I simply pointed out the stupidity of taking an elitist attitude with someone who, for all you know, could be a well respected software developer in the open source community.

Easy by devphil · 2002-05-27 14:20 · Score: 2

If you're serious about typing in Russian, you don't type the control-meta-alt-whacky sequences.

You spend $15 and buy a plastic keyboard overlay, one of those little flexible jobs with the alternate characters printed on them. Change your keymapping -- they make keymap files to match the popular overlay's plastic sheets, I'm told -- and you're done.

--
You cannot apply a technological solution to a sociological problem. (Edwards' Law)

Re:Terminology whine by RelliK · 2002-05-27 14:21 · Score: 4, Insightful

The Cyrillic alphabet was developed a long time ago by a religious man (guess what his name was), because the Russian peoples he was trying to convert had no written alphabet

That is false. Russian people had alphabet long before Cyrillic. Incidentally, that should really be proto-Russian, or Eastern Slavic since the people diverged into Russian, Ukrainian, and Belorussian much later.

So it could be said that "Russian Cyrillic" is redundant.

It is not. There are several "dialects" of the Cyrillic alphabet. They are mostly the same but a few letters are different. I already mentioned three of them above. There's also Bulgarian, Serbian, and I'm not sure what else.

I seriously doubt the the "c" and "o" characters mentioned in the article are unique to the K018R charset

The charset is called KOI8-R. Or are you using the l33t sp3lling?

--
___
If you think big enough, you'll never have to do it.

Re:The site by Servo5678 · 2002-05-27 14:23 · Score: 3, Funny

Hey, that URL is infringing on my copyrights! It's similar to my business's name, Bq--at77w373jih7xepx7om7p6zx7oq Enterprises, Inc.

Lousy cybersquatters...

i know you're being funny, but... by Anonymous Coward · 2002-05-27 14:29 · Score: 5, Interesting

I believe it would be something along the lines of .

Re:i know you're being funny, but... by Anonymous Coward · 2002-05-27 15:36 · Score: 2, Insightful

you'd need an accent mark over the 'o' to keep it from sounding like an 'a', fyi. other than that, you're absolutely correct.
Re:i know you're being funny, but... by arivanov · 2002-05-27 19:53 · Score: 2

That is in russian. In bulgarian and serbian you do not need it. The stress is at the other syllable so it sound correct.

The spoofable letters are a and o.

--
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/

Re:I would have thought it wasn't a problem except by SwellJoe · 2002-05-27 14:30 · Score: 2

No offense intended, but if you associates aren't smart enough to distinguish between a scam and a legitimate e-mail, than you need to let thme get burnt a few times. Either that or get them off the Internet.

Yep, you're right. Let's make all the grandmothers stay in their rocking chairs where they belong. The internet is for young, savvy nerds. Knitting is for old people.

Seriously, I understand your perspective, and it isn't as though I'm suggesting legislation or something stupid like that (I'm anti-government on all issues)...I'm just saying I think people will get scammed using this method. And I think it may be damaging to legitimate companies as well. This is unfortunate on two counts...it is bad for my grandmother, and yours, and it is bad for honest businesses who would never use spam marketing or pull some kind of bait-and-switch, or just plain ol' scam.

That's all...I don't have solutions. I'm just griping about the problem. Isn't that what slashdot is for, hand-wringing and griping?

Re:Terminology whine by tulare · 2002-05-27 14:34 · Score: 2

That is false. Russian people had alphabet long before Cyrillic. Incidentally, that should really be proto-Russian, or Eastern Slavic since the people diverged into Russian, Ukrainian, and Belorussian much later.

Fair 'nough. The good bishop simply wanted a written language that he understood, so that he could teach his religion. So the creation of the Cyrillic alphabet is a matter of convenience for the religious powers-that-be of the time. Not a new story, unfortunately. And your point about proto-Russian is well-taken.

It is not. There are several "dialects" of the Cyrillic alphabet. They are mostly the same but a few letters are different. I already mentioned three of them above. There's also Bulgarian, Serbian, and I'm not sure what else.

While in the broadest sense, you are right (I have a great story outside the context of this article on a miscommunication on my part with a Ukranian individual who I mistakenly thought was speaking Russian) in the context of my point about those two specific characters, I disagree. Again, a Unicode geek could prove me wrong.

The charset is called KOI8-R. Or are you using the l33t sp3lling?

Lol, heh. You are right on there. I was just dashing off a reply to the article, and wasn't paying enough attention to the niceties. l33t sp311ing was farthest from my mind, b3 a55ur3d.

--
political_news.c: warning: comparison is always true due to limited range of data type

Re:It shouldn't really be a problem. by GigsVT · 2002-05-27 14:34 · Score: 4, Informative

Most people just blindly click OK, because it is usually OK.

A lot of small e-business sites want to use their hosting provider's cert, but don't want the user's browser to display the hosting company's domain rather than their own. (Yes I know it's stupid, people are picky as fuck when you are making web pages).

Anyway, that causes the browser to warn that the cert is not valid for the domain it is being used in.

It's kinda possible to get around this using frames, but then the browser might say something about mixed secure and unsecure items on a page. The only real way to do it right is to just let the users see the hosting provider's address, as far as I know, or have the site buy their own cert.

--
I've had enough abrasive sigs. Kittens are cute and fuzzy.

Re:Terminology whine by VP · 2002-05-27 14:35 · Score: 2, Offtopic

St. Cyrill developed the Glagolic alphabet, based on the slavic dialects spoken on the Balkan peninsula, and used it in translating the Christian holly scriptures for the slavic tribes in Moravia (today's Hungary/Slovakia). His student, St. Clement, developed the improved Cyrillic alphabet and spread its use in Bulgaria, from where it was adopted by Russia, Serbia, and others...

Today there are several variants of Cyrillic - Bulgarian, Serbian, Macedonian, Russian, Ukrainian, and it was used even in some of the former soviet republics and Mongolia, whose languages are very far from Slavic.

Also, KOI8 is not considered the Cyrillic codeset by other cyrillic-using nations, it is rather considered the Russian cyrillic code set. Other codesets are the Windows 1251, and ISO-8859-5. The latter would arguably be the standard Cyrillic code set.

Are international domain names even necessary? by ukryule · 2002-05-27 14:36 · Score: 4, Insightful

From the article:

But are international domain names even necessary? Kuhn, who is German, doesn't think so: "Familiarity with the ASCII repertoire and basic proficiency in entering these ASCII characters on any keyboard are the very first steps in computer literacy worldwide."

That's like saying basic numeracy is the first step for computer literacy worldwide, so we should go back to using IP addresses!

Currently email addresses and URLs are the only reason a native Chinese speaker needs to use ASCII. For someone from Germany, ASCII is pretty easy to handle, but for a lot of languages, Unicode URLs & email addresses are very necessary ...

Re:Are international domain names even necessary? by Val314 · 2002-05-27 19:45 · Score: 2, Interesting

but how can you (or a avarage user) send an email to say müller@müller.de using an english keyboard?

i think we should stick to ASCII
Re:Are international domain names even necessary? by dvdeug · 2002-05-27 20:21 · Score: 2

but how can you (or a avarage user) send an email to say müller@müller.de using an english keyboard?

Yes, with the right ALT- combinations. Or you can cut and paste.

Sure, one might want to think twice about using müller@müller.de as an email address if you want to communicate with people who won't find that easy to type from their keyboard. But I don't think that's choice that should be imposed from the outside.
Re:Are international domain names even necessary? by plumby · 2002-05-27 20:58 · Score: 4, Insightful

What if the Internet had started in China? Would you be happy to learn the Chinese alphabet in order to enter URLs?
Re:Are international domain names even necessary? by dvdeug · 2002-05-28 03:02 · Score: 2

if the Internet had started in China (which would have been absolutely bizarre)

What's so bizzare about the idea? It's an alternate history idea, and there will of course be different events (I'd make the divergence China a republic sometime in the 50 years before 1930 . . .)

they probably would have used ascii characters, or some set of latin characters.

If your alternate history still calls for the predominance of US computing and ASCII/ISO-646, then they probably would have used some ASCII based system. Otherwise I see no reason for that; if an alphabet was needed, they could have used Bopomofo (Chinese phonetic characters), or Cyrillic or Katakana.

if they used Chinese symbols, then it wouldn't have gone outside the country until a different format came about.

Depends on the economic importance of the Chinese net. If Chinese was the language of computing, there wouldn't be a problem. Even if it wasn't, it's possible that people would have put it up with it and added ad-hoc changes to make it work with their languages.

Latin characters are easier to recognize and use than Chinese symbols (there are only 26 of them!). And just about every computer-aware culture is familiar with these 26 letters (since before the Internet).

But Chinese characters are more concise than Roman characters. In any case, everything you've said applies with equal power to Cyrillic; a Sino-Russian web would probably have used Cyrillic as a base.
Re:Are international domain names even necessary? by BlueGecko · 2002-05-28 03:11 · Score: 2

No, but I'm sure we would. The dominant culture at the time of a field's inception tends to define that particular field. Have you noticed that biology is still heavily Latin, that music is heavily Italian, that a number of math symbols are French (natural log as ln being the best example), and that martial arts are almost all Asian languages? In those fields, Americans must familiarize themselves at least with the foreign terminology, if not with the languages themselves, and I'm sure that if the Chinese had developed the Internet that we would add whatever Chinese words to our language that we had to in order to use the technology.

As for the alphabet itself: I think that the Chinese use the Roman alphabet when they need to write a word phonetically, and at any rate know that there is a specific way to write Chinese sounds in roman. Regardless, however, I suspect that we would probably simply familiarize ourselves with their alphabet, or, better yet, nonsensically map our characters onto theirs, so that an "m" was this glyph and a "y" was that one, thus allowing us to use the system but with our words. In other words, do exactly what we are doing, only the other way around...
Re:Are international domain names even necessary? by Dephex+Twin · 2002-05-28 04:05 · Score: 2

It's an alternate history idea, and there will of course be different events (I'd make the divergence China a republic sometime in the 50 years before 1930 . . .)

Well, I wasn't really considering going back that far in history. What I thought would be bizarre would be if other events happened along the same lines, except China got an equivalent to the Internet going first.

If your alternate history still calls for the predominance of US computing and ASCII/ISO-646, then they probably would have used some ASCII based system.

I was simply thinking of Latin-based dominance in terms of international communication.

But Chinese characters are more concise than Roman characters.

Certainly not in terms of the number of characters!

In any case, everything you've said applies with equal power to Cyrillic; a Sino-Russian web would probably have used Cyrillic as a base.

Or Korea could have used its letters, yes. Or what if German had been made the national language of the United States? What if? Again, when responding the the comment about "what if China had created the Internet", I didn't know it was fair game to also imply "what if China became a Republic in the 1800s", "what if Russia/China/etc. had pioneered computing" and so on.

In response to Cyrillic and such, I went by the notion of "what language(s)/alphabet(s) are/were most widespread during the development of the Internet?" In the early 1900s, one might say that French was the "international language", and nowadays it is English-- both are Latin-based alphabets.

Yes, English dominated computing, as well as the Internet, but don't you think part of the reason for that could be because just about everyone was familiar in some way with these characters? On the other hand, if all of Western Europe and the Americas had to start learning katakana, cyrillic or Korean letters (let alone Chinese(!), which is what I was responding to), I think they would have rather developed their own way of handling the Internet, keeping the Chinese(or whatever) "Internet" confined within China.

mark

--

If you want to make an apple pie from scratch, you must first create the universe. -- Carl Sagan
Re:Are international domain names even necessary? by Dephex+Twin · 2002-05-28 05:11 · Score: 2

Reducing the symbol set is not always making things easier. Even if a binary dump of text uses only 2 symbols, it is much harder to read.

Hey, that's a slippery-slope argument.

My point was not even readability anyway. It was being able to tell things apart and use them. Think about this situation. If I speak Japanese as my native language, I probably see English characters often, and am at least familiar with the 26 letters (and probably have studied a Latin-based language at some point). And this holds true before the Internet's popularity. Also, I probably have some known way to easily type Roman characters into my keyboard. The Internet comes around, I just go along with these English characters without much extra learning, although it can be a bit of a pain.

Now take Chinese as the language of the Internet. Say I speak English/Russian/Swedish/Portugese/whatever as a native language. It's likely that I can't tell Chinese from Japanese (or even Korean for some people), and it's also likely I can't even write a single Chinese character (maybe I memorized one or two for fun). Imagine the undertaking required to even tell characters apart, regardless of meaning. There would have to be a fundamental restructuring of schooling in the West just to get people acquainted enough with the language to use it for web addresses, email, etc. I mean, people wouldn't even be able to transcribe an email address. More likely would be the creation of a Latin-based Internet in the West.

This is the problem that causes it to be very unlikely to have ever been universally adopted in the Internet.

I don't think one language is better than another, but it's just a fact that Chinese (as well as Japanese) takes much longer to learn than Latin-based languages and is much less widespread geographically.

mark

--

If you want to make an apple pie from scratch, you must first create the universe. -- Carl Sagan
Re:Are international domain names even necessary? by dvdeug · 2002-05-28 06:07 · Score: 2

Certainly not in terms of the number of characters!

But in terms of number of characters per text or sq. cm. of paper. The "What is Unicode" page is smaller byte-size in Chinese than any other language, IIRC, even with each Chinese character taking 3 bytes and most Latin character taking 1.

Again, when responding the the comment about "what if China had created the Internet", I didn't know it was fair game to also imply "what if China became a Republic in the 1800s", "what if Russia/China/etc. had pioneered computing" and so on.

Any scenario where China created the Internet naturally presuposses major changes in what China was, as China was in no position to build a large network of computers at the time the Internet was created.

On the other hand, if all of Western Europe and the Americas had to start learning katakana, cyrillic or Korean letters (let alone Chinese(!), which is what I was responding to), I think they would have rather developed their own way of handling the Internet, keeping the Chinese(or whatever) "Internet" confined within China.

Had the shoe been on the other foot, and April 1st, 1983 carried a message about WashVAX instead of KremVAX, I think they would have happily jumped on board whatever was in place. What self-respecting geek is going to let a language - much less a writing system - get in the way of getting hooked up to the Internet?
Re:Are international domain names even necessary? by Dephex+Twin · 2002-05-28 07:43 · Score: 2

But in terms of number of characters per text or sq. cm. of paper. The "What is Unicode" page is smaller byte-size in Chinese than any other language, IIRC, even with each Chinese character taking 3 bytes and most Latin character taking 1.

I know that this is what you meant, but I wasn't talking about complexity in terms of size on paper or byte-wise. I'm sure you are correct on that. But the complexity of learning the language is far greater. And that complexity is the crux of my viewpoint.

What self-respecting geek is going to let a language - much less a writing system - get in the way of getting hooked up to the Internet?

Maybe a few geeks would learn Chinese in order to get on the Chinese Internet. Just like a some really hardcore gamers study up on Japanese because Japan has great games that don't come out in the US. Geeks make up a *very* small minority, especially ones that learn a language like this. How long was the Internet around in some form to computer geeks before it was really a "thing"? I'd say 96-97 was when the Internet really took off (although it was building throughout the 90s). It would have just stagnated like this if one had to learn Chinese to even see what was going on. Do you know how much more time and effort it takes to learn (even basic) Chinese compared to learning (even fairly advanced) Perl? It is for this reason that Western geeks would learn protocols and technology to build their own Internet rather than study languages for years!

Here's the big difference I was trying to explain before. Lots of people in China and Japan already know English or are in some way familiar with Roman characters. They see and use them every day. When the Internet was coming about, even if they didn't speak English, they could at least see more than incomprehensible gibberish (of course it isn't really this at all, but it is to most Westerners, honestly).

How would the Chinese Internet ever take off in the West, if the West would not even be able to recognize a single character, of which there are over 10,000 (5,000 common)?

Yes, people would begin learn Chinese, *if* the Internet was a big thing. But the Internet wouldn't be big enough to warrant years of study if only Chinese people used it. So you have a catch-22. At this point in time, instead of people saying "we want to be part of China's Internet, so let's all learn Chinese!" (which requires huge changes in social and educational structures), I think it is far more likely that people would say "we want an Internet too, let's make one (and connect it with China?)"

This is why I think a Chinese Internet that is a world standard almost could not exist.

Unless you go back and completely change the political and economic structures of the entire world for the past few hundred years. But then there would be so many changed factors that it's almost pointless to speculate. I mean, being expansionist goes against the very nature of China, which has always been isolationist (sometimes to great extremes). Would the USA even exist? Would widespread use of Chinese cause a total reform of the language (I feel it would)?

Although Chinese may not be harder for the native Chinese speaker than English is to the native English speaker, I still believe it is much easier and more productive for a Chinese speaker to "dabble" in English than it is for an English speaker to "dabble" in Chinese. Much higher learning curve.

mark

--

If you want to make an apple pie from scratch, you must first create the universe. -- Carl Sagan
Re:Are international domain names even necessary? by ukryule · 2002-05-28 14:14 · Score: 2

if the Internet had started in China (which would have been absolutely bizarre)

What's so bizzare about the idea? It's an alternate history idea, and there will of course be different events (I'd make the divergence China a republic sometime in the 50 years before 1930 . . .)

Well of course the Republic of China was formed in 1912, so that fits in with your timescale :-). So perhaps your divergence is that the communist revolution failed.

But if the internet had been Chinese based, then the character set would have been multi-byte based, and so would almost certainly have included Latin characters along with the Chinese characters, and Chinese phonetic characters. Thus it would have been much easier for a Westerner to use, as they just use the Latin subset and ignore the rest. The development of something akin to unicode would have been fundamental, not an afterthought.

What we're doing now, trying to expand from a small (1-byte) set of characters is much more complicated ...
Re:Are international domain names even necessary? by dvdeug · 2002-05-28 16:25 · Score: 2

Well of course the Republic of China was formed in 1912, so that fits in with your timescale :-).

Oops. Asian history isn't really my forte . . .

So perhaps your divergence is that the communist revolution failed.

I'd think it be more important that whatever problems let it get started disappear.

The development of something akin to unicode would have been fundamental, not an afterthought.

One of my personal annoyances, and something I don't entirely understand, is that China, Taiwan and Japan all had multibyte character since at least 1980, but none of them really tried to be multilingal; the impetus behind Unicode was largely American. (To be fair, America had multibit character sets since the 50's, but expanding the number of bits to include the everyone didn't really occur to us until the mid-80s.) Looking at history, a multibyte character set would have been used, but I don't think something like Unicode would have appeared until the world just plain shrunk enough and people got tired of dealing with dozens of different character sets everywhere.
Re:Are international domain names even necessary? by plumby · 2002-05-28 19:11 · Score: 2

Think about this situation. If I speak Japanese as my native language, I probably see English characters often, and am at least familiar with the 26 letters (and probably have studied a Latin-based language at some point).

Japanese, probably. Chinese, not necessarily. Most highly educated Chinese will have studied English (or some other Western language), but there are still vast amounts of Chinese people who have never seen a Western alphabet (or at least any more than you have seen Chinese on a Chinese menu etc). The Chinese equivilent of Mom & Dad USA usually speaks not one word of English.

More likely would be the creation of a Latin-based Internet in the West.

Which is almost exactly the route that is being taken by creating Unicode URLs.

but it's just a fact that Chinese (as well as Japanese) takes much longer to learn

If you want an easier alphabet, would you expect everyone in the US to learn Korean (24 letters, my wife picked up the basics in a week or so)?

but it's just a fact that Chinese (as well as Japanese) takes much longer to learn than Latin-based languages and is much less widespread geographically.

Geographically, possibly, population-wise, Chinese is spoken as a first (and usually only) language by almost 1/4 of the worlds population. When you take into account other non-Western languages (Cyrillic, Japanese, Korean, Arabic, etc), it's probably getting on for (if not more than) half the world's population.
Re:Are international domain names even necessary? by plumby · 2002-05-28 19:25 · Score: 2

No, but I'm sure we would. The dominant culture at the time of a field's inception tends to define that particular field. Have you noticed that biology is still heavily Latin, that music is heavily Italian,

This is true, for people who work within that field. I suspect that your average person in the street wouldn't know what adagio meant. Also a better comparison is actually Greek symbols in maths, as they are in a different alphabet. I suspect that most people would recognise the pi symbol, but few of them (unless they are mathematicians) would recognise, or be prepared to learn, any of the others. This use of other languages/symbols in particular fields is almost universally seen by outsiders as a mechanism to stop them understanding that particular field. And if you had to learn a different alphabet/set of symbols to get on the internet, this would be a huge barrier to the the average person ever figuring out how to use it. I suspect that if we had to use ip addresses instead of urls to access sites, my parents (and most people like them) would never be able to get anywhere on the internet, and ip addresses are at least made up of symbols that they can read.
Re:Are international domain names even necessary? by ukryule · 2002-05-28 20:04 · Score: 2

One of my personal annoyances, and something I don't entirely understand, is that China, Taiwan and Japan all had multibyte character since at least 1980, but none of them really tried to be multilingal;

I think the reason for this is the same reason that ASCII doesn't cover the French/German/Scandiwegian extra characters despite it being easy to do. Any country is only going to standardise on their own character set until there's a reason to be multilingual. That reason is the internet: You don't see other languages until the internet takes off, by which time you've already standardised on the protocols it uses (DNS,HTTP,SMTP et al) - and it's too late!

So whatever language the internet was developed in, it would need internationalisation after it has become successful.
Re:Are international domain names even necessary? by Dephex+Twin · 2002-05-29 02:54 · Score: 2

but there are still vast amounts of Chinese people who have never seen a Western alphabet (or at least any more than you have seen Chinese on a Chinese menu etc). The Chinese equivilent of Mom & Dad USA usually speaks not one word of English.

That might be true, although it is a lot less likely that the poorer and more rural areas would have much exposure to computers or the Internet. And I would say a lot more Chinese are reasonably familiar with the Roman letters than non-Chinese are with Chinese characters.

More likely would be the creation of a Latin-based Internet in the West.

Which is almost exactly the route that is being taken by creating Unicode URLs.

I'm not arguing that. I'm only saying if China had developed an Internet before the Internet as we know it existed, it would not have gotten as widespread as the "English" Internet has throughout the world.

If you want an easier alphabet, would you expect everyone in the US to learn Korean (24 letters, my wife picked up the basics in a week or so)?

Well, I happen to have studied Korean myself, and it isn't very difficult to learn. However, Korean has some slight disadvantages. The first is that it is mostly spoken only by Koreans (either within the country or elsewhere in the world). So, while Roman character languages are already widespread, 99% of those who would already be familiar with Roman characters would now have to learn Korean characters. The other problem with Korean is that it is a more complicated matter to display. Roman characters go left to right, and the letters do not need to cross or connect (except maybe oe, ae...). Korean groups each syllable together into a single "symbol" where the letters within that symbol can connect right to left and/or up to down.

For the sake of having just 2 less letters, I don't think these problems make Korean a simpler or more convenient option for anyone, except Koreans of course.

I'm not trying to do a nickle-and-dime comparison like that (Hawaiian has only 12 letters, but it's not twice as easy as Korean or something). But 26 (English) compared to 5-10,000 (Chinese)-- that is a significant difference and I see meaning in that.

mark

--

If you want to make an apple pie from scratch, you must first create the universe. -- Carl Sagan
Re:Are international domain names even necessary? by plumby · 2002-05-29 03:24 · Score: 2

That might be true, although it is a lot less likely that the poorer and more rural areas would have much exposure to computers or the Internet.

To some extent true, but I have at least one Chinese friend who's parents speak not one word of any language except Chinese, but who keep in contact with him via email.

There will always be people that the Internet won't reach, in the same way that many people have never used a phone, but seeing the way it has taken off amongst people like my parents (the first time my father ever used a mouse was 6 months ago, but he's now addicted to the Internet), and the way new technology like mobile phones have taken off in China (and not just amongst the 'educated', it's not unreasonable to expect that there is a good likelihood that usage could start to pick up amongst the 'average' Chinese in the near future, and having to learn another alphabet in order to use it would be a barrier to this.

For the sake of having just 2 less letters, I don't think these problems make Korean a simpler or more convenient option for anyone, except Koreans of course.

The point about Korean was not a suggestion that it becomes the default language for the internet, just that if it had been the language that needed to be used to access anything, then this would be a barrier to the takeup of the Internet by average people in the West, and most people would not be prepared to learn it, even though it is not that complex, just for this purpose.
Re:Are international domain names even necessary? by Dephex+Twin · 2002-05-29 03:54 · Score: 2

The point about Korean was not a suggestion that it becomes the default language for the internet, just that if it had been the language that needed to be used to access anything, then this would be a barrier to the takeup of the Internet by average people in the West, and most people would not be prepared to learn it, even though it is not that complex, just for this purpose.

Oh, so you are kind of agreeing with me but saying that even an easy language like Korean would be a barrier if it was a Korean Internet? Yes, this might be true.

However, I did originally say that English worked not because it is easy to learn, but because a lot of people already knew it or were regularly exposed to it by the time the Internet came about. And there is a good chance that those who don't know English know some Roman-based language (this doesn't cover everybody of course).

So it is this advantage of being widespread and similar to other widespread languages that makes it IMO easier to adopt in other countries than Korean. That is to say, if Roman-based languages weren't already so widespread and Korean dialects were, and a Korean Internet came about, it would have no more difficulty in the West.

Anyway, this is getting very far from reality. But you have an interesting point about Korean.

mark

--

If you want to make an apple pie from scratch, you must first create the universe. -- Carl Sagan

IDNC3 by Russ+Nelson · 2002-05-27 14:48 · Score: 5, Informative

Dan Bernstein has a proposal for internationalized domain names which solves this problem and many other problems. It's called IDNC3. IDN stands for ``internationalized domain name.'' C3 stands for ``clean, careful, conservative.''

--
Don't piss off The Angry Economist

Re:Lunix saves the day! by John+Hasler · 2002-05-27 14:49 · Score: 2

"...this "superior" Lunix operating system's complete lack of Unicode support..."

Try Linux. It's had Unicode for years.

--
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.

Who needs a paper... this is irrelevant by wadetemp · 2002-05-27 14:52 · Score: 4, Informative

1) Some people are not good at spelling, and wouldn't know microsoft.com from microssoft.com, especially if it's just seen in a few quick glances.

2) There are more TLDs out now, and the same name at a .biz or .info TLD does not mean it is the same company... but no doubt alot of people think that's true.

3) There's always the old numeral "1" swapped for the lowercase "L" or the uppercase "I", trick, among other similar things that never involved Unicode, but rather human vision and high-resolutions.

4) The "@" symbol in the URL trick, like http:\\microsoft.com\moneyfrombil@haxor.com?action =allyourmoneyarebelongtous

So if you haven't figured out my point yet, a good percentage of people that use the internet are going to be fooled by far simpler feats of social engineering. Who needs Unicode to do it?

Re:Who needs a paper... this is irrelevant by weinerdog · 2002-05-28 02:59 · Score: 2

3) There's always the old numeral "1" swapped for the lowercase "L" or the uppercase "I", trick, among other similar things that never involved Unicode, but rather human vision and high-resolutions.

Since browsers (and possibly Outlook and other HTML-enabled mail manglers) support downloadable fonts, it shouldn't be all that tricky to just create a custom font with glyphs in the 'wrong' position. microsoft.com might be taken, but I bet that mzcyoqoxp.com is free. You wouldn't have to rely on similarities between glyphs in sans-serif fonts; you could spoof most any address through selective use of downloaded fonts.

This capability has existed for a while.

--
There's no such thing as Scotchtoberfest!

Re:Terminology whine by tulare · 2002-05-27 14:52 · Score: 2

Er, no. Cyril developed the "Cyrillic" alphabet, although your statement of his intentions is correct. (I don't believe he was much of a saint, btw)
I do thank you for the correction on the charsets. I kind of knew that would happen :)

--
political_news.c: warning: comparison is always true due to limited range of data type

Re:Terminology whine by markov_chain · 2002-05-27 14:58 · Score: 2, Informative

Actually, no. Glagolitic was indeed invented by Cyrill and Methodius, in the 9th century. I don't know where the previous poster got the St. Clement reference. See here for the character set and a bit of history.

These two also invented cyrillic. The difference is that glagolitic didn't survive very long, while the cyrillic is still in use today. The last country to use glagolitic in any quantity is Croatia, up to the end of the 19th century.

--
Tsunami -- You can't bring a good wave down!

Isn't fraud illegal? by anthony_dipierro · 2002-05-27 15:15 · Score: 2

If you buy something online without using a credit card, you deserve to get scammed.

If you buy something with a credit card, not only will you get your money back (actually never lose it in the first place), but the scammers will likely go to jail.

Besides, why are you clicking on links in your spam anyway?

Re:WHY THIS IS IMPORTANT - It's already been done by JesterOne · 2002-05-27 15:15 · Score: 4, Informative

Even better... I seem to recall a scam that did just that with paypal. They sent out bulk mail about updating your account or something but the link was not paypa(lower case 'L').com but paypa(Capital 'I').com and had made a carbon-copy of paypal's website, hoping you would log in. The address in the location bar looks identical for both. This sounds like the same kind of thing but using Unicode to make the spoof.

Comment removed by account_deleted · 2002-05-27 15:20 · Score: 2

Comment removed based on user account deletion

Reminds me of my friend by quantaman · 2002-05-27 15:20 · Score: 2

My friend told me that a few years ago he was looking for a domain name to register. After some poking around he discovered that microsoft.net was up for grabs. He then proceeded to go to his dad to ask for the $10-$15 (don't remember the exact amount) he needed to register the domain, needless to say his dad refused!!

--
I stole this Sig

Think of the fun you could have with this! by chabotc · 2002-05-27 15:20 · Score: 3, Funny

Ok, first take microsoft.com (alternate spelling), name your mail gateways identitcal to microsoft's, and then send out emails (as balmer@microsoft.com?) to a lot of MS employees, telling them to remove IE from XP ..

From there on, it only gets better and better. Think of the countries you would be able to influance, technology developement you could steer, and leaked memo's you could fabricate..

Damn i wish i had thought of it ;-)

Re:Terminology whine by VP · 2002-05-27 15:28 · Score: 2

Er, no. Cyril developed the "Cyrillic" alphabet, although your statement of his intentions is correct. (I don't believe he was much of a saint, btw)

I am sorry, but you are wrong - please see my other post for some links. Here is another: http://education.yahoo.com/search/be?lb=t&p=url%3A c/cyrillic_alphabet

IMO, the major contribution of St. Cyrill and Methodius is not the creation of an alphabet, but their disputes with the Western church and the Pope regarding the right for the different peoples to learn and practice Christianity in their own language. Up to that point only Latin, Greek and Hebrew was used in church services...

Re:Client-side fix? by bani · 2002-05-27 15:29 · Score: 2

Not necessarily.

Unicode defines character code points, but doesn't specify their appearance.

There's nothing preventing an application from using lame fonts for glyphs, and in fact many do.

On average, unicode implementations vary from bad to utterly horrible.

Different behaviour on different TLDs by ukryule · 2002-05-27 15:41 · Score: 3, Interesting

One way to control this would be to restrict the valid characters based on the TLD.

So for example '.uk'/'.au'/'.us' etc. can ONLY have ASCII 2nd level domains. '.de' Can only have German characters, '.fr' only French, and so on ...

Then for completely different character sets, you have new Unicode TLDs (Arabic, Greek, Chinese), which can only have their relevant characters.

I guess you leave .com/.org./.net as ASCII, although they are meant to be global they are based on the Latin character set.

Of course, this adds complexity - but you can do all the testing for validity when the domain is registered (i.e. a web client can request any URL, but dodgy mixed character set domain names cannot be registered).

still encrypted, but... by MenTaLguY · 2002-05-27 15:43 · Score: 2

It's impossible to prove that someone hasn't inserted themselves in between you and the server, giving you a bogus cert, and pretending to be you to the server.

This is the reason for trusted signatures on certs.

Hit google for "man in the middle attack" if you want to know more.

--

DNA just wants to be free...

Re:I would have thought it wasn't a problem except by cabbey · 2002-05-27 15:47 · Score: 2

if you associates aren't smart enough to distinguish between a scam and a legitimate e-mail, than you need to let thme get burnt a few times.

This is family we're talking about, not "associates"... you let family get burnt and you're getting fruitcake for christmass... for life.

Either that or get them off the Internet.

Ah, but then you couldn't get the pictures of the cousin's sister's kids emailed every time they get an award at school. Or the forward of the forward of the quoted forward of the latest monster joke to wander the 'net.

Nothing new by Florian+Weimer · 2002-05-27 15:51 · Score: 2

The same risks exist today with ASCII domain names: transposed letters "1lI", "O0", playing tricks with "@" and most user agents.

You just must not take anything for granted which you see or read on the web.

Re:The site by Indras · 2002-05-27 15:54 · Score: 2, Funny

Yes, but you're forgetting, "Bq--at77w373jih7xepx7om7p6zx7oq" cannot be trademarked, because it is a common word, like "door" and "window."

--
The speed of time is one second per second.

Re:Obligatory observation by BrookHarty · 2002-05-27 16:18 · Score: 2

If a domain needed to be hijacked, thats it.
-
The only way to get rid of a temptation is to yield to it. Resist it, and your soul grows sick with longing for the things it has forbidden to itself. - Oscar Wilde (1854 - 1900)

Re:Obligatory observation by aardvarkjoe · 2002-05-27 16:22 · Score: 2

Well, there goes all the security of being able to find your gay porn when you want it.

--

How can we continue to believe in a just universe and freedom to eat crackers if we have no ale?

Re:The site by PurpleBob · 2002-05-27 16:24 · Score: 2

Domain names starting with bq-- are Unicode domain names mangled back into ASCII, so you were probably in the right place.

--
Win dain a lotica, en vai tu ri silota

Re:cyrillic trivia Re:Terminology whine by os2fan · 2002-05-27 16:34 · Score: 3, Offtopic

I'm aware of all of this. But even in the soviet empire, there were extra letters. Compare this in the west, where Icelandic still uses thorn and etha. Thorn was used in english before the latin alphabet arrived, and continued afterwards. edda or etha is a crossed d. Capital thorn looked something a Y with a vertical left stroke. Hence "Ye Olde Shoppe".

Ohter english letters to fade is yoch [looks like a 3] - this is the z in Menzies = Men3ies "Menges".

Also of note is digamma. In the greek number system, this is 6, that is, the 6th letter of the alphabet. As a letter, it appear between epsilon and zeta. Since our alphabet is derived from the greek, one notes the letter here not only looks like digamma, but preserves much of the original sound: F. Phi was an asperated p.

Cyrillic bears a much closer resemblance to the classical greek letters, and the theta, indeeds represents an f here.

Unicode reflects current realities. There is more than one Cyrillic Alphabet, just as there is more than one Latin alphabet.

--
OS/2 - because choice is a terrible thing to waste.

You gotta love this quote... by ncc74656 · 2002-05-27 16:44 · Score: 2

Certification agencies (which include VeriSign) ensure that encoded names are not misleading and that the registration corresponds with the correct real-world entity.

Yeah, that's why a couple of Israeli college students were unable to register mirsoft.com (spelled "miсrоsoft")...oh wait a minute, what were they saying again?

--
20 January 2017: the End of an Error.

Verisign -- the company you can trust! by Corgha · 2002-05-27 16:56 · Score: 3, Interesting

Verisign never ceases to amaze me. The first sentence on their website is:

VeriSign, Inc. (Nasdaq:VRSN) is the leading provider of digital trust services that enable businesses and consumers to engage in commerce and communications with confidence.

... so it seems safe to say that trust is the foundation of their business. Essentially, we trust Verisign to ensure that we're communicating with whom we think we're communicating, and to protect us from various forms of spoofing. They should therefore, IMHO, actively avoid even the appearance of impropriety.

However, we all remember the Microsoft certificates they mistakenly gave out to a third party.

Now we've got them registering another domain to someone that looks just like "microsoft.com." While it's tempting to absolve Verisign of guilt in this, I think they were asking for it. After all, even I thought of this possibility when I first heard about Unicode domain names, and I'm not the sharpest knife in the drawer. You've got to think someone at Verisign raised the possibility, but they chose not to deal with it.

Again, one might be tempted to say that this isn't their problem, if not for the fact that they are in the trust business. As the article says, "Certification agencies (which include VeriSign) ensure that encoded names are not misleading and that the registration corresponds with the correct real-world entity." It should not be technically difficult, for instance, to build a set of lists of visually similar Unicode characters and to refuse to register domains visually identical to existing ones. Maybe they should decide to forgo a relatively small amount of revenue and to refuse to sully their reputation with such inevitably deceptive domain registrations, especially considering that they interfere with Verisign's core business.

Of course, none of this compares to the letters they sent out trying to fool people into switching their domains over to Verisign. The other two were negligence and foolishness, but that was an active attempt to deceive from a company that's selling trust.

It all leaves me in a bit of shock. It's not that I'm shocked to see a company doing stupid and deceitful things; it's that trust is Verisign's primary asset. Hearing about these (colossally, in my mind) stupid decisions is like hearing that GM decided to torch all its manufacturing plants and assasinate all its employees. It leaves me with two questions: "what they hell are they thinking?" and "why does anyone continue to do business with Verisign?"

Comment removed by account_deleted · 2002-05-27 17:15 · Score: 2

Comment removed based on user account deletion

Re:I fail to see by AndyElf · 2002-05-27 17:48 · Score: 3, Interesting

Domain spoofing is one are. But what if you see an email address on a business card, say @mirsft.com? How do you know what encodings are those 'c', 'a' and 'o' are in (for those with UNICODE brain-damaged browsers the address above should look like ca@microsoft.com)? Same goes for URLs, etc. Another option -- say a Swedish company registers an URL that perfectly represent the name of the comapny in Swedish. With all those umlauts and whatever-they-are-called-those-circles-over-A. And you are sitting there with a US_en keyboard -- how are you expected to type that URL into a location field in your browser?

For the use-cases like this I think that multilingual URLs are a Bad Idea (TM).

--

--AP

Shush, the filter here doesn't know about cekc!!!! by Slashamatic · 2002-05-27 18:00 · Score: 2

We have a http filter here that protects against things such as www.sex.com (being PC, it also stops theOnion and fuckedcompany). They even filter out the fish, in case you use it as a proxy to get non-PC pages.

Unfortunately, it doesn't protect against 'cekc' (I can't be bothered to get type this in Cyrillic here).

Also discussed in "Secure Programming" HOWTO by dwheeler · 2002-05-27 18:06 · Score: 2

This issue was also discussed in my book Secure Programming for Linux and Unix HOWTO. Look at the section on semantic attacks.

--
- David A. Wheeler (see my Secure Programming HOWTO)

Re:Terminology whine by RelliK · 2002-05-27 18:09 · Score: 2

Can you perhaps explain why KOI8 characters are out of order? This is so stupid and I'm amazed KOI8 is still in use. How do you sort stuff alphabetically if you can't just do an integer comparison? Would be really slow to use some funky custom sorting routine.

--
___
If you think big enough, you'll never have to do it.

Re:I fail to see by dvdeug · 2002-05-27 18:13 · Score: 2

But what if you see an email address on a business card, say ñà@miñrîsîft.com? How do you know what encodings are those 'c', 'a' and 'o' are in[...]?

Since the surrounding characters are Latin, I think it safe to assume they are 'c', 'a' and 'o'. (BTW: encodings are things like ISO-8859-*, KOI8-R, and so on, which the IDN will only use Unicode. The question should be what script they are in.)

Same goes for URLs, etc.

You've never been prohibitied from using non-ASCII stuff in URLs.

Another option -- say a Swedish company registers an URL that perfectly represent the name of the comapny in Swedish. With all those umlauts and whatever-they-are-called-those-circles-over-A. And you are sitting there with a US_en keyboard -- how are you expected to type that URL into a location field in your browser?

Depending on your system, you can use ALT- or SHIFT-CTRL- combinations and the character numbers. Character Map or the equivelent will also let you enter the characters in.

OTOH, why is this a problem? If they have a large non-Swedish audience, they ought to register an all-ASCII name. If they chose not to, then that's their problem. Odds are any such site will be in Swedish for Swedes.

Think of the enormous lawsuit! by mindstrm · 2002-05-27 18:16 · Score: 2

Just because it's a technical no-brainer doesn't mean it's legal, and doens't mean it even treads on laws that have anything to do with the internet.

If you pretend to be someone else, or if someone registered an alternate lookalike domain for microsoft.com and used it to in any way whatsoever to benefit from the fact.. they'd be in deep sheep.

Paper Online by AstroMage · 2002-05-27 18:20 · Score: 5, Informative

Inspite of what the heading says, the original paper is online- you can find it on Evgeniy Gabrilovich's homepage.

That is, if you are interested in the dry, technical details... ;-)

You are mixing things up. by mindstrm · 2002-05-27 18:20 · Score: 2

Verisign's activites as a domain registrar are NOT the same thing as their CA business.

They are not required to, nor do they claim to, verify domain registrants UNLESS those registrants apply for digital certificates.

Yes, verisign are scum.. but you are barking up the wrong tree here. They are not at all requred or expected to verify domain registrars.

Hey. I wish they were. Imagine how many domains would have to be revoked? Literally millions.

Re:You are mixing things up. by Corgha · 2002-05-28 03:17 · Score: 2

They are not required to, nor do they claim to, verify domain registrants

I'm not mixing things up. You're misunderstanding me. Where did I say that they are required to verify domain registrants?

I said that maybe they should refuse to register domains that are visually similar to existing domains. That has nothing to do with verification of the identity of the person attempting to register it -- if you're refusing, who cares?

Nor did I try to make the claim that there was some legal requirement for them to do so. While I'm at it, note that I did not say that people should not be allowed to register such domains, only that Verisign should refuse to sully their hands with them.

Yes, verisign are scum.

And the fact that their lack of integrity is without question is what is so weird! Since I've pointed out what my point isn't, here's what it is:

If I were Verisign, I would work very hard to ensure that my integrity was above question, and to that end, I would refuse to facilitate obvious attempts to deceive. I would do this not because of some legal or techincal requirement, but because of an ethical one.

Normally, businesses do not have to be ethical, and even with Verisign, ethical behavior is not a legal or techincal requirement. However, since unethical behavior makes them less trustworthy, and trust is their primary asset and business (as they themselves say), unethical behaviour should carry a special cost for them.

Verisign's activites as a domain registrar are NOT the same thing as their CA business.

Neither was Caesar's wife's supposed affair with Clodius directly related to Caesar's ability to govern. Trust works in funny ways. Consider this: I am a sysadmin. People have to trust me not to snoop through their files and email. If my users discovered that I were engaging in some shady scam on the side, or even that I hung around with a bunch of con artists, it might make them trust me a bit less, even though those activities have nothing, prima facie, to do with my role as a sysadmin.

Similarly, Bill Clinton's sexual activities or George Bush's supposed connection to Enron may have very little to do with their activities as President, but engaging in deceitful behavior or even associating with those that do damages trust, an important asset for politicians as well. That's why their opponents are always searching for such scandals.

The fact that Verisign has allowed themselves to be involved in yet another scandal that says "you can't trust that you're talking to whom you think you're talking" is sort of crazy, considering that they identify that very trust as their core business. That the news came from another division of the company is not really that mitigating -- it's still the Verisign name in the headline: "Verisign gives away microsoft.com domain" (printed with Cyrillic "c" and "o" ;)

Re:Terminology whine by dvdeug · 2002-05-27 18:26 · Score: 2

How do you sort stuff alphabetically if you can't just do an integer comparison?

Unicode Sorting Algorithm.

Would be really slow to use some funky custom sorting routine.

What are you running? There are massive databases that use binary compare, and bitty boxes that use binary compare, but even my 386 should be able to do decent sorting in a negligable amount of time.

I don't know of many character sets that put the characters in sort order. ASCII doesn't work for English, because capital letters and lower case letters don't sort together. Latin-1 puts all its characters after ASCII, when some of them should sort with the ASCII characters.

As for why, the fact is it's not an option in a multilingual enviroment. Lithuanian sorts y after j; Swedish, German and Danish use some of the same accented characters, but sort them differently. The whole concept of binary sorting fails for some languages; Maltese and traditional Spanish both sort two letters ("ch" and "ll" for Spanish) as if they were one, and German sorts one letter ("ß") as if it were two ("ss").

What needs to be done to solve this by Hellkitten · 2002-05-27 18:38 · Score: 3, Insightful

Solution: Make brovsers default to displaying links to sites with non-ascii address different from regular links

Also since link display mey be overridden by style sheets, either make the browser override stylesheets for these links.

Display a warning when user follows one of these links

If this warning is displayed as a popup, if the user checks the "never show this warning again" display a text that explains why this is a bad idea

The only true way to security is to annoy your users into submission

--
- We are the slashdot. Resistance is futile. Prepare to be moderated -

Re:What needs to be done to solve this by dvdeug · 2002-05-27 19:06 · Score: 2

Solution: Make brovsers default to displaying links to sites with non-ascii address different from regular links

You're treating Unicode URL's as something wrong. They aren't, and a program should not usually annoy users using normal features of the standard.

A much better solution is to detect anomalous URL's (those with mixed scripts) and display them differently in the address box. Since a website can't do anything bad to your computer (and if it can, then that's a serious bug in the browser that needs to be fixed), you don't need to worry about links; just telling whether a site isn't authentic. If your users won't check the address bar, then ASCII links will have no problem spoofing them, either.

The only true way to security is to annoy your users into submission

It tends to make people not want to use your products. It induces random fear and doubt in others. What popping up random boxes for normal events tends not to do, is help users be any safer. It's amazing how fast I can click [yes] on Windows' dialog boxes, before I can even really check what they say.

Re:Terminology whine by VP · 2002-05-27 18:47 · Score: 3, Informative

Can you perhaps explain why KOI8 characters are out of order?

Because they were ordered as a transliteration for the Latin alphabet (sorry, can't put it in Cyrillic): ABCDEF instead of ABVGDE.

My guess is that this was done to easily transform Russian text written using the Latin alphabet into Cyrillic by simply flipping a bit.....

Re:I fail to see by GutBomb · 2002-05-27 19:15 · Score: 2

actually as it stands right now you can not have åöä in your address, a large complaint of people in sweden. å is not just an a with a ring over it. å is actually another letter entirely, something lots of english speakers can't get the grasp of.

Re:Right.. excpet.. SSL by Alan · 2002-05-27 19:20 · Score: 4, Insightful

Isn't the point of the article that now you can go to a Verisign approved website for (unicode of some big company) and have it check out properly because there is a verisign cert for the site (unicode of some big company)?

People now seem to be good at knowing that if you get funny pop ups about self signed certs or certificates not matching the url that they don't put in their credit card number... now suddenly that doesn't apply, because you won't get that, and the differences aren't as obvious as those for something like paypaI.com or micros0ft.com :)

Re:I fail to see by dvdeug · 2002-05-27 19:23 · Score: 2

actually as it stands right now you can not have åöä in your address

You can't have it in the domain name. You can have it in the part of the URL following the domain name.

å is actually another letter entirely, something lots of english speakers can't get the grasp of.

I would be surprised to find many English speakers who couldn't learn that. I would expect that most of them just don't know that fact right now, and that many of them really don't care.

sure they get the domain name, what about hosting? by GutBomb · 2002-05-27 19:37 · Score: 2

unless they run thier own servers, hosting is gonna be a little hard to get. I run a web hosting company. When a user signs up for hosting they are immediately ushered to the credit card processor, then after that it askes them what passowrd they wish to use on the system. after that the domain name, password, and other stuff are stuck into a database and an email is fired off to me to let me know someone signed up, containing the url of the page that will give me the details. anyway, i open up an ssh session to the server and start setting it up. when i enter the domain name into the httpd.conf i am not typing in cyrillic. I simply fire up vi, and type the domain name in there using regular latin characters. Same when I set up the DNS zone files, email, and other such stuff. Sure they can get the domain name there, but actually getting the page to show up is another matter all together. I believe even russian ISPs would assume the letters were latin characters and not thier cyrillic counterparts if they are used to spell english words (as in known company names to be used in some sort of scam)

Re:I fail to see by GutBomb · 2002-05-27 19:46 · Score: 2

actually you can not have it anywhere in the url. i tried it. first using cute ftp i tried to create a directory on my web site called "pål" cute wouldn't let me, so i had to ssh in and do it. i created the directory, put i file in there, but when i go to http://www.mydomain.com/pål i get a 404. so you can't have thoe symbols in URLs at all.

Re:Still... by arivanov · 2002-05-27 20:03 · Score: 3, Interesting

In windows (the EU edition) - anyone. Just add the language. Your only problem is that the idiots in Redmond have yet to add a keyboard editor (something that has been present in all third party internationalisation packages since Windows 3.10). As a result you will be stuck with some extremely obscene keymap inherited from a cyrillic typewriter. Alternatively you can pick up dlls from third party cyrillisation packages made for older windows versions and violate the sanctity of the MSFT sertificate by slapping it on top of the current ones. It usually works. And you get a proper keymap.

Under unix it is usually a bit more p*** in the a*** because most internationalisations rely on Xmodmap and it no longer works nowdays. Once again by default you will get stuck with something you cannot use unless you have a keyboard that is engraved with the alternate characters. Once again you will need to spend half an hour with vi swearing at whoever made Xmodmap not to work any more in order to get a less obscene keymap.

--
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/

Re:Still... by ncc74656 · 2002-05-27 20:12 · Score: 2

How many people can type the cryllic letters?

The Alt-keypad trick only works for 8-bit characters, AFAIK. You can copy characters out of Character Map (in Win2K/XP, not Win9x...Win9x's Character Map doesn't grok Unicode) and paste them into whatever you're typing, though: , , etc. (I think the first is "da" and the second is "nyet"...saw something that looked like that in a banner ad on a Russian website recently.)

If all else fails and you're editing HTML, you can escape the character entries, so that (for instance) gets entered as да.

--
20 January 2017: the End of an Error.

Why not stick with English? by evilviper · 2002-05-27 20:16 · Score: 2

I'm trying not to sound like a lingual elite-ist by any means, but can anyone really say that we shouldn't standardize on English/ASCII? Just about every country where English is not the native language, English is taught to their school children from early on.

The internet has shrunk the barrier to exchange information, which has made diverse languages even more significant of a barrier. If we use UNICODE and just let accept that everyone wants to use their own language, then the internet will end up as a group of national islands of information. Each group will surf their set of native language web sites. When you search the web, the information on that Nokia phone might not be readable by you (Babblefish isn't a solution).

Language has always been a barrier, and I hope the internet will be the tool by which that barrier is torn down; not the tool which escalates the problem.

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant

Re:Why not stick with English? by dvdeug · 2002-05-27 20:35 · Score: 5, Insightful

I'm trying not to sound like a lingual elite-ist by any means, but can anyone really say that we shouldn't standardize on English/ASCII?

The 5 billion people in the world who don't have English as their native language might. Some would argue that language is a cornerstone of culture, and that when a society loses their language, they lose a significant part of their culture. I've read parts of Shakespeare in German, and was very unhappy about the destruction of the writing. I know several poets of my native tongue (Poe, in particular) would be lost completely in translation. I have no interest in condeming other people to reading the great literature of their cultures in translation.

In any case, ASCII isn't good enough for English writing. French accents are used in English writing, as well as the ae and oe ligatures. Even in modern writing, proper quotes and apostraphes are needed, and footnote daggers often show up in English writing. For specialized work, mathematics, linguistics (even of English), historical English writing and APL all have thier own body of characters outside ASCII that need supported.
Re:Why not stick with English? by ukryule · 2002-05-27 21:42 · Score: 3, Informative

I'm trying not to sound like a lingual elite-ist by any means, but can anyone really say that we shouldn't standardize on English/ASCII?

Yes. It's ridiculous to ask people to learn (admitedly a small part of) a new language to use a computer. Just because English is taught in a lot (not all) of schools around the world, it doesn't mean that everyone is comfortable using it. A truely usable computer should be one which allows you to interact with it 100% in your own langauge.

The internet has shrunk the barrier to exchange information, which has made diverse languages even more significant of a barrier.

The main barrier to computer usage in a large part of the world is that it is still an elitist medium - only useable (and affordable) by the well-educated. If you are actually interested in making it easier for everyone to communicate, then the main technical issue to be solved is how to make the internet useable by anyone from any background.

If we use UNICODE and just let accept that everyone wants to use their own language, then the internet will end up as a group of national islands of information. Each group will surf their set of native language web sites.

This already happens. Of course people surf websites in their own language! Because you (and I) only surf the English-speaking fraction of the web, you don't see it. All that international domain names adds is that a Russian accessing a Russian website can do so via a Russian URL. What could be more sensible or obvious than that?

If no standard is agreed upon, proprietory standards will pop up all over the place, and it'll be a huge mess. In fact this is already happening - although he's the current anti-Christ of Slashdot, the big selling point of RealNames was for non-English languages, and if you believe Keith Teare's account, he was shafted by Microsoft because they wanted to control (via their browser) the translation of non-ASCII names to ASCII URLs.
Re:Why not stick with English? by david+duncan+scott · 2002-05-28 02:17 · Score: 2

Shakespeare in German? But it's so much better in the original Klingon!

--
This next song is very sad. Please clap along. -- Robin Zander
Re:Why not stick with English? by HiThere · 2002-05-28 06:26 · Score: 2

But perhaps it would be reasonable to require that URL's should be interpreted in ASCII (or at least that the ASCII interpretetion should be the official one).

There is a reason why all airline pilots are required to speak the same language. It happens to be English, but that's not what's important.

I don't see anything wrong with requiring all URLs to be in ASCII. Conventions could be adopted to convert their display into other character sets, but the ASCII version should be the official one. So a URL like:
http://chnEi3ReY/...
would have, say, an interpretation in Hiragana, but the ASCII version would be the official one.

--

I think we've pushed this "anyone can grow up to be president" thing too far.
Re:Why not stick with English? by dvdeug · 2002-05-28 09:16 · Score: 2

nothing like a good bash of the perceived 'English Language Oligarchy' that rules the planet.

You're saying the _American_ Standard For Information Interchange should be the _universal_ standard for URLs, and so far, more or less, it has been. Of any non-invented language in the world, English has the most people who say "why don't they all speak my language". How many American journals publish in a non-English language? How many Japanese journals publish in a non-Japanese language? And what language is that usually? It's not an overt thing; no oligarchy exists per se. But there are distinct biases towards English, and it's very frustrating when those biases are gratitious.

URLs must be standardized on a universally recognizable (and unconfused) character-set.

Why must the character set be universally recognizable? Yes, I'm not likely to type in www..com. So what? If they want my business, they probably need to register an ASCII domain name. If they don't want my buisness, they can go on being www..com.

Every literate person on the planet can read them.

Um, no? At best, _many_ people literate in languages using non-Latin scripts can recognize them. That's a far cry from reading them. That's important; part of the point of a URL is that www.slashdot.org is memorable. But if I'm not fluent in English, even if I can recognize individual characters, it won't be. How frustrating would it be if you had to transliterate English into Greek or Cyrillic everytime you wanted to write a URL? I want my name in my email address, not some mangled foreign version.
Re:Why not stick with English? by dvdeug · 2002-05-28 09:35 · Score: 2

There is a reason why all airline pilots are required to speak the same language. It happens to be English, but that's not what's important.

Yes, but airline pilots are highly trained professionals who may kill people if there's a communication problem. The Internet should be accessable to everyone, not just those who know English.

Conventions could be adopted to convert their display into other character sets, but the ASCII version should be the official one.

That's what they're doing - I believe a leading bq-- will signal that it's supposed to be Unicode. How is bq--asdiv8nern.com much better than 129.22.21.99, though?
Re:Why not stick with English? by HiThere · 2002-05-28 13:16 · Score: 2

That's what they're doing - I believe a leading bq-- will signal that it's supposed to be Unicode. How is bq--asdiv8nern.com much better than 129.22.21.99, though?

It wouldn't be better, if you were browsing in English. But it also wouldn't get confused with something else. E.g. bq--rusMicroSoft.com wouldn't be confused with Microsoft.com. But you wouldn't usually go to a page with a Russian URL unless you were conversant with Russian. The thing is, if a unicode representation is the official version, then one will need to use hex editors to pry apart similar looking glyphs. If it's ASCII, all you need is a text box, with an option to force the URL into lower (or upper) case. Much more convenient for most people than parsing the hex.

--

I think we've pushed this "anyone can grow up to be president" thing too far.
Re:Why not stick with English? by dvdeug · 2002-05-28 16:11 · Score: 2

E.g. bq--rusMicroSoft.com wouldn't be confused with Microsoft.com

Yes, but bq--asdiv8nern.com could easily get confused with bq--adiv8narn.com. No one's going to want to look at the ASCII mangled names.
Re:Why not stick with English? by evilviper · 2002-05-29 17:47 · Score: 2

The ae and oe to which you refer are indeed ASCII... just because they don't appear on your keyboard doesn't mean they don't exist.

Mathematics has it's own language which I doubt Unicode can rival.

As far as culture, you can keep your native language... Just keep one language standard on the internet!

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Re:Why not stick with English? by evilviper · 2002-05-29 18:05 · Score: 2

It's ridiculous to ask people to learn (admitedly a small part of) a new language to use a computer.

Why? We require all the sciences to use the metric system.

A truely usable computer should be one which allows you to interact with it 100% in your own langauge.

I happen to know a 4 year-old who knows how to use Windows better than some post-grad students I know. Knowing the definition of a word is not even necessary for computer use.
Besides, you haven't proven that those who speak a different language have a harder time using a computer in English. I learned how to use DOS with no idea waht DIR or CD meant...I learned how to program in C without C as my native language!

The main barrier to computer usage in a large part of the world is that it is still an elitist medium - only useable (and affordable) by the well-educated.

Send me $10 and I'll send you a 486. A little more for a 100MHz Pentium. As far as well educated, I refer back to the 4 year-old. Besides, I had no training on how to use a computer, I was taught how to double-click and the rest I learned becase of weak restrictions.

Of course people surf websites in their own language!

Well, not exclusively in most cases. At least because most information is in English, people learn that they need to know english well enough to ask a question, read a manual, read a book on programming, etc.

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Re:Why not stick with English? by dvdeug · 2002-05-29 19:24 · Score: 2

The ae and oe to which you refer are indeed ASCII...

If you use Linux, try man ascii. That will show you all the characters in ascii. See Roman Czyborra's character set page for more detail than you could ever want. As a point, ae and oe are not in ASCII; oe isn't even in Latin-1.

Mathematics has it's own language which I doubt Unicode can rival.

Huh? What's that supposed to mean?

As far as culture, you can keep your native language... Just keep one language standard on the internet!

You can keep your language, as long as we don't see it, then? Use our language where we can see it. That's not arrogant? What if you were asked to learn Mandarian Chinese (the language of more people than any other) before you could use the Internet?

One of the amazing things about the Internet are when grandparents can communicate with grandchildren; forcing old dogs to learn new tricks and young kids to pick up a second language before they've fully learned their first isn't going to make the Internet more accessable to everyone.
Re:Why not stick with English? by dvdeug · 2002-05-30 08:43 · Score: 2

We require all the sciences to use the metric system.

The metric system - all the average person needs to use - is describable on one small page at normal size print. No language comes close, especially not English. In any case, if you want to be a scientist, you have to take 10 years of college; there should be no unnessecary requirements on what you have to learn to use a computer.

most information is in English,

Most people use the Internet more for communication than information. It's like comparing the amount of water in two lakes when you're looking to water your roses. It doesn't matter how much information there is, so long as there's enough. I would guess there's enough information out there in Spanish or German or Chinese or most other major languages.

people learn that they need to know english well enough to ask a question, read a manual, read a book on programming, etc.

Right. Most people don't go around reading manuals (that didn't come with their computer) or books on programming (at all).

Again, we aren't talking about hackers here. People seriously into any field are going to pick up the language common in that field, and computers are no exception. You're asking the masses to learn a language to use computers, which is entirely different.

You don't sound like someone who's tried to learn languages here. I've taken years of classes to learn German, and can still only stutter through. Reading a half-page of technical information in German took me a hour, at which point I gave up trying to read it. I can't imagine ordering everyone in the world to learn a language to use a computer.

Re:Terminology whine by arivanov · 2002-05-27 20:25 · Score: 2

Excuse me...

I think your knowledge of the subject is a bit off...

It was not developed for russian use at all. It was developed in Moravia which spanned most of current Chech Republic and bits of Slovakia. In other words it was developed for what has become Chech nowdays. The people who developed it were fairly high in the hierarchy of the Moravian church but got nailed for herecy by their superiors in Rome.

After that they fled to Bulgaria and from there on the alphabet spread to Russia. Considering that at that time the Bulgarian Empire span most of the Balkan peninsula and both bulgarians and serbians claim it to be in their ancestry I will skip on where did Serbians get the alphabet to avoid a Balkan flame war. Let's say once upon a time it was one country.

After that the alphabet went through at least several simplifications and changes of the writing. One around 9-10th century, one during the church reform in the middle ages in russia and one more in most slavic countries just before the first world war.

In any case:

It took several senturies between the invention of the alphabet and it landing in Russia. Russians have actually started selebrating the origins of their alphabet in 1991. Before that it was a topic that was usually skipped in their history books.
All nations using it have different versions. That includes Serbs, Bulgarians, Russians, Bellorussians, Ukrainians,etc and the eastern ortodox church (the latter uses an ancient dialekt corresponding to 9-10th century south slavic). So saying Russian cyrillic is not redundant. It is a different alphabet from Bulgarian, Ukrainian, etc. Most of them contains letters specific only to the particular language. The computer encodings are different as well.

--
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/

Re:I fail to see by dvdeug · 2002-05-27 20:26 · Score: 2

W3C has a page on the subject. I don't know why it doesn't work in your case; I suspect you're dealing with character set issues, but you didn't mention what webbrowser you were using or anything, so it's hard to tell what went wrong.

Re:sure they get the domain name, what about hosti by Edmund+Blackadder · 2002-05-27 20:27 · Score: 2

Many ISPs do the whole sign up process automaticaly.

Maybe you would like to save some time as well - check out - www.rodopi.com.

Discussed on Unicode in February by absurd_spork · 2002-05-27 20:29 · Score: 2

This has been known for quite some time. It was in February discussed in a series of threads on the Unicode mailing list, started by this message by Gaspár Sinai who developed the Yudit Unicode editor.

Basically, the consensus in the end was that it is impossible to avoid this sort of problem as long as you have a standard that encodes characters instead of glyphs (that means that Latin "o" and Cyrillic "o" are different characters, even though they look the same).

A character set that encoded glyphs instead of characters could avoid this. However, such charsets are extremely tedious to implement. It has been tried with the Adobe glyph registry and has been found insufficient.

In practice, glyph-based character sets are unusable. The reason is that they cannot be made fully round-trip compatible with existing character sets, such as ISO 8859 or the Windows codepages, because these legacy character sets encode characters instead of glyphs. If URLs were encoded in such a glyph-based character set, it would be impossible to embed URLs in any document in a legacy character set. No URLs in e-mails.

As a result, the only solution is to have application and operating system vendors implement checks for such situations and to have URL registries reject such obvious spoofing attempts (e.g. no mixed-alphabet URLs). Since the problem is not fundamentally different from registering slashdot.org, it is not even a problem that we weren't already aware of.

--

There is absolutely no reason to panic.

Re:Discussed on Unicode in February by pne · 2002-05-28 02:02 · Score: 2

Mixed-alphabet URLs need not necessarily imply spoofing. That would be like saying URLs with both upper and lower case are all spoofs -- but some people write URLs with mixed case for legibility (e.g. www.SomeLongDomainName.com vs www.somelongdomainname.com). There are undoubtedly legitimate uses for mixed-alphabet domain names as well (perhaps something like "www.team-xxxxxxx.com", with the x's standing for Greek/Cyrillic/whatever letters?).

--
Esli epei etot cumprenan, shris soa Sfaha.

Re:sure they get the domain name, what about hosti by GutBomb · 2002-05-27 21:42 · Score: 2

yeah, i really did not think of that. i prefer to do it manually instead of something doing it automatically, A) because i don't want to pay for a tool that helps me do it automatically B) I am too lazy to do one up for myself C) I use plesk server administrator on some of the servers and i don't think i want to play plesk to develop something for me since all thier php source code is encrypted.

Google is your friend by Luminous+Coward · 2002-05-27 21:46 · Score: 2, Informative

The work was done for a paper in the Communications of the ACM (the paper itself is not online).

I doubted that statement as I'd read the paper online several days ago. I think it was linked to from Bruce Schneier's Crypto-Gram Newsletter. Anyway a simple Google search with homographic attack dns yields one and only one result:

The Homograph Attack

tangential ask slashdot... by Graspee_Leemoor · 2002-05-27 22:54 · Score: 2

This is slightly tangential, but seems a good place to ask: does anyone know how to get Microsoft IME under Windows XP to use a Dvorak layout for romanji input when typing Japanese ?

For English I just use the US Dvorak input method, but when the language is set to Japanese there seems to be no way to use Dvorak other than tediously modifying the romanji->kana input table, which is clearly the wrong way to go about things.

graspee

goatse by diesel_jackass · 2002-05-28 00:59 · Score: 2

oh geez, i can see the creative Goatse links now.

--
THERE IS NO DATA. THERE IS O

Its not the same by Srin+Tuar · 2002-05-28 01:53 · Score: 2

The average literate chinese person has to know upwards of 3000 unique characters. Picking up the ~30 ascii glyphs needed to use the current internet is trifling in comparison.

Knowing a sufficient number of english words is much more difficult, but completely unnecessary for using email/DNS.

Also, I imagine if the "internet started in china", they would have included the measly 26 uppercase latin letters, as they are kanji's too. Most of the sites youd be interested in as an english speaker would stick to those anyway...

Farsight seems to be scarce by gweihir · 2002-05-28 02:01 · Score: 2

I have had numerous discussions (or better: fights) with people about this. Usually they feel the security problems can be solved without real effort (by somebody else of course), but feel what I really wanted is to discriminate against them.

It never ceases to amaze me that some people rather risk an entirely working system, like the DNS, than accept that technology cannot accomodate their personal needs that fast and that some of their personal needs may be very difficult to fulfill, and that this is not the fault of the engineers but rather a consequence of the fact that the technology they now want adapted to their needs was invented by people from another culture! If the WWW was a russian invention, of course everybody participation in it would have to learn russian language at first! Maybe even still some decades later. Now it was mostly american so it is ASCII and english. Those that cannot adapt to that should wait until their needs can be safely and cost-effectively accommodated or do the nedded extensions from thier own ressources!

But obviously many people just "want" without any willingness to contribute or invent or implement by themselves. I foresee interessting times for anybody using text-based identities, like names.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted and ignored otherwise.

Re:I fail to see by GutBomb · 2002-05-28 02:08 · Score: 2

how thew fuck do i ssh uin and create a directory with fucking IIS. i tried mozilla and netscape on linux and i tried mozilla, opera, and IE on win32. none worked. oh yeah, i used apache.

Re:I would have thought it wasn't a problem except by Tony-A · 2002-05-28 03:58 · Score: 2

Taken to its logical conclusion, if you can't handle life (all of it?), then you shouldn't be alive.
The thing is that people do have to cope with things that they do not understand. Societal norms should be such that minimal damage is inflicted due to lack of understanding of consequences. This applies to adults as well as children and infants.

Re:Terminology whine by Reziac · 2002-05-28 04:06 · Score: 2

As an aside, to me the various flavours of Cyrillic look like the character set was ultimately derived mostly from Greek, which seems reasonable if it was developed in the Balkan region. Anyone know any further-back history on it??

--
~REZ~ #43301. Who'd fake being me anyway?

Re:homograph by GMontag451 · 2002-05-28 04:26 · Score: 2

Homograph is a real word (spelled identically, but have different meanings think fair ( just or morally right (life isn't fair), appealling appearance (fair skinned), a market place), but they are using it a new context. The occurance of using Unicode to do bad things with domains is so uncommon there is no word for it, they coined a word to make it something they could actually talk about.

I would think a better term to coin would be "homoglyph", because that is what it is. Two different characters with the same glyph. Plus this has the advantage of not being a word already in use (to my knowledge).

[OT] not quite correct by brokeninside · 2002-05-28 05:12 · Score: 3, Interesting

IMO, the major contribution of St. Cyrill and Methodius is not the creation of an alphabet, but their disputes with the Western church and the Pope regarding the right for the different peoples to learn and practice Christianity in their own language. Up to that point only Latin, Greek and Hebrew was used in church services...

This was only true in Western Christendom and then only true to a limited extent. For example, in the west, the first Christian missionaries to the British Isles translated the service books of the early Church to Gaelic and other Celtic languages. In the east, the the generally accepted practice was to use the venacular. This is why some of the oldest extent copies of the Bible are in one of the Ethiopic languages, Coptic, Syrian, etc.

The Roman canon that the liturgy could only be practiced in one of the tongues spoken by the apostles was of relatively late invention and only applied to congregations under the sole apostolic see of the west, Rome. Congregations under the apostolic sees of the east always used the venacular.

Hence it is somewhat ironic that many eastern Churches refuse to update the liturgy from being in liturgical Greek or old Slavonic into their modern equivalents.

Regards,

-l

Re:I would have thought it wasn't a problem except by HiThere · 2002-05-28 05:58 · Score: 2

The first time I got a Klez message, I sent a reply saying that I thought their machine was infected. I only discovered the forgery problem when I started reading up on it. That's probably what happened to your friend.

If you aren't really bothered by viruses (i.e., keep you system reasonably secure and don't use MS), then their new tricks can sneak up on you.

--

I think we've pushed this "anyone can grow up to be president" thing too far.

Because they're smart by GCP · 2002-05-28 06:31 · Score: 2

Nobody who understands text data would use anything other than Unicode except for legacy handling. Using different encodings for different languages is as ridiculous today as using different encodings for English on different platforms used to be before everyone agreed to exchange data in ASCII.

--
"Those who have never entered upon scientific pursuits know not a tithe of the poetry by which they are surrounded."

Re:Because they're smart by bani · 2002-06-01 09:54 · Score: 3, Interesting

For many language encodings the conversion to unicode is a one-way ticket, there is no roundtrip possible -- so you sometimes lose critical information about the characters.

It's also disappointing that unicode forum dropped their official JISUTF tables. There is no longer any official translation table for japanese encodings to unicode. It's the wild west for asian languages in unicode (ever wonder why no asian data systems use unicode?)
Re:Because they're smart by dvdeug · 2002-06-03 17:52 · Score: 2

For many language encodings the conversion to unicode is a one-way ticket, there is no roundtrip possible

Really? Name a few. The Unicode consortium and ISO 10646 went through a lot of trouble to round trip everything through Unicode. They recently added ARABIC TAIL FRAGMENT to roundtrip some ancient IBM Arabic character sets through Unicode. There are hundreds of characters in Unicode whose only point is roundtrip other character sets through Unicode.

There is no longer any official translation table for japanese encodings to unicode.

Those were never official. Two different systems didn't always agree on the definition of various characters in those encodings, making it hard to make an accurate universal translation table. Also, Unicode doesn't consider the defintion of various Japanese standards its business.
Re:Because they're smart by bani · 2002-06-04 16:49 · Score: 2

"Those were never official. Two different systems didn't always agree on the definition of various characters in those encodings"

Thank you for making my point for me! You can't have a reliable roundtrip conversion, when you don't even have a reliable freaking one way conversion!

"making it hard to make an accurate universal translation table."

And there is no longer any translation table at all. The impetus for asians to use unicode has all but evaporated, the only people using unicode for asian data interchange is westerners. The asians have largely abandoned it, and it is hardly suprising given these problems.

"Also, Unicode doesn't consider the defintion of various Japanese standards its business."

And this has absolutely nothing to do with translation tables. Unicode isn't defining JIS or EUC.
Re:Because they're smart by dvdeug · 2002-06-04 17:26 · Score: 2

You can't have a reliable roundtrip conversion, when you don't even have a reliable freaking one way conversion!

How can you have a reliable conversion if the origin is ill-defined?

And there is no longer any translation table at all.

All Unicode did was move it into the OBSOLETE directory. They certainly didn't search the web and destroy every copy they could find.

The impetus for asians to use unicode has all but evaporated,

Which is why all operating systems sold in China must support GB18030, a format of Unicode.

the only people using unicode for asian data interchange is westerners.

Sure, the email may be in ISO-2022-JP; but the emailer that sent it probably stored it in Unicode and displayed it using a Unicode text display engine. This is true for Outlook Express, Mozilla, and anything written for KDE or Gnome 2.

Sideswiped with a limerick! by NoMoreNicksLeft · 2002-05-28 07:12 · Score: 2

MrHat, I've missed you. I had started to think I was unworthy of your limericks.

Tell me the truth though, is it, or is it not incredibly sad, that nearly every topic/conversation on this site can be reduced to a 5 line poem? It tells the lie of just how shallow most of this is...

Unicode is not the problem... by GCP · 2002-05-28 07:34 · Score: 2

...but it will have to be part of the solution.

The problem is the diversity of characters used by people around the world, regardless of how they are encoded. Encoding them in anything other than Unicode would make the problem dramatically worse because no group will sit back for long and allow their language to be excluded from global naming protocols on this shared "worldwide" platform.

Having everyone share an ASCII-only system is no longer a viable option, so either everyone shares a single system that covers all languages (Unicode is the only viable option), or the system breaks up into a composite of conflicting encodings. (.com could be registered as half a dozen different byte sequences by different registrars.)

The Unicode solution is the only one that makes sense, then you have to look at rules for the use of characters. You would have to look at the rules for the use of characters even without Unicode. It's just that Unicode makes it so much simpler than the composite alternative that a solution is probably possible.

This IDNC3 proposal is a good start, but there are even more issues. People who wave their arms about the "problems of Unicode" aren't helping, though. Almost all of them are really just advocating "let's keep it simple by limiting it to the characters I need and disallowing yours", and that won't fly any longer.

--
"Those who have never entered upon scientific pursuits know not a tithe of the poetry by which they are surrounded."

Re:Unicode is not the problem... by Russ+Nelson · 2002-05-30 16:58 · Score: 2

IDNC3 prohibits the use of visually identical characters in domain names. There's really no alternative to doing that.
-russ

--
Don't piss off The Angry Economist

Keyboards by GCP · 2002-05-28 08:08 · Score: 2

Yes, and it's a lot harder for you to write the characters needed for programming in C++ or Perl. I'd rather have my English keyboard.

HOWEVER, what I'd like best of all would be to replace the dumb keyboard (hit a key, get the character printed on the key cap) with smart input methods at the OS level (maybe keyboard driver level if you don't have a GUI).

For example, I should be able to type user-defined abbreviations and have the OS replace them with what they represent. I should be able to type "deja vu" and have the OS input dictionary automatically replace it with "déjà vu" and so on. We should be able to use the tab key for autocompletion and substitution, so if I type e/ then tap the tab key, it might replace e/ with é, and so on.

Yes, I know we have some of this functionality in unix shells like bash, some in emacs, some in word processors like MS-Word, etc. I'd like it at the OS level so that no matter what I was typing into, I would have a virtual keyboard much more powerful than my simple physical keyboard and one that I could optimize for the characters/words/phrases I needed most often.

--
"Those who have never entered upon scientific pursuits know not a tithe of the poetry by which they are surrounded."

Re:Keyboards by dvdeug · 2002-05-28 09:20 · Score: 2

HOWEVER, what I'd like best of all would be to replace the dumb keyboard (hit a key, get the character printed on the key cap) with smart input methods at the OS level (maybe keyboard driver level if you don't have a GUI).

Will the X input manager framework not support this? I've looked at it repeatedly, but I read none of the CJK languages XIM is primarily designed for and primarily documented in.

I know it's only X, but I've given up any hope of Linux and X agreeing on the keyboard.

Re:I fail to see by GutBomb · 2002-05-28 09:49 · Score: 2

actually i clearly stated that cute didn't work so i had to ssh in. read it again before you complain about it. second, i don't really care what the w3c has to say about it. it simply doesn't work. using standard server software and standard client software, it simply does not work.

Re:It shouldn't really be a problem. by GigsVT · 2002-05-28 10:44 · Score: 2

You are absolutely correct, but try explaining that to a customer that insists they want their own domain name to be on the "check out" screen, and not their hosting provider, but also refuses to buy their own cert. They won't understand, and they won't listen. Maybe allowing hosting providers to sign certs for the domains they host could be a solution.

--
I've had enough abrasive sigs. Kittens are cute and fuzzy.

Done in Unix also... by dghcasp · 2002-05-28 11:44 · Score: 2

Back in my old Unix Admin days, working at a company where everyone knew the root password, People would sometimes try to hide directories by putting non-printing control characters in the name, e.g. ". ^hfoo"

Of course, this is easy to defeat with a simple combination of backticks, ls -1 and wc.

The best way I discovered to hide the contents of a directory in unix is:

use fsdb(8) to change the first character of the directory name to "/"

Unix is rather unhappyful trying to cd to a directory that has a / as part of its file name. Shell quoting tricks won't get you past it, since it's the kernel handling the /

Of course, you had to un-/-ify the directory every time you wanted in, but hey, the price of security...

Re:Right.. excpet.. SSL by mindstrm · 2002-05-28 13:19 · Score: 2

No. THe point of hte article is to try to blame this on verisign when in fact they are doing nothing wrong.

It sounds like you don't really understand how certificates work. Verisign will NOT issue you a certificate for www.microsoft.com using some cyrillic characters. So there is no way a site can present a certificate, signed by verisign, indicating the site is microsoft.com

The article tries to make it out that because verising issues certificates, it shoudl ALSO be verifying domains people register.

This is not a scandal. by mindstrm · 2002-05-28 13:25 · Score: 2

It is a totally legitimate domain. There is nothing WRONG with it.

It's particular uses of it that can be wrong, but not the domain itself.

And as to what you said, you, directly or indirectly, implied that Verisign should not allow domains like this to be registered because they are in the certificate authority business.
Totally different things.

I don't see the connection you are drawing.
No, bill clinton's relationship with his wife has nothing to do with his ability to govern, and I cannot *believe* that people actually think it has an effect.
Actually, what I realy think (read this carefullY) is that it's a big deal because people THINK that other people think that it has some effect, and don't want to appear different.

Re:Telling Japanese from Korean [OT] by Dephex+Twin · 2002-05-28 14:57 · Score: 2

Ha, I can definitely believe that.

The other day at work I scribbled a couple of words in Korean on my notepad (I studied Korean a bit). Later my boss came by and saw it and asked what it meant. I told him and then showed it to the *Chinese* guys who work in my area to see if I had it right. They glanced and said "oh that's Korean" and my boss said "oh, I see, but can't you read it at all?" Errrr, no.

With Chinese, I can understand there is a huge learning curve, but a lot of people don't know that they could pick up basic Korean (alphabet and making words) in about a day.

I guess when people look at Chinese/Japanese/Korean they see "Chicken Scratch language" and don't really look for the obvious distinctions among them.

mark

--

If you want to make an apple pie from scratch, you must first create the universe. -- Carl Sagan

Re:WHY THIS IS IMPORTANT - It's already been done by bay43270 · 2002-05-28 15:43 · Score: 2

Someone once sent an email to my yahoo account that looked just like the yahoo login message. I would have fell for it, but IE didn't auto-fill my login into their fake text field.

Comm of the ACM article online by plcurechax · 2002-05-28 16:38 · Score: 2

The Communications of the ACM article, is available online, at <http://www.csl.sri.com/users/neumann/insideris ks.html#140> (Inside Risks 140, CACM 45, 2, February 2002).

Why not discredit Christianity and Judaism, too? by Karma+To+Burn · 2002-05-28 21:57 · Score: 2, Insightful

Folks are just kinda thick about questioning the veracity of claims (hell, astrology still sells books and 900-number phone calls).

So... you can't respect other people's personal decisions on spirituality? Granted, the 900-numbers are gimmicky. But why should Astrology books be discredited as non-sense? Most mature people respect other's religious beliefs.

Although Astrology isn't a religion, it is faith-based, as religion is. Is Astrology scientific? No. Niether is the Bible (etc.). You might as well have worded that sentence to say "hell, astrology, christianity, and paganism still sell books...".

All I ask is that you respect other people's personal spiritual beliefs, whether that involves Astrology, Judaism, Wicca, or what have you. An exception is when you're discussing/debating spirituality or religion, but this isn't the case.

I don't believe in Christianity, but I don't attack a Christian's personal beliefs because I don't agree with them. I expect others to respect my personal beliefs the same way.

Re:Client-side fix? by dvdeug · 2002-05-29 07:25 · Score: 2

(1) Which glyphs are the same are entirely font dependent.

(2) Greek letter A lowercased should look like ; Latin letter A lowercase should look like a. There are 23 o-like characters, some symbols, some alphabetic characters, 1 ideograph; each of them has their own properies, and many of them may or may not look the same depending on the font.

176 of 432 comments (clear)