Spoofing URLs With Unicode
Embedded Geek writes: "Scientific American has an interesting article about how a pair of students at the Technion-Israel Institute of Technology registered "microsoft.com" with Verisign, using the Russian Cyrillic letters "c" and "o". Even though it is a completely different domain, the two display identically (the article uses the term "homograph"). The work was done for a paper in the Communications of the ACM (the paper itself is not online). The article characterizes attacks using this spoof as "scary, if not entirely probable," assuming that a hacker would have to first take over a page at another site. I disagree: sending out a mail message with the URL waiting to be clicked ("Bill Gates will send you ten dollars!") is just one alternate technique. While security problems with Unicode have been noted here before, this might be a new twist."
St. Cyrill developed the Glagolic alphabet, based on the slavic dialects spoken on the Balkan peninsula, and used it in translating the Christian holly scriptures for the slavic tribes in Moravia (today's Hungary/Slovakia). His student, St. Clement, developed the improved Cyrillic alphabet and spread its use in Bulgaria, from where it was adopted by Russia, Serbia, and others...
Today there are several variants of Cyrillic - Bulgarian, Serbian, Macedonian, Russian, Ukrainian, and it was used even in some of the former soviet republics and Mongolia, whose languages are very far from Slavic.
Also, KOI8 is not considered the Cyrillic codeset by other cyrillic-using nations, it is rather considered the Russian cyrillic code set. Other codesets are the Windows 1251, and ISO-8859-5. The latter would arguably be the standard Cyrillic code set.
Ohter english letters to fade is yoch [looks like a 3] - this is the z in Menzies = Men3ies "Menges".
Also of note is digamma. In the greek number system, this is 6, that is, the 6th letter of the alphabet. As a letter, it appear between epsilon and zeta. Since our alphabet is derived from the greek, one notes the letter here not only looks like digamma, but preserves much of the original sound: F. Phi was an asperated p.
Cyrillic bears a much closer resemblance to the classical greek letters, and the theta, indeeds represents an f here.
Unicode reflects current realities. There is more than one Cyrillic Alphabet, just as there is more than one Latin alphabet.
OS/2 - because choice is a terrible thing to waste.