Unicode Encoding Flaw Widespread
LordNikon writes "According to this CERT advisory: 'Full-width and half-width encoding is a technique for encoding Unicode characters. Various HTTP content scanning systems fail to properly scan full-width/half-width Unicode encoded HTTP traffic. By sending specially-crafted HTTP traffic to a vulnerable content scanning system, an attacker may be able to bypass that content scanning system.' A proof of concept affecting IIS is already being posted to security mailing lists. Cisco IPS and other IDS products are also affected." The CERT advisory lists 93 systems, with 6 reported as vulnerable (including 3com, Cisco, and Snort), 5 known not vulnerable (including Apple and HP), and the rest unknown.
You'd prefer securing against vulnerabilities in dozens, if not hundreds of different encodings? The only people who are against Unicode are those that have never had to work with more than one written language in the same project. Yes, it's a lot easier to secure stuff when you only accept ASCII or ISO8859-1/Windows CP-1252, but then you're limiting your software to about a third of the world (if that). Crappy engineers are going to write crappy code no matter what the encoding. No sense compromising for the sake of poorly written software.
To think that English doesn't fit in 7-bit ASCII is na\"ive.
Down below this post, there's a troll writing something like 'lol if u cant just use ASCII u shud let ur language die u foreign creeps lol k thx'.
And a whole bunch of people then jump on the troll and criticize him for his US-centrism, and so on, and the troll is at -1.
Yet the post I'm replying to, which is at +4, really comes to the same thing as this troll; it's simply UNIX 8-bit centric rather than USA ASCII centric.
The fact is, computers are used for text, and much if not most text is non-ASCII. How would you rather represent that text:
--With Unicode
--With KOI-8, KOI-8R, KOI-8RU, EBCDIC, EUC-KR, EUC-JP, shift-JIS, Shift-JIS-the-Jphone-version, ISCII, VISCII, ISO-2022-*, and the many many other encodings that have evolved in different times and environments.
Seriously, which is going to be easier to secure (and otherwise manage) -- one encoding (which is HEAVILY documented and discussed) or a large number of encodings (the actual number being ever-changing and impossible to really know) many of which are not well documented and have forgotten ramifications and assumptions?
Right -- so now you know why people use Unicode so much.
But the interesting question is, why is one error ("All teh world is teh USA lol! Shouldn't you learn to speak English?") rightly jumped on and pounded flat, whereas another form that's actually more problematic ("All teh world is C on UNIX lolz!! Shouldn't you stop wanting dangerous extra features?") isn't?
Actually, I see in another window that some people have indeed been pounding the parent poster flat, so perhaps my question isn't valid after all.
Whence? Hence. Whither? Thither.