Unicode Encoding Flaw Widespread

← Back to Stories (view on slashdot.org)

Unicode Encoding Flaw Widespread

Posted by kdawson on Monday May 21, 2007 @06:33PM from the sneaking-past-the-IDS dept.

LordNikon writes "According to this CERT advisory: 'Full-width and half-width encoding is a technique for encoding Unicode characters. Various HTTP content scanning systems fail to properly scan full-width/half-width Unicode encoded HTTP traffic. By sending specially-crafted HTTP traffic to a vulnerable content scanning system, an attacker may be able to bypass that content scanning system.' A proof of concept affecting IIS is already being posted to security mailing lists. Cisco IPS and other IDS products are also affected." The CERT advisory lists 93 systems, with 6 reported as vulnerable (including 3com, Cisco, and Snort), 5 known not vulnerable (including Apple and HP), and the rest unknown.

15 of 184 comments (clear)

Limited impact. by shird · 2007-05-21 18:38 · Score: 3, Informative

This appears to be limited to content scanning, and isn't really a vulnerability in itself. Relying on content scanning to prevent an exploit to reach an exploitable system is a pretty bad idea, much better to fix the system than the extra layer of defense on the outside.

Content scanning is mostly useful against filtering known exploits, and is hardly meant to be your primary defense. Being able to bypass this scanning won't buy you much. If the content scanner is aware of an exploit it scans for, chances are so are the systems being targeted and are patched to protect against it.

--
I.O.U One Sig.
1. Re:Limited impact. by TheRaven64 · 2007-05-21 22:27 · Score: 4, Informative
  
  Windows makes no distinction between privileged and unprivileged ports, so any application that can open sockets can listen on port 80. That said, every port number (and every other object in the NT kernel) has an associated ACL, so it is possible to limit them on an individual basis. I've never seen this exposed to the UI though, so I've no idea how you'd go about doing it. Filesystem objects also have ACLs, so I'd imagine that IIS is not allowed access to the filesystem outside the tree it is sharing.
  The NT kernel provides a lot of facilities that are very useful for writing secure code. I often wonder if the application developers at Microsoft ever noticed that they weren't writing code on top of DOS anymore...
  
  --
  I am TheRaven on Soylent News
2. Re:Limited impact. by fatphil · 2007-05-21 23:32 · Score: 4, Insightful
  
  I think you've missed his point. There are now two ways that, for example, a quote character can be passed as user input to your program: either as " or as %ublah.
  
  Your program, sitting below the layer performing the unicode translations, doesn't need to do anything differently from before, as it doesn't matter which of the two methods were used. If you _relied on_ the layers above you to strip out, reject, escape, or whatever, quote characters, then you're writing teabag code, and should get a job selling flowers instead, as software engineering is beyond you.
  
  Always validate user input to your own specification. Never rely on something external to do it.
  
  This exploit hasn't changed the rules one little bit, it's just highlighted the fact that some idiots don't follow them.
  
  --
  Also FatPhil on SoylentNews, id 863
3. Re:Limited impact. by rabtech · 2007-05-22 03:11 · Score: 4, Informative
  
  The NT kernel has a root namespace for everything in the system (from local filesystems to network drives to sockets to synchronization objects like mutexes), and in fact treats everything as a file (just like Unix) underneath.
  
  Using the Native (NT Executive) API you can read or set the ACL on any object in the namespace, assuming you have the appropriate user rights and you own the object (or the ACL allows you to modify the permissions). NT kernel objects can also be case-sensitive (though that can confuse some Win32 programs). Often, you can delete, move, etc files that are locked by the Win32 subsystem, which can be useful in certain situations (though in Vista they made the IO system capable of cancelling outstanding IOs on its own so the zombie process bug that ends up locking files doesn't happen anymore. Its unfortunate Vista is so DRM-laden, or I'd try upgrading.)
  
  The APIs are NtQuerySecurityObject and NtSetSecurityObject and I believe the devices are in \Device\Tcp, \Device\Ip, \Device\RawIp, \Device\Udp, etc. Check out http://undocumented.ntinternals.net/ for more details on what is in the native API (ntdll). This API provides everything necessary to implement a full POSIX layer, which is exactly what Services for Unix does, installing itself as a new runtime subsystem right next to the Win32 subsystem. (With Server 2003 R2 SP2 they shipped it as an available component as part of the install; I've even got setuid support and GCC installed as part of the package.)
  
  --
  Natural != (nontoxic || beneficial)
Re:Send your claim in now by QuantumG · 2007-05-21 18:47 · Score: 4, Funny

IIS 6 hasn't had a public remotely exploitable bug in it. Ever. That's bullshit anyway, I've got dozens of remote exploits for IIS 6.

Oh, you said public.. hehe, forget I said anything.

--
How we know is more important than what we know.
Incident response by Anonymous Coward · 2007-05-21 19:11 · Score: 4, Interesting

I work incident response in a large web company (hence anonymous posting, natch) and currently we're treating this as "interesting, but case not proven". We test our web apps filter all input so I'm adding double-width unicode to our security regression test cases; however I'm happy to let the FD posters lab it out between them in the short term. These alleged IIS exploits don't work for us - which is not to say that we don't have some system, somewhere, for which this is an issue. At the end of the day it's just a clear restatement of something that's obvious to anyone - you need to filter input carefully, and you need to be aware of issues around alternative encodings. But it's not a "BRB" (big-red-button, ie emergency stop and all hands to the pumps to fix a vulnerability) issue for us - yet. The last time we had one of those, it was the Microsoft DNS server remote root... because most of our internal domain controllers were also running DNS servers.
Re:Not a surprise... by etnu · 2007-05-21 20:42 · Score: 5, Insightful

You'd prefer securing against vulnerabilities in dozens, if not hundreds of different encodings? The only people who are against Unicode are those that have never had to work with more than one written language in the same project. Yes, it's a lot easier to secure stuff when you only accept ASCII or ISO8859-1/Windows CP-1252, but then you're limiting your software to about a third of the world (if that). Crappy engineers are going to write crappy code no matter what the encoding. No sense compromising for the sake of poorly written software.
Re:Smelly foreigners by ettlz · 2007-05-21 20:57 · Score: 5, Funny

To think that English doesn't fit in 7-bit ASCII is na\"ive.
Re:Not a surprise... by KiloByte · 2007-05-21 21:23 · Score: 3, Insightful

Wrong, the flaw in Cisco's "security" software and IIS is due to them converting things to 8-bit charsets, not due to Unicode. In fact, the whole idea of "code pages" is fundamentally broken, as it assumes all data ever moves to another places only in the same region.

The idea of double-width characters is broken too, yeah, and they are there only to appease the users of some broken Chinese/Japanese software -- but there's nothing wrong with having strange characters in file names. They don't match any file they are not supposed to unless you try to shoehorn them into a limited character set.

So, it's a flaw in the software, not Unicode by itself.

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:Not a surprise... by kahei · 2007-05-21 22:19 · Score: 5, Insightful

Down below this post, there's a troll writing something like 'lol if u cant just use ASCII u shud let ur language die u foreign creeps lol k thx'.

And a whole bunch of people then jump on the troll and criticize him for his US-centrism, and so on, and the troll is at -1.

Yet the post I'm replying to, which is at +4, really comes to the same thing as this troll; it's simply UNIX 8-bit centric rather than USA ASCII centric.

The fact is, computers are used for text, and much if not most text is non-ASCII. How would you rather represent that text:

--With Unicode
--With KOI-8, KOI-8R, KOI-8RU, EBCDIC, EUC-KR, EUC-JP, shift-JIS, Shift-JIS-the-Jphone-version, ISCII, VISCII, ISO-2022-*, and the many many other encodings that have evolved in different times and environments.

Seriously, which is going to be easier to secure (and otherwise manage) -- one encoding (which is HEAVILY documented and discussed) or a large number of encodings (the actual number being ever-changing and impossible to really know) many of which are not well documented and have forgotten ramifications and assumptions?

Right -- so now you know why people use Unicode so much.

But the interesting question is, why is one error ("All teh world is teh USA lol! Shouldn't you learn to speak English?") rightly jumped on and pounded flat, whereas another form that's actually more problematic ("All teh world is C on UNIX lolz!! Shouldn't you stop wanting dangerous extra features?") isn't?

Actually, I see in another window that some people have indeed been pounding the parent poster flat, so perhaps my question isn't valid after all.

--
Whence? Hence. Whither? Thither.
Nothing to see, move along ... by udippel · 2007-05-21 22:24 · Score: 4, Insightful

It is a vulnerability, in the strict sense.
It is a self-inflicted misbehaviour as in common sense.
It is like those silly Cisco content inspectors on port 25, that try to avoid attacks on flimsy MTAs.
It is like someone dying from a jab against measles: the jab protected that person from contracting measles, actually.
It is like those stupid anti-virus programs that are more vulnerable than the daemons they profess to protect.

When the attacker uses a codepage different from the one that you think she ought to use, she can circumvent your content filter. Which ought not be an attack vector, in any case.

As I said: nothing to see, move along ...
Re:Smelly foreigners by TempeTerra · 2007-05-21 22:51 · Score: 3, Interesting

The notable difference between Chinese and English (or most other written languages) is that several English characters combine to form syllables, which combine to form words (i.e., we use an alphabet). In Chinese, each character corresponds directly with a word (each character is a logogram). If you're interested you can look up Alphabet on Wikipedia as a starting point, although I must admit I find the article hard to follow even though I know what it should be saying.

The practical result of this is that English is normally encoded as a long sequence of 0-25 values (a-z), whereas Chinese would be encoded as a shorter sequence of 0-~100,000 values (Wikipedia reports Chinese dictionaries with 85,000 characters). Naturally, there would be fewer Chinese characters required for a message as each character corresponds to an entire word.

I guess that since morse code is rather like binary and English letters can be encoded using 5 bits, Chinese morse codes would need to be... about 20 bits long? It's late at night, brain not work so good. It seems to me that morse codes using 20 dots/dashes would be extremely difficult to learn; but on the other hand it shouldn't be any more difficult than learning Chinese characters in the first place.

I wouldn't be surprised if English morse codes were more robust against poor data, siny Englxsh is stvll reahible even if sew2eral cheracter; are wrong.

Disclaimer: I don't know anything about the subject, I'm talking out of my elbow for the sake of discussion.

--
.evom ton seod gis eht
Re:Hmmmm.... by peragrin · 2007-05-21 22:55 · Score: 4, Interesting

1) unicode is better than having a hundred other encodes to debug
2)there's is nearly two billion chinese and Indians, who can't use your encoding.
3)I get just as much spam from US companies as I do foreign ones

--
i thought once I was found, but it was only a dream.
Re:Depends on alphabet size by rabtech · 2007-05-22 03:27 · Score: 4, Interesting

IIRC, China was on its way to moving to an alphabet system (certain characters can be used for their alphabetic sounds in various circumstances) and so was Japan (look at Katakana/Hirigana).

It is likely that the introduction of the printing press (and later mass media like TV/radio and computers) have "arrested" this natural evolution. It may also be possible that the development of a national identity and cohesive society tends to put the brakes on some developments as well - if a single unified language is mandated by culture or a central authority then local variations are much less important.

Romanji (and to a certain extent English itself) is definitely influencing the Japanese; the younger generations even moreso. Japan may end up using an alphabet for day to day needs almost exclusively within the next 100 years. The situation in China is much less clear but it will probably happen eventually.

If we look into the past, nearly all societies with ideographic/logographic writing systems eventually moved to an alphabetic system. Hell, even Ancient Egyptian Hieroglyphs were partially syllabic much like Katakana. Much as previous posters have pointed out, changing to an alphabetic system from Chinese-characters has allowed Korea to dramatically raise literacy rates. There is only so much time for schooling and memorization, and only so much effort to expend on literacy. If a simpler writing system is more accessible then that is a net gain, even if there are a few things that logographic writing systems do better than alphabetic ones.

--
Natural != (nontoxic || beneficial)
Re:Depends on alphabet size by loyukfai · 2007-05-22 11:35 · Score: 3, Interesting

IIRC, China was on its way to moving to an alphabet system (certain characters can be used for their alphabetic sounds in various circumstances)...
I'm a Chinese but I have never heard of this. Would you be so kind to educate me on this...? Where did you hear such things?
I'm serious.