English Shell Code Could Make Security Harder

← Back to Stories (view on slashdot.org)

English Shell Code Could Make Security Harder

Posted by ScuttleMonkey on Monday November 23, 2009 @01:33PM from the little-bobby-tables-takes-up-writing dept.

An anonymous reader writes to tell us that finding malicious code might have just become a little harder. Last week at the ACM Conference on Computer and Communications Security, security researchers Joshua Mason, Sam Small, Fabian Monrose, and Greg MacManus presented a method they developed to generate English shell code [PDF]. Using content from Wikipedia and other public works to train their engine, they convert arbitrary x86 shell code into sentences that read like spam, but are natively executable. "In this paper we revisit the assumption that shell code need be fundamentally different in structure than non-executable data. Specifically, we elucidate how one can use natural language generation techniques to produce shell code that is superficially similar to English prose. We argue that this new development poses significant challenges for in-line payload-based inspection (and emulation) as a defensive measure, and also highlights the need for designing more efficient techniques for preventing shell code injection attacks altogether."

11 of 291 comments (clear)

Min score:

Reason:

Sort:

Re:In other news... by blueg3 · 2009-11-23 13:50 · Score: 5, Informative

Good job not reading the article.
It's not that shellcode can be written in text and then compiled to an executable form. It's not that shellcode can be compiled to an intermediary form, translated or compiled into machine instructions by a piece of code (this is common in malware now, to pass input restrictions -- as the article says). It's that the executed machine instructions themselves -- the compiled binary data that can be run raw on an x86 processor -- looks like English text.
This very comment by ewg · 2009-11-23 13:51 · Score: 5, Funny

Why, this very comment prints a list of prime numbers less than one hundred!

--
org.slashdot.post.SignatureNotFoundException: ewg
OMG! by mhajicek · 2009-11-23 13:53 · Score: 5, Funny

Now your brain can catch a virus just by reading!!!1
1. Re:OMG! by Nethead · 2009-11-23 13:58 · Score: 5, Funny
  
  Leave the bible out of this!
  
  --
  -- I have a private email server in my basement.
2. Re:OMG! by Nethead · 2009-11-23 14:40 · Score: 5, Funny
  
  So now that you've explained my joke, do you get it?
  
  --
  -- I have a private email server in my basement.
Re:This is by blueg3 · 2009-11-23 14:11 · Score: 5, Informative

Pinning down terminology use by security researchers is tricky.
In this case, what they mean is that the system has a vulnerability that enables code from a remote source to be executed, and that the input from the remote source is being run through a filter that attempt to identify executable code (in order to block it) versus English text.
On an already-secure system, this makes no difference at all. Those don't exist, much. If you were relying on a "looks like executable code" filter to protect you, this is a tip that it's not that secure. The paranoid should already assume so (based on things that already are available in Metasploit, if nothing else).
Re:In other news...BAN THE PARENT by Tynin · 2009-11-23 14:13 · Score: 5, Informative

This is the sixth spam message this user has posted, will SLASHDOT please BAN this guy already? Come on.
He must be making new logins. I've seen him posting for a few weeks, he surely has more than 6 spams that I've seen alone. Going on that idea... lets see:
http://slashdot.org/~coolforsale117
http://slashdot.org/~coolforsale116
http://slashdot.org/~coolforsale115
http://slashdot.org/~coolforsale114
http://slashdot.org/~coolforsale112
http://slashdot.org/~coolforsale110

No doubt there is a TON of them. So I'd guess they are banning him, he just keeps making new uids (and siphoning a ton of moderation points to keep him marked at troll / offtopic). I know I've used many mod points keeping this bastard down.
Antelope museum by beej · 2009-11-23 14:37 · Score: 5, Funny

Consume more trains, Elvis! He, and snorkels, drink elephant's sock puppet master. Steamed cabbage can reverse big piles of ducks. Additionally, cheese log cabin nightmare.
You're screwed now, x86 suckas!
1. Re:Antelope museum by slashqwerty · 2009-11-23 17:12 · Score: 5, Informative
  
  For those that are curious, here is some actual exploit code from the paper:
  
  There is a major center of economic activity, such as Star Trek, including The Ed Sullivan Show. The former Soviet Union. International organization participation Asian Development Bank, established in the United States Drug Enforcement Administration, and the Palestinian territories, the International Telecommunication Union, the first ma
  
  The bold characters are code. The rest have no net effect.
  
  Their strategy is to break the exploit into two pieces, a small executable decoder, and the payload. As you might imagine, the decoder decodes the payload. The payload is encoded in a benign-looking format which is simple enough. Their goal was make the decoder also look like benign data. To achieve that, their tool takes an existing decoder and automatically converts it to English-looking prose like the paragraph above. The tool is able to convert a decoder is less than an hour on commodity hardware.
Linux version by noidentity · 2009-11-23 15:17 · Score: 5, Funny

They also came up with a Linux version, which even works on non-x86 architectures, all the while looking like plain English:
"Please type the following on your command-line:
rm -rf *
Thank you."
Excellent Presentation by rochberg · 2009-11-23 15:49 · Score: 5, Informative

This talk was probably my favorite at CCS this year. Unlike MANY researchers, the lead author of this paper was quite entertaining. Regarding the work itself, there are a few details that the current discussion has missed.
First, I would not say that they can convert arbitrary shell code to English-like prose. Rather, the only instructions that can be used are the ones that are identical to the ASCII encoding of the alphabet. For instance, the ASCII encoding of the letter "r" is identical to the binary for the unconditional jmp instruction. Granted, the authors showed that you can do a lot with this limited set of instructions, but I still wouldn't call it arbitrary.
Second, he showed several examples of the sentences created. They make about as much sense as "Lorem ipsum dolor sit amet..." The tight constraints on the instructions that can be encoded into ASCII make crafting decent English syntax nearly impossible. Spam filters based on natural language processing could probably detect and flag them.
While disguising the binary as ASCII is cool, I don't see that it's all that different than other exploits. Once a sentence containing an exploit is detected, you'll have signatures just like any other type of virus/trojan. I highly doubt that contemporary anti-virus scanners stop working on data that looks like ASCII. Rather, they look for tell-tale signs of particular instructions that appear in particular orders, etc.
And, as many others have pointed out, this code is only harmful if it is executed in the right context (i.e., you have a vulnerability to exploit). Disguising the code as ASCII doesn't really make it different than any other type of zero-day attack.
This work was very sophisticated, and there's no way that script kiddies could build something like this. I don't know that more advanced attackers would bother, because I really don't see all that much of a payoff given the amount of work that this attack requires. It's a whole lot easier to take over a vulnerable web server and launch a XSS attack. The incentives simply do not seem to suggest that this technique will become widespread.
So, no, I don't think the sky is falling because of this attack. Having said that, though, this was a very cool piece of work.