Computer Scientists Develop 'Mathematical Jigsaw Puzzles' To Encrypt Software

← Back to Stories (view on slashdot.org)

Computer Scientists Develop 'Mathematical Jigsaw Puzzles' To Encrypt Software

Posted by samzenpus on Wednesday July 31, 2013 @04:32PM from the looking-for-edges dept.

another random user writes "The claim here is that the encrypted software can be executed, but not reverse-engineered. To quote from the article: 'UCLA computer science professor Amit Sahai and a team of researchers have designed a system to encrypt software so that it only allows someone to use a program as intended while preventing any deciphering of the code behind it. According to Sahai, previously developed techniques for obfuscation presented only a "speed bump," forcing an attacker to spend some effort, perhaps a few days, trying to reverse-engineer the software. The new system, he said, puts up an "iron wall," making it impossible for an adversary to reverse-engineer the software without solving mathematical problems that take hundreds of years to work out on today's computers — a game-change in the field of cryptography.'"

1 of 245 comments (clear)

Min score:

Reason:

Sort:

Re:I Call BS by greg1104 · 2013-07-31 20:46 · Score: 5, Informative

You can make the algorithm as complex as you'd like, but at the end of the day, you have an input and output(s). You may decide to take a long time to get there, but at the end of the day, I know what you did when the code ran.
This is referred to as "black-box" reverse engineering in the paper. You know what the code did for one input. And if you injected every possible input to the program, at the end you'd have worked out a complete specifications for the function. But how long will that take? It's not always "the end of the day". For functions with a wide range of inputs, it could take the life of the universe to map them all unless you know what you're looking for in advance.
Right now, obfuscation approaches for software usually have some small chocking point to attack. It might be an encryption wrapper around the main binary. Bust that with a debugger, you can get to unobfuscated code for the main system, and then really start to work your way through the program.
But if you have to fight this every step of the way, where all you do is inject inputs and get an output to figure out what the program does, it will take you forever to reverse engineer things. That's the claim of the paper. The code itself is so obfuscated that you can't read it straight and understand it, ever. It looks like random junk. All you can do is run it with an input and see what comes back, and from that reverse engineer what it does. Assemble enough of those and you can see how the program really works. But that black box teardown process--determining possible inputs, injecting them, collecting outputs, and then deriving the function behavior--is time intensive enough that it may not be practical to actually do it anymore. You don't learn anything from reversing any component that speeds handling the next; you have to attack them all like this.
There's a great line from the seminal paper on this subjectOn the (Im)possibility of Obfuscating Programs: "Any programmer knows that total unintelligibility is the natural state of computer programs (and one must work hard in order to keep a program from deteriorating into this state)" Extracting meaning from source code, being able to predict what some lines of code will do, it's hard. Ideally you'll just be able to read the code, make sense of it, and then reverse from there. Most systems that are thoroughly cracked have some sections where it's hard, but once you get those the remaining code is straightforward to read. And in fact, it's impossible to make something where you cannot reverse it. The question though is how long it will take.
If no sense is ever made of the code, you can only apply the "black-box" reverse engineering, where you inject inputs, watch outputs, and from there determine what the code does. You can't prevent that, but you can make the box so big that such work is impossible to do. That's what this technique tries to accomplish. You never find an easy part to the code you can read; all you ever find are ones where you have to map every input into an output to figure out what the code does.