Morphing Code to Prevent Reverse Engineering?
ptolemu writes "Cringely's latest article discusses a new obfuscation technique currently being researched called PSCP (Program State Code Protection). An informative read that concludes with some interesting insight on the software giants that heavily depend on this kind of technology."
Java (and subsequently .Net) bytecode made a reverse engineer's life a bit easier on a whole, because of the way it could be decompiled into source that was extremely similar to the original.
All this seems like it would do, is remove that benefit and cause the reverse engineer to approach it the same old way one would approach a compiled C program (as you described, with a debugger and hooks on syscalls). Or bust out a new type of disassembler to emulate traces, and dump that to an assembly listing.
But you're right, it's not really that mind blowing if the reverse engineer has worked on non-java/non-.net binaries before.
I wonder if they've seen the proof of the impossibility of obfuscating programs?
I have found that most code generation tools (the kind you program boubles and arrows in, like this one) will give you C code that looks like it's been obscurified on purpose.
E.g. all states and variables are in an array called n[][] and the program is basically a big loop.
Quite impossible to know whats going on
Once the virus writers get a hold of this viruses will be much harder to catch, unless anti-virus writers start looking more for virus-like activity.
Of course, virus writers have been using this since the early 1990s. One particular virus called Ontario III (there might be others before it) used this trick. An interesting part from the virus writeup: "The Ontario III virus uses a very complex form of encryption with no more than two bytes remaining constant in replicated samples."
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
Having worked with Java bytecodes when I took compilers, I will say that you can get really close to the original program by looking at the bytecodes. You can't tell if someone used a while loop or a for loop, but you can still reconstruct the loop from the code.
The Java Virtual Machine is a stack machine - there are no CPU registers. There's a seperate memory store for local variables. That tends to make it easy to tell exactly what data is being operated on at any given time.
I've seen Java decompilers that return very clear, readable code.
Cloakware also has some nice obfuscation technologies
The GPL says "The source code for a work means the preferred form of the work for making modifications to it." Some obfuscated derivative of the source code doesn't count.