Machine Learning Used For JavaScript Code De-obfuscation
New submitter velco writes: "ETH Zurich Software Reliability Lab announced JSNice, a statistical de-obfuscation and de-minification tool for JavaScript. The interesting thing about JSNice is that it combines program analysis with machine learning techniques to build a database of name and type regularities from large amounts of available open source code on GitHub. Then, given new JavaScript code, JSNice tries to infer the most likely names and types for that code by basing its decision on the learned regularities in the training phase."
Minifcation is obfuscation, if you try running some _really_ obfuscated code through it nothing really improves.
Mine! All Mine! The javascript! It's all Mine!
this and that
Since this is pretty similar to duck typing, I nominate "UnDuck" since it can also be used as a verb. As in, "I was finally able to tell what that obfuscated JS library does after I unducked it."
The development of tools like these started out of necessity for figuring out old COBOL code.
I hope to christ you are trolling.
Now we just run every JavaScript program through an obfuscator then JSNice and we have consistent naming.
I tried it on my obfuscated code and it made no improvement.
I tried it on a minified jquery 1.7.2 and got:
Error compiling input:
Line 3: Parse error. missing ) after condition
Line 3: Parse error. unterminated string literal
Line 4: Parse error. missing ; before statement
Line 4: Parse error. syntax error
Line 4: Parse error. missing ) in parenthetical
Line 4: Parse error. missing } after property list
Line 4: Parse error. illegal character
Line 4: Parse error. syntax error
Line 4: Parse error. illegal character
Line 4: Parse error. illegal character
this can be a good candidate for JavaScript code refactoring when people are building large scale JS based application.
I wonder what would R.M.Stallman say about this. Maybe he will feel 60% freer to browse the web now?
How is this different than jsunpack-n?
This tool looks very intriguing, so I gave it some malicious code for a spin (all codes are from malicious drive-by sites in the last 24 hours.)
Sort of useful, I guess. But ultimately not an essential feature for malicious javascript analysis. I think the tool would be more useful to legitmate JS reverse-engineering tasks as their obfuscated JS are much much bigger.
function loadVideo(css,src,version,flashvars,width,height){
$(css).flash({src:src,
width:width,height:height,
menu:'false',
allowfullscreen:'true',
allowscriptaccess:'always',
flashvars:flashvars
},{version:version});}
becomes: /**
* @param {?} data
* @param {string} src
* @param {string} browserVersion
* @param {?} dataAndEvents
* @param {number} w
* @param {number} rowHeight
* @return {undefined}
*/
function loadVideo(data, src, browserVersion, dataAndEvents, w, rowHeight) {
$(data).flash({
src : src,
width : w,
height : rowHeight,
menu : "false",
allowfullscreen : "true",
allowscriptaccess : "always",
flashvars : dataAndEvents
}, {
version : browserVersion
});
};
Wow there are still idiots who fall for that ancient and obvious trollbait? Hopefully your excuse is that you're autistic.
Yeah, I don't think it added much value to this code: https://gist.github.com/nickwb/3d95944feaa2d9409a57
I hear there's a bug in the string length() method that miscounts by 1.
systemd is Roko's Basilisk.
Yeah, but how seriously can you take "i". It's just a line with a dot. It's like some spastic tried to draw a straight line and then his pen jumped from the paper. It's not really a letter; it doesn't deserve to be counted.
On the other hand, the S... look at that S. That's some big fat S, guys! It's go two curves flowing into capital perfection. It's an S so nice, it should count twice!
So I guess you're right.