Machine Learning Used For JavaScript Code De-obfuscation

← Back to Stories (view on slashdot.org)

Machine Learning Used For JavaScript Code De-obfuscation

Posted by Soulskill on Tuesday June 3, 2014 @10:10AM from the cleaning-up-the-digital-streets dept.

New submitter velco writes: "ETH Zurich Software Reliability Lab announced JSNice, a statistical de-obfuscation and de-minification tool for JavaScript. The interesting thing about JSNice is that it combines program analysis with machine learning techniques to build a database of name and type regularities from large amounts of available open source code on GitHub. Then, given new JavaScript code, JSNice tries to infer the most likely names and types for that code by basing its decision on the learned regularities in the training phase."

31 comments

Min score:

Reason:

Sort:

Not de-obfuscation by Anonymous Coward · 2014-06-03 10:21 · Score: 0

Minifcation is obfuscation, if you try running some _really_ obfuscated code through it nothing really improves.
1. Re:Not de-obfuscation by Anonymous Coward · 2014-06-03 13:25 · Score: 0
  
  I am genuinely confused. If minification is obfuscation, and this thing de-minifies, how is that not de-obfuscation?
2. Re:Not de-obfuscation by Anonymous Coward · 2014-06-03 17:14 · Score: 0
  
  It may be able to de-minify by adding back some white spaces and indenting, which is the easy part because that's just based on the language syntax. I think the hard part is restoring the obfuscated vars, funcs, and objs to have meaningful names again.
3. Re:Not de-obfuscation by hotcut · 2014-06-03 18:48 · Score: 2
  
  Yes, the hard part is getting meaningful names back - which is exactly what the article is about; they claim to have found a way to do it. Granted, I doubt how good it could possibly be - but on the other hand, it is an interesting project that may come to use.
Muhhahahah! by Anonymous Coward · 2014-06-03 10:24 · Score: 0

Mine! All Mine! The javascript! It's all Mine!
So, what are you good at, JSNice? by albacrankie · 2014-06-03 10:29 · Score: 2

this and that
Re:Biggest beneficiary: Minecraft mods by Anaerin · 2014-06-03 10:32 · Score: 2
1. Minecraft is written in Java, not JavaScript.
2. The MCP (Minecraft Coder Pack) already has a deobfuscator built in (kinda sorta)
Needs Better Name by Anonymous Coward · 2014-06-03 10:32 · Score: 0

Since this is pretty similar to duck typing, I nominate "UnDuck" since it can also be used as a verb. As in, "I was finally able to tell what that obfuscated JS library does after I unducked it."
Hahahaha! by pigiron · 2014-06-03 10:34 · Score: 4, Funny

The development of tools like these started out of necessity for figuring out old COBOL code.
1. Re:Hahahaha! by K.+S.+Kyosuke · 2014-06-03 11:49 · Score: 2
  
  If DIVIDE X BY Y GIVING Z REMAINDER W is the minified version, I'm not sure I want to see the un-minified one!
  
  --
  Ezekiel 23:20
2. Re:Hahahaha! by Anonymous Coward · 2014-06-03 13:32 · Score: 2, Funny
  
  That would be
  "DIVIDE REC-WORKER-TOTAL-ANNUAL-SALARY BY WS-HOURS-IN-FISCAL-YEAR
  GIVING WS-HOURLY-RATE REMAINDER WS-ANNUAL-BONUS."
  or something similar.
3. Re: Hahahaha! by Anonymous Coward · 2014-06-04 08:43 · Score: 0
  
  Some of the early development tools were created out of necessity for programming the original COBOL code. Everything progresses based on earlier work...
Re:Biggest beneficiary: Minecraft mods by Anonymous Coward · 2014-06-03 10:47 · Score: 0

I hope to christ you are trolling.
Finally consistent naming by orionpi · 2014-06-03 10:50 · Score: 5, Funny

Now we just run every JavaScript program through an obfuscator then JSNice and we have consistent naming.
1. Re:Finally consistent naming by Anonymous Coward · 2014-06-03 11:11 · Score: 1
  
  Now we just run every JavaScript program through an obfuscator then JSNice and we have consistent naming.
  You laugh, but I have tried it.
  The naming isn't as good as you would like, but for some projects, it may be an improvement. o.O
Didn't work for me by Anonymous Coward · 2014-06-03 10:55 · Score: 0

I tried it on my obfuscated code and it made no improvement.
Fail by viperidaenz · 2014-06-03 11:10 · Score: 0

I tried it on a minified jquery 1.7.2 and got:
Error compiling input:
Line 3: Parse error. missing ) after condition
Line 3: Parse error. unterminated string literal
Line 4: Parse error. missing ; before statement
Line 4: Parse error. syntax error
Line 4: Parse error. missing ) in parenthetical
Line 4: Parse error. missing } after property list
Line 4: Parse error. illegal character
Line 4: Parse error. syntax error
Line 4: Parse error. illegal character
Line 4: Parse error. illegal character
1. Re:Fail by marchosancho · 2014-06-03 11:53 · Score: 5, Informative
  
  Hi, Thanks for trying the tool out. I tried http://code.jquery.com/jquery-... (from here: http://blog.jquery.com/2012/03...) and it worked fine. best, Martin
2. Re:Fail by Menkhaf · 2014-06-03 18:38 · Score: 1
  
  Now that you're here...
  I tried it on this hunk of JavaScript: http://pastebin.com/miGDVkdf , but all I got was a parse error:
  "// Error contacting the server...
  parsererror
  SyntaxError: Unexpected token :"
  
  --
  A proud member of the Onion-in-Hand alliance
potential tool for JS code refactoring by Anonymous Coward · 2014-06-03 11:13 · Score: 0

this can be a good candidate for JavaScript code refactoring when people are building large scale JS based application.
RMS by Anonymous Coward · 2014-06-03 11:25 · Score: 0

I wonder what would R.M.Stallman say about this. Maybe he will feel 60% freer to browse the web now?
1. Re:RMS by tepples · 2014-06-03 12:43 · Score: 1
  
  I don't think he would. Code distributed under terms that prohibit modification is still distributed under terms that prohibit modification, whether or not it's possible to convert it into a form suitable for making modifications.
jsunpack? by Anonymous Coward · 2014-06-03 12:34 · Score: 0

How is this different than jsunpack-n?
As a exploit kit researcher.... by guardiangod · 2014-06-03 13:28 · Score: 3, Interesting

This tool looks very intriguing, so I gave it some malicious code for a spin (all codes are from malicious drive-by sites in the last 24 hours.)

/** @type {function (string): *} */ e = eval; /** @type {string} */ v = "0" + "x"; /** @type {number} */ a = 0; try { a *= 2; } catch (q) { /** @type {number} */ a = 1; } if (!a) { try { document["bod" + "y"]++; } catch (q$$1) { /** @type {string} */ a2 = "_"; } z = "2f_6d_*snip*"["split"](a2); /** @type {string} */ za = ""; /** @type {number} */ i = 0; for (;i < z.length;i++) { za += String["fromCharCode"](e(v + z[i]) - sa); } zaz = za; e(zaz); } /** * @param {string} n * @param {string} k * @param {number} v * @param {string} reason * @return {undefined} */ function SetCookie(n, k, v, reason) { /** @type {Date} */ var defaultCenturyStart = new Date; /** @type {Date} */ var expiryDate = new Date;

Sort of useful, I guess. But ultimately not an essential feature for malicious javascript analysis. I think the tool would be more useful to legitmate JS reverse-engineering tasks as their obfuscated JS are much much bigger.
Try some unobfuscated JS for fun! by Anonymous Coward · 2014-06-03 16:09 · Score: 0

function loadVideo(css,src,version,flashvars,width,height){
$(css).flash({src:src,
width:width,height:height,
menu:'false',
allowfullscreen:'true',
allowscriptaccess:'always',
flashvars:flashvars
},{version:version});}
becomes: /**
* @param {?} data
* @param {string} src
* @param {string} browserVersion
* @param {?} dataAndEvents
* @param {number} w
* @param {number} rowHeight
* @return {undefined}
*/
function loadVideo(data, src, browserVersion, dataAndEvents, w, rowHeight) {
$(data).flash({
src : src,
width : w,
height : rowHeight,
menu : "false",
allowfullscreen : "true",
allowscriptaccess : "always",
flashvars : dataAndEvents
}, {
version : browserVersion
});
};
Re:Biggest beneficiary: Minecraft mods by Anonymous Coward · 2014-06-03 17:02 · Score: 0

Wow there are still idiots who fall for that ancient and obvious trollbait? Hopefully your excuse is that you're autistic.
As a exploit kit researcher.... by Anonymous Coward · 2014-06-03 17:53 · Score: 0

Yeah, I don't think it added much value to this code: https://gist.github.com/nickwb/3d95944feaa2d9409a57
Re:Biggest beneficiary: Minecraft mods by wonkey_monkey · 2014-06-03 19:23 · Score: 2

I hear there's a bug in the string length() method that miscounts by 1.

--
systemd is Roko's Basilisk.
Re:Biggest beneficiary: Minecraft mods by Anonymous Coward · 2014-06-03 20:16 · Score: 0

Yeah, but how seriously can you take "i". It's just a line with a dot. It's like some spastic tried to draw a straight line and then his pen jumped from the paper. It's not really a letter; it doesn't deserve to be counted.
On the other hand, the S... look at that S. That's some big fat S, guys! It's go two curves flowing into capital perfection. It's an S so nice, it should count twice!
So I guess you're right.