Slashdot Mirror


Machine Learning Used For JavaScript Code De-obfuscation

New submitter velco writes: "ETH Zurich Software Reliability Lab announced JSNice, a statistical de-obfuscation and de-minification tool for JavaScript. The interesting thing about JSNice is that it combines program analysis with machine learning techniques to build a database of name and type regularities from large amounts of available open source code on GitHub. Then, given new JavaScript code, JSNice tries to infer the most likely names and types for that code by basing its decision on the learned regularities in the training phase."

10 of 31 comments (clear)

  1. So, what are you good at, JSNice? by albacrankie · · Score: 2

    this and that

  2. Re:Biggest beneficiary: Minecraft mods by Anaerin · · Score: 2
    1. Minecraft is written in Java, not JavaScript.
    2. The MCP (Minecraft Coder Pack) already has a deobfuscator built in (kinda sorta)
  3. Hahahaha! by pigiron · · Score: 4, Funny

    The development of tools like these started out of necessity for figuring out old COBOL code.

    1. Re:Hahahaha! by K.+S.+Kyosuke · · Score: 2

      If DIVIDE X BY Y GIVING Z REMAINDER W is the minified version, I'm not sure I want to see the un-minified one!

      --
      Ezekiel 23:20
    2. Re:Hahahaha! by Anonymous Coward · · Score: 2, Funny

      That would be
          "DIVIDE REC-WORKER-TOTAL-ANNUAL-SALARY BY WS-HOURS-IN-FISCAL-YEAR
          GIVING WS-HOURLY-RATE REMAINDER WS-ANNUAL-BONUS."
      or something similar.

  4. Finally consistent naming by orionpi · · Score: 5, Funny

    Now we just run every JavaScript program through an obfuscator then JSNice and we have consistent naming.

  5. Re:Fail by marchosancho · · Score: 5, Informative

    Hi, Thanks for trying the tool out. I tried http://code.jquery.com/jquery-... (from here: http://blog.jquery.com/2012/03...) and it worked fine. best, Martin

  6. As a exploit kit researcher.... by guardiangod · · Score: 3, Interesting

    This tool looks very intriguing, so I gave it some malicious code for a spin (all codes are from malicious drive-by sites in the last 24 hours.)
     
     

    /** @type {function (string): *} */
    e = eval;
    /** @type {string} */
    v = "0" + "x";
    /** @type {number} */
    a = 0;
    try {
      a *= 2;
    } catch (q) {
    /** @type {number} */
      a = 1;
    }
    if (!a) {
      try {
        document["bod" + "y"]++;
      } catch (q$$1) {
    /** @type {string} */
        a2 = "_";
      }
      z = "2f_6d_*snip*"["split"](a2);
    /** @type {string} */
      za = "";
    /** @type {number} */
      i = 0;
      for (;i < z.length;i++) {
        za += String["fromCharCode"](e(v + z[i]) - sa);
      }
      zaz = za;
      e(zaz);
    }
    /**
      * @param {string} n
      * @param {string} k
      * @param {number} v
      * @param {string} reason
      * @return {undefined}
      */
    function SetCookie(n, k, v, reason) {
    /** @type {Date} */
      var defaultCenturyStart = new Date;
    /** @type {Date} */
      var expiryDate = new Date;

    Sort of useful, I guess. But ultimately not an essential feature for malicious javascript analysis. I think the tool would be more useful to legitmate JS reverse-engineering tasks as their obfuscated JS are much much bigger.

  7. Re:Not de-obfuscation by hotcut · · Score: 2

    Yes, the hard part is getting meaningful names back - which is exactly what the article is about; they claim to have found a way to do it. Granted, I doubt how good it could possibly be - but on the other hand, it is an interesting project that may come to use.

  8. Re:Biggest beneficiary: Minecraft mods by wonkey_monkey · · Score: 2

    I hear there's a bug in the string length() method that miscounts by 1.

    --
    systemd is Roko's Basilisk.