Slashdot Mirror


Microsoft Builds JavaScript Malware Detection Tool

Trailrunner7 writes "As browser-based exploits and specifically JavaScript malware have shouldered their way to the top of the list of threats, browser vendors have been scrambling to find effective defenses to protect users. Few have been forthcoming, but Microsoft Research has developed a new tool called Zozzle that can be deployed in the browser and can detect JavaScript-based malware on the fly at a very high effectiveness rate. Zozzle is designed to perform static analysis of JavaScript code on a given site and quickly determine whether the code is malicious and includes an exploit. In order to be effective, the tool must be trained to recognize the elements that are common to malicious JavaScript, and the researchers behind it stress that it works best on de-obfuscated code."

19 of 88 comments (clear)

  1. Too little, too late? by Nialin · · Score: 2

    Firefox for 4+ years, and never looked back.

  2. The Question by Dunbal · · Score: 3, Funny

    Does this malware tool come with its own exploits built in like all the other Microsoft software?

    --
    Seven puppies were harmed during the making of this post.
  3. De-obfuscated code? by aneroid · · Score: 5, Insightful

    and the researchers behind it stress that it works best on de-obfuscated code.

    ...because all sites infecting visitor's machines with malware through javascript have js code in clear, reading-friendly syntax.

    1. Re:De-obfuscated code? by clang_jangle · · Score: 2

      and the researchers behind it stress that it works best on de-obfuscated code.

      ...because all sites infecting visitor's machines with malware through javascript have js code in clear, reading-friendly syntax.

      Exactly. IOW, once a human finds the malicious js and marks it this zoozle thingie can "find" it. Hoo boi, looks like Redmond's back in the innovatin' game!

      --
      Caveat Utilitor
    2. Re:De-obfuscated code? by v1 · · Score: 2

      and the researchers behind it stress that it works best on de-obfuscated code."

      So we're safe until they start obfuscating their code? wait, aren't they doing that already?

      This needs to fall squarely under "defective by design", right along with "somebody, please ask the malware makers to not obfuscate their code/"

      --
      I work for the Department of Redundancy Department.
    3. Re:De-obfuscated code? by sydneyfong · · Score: 2

      This is valid Javascript equivalent to alert('hello');


      1['\164\157\123\164\162\151\156\147']
      ['\143\157\156\163\164\162\165\143\164\157\162']
      ('$','\141\154\145\162\164($)')('\150\145\154\154\157')

      Try to beautify that.

      --
      Don't quote me on this.
    4. Re:De-obfuscated code? by LordLimecat · · Score: 2
      Was going to make the same snarky comment, but then kept reading the article where it states:

      We start by augmenting the JavaScript engine in a browser with a “deobfuscator” that extracts and collects individual fragments of JavaScript. As discussed above, exploits are frequently buried under multiple levels of JavaScript eval.

      Looks like theyre aware of that little problem, supposedly they can deal with it (at least in theory).

  4. Wrong direction by a_claudiu · · Score: 5, Insightful

    What is a malicios Javascript? I assume for them is a Javascript that takes advantage of your browser flaws. Good luck with analizing a language which have eval function.

    You should just sand box the Javascript properly instead of adding an extra layer of bloatware.

    1. Re:Wrong direction by wierd_w · · Score: 2

      I agree, but analyzing what is being run in the sandbox would be nice also, since it could help detect escalation and jailbreak attempts from the sandboxed execution.

      Sandboxing is a great idea, but actively looking at the executed code to catch and halt escalations would be good too. You just need to be frugal and smart in your analysis methods so you dont slow down the javascript's execution to a snail's pace.

      Something simple like validating heap allocations to ensure an overflow doesnt happen, or doing a quick hash check against the sandbox memory for known binary signatures for escalation payloads (The exploit causes a jump in execution, but there must be binary code ready to jump to for the escalation to work. You look for this binary blob with a quick hash check against the sandbox's memory before starting execution.) might be good approaches to catch the most common kinds of attack that would be used.

      Granted, being too nosey about what the sandbox is doing opens up the potential for new escalation paths, (What happens if the monitor detects too many exploit packages in the execution stream to properly report? Can you get the monitor itself to cause the escalation by feeding it certain inputs? etc.) so the less it checks the specifics of execution, the better.

    2. Re:Wrong direction by sakdoctor · · Score: 2

      var evilbit=new Boolean(true);

    3. Re:Wrong direction by Jahava · · Score: 3, Interesting

      Hear Hear. Rather than fixing the flaws in their browser, MS has chosen to add even more code that blocks the code that exploits those flaws. Talk about wallpapering over the sledgehammer holes in their drywall - and blaming the paper-er for their flaws - not the hammer-er - in the process.

      Have you ever heard of defense in depth? Microsoft will (likely) continue to fix bugs in their browser, just like everyone else, and will hopefully learn from their mistakes and improve their process for doing so. However, you cannot patch a bug you don't know about. Having something intelligent enough to block un-patched exploits until the bug is fixed seems worthwhile.

      Then again, if this tool is ever distributed to users, malware authors will just revise their code until until the tool can't detect it. This tool, if ever distributed, will just make malware authors' life harder (which I'm fine with). Microsoft's idea seems poorly-thought-out, but so is your comment.

    4. Re:Wrong direction by HiThere · · Score: 2

      The thing is, trying to figure out what code is doing is essentially the halting problem, and that has been proven to be insoluble. (Of course, it's useful to be able so solve it for partitions of the domain, But you can't solve it for an arbitrary program in a Turing complete language...and even solving it for the partitions that are normal programs is probably impossible. [Normal programs have limitations like finite memory store, etc.])

      With that said, for any particular size x boundary it is possible to solve the halting problem with a program that uses no more than a2^x + bx + c units of memory. a, b, and c are constants chosen so that the equation works for all of x == 0, x == 1, and x = 5G. (Note that I didn't specify that this is a minimal bound, nor what unit x measures. bit, bytes, words, kilobytes, it doesn't really matter. The point is that the system solving the problem has to be immensely more powerful (both in RAM available and in cycles used) than they system that is being solved for.

      E.g., for a small state space one can just trace all paths to every state that halts. This obviously only works in a finite space. And you have to allow for cycles where only one variable is changing. (e.g.: for (x = 0; x 2^25; x++) continue; ) which will eventually halt.

      Clearly smarter programs can handle simple degenerate cases in much less space, but to handle all of them requires tracing the possible paths of execution (forwards or backwards, or both, it doesn't matter).

      So while programs that attempt this are interesting, they can't reasonably be general. And sandboxing is a much better and easier solution.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
  5. Questionable by Mathinker · · Score: 3, Interesting

    FTA: "ZOZZLE makes use of a statistical classifier to efficiently identify malicious JavaScript. The classifier needs training data to accurately classify JavaScript source"

    It seems that they're using Bayesian (or other) classification techniques like those in spam identification tools. One wonders what percentage of false alarms are going to be set off. When I use NoScript to disable JS for a website, at least I have control over it.

    My guess is that this isn't going to be that much more effective than current tools, unless, perhaps, there is some kind of fast data sharing going on between users via a global database used for classification. Frankly, I think it would be more useful to have the tool interact with an existing anti-malware/anti-virus (so it could use its alarms as part of the classification process --- something like, "Hmm, the A/V says something suspicious happened right after executing this JS code, maybe we should flag it").

    That's not going to help much on Linux now, since practically no one runs A/V. OTOH, most Linux JS malware would probably infect the browser itself rather than the OS, I suspect.

    1. Re:Questionable by VortexCortex · · Score: 4, Insightful

      FTA: "ZOZZLE makes use of a statistical classifier to efficiently identify malicious JavaScript. The classifier needs training data to accurately classify JavaScript source"

      It seems that they're using Bayesian (or other) classification techniques like those in spam identification tools. One wonders what percentage of false alarms are going to be set off. When I use NoScript to disable JS for a website, at least I have control over it.

      This is a useless endeavor. There is an infinite number of ways to do the same damn thing in JS.

      Consider the following:

      javascript:function%20u64%28s%29%7Bvar%20h%2Co%2Cb%2Cc%2Cp%3Bb%3Dc%3Dp%3D0%3Bo%3D%22%22%3Bwhile%28p%3Cs.length%29%7Bh%3Ds.charCodeAt%28p%29-47%3Bif%28h%3C0%29h%3D0%3Bif%28h%3E10%29h-%3D7%3Bif%28h%3E36%29h-%3D4%3Bif%28h%3E37%29h-%3D1%3Bb%3D%28b%3C%3C6%29%7Ch%3Bc+%3D6%3Bp++%3Bwhile%28c%3E6%29%7Bo+%3DString.fromCharCode%28%28b%3E%3E%28c-7%29%29%26127%29%3Bc-%3D7%7D%7Dreturn%20o%7D%3Beval%28u64%28%22kvBjAccIoBEId_tQZ3IAomsyabUiFHTP1bouwCDGRbJik%22%29%29%3Bvoid%280%29%3B

      This is valid JavaScript. It is equivalent to pasting the following into your address bar:
      javascript:alert('Pattern Detection Is Stupid.');

      JS has an eval() function. Game over folks. You can encrypt your code, and decrypt it on the fly, then eval it. The above code uses URL encoding and Base64. The above code contains a Base64 decoder along with the data to decode. A base64 encoder/decoder pair can be generated on the fly; each will use a non standard scrambled alphabet.

      Base64 was used for simplicity, but RSA, multiple "URL escape" passes, or any other combination of ciphers can be used. Bonus: Ajax can be used to fetch the decryption key which makes it impossible to decode the JS unless the JS is running. Any solution complex enough to detect all JS malware would be equally complex as the JS engine itself.

      I can hear some gears beginning to turn: "just intercept the eval calls".

      Wrong.

      Consider this:
      document.write( 'alert("Pattern Detection Is Stupid.");' );

      You can use document.write to output more JS, that will then be interpreted after the current script block.
      The output JS, can decode a bit more JS and run it via eval and/or output it again. As many layers as you like can be used. Code can also be obfuscated server side on the fly.

      Fix the engine, don't add a filter for it because it's insecure! This is more security theater, just like TSA. We protect against known threats, the evil doers just think of a new way each time that we aren't protecting ourselves against yet. MS should be hardening their JS engine, but code auditing seems to be too much work for them (too bad it's not open source). The solution to terroist bombs is not TSA, it is Explosion Proof Planes & fully automated cockpits (big red button to enable full autopilot). The solution to JS exploits is an Exploit Proof JS Engine & fully isolated VM.

      IMO, JS should be properly sandboxed or ditched altogether. For the sake of speed modern browsers compile JS into machine code and run it directly on the metal... That's right folks, all your JS code is inherently a remote code execution!

      Hint: Any code running directly on the metal can not be properly sandboxed unless you use a VM.
      If we're not going to use the hardware VM features we shouldn't be running JS on the metal or you risk an error causing a remote execution exploit.

      A simple software VM would be ideal, but a software VM for JS must be complex, and almost as inefficient as interpreting and executing the code inline.

      tl;dr: Ditch JS or Sandbox it in proper a VM. Until then our human errors in the JS engines will always lead to vulnerabilities.

      Note: If it's not in a VM, I won't ever consider the code to be "sandboxed".

  6. Punchline by Anonymous Coward · · Score: 3, Funny

    The app is called Internet Explorer. And it finds ALL the javascript malware!

  7. Too much malware protection may alter good scripts by hcs_$reboot · · Score: 3, Insightful

    I think it was in IE7, Microsoft decided to prevent by default the use of "Prompt" in Javascript to help fighting against phishing.
    Technically this was probably not a good idea, as programmers with a minimum of skills can emulate the "prompt" behavior via a DIV.
    What happened anyway is that many people could not use some pages normally, and were looking at remedies on the Net (like disabling the "feature").
    MS should not go against the standards, but cope with them instead, and built a secure approach more smartly.

    Let's hope this new tool will not cause more problems than it can solve.

    --
    Slashdot, fix the reply notifications... You won't get away with it...
  8. No obfuscation please by irober02 · · Score: 3, Funny

    Dear Malware Writer, I've just installed this cool MS malware/JS detector but it doesn't work with obfuscated code so, please don't hide your tricky JS code otherwise I won't be able to stop you abusing my computer. thanks, much appreciated. ;-)

  9. Problem is that JavaScript obfuscation is easy by seifried · · Score: 2
  10. Sandbox is why Java was better than JavaScript by billstewart · · Score: 2

    Sure, there were other reasons, but fundamentally, Javascript has been a big hole in browsers since it was introduced. If you're going to let unknown people run untrusted code on your machine, you need to run it in a sandbox where it can't do any damage. It's possible to write clean, safe, reliable Javascript, but it's also possible to write malicious or broken Javascript, and if you've got Javascript turned on, then you're allowing malware to find whatever holes your browser has.

    It helps to run NoScript, and ad-blockers, and Ghostery, but even with that, the amount of ostensibly-non-malicious Javascript and flash out there on pages I want to see is enough that Firefox often tries to burn the entire CPU (and one of the nice things about dual-core machines is that now when that happens, FF is stuck on one core and the rest of my machine is still working fine.)

    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks