Slashdot Mirror


Avast Launches Open-Source Decompiler For Machine Code (techspot.com)

Greg Synek reports via TechSpot: To help with the reverse engineering of malware, Avast has released an open-source version of its machine-code decompiler, RetDec, that has been under development for over seven years. RetDec supports a variety of architectures aside from those used on traditional desktops including ARM, PIC32, PowerPC and MIPS. As Internet of Things devices proliferate throughout our homes and inside private businesses, being able to effectively analyze the code running on all of these new devices becomes a necessity to ensure security. In addition to the open-source version found on GitHub, RetDec is also being provided as a web service.

Simply upload a supported executable or machine code and get a reasonably rebuilt version of the source code. It is not possible to retrieve the exact original code of any executable compiled to machine code but obtaining a working or almost working copy of equivalent code can greatly expedite the reverse engineering of software. For any curious developers out there, a REST API is also provided to allow third-party applications to use the decompilation service. A plugin for IDA disassembler is also available for those experienced with decompiling software.

9 of 113 comments (clear)

  1. Re: Wow! So many architectures! by Anonymous Coward · · Score: 5, Insightful

    Get over yourself and stop complaining about things being given away to you for free. It's a shame that people complain about open source software when it's being given to them for free. The decompiler could have never been released to the public or released as a closed source program. Your complaint about the architectures it supports or doesn't support totally rings hollow.

  2. Re:Wow! So many architectures! by J053 · · Score: 4, Informative

    ...but no x86_64.

  3. Should crossref with github. by shess · · Score: 4, Interesting

    Perhaps if you built a fingerprint based on the structure of calls across functions, you could map it back to source code from github. Not that malware is generally posted to github, but I'd be surprised if they didn't use a TON of third_party libraries, and factoring all of those out would make what's left easier to understand and also let you focus better.

  4. encryption by bugs2squash · · Score: 4, Funny

    unfortunately it de-compiles the machine code to perl.

    --
    Nullius in verba
  5. Re:A debugger does this by Tony+Isaac · · Score: 3, Insightful

    One problem with a lot of those old debuggers and disassemblers was that they weren't that smart about what they were looking at. You often had to tell them a range of memory to disassemble, and they would blindly treat everything they saw as code, even if it was actually data. This was partly a problem because in those days, code and data weren't so neatly divided from one another, everything could live anywhere in memory. It was actually common for software to "poke" data into memory and then execute it. Ah, the good old days.

  6. Re:decompiles INTO WHAT ? by Anonymous Coward · · Score: 3, Informative

    "no mention in the article of what the decompiler actually decompiles to .."

    According to https://github.com/avast-tl/retdec:
    Output in two high-level languages: C and a Python-like language.

  7. Re:Wow! So many architectures! by jonwil · · Score: 3, Insightful

    The thing is open source, if you really want x86-64, grab the code and write something :)

  8. Re:A debugger does this by AmiMoJo · · Score: 4, Interesting

    Indeed, poking code is often the fastest way to do stuff on those older systems where memory bandwidth and CPU clocks are very limited.

    We called it speedcode back in the day. Say you wanted to calculate and plot a load of points on the screen. Normally you would calculate the coordinates, store them and then later pass a reference to some plotting function. To do it faster you could turn calls to the plot function into an unrolled series of instructions, and instead of reading the coordinates every time just poke them directly into the immediate instruction op-codes.

    --
    const int one = 65536; (Silvermoon, Texture.cs)
    SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
  9. Decompiler are not simple debugger/dumper by DrYak · · Score: 3, Informative

    x86 is hard to decompile. It doesn't have fixed length instructions, so it is difficult to figure out where opcodes begin and end. It is even possible to write code that can execute two different sequences of instructions by offsetting the instruction pointer by a byte. I don't think any decompiler could deobfusticate that.

    The simple code dumper that comes with garden variety debugger won't easily deobfuscate that. (You need to manually ask the debugger to start dumping from the 2 overlapping point).

    That why, the best decompilers available in the 90s used some sort of virtual machine to follow through the execution flow, and be able to distinguish such kind of "frame shifts" (that's actually a biology term, I've forgotten what the proper CS term is), and also be able to understand a bit of self-modifying code.
    (Basically, the decompiler will notice that various part of the code make calls into the same region but at an odd offset, and will automatically try dumping with from each overlapping point)

    Makes it also possible to put actually-useful label/names into variable. (call something "sound_frequency" instead of "var184" because by following the data flow, the decompiler release that this is the parameter the is output to the PC-Speaker tone generator).

    Sourcer by V-Com was one such good decompiler.
    (I managed to learn quite a ton of tricks like PCM play on the PC Speaker, tweaked graphical modes, etc. simply by using sr to inspect interesting executables.
    I even manage to desinfect a cracked game that was saddly being distributed infected with some virus)

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]