Slashdot Mirror


Morphing Code to Prevent Reverse Engineering?

ptolemu writes "Cringely's latest article discusses a new obfuscation technique currently being researched called PSCP (Program State Code Protection). An informative read that concludes with some interesting insight on the software giants that heavily depend on this kind of technology."

20 of 507 comments (clear)

  1. Are folks really using obfuscation for Java? by tcopeland · · Score: 5, Insightful

    I've done mostly server-side work where:

    - the jar files were secure because they were on the server and
    - bytecode optimization and jar size was the least of our problems

    Obfuscation seems to be useful only for client-side Java applications that contains super-secret valuable algorithms. I mean, who cares if somebody decompiles your code to see how you did sortable JTables or whatever?

    1. Re:Are folks really using obfuscation for Java? by Grishnakh · · Score: 5, Insightful

      If you need a tamper-resistant client-side binary, don't use Java. It's that simple. A good engineer understands many different tools and selects the best one for the job.

      You're obviously not living in the "real world". Here, an engineer uses the tool that the PHB management selects for him, based on buzzwords, what competitors are doing, and what schmoozing vendors have sold to them.

  2. Reverse engineering is not the problem by geoffspear · · Score: 5, Insightful
    It's not the ability to reverse engineer code that creates security problems; if it was, open source code, which you don't even need to reverse engineer would be much less secure. The problem is just badly written code.

    This technique might be interesting for stopping people from stealing your closed source code, but as far as security goes it's pretty much worthless. 99% of the vulnerabilities in MS's code were found before their code was leaked, and if you believe them, even the major exploit found after it was leaked had more to do with bad code than someone finding the existing problem by reading the code.

    --
    Don't blame me; I'm never given mod points.
    1. Re:Reverse engineering is not the problem by Dukael_Mikakis · · Score: 5, Insightful

      It's just like the axiom about divorce that goes something like "It's not the fact that divorce is legal that's killing our marriages, it's the bad marriages that are causing so much divorce."

      Because of the n millions of lines of code in Redmond it's certainly daunting to actually go through and make good code out of the mess, rather than the obscurity.

      The fact that there's an open vulnerable port is a flaw, and the FIX is to make the port secure, rather than to shift its address every five seconds or whatever, which is only a Band-Aid.

      MS is just lucky that the bulk of its customers don't truly know what's going on, otherwise the business model they have wouldn't work.

      I.e. since I'm not a doctor, my doctor can prescribe whatever for me, or insist that I do whatever, and I'll take it as scripture. If what he recommends is the stupidest thing in the world, or he's blatantly a horrible doctor, I would have no idea and suffer the consequences. If I were also a doctor, though, I'd be able to call shenanigans the very second he did something wrong. That's why educating the consumer is the most crucial point of this whole issue.

  3. It's ironic by Dukael_Mikakis · · Score: 5, Insightful

    The medical profession deals with viruses by identifying our weaknesses, and exposing them to the viruses (the ultimate "reverse engineering"?). If there were a biological DMCA, developing vaccines would certainly violate it on the illegality of "hacking into the body".

    With software, though, people still insist on trying hide and pretend as if there were no viruses out there and that we would be impervious to them.

    Can we finally just open all of our code so we can vaccinate it against all these exploits?

  4. Isn't this just self-modifying code? by mveloso · · Score: 5, Insightful

    This looks vaguely like self-modifying code, like back in the old days of copy protection.

    The thing I don't understand about the article (and how it describes the PSCP process) is this: how will this make reverse engineering more difficult?

    When you're starting to crack something, you work backwards from system calls, library calls, and known behaviors. "Known behaviors" are, well, patterns of code that people (or compilers) use to do things. Anyone good at low-level stuff can probably identify the compiler used to build the code. Likewise, if you think about something enough, you can probably figure out three or four ways to do something, and look for that pattern in the code.

    PSCP prevents this...how? By making this process happens as the program runs? How else do you reverse engineer something?

    Anyway, it sounds like this thing sits right before the .net runtime engine (or maybe it's loaded and spews bytecode to the runtime), then it can be removed...or the output intercepted. .

    What am I not getting here?

    1. Re:Isn't this just self-modifying code? by hazee · · Score: 5, Insightful

      Yeah, and self-modifying-code was eventually abandoned because it played havoc with the then-new CPU caches and pipelines.

      Have these people learned nothing?

  5. Just need to tap the Analog Out... by Speare · · Score: 5, Insightful

    Just like all the hubbub over proprietary signal encryption to "protect" digital audio streams, all you need here would be the CPU-equivalent of the old Analog Out jack.

    Break it down to the Universal Turing Machine and tape analogy. The program code is the tape, and the state of the machine is in the tape-executing device. If the tape were to somehow morph itself dynamically, and yet execute properly by morphing to a well-designed program at the moment it is read for execution, all you have to do is to watch the read/write head of the UTM itself.

    If they find ways to monkey around with bytecodes so that they're shifted around between disk and executor, just run it with a special version of the executor. Shouldn't be hard... the standard for what the unencrypted bytecodes are capable of accomplishing are standardized. Execute the code once, and take "notes" of what is being accomplished. Run through a code coverage test suite, even a crude black-box analysis, and you should get an unscrambled bytecode equivalent.

    It just doesn't make sense. If obfuscation, i.e. obscurity, is your only security, it is no security at all.

    --
    [ .sig file not found ]
  6. Wow by Anonymous Coward · · Score: 5, Insightful

    Cringely has really outdone himself that time. I can't even follow this poorly thought out mess. He seems to totally misunderstand every single concept he touches on.

    Compilation to bytecode and an "interpreted language" are NOT THE SAME THING. Both the CLR and a compiled java class are effectively machine code for a machine that doesn't exist. These abstract machines have machine code that reveal *MORE* information to a disassembler/reverse engineer than, say, x86 or PPC assembly, but it is still far, far from being code. This is reaction one that I have. The rest of the article is so confused I don't even know how to respond to it.

  7. performance by happyfrogcow · · Score: 5, Insightful

    When a computer program runs, the computer can follow millions of paths to get the job done. We leverage those millions of paths and transform them into billions of paths instead

    Millions of paths implies some sort of jump instruction, whether or not that translates to millions of function calls, i don't know. assume it does. then instead of making millions of function calls, your making billions of function calls. Going from millions to billions is a large step, bigger than just swapping an "m" for a "b" in marketingspeak. So are they planning on passing this performance hit to the legitimate consumer? No thanks, I'll take my Free source code and like it.

  8. Great. by Anonymous Coward · · Score: 5, Insightful

    So legitimate software is going to take on the functionality that virus software has been using for years? And companies are patenting these techniques as if they are somehow new? Virus writers are the true innovators here. They pioneered the infamous Mutation Engine. I would consider off the shelf software that used those techniques innovative, in fact I find it creepy. Honestly, if the time wasted trying to protect so-called intellectual property was used instead to invent things to simplify our lives, we (as in humanity) would be better off.

  9. I can see a market for this. by nicophonica · · Score: 5, Insightful
    I have worked on a couple of projects where the 'higher ups' (COO, CEO) were obsessed with the value of the intellectual property that their code represented. Woe be to the developer that tried to explain to them that their code was crap, written by team of programmers obviously just learning learning VB and trying to write it like a dumbed down version of Java. Most of programming was developing solutions to straight forward programming problems, which they still implemented in nearly the worst possible way.

    Yet, I have no doubt that if someone came up to them and warned them about the dangers of IP theft and showed them this solution, they would bite.

    If they really wanted to do maximum damage to their competition they should have just released the source code and hoped their competitors tried to used that as guidance.

    There are probably some rare instances when a specialized software technique is developed and you want to keep its implementation specifics secret. I have yet to run into a single instance of this after many years in the industry.

  10. Re:Won't work by jfengel · · Score: 5, Insightful

    Sure, you can reverse engineer it. But is it worth the effort?

    Most of the time it's not even worth reverse engineering unencrypted code, because it's really hard. There are open source projects that go undone because people don't want to expend the effort.

    The trick is not to make it impossible, but to make it hard enough that it isn't done. That level is different for different projects, but it's always finite.

  11. The software arms race. by kyz · · Score: 5, Insightful

    There is nothing new under the sun. These Java and .NET obfuscators are just the same old anti-SoftICE sections, which were just the same old Amiga/Atari copylocks, which were just the same Spectrum/C64 turboloaders, and so on.

    Every single one of these is broken. Almost all good programmers are capable of deciphering the standardised, retail-boxed algorithm used for the obfuscation, and can easily un-obfuscate it. Are all the Java variables named "a"? Diddums! You don't have a Java decompiler with the option to ignore that simple tweak.

    All that matters is:

    1) How important is the code behind the obfuscation?

    2) How much time and effort is the reverse engineer willing to spend?

    If you use a company's retail-box obfuscator, anyone with the "'Brand X obfuscator' deobfuscator v1.0" can get straight at your code. It's a technological arms race, nothing more.

    --
    Does my bum look big in this?
  12. Disagreements with the Premise by no+soup+for+you · · Score: 5, Insightful

    I don't love microsoft, but I think this article makes several claims without backing them up or offering any explanation as to their merits. Such as:

    1. .NET, on the other hand, is Microsoft's chosen successor to Visual BASIC, and effectively exposes source code at the very heart of Microsoft consumer and enterprise applications.
    2. If .NET is Such a Security Nightmare (It Is)...

    And "You can write a program in C# or Visual Basic.NET." while factually accurate, ignores Delphi.NET, C++ managed code using the CRL, and other implementations of the CRL (COBOL, etc).

    I think the basic premise of the article, where if someone is using your objects it is obviously a bad thing/security breach, is flawed. If you need to secure your objects, SECURE them! Seal them, see who is calling you, etc.

    Lastly, As shown by previous posts, Obfuscation is not the end-all panacea to security. In my opinion, it's barely a detour. Otherwise, Open Source literally could not be secure.

    --
    If you blog it...
  13. Re:Not enough eyes to make the bugs shallow... by dtio · · Score: 5, Insightful
    Nonsense. They don't see the forrest for the trees? I beg your pardon?

    You're a asuming that there is a Microsoft way to look at code and that every MS developer is a robot brain washed to think that way. MS hire very capable and brilliant people, you couldn't tell the difference between a bunch of NT kernel hackers and a bunch of Linux kernel hackers, both groups are extremely knowledgeable and manufacture high quallity code.

    MS has the biggest industrial infrastructure in the world of quallity assurance. Every developer should go trhough an internship in Redmond to see this.

    Large software *is* complex, period. Given a finite amount of talent and time, bugs are depedent of the size of the project, it really don't make a difference whether you're code is open source or not.

    How many people do you think actually look at open source code to look for bugs?

    Moral: if MS releases buggy, exploitalbe and potentially unsafe code is *not* because they are sloppy or because propietary code is inherently worse than open source, is because large software is complex and takes a lot of time to do it right.

  14. Sounds like a way to make things more unstable by Sporkinum · · Score: 5, Insightful

    If it changes how it executes every time, it sounds like it would be a fantastic way to introduce unreproducable bugs.

    I'm sure this would make QA testing a nightmare.

    --
    "He's lost in a 'floyd hole"
  15. Re:do what i do by __past__ · · Score: 5, Insightful
    i is always an integer with local scope, used as a counter in a loop and/or an index into an array or a similar collection. j, k and l are the same, if you need more than one variable that would qualify for being "i". This convention is perfectly clear and has been used for more than 40 years; calling "i" "index", "count" or "currentEmployeeIndex" does not carry any interesting surplus information. The same could be said for "n", which always is an integral number denoting the number of elements in some collection to operate on.

    tmp is less clear, but it certainly would have local scope, and only exists because of shortcomings in the implementation language (like not having a primitive operation for swapping the values of two variables without introducing a temporary variable), but no real significance in the problem domain.

    These variable names are perfectly acceptable and clear - unless you abuse them, of course, but you can abuse all nameing schemes. Nothing stops you from calling a global integer m_pszHelloKitty.

    Hungarian notation on the other hand is problematic because a) it is just a non-functional workaround for the weak typing in C and C++ (and their habit to make type errors crash your program in random unrelated places, or just corrupt your data) and b) there aren't actually enough rules, and if there were, nobody could remember them all. "iSomeInteger" and "sSomeString" are pretty common, but if you happen to use more interesting types, or even a whole C++ class hierarchy, it just doesn't work anymore. The only use of Hungarian Notation is to make clueless middle managers happy, similar to a long-winded format for mandatory comments preceding any trivial function or multi-page e-mail disclaimers. Source code is readable when you can actually read it out loud and people would understand whats going on, not if you encrypt redundant information in variable names.

  16. Re:do what i do by Zooks! · · Score: 5, Insightful

    Of course, if you don't know what the type of a variable is you can also just look at the type declaration.

    Unless you're using something like BASIC where variables just suddenly appear out of the ether I really can't see how Hungarian notation is necessary. Especially in an age where we have advanced editors with split windows, and powerful search tools like glimpse, cscope, and ctags.

    Besides, why should I trust some agglutinated letters on a variable name when I can do the same thing the compiler will do and look at the type declaration and be totally _sure_ of the type of the variable? What if some doofus changed the type of the variable in the declaration but was too lazy to update all the instances of Hungarian notation? Hungarian notation can only lead to a code maintainence nightmare!

    --

    --

    "I'm too old to use Emacs." -- Rod MacDonald

  17. Re:do what i do by angst_ridden_hipster · · Score: 5, Insightful

    The problem with Hungarian, of course, is that it lies.

    It's like the comments. They tell you what the programmer *meant* to do, not what he or she did.

    Similarly, Hungarian notation tells you the *intended* scope, type, etc, but the compiler may have a very different view of things.

    --
    Eloi, Eloi, lema sabachtani?
    www.fogbound.net