Slashdot Mirror


Famous Last Words: You can't decompile a C++ program

The Great Jack Schitt writes "I've always heard that you couldn't decompile a program written with C++. This article describes how to do it. It's a bit lengthy and it doesn't seem like the author usually writes in English, but it might just work (haven't tried it, but will when I have time)."

479 comments

  1. Why by Bruha · · Score: 0, Insightful

    Why would you want to do this unless you were stealing source?

    I'd just leave it alone. ;)

    1. Re:Why by flyneye · · Score: 1

      you best leave it alone you source stealer.
      your conscience betrays you. I'm tellin dad!

      --
      *Repent!Quit Your Job!Slack Off!The World Ends Tomorrow and You May Die!
    2. Re:Why by Pingular · · Score: 1

      stealing source It's called 'research' :)

      --

      When anger rises, think of the consequences.
      Confucius (551 BC - 479 BC)
    3. Re:Why by ankit · · Score: 1

      Simple, because it is fun !!

      --
      Don't Panic
    4. Re:Why by czion3 · · Score: 2

      Because you lost the source and forgot to make a backup.

    5. Re:Why by Morologous · · Score: 3, Informative

      I can't count the number of times I've been frustrated with the performance or process of an application that I had to interface with, and just wondered: *why* in god's name, or *what* in god's name are they doing in there.

    6. Re:Why by Anonymous Coward · · Score: 5, Insightful

      You need reasons?

      1) Finding backdoors
      2) Testing security
      3) Fixing bugs
      4) Adding features
      5) Discovering copyright violations
      6) Interfacing to non-supported clients

      Pretty much anything and everything you would do if you had the source.

    7. Re:Why by joonasl · · Score: 1

      This is exactly why I love Java. I have even debugged Weblogic app server few times by decompiling the class files.. (You would be suprised to know what you can find in there :)

      --
      "There is a terrorist behind every bush"
    8. Re:Why by p4ul13 · · Score: 4, Insightful

      You could be updating a program for your company for which the source is lost.

      --
      Paul Lenhart writes words!
    9. Re:Why by Grapes4Buddha · · Score: 1

      Not sure why this was modded 'funny' since I have encountered a few occasions when this would have been useful...

    10. Re:Why by Kadagan+AU · · Score: 2, Insightful

      I agree, that should have been modded insightful, not funny. We have a ton of in-house apps that we don't have source for anymore, and it would be really nice to be able to update them without having to rewrite the entire thing.

      --
      This space for rent, inquire within.
    11. Re:Why by Anonymous Coward · · Score: 0

      But its illegal!

    12. Re:Why by anthony_dipierro · · Score: 1

      Isn't stealing source reason enough?

    13. Re:Why by Call+Me+Black+Cloud · · Score: 4, Informative
      As a Java programmer I find it very useful to decompile class files from time to time. Reasons I've done so:

      A library we were basing a major portion of our code on had a bug in it (a Listener class failed to implement EventListener if I remember correctly) which kept our code from working. Removed offending classes from archive, decompiled, fixed, and recompiled.

      It's educational...the ol' "how'd they do that?". I've never taken code and used it but I found it instructional to look at how someone made a Swing text area from scratch, e.g.

      The challenge...one program I installed had a "enter registration key" and I was curious how that was handled (turned out to be a static string). Then there was this applet that was the the core of a company's business. Free, or pay and get more features. As it turns out the control of the features all resided in the applet, so change a couple of switch and if/then statements and voila, administrative privleges. Didn't use it for evil, much... :) They've since come out with a new version and I've been too busy using my mad java skillz on contract work to take a look at their code.

      Looking at security was instructional too, though, for when I was project lead on a commercial Java app I knew what worked and what didn't (we ended up using the Wibu key).

    14. Re:Why by neonstz · · Score: 2, Insightful

      What you need is a decent source control/backup system, not a decompiler.

    15. Re:Why by Anonymous Coward · · Score: 1, Funny

      Because you're one of the "innovators" at Micro$oft!

    16. Re:Why by IamTheRealMike · · Score: 2, Insightful

      This might be useful for when trying to make apps run in Wine. Occasionally disassembly is the only way to figure out why the app crashes light years away from the nearest API call etc.

    17. Re:Why by isorox · · Score: 2, Insightful

      shutting the barn door after the horse bolted?

    18. Re:Why by sik+puppy · · Score: 1

      Not quite - don't know who to credit with this quote:

      "If you steal from one person its plagarism. Stealing from many is research."

      --
      The first thing we do, let's kill all the lawyers. Shakespeare, Henry VI, Part 2, Act 4, Scene 2
    19. Re:Why by Lumpy · · Score: 4, Insightful

      Why would you want to do this unless you were stealing source?


      nice try.

      You must be either Bill Gates, Steve Ballmer or someone who works for the BSA.

      How am I to tell if your close source program isn't full of my GPL code that you blatently stole and are trying to rob me blind by STEALING my IP? Being a closed source advocate as you seem to be you are for me trying to detect IP theft and the illegal STEALING of my code by PIRATES right?

      Ok, I'm going overboard to make my point... I have EVERY right to use tools in a good and legal way. Why not outlaw hammers as anyone can perform a very grisly and horrible murder with one... Or better yet only allow licensed contractors to have hammers! as we know that the unlicensed public is only going to do very ewvil things with tools!

      see my point now? A tool is exactly what it looks like.... a tool. it can be used for good and evil. and I dont have any respect for the self righteous like you condemning what I do before I even do it.

      people with attitudes like you are what cause all the pain and suffering in this world...... STOP IT!

      --
      Do not look at laser with remaining good eye.
    20. Re:Why by Anonymous Coward · · Score: 0

      There are lot's of reasons why you would do this, an example that I just used was cracking the copy protection on GTA Vice City. Christ teaches us to be frugal and to make use of the tools that we have available, only a bleeping idiot would pay $50 for something that you can download for free.

    21. Re:Why by Brymouse · · Score: 1

      Plagiarize,
      Let no one else's work evade your eyes,
      Remember why the good Lord made your eyes,
      So don't shade your eyes,
      But plagiarize, plagiarize, plagiarize...
      Only be sure always to call it please, "research".

    22. Re:Why by Phleg · · Score: 1

      Actually, IIRC, discovering copyright violations is easier done looking at the assembly code. Once you decompile, the C++ code will effectively do the same thing, but it will probably look nothing like the original source.

      --
      No comment.
    23. Re:Why by stuuf · · Score: 0

      So you can decompile your favorite closed-source unstable operating system and FIX WINDOZE! Ok, maybe you would need to completely rewrite it.

      --

      Everyone is born right-handed; only the greatest overcome it

    24. Re:Why by Crashmarik · · Score: 3, Interesting

      That list can also double as 6 things your vendors dont want you to be able to do.

      I have always felt the greatest problem with closed source was it forced you to trust someone who you were fairly certain had only one skill and that was salesmanship.

      It of course raises the interesting question of if you find a copyright violation, in commercial software is your evidence void because the license agreement usually excludes all reverse engineering ?

    25. Re:Why by Crashmarik · · Score: 2, Interesting

      No its not.

      It may be a violation of the license agreement which would be a violation of a civil contract The enforcibility and applicability of said agreements have been a point of contention for nearly 30 years now.

    26. Re:Why by idlethought · · Score: 1

      Well that would make sure the cows didn't follow the horse..

    27. Re:Why by Anonymous Coward · · Score: 0

      Can a license be enforced if it doens't have the right to license that which it is covering?

    28. Re:Why by Dylan+Zimmerman · · Score: 4, Interesting

      Nope. It (probably) wouldn't be admissible because of the part that says no reverse compiling. Reverse engineering is something totally different.

      Reverse engineering is taking a black box and figuring out what it contains by giving it test inputs and watching the outputs. There are a few other things considered reverse engineering, but that describes most of it.

      Of course, all of this ignores the fact that EULAs have never been tested in court. They could be proven invalid as contracts fairly easily since the exchange of goods occurs before you ever see the EULA and most stores don't accept returns of opened software. Therefore, if you don't agree to the EULA, you still have the right to use what you purchased.

      On an interesting side note, various free trade laws specifically protect reverse engineering.

    29. Re:Why by abreauj · · Score: 1
      Why would you want to do this unless you were stealing source?

      There's a branch of archaeology where researchers reverse-engineer ancient technologies and actually build the tools and techniques in a historically authentic manner. There are many such studies building and testing things like medieval siege engines, ancient Greek warships, Polynesian transoceanic sailboats, stone-age tools and weapons, and Egyptian pyramids. It's one of the most fascinating areas of archaeological research today.

      By your logic, these researchers are merely IP thieves.

    30. Re:Why by bluethundr · · Score: 1


      No its not.

      It may be a violation of the license agreement which would be a violation of a civil contract The enforcibility and applicability of said agreements have been a point of contention for nearly 30 years now.


      This sureley doesn't sound illegal to me. But then, neither does just about anything that's prohibited by the DMCA. What does the DMCA have, if anything, to say about this practice? If it doesn't, how long, one might wonder, until it does?

      --
      Quod scripsi, scripsi.
    31. Re:Why by Waffle+Iron · · Score: 1

      Hey, didn't you "research" that verse from Tom Lehrer?

    32. Re:Why by darien · · Score: 1

      It was Wilson Mizner (1876-1933). He was an American screenwriter who co-wrote, among other things, the script to Hard to Handle (1933) starring James Cagney.

      There's your answer - and it only took ten seconds' plagiarism with Google. :)

    33. Re:Why by Morologous · · Score: 1

      JAD is a wonderous tool, isn't it?

    34. Re:Why by i_am_nitrogen · · Score: 3, Insightful

      One really good reason I haven't seen mentioned yet is writing a Linux driver for a piece of hardware only supported in Windows, such as the DXR3/Hollywood+ or the MyHD/WinTV-HD/etc. For these projects where the hardware manufacturers either can't or won't offer any help, the only way to support the hardware is by disassembling the Windows driver and figuring out the algorithms used by reading the disassembly and/or watching the interactions between the driver and the code. Fortunately for the MyHD driver project, the MyHD software is distributed without any EULA.

      BTW: Nice job getting all those responses with two lines...

    35. Re:Why by en4ca · · Score: 1

      Sounds perfect for all the Comp Sci students out there.

    36. Re:Why by Anonymous Coward · · Score: 0

      penis

    37. Re:Why by smittyoneeach · · Score: 1

      You might be doing digital forensics. It could well be your job to poke about at the lowest level. Part of a lawsuit, for example.
      You may have a hard performance or space requirement, and need to dig just a little deeper...

      --
      Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
    38. Re:Why by Anonymous Coward · · Score: 0

      The tool argument is interesting : what if I choose to use an uzi as a hammer, does that allow me to possess an uzi ?
      The practical use of a tool is not the only thing to consider; we have to think about the purpose of it also. That said, I don't believe that decompilers are purposely made to harm.

    39. Re:Why by Anonymous Coward · · Score: 0

      Jeff Hopkins, is that you?! Duck. Afflac. /me wonders why everybody is staring

    40. Re:Why by kasperd · · Score: 1

      It may be a violation of the license agreement

      But then again such a license agreement could be a violation of the copyright law. In the country where I live, the law explicitly says the user has the right to decompile if that is the only way to find the necesarry information to make a program you are developing interoperable. And the law says that right cannot be given up by agreement.

      --

      Do you care about the security of your wireless mouse?
    41. Re:Why by matrix29 · · Score: 1

      The tool argument is interesting : what if I choose to use an uzi as a hammer, does that allow me to possess an uzi ?

      The practical use of a tool is not the only thing to consider; we have to think about the purpose of it also. That said, I don't believe that decompilers are purposely made to harm.


      And yet some tools ARE naturally dangerous if left in the hands of the irresponsible.

      Shall I leave a hand grenade to be used as a baby's teething ring?

      What about rat poison as a sandbox filler?

      Some tools have an inherently high danger potential for casual misuse and others have a high danger potential even with exceptionally proper usage. Nukes for clearing mountain ranges anyone? The waste would be vast in the result, but this once was proposed as a reasonable civilian use of nuclear bomb tech.

      Nobody sane would fear the misuse that a single hammer alone would kill millions or level mountains. The casual misuse COULD accomplish these goals, but the time and energy required would be enormous compared to the casual misuse of a nuclear bomb.

      --
      "Face it, a nation that maintains a 72% approval rating on George W. Bush is a nation with a very loose grip on reality.
    42. Re:Why by le_jfs · · Score: 1

      Nukes for clearing mountain ranges anyone?
      Nobody sane would fear the misuse that a single hammer alone would kill millions

      What if I use the hammer to detonate the nuke?
      I know it's not supposed to work... But has anyone tried it? :-)

      --
      main(char O){O++&&(((O-291)*O+27788)*O-868020?1:putchar(O++) )&&main(O);}
    43. Re:Why by Anonymous Coward · · Score: 0
      (f) REVERSE ENGINEERING-
      1. Notwithstanding the provisions of subsection (a)(1)(A), a person who has lawfully obtained the right to use a copy of a computer program may circumvent a technological measure that effectively controls access to a particular portion of that program for the sole purpose of identifying and analyzing those elements of the program that are necessary to achieve interoperability of an independently created computer program with other programs, and that have not previously been readily available to the person engaging in the circumvention, to the extent any such acts of identification and analysis do not constitute infringement under this title.
      2. Notwithstanding the provisions of subsections (a)(2) and (b), a person may develop and employ technological means to circumvent a technological measure, or to circumvent protection afforded by a technological measure, in order to enable the identification and analysis under paragraph (1), or for the purpose of enabling interoperability of an independently created computer program with other programs, if such means are necessary to achieve such interoperability, to the extent that doing so does not constitute infringement under this title.
      3. The information acquired through the acts permitted under paragraph (1), and the means permitted under paragraph (2), may be made available to others if the person referred to in paragraph (1) or (2), as the case may be, provides such information or means solely for the purpose of enabling interoperability of an independently created computer program with other programs, and to the extent that doing so does not constitute infringement under this title or violate applicable law other than this section.
      4. For purposes of this subsection, the term 'interoperability' means the ability of computer programs to exchange information, and of such programs mutually to use the information which has been exchanged.
    44. Re:Why by Lumpy · · Score: 1

      The tool argument is interesting : what if I choose to use an uzi as a hammer, does that allow me to possess an uzi ?

      dont be an idiot. has an Uzi ever been marketed for anything other than causing the rapid fire of lead projectiles? do you know of anyone good enough to use an Uzi correctly to shoot a nail into a board?

      No.

      So let's go on your gun theory... damn... Nail guns are fricking evil... I coult kill and maim more people with it than your uzi.

      An Uzi has never been made for or represented as a tool for anything but killing of people. They were not built for hunting exactly like an atomic bomb... (New from RonCo.. the deer aniliator! kill entire acres of deer easily!)

      show me something that is sold as a utility item or tool.. like a rifle, bow and arrow, Bowie knife. THOSE are tools. an UZI has never been a tool.

      so yes, the pourpose of the tool is to be considered.. and I have yet to see a de-compiler that is designed specific for wanton stealing.

      Hmmm... kinda like the DeCSS case... it was not written for copying.. it was written for viewing, can it be used to copy? I am pretty sure it can be a part of a copy process. just like lead can be forged into a bullet that can be coated in teflon to create a cop-killer bullet.

      I dont think that Dow-corning invented teflon to murder police.

      --
      Do not look at laser with remaining good eye.
  2. No comments and slashdotted already? by chefbimbo · · Score: 0, Redundant

    Or is it just my ISP?

    1. Re:No comments and slashdotted already? by whatparadox · · Score: 1

      It is not just you.

    2. Re:No comments and slashdotted already? by sweeney37 · · Score: 0, Offtopic

      I tried viewing it in the "mysterious future" and it still crapped out on me.

      Mike

    3. Re:No comments and slashdotted already? by Anonymous Coward · · Score: 0

      The article doesn't exist. It's a joke. Laugh!

    4. Re:No comments and slashdotted already? by Minna+Kirai · · Score: 1

      Hi, the URL posted is obviously wrong.

      There's an extra "/" character that breaks the link. Here's the article.

    5. Re:No comments and slashdotted already? by kasperd · · Score: 1

      Hi, the URL posted is obviously wrong.

      Is this a new approach to prevent slashdoting?

      --

      Do you care about the security of your wireless mouse?
  3. uses by Anonymous Coward · · Score: 0

    to find security holes

  4. New evolution by cheezycrust · · Score: 0, Troll

    Now even the story posters don't read or verify the articles they're posting...

    --
    Teenagers these days don't have as much sex as they want each other to think they do.
    1. Re:New evolution by joto · · Score: 1
      Now even the story posters don't read or verify the articles they're posting...

      No, that's not a new evolution. It has been this way forever...

  5. You can't by Anonymous Coward · · Score: 5, Insightful

    Information is lost in compilation. You can never reconstruct the exact original source. You end up with valid C++ that has no more human-understandable information than the equivilent machine code.

    Like turning hamburgers into cows...

    1. Re:You can't by Anonymous Coward · · Score: 0

      Equivalent, that is. I fix my typos - grammar trolls can all suck on my shit.

    2. Re:You can't by Morologous · · Score: 5, Funny

      Like turning hamburgers into cows...

      I'm going to use that line.
    3. Re:You can't by jezzgoodwin · · Score: 2, Informative

      He's quite right.

      Take a sum within a program, for example (a+b)=1000 ... now there are infinite possible combinations of what a and b can be ... but without the correct variable names, or the commenting that went along with the code (assuming there was some) ... the decompiled output is going to be pretty much useless / extremely difficult to understand

    4. Re:You can't by NewbieProgrammerMan · · Score: 5, Funny

      Heh. You're assuming that you're attempting to decompile something that had human-understandable source to start with. :)

      --
      [b.belong('us') for b in bases if b.owner() == 'you']
    5. Re:You can't by cperciva · · Score: 5, Funny

      We're talking about C++ here, not perl.

      Compiled C++ code can't be decompiled into anything approximating the readability of the original; compiled perl code can.

    6. Re:You can't by antis0c · · Score: 4, Informative

      What's to say you need something as readable as the original? I worked at InterAct Accessories/GameShark for a few years before they went under as essentially a 'reverse engineer'. Without getting yet another CND from them in the mail due to a post on Slashdot (I don't even think they could send one now they're out of business?), all I can say is sometimes when hacking a game it benefits an engineer to decompile the application and be able to set breakpoints and watch execution flow while the game is running on for example a PlayStation 2. Sure it's going to be a lot of nearly unreadable C++ mixed with Assembly, but if you can watch the execution flow as you do something, it can be useful.

      Of course a lot of naive people think decompiling would allow you to take an application and start writing patches for it, in that case you are right, it's going to be pretty useless. However it's not entirely useless for all situations. I'm sure the WINE guys might get some use out of it.

      --

      ..There's a-dooin's a-transpirin'
    7. Re:You can't by Anonymous Coward · · Score: 0

      Unable to view said article, but here are some important points:

      1) You have to know the exact compiler used and understand its intricacies.
      2) You need to know and understand the optimizations that were compiled with (unrolling loops, etc.)
      3) You won't be able to get human readable variable names or comments

      So you *could* decompile a program (C or C++, doesn't matter), but you'd end up with something totally obfuscated. This could still be useful if you could recompile it with debugging options and trace through while running the code, to crack games, find out which memory is used in password verification, or just debug broken code.

      Better still: learn assembly language.

    8. Re:You can't by capnjack41 · · Score: 4, Insightful
      And then on top of that, the compiler optimizes that code, so calculations are no longer the straightforward and intuitive things they used to be, now they're a series of out-of-order, smaller calculations that are harder to recognize. They're efficient as hell but barely reversible.

      I'll RTFA when it comes back to life :).

    9. Re:You can't by hackstraw · · Score: 1

      Not to mention that no two C++ compilers can even agree on how to compile C++ code.

      The article is slashdotted, so I couldn't read it, but I would think that C++ would be extreemly difficult to decompile because of the use of inlined functions, and what would you do with templates? Also, I don't see how a class could be recreated from binary. I would be more likely to believe that a C++ binary could be decompiled into (ugly) C code, but not necessarily C++ code.

    10. Re:You can't by Anonymous Coward · · Score: 0

      Go ahead - I already used yours.

    11. Re:You can't by zackbar · · Score: 1

      Considering some of the code I've had to support, I could probably deal with it.

      As opposed to code by authors from the school of copy & paste, who don't include comments, and are generally confused as to what they are trying to do, I'll take the decompiled code that actually works but needs commenting.

    12. Re:You can't by Anonymous Coward · · Score: 0
      We're talking about C++ here, not perl.

      There's a difference? I thought Perl was C++ with lame garbage collection bolted on.

    13. Re:You can't by Anonymous+Brave+Guy · · Score: 2, Funny
      Compiled C++ code can't be decompiled into anything approximating the readability of the original; compiled perl code can.

      Yep; compiled Perl already approximates the readability of the original pretty well anyway. :-p

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    14. Re:You can't by kindofblue · · Score: 1

      I'm a vegetarian and a foulmouth. I'd prefer... Like turning shit backing into pizza.

    15. Re:You can't by Anonymous Coward · · Score: 2, Informative

      The thing is, the point of a decompiler is to make the code readable. If you don't particularly care how readable the code is, then your standard disassembler is usually good enough.

      Incidentally, you can't even theoretically create a perfect disassembler, at least on the x86 instruction set. The nature of the complex instruction set means that an arbitrary string of bytes can be decoded into a wide variety of programs, especially when you throw in the possibility of self-modifying code, and all that other garbage. It's a little better on RISC with fixed, word-aligned instruction sizes. Some minor problems would still exist, but they wouldn't be much of a hinderance to a practical "good-enough" disassembler.

      Not to say that creating a workable disassembler is impossible. However, usually more valuable is a debugger with a disassembled output. In this case, you know the program counter's value, so you can deterministically disassemble the program (up to a point). This is generally all you really need to do reverse engineering. Throwing in a decompiler on top of all this generally doesn't help somebody who is fairly experienced reading a disassembly, although I suppose it could be of help to somebody who's more familiar with C++ than assembly mneumonics.

      On the other hand, it's not that hard for somebody to pick up just enough assembly to figure out what's going on, especially if they're technically sophisticated enough to be going to all the trouble of stepping through the program to try and figure out how it works.

      So just to reiterate, decompilers are generally not all that valuable.

    16. Re:You can't by Paradise+Pete · · Score: 5, Funny
      I'd prefer... Like turning shit backing into pizza.

      Clearly you haven't tried Domino's.

    17. Re:You can't by ryanr · · Score: 2, Insightful

      Who said the point of the exercise was to turn the code back into the original C++?

    18. Re:You can't by deanpole · · Score: 1

      Like turning hamburgers into cows...

      Easy, feed it to calves...

      Sure the recovered C code will have garbage variable names, but it will compile on a different architecture.

    19. Re:You can't by bumby · · Score: 1

      Information is lost in compilation. You can never reconstruct the exact original source. You end up with valid C++ that has no more human-understandable information than the equivilent machine code.

      Of course you can! You just make a program that generates random C++ source, compiles it, and diffs the output with the orginal file. If they match, you have the exact orginal sourcecode ;)

      --
      Hey! That's my sig you're smoking there!
    20. Re:You can't by Anonymous Coward · · Score: 0

      Not to mention that it is usally against the TOS agreement. Almost every purchased piece of software ive seen in the past 20 years has had that. Its usually something like 'you shall not reverse engineer, decompile, or disassemble this code'.

      Like that will stop us :)

      With visual studio im sure it would be fairly easy to write something that does this. However it may be slightly compiler specific. Like you would need to know it was VC5.0 or something like that. But other than that the structures VC produces are VERRRY regular. Hell ive seen games that SHIP with the debug stuff still in it. You probably could probably even get back the original class names.

    21. Re:You can't by len_harms · · Score: 2, Interesting

      You probably could get very close.

      With straight C++ classes you probably could get something back resembling them. VC is a very regular compiler. Which is the one he used. Havent looked at what VC dose to templates. But I would be willing to bet it transforms them into type specific classes then into C. Would just need to use the preprocessor and see what it did to it.

      Inline functions though would be imposible to get back. But then again they are inlined. So the code would be there. Just not necessaryly in the original form.

      The VC compiler is just a transform engine. It transforms from C++ to C to PCODE to ASM. Course thats 5 year old info. When I used to care about what the compiler was doing to my code. Templates are probably similar.

      Im sure the code that came back out of this thing would be UGLY. But if you look at the end of most exe's shipped these days most developers do not even bother stripping the exe anymore. You probably could even get back MOST of the classe names and function names maybe even the variables.

    22. Re:You can't by Waffle+Iron · · Score: 4, Funny
      Like turning shit backing into pizza

      Here's how:

      Flush shit down toilet -> let shit mellow at sewage plant -> strain shit residue out of bottom of sewage vat -> haul to field -> spread on grass -> grass grows -> cow eats grass -> pull cow's udder, direct milk into bucket -> ferment milk to cheese -> shred cheese -> spread on dough -> Pizza!

    23. Re:You can't by Anonymous Coward · · Score: 0

      You forgot: -> Profit!

    24. Re:You can't by 42forty-two42 · · Score: 1

      Actually, using the Deparse mode, the readability increases. Try it yourself - perl -MO=Deparse something.pl

    25. Re:You can't by Llywelyn · · Score: 3, Funny

      This reminds me of a statement I saw on /. a long time ago:

      Python: Executable Pseudocode
      Perl: Executable line-noise

      --
      Integrate Keynote and LaTeX
    26. Re:You can't by mystran · · Score: 1

      You're sure the compiled version isn't easier to read after all ?

      --
      Software should be free as in speech, but if we also get some free beer, all the better.
    27. Re:You can't by Anonymous Coward · · Score: 0

      Ooookeeey... I'm gonna puke now.

    28. Re:You can't by jkorty · · Score: 4, Insightful
      Information is lost in compilation. You can never reconstruct the exact original source

      So what? Doing reasonable interpolations in context is what brains are for. Example: IIRC, when the Morris Worm appeared in 1989, Gene Spafford examined the binary and reverse-engineered the C code, sprinkling it with meaningful comments and good variable and function names. When the original source became available, his turned out to be cleaner program than the original. That is, he not only recreated the original in every way that counts, he overshot and did better than the original

    29. Re:You can't by Anonymous Coward · · Score: 0

      That is exactly what the article says. You don't need to read the article to guess that.

      Incidently, the way classes compile under most compilers does allow you to recreate them somewhat. If there is a virtual function table that gets populated, you will notice function calls have the jump then jump signature. Inlined calls are much harder, especially if the optimizer allows branching to be optimized (call a function with a boolean parameter set to a literal false. If the compiler optimizes away any if statement that uses it, then you have almost no chance of even guessing which function it came from). However, I would find that ugly C++ code still more readable than assembly. YMMV

    30. Re:You can't by JoeCommodore · · Score: 1

      Now if compilers can take human readable code and correlate, optimise and reconfigure it for more machine oriented code, what is to stop someone from writing a 'machine code interpreter' to take the object and re-work it into a human understandable source.

      No - it won't be 'the original,' but it will work as the original and the owner of the object can make any necessary observations and/or corrections and re-compile it (who knows, maybe it will compile tighter?)

      I myself have used decompilers on my own works to (thankfully) ressurect projects I had lost source material for. If you think you will *always* have your source code on file for you or your customers, you are living in a dream world.

      --
      "Enjoy what you're doing! If it becomes drudgery, you're doing it wrong!" - Jim Butterfield
    31. Re:You can't by Anonymous Coward · · Score: 0

      Like turning hamburgers into cows...

      This is exactly what the farmers tried to do... they ended up with a new virus that melts your brain.

      Don't say you weren't warned.

    32. Re:You can't by Minna+Kirai · · Score: 1

      you are living in a dream world.

      Or you publish Free Software.

    33. Re:You can't by ForsakenRegex · · Score: 1

      The common use of templates for generic programming makes C++ a lot harder to read than Perl. I'd be a lot less concerned about someone decompiling my C++ app than someone reading my Perl source.

      --
      "A man talking sense to himself is no madder than a man talking nonsense not to himself."
    34. Re:You can't by humming · · Score: 1
      You wrote:

      Like turning hamburgers into cows...

      Easy, feed it to calves...


      Which, IIRC, is exactly how the 'Mad Cow Disease' started to spread in the first place.
      --
      I'm too stupid to preview.
    35. Re:You can't by Anonymous Coward · · Score: 0

      Well, you seem to be pretty inside Perl after all.

      The Perl interpreter (sic) does turn the source code into a compiled optree internally, yes.
      But to actually see the compiled opcode byte stream you just might have to dump its memory pages.

      I don't think people should judge readability on the used language but on the moron who wrote it.

    36. Re:You can't by Anonymous Coward · · Score: 0

      thanx... was looking for a new cool signature anyway :)

    37. Re:You can't by FuzzyDaddy · · Score: 1
      I had an idea about this. Suppose you had a decompiler that worked like a compiler in reverse. That is, the first pass is to take the machine code and generate syntactically valid source code. Then, go through a series of optimizations - but instead of optimizing for performance, optimize for human readability.


      You wouldn't get the same source code you put in (all your local variable names would be different, for one thing) - but you could get something useful that could be modified.


      It would be non-trivial to implement something like this, of course. But I wonder if it could be done.

      --
      It's not wasting time, I'm educating myself.
    38. Re:You can't by Anonymous Coward · · Score: 0

      But grammar trolls whine about typos, assfarmer.

    39. Re:You can't by jo42 · · Score: 1

      9) Profit!

    40. Re:You can't by kevquinn · · Score: 2, Insightful
      It is perfectly possible to reverse-engineer a meaningful source from a given binary. It's certainly not easy, and of course you won't end up with the same variable names etc (unless the author kindly left in heaps of debug symbols etc), but that hardly matters. The point is that it is possible. Even templates are possible to decompile, given enough incentive; after all it's just fancy pattern matching.

      With regards the original article - well, that was a bunch of obvious guff really; what you'd expect from high-school geeks of the type I was, some number of years ago. Of note, is that it claimed to decompile C++, when actually it talked only of rather trivial C constructs, something that is a well understood practice already.

      Some relatively recent classic decompilation work was done by Cristina Cifuentes who put together a C decompiler that worked to a significant degree for common DOS-based compilers of the time. Effectively the job of "decompilation" can be thought of as "compilation" - instead of compiling C into ASM, you think of compiling ASM into C. Not as daft as it sounds, honest. You can download "dcc" from the above site to investigate further.

      Boomerang is a sourceforge project attempting to create a decompiler. Worth a look, as well.

      It's worth noting, that there are a number of ways to "cheat". For example, it's often trivial to discover what compiler was used to generate a given object code, and there are usually masses of common library-type code that gives you a leg up. Add to that, the fact that a piece of code was generated by a compiler, and the problem of discovering what a given piece of object code does is drastically simplified - compilers add huge amounts of structure and predictability to the generated object code that can be absent in free-form handwritten assembler (and few people do that anymore!), and much can be made of this.

      On the code/data issue mentioned by others in this thread - although separating code/data in general from mixed binaries can be considered hard, in reality it's often quite feasible and even simple. After all, the CPU manages to work it out. Again, the fact that there are so many short-cuts you can take really helps.

      Of course, a quick cruise around the cracking community will turn up all sorts of ways and means to shortcut this sort of problem...

      Here are the results of a quick googling:

    41. Re:You can't by CAIMLAS · · Score: 1

      This could simply be for the fact that Spafford was a better programmer. However, you're likely correct that the decompilation helped him substantially.

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    42. Re:You can't by evilviper · · Score: 1
      You can never reconstruct the exact original source.

      Even in freakin' ASSEMBLY, you can't reconstruct the original source from the binary...

      Decompilation doesn't strictly require exactly perfect output.

      You end up with valid C++ that has no more human-understandable information than the equivilent machine code.

      You you understand C++, but don't grok machine code, it will be lightyears more understandable. Nuffsaid.

      Besides, it takes far less knowledge to decompile it and change some string, than to chang the same string in machine code, since jumps and the like are hard-coded to the addresses, which you will likely have changed.

      Like turning hamburgers into cows...

      Actually, hamburger is fed to cows.

      It is really not a good analogy. Decompling is more like RECYCLING computers... You won't get the same thing out as you put in, but you will get something very useful out of it anyhow.
      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    43. Re:You can't by mystran · · Score: 1
      Agreed.

      Take frontpage, write a document, try to read the source. Then check my webpage, read the source, compare.

      Another thing is that it really depends on what one is used to. Most people think LISP is hard to read, but for someone who works a lot with LISP it's just as easy to read as Ruby.

      I've seen C (mostly small fragments) code which is easiest to understand by piping it through objdump.

      --
      Software should be free as in speech, but if we also get some free beer, all the better.
    44. Re:You can't by Anonymous Coward · · Score: 0

      No. Given an X86 binary, it can be disassembled into the instructions the CPU will execute.

      If you can't disassemble it, the CPU can't execute it. And it's not like x86 instruction encoding is secret.

      What you're probably thinking of is the ability for users of fixed-instruction-width instruction sets to disassemble backwards from a breakpoint. You can't do this on x86, since you don't know the length of the last instruction (nor the last-1, last-2, etc).

    45. Re:You can't by Anonymous Coward · · Score: 0

      To be fair, Mike Muus, Ed Wang and many others helped out with the decompilation of the rtm worm...

  6. Oop by Suffering+Bastard · · Score: 5, Funny

    it doesn't seem like the author usually writes in English

    Surely he now understands the English infinitive "to be Slashdotted".

    --
    "Molest me not with this pocket calculator stuff."
    - Deep Thought
    1. Re:Oop by ischorr · · Score: 1

      I don't know, when I use the correct URL I get in just fine... What slashdotting were you talking about?

    2. Re:Oop by Anonymous Coward · · Score: 0

      The kind that gives you +4 Funny for nothing, by reusing that slashdot joke over and OVER AND OVER AGAIN. :(

  7. Why not? by bazik · · Score: 5, Insightful

    I've always heard that you couldn't decompile a program written with C++.

    Well, you can decompile every binary programm at least to assembler code, so why shouldnt it possible with C++?

    Maybe he ment "you can't decipher the source of a C++ programm" ;)

    --


    --
    One by one the penguins steal my sanity...
    1. Re:Why not? by Anonymous Coward · · Score: 0

      Yeah I had a program several years ago that decompiled DOS apps into ASM, and converted the ASM to C. Fun stuff/

    2. Re:Why not? by Anonymous Coward · · Score: 0

      Insightfull? You cannot compile assembler, hence you cannot decompile to assembler. You can dissassemble binary to assembler (if you're lucky, it never works for me). However compilation != assembly.

    3. Re:Why not? by Anonymous Coward · · Score: 0

      You can dissassemble binary to assembler (if you're lucky, it never works for me).

      Just out of curiosity, why doesn't it work for you? What happens when you try it?

      I've only done it on snippets of code in the IDE, to compare how different algorithms get compiled down to ASM.

    4. Re:Why not? by Anonymous Coward · · Score: 0

      Any data in the code, strings, blank areas to put code on 8 byte boundries, etc... tend to get disassembled to. This upsets the offsets and means opcodes and operands can get mixed up.

    5. Re:Why not? by mindstrm · · Score: 1

      A proper multi-pass disassembler takes care of this usually.

      Data and code do not usually end up in the same segment.

    6. Re:Why not? by glwtta · · Score: 0
      you can decompile every binary programm at least to assembler code

      Um, no. If you program is in assembly it is compiled, that's what compilation does - produce assembly code.

      +5 Insightful my ass, back to Compilers 101 with you.

      --
      sic transit gloria mundi
    7. Re:Why not? by BJH · · Score: 2, Interesting

      Actually, not quite true. Assembly code is usually considered to mean the mnemonic code intended for human (well, semi-human) consumption, whereas machine language is the actual binary opcodes and arguments.

      So, he's sort of right - you can decompile any binary program to assembler. It's usually called disassembly rather than decompilation, though.

    8. Re:Why not? by GC · · Score: 0

      Compilation produces binary code, not assembly.

      Assembly is often considered a mid-level language between binary and high-level languages like C/C++.

    9. Re:Why not? by Anonymous Coward · · Score: 0

      Der, I'm a moron who doesn't know anything about computers.

      So assembly language is the final form? So your assembly language file is executable or linkable? Yeah, uh huh...

      And all this time I was thinking you compiled to binary. Boy was I wrong... wow, what I mind blower.

    10. Re:Why not? by NoMoreNicksLeft · · Score: 2, Informative

      Uh, no. Compilation produces assembly, and then the (sometime integrated) assembler assembles it into machine language (not binary). Forget what switch it is, but gcc even let's you see what asm code it is generating.

    11. Re:Why not? by BJH · · Score: 1

      -S

    12. Re:Why not? by joto · · Score: 1
      Uh, no. Compilation produces assembly, and then the (sometime integrated) assembler assembles it into machine language (not binary). Forget what switch it is, but gcc even let's you see what asm code it is generating.

      No, that's not right. While a compiler could produce assembly as it's final stage (as e.g. lcc), gcc, and most other compilers do not. Just because gcc and most other compilers are able to produce assembly code in the same way they produce object code, does not mean that that is what they usually do!

      On the other hand, there is nothing wrong in generating assembly code, and I would probably use that approach if I were to write a native-code compiler myself (something that seems less and less likely the more I learn about it...)

    13. Re:Why not? by cduffy · · Score: 1

      So assembly language is the final form?

      No, it's not the final form, it's the compiled (but unassembled and unlinked) form. These are all discrete stages, as much as some higher-level tools may hide that fact (or even do multiple steps at once).

    14. Re:Why not? by leviramsey · · Score: 1

      No, you're misunderstanding.

      In most C/C++ compilers, the compilation step is simply translating it to asm. Then the assembler is used to get to object code.

      However, to the user, compilation implies the compilation and assembly.

    15. Re:Why not? by Anonymous Coward · · Score: 0

      Ahh, so thats why GCC spits out assembler errors if an older version of 'as' (at least i think thats the command...) is installed...it doesn't actually use it, it just runs generated assembly through it for the hell of it, just to see what happens.

    16. Re:Why not? by GlassHeart · · Score: 2, Insightful
      you can decompile every binary programm at least to assembler code

      No. Assuming we're talking about software disassemblers here, not every program can be reliably disassembled. Disassemblers work by mainly following the execution paths of already disassembled code, so that it knows exactly where a subroutine begins. In many instruction sets, instructions have variable length, and not starting your decoding on the right byte will be a big mistake that cascades on to the next instructions. Now, knowing this, all we have to do is to change the execution path without the disassembler knowing. A function pointer (address loaded at run-time) already presents a serious problem to a disassembler, but simply asking the user to enter the instruction address to jump to will completely defeat the automatic disassembler. There's no way for the disassembler to know what the user will enter, and hence where the program will go to next.

      Humans will still be able to disassemble your program, of course. However, you still won't get the original assembly source back. Assembly languages usually support macros and pseudo-instructions that improve readability, but have no correspondence in assembled form.

    17. Re:Why not? by KarmaPolice · · Score: 1

      This is true. In MIPS assembly the nop and move -instructions are pseudocode. The nop is merely a sll $0, $0, 0 (shift reg0 left 0 times and store in reg0) and move is addu $target, $dest, $0 ($target = $dest + 0).

      Also some load instructions are not possible and therefore must be replaces with 2-3 other instructions to actually do the desired operation.

      The assembly produced by e.g. gcc is meant for hand-optimization and curiosity but the assembler will do optimizations and various translations.

    18. Re:Why not? by SN74S181 · · Score: 1

      Umm, I've never heard of a program that prompted the user to enter the instruction address to jump to. Well, I have, but only in some of my Assembly Language textbooks. I've certainly never heard of any user code that does so. Are you sure you're not just making this up as a hypothetical case?

    19. Re:Why not? by arkanes · · Score: 2, Interesting

      Not directly, but inputting, say, the name of a function or command to call, looking that up in a table of function pointers, and executing the pointed-to function amounts to the same thing.

    20. Re:Why not? by a+hollow+voice · · Score: 1
      Well put. Now, before anyone else gets confused about what's compilation and what's not:

      (I'm thinking back to old CS classes here, so correct me if I'm wrong...)

      Compilation, in common usage, means converting a source file in whatever language into an executable, since most people rarely if ever compile without intending to immediately generate an executable.

      Compilation, in the more technical definition most commonly given by CS professors and people who wrote code before 1980 ;), is the process of converting a higher-level language into assembly code, which is then assembled into an executable. (I'll avoid the thornier issues of linking here.)

      The term Compilation gets used in both of the above senses, but if you're going to use one, keep the other one in mind so you know what other people are talking about.

      And on a side rant, contrary to what some people seem to believe, assembler is not just a prettier form of machine language, though it's close. For example, in x86 assembly, the assembly opcode and arguments MOV x,y can translate into several different machine language instructions, depending on whether x and y are registers, memory locations, memory locations with segment offsets, etc. in any combination (i.e. MOV reg,mem is one instruction, MOV mem,reg is another, MOV reg,[seg]mem is yet another, etc.). Other instructions work the same way, which is why the x86 instruction set has about 3,987,236,231,235 valid instructions.

      Hope I remembered all that right. If not, correct me, because I'm getting rusty on my low-level stuff. ;)

    21. Re:Why not? by seanadams.com · · Score: 2, Informative

      This is such a grossly misinformed statement, I don't even know where to begin. Assembler and machine language ("binary") are semantically identical. You can go back and forth from assembler to machine code all day and still have the same thing. All you lose when going from human/compiler generated (vs disassebled machine code) is labels and comments.

      With C++ or any high-level language, there zillions of ways a compiler might interpret the code - just as long as the machine code effectively does was the C code says. Even identifying what compiler was used will not help - there are just so many ways to say the same thing in C. for, while, goto, case, it's all syntactic sugar that disappears when you compile.

      You can make a decompiler which identifies various code structures and converts them to high-level representations, but it can't EVER know what the original source looked like.

    22. Re:Why not? by Theodore+Logan · · Score: 1

      Well, you can decompile every binary programm at least to assembler code

      Because of the equivalence of this problem to the halting problem, this is not strictly true.

      --

      "If you think education is expensive, try ignorance" - Derek Bok

    23. Re:Why not? by joto · · Score: 1
      No, you're misunderstanding.

      No, I'm not. But I seem to be arguing with someone with a disability for learning.

      In most C/C++ compilers, the compilation step is simply translating it to asm. Then the assembler is used to get to object code.

      "Most" is a relative term. I can only think of one (lcc) C compiler that works that way, whereas most every other compiler I've heard about or used does not work that way. If you are willing to post a reasonably large list of (real-world, not student-project) compilers that verifyably works that way, I'll be willing to change my viewpoint. But anyway, gcc is not one of them.

      However, to the user, compilation implies the compilation and assembly.

      The user doesn't care either way. This is only of interest to nerds that have to know how everything works, or to compiler hackers.

    24. Re:Why not? by Anonymous Coward · · Score: 0


      whereas most every other compiler I've heard about or used does not work that way


      well, al most all americans I've encountered know al most nothing about grammar.

    25. Re:Why not? by GlassHeart · · Score: 1
      Are you sure you're not just making this up as a hypothetical case?

      The line I specifically quoted in my post was:

      you can decompile every binary programm at least to assembler code

      which is what I am trying to refute. Yes, the example was hypothetical, but it presents an impossibility (rather than just a very very difficult problem) to the disassembler, unlike things like self-modifying code or function pointers.

  8. hmm by Graspee_Leemoor · · Score: 5, Informative

    A c/c++ decompiler that totally worked would be the Holy Grail of crackers. Unfortunately it is actually impossible to get everything back because lots of info is lost on compilation.

    Nevertheless there are tools out there that attempt to decompile programs; I think of them more as ways of making assembly more readable.

    Note, a lot of them wouldn't work on hand-written assembly, because they rely on knowledge of how certain compilers compile various things- e.g. there was a Delphi decompile available.

    graspee

    1. Re:hmm by deranged+unix+nut · · Score: 2, Insightful

      The problem is that there are quite a few people out there that assume that just because it is in binary form, that it can't be figured out. For example, they will use XOR to "encrypt" data stored inside the program, or assume that their secret algorithm is safe because it is compiled.

      The barrier to entry is definately raised, but it is always possible to figure out what the compiled code is doing given enough time and effort. In fact, I've even heard of people who patch operating system kernel code without the source...

    2. Re:hmm by jackb_guppy · · Score: 5, Interesting

      I wrote reverse compilers on IBM midrange equipment. where there are not stacks and self modifing code is VERY commom place. It is easy to do:

      Create a program that preforms / understands the opcodes for the processor and addressing. And it follows both sides of a branch.

      Now "run" the program, that maps out the all opcode and data areas.

      Once done. Look at that Assemmebler equivatlent, map out commom subroutines and function calls. Data Storage become very clear. Lastly, commom storage with show external and internal common structures - so naming of fields and visualable.

      It is striaght forward, can be time comsuming - and very helpful is understnad hinden or loss information.

    3. Re:hmm by Anonymous Coward · · Score: 0

      I hear you.

      Example: Those with GTA: Vice City wanting to play the soundtrack files - adf = mp3 XOR 0x22.

      Two seconds, one hex editor, no brain. :)

    4. Re:hmm by Anonymous Coward · · Score: 0
      The barrier to entry is definately raised, but it is always possible to figure out what the compiled code is doing given enough time and effort.


      This is what a CPU does. It takes compiled code, and it figures out what to do based on it. :) It even does it at several billion cycles per second these days.

      If you're trying to reverse engineer some code, duplicating the work that your CPU does isn't an efficient use of your brain. Your general strategy needs to be more along the lines of figuring out where stuff you're interested happens, then drilling down through each layer without actually reading every opcode.
  9. Software industry, fear no more... by CoolVibe · · Score: 1

    Slashdot has DDoS'ed the damn thing into oblivion.

    On the other hand, did anyone get to mirror it?

  10. sure you can go from asm - c++ by Anonymous Coward · · Score: 5, Informative

    but it'll look like this

    class a
    {
    public:
    void b(int c);
    void d(int e);
    private:
    int g;
    int h;
    };

    int main()
    {
    a f;
    f.b(23);

    int x; x=0; x++;
    if(x > 3) goto j;
    f.d(x); x++
    if(x > 3) goto j;
    f.d(x); x++;
    if(x > 3) goto j;
    f.d(x);
    j: f.b(42);

    return 0;
    }

    1. Re:sure you can go from asm - c++ by Anonymous Coward · · Score: 0

      And even if it does, this is much better than just an assembler listing. And it would be possible to replace those gotos with blocks in if statements, I am sure. And then all you need is a tool that lets you rename the members and classes as you go and makes these changes to the rest of the source.

      On the other hand: I have seen a lot of genuine C++ sources which looked like this. No decompiler involved, just a guy who tried to write "smart" code.

    2. Re:sure you can go from asm - c++ by Anonymous Coward · · Score: 0

      That looks like most KDE source code.

    3. Re:sure you can go from asm - c++ by occupant4 · · Score: 1

      A good decompiler would recognize the assembly output for loops, and convert them accordingly.

    4. Re:sure you can go from asm - c++ by rsheridan6 · · Score: 4, Funny

      My girlfriend just read that over my shoulder and said "Is that a poem?"

      --
      Don't drop the soap, Tommy!
    5. Re:sure you can go from asm - c++ by Corporate+Troll · · Score: 1

      You *must* be kidding? Right? Tell me I'm right...

    6. Re:sure you can go from asm - c++ by Imperator · · Score: 1

      Hey--how did you find the source of my program!?

      --

      Gates' Law: Every 18 months, the speed of software halves.
    7. Re:sure you can go from asm - c++ by Alsee · · Score: 1

      Of course he's kidding, he said he had a girlfriend. But he's right, it looks like a poem. Then again, almost any odd arrangment of text looks like a poem.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
    8. Re:sure you can go from asm - c++ by rsheridan6 · · Score: 1

      No, I'm serious, she says it looks like e.e. cummings.
      I can kind of see her point The weird structure of that poem looks kind of like code in some obfuscated language.

      --
      Don't drop the soap, Tommy!
    9. Re:sure you can go from asm - c++ by rsheridan6 · · Score: 1

      This is link to the poem I meant to post. Oops.

      --
      Don't drop the soap, Tommy!
    10. Re:sure you can go from asm - c++ by Corporate+Troll · · Score: 1

      Hey, that's actually a pretty cool poem. Had to read it a couple of times till the words got through to me.
      Let me guess. Your girlfriend is an english graduate or something.

    11. Re:sure you can go from asm - c++ by Anonymous Coward · · Score: 0

      I don't believe you

      Everybody knows slashdot folk don't have girlfriends.
      Thank god I don't read slashdot.

  11. Decompile this! SlashDot Effect! by lems1 · · Score: 2, Funny

    Yeah, but they should know how to decompile the slasdot effect first... another one down. Anybody with a Mirror or Google Cache link ?

    --
    This sig can be distributed under the LGPL license
  12. Re:Intresting by Pingular · · Score: 0

    Last time I checked 'Intresting' was in the English dictionary.

    --

    When anger rises, think of the consequences.
    Confucius (551 BC - 479 BC)
  13. Too busy? by destiney · · Score: 0


    (haven't tried it, but will when I have time)

    Yeah I dunno how you have time for anything anymore.. having to post duplicate articles and all.

  14. Re:Intresting by Anonymous Coward · · Score: 1, Funny


    I would find both your educational background and the dictionary you're using very intresting.

  15. Re:Intresting by Anonymous Coward · · Score: 0
  16. Re:Intresting by Morologous · · Score: 1

    *BBBRRRRTTTT*

    Incorrect! Intresting is *NOT* in the dictionary. Interesting, however, is.

    Check it out.

  17. Idiot by Anonymous Coward · · Score: 0
  18. Re:Intresting by Pingular · · Score: 0

    Ohhh grammar nazis, right.

    --

    When anger rises, think of the consequences.
    Confucius (551 BC - 479 BC)
  19. Re:Intresting by Morologous · · Score: 3, Funny

    *BBBBRRRRRRTTTT*

    Incorrect! Spelling Nazi may have been the answer you're looking for.

  20. The cow never lies! by Anonymous Coward · · Score: 0

    She always sleeps standing.

  21. Re:Intresting by Pingular · · Score: 0

    Grammar: The system of rules implicit in a language, viewed as a mechanism for generating all sentences possible in that language. I think you'll find that includes spelling.

    --

    When anger rises, think of the consequences.
    Confucius (551 BC - 479 BC)
  22. Inline functions, templates and decompilation by truth_revealed · · Score: 4, Insightful

    Sure you can decompile an optimized and symbol-stripped C++ program, but you'd never have it the original compact form of the source as you do with the Java class file decompilers due to the heavy use of inline functions and templates used in C++. A C program, sure, but decompiling C++ is not terribly useful.

    1. Re:Inline functions, templates and decompilation by Dominic_Mazzoni · · Score: 1

      Sure you can decompile an optimized and symbol-stripped C++ program, but you'd never have it the original compact form of the source as you do with the Java class file decompilers due to the heavy use of inline functions and templates used in C++. A C program, sure, but decompiling C++ is not terribly useful.

      Actually if you could deduce the class hierarchy and distinguish objects from other variables, it might be more useful to decompile a C++ program than a C program, because information about encapsulation and inheritance might help you understand the structure of the program.

    2. Re:Inline functions, templates and decompilation by truth_revealed · · Score: 1

      You would not be able to deduce much considering the template code and inline class members functions would simply become part of other unrelated functions or classes - or collapsed into constants or a word in memory. Consider C++ template metaprogramming where a killer 500K block of source code is crunched down into a single number or simple expression. vector::size() might be reduced to be the 8th byte from the beginning of a block of memory - there's no function. There's no way to retrieve the original due to the intense optimization.

  23. Re:But (Mod this post up) by Anonymous Coward · · Score: 0
  24. Re:Intresting by kirun · · Score: 0, Offtopic

    It's right here in my Oxbridge English Dictionary. What are you on about?

    --
    I'm scared of numbers that can't be written as a fraction. It's an irrational fear.
  25. yeesh. by digitalsushi · · Score: 0

    that thing was slashdotted even in "the mysterious future". hrm. i left the ads on cause that's the only feature i wanted :D

    --
    slashdot: where everyone yells sarcastic metaphors to themselves to understand the issue
    1. Re:yeesh. by po8 · · Score: 1

      Fastest /.ing ever. What a pain. I'm seriously tired of this.

      Has anyone thought seriously about creating a proxy site for /. that automatically caches all the links? I think that if the site and caching was done as a proxy, there would be no copyright issues?

      Heck, if anyone wants to put together a demo of this, I'd be willing to host it on an experimental basis. I've got access to a ton of free BW through my academic institution for non-profit stuff. My theory is that if the proxy becomes popular, we can bully VA into hosting it, like they should have done for the past several years...

      Alternatively, and maybe even cooler, one could build a proxy that browses through the Google cache, to save bandwidth and storage and further blur the copyright question.

    2. Re:yeesh. by Minna+Kirai · · Score: 1

      It's not slashdotted. In fact, the guy probably hasn't gotten many extra hits at all, because the posted URL is wrong.

  26. Re:Intresting by Pingular · · Score: 0

    Would you like me to point out the incorrect structure of your grammar?

    --

    When anger rises, think of the consequences.
    Confucius (551 BC - 479 BC)
  27. progr? by Anonymous Coward · · Score: 0

    what the hell is a progr?

    1. Re:progr? by inaeldi · · Score: 0

      Someone in the middle of saying "program" but dying before they can finish. Famous last words. Get it? *nudge nudge*

    2. Re:progr? by SharpFang · · Score: 1

      Any kind of software, modulated either by frequency or amplitude (program or progrfm)

      --
      45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
  28. Re:Intresting by Anonymous Coward · · Score: 2, Funny


    Its write hear in my Oxbrige Enlish Dictionairy. What are you on about?

  29. Re:This is nonsense by Anonymous Coward · · Score: 0

    "But binary can only sometimes be translated into slightly-readable assembly code."

    I don't think you understand what assembly code is, dumbass.

  30. let's get back to basics by 1nv4d3r · · Score: 5, Funny

    Hell, I'd be happy if the people working for me could consistently compile their c/c++. I need a new job...

    1. Re:let's get back to basics by Anonymous Coward · · Score: 0

      It's probably typos.

    2. Re:let's get back to basics by Anonymous+Brave+Guy · · Score: 1

      Most of the guys I work with can compile their own C++. Then again, they all use VC++ 6 and don't know how scope and for loops are supposed to mix, so no-one else can compile their code. Who would have thought such a simple non-standard behaviour could waste so much time... <sigh>

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    3. Re:let's get back to basics by Anonymous Coward · · Score: 0

      int i = 0;
      for (int i = 0; i < 8; i++) {
      if (a[i] < 0) break;
      }
      // now i equals the first value for which a[i] < 0
      // right? right?

    4. Re:let's get back to basics by CwazyWabbit · · Score: 1

      ... don't know how scope and for loops are supposed to mix ...

      Lint will catch that (almost certainly, but I have not checked), so you could try to make a clean Lint part of your coding standard or checkin requirements. If you can show whoever is in charge that the time spent Linting is more than recovered from the time spent fixing silly errors it should be a doddle to convince them. The only thing is, I think most projects need a list of lint warnings/errors that should be turned off for your situation, which goes into the configuration file your project will use. Guard this list with your sanity :)

    5. Re:let's get back to basics by arkanes · · Score: 1

      Not to mention that the non-standard behavior (which pre-dates the standard that said it's wrong) is well known and was implemented in more compilers than VC++, and most of the compilers I know of have switches to support it.

    6. Re:let's get back to basics by CwazyWabbit · · Score: 1

      Keeping to standards helps with future portability though, which I think is important to keep in mind. You say yourself "most" and not "all" compilers can support it. My own experience and Y2K just shows software tends to live longer than intended. Thanks for the explanantion of why - I didn't know that.

    7. Re:let's get back to basics by miu · · Score: 1
      Hell, I'd be happy if the people working for me could consistently compile their c/c++.

      Just hope you don't get what you ask for. Otherwise you'll wind up in a pissing match with QA cause your developers are tossing code over the wall every time it compiles.

      --

      [Set Cain on fire and steal his lute.]
    8. Re:let's get back to basics by Anonymous+Brave+Guy · · Score: 1
      Lint will catch that (almost certainly, but I have not checked), so you could try to make a clean Lint part of your coding standard or checkin requirements.

      Funnily enough, I'm currently arguing for exactly that, and citing this very example as a good reason for it. :-)

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    9. Re:let's get back to basics by Anonymous+Brave+Guy · · Score: 1
      Not to mention that the non-standard behavior (which pre-dates the standard that said it's wrong) is well known

      That's a cop out. The draft standard was indicating the final behaviour way before VC++ 6 was out, and Microsoft knew damn well they weren't going to follow it. The behaviour was left as it was by default because changing it would break Microsoft's library implementations, which is also the reason why the options you mention were useless with that particular compiler and its supplied libraries.

      BTW, we compile on around 15 different platforms at work, from various Windows and Mac through to various flavours of UNIX. VC++ 6 is the only one, even including the UNIXy ones several years older, that has this problem, and the only one without a useful switch to disable it.

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    10. Re:let's get back to basics by Anonymous+Brave+Guy · · Score: 1

      No, i == 0, just as it did immediately after you initialised it. The i changed in the for loop is a different i in a different scope, and has no bearing outside that loop. But I'm sure you knew that, or you wouldn't have posted the code sample. At least most compilers will warn about the hiding of the outer-scope i here, though. The problem with the non-standard scoping generally is that code can compile cleanly on VC++ 6, get checked in, and promptly break the overnight builds on 15 other standard-conforming platforms. :-(

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    11. Re:let's get back to basics by arkanes · · Score: 1

      VC 6 can disable this behavior, and I don't see how it affects MSs libraries (unless the STL that ships with them uses that construct? It's not even MS's STL, but okay...). MS kept with it because earlier versions of VC used it. It's an extension. If people in your project using it bother you, then disable it. It's a syntatical issue, one of coding style, not one of efficency.

    12. Re:let's get back to basics by Anonymous+Brave+Guy · · Score: 1
      VC 6 can disable this behavior, and I don't see how it affects MSs libraries (unless the STL that ships with them uses that construct? It's not even MS's STL, but okay...)

      Large amounts of MS source code wouldn't compile with the option switched on, allegedly including libraries such as the Dinkumware STL implementation that shipped with VC6 where the library code is effectively supplied as source.

      If people in your project using it bother you, then disable it. It's a syntatical issue, one of coding style, not one of efficency.

      Yes; in fact, at my request as the only person currently using VS.NET on the project, the master makefiles were adjusted so that when building with VC7 the scoping option is set to standard compliant. However, and this was my original point, most people on the project haven't bothered to, or can't (because of the library issue), set this option with VC6, and thus they check in "clean" code that doesn't build on any other platform we support.

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    13. Re:let's get back to basics by angelo-flores · · Score: 1
      I don't necessarily see how this problem in VC would affect a standards-compliant compiler; in fact, wouldn't the converse hold true, ie. code that compiles in a standards-compliant compiler would break VC, at least in regard to this kind of scoping? For example:


      for (int i = 0; i < 8; i++)
      ; // ... fun stuff in here

      for (int i = 0; i < 8; i++)
      ; // ... even more fun stuff


      This would compile correctly in a standards-compliant compiler, but VC would complain that this involved some sort of redeclaration of 'i'.

      Now, if I was working in VC and needed to do a quick workaround, couldn't I just wrap these in their own independent set of braces? Or maybe, in a more hackish fashion, declare some sort of macro that attached 'if (true)' to the beginning of 'for' loops? Even if I did the old VC standby, for example:


      int i;
      for (i = 0; i < 8; i++)
      ; // ... stuff

      for (i = 0; i < 8; i++)
      ; // ... more stuff


      The only code I could imagine that this would break, at least in a standards-compliant compiler, would be code involving a previous declaration of 'i' (before or after this example section of code), but this would also break the code within VC, so it would have been caught beforehand, correct?
    14. Re:let's get back to basics by Anonymous+Brave+Guy · · Score: 1

      VC++ 6 will compile the following quite happily. Not a lot else will.

      for (int i = 0; i < 10; ++i) {
      // Do something with i
      }
      // Do something else with i, perhaps another for-loop
      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    15. Re:let's get back to basics by arkanes · · Score: 1
      With all due respect, thats a project management problem, not an issue with C++. Eeven with the switch on, it's easy enough to write code that doesn't rely on either behavior - it's a relatively minor syntax change. You just need to control your project better.

      On a side note, you can upgrade the STL that ships with VC6 - either with a new Dinkumware one, or by using something better like STLPort, which is probably worth doing, compatability switch or no.

    16. Re:let's get back to basics by Anonymous Coward · · Score: 0
      I believe the code you're thinking of is:
      int i=0;
      for(i=0;i<8;i++)
      {
      if(a[i]<0) break;
      }
      The "int" in the for-line declares a for-loop-scope i, which takes over for the i declared outside the loop. Once you leave the loop, the original i is once again in scope, and the value is unchanged since before the for-loop. When that "int" is removed, the for-loop uses the current in-scope i, which happens to be the one that is declared on the lines just above the loop.
    17. Re:let's get back to basics by Anonymous+Brave+Guy · · Score: 1
      With all due respect, thats a project management problem, not an issue with C++.

      Of course; I never suggested otherwise. My point was simply that certain project management issues seem to apply to the vast majority of projects using a certain common development platform, because an awful lot of C++ developers don't actually know what the rule is, having learned the behaviour of their current platform and not the standard.

      Of course if it were my choice, we'd just switch on all the warnings and upgrade to a good STL implementation. Unfortunately, it's not. I've been campaigning for some time for people to fix this issue, and made some progress, but still this problem and others like it waste incredible amounts of time. As I originally said, it's not just whether you can compile your code, it's whether the rest of your dev team can as well. The problems in this area aren't inherently due to C++, but they are part and parcel of working in the C++ world.

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  31. Server's Famous Last Words ... by Anonymous Coward · · Score: 0

    .. You Can't Slashdot me!

  32. It's the other way around by photon317 · · Score: 1


    When you think about it, the higher level the language is, the easier it should be to "decompile". The closer the original source was to asm, the more the individual coder's style will be reflected in the asm - the higher level it is, the more the obvious patterns the compiler uses every time for given constructs will be present. Reverse engineering a program written in asm to human readale source is a nightmare, but if you knew for instance that the source was C++ and it was compiled by gcc 3.2 (easy enough to tell), it's probably pretty easy to see from the asm patterns the classes and whatnot, to see the structure of the source.. then you just have to comine that with what the program actually does to give human meaning ack to the variale and class names and whatnot.

    --
    11*43+456^2
    1. Re:It's the other way around by Vengie · · Score: 1

      Dude.....C is practically ASM. C is *not* a "high level" language according to most theoretical professors. (c.f. Yale's Stan Eisenstat, Arvind Krishnamurthy, Zhong Shao, Richard Yang, Columbia's Belhauser, Havard's Smith...) C _is_ readable asm code......
      have you ever taken a compilers course?

      having written a compiler for a toy language (tiger) [google for princeton professor appel's "tiger" language and his collaboration with z. shao, who implemented the heap-activation in SML-NJ....] i can assure you, it is nowhere NEAR as easy as you'd think.

      push ebx ;)
      -b

      --
      When in doubt, parenthesize. At the very least it will let some poor schmuck bounce on the % key in vi. (Larry Wall)
    2. Re:It's the other way around by photon317 · · Score: 1


      Where did I say C was a high level language? I used C++ as a reference because like it or not, it is high enough level to have it's own structure. I didn't use C because I well understand that C is basically portable assembler.

      Beyond the current OOP language like C++ and Java, the only things higher level are the toy languages for braindead programmers (think VB, Delphi, FoxPro, etc) - and the various real attempts at 4GL, which never seem to work right for general cases, but can be useful in application-specific situations.

      --
      11*43+456^2
    3. Re:It's the other way around by photon317 · · Score: 0, Flamebait


      Oh and aside from the point of the thread, I feel it's important to point out that I haven't taken a compilers course. I've never taken any college course. I operate in the real world. I don't use useless theoretical constructs written by your above-mentioned big fat name-dropping list. Your post is pathetic in it's attempt to feel superior because of your acedemic background.

      And writing compilers is every bit as easy as I'd think. Where do you get off thinking you know how I'd think to begin with? I'll code circles around you from asm to java and back again.

      --
      11*43+456^2
    4. Re:It's the other way around by Anonymous Coward · · Score: 0
      Point #1. You know, if you want to work out if C is "readable asm code" or not all you have to do is try to map C to asm and check for one-one-ness, you don't need to name drop. The latter makes you sound like an Arts student, not an engineer nor a scientist.

      Point #2. Everyone's written a real compiler for a toy language, or a toy compiler for a real language. Why do all kids who go to Uni think they've just crossed the grand canyon on a tight rope when they've written their first compiler?

    5. Re:It's the other way around by Minna+Kirai · · Score: 2, Insightful

      When you think about it, the higher level the language is, the easier it should be to "decompile".

      No, no, no. This is both empirically untrue, (Do you see many ML or even C++ decompilers out there?) and theoretically insensible.

      The higher level a language is, the more changes there will be between the original source code and the assembly. Thus the more source data that will have been discarded by the original compiler, which is data the decompiler cannot reconstruct.

      The reason Java decompilers work relatively well is not because Java's a high-level language (it isn't, really), but because the output program is at such a high level! Instead of working from binary code, a Java decompiler gets a more presentable bytecode, packed with the names of classes and methods. (Also, because optimization of Java programs is supposed to happen after compilation at a JIT stage, the bytecodes won't be as obfuscated as the output of a normal C++ optimizing compiler)

      The closer the original source was to asm, the more the individual coder's style will be reflected in the asm

      When decompiling, the "individual coder's style" is exactly what you're trying to get!

      the more the obvious patterns the compiler uses every time for given constructs will be present.

      Good compilers don't use "obvious patterns". Their transformation functions are very sensitive, so a tiny change in the input source (expanding a loop from 3 times to 4) can cross an optimization threshold and totally change the appearance of the output.

    6. Re:It's the other way around by Anonymous Coward · · Score: 0

      Right up until you need to formulate a complex algorithm. Then you're completely fucked because you don't know the theory.

    7. Re:It's the other way around by almaw · · Score: 1

      Congratulations. This is the most self-evidently false comment I've seen on here in a long while...

      "Decompiling" to ASM is a simple matter of replacing machine codes with mnemonics. It's trivial. Of course, you lose the comments your ASM coder (hopefully) put in the source code, which makes things difficult to understand on some level.

      "Decompiling" to C code is trivial. You lose the names of variables, function names, etc, but that's about it. You get a slightly different set of (equivalent) logic, too, given the compiler will probably have optimised things a fair bit. A good C coder (heck, a moderately reasonable C coder) should be able to pretty much see what assembly the C code he writes is going to generate, though. It's difficult to write efficient code otherwise.

      "Decompiling" to C++, I wouldn't even know where to start. The suggestion you make of "recognising patterns" should ring alarm bells even if you don't know anything about the subject. Recognising patterns is about the hardest thing you can ask a computer to do. I'd say that this is probably more difficult that attempting face-recognition, for example. At least with that there's established research and various biometrics you can use.

      Of course, some languages like Java compile to intermediate byte code which doesn't lose quite so much information, making it easier to "decompile". This is one reason why C# will have issues penetrating the desktop app market - it's too easy to nick other people's code.

      Note that I put "decompile" in quotes. That's because there's really no such thing.

      As another poster points out, you really should do a course in optimising compiler construction. If you did you'd realise just how silly you sound. :)

    8. Re:It's the other way around by KingRamsis · · Score: 2, Informative

      excuse me..!!
      just leave Delphi out of it, Delphi is a true OOP language you should do some research before coming up with a gross generalization like that.

    9. Re:It's the other way around by Anonymous Coward · · Score: 0

      Yea, why write something the easy way when you can, when you can do it the hard way!

    10. Re:It's the other way around by photon317 · · Score: 1


      Oh yeah, that's right, you can only learn complex theories in a classroom, I forgot. I guess some people need structure in order to learn, but I'm not one of them, thanks.

      --
      11*43+456^2
  33. Re:Intresting by Anonymous Coward · · Score: 0


    Would you like me to pull my dick out of your mom's asshole, and scrape off the blood, feces, and sperm and put it in a glass for you? Add a little milk, put it in a blender, and you'll have a nice shake.

  34. Spectulation Code by Davak · · Score: 5, Informative
    Considering the entire post is evidently based on speculation...

    Here is some code that supposedly decomplies... not that I've tried it.

    Quote from the FAQ:


    [35.4] How can I decompile an executable program back into C++ source code?

    You gotta be kidding, right?

    Here are a few of the many reasons this is not even remotely feasible:
    * What makes you think the program was written in C++ to begin with?
    * Even if you are sure it was originally written (at least partially) in C++,
    which one of the gazillion C++ compilers produced it?
    * Even if you know the compiler, which particular version of the compiler was
    used?
    * Even if you know the compiler's manufacturer and version number, what
    compile-time options were used?
    * Even if you know the compiler's manufacturer and version number and
    compile-time options, what third party libraries were linked-in, and what
    was their version?
    * Even if you know all that stuff, most executables have had their debugging
    information stripped out, so the resulting decompiled code will be totally
    unreadable.
    * Even if you know everything about the compiler, manufacturer, version
    number, compile-time options, third party libraries, and debugging
    information, the cost of writing a decompiler that works with even one
    particular compiler and has even a modest success rate at generating code
    would be significant -- on the par with writing the compiler itself from
    scratch.

    But the biggest question is not how you can decompile someone's code, but why
    do you want to do this? If you're trying to reverse-engineer someone else's
    code, shame on you; go find honest work. If you're trying to recover from
    losing your own source, the best suggestion I have is to make better backups
    next time.

    I would have posted AC but that have me blocked out for some reason...


    Davak

    1. Re:Spectulation Code by Davak · · Score: 1

      btw... the C++ FAQ quote and the code link have nothing to do with each other... I should have made that a little clearer. Sorry. Davak

    2. Re:Spectulation Code by daaan · · Score: 1

      "I would have posted AC but that have me blocked out for some reason..."

      This is a straight forwad, honest answer, that I was honestly going to mod insightful until I saw that. I just wondering, what would have been the point to posting AC?

    3. Re:Spectulation Code by Davak · · Score: 1
      Because after I previewed it, it looked like a "Karma Whore" post to me. So as prophylaxis against the potential flamebait, I wanted to post it AC. But I couldn't, so I didn't.

      That's all.

      Davak

    4. Re:Spectulation Code by Anonymous Coward · · Score: 0

      If you're trying to reverse-engineer someone else's code, shame on you; go find honest work.

      eh. fuck you, buddy.

      if you're writing code that you care people are picking apart, I don't see that as being particularly "honest work".

      and I certainly wouldn't call bullshitting about code honest work. so there.

    5. Re:Spectulation Code by Anonymous Coward · · Score: 0

      But the biggest question is not how you can decompile someone's code, but why
      do you want to do this? If you're trying to reverse-engineer someone else's
      code, shame on you; go find honest work. If you're trying to recover from
      losing your own source, the best suggestion I have is to make better backups
      next time.


      IMHO, it would be very useful for people making open source products that must interface proprietary protocols/file formats/whatnot. That it would be useful doesn't make the problem solvable though.
    6. Re:Spectulation Code by Anonymous Coward · · Score: 0

      If you're trying to reverse-engineer someone else's
      code, shame on you; go find honest work.


      I get paid to reverse engineer code for a large security company. Real geeks don't need source code.

      -AC

    7. Re:Spectulation Code by teklob · · Score: 1

      BTW, the link you posted to Planet-Source-Code just links back to the original article

    8. Re:Spectulation Code by Anonymous Coward · · Score: 0

      Recent addition to slashdot...
      If you're logged in, you can't post as AC for the first 30 minutes of a discussion's life. Mind you, you can still log out and post anonymously. You just have to click through at least one preview of your comment before submitting it. It's an ineffective and rather silly anti-troll measure, really.

    9. Re:Spectulation Code by Anonymous Coward · · Score: 0
      If you're trying to reverse-engineer someone else's code, shame on you; go find honest work

      WTF? There are many many honest reasons to reverse-engineer code. Compatiblity and bugs have both force me to reverse-engineer stuff. Actually, I can't think of a time I've dishonestly reverse-engineered something, but I bet I've done that too.

    10. Re:Spectulation Code by Anonymous Coward · · Score: 0

      You'll drop that journal of yours if you know what's good for you, Davak.

    11. Re:Spectulation Code by Anonymous Coward · · Score: 0

      Or what, you'll go over to his house with a baseball bat? Sheesh... you can't even make a decent threat.

    12. Re:Spectulation Code by Anonymous Coward · · Score: 0

      The first five points in that FAQ answer are nonsense, at least for the last ten years. Every good disassembler, such as IDA Pro, already has library identification code with signatures from dozens, if not hundreds, of compilers and libraries. When I disassemble programs, the compiler and libaries are identified instantly, and all of the uses of library routines and named and tagged. A decompiler would do the same thing. The real problem with decompiling C++ code is the trend towards the generic programming idiom, with a heavy reliance on templates and inline functions. All of that information is lost when compiling. The FAQ answer that you quoted is so wildly out of date as to be laughable.

    13. Re:Spectulation Code by Anonymous Coward · · Score: 0

      Nah, the editors will probably just super-downmod and $rtbl him.

  35. Re:Intresting by Pingular · · Score: 0

    Yes please.

    --

    When anger rises, think of the consequences.
    Confucius (551 BC - 479 BC)
  36. Re:Text of the article by kingkade · · Score: 1, Insightful

    dear anonymous c,

    please stop breathing and kill any offspring you may have inexplicably fathered for the sake of our gene pool. Thanks.

    Respectfully,
    -- Human Race

  37. Re:Intresting by handybundler · · Score: 0, Funny

    I don't know about you but I pee in the shower all the time!!! intrestingh!!!

    --


    a/s/l here. Sorry, adding domain tags to your s
  38. To all those, who think it's useless... by SharpFang · · Score: 4, Interesting

    Well, it isn't. Sure, if you're so lazy uou want to have source rebuilt from binaries with one click, complete with comments, makefile and documentation, that's of no use. But imagine the program does some very clever trick. Something you ooh about, "How the hell does he do that? It's impossible?". You want to include that trick in your code. You need it. So - you have three options: 1) Try to design it from scratch. Helluva work, you don't know where to start. 2) Look into the binary. If you're ASM guru, you MAY succeed. But ASM from high-level languages is hell to read. 3) Decompile the puppy, look for that piece through what looks like piles of junk, but is way more readable than ASM and find it. Then just rewrite it in pretty fashion, changing variable names and functions to your needs and include in your own software. It's "the best of the worst", last resort at finding a solution to a small problem. Not a way to edit the source and add a single feature to the original program, like remove print protection from Acrobat Reader. The decompiled program most probably won't be possible to compile. You won't make a cow from hamburgers. But with some luck you may find out the cow was a bull and got killed by a truck.

    --
    45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
    1. Re:To all those, who think it's useless... by pVoid · · Score: 3, Insightful
      Neat tricks are generally either one of these three things:

      A hidden API call - which can be easily found via ASM listings

      A nice little algorithm - which can be found in comp sci books

      An elegant piece of code - which can *not* be decompiled from ASM

      So no, I disagree with you.

    2. Re:To all those, who think it's useless... by janda · · Score: 1

      Quoteth the original poster

      Then just rewrite it in pretty fashion, changing variable names and functions to your needs and include in your own software. It's "the best of the worst", last resort at finding a solution to a small problem.

      And exposes you to possible trade secret and copyright infringement claims.

      Really, if you know somebody else can take input "a", do "something magical" with it, and get output "b", are you really willing to admit that they are smarter then you?

      --
      Karma: Food Fight (Mostly affected by Date Plate).
    3. Re:To all those, who think it's useless... by magic · · Score: 1
      4) e-mail the person who wrote it and ask how the trick works.



      -m

    4. Re:To all those, who think it's useless... by Anonymous Coward · · Score: 0

      Am I the only one who's getting really annoyed by people who use the word "puppy" to refer to things like code, stocks, CPUs, video cards, etc. ?!

    5. Re:To all those, who think it's useless... by Anonymous Coward · · Score: 0

      No. I could write a book on all the trite, cringeworthy phrases slashbots spew out. In fact, one day I may just do that.
      Another one that particularly makes me ill is "the pot calling the kettle black". It would seem that this perfectly useful phrase now has to be written as a story, involving the characters Pot, a pot, and Kettle, a kettle, and the former's observation of the latter's low albedo.
      "Can anyone say x", where x is something extremely easy to say... Aaarrgh!

    6. Re:To all those, who think it's useless... by Anonymous Coward · · Score: 0

      3) Decompile the puppy, look for that piece through what looks like piles of junk, but is way more readable than ASM and find it. Then just rewrite it in pretty fashion, changing variable names and functions to your needs and include in your own software.

      The ethical version of this (granted, still dubious) is to decompile it, examine the source code, understand what is being done, and then, based only upon the understanding you gained, write it from scratch.

      Maybe that's what you meant, but from the sounds of it you advocate taking the source code and just renaming the variables and tweaking it so that it's different enough.

    7. Re:To all those, who think it's useless... by Anonymous Coward · · Score: 0

      Ohhh that last one really gets my goat.

  39. Re:This is nonsense by Anonymous Coward · · Score: 0

    ..sic:

    "I work for SGI. I make nearly 6 figures. I know programming, and I damn well know computers. I've been working since 1968 in the computing industry."

    While I'm inclined to agree with you in just about every way, you immodestly makes me want to smack you.

    I work for a very profitable information services entity. I've been working in the IT Security Industry for five years, after having left the film industry. I know my way around computing, (I started with a Commodore PET in the 70s), but I'm not afraid to say that I don't know something when I need help, or I have an opportunity to learn.

    Oh... and I make a comfy bit, that's well into the six figure range. There are plenty of folks who make a bunch more than I do, who know a bunch less. Salary is not an indicator of expertise, in this arena, except maybe where self-promotion is concerned.

    You forgot to say that you are a jackass.

  40. Re:This is nonsense by Anonymous Coward · · Score: 0

    FUCK YOU FUCKING ASSHOLE!

    For the clueless mod that modded this clownshoes up, he's no other than ekrout

    Btw, FUCK BILL FUMEROLA and FUCK SCOTT LONG

    Thank you

    Brett Glass

  41. Re:Intresting by Anonymous Coward · · Score: 0


    Please film this activity. It would make an intresting video.

  42. Re:This is nonsense (on good authority) by Anonymous Coward · · Score: 0

    I work for SGI. I make nearly 6 figures. I know programming, and I damn well know computers. I've been working since 1968 in the computing industry.

    And just cuz you're an arrogant, self-important sonofabitch doesn't mean you're wrong!

  43. You're right, that is nonsense. by Anonymous Coward · · Score: 5, Funny

    I damn well know computers. I have been working with them since 1904, when the Black Man made the first computer out of a peanut. I now work for Cray research making 18 figures.

    I can scratch a superscalar CPU out of silicon with a pocket knife. I even have friends who can write major programs in binary code (yes, just 1s and 0s)... even though writing a simple "hello world" program can ammount to 92,752 bits. I fail to realize that this ability does not a good computer scientist make. Things like intelligent design and research make a CS good.

    The parent post is fluff. It's stupid, the man is flamboyant and exagerating. He clearly has no real education of computer engineering and does not recognize that any executable code can be reverse-engineered or decompiled. Especially since every langage (save interpreted languages like Java) are compiled to machine code -- specific, unambiguous, structured code. "Decompiling" this is only really a matter of translating it into your langauge of choice.

    So, Mr. Proud American, please get off your imaginary high horse. You're not fooling anyone.

    1. Re:You're right, that is nonsense. by Anonymous Coward · · Score: 0

      I am pretty sure he isn't American considering the poster said "it doesn't seem like the author usually writes in English". Yeah...

    2. Re:You're right, that is nonsense. by Nerull · · Score: 1

      He is not replying to the article, he is replying to another post, which you probobly didn't see, as it's currently at -1, Troll.

    3. Re:You're right, that is nonsense. by Anonymous Coward · · Score: 0
      even though writing a simple "hello world" program can ammount to 92,752 bits


      92,752 bits = 11,594 bytes. I realize you're joking and all, but did you pull that number from a quick compile? There's pretty much no way in hell that you're going to need 11,594 bytes to write "hello world" by hand, even if you have to write everything from initial reset. A compiler might, but no human would. Heck, my compiler doesn't even generate that much code for "hello world" (although it can take advantage of the standard libraries).
  44. GNU/Agent GNU/Smith by Anonymous Coward · · Score: 0

    Think about it: In the Matrix Reloaded, GNU/Smith touches anyone else, and they become GNU/Smiths. He is just as viral as the GPL!

    And it's not like they only get a GNU/Smith arm in their othervise normal bodies, they turn completely GNU/Smith.

    1. Re:GNU/Agent GNU/Smith by Pingular · · Score: 0

      You just ruined part of The Matrix Reloaded for me, thanks.

      --

      When anger rises, think of the consequences.
      Confucius (551 BC - 479 BC)
    2. Re:GNU/Agent GNU/Smith by Anonymous Coward · · Score: 0

      it's been out a week now... why don't you just watch it?

    3. Re:GNU/Agent GNU/Smith by Anonymous Coward · · Score: 0

      np

  45. Re:Intresting by Anonymous Coward · · Score: 0

    Intresting is a perfectly cromulent word.

  46. Re:This is nonsense by Anonymous Coward · · Score: 0
    I work for SGI. I make nearly 6 figures. I know programming, and I damn well know computers. I've been working since 1968 in the computing industry.

    I've been working in the computing industry since 1995 - school doesn't count - and make over 6 figures (well over if you want to include bonuses, options and restricted shares). What does this have to do with what I do or don't know...? Nothing!

    I agree with your comments about the article, but there's no reason to throw salary around to make a point. I make more than most of my friends and try not to rub anyone's nose in the dollars. (The reason I'm posting AC right now.)

  47. Re:This is nonsense by spiz21 · · Score: 1

    6 figures...riiight.
    You don't even know what you are talking about.

  48. Mirror!!! by Anonymous Coward · · Score: 1, Funny


    Here's the text from the original article:

    1. Make a copy of the program you want to decomplie. Let's assume it's PROG.EXE. Copy it to PROGBACK.EXE.

    2. Copy PROGBACK.EXE to a DOS PC if you're not using one.

    3. Type EDIT PROGBACK.EXE from C:\ (or where ever you copied it to).

    4. Enjoy the source code! You can print it out or change it or just look at it.

    5. If you change it, use FILE SAVE.

    1. Re:Mirror!!! by Anonymous Coward · · Score: 0

      Personally, I laughed at loud at the parent post. Mod up, funny as hell. =)

  49. Reverse-engineering programs written in C/C++ by Anonymous Coward · · Score: 2, Interesting

    I've done some reverse-engineering on programs written in C/C++ (Intel x86). After a while you learn how to recognize different things like virtual function calls, while/for-loops, switch and stuff like that. However, it's a totally different thing to decompile to C++. It may be possible to decompile compiled code to C, but don't expect that it will look much like the original source, especially if the code was optimized by the compiler :)

  50. Templates by ucblockhead · · Score: 4, Informative
    He won't be able to regenerate any templates. If a program makes heavy use of templates, the "C++" he "decompiles" to is going to be hideously ugly.

    [insert joke about it being hideously ugly with templates here.]

    {I did not read the article itself because it is, of course, slashdotted)

    --
    The cake is a pie
    1. Re:Templates by jacobm · · Score: 1

      You could probably guess. Worst case, the decompiled version is refactored to be cleaner than the original, which doesn't sound all that bad to me. :)

      --
      -jacob
    2. Re:Templates by ryanr · · Score: 1

      You can recognize some C++ism in the disassembly... things like nearly identical functions that take a different number of parameters, etc..

      Not that the article in question will help you with that, since it has nothing C++ specific in it at all. In fact, it has enough inaccuracies and oversimplifications, that I'm not sure it isn't detrimental to someone trying to learn disassmbly for the first time.

  51. eh... by rebelcool · · Score: 1

    all modern compilers are optimizing compilers, and they reorganize code completely to suit themselves in the most efficient manner. The compiler will reorganize modules and rewrite lines of code in order to make better use of registers, processor features/limitations that
    You cannot really see a programmer's style as a result. When you decompile, you'll get it returned as whatever the compiler shifted the code around as.

    --

    -

    1. Re:eh... by photon317 · · Score: 1


      Exactly, you backed up my point while trying not to :)

      The fact that you cannot see the programmer's style, only the compiler's style, is what makes decompiling source much easier. It's easier to learn the thinking patterns of the compiler by observing its output in various cases than it is to write software that can guess random human patterns.

      --
      11*43+456^2
  52. Smart Compilers by czion3 · · Score: 1

    Don't smart compilers change recursion to iteration automatically so how could it be the same source? That's all I know from my little java knowledge.

    1. Re:Smart Compilers by janda · · Score: 1

      I hope not. If I want it to be recursive, I'll code it that way. If I want it to be iterative, I'll code it that way.

      This is not to say that compilers shouldn't optimize things, such as dead code, register optimization, and stuff like that, but I know what I want, not the compiler writer(s).

      --
      Karma: Food Fight (Mostly affected by Date Plate).
    2. Re:Smart Compilers by jacobm · · Score: 1

      ... yeah ... compilers do little rewriting tricks that preserve semantics while improving runtime characteristics all the time, and converting recursion to iteration is definitely a good trick in that category. That's the essence of code optimization.

      --
      -jacob
    3. Re:Smart Compilers by larry+bagina · · Score: 1
      i think newer gcc's have a flag to check for tail recursion, and replace it with iteration. I've never tried it, though.

      Most functional language (scheme, lisp, haskell, ocaml etc) compilers will convert recursion into tail recursion, which can be converted into iteration in the compiled code.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    4. Re:Smart Compilers by spydir31 · · Score: 1

      tail recursion only, IIRC.
      this has a nice list of optimizations.
      and this is just plain interesting. (go look at CiteSeer anyway, lots of interesting CS stuff)

    5. Re:Smart Compilers by Midajo · · Score: 1

      There're an infinite number of constructions that will never be the same in the decompilation as they were in the original source code. I don't know a single C++ programmer who will readily admit to using a goto, but guess what? Chances are their compiled code is fraught with unconditional branches, the low-level equivalent of goto. (This mentality predates C/C++. One of my first year professors jokingly referred to "The Cult of Nogoto") Perhaps the decompiler in question is able to transform them back into more realistic code, but I don't know becuase I haven't been able to read the article yet.

    6. Re:Smart Compilers by Minna+Kirai · · Score: 1

      Um, yes, if you want recursion, code it that way... but recursion is just a mechanism for you to communicate your intent to the computer (an aspect of a programming language).

      It's not instructions on how that intent must be carried out. As long as the final output of running the program is the same, the compiler is allowed to swap recursion and iteration or do many other things.

      In fact, since assembly language and binary code don't support recursion, you'd better hope that the compiler replaces it with iteration for you!

      When a compiler changes code from a language supporting recursion into one that doesn't, it can use two approaches. The more general approach is to use a stack to hold prior recursive states (which works adequately, especially if your target system is normal and already has a stack for function calls). There's also the special case of "tail recursion". When the recursion happens as the last statement of a function, then the prior function state can actually be completely disgarded, and no stack is needed. So the recursion becomes a cheap loop.

    7. Re:Smart Compilers by Anonymous Coward · · Score: 0

      Wow you certainly are arrogant (and wrong). You'll fit in well here.

    8. Re:Smart Compilers by janda · · Score: 1

      To quote the previous poster:

      In fact, since assembly language and binary code don't support recursion, you'd better hope that the compiler replaces it with iteration for you!

      This is obviously some definition of the word "recursion" that I wasn't aware of before. Please elaborate.

      --
      Karma: Food Fight (Mostly affected by Date Plate).
  53. Java Decompiler? by mindstrm · · Score: 2, Interesting

    Anyone recommend a java decompiler known to work on the most recent versions of java, properly?

    Something that will literally give me code I can re-compile immediately?

    1. Re:Java Decompiler? by bricriu · · Score: 1

      You want JAD

      And for everyone that whines about "Oh, the decompiled code doesn't have pretty names...!" Who cares? You can puzzle through. Say some method in your app server throwing a NullPointerException... "well, where in the method could that be happening... decompile, put some debug here, and here... ah, that's weird, it's needs this obscure session variable, how did that go missing?" Now isn't that better than screaming "GODDAMN IT WHY DOES THIS CRAP KEEP BREAKING!!" and distressing your co-workers?

      --

      AHHHHHHH! I'm burning with goodness again!
      - Reakk, Sluggy Freelance

    2. Re:Java Decompiler? by Anonymous Coward · · Score: 0

      Who cares? You can puzzle through.

      Duh... That's why you should care.

      So you don't have to "puzzle through" by adding debug statements all over the code. :-P

    3. Re:Java Decompiler? by anonymous+loser · · Score: 2, Interesting

      JAD is a godsend. I wrote a very complex optimization method that was extremely effective in a couple of circumtstances. A couple of years later, those circumstances turn up again only in a different language. I can't find the source code anywhere, just the class file that had my great method in it. So, JAD comes to the rescue; it gave me a bunch code that used d1,d2,d3,... as my variables, but I already had a basic understanding of what each variable's role was, so it wasn't a problem for me to reverse-engineer my own code and finally port it to another language. I also made several back-ups of the source code this time. :-)

    4. Re:Java Decompiler? by arunkv · · Score: 1

      Use JAD. It's the best one for Java. If you want a decent GUI front end, get DJ Java Decompiler.

  54. Reverse engineering has its uses... by sheetsda · · Score: 4, Insightful

    There seem to be a lot of people in this story saying "shame on you for reverse engineering". It has its uses, how else would viruses, worms, and trojans be analyzed to figure out what they do and how they do it.

    1. Re:Reverse engineering has its uses... by Anonymous Coward · · Score: 0

      You mean there are viruses, worms and trojans written in C++ now? Boy, the days of "less is more" are long gone in hax0r-land.

    2. Re:Reverse engineering has its uses... by BitwizeGHC · · Score: 1

      The most publicized viruses and trojans have been written in... Visual Basic. Like, ew, man.

      --
      N4st0r, trixx0r h0bb1tz0rz! Th3y st0l3 0ur pr3c10uzz!
    3. Re:Reverse engineering has its uses... by stevenp · · Score: 1

      >> It has its uses, how else would viruses, worms, and trojans be analyzed to figure out what they do and how they do it.

      This is DMCA violation, dont you know it!
      It is illegal to analyze how the virus works - it's called "theft of trade secrets"

  55. Re:This is nonsense by scottking · · Score: 2, Funny

    well, when SGI lays you off this week, you're going to have plenty of time to learn how to create programs in binary, just like your friends.

    --
    scott king
  56. A good decompiler shows you what was written by crovira · · Score: 4, Interesting

    not the source's lies.

    Losing source code and var names (name spaced globals aka statics and scoped locals) allows the cracker (these are rarely hacking tools, they're mostly cracking tools,) to focus on what the machine actually was told to do instead of smothering it with shades of meaning which interfere with understanding the code.

    C++ or Java or Smalltalk, or almost any highly structured language using machine code libraries or virtual machines result in structured blocks of code and heap and stack allocation.

    A good decompiler can take the machine code, peel away the name spaces and code calls, extract the patterns in the code and the hacker/cracker can read the patterns instead of wasting time on the code.

    Forensic analysis work is extremely useful at telling you what happened when something dies but it is no good at telling you how something worked. For that you need code traces.

    Map those code traces onto the structure the decompiler reveals and you understand the program better than the authors/coders.

    --
    MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
  57. Anyone want to decompile SCO? by pchown · · Score: 4, Funny
    You might decompile one file and find a comment like this at the top:

    * This program is free software; you can redistribute it and/or
    * modify it under the terms of the GNU General Public License
    * as published by the Free Software Foundation; either version
    * 2 of the License, or (at your option) any later version.
    ;-)
    1. Re:Anyone want to decompile SCO? by Anonymous Coward · · Score: 1, Funny

      Nice try but c/c++ comments are removed by the pre-processor, so you won't find this in there unless it's in a string constant. And if so, then all you have to do is run 'strings' on the binary. ;)

    2. Re:Anyone want to decompile SCO? by Anonymous Coward · · Score: 0

      Oops, forgot to remove the -g option when I compiled.

  58. come on , hack a fuckin school? what a moron by Anonymous Coward · · Score: 0

    OrgName: University of Oklahoma, Health Sciences Center
    OrgID: UOHSC
    Address: P.O.Box 26901
    City: Oklahoma City
    StateProv: OK
    PostalCode: 73190
    Country: US

    NetRange: 156.110.0.0 - 156.110.255.255
    CIDR: 156.110.0.0/16
    NetName: UOKHSC
    NetHandle: NET-156-110-0-0-1
    Parent: NET-156-0-0-0-0
    NetType: Direct Assignment
    NameServer: DNS.ONENET.NET
    NameServer: TERRA.OSRHE.EDU
    Comment:
    RegDate: 1992-01-06
    Updated: 2002-04-19

    TechHandle: ZO24-ARIN
    TechName: OUHSC
    TechPhone: +1-405-271-1905
    TechEmail: networking@ouhsc.edu

  59. Decompilation by bolix · · Score: 1

    Source level access is at the core of the digital rights debate. I do not seek to recycle the arguments presented better elsewhere.

    I highly recommend the following website for those seeking to investigate decompilation/reverse engineering.

    Fravia
  60. misleading... by bismarck2 · · Score: 4, Informative

    Even with complete original source code, understanding a non-trivial C++ application is very difficult. Source derived from an optimized executable is going to be a LOT rougher. No real function names, module names, variable names, or comments. Use of standard libraries (STL, MFC, Boost) is likely highly obscured as well. A tool like this would probably produce source that looks more like a C/machine language hybrid rather than normal C++. The primary use of something like this is if you are looking for a very specific piece of logic such as a password check or an encryption operation or protocol details. When were these famous last words anyway?

  61. Decompiling to C++ is like... by Call+Me+Black+Cloud · · Score: 3, Insightful

    ...trying to rebuild a wrecked sand castle just by looking at the grains of sand. You can't. Compilers throw away a lot of information needed by people but not necessary for the machine. Compilers optimize the code to run more efficiently and that's a one-way street. Sorry to burst your bubble but trying to reconstruct original source is like trying to herd cats.

    Thank you, thank you. I'm Mr. Metaphor and I'll be here all week.

    1. Re:Decompiling to C++ is like... by davidstrauss · · Score: 2, Funny
      Thank you, thank you. I'm Mr. Metaphor and I'll be here all week.

      Calling yourself Mr. Metaphor is like using metaphor instead of analogy, which, in your case, is as incorrect as a cow marking its territory with cow pies and instituting an elaborate cow-tipping territory defense program.

    2. Re:Decompiling to C++ is like... by Call+Me+Black+Cloud · · Score: 1

      Thank you Mr. Simile...

  62. There are decompilers available by Anonymous Coward · · Score: 0
    dcc is one example of a decompiler that's avaliable:
    http://www.itee.uq.edu.au/~cristina/dcc.html

    In reply to those who think that reconstructing the original source is impossible: That's not even the point. The goal is to take assembly and construct C/C++ that's readable, not exactly what you started with.

    Executables often have a lot of info in the symbol table and other places that often let you reconstruct even the same variable names for the decompiled source!

  63. Re:You can't-It's a small "code" after all. by Anonymous Coward · · Score: 0

    Bingo. Yes the odds of getting the original are practically null, but the problem isn't quite like a simple mathmatical multiplication/factoring in which all numbers are equal. I'm finding with my research that most likely combined with a minimalization goal can carry one far. Intent (and hence structure) can be morphed by the compiler but still must remain intact if the code is to work as the programmer intended.

  64. Oh yea by Anonymous Coward · · Score: 0

    I'd rather decompile my bare bum than C++

  65. Re:i just registered slashdot.tv by Anonymous Coward · · Score: 0

    slashdot.tv has not been registered and is available for $50.00/ year*.

  66. Why stop there? Copy someone's whole life! by Wills · · Score: 1
    If you are a competent well-educated programmer, why would you need to take somebody else's code when you could simply design and implement your own solution in as much time as a reverse-engineering effort would probably take?

    Code or pseudocode is available free for many thousands of tough algorithmic problems which have been studied and published in the literature (e.g. Knuth et al) which is to be found in most good university libraries and/or the Internet.

    1. Re:Why stop there? Copy someone's whole life! by be-fan · · Score: 1

      Let me give you an example. A while ago I was trying to figure out how the Intel C++ compiler called global constructors during program initialization. This was before the standard x86 C++ ABI, so this wasn't documented anywhere. The only way to figure out how things were working was to wade through ASM listings and hex dumps. Also, there are lots of places where reverse engineering is absolutely crucial. For example you can sometimes figure out the protocol to a device by analyzing the binary code of the driver.

      --
      A deep unwavering belief is a sure sign you're missing something...
  67. pssssshhhh Intel wants to buy it by Anonymous Coward · · Score: 0
    The rich Intel wants to buy a C++ decompiler for its Itanium ... (when it dissapears x86 code) from someone very poor.

    JCPM (copyright) (c)

  68. Re:Intresting by Anonymous Coward · · Score: 0

    No. No, it doesn't.

    Sorry, but it's obvious by this point that you're a complete retard, so I won't bother even looking at any replies you make. Have a nice day, fuckwit.

  69. Re:Text of the article by Anonymous Coward · · Score: 0

    hi! could you show me how to stop breathing? thanks.

  70. Single sentence answer. by NoMoreNicksLeft · · Score: 1, Interesting

    Updating Total Annihilation to use opengl, increasing the number of weapons (currently 256), and increasing the weapon limit (3 per unit).

    1. Re:Single sentence answer. by Decimal · · Score: 1

      Re: Your sig...

      Build an internet incorruptible by corps and goverments. MetaNET

      How exactly is this different from Freenet?

      --

      Remember "Bring 'em on"? *sigh
  71. thanks for nothing. by twitter · · Score: 3, Interesting
    If you're trying to reverse-engineer someone else's code, shame on you; go find honest work.

    Shame on you Davak, you should go find honest code. There's nothing wrong with trying to understand how things work. Some people are stuck with legacy equipment or code they can't replace easily and this is their only option for improvement or even fixing it. Those people would be better off if free code were available. Sometimes the only way to make that free code is to understand the original code. There's nothing wrong with reverse engineering software, ever. Republishing someone else's binary is not legal, but it's not immoral. If the code were honest to begin with, the reverse engineer part would not be required. These days, it's cheaper to throw out the dis-honest code and hardware and buy some hardware that's well understood. If you make hardware or software, I hope you understand the implications for your product - I'm not buying it.

    --

    Friends don't help friends install M$ junk.

    1. Re:thanks for nothing. by TC+(WC) · · Score: 1

      If the code were honest to begin with, the reverse engineer part would not be required. These days, it's cheaper to throw out the dis-honest code and hardware and buy some hardware that's well understood.

      Dash those dishonest, lying, binary files!

    2. Re:thanks for nothing. by gwappo · · Score: 1
      Agree, also, note-worthy to mention is that decompilers make life easier when analysing foreign worms.

      (Things like Nimda are a lot more hairy due to HLL use than when compared to Code Red).

  72. Re:Intresting by Anonymous Coward · · Score: 0

    Intrestingly enough, I try and avoid making my pubic hair public hair - it's so embarrassing.

  73. Usefull for compatibility reasons by wilddur · · Score: 2, Interesting

    In europe it is legal to use reverse engineering for compatibility reasons enabling your software to work with others people software (mainly Microsoft)

    If you make the reverse engineering in europe you could develop compatible software and then export it to US. So it may be great news for us. In fact it is becoming really complicated to develope software for/at US. Patents, legislation, compatibility. It seems that more lawers than programmers are needed to write something more complicated than HelloWorld.exe.

    There is a need for tools that enable the compatibility of the programs or we will end with a monopoly of all kinds of progrmas (And it is illegal to use your O.S. monopoly to obtainthe monopoly of let say...web browsers).

  74. Not a Mirror!!!! BAD JOKE ALERT by Univac_1004 · · Score: 1

    sick. DoT (Denial of Thought)

  75. Untrue, Java has obfuscators and is much more OO by Anonymous Coward · · Score: 0

    Since Java programs tend to be more OO and thus complex
    the decompiler will give just as much information.
    Java is much simpler than C++ so things might look trivial
    compared to a decompiled C++ application but because of
    of complex relationships between objects with no names
    you will have a hard time following it.

    In C++ you will have larger methods due to inlining, but a
    "really" smart decompiler can find common pieces of code
    and separate them to a method.

    So yes Java will be simpler to read decompiled, but only
    because the language is simpler (which means you will
    be more productive and write more code).

  76. Slight misunderstanding here . . . by Selanit · · Score: 2, Informative
    If you're trying to reverse-engineer someone else's code, shame on you; go find honest work.

    Shame on you Davak, you should go find honest code ...
    If you read carefully, you'll note that the "honest work" sentence is NOT Davak's. It is still indented as part of the blockquote, and therefore is the final section of the passage he was quoting from that C++ FAQ. The last sentence that is actually Davak's is his comment about wishing to post as an anonymous coward, presumably to avoid situations like this one. Since AC posting wasn't working for him, it might have been a good idea to italicize the quoted passage to set it off clearly from the rest of the post. Oh well, too late now.
  77. Re:Intresting by User61 · · Score: 1

    Indeed! You can teach yourself more engrish at www.engrish.com

  78. Re:Intresting by User61 · · Score: 1

    Last time I checked 'Intresting' was in the English dictionary.

    That would from the Engrish dictionary actually.

  79. Re:Why? Java-style reflection by Mad+Bad+Rabbit · · Score: 1
    Why would you want to do this unless you were stealing source?

    If this technique works (haven't read it, page is slashdotted), maybe it could be used to implement Java-style runtime reflection for C++, which would be extremely cool and useful. Get a pointer to a method, decompile it to find out the expected arguments and return type, and dynamically invoke it.

    --
    >;k
  80. Decompilation = halting problem by Wizard+of+OS · · Score: 3, Insightful

    Why do people keep thinking that decompilation is possible? In short: decompiling a computer program is solving the halting problem. Period.

    The long version: In a compiled computer program there is no distinction for either code or data. Every byte in memory can be data, but it can also be executed as valid computer code.

    Now, the catch is that during compilation, data and code are mixed in the resulting binary. For instance take the compilation of a 'case' statement. There are several ways of compiling a case:
    - you can write it as a list of IF's, which is perfectly fine decompilable
    - you can write it as a jump, based on the case expression.
    The fun part about the second possibility is that it's far more efficient, but it poses a problem: when decompiling this you have to know where the bounds of the case lie. What's the furthest jump that can be made? It's a jump based on a calculated value, so you should know which values are possible. But for that, you need to run the program, and more specifically, you must run all possible execution paths.

    This can be rewritten as the instance of the halting problem: can a computer find out for any program whether or not it will halt? It is proven that a computer program cannot be written to do this task. Neither can a computer program decompile any other computer program.

    --

    --
    If code was hard to write, it should be hard to read
    1. Re:Decompilation = halting problem by Anonymous Coward · · Score: 0

      Except that data that is not code will most likely contain illegal instructions.

    2. Re:Decompilation = halting problem by johannesg · · Score: 2, Insightful
      Your understanding of the consequences of the halting problem is incomplete. It is not a proof that it is impossible to determine of any given program whether or not it stops in finite time. It is merely a proof that there exists a class of programs for this determination cannot be made. However, there are also many programs for which it can easily be determined whether or not it stops in finite time, and the same thing is true for decompilation.

      Furthermore, there is nothing saying that it has to do a 100% perfect job. Decompilation is already accepted to be imprecise; using some common sense (intuition if you want) to fill in some gaps is not an invalid method.

      The problems that face decompilation that stem from real-world issues are far, far greater than this (rather theoretical) problem. For example, decompiling any STL-based source to a useful state will be far more difficult than a simple jumptable.

    3. Re:Decompilation = halting problem by Anonymous Coward · · Score: 1, Insightful

      You are assigning too much "magic" to the process of compilation.

      Compilation is the process of converting one defined stream of data into another defined stream of data based on a ruleset. The ruleset usually discards useful meta information, like comments, but the core information is retained and the process is, for the most part, predictable--feeding the same source to the same compiler with the same settings will yield identical results as far as this discussion is concerned.

      > It is proven that a computer program cannot be
      > written to do this task

      The simplicity of the ruleset for early versions of VB allowed for the proliferation of VB decompliers in the early 90's. That must tell you something.

    4. Re:Decompilation = halting problem by Anonymous Coward · · Score: 1, Insightful

      Ah, the joys of being out of the academic enviornment. You can ignore the people that tell you the things you commonly do are impossible.

    5. Re:Decompilation = halting problem by 42forty-two42 · · Score: 1

      It's not a halting problem - you still can't tell whether the C++ program will halt. Therefore, there's no logical inconsistency. QED.

  81. Of course you can decompile C++ by TapeLeg · · Score: 3, Informative

    You can decompile any program. A compiled program is just your high-level program translated into machine language. There is no sort of magical encryption or similar transformation that it undergoes once you compile it.

    All you need to do is read in the bytes of any binary program, interpret the bytes as their machine language equivalents for whatever platform you are using, and then convert your MOV statements to assignment operators, JMP statemets to higher level loop structures, etc..

    Of course, you won't retain the names of identifiers, which are referred to only by memory locations in a compiled program; and some control structures might be rearranged due to compiler optimization and the lack of machine language equivalents, but the meat and potatoes of it is all right there.

    It's by no means easy to accomplish, especially with higher and higher level programming languages, but impossible? humbug! =)

    1. Re:Of course you can decompile C++ by jacobm · · Score: 1

      Actually, compilation is often lossy. For instance, it's not uncommon for a compiler to translate for loops to while loops or boolean relational operators into if statements (not in C or C++, maybe, but in other langauges). Once the compiler does that, there's no way of determining which fragment was in the original code.

      --
      -jacob
    2. Re:Of course you can decompile C++ by dvdeug · · Score: 1

      It's by no means easy to accomplish, especially with higher and higher level programming languages

      It seems like a higher level language would sometimes help, especially if they aren't running a highly optimizing compiler, because it has more patterning. An assembly language subroutine could be a chunk of code that's simply goto'ed to, or even if it does use the standard subroutine opcodes, it could use whatever calling ABI it wanted. A C subroutine, by most compilers, is going to surrounded by code to push and pop the stack, and is going to have one standard ABI. Even a highly optimizing compiler is usually only going to optimize tail-recursion and leaf calls. A compiled APL program might be trivial to decompile, with each character either being represnted by a function call or fairly standard chunk of code.

  82. No time? by Anonymous Coward · · Score: 0
    haven't tried it, but will when I have time

    Why does the poster have time to read the article and post it to Slashdot, but not actually try out the very thing discussed in the article?

    I'll never understand the expression "when I have time." It always comes up in a conversation. If you have time to talk about it, make the time to do it. Jeebus.

  83. Harry Vagina? by Anonymous Coward · · Score: 0

    Is that your real name, Larry? Must've sucked in junior high, eh?

  84. Only Works for PE (.exe) and MS Visual C++ by Anonymous Coward · · Score: 1, Interesting

    Having finally gotten through to the server momentarily, it appears that the article in question only applies to MS Visual C++.

  85. Re:misleading... no you were by jackb_guppy · · Score: 1

    It all those "stand" things that make it work very well. yeah variable names can be helpful. But those standard calls give me allow of variables names of extact meaning. Very helpul.

  86. C/C++/C# alternative by Pantheraleo2k3 · · Score: 1

    Hmm... Sounds a lot like programming with batch files to me. Why don't you guys get with the program and dump C/C++ for smaller, faster, easy to understand programs that run on Windows and DOs! Batch files forever. I will personally growl at those who do not respect them. Click here if you don't know what Panthera leo means

    1. Re:C/C++/C# alternative by ischorr · · Score: 1

      Click where?

  87. algorithm? by dollargonzo · · Score: 1

    i really disagree about the second one. mostly, because *gasp* not all algorithms people use come from comp sci books. people develop algorithms all the time, so looking into code for that specific reason might be extremely helpful.

    --
    BSD is for people who love UNIX. Linux is for those who hate Microsoft.
  88. Re:Why? Java-style reflection by KingRamsis · · Score: 1

    what a sick hack :-)
    ever heard the term RTTI?

  89. I couldn't help it by Fnkmaster · · Score: 4, Funny
    Neo: Do you always look at it in binary?


    Cypher: Well you have to. The compilers work for the construct program. But there's way too much information to decode the Matrix. You get used to it. I...I don't even see the code. All I see is an array, function pointer, integer. Hey, you uh... want a drink?


    Neo: Sure.


    Cypher: You know, I know what you're thinking, because right now I'm thinking the same thing. Actually, I've been thinking it ever since I got here. Why, oh why didn't I sell my VA Linux stock?... Good shit, huh? Cowboy Neal makes it. It's good for two things, degreasing Perl code and killing brain cells.

    1. Re:I couldn't help it by Anonymous Coward · · Score: 0

      A true classic. This will go into my Slashdot Classic archive. :-)

  90. You just need new people by zackbar · · Score: 1

    Of course, it would probably help if the new people actually have skills.

  91. Re:Decompilation = halting problem == boloney by jackb_guppy · · Score: 1

    As I know else where it is easy to decompile the program. To think about one way - THAT IS WHAT THE COMPUTER DOES TO RUN IT.

    It you write a program that simulates the functions of the processors. YOU KNOW THEY ARE WELL KNOWN ELSE NO COMPILERS. And load the program the way the load it. AGAIN WELL KNOWN. And now follow all the branches and data pointers, you have a ness map of the binary.

    Once you have that pattern matching and known funciton calls (say printf for example) you have map worked out quite well.

    Add back know inputs and function symbolic names and the code appears.

    One note: it is not the original code. But it the 100% functional equivilent.

  92. Goatse in parent by Anonymous Coward · · Score: 0

    you have been warned

    1. Re:Goatse in parent by Anonymous Coward · · Score: 0

      No shit. I was making a point.
      Who doesn't use "show link domain", anyway? You're just asking for trouble otherwise.

  93. Decompiling is possible, but hard by Animats · · Score: 4, Interesting
    Decompilers are rare, but possible. The first good one, decades ago, decompiled IBM 1401 assembler programs into COBOL. There's a commercial business, The Source Recovery Company, still doing that for legacy mainframe programs.

    C decompilers exist; here's one. There are others. Most aren't very good. It's a hard problem.

    Without debugging information, decompilation tends to result in code with arbitrary variable and function names, of course. But you get names when a DLL or .so is entered, so at least you get the program's major interfaces. Minimal C++ decompilation could be done by adding vtable recognition to a C decompiler.

    A more difficult problem is recognition of idioms. Things like "for" statements tend to decompile as lower level constructs. That's OK as a first step. You need some internal representation Initial decompilation might represent all transfers of control with "goto"; higher level recognition then deals with that.

    The key to doing a good job is "optimization", finding more concise source code that will generate the object code. The key to this problem is defining an internal representation that can represent any valid machine-language program, and which can be modified as higher level information about the program is discovered. The first step is usually to start at the starting address and build a code tree by following calls, like a good debugger does. Then you start to improve on the code tree, doing things like this:

    • Recognition of function calls. Each function call should be decompiled, and all calls to the same function checked to insure they have the same calling sequence. Then a prototype can be generated and placed in a header file.
    • Recognition of fixed-format structures. Figuring out how big a structure is can be tough, but at least fixed-format ones should be fully recognized. All references to the structure should be checked for type consistency, and a structure definition generated.
    • Recognition of "for", "while", and "switch".
    • Once constructors and destructors have been found, the structure of derived objects can be figured out. Now class definitions can be generated.
    • Once class member functions have been identified, the most restrictive protection ("private", "public", "protected") that will work should be attached. Similarly, "const" can be inserted for all arguments not seen to be modified.

    Decompilation won't always succeed. But you should find all the places where the code is doing something the compiler doesn't understand, and get code back for everything else.

    It's a big job, and somebody ought to do it. Among other things, it would be a valuable tool for finding compiler bugs.

  94. Re:Decompilation = halting problem == boloney by Wizard+of+OS · · Score: 1

    You have just proven my point. You need to write a program that simulates a process, and then run the program on that. Then you must run it using ALL POSSIBLE INPUTS.

    Now, go study the halting problem, then write your reply again.

    --

    --
    If code was hard to write, it should be hard to read
  95. Source files? by krumms · · Score: 1

    I haven't RTA, but how would you determine which code goes in what source files? On a per-class basis? And then end up with files like "a.hpp" "a.cpp" "aaa.hpp" etc.?

    Or put them all into one big source dump, which would take an eternity to load up for any non-trivial program? Mind you, disassemblers tend to work this way ...

    I think this would also be slow and horribly inefficient - ever tried to disassemble executables that are a hundred MB or so in size? Still waiting? Multiply that time ten fold as JMPs, JNZ and JZs are analyzed to determine whether something is a while, do-while or a for loop. Or to determine if a statement is a 'break' or a 'goto'.

    All this before even mentioning the matter of different compilers - g++, msvc, borland c++, etc., etc. etc.

    Ah I dunno, who cares - to ye who implements this shit: kudos, my friend, kudos.

  96. I doubt it. by joto · · Score: 1
    It would look nowhere like that.

    First: There would be no private members. That is information that does not exist in the compiled code. Everything would be public, or just declared as a struct.

    Second: The sequence starting from int x; going to j: would be optimized away to just:

    f.d(1);
    f.d(2);
    f.d(3);

    Third: Since class a, is a global name (which you can find by looking at the name-mangled a.b() and a.d(), the decompiler should be able to come up with the correct name.

    Fourth: It might be too hard for the compiler to correctly guess the layout of class a, given that it has not virtual member functions (thus doesn't need a vtable), the default constructor is simple enough to be inlined, and that no heap allocation of a-objects occur. If the two member functions uses both g and h it should be able to find them both, but there might be friend functions elsewhere that uses more members, and that is again information that would be hard to infer from the binary code. It should guess this correctly, but it should also insert some warning that it wasn't really sure...

  97. Is This Really C++ by BrianEnigma · · Score: 1

    It looks like the author is decompiling simple C programs that are compiled using Visual C++. His sample programs consist of nothing more than a main() function, a global character array, and sometimes another global function or two. It does not address ANY features of C++, even fundamental ones like classes.

    I do not see how this is decompiling C++. It is simply decompiling C.

    1. Re:Is This Really C++ by Specialist2k · · Score: 2, Funny
      And how do you compile the following statement using a C compiler?

      cout << "member 1: " << local_struct.member1 << endl;

    2. Re:Is This Really C++ by Anonymous Coward · · Score: 0

      /*
      cout << "member 1: " << local_struct.member1 << endl;
      */

      You could also use a special preprocessor, like they used for the original prototype C++ compiler.

    3. Re:Is This Really C++ by ryanr · · Score: 1

      That's the only bit of C++ I saw. The original point stands, and the author gave absolutely no info about how to disassemble C++ above and beyond C.

    4. Re:Is This Really C++ by CrocOS · · Score: 1

      Lemme get this straight: You agree that the author has shown how to decompile some C++ code, but then complain that he hasn't shown how to decompile C++ code?

      The other issue here is the nature of the beast: There is nothing in C++ that can not be expressed in C.... it might be a little more confusing to read, but hey... at least he's not just converting the .exe to assembler =)

      --

      I should really get around to creating a sig.... Nah - too lazy =)
    5. Re:Is This Really C++ by ryanr · · Score: 1

      Lemme get this straight: You agree that the author has shown how to decompile some C++ code, but then complain that he hasn't shown how to decompile C++ code?

      No. I agree that the bit of code shown is C++. He didn't disassemble that bit. Even if he had, that's not the interesting bit about disassembling C++. All the interesting (PITA) stuff for disassembling C++ comes in when the programmer is using objects, member fucntions, constructors, destructores, etc...

      The other issue here is the nature of the beast: There is nothing in C++ that can not be expressed in C.... it might be a little more confusing to read, but hey... at least he's not just converting the .exe to assembler =)

      Not true. You can express any algorithm in either, they're both Turing complete. You can absolutely use some very different syntax in C++, that's the point if it existing. :)

      And yes, I don't know if you're being sarcastic, but he did just bascially convert the .exe to assembler.

    6. Re:Is This Really C++ by CrocOS · · Score: 1

      I've got to admit to a certain degree of sarcasm about the conversion (I did RTFA after all) =)

      My understanding of C++ compilation is that all of the template and class information, including the object structure etc, is effectively lost when it is compiled. The only way that functions could be determined is through laborious tracking... but then it won't be pretty either, with no function-names etc. About the only things where function-names would be recoverable would be the ones that have external entry-points.

      I find this to be an intriguing concept: How C++ like could we make the decompiled .exe? Even given the fact of the unrecoverable parts (eg: templates) the majority - enough to form a "hard-coded" version of the program - should exist. Still, bugger trying to be the one that writes the heuristics to reverse-match the... basically assembler, into valid C++ code structures!

      --

      I should really get around to creating a sig.... Nah - too lazy =)
    7. Re:Is This Really C++ by ryanr · · Score: 1

      My understanding of C++ compilation is that all of the template and class information, including the object structure etc, is effectively lost when it is compiled.

      Sure. That stuff doesn't exist at the machine-code level. An object turns into a bunch of function that all pass around a pointer to the same structure. Virtual functions turn into however many versions of the actual function that are requried.

      The only way that functions could be determined is through laborious tracking... but then it won't be pretty either, with no function-names etc. About the only things where function-names would be recoverable would be the ones that have external entry-points.

      Same as a C program. There are often clues that help you name the fucntions again. For example, authors have a habit of using the filename or function name as part of their assert messages. Those show up as simple strings in the binary. For functions with no clues, you name them based on what they are doing.

  98. Re:Intresting by Anonymous Coward · · Score: 0

    No, I think they were looking for the grammar nazi. Any relation?

  99. Halting is a red herring by yerricde · · Score: 2, Insightful

    Now, the catch is that during compilation, data and code are mixed in the resulting binary.

    Not last time I checked. My compiler emits at least four segments in a compiled program: .text (program code), .rodata (initialized data marked as 'const'), .data (initialized data), and .bss (zero-filled data, which is run-length encoded). Segments .text and .rodata are also write-protected.

    Yes, there is a halting problem, but this isn't it. Segments make distinguishing code from data straightforward. I understand that a few programs make platform-specific API calls that write-enable .text would be harder to disassemble (and subsequently decompile), but do most user programs make such calls?

    Besides, even if the halting problem were relevant, the halting problem can be solved in a real computer, which has limited memory and is thus a linear bounded automaton rather than a Turing machine.

    --
    Will I retire or break 10K?
    1. Re:Halting is a red herring by js7a · · Score: 1
      the halting problem can be solved in a real computer

      No, it can not. Write a program which will loop until any key is pressed. When will it halt? No program can determine the answer in advance.

      The grandparent post has a good point about the technical impossibility of always correctly decompiling a computed branch address. However, in practice, those don't appear. I know of only one compiler that ever did anything like that. These days, if you have a switch statement with cases of enough chars to make it worth the construct, dispatch tables are used instead.

    2. Re:Halting is a red herring by napir · · Score: 1

      Besides, even if the halting problem were relevant, the halting problem can be solved in a real computer, which has limited memory and is thus a linear bounded automaton rather than a Turing machine.

      Yeah, and I can exactly determine the position and velocity of an electron using only a tube of toothpaste, a can of Pabst Blue Ribbon, and the skull of a housecat. But I don't see any reason to justify my claims either.

    3. Re:Halting is a red herring by Minna+Kirai · · Score: 1

      No, it can not. Write a program which will loop until any key is pressed. When will it halt? No program can determine the answer in advance.

      That example is irrelevant to the actual Halting Problem.

      The Halting Problem assumes that both the program itself and all it's inputs are available to you beforehand. In that case, it is still impossible to detect if the program will terminate without actually running the whole program.

      (And if the program takes decades to execute, then that possibility is eliminated)

    4. Re:Halting is a red herring by Dachannien · · Score: 1

      But that example *was* relevant to the suggestion that real computers can solve the halting problem on programs written to be able to run on real computers.

    5. Re:Halting is a red herring by jacobm · · Score: 1

      Don't be silly, the statement you quote is obviously true. Turing machines rely on the fact that they have infinite tape. If you only have 17 zillion slots on your tape, you're just a DFA in a fancy dress because there are finitely many states you can possibly be in.

      --
      -jacob
    6. Re:Halting is a red herring by napir · · Score: 1

      Yeah, I read this wrong. I read it as "decide halting problem of TM using an LBA" instead of "LBA problem using a TM".

      That and I had a compelling desire to use the phrase "skull of a housecat".

  100. Re:Intresting by Anonymous Coward · · Score: 0

    You rode the short bus to school, didn't you?

  101. Lossy is OK by yerricde · · Score: 1

    And it's perfectly fine for decompilation to be lossy. The point of decompilation is not to recover the original source code byte-for-byte but to recover something from which a programmer of ordinary skill can recover the gist of the algorithm.

    --
    Will I retire or break 10K?
  102. what's the point? by KMonk · · Score: 1

    I have worked on a few projects where I was asked to pick up in the middle of a project already started in c++ or c, and even with comments and full normal source... figuring out someone else's code for a project of any size is a big pain, I can't imagine how hard a decompiled source would be.

  103. I don't know about you but by jsse · · Score: 1

    whenever someone told me my java source code is unreadable, I usually feed it into a decompiler, then send it back to the whiner. It usually fixes them up. (yes, the decompiled .java has no compile, but neither does the original source)

  104. Silly idea and inconsequential material. by scherrey · · Score: 1, Informative

    First off, there is no "decompiling" going on here. That would imply that you will end up with code having a semi-resemblence to the original code - which is certainly not happening. What is going on here is simply just another compilation phase. This time, instead of an object file target compliant with the system ABI, you are getting a C/C++ file target which should theoretically be compilable into a program that will generate the same output for the same runtime input. The scope of effort and implications barely overlap as they are so vastly different.

    Of course, with C++, being a strongly typed language that resolves so many things at compile time, decompilation is not possible for any non-trivial example (which all the examples in the link were- indeed they didn't use any C++ features at all). This is even ignoring the effects of compiler optimizations. The C++ language is far more expressive than the output dialects of the compiler making the whole idea of decompiling silly. C, on the other hand, is basically a platform-independent assembly language which is why the one-to-one examples of C and asm output seem to imply one can move back and forth between the two at will. Still this is a mistaken impression.

    Now - is compilation from object code to (non-equivilent but functionaly similar) C code useful and interesting? Certainly. And all compiler developers and most hard core debuggers can do this pretty much at will. Its the only way to check the correctness of your compiler and its generated code and, in desperate circumstances, can give you some clue as to what an existing application for which you have no source to, is doing. This is called reverse engineering, btw, NOT decompilation. Unfortunately the material pointed to here provides absolutely no new insights and is quite rudimentary at best. Anyone intimately familiar with their compiler and environment already has more knowledge than this paper provides. Really doesn't justify a slashdot posting but I guess whomever posted it simply isn't a C/C++ developer.

    1. Re:Silly idea and inconsequential material. by muonzoo · · Score: 1
      Of course, with C++, being a strongly typed language ...

      I completely agree with all of your points, except this one. :-) This I must disagree with. C++ is FAR from a strongly typed language. 99% of modern C++ compilers will permit the following to compile, most without a warning even.
      #include <iostream>
      int main(int, char *[])
      {
      int i = 1 << sizeof(char)*8; // too big for a char
      char c = i; // Truncation error

      std::cout << "i:" << i
      << "\tc:" << static_cast<unsigned int>(c)
      << std::endl;
      return 0;
      }

      The output:
      bash-2.05a$ g++ -Wall foo.c [NOTICE NO WARNINGS]
      bash-2.05a$ ./a.out
      i:256 c:0
      bash-2.05a$

      Note: I make use of a static_cast ONLY to display the character as a decimal value representation of it's bitset and not as the character at ordinal 0.
      Note: This is frightening. There ought to be atomic blast-sized warnings about the assignment of a 4 byte wide value to a char, BUT, oh - no. That would be strong typing.
      Ada83 / Ada95 : Not there's some strong(er) typing. (yuck).
    2. Re:Silly idea and inconsequential material. by Minna+Kirai · · Score: 1

      That's not about types, though. It's about values. A truncation error isn't based on the type of i, but the value it contained.

      So, what behavior would you really prefer? The options are (A) runtime exceptions for truncation (Java-ish), (B) No int->char assignment without an explicit typecast, (C) print warnings for questionable behavior in the rare case static analysis can catch it.

      Because you called this a "type system" problem, I assume you'd prefer option (B). But that won't fix your problem- the central error here is that char was used to contain a number too large to fit in 8 bits. Having made that mistake, and having already demonstrated willingness to use typecasts (that static_cast char->int) to protect his error, the wayward programmer would've just instinctively typed "char c=(char)i;". (Especially likely as a knee-jerk response to a type-mismatch error from the compiler). And now the program exhibits exactly the same wrong behavior.

      (Plus, requiring int->char typecasts would break old code, and outlaw normal things like "typedef char int8;int8 s=13;")

      How about (C)? Your situation is a special case- i is given a constant value, then immediately assigned to c. But in general, looking at the statement "char c=i;" by itself, the compiler can't know what previous value i had. Sure, in foo.c the truncation effect (can't truly call it an "error"- after all, it's exactly the effect the programmer wished to produce) could be firmly detected ahead of time. But should the compiler give warnings for runtime problems in the odd event that compile-time analysis can catch it? If so, then you'd want "int x=0,y=10/x;" to be a compiler warning too.

      So, what you're really saying is not that the type-checking is too weak, but either that the tradition of silent wraparound on overflow is wrong (which is true- for the majority of uses today, programmers don't want this), or that C++ compilers should try to pre-evaluate all possible arithmetic to check it for operations that are legal, but surprising to a total neophyte.

    3. Re:Silly idea and inconsequential material. by muonzoo · · Score: 1
      I believe (keeping in mind that I was a C++ guru in an Ada shop), that Ada83 and Ada95 will prevent the assignment in my (contrived) example simply because the type is different. My static_cast in the example was (as stated) solely for the output illustration; I do not condone arbitrary casts. At several places I have worked, a cast required a waiver, in writing, to ensure that the architectural oversight was adequate and the potential consequences of that cast considered. Sound draconian? Depends on the application. This was an air-traffic control project; many people were coming from weapons or command-control environments, and they thought our programme was lax.
      It's all relative, and, in the end, more awareness of the language limitations is a good thing.
      I don't agree with all of your examples, for instance;
      int i = 1<<8;
      char c = i;
      This is a trivial example, and I did not intend to imply that I was interested in this case because of the static nature of the assignments. I am interested in this case solely because you are slamming a value from one type into another type that cannot hold all values of the former.

      I think this should be an error.

      If you want to patch it up with a cast; then sure, go for it. BUT, implementors have to understand that the cast is really saying a whole lot more than "make the warning go away". A cast is an implicit contract of sorts; "I know what I'm doing -- leave me alone". Far too often, I think the responsible party really doesn't understand what they are doing. That's where problems start.

    4. Re:Silly idea and inconsequential material. by Minna+Kirai · · Score: 1

      I suppose you don't enjoy the existence of C++'s explicit keyword, then.

    5. Re:Silly idea and inconsequential material. by muonzoo · · Score: 1

      Actually, I think explicit is a huge step in the right direction, but it doesn't help me in my case, since it is for class member functions -- as far as I can tell, it cannot sort out my original problem. I'd love to be shown wrong though.

    6. Re:Silly idea and inconsequential material. by scherrey · · Score: 1

      Your example does not take away from the fact that C++ is strongly typed, it only shows the warts from its C heritage which Ada, being designed from scratch as such, does not share. This is similar to the purist (bitter, nonsensical) argument that C++ isn't an OO language because it doesn't force everything to be an object like SmallTalk. The fact is that the language and compiler support and enforce both constructs quite well (as specified) but will also allow the programmer to go outside of those bounds (therefore be more productive!) at their own risk. :-) But we stray from the topic now...

  105. Decompilation Info by Foresto · · Score: 1

    For those who don't know about it already: The Decompilation Page

  106. Research Company in Kingston, ON by msobkow · · Score: 2, Informative

    A friend of mine work(ed) with a company in Kingston, ON that was spun off from Queens University. Their sole purpose and business model is to take whatever binaries and source a company has available, run it through their cluster of analysis systems, and produce a "clean" update of the system. As per usual, there is about 10-15% of the produced code that needs some hand inspection and tweaking to complete the task.

    Their "big" business was the Y2K work, as their software isn't limited to just reverse-engineering, but can also refactor the re-engineered code (e.g. change all "year" values in the system from 2 digit to 4 digit, updating all related I/O formatting functions, overlay structures, etc.)

    On the flip side, their stuff involves complex pattern matching and heuristics that put any other system I've heard of to shame. It requires clusters of systems running for days to do the initial code analysis. (OTOH, it probably took years to create the original code.)

    I can't provide more specifics on the company because they're having some legal issues with co-investors.

    --
    I do not fail; I succeed at finding out what does not work.
  107. Reverse Disassembly by LauraW · · Score: 1
    I have a tool that will do "reverse disassembly" of C++ programs. It's called a "compiler"....

    (Yes, I know the author's native language probably isn't English. But I couldn't resist.)

    Laura

    1. Re:Reverse Disassembly by farnerup · · Score: 1

      Yeah, it's a tautology and we can't have none of that crap.

  108. Re:This is nonsense by Anonymous Coward · · Score: 0

    Who is the dumbass replying to the wrong comment?

  109. Optimizers by msobkow · · Score: 1

    Loop unrolling is only one heuristic for optimizing, and it's something many programmers do by hand to tweak performance. For example, you could do one of the following:

    char * foo;
    const char foo_init[] = "bar";
    foo = new char( strlen( foo_init ) + 1 ) );
    strcpy( foo, foo_init );

    Alternatively, many programmers would realize they're dealing with constant values (most compilers won't recognize that strlen() as producing a constant), and write:

    char * foo;
    foo = new char( 4 );
    foo[0] = 'b';
    foo[1] = 'a';
    foo[2] = 'r';
    foo[3] = '\000';

    An ideal optimizing compiler would produce the latter code from the first. How in the world would a reverse-engineering tool recognize those two forms as equivalent?

    --
    I do not fail; I succeed at finding out what does not work.
    1. Re:Optimizers by sholden · · Score: 1

      And hopefully those many programmers are out of a job, for producing hideously difficult code to maintain for absolutely no benefit.

      And of course the strlen() isn't the problem, since in that case it can be replaced with sizeof() since the array declaration is visible, the strcpy is the problem. But of course there is another function called memcpy for use when you *know* the length.

      If your compiler (with appropriate optimiser settings) doesn't produce equivalent code (for the loop unrolling, using literal loads instead of movs in the assembler is another story and in practice isn't what you want because you'd just rename foo_init to foo and drop the const in that case :) to your unmaintainable hand unrolled piece of shit from:

      char *foo;
      const char foo_init[] = "bar";
      foo = new char[sizeof(foo_init)];
      memcpy(foo, foo_init, sizeof(foo_init));

      Then you need to get a new compiler (or write a for loop, which the compiler will unroll, if your libc is the problem...)

      Now change foo_init to :a different string" and then to "yet another different string" and then to "short". I think I prefer my version that lets the compiler do the unrolling and requires only one simple modification to do that, as opposed to your which requires every line except the first to be modified to achieve such a simple change.

    2. Re:Optimizers by Anonymous Coward · · Score: 0

      On a 32 bit system, copying 4 chars can be a single instruction. It would have been much better to call the C standard libray instead of manually copying each char.

    3. Re:Optimizers by msobkow · · Score: 1

      True, and the compiler should recognize that is what is happening, and tweak the output accordingly. That would make it real interesting for the decompiler to try to create code that even vaguely describes what is really going on!

      --
      I do not fail; I succeed at finding out what does not work.
  110. I've said this before. by Anonymous Coward · · Score: 0

    Microsoft owns your code. They have an inhouse decompiler that will decompile any visual c++ code. They use this in case your product gets to popular, they buy a copy of your code, decompile, do a bit of reworking and ...wam... they own your market and you are out of business.

    DON'T BUY MICROSOFT!!!

    1. Re:I've said this before. by Anonymous Coward · · Score: 0

      they buy a copy of your code

      Then why would they have to decompile it?

      DON'T BUY MICROSOFT!!!

      I'm afraid that neither I, or anyone else reading this, has enough money to buy Microsoft.

  111. Mod parent up! by Anonymous Coward · · Score: 0

    Nice save. Someone might have clicked on something less than wholesome.

  112. Why just C++? by Anonymous Coward · · Score: 0

    I thought decompiling a program in any language would be similar. Why is C++ singled out?

  113. Article is mistitled. by Minna+Kirai · · Score: 2, Informative

    The article (link provided for those who don't read URLs) is wrong, even in the first section.

    The title of the first "chapter" is "Why is c++ Decompiling possible?". But immediately he lists "what is totally loss when you compile a program and what stays there".

    In the Lost column he puts templates and classes. The remains list has things like function calls and local variables.

    Well, guess what? Those things are are "lost" are everything that distinguishes C++ from C. If you don't have classes (meaning no inheritance or virtual functions either) and don't have templates either, then you're really just programming in "a better C", not C++.

    So all his approach can hope to "decompile" is C code. Which is something we've seen done in various forms for decades.

  114. Re:Decompilation = halting problem == boloney by Anonymous Coward · · Score: 0

    No, he hasn't. The halting problem is a general statement, i.e. can a computer (universal Turing Machine, whatever) determine, in the _general_ case, for any program, whether or not it will halt properly. For non-theoreticians out there, that's basically asking, "is it possible to write a program that tells whether any given other program will return zero?" The answer to that, obviously, is no, the only way to find out what the return value will be is to run it. (Yes, there's a mathematical proof of the same thing, I'm sure you can find it if you want to...)

    But decompilation isn't the same thing at all. You're taking assembler code and trying to reverse engineer the data and control structures that generated it. Yes, there are complexities inherent in the process (your example of the switch-case statement is a good one) but these are bound by the variables they depend on. The switch statement has a finite number of cases to jump to and the decompiler can easily check each branch for just the one switch statement and characterize it's behavior.

    Sure, exploring each branch will take exponential time, but no one said decompilation had to be done in polynomial time. There's no need to simulate all possible inputs, just the ones at each branch point. They can be done totally independently of one another, since all we want to do is build something that's functionally equivalent at each module. The programmer reading the decompiled output bears the burden of trying to figure out what the program does, i.e. how the various modules work together.

    So, no, decompilation != halting problem. (I'm not saying that decompilation is easy or accurate here, just that it can be done and it's not a proven theoretical impossibility.) Now, go study decompilation, then write your reply again :)

  115. SCO by Tablizer · · Score: 1

    Finally, somebody can reverse eng. SCO and rewrite the parts that are the same or similar as Linux before the damned lawsuit is finished. Mootify it!

  116. It's a shame that nobody mentioned IDA by Sam+Lowry · · Score: 1

    It's a shame that nobody mentioned IDA yet -- an interactive decompiler that does not restore the source code but instead tries to work with the human to figure out what parts of machine code do and mean by splitting data and code and giving readable names to functions and variables to start with.

  117. The linked "book" itself by p3d0 · · Score: 2, Insightful

    I haven't seen any comments on the linked "book" itself yet. In short: it sucks hard. Go take a look and try not to laugh.

    --
    Patrick Doyle
    I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  118. MacNosy by oaklybonn · · Score: 1
    Steve Jasik has/had a product called MacNosy. It was a disassembler that allowed you to iterate over a disassembly. It would isolate access to globals, for example: you could name a global, and from then on, all references would have the name of the global. You could define structures and tell it that a given register held a pointer to it for some nice structured disasembly, not as nice as theoretical decompilation, but certainly quite useful.

    Unfortunately, the user interface left a lot to be desired, owing to the authors peculiar, um, tastes in UI design. And it never produced assembly that could be fed back into an assembler after modification, which would've been most useful. It did, however, come with a fairly detailed analysis of the Mac roms (back in the day ;-) and so was invaluable in learning the secrets of the Mac toolbox. Anybody know of a similar product for linux/windows/os x?

    As an aside, the parent article is complete crap written at a high school level by someone that recently learned that what you put in the compiler vaugly resembles what you can see from the debug prompt. I kept waiting for the "here's my magic algorithm that allows me to trace object usage accross appearent compilation units", but this pointless article didn't even come close.

    1. Re:MacNosy by oaklybonn · · Score: 1
      Yuck, replying to myself.

      Forgot to mention why I would want to do these things. In my job, I develop frameworks used by other applications; often, changes we make in our frameworks breaks some third party application. By disassembling the third party app, I can figure out what their erroneous assumption was that we just broke - and then we can implement a workaround until they fix it.

      Another reason, and one that I've come up against time and time again, is buggy third party libraries. Sometimes the vendor has gone out of business; sometimes they want more money to fix their bugs. But if our license allows for it, we can just fix their bugs and move on.

    2. Re:MacNosy by Anonymous Coward · · Score: 0

      Don't feel bad.

  119. Hilarious by renehollan · · Score: 1
    This falls into the "No you can't", based on a sound theoretical basis that compilation throws away data vs. "Yes you can", based on an observation that it may be possible to produce a C++ source program that compiles into something that appears to operate the same way as the original executqble (it may, in fact, be identical), argument.

    While complete decompilation is impossible, if any data has been thrown away, partial translation into a form that is more useful may be possible: when many ask for a "decompiler", they really want that useful translation instead, though may not know it -- knowing only that a complete decompilation would be such a useful translation.

    IOW, "You can't always get what you want, but you might just find, you can get what you need."

    The real danger here are people (think PHBs) who can't tell the difference between a translation and a complete decompilation, see that the former is possible, think it is the same as the latter, and INSIST, on pain of being fired, that a tool be developed to produce the latter. A little knowledge is a dengerous thing in that context.

    --
    You could've hired me.
    1. Re:Hilarious by Ardias · · Score: 1

      Rene Hollan, that is funny. Reminds me of this joke:

      An engineer and a scientist have this long-standing debate over whether something can be done. Eventually, the engineer shows how to implement it and proves beyond a doubt that it works. The scientist then says, "Okay, so that works in practice, but does it work in theory?"

    2. Re:Hilarious by renehollan · · Score: 1
      The problem is that "what works in practice" is not what the theory covers, but rather some poorly qualified subset of the problem space.

      This is compounded by people who's needs might well be satisfied with an imperfect implementation that has an acceptable rate of false positive or false negative failure modes, but who insist on "perfection", and point to the "good enough" implementation as disproof by example of the theoreticians claim that perfection in that context is impossible.

      It comes down to two people not talking the same language.

      --
      You could've hired me.
  120. Re:Why? Java-style reflection by Anonymous Coward · · Score: 0

    RTTI doesn't let you ask for a method/variable by its name and then change it.

    You can write a program in Java without actually hard coding any of the method/variable names that you are calling (bad idea... but not possible in C++ without having to roll a whole lot of your own code or using a special pre-processor type of deal).

  121. Wouldn't it be simpler to just use a disassembler? by Kaali · · Score: 1

    I have done quite a lot of reverse-engineering stuff. Assembler really isn't that hard, nor is reading disassembled programs. Just use a debugger or string-search in the disassembled code to find the place you want to modify. Study the piece a bit and change it. What makes it _really_ easy is library calls, that will show in the code as "call printf". With this technique i have added, removed and patched a lot of features in different programs (removed splash-screens(yes, some legal apps use this annoying feature), removing ugly skinning of programs windows etc.).

    Yes, when i last checked this was all legal in Finland.

    For bigger patching this could be really useful.. but for that purpose there are open source programs ;)

  122. Re:Text of the article by kingkade · · Score: 1

    I can tell you. Blow your head off with a suitably powerful firearm. Tie a cinderblock around your legs and throw it off a bridge. Use your tiny imagination. Do whatever it is you did to those animals you most likely tortured when you where younger. See you on the other side, worthless.

  123. Put the dots together... by Anonymous Coward · · Score: 0

    Obviously we are being given an obliquely put challenged to hack the Irish e-voting system and decompile its code!

  124. Re:Why -- Pursuit of Knowledge by Anonymous Coward · · Score: 0

    A better question might why you _wouldn't_ want to be able to decompile your code.
    I would imagine that anyone using or seeking knowledge in the digital domain would want every tool possible added to their arsenal. Not everyone making code is a nice or law abiding person.

  125. Umn.. by TyrranzzX · · Score: 1

    Since when was taking apart my car ilegal? "trade secrets" and lisence agreements that demand you give up your right to take apart a piece of software are hogwash and total BS. I have the right to know what my computer is doing, because it's mine, and besides, I can still take apart the cotton gin and rigure out how it works, the same applies to software. Just becuase it's a computer and has thousands of magical circuts doesn't mean I can't learn how it ticks.

    Removing one of the barriers that keeps people from learning what a program does is a great idea.

  126. Re:Copyright Violations?? by mcheu · · Score: 1
    5) Discovering copyright violations

    I'm curious how this works. I hear about it all the time, that some company or other has ripped off a piece of someone else's code and they're suing. The thing is, if I understand this correctly, a number of modern compilers optimize the executable to such an extent that t's theoretically possible that two completely different pieces of code employing the same algorithm will end up with an extremely similar executable. Unless your algorithm was unique, how can you possibly prove definitively that someone ripped off your code.

  127. From the author by opcodevoid · · Score: 5, Interesting
    I didn't relize my artical was getting any feedback because people are posting it here instead of pscode.

    Anyway i seen alot of people saying decompiling is impossible or at least not practical, well that is not true. Decompiling c++ is very practical because of high level keywords(if,while,for) ,local variables, and parameters. All of these generate certain instruction similer on every platform and just about every proccesser.

    I also extending the artical to contain 92 pages in total which will cover OOP, and crt, and a whole bunch of other stuff

    1. Re:From the author by Anonymous Coward · · Score: 0
      You probably didn't get any feedback there because it requires registration in order to post comments.

      And it looks like the comments that were posted were also from 13 year olds like yourself. I mean, C- for effort, but I haven't seen anything in what you wrote to warrent even a Sunday slashdotting.

    2. Re:From the author by Anonymous Coward · · Score: 0
      I'm sorry, but I just can't let this thing go. This really is about 12 year olds, isn't it? Someone, please tell me I'm wrong. http://www.wolfyserver.org/nutdws/news.html
      Attention: A complaint had been filed on June 13th, 2002 by a high-ranking member of the staff concerning copyright infringement and the plagiarism of intellectual property of wolfyserver.org- The culprit is identifiably one of our own. Justin, under the pseudonym of TitrationX, also Magi-Elect, will face a public trial in the chat room on an undisclosed date. His hearing has been approved by webmaster thatbiggbadwolfy under confirmed allegations. wolfyserver.org and its staff do not look kindly upon infringements of one's own work. The layout of wolfyserver.org and the site under fire, www.evilattitude.com, look identical- with the latter being a rather poor duplicate. wolfyserver.org trademarked tables, colours, menu bars, and even the restricted section were even clichéd. If nothing is done on Justin's part to justify his own ends, then the inevitable will occur. The Magi will vote on this matter with 2 out of 3 being majority. this Friday or coming weekend. No appeals will be given. In other news, a vote will be given regarding opcodevoid, the site by Justin (unwiseEvil)- to see if he lands a spot on the direct affiliations list. Just as a note, the magi do not favor Justin going over to TitrationX to beg for hosting. Such an act of desperation is a sign of weakness. Thus, it borderlines the issue of trust and loyalty. wolfyserver.org's staff will have no such people and as a proven point- all staff members are individually evaluated by thatbiggbadwolfy. A decision will be made next weak [sic].
      I've gotta know... Did they DECIDE!? And WHAT?
  128. I've tried it by mnemonic_ · · Score: 1

    A coding sequence cannot be revised once it's been established.

    Why not?

    Because by the second day of compilation, any objects that have undergone reversion compilations give rise to revertant binary like rats leaving a sinking ship. Then the ship sinks.

    What about C decompilation?

    I've already tried it. Reverse disassembly, templates as an decoding agent and potent mutagen. It created a virus so lethal the system was dead before it left the startup sequence.

    Then a repressive extension that blocks the operating code?

    Wouldn't obstruct replication, but it does give rise to an era in replication so that the newly formed code carries the mutation and you've got a virus again. But this-- all of this is academic. The code was made as well as we could make it.


    Guys, Tyrell knows....

  129. Variables don't hold values? by 42forty-two42 · · Score: 1

    Another interesting fact is variables don't hold data, they pointer to where the data is stored.

    [...]

    char * globalvar = "Whats Up";

    [...]

    00405030: global_var dd 405034h
    00405034: global_var_value db 'Whats Up',0


    Well, duh. You asked for a variable that was a pointer, you got it. If you used an int, it wouls hold a value. Understand the language before reverse-engineering it!
  130. I guess you'd have to read the article by Anonymous Coward · · Score: 0

    How can you be joking at a time like this? Marco is loss!

  131. Why not indeed. by fishexe · · Score: 2, Informative

    Well, you can decompile every binary programm at least to assembler code, so why shouldnt it possible with C++?

    There's a huge difference between disassembling and decompiling. With assembly, you generally have a 1 to 1 correspondence between machine language instructions and assembly instructions. That is, one specific instruction you feed to the assembler becomes one specific assembled instruction. Sometimes it's more complicated than this, but only slightly.

    Now look at c, where one line of code could be arbitrarily many opcodes, depending on the complexity of the logic within that line (and the length of the line). Now suddenly, instead of looking at one instruction and translating it back to it's equivalent, your decompiler has to look at possibly hundreds of instructions, parse them logically and figure out where each line starts, and ends, and what the logical purpose of each set of instructions is. Then dealing with structures (or in C++, objects) where you have to come up with a definition for how data is laid out based solely on the instructions for dealing with that data.

    That's quite a bit more complicated. I sure as hell couldn't do it. I know I could write an assembler or disassembler, I might be able to write a simple compiler, but there's no way in hell I could write a functional decompiler.

    --
    "I don't care about the Constitution!" --Bill O'Reilly, November 17, 2009
  132. Do you really need one? by Blue+Master · · Score: 1

    Something that will literally give me code I can re-compile immediately?

    If you're going to re-compile it immediately, it sounds like you really don't have a use for a decompiler =)

    1. Re:Do you really need one? by Anonymous Coward · · Score: 0

      Weeell... Perhaps he's decompliling, inserting debug statements around offending code, and then recompiling. Ur teh funnay!

  133. sweet. our plan is working by Anonymous Coward · · Score: 0

    a few more of these and you will be up to +5.
    then we can freely post goatse.cx links all over the place

  134. Turing machines are not interactive by yerricde · · Score: 1

    Write a program which will loop until any key is pressed. When will it halt?

    Interactive input is outside the parameters of a Turing machine and thus not related to the halting problem. Besides, in modern operating systems, input is an API call, and API calls are ridiculously easy to handle correctly in a disassembler.

    --
    Will I retire or break 10K?
  135. Correction by Anonymous Coward · · Score: 0

    >> You need reasons?

    >>1) Finding backdoors
    >>2) Testing security
    >>3) Fixing bugs
    >>4) Adding features
    >>5) Discovering copyright violations
    >>6) Interfacing to non-supported clients

    Let me Microsoft-ize your post

    1) Finding backdoors^Hundocumented features
    2) Testing security^Hundocumented features
    3) Fixing bugs^H^HExtending undocumented features
    4) Adding features^Hdoggy bloat
    5) Discovering copyright violations^H^Hblackmail and copyright duplication
    6) Interfacing to non-supported clients^H^H^H^HMaking code non-standard to prevent competition and otherwise non-documented to suffer competition and cause difficult coworking.

    That sum it up? Oh yes, IRC...

    7) Profit!

  136. Re:You can't [MOD PARENT UP] by Anonymous Coward · · Score: 0

    One of the funniest posts I've read in a while...

  137. Re:Intresting by Morologous · · Score: 1

    The previous reply wasn't me, Ping.

  138. LBA has no input registers by yerricde · · Score: 1

    OK, my statement may have been a bit misleading. By definition of an LBA (linear bounded automaton), the input for an LBA is the contents of its memory when it is powered on; an LBA has no "input registers", that is, addresses whose contents will change other than by a write to memory by the CPU. Because every LBA has an equivalent (but humongous) finite state machine, it is possible to determine whether any LBA will halt or loop by running it for n cycles where n is the number of states of the machine's memory (n = 2^m where m is the number of bits in the machine's memory). In practice, two identical machines are run in a tortoise-hare configuration (one cycle of the tortoise to two cycles of the hare), and the machine has halted or looped when both machines are in the same state.

    Take interactivity into account, and programs that run on real computers will always eventually halt because if nothing else, the power will die, or the CPU will melt, or something.

    --
    Will I retire or break 10K?
    1. Re:LBA has no input registers by Minna+Kirai · · Score: 1

      it is possible to determine whether any LBA will halt or loop by running it for n cycles where n is the number of states of the machine's memory (n = 2^m where m is the number of bits in the machine's memory).

      But, that is impossible on a real computer.

      For real computers, the number of bits in memory is usually in the millions (even a tiny Palm Zire has 16 million bits)

      2^16000000 is impossible to cycle through in less than centuries- it's not a task a "real computer" can ever do.

      Remember, 2^300 is the number of electrons in the entire universe! Once any algorithm has a predicted running time of 2^(150 or higher), it's on the virge of being impossible to solve on any real computer.

    2. Re:LBA has no input registers by yerricde · · Score: 1

      that is impossible on a real computer.

      Which is why I mentioned the tortoise-hare setup, which speeds up the computation for most non-pathological cases.

      --
      Will I retire or break 10K?
    3. Re:LBA has no input registers by Minna+Kirai · · Score: 1

      The Halting Problem is all about pathological cases. If you can't handle them, then your "solution" is not an algorithm but an heurstic.

      Using a tortoise chasing a hare makes you exponentially faster than a more naive exhaustive approach (keeping each state in a list). But the number of cases to check (especially if the program under test is a busy beaver) is still proportionate to 2^m. Which is to say, unworkable.

      The true Halting Problem, of course, assumes the machine has infinite memory so exhaustive approaches are flat-out disallowed. But if you try to bring it to reality and apply it to a finite computer, then you've got to also disallow running times that are merely nigh-infinite.

  139. LBA halting problem by yerricde · · Score: 1

    But I don't see any reason to justify my claims either.

    All PSPACE-complete problems are decidable by a machine running algorithm that uses P space. This PDF states that the acceptance problem (equivalent to the halting problem) for linear bounded automata (which it calles "linear bounded deterministic Turing machines") is PSPACE-complete. Here's an algorithm that decides it.

    --
    Will I retire or break 10K?
  140. Unless... by Ryan+Amos · · Score: 1

    They forgot to remove debugging symbols. Then it's really easy. :)

  141. FUD? by Gothmolly · · Score: 1

    First he crows about MS Visual C++, then says It's harder to reverse engineer something created than to create it in the first place..

    Who is this, the King of Trolls?

    --
    I want to delete my account but Slashdot doesn't allow it.
  142. The author can't take honest criticism. by mark-t · · Score: 3, Interesting
    I posted the following remark about 20 minutes ago on pscode, and when I just checked back there I found that the remark had been surreptitiously removed (I still had a backup of what I had written in my cache):

    Nice try, but no. All this article ultimately describes is how to write high level language code that does the same thing as particular groups of assembly instructions, which is meaningless to a high level language programmer because knowing all the individual steps of a process are nowhere important as understanding what the process actually *IS*. This is something that no automated decompilation process can uncover because the responsibility for that understanding falls on the programmer, not the computer. Since code that only replicates functionality, but does not convey meaning to the programmer is not maintainable, the entire process of decompilation would be wasted. One would probably be better off spending their time figuring out how to do it themselves (with, perhaps, some help from standard reverse engineering, if needed).

    Not only does the author completely fail to realize that the technique he is describing doesn't remotely qualify as decompilation, and is is nothing but normal reverse engineering, but he figures that the appropriate response to negative criticism is to remove evidence of it rather than attempt to intelligently respond. I noticed that my vote of 1 of 5 was still intact on his voting page, though.

    I was originally surprised when I first read the article that someone would think it had merit enough to write about, but having some insight into the mindset of the author that I did not have before (offered by his rapid censorship of my remarks), my surprise has waned completely.

  143. Re:Intresting by Anonymous Coward · · Score: 0

    Dear handybundler,

    I have been reading Slashdot for a few years now, always at -1 just to get the full, dirty details of the discussion. I'd just like to say that I thoroughly enjoy your comments, and hope to see more (although IIRC -1ers are limited to 2 a day).

    Anyway, keep up the good work.

    Regards,

    M

  144. Re:Text of the article by Anonymous Coward · · Score: 0

    Wow with that kind of attitude, you'll probably be there first from all the stress the trolls are causing you. Read at +1 or stop being such a pussy. KTHXBI.

  145. Re:Text of the article by Anonymous Coward · · Score: 0

    FUNNIEST TROLL EVER.

    Mad props to whoever wrote it -- the fake links at the end got me pissing myself. Shame the old days of -1 accounts being able to post zillions a day are gone :(

  146. No need...if you know assembler and op codes by Anonymous Coward · · Score: 0

    Seriously, I learned a bit of x86 assembler, and with the intel architecture reference manuals available online for free download (and they'll even send you free hardcopies if you prefer -- which I did),
    you should be able to follow along -- with not too little effort on your part.
    You may not realize that in a hex dump, all them silly hex codes are actually machine instructions which can be easily translated into "human readable" assembly language statements by any decent hex editor.
    So if you can understand a bit of assembler some op codes, and a bit about your computer architecture, and your operating system, you can follow along pretty well and get some idea of what's going on.
    Of course if you can read a hex dump and don't need to see it in assembly language, and can then
    immediatly see the patterns in the code and translate that hex dump int java, c++, perl, etc....then you must be NEO...and the Matrix is afraid, VERY AFRAID.

    So, for entertainment,
    1. Read a memory dump in your favorite hex editor.
    2. Open up any binary file in your favorite hex editor and follow along...
    3. For practice, whenever you write a c or c++ program, use the -S switch (gcc) and compare the assembly code produced to the c or c++ code you wrote. (For Java you can look at the bytecode in a
    similar manner)

    Why? Because you is an uber hacker, that's why.

  147. *sigh* by Free+Bird · · Score: 1

    "to be Slashdotted" is not an infinitive...

    1. Re:*sigh* by Jonner · · Score: 1

      I think you're right. "To be" is an infinitive, so "to be Slashdotted" is probably some kind of infinitive phrase. Of course, you shouldn't verb nouns anyway.

  148. There is a Real reason Program sources get lost. by Anonymous Coward · · Score: 0

    Some programs end up losing the source code. Harddrive crashs viruses and so on so being able to recover the source in these cases can save programer months.(and some time years).

    Other cases are companys that have gone bust and the program is no longer being made.(here you have to be careful not to step on a sold on licence)

    Even so back door scanning linux does not care about the black box. To find a back door you only need asm not c++. Just asm will take longer to find it but some auto search programs can reduce the time down no end. Basicly a black box is point less as hackers will find there way around them.

    Asm does not care what the source is complied in as most programs get convered to asm of some form anyhow.

  149. Re:Intresting by Anonymous Coward · · Score: 0

    And was it brillig in your slithey toves?

  150. Off topic, mod appropriately. by NoMoreNicksLeft · · Score: 1

    Sorry I missed you on IRC (if that was you). Hanging a closet door...

    It's a full-fledged IP network, not a p2p app. Hundreds, if not thousands, of IP apps will work on it with no modification whatsoever. You're not relying on the assumption that someone won't declare your freenet node illegal, either. For that matter, there is no certainty that you are even connecting to Meta... for all you know, you're just VPNing to some friend you met online. It may not be the original "MetaNET", or even a metanet at all...

    I like DNS, the web, ftp, email, irc, im... I don't see why it's necessary to reinvent the wheel, if that is indeed what freenet is trying to do. For myself, at least, freenet simply is the wrong approach. I want a true internet, just a non-shitty one. And the idea of IP-over-freenet must have had something to do with smoking crack...

    1. Re:Off topic, mod appropriately. by Decimal · · Score: 1

      Thanks for answering. That sounds like a good answer to add to your FAQ!

      --

      Remember "Bring 'em on"? *sigh
  151. Oh there's more by Anonymous Coward · · Score: 1, Informative

    I know this guy. A sad thing is, lives in the US, and as far as I know, he's a native english speaker, I just can't understand a thing he says. I read this "book" week or two ago when he finished it. I thought this was a very rough draft, but I guess not. I couldn't help but laugh at some things, like it's irrelevance to C++ in general. He should have just used C, since he never even mentions a class.... Well, to be fair, he did mention classes when he describes what is lost in the compilation process, which is untrue, especially if it is a polymorphic class. In fact, I didn't see one thing in this article that would set it apart from one written on the same subject, except using C.

    For a laugh, look at his other tutorials. Surprisingly, his "book" here is among some of the better material. Most have to do with C++, and some assembly, and some even cover the same material in this lengthy and pointless article. I especially like his tutorial on using Macros in C++, a concept so backwards and wrong it shouldn't even have to be mentioned. Sure, macros have uses, but with C++, you have real inline functions and constant variables, so why use them for anything besides #include? Anyway, his other works can be found on pscode.com.

    What all this boils down to here, is that nothing new is said here. Not only that, but what is said is presented and worded so poorly that anyone reading it is either going to die of laughter or confusion. If you want to read something on reverse engineering, pick up the dragon book, an assembly book, a good disassembler, and some of the very nice documents on cracking software. Many of these are written by people who will be years ahead of you no matter how hard you work, people who actually know what they're talking about.

    - Mik Mifflin

  152. Great ! Who here has UnixWare? by Billly+Gates · · Score: 1
    Lets then decompile the Unixware kernel and do a diff to Linux and see what we find.

    If SCO commits perjury and puts linux into unix we will know instantly and can throw these guys in the slammer.

  153. Fravia by i_am_nitrogen · · Score: 1

    You know, I used to read Fravia when I was in junior high. I strongly dislike the way the GNU philosophy is associated with "warez" and software cracking on that site. Yes, reverse engineering and disassembling are important tools, but they are not to be used to steal software, but to understand how software works, and thereby implement a free version using similar but superior algorithms and ideas.

    1. Re:Fravia by that+_evil+_gleek · · Score: 1

      Hmm close... Really, you could only reverse-engineer to get the spec, hand that spec to another programmer who imps new software.. Otherwise they'd assume its a derivative work, and be open to lawsuit... To me, reverse-engineer has 3 main legitimate reasons:
      1. Is the program or app really a non-obvious trojan or spyware? Really important for anyone with secret information like U.S gov or corporations where industrial espionage is a factor....
      2. Where is the source of the error or bug? If your company is getting software from more than one contractors, how do you know which one? A mixture of cockiness and laziness on their part might have everyone pointing the finger at everyone else... With no solution... It helps to know where to focus attention... And if the orginial provider can't respond in time to meet your deadline, perhaps a tool like a decompiler can help your team devise a workaround...
      3. Porting your system to new hardware, and having problems with a few components, maybe the orginal provider is gone, doesn't have the skill, or perhaps swallowed up by someone bigger who's business model doesn't jibe with supporting you....

      Of course, if you have a right to the source to the product, all this becomes academic...
      unless its a compiler bug...

  154. It "works" for trivial-sized programs by Anonymous Coward · · Score: 0

    It "works" for trivial-sized programs. I can do better in my sleep.

  155. decompiled copyright? by iplayfast · · Score: 1

    If you take a binary image and compile (back) into C code, that C code will be very different then the original code. As the artical says there will be things lost, classes become dynamic function calls etc.

    The new C code would in effect be a different expression of the same idea would it? Doesn't that make it copyrightable in it's own right?

    If you make a copy note for note of a public doman piece, and publish it, that is a copyrightable work, since it is one rendering of the work. Someone else could make the same copy of the original, and copyright it and as it is a different layout etc, it is copyrightable as well.

    I think this is the same type of thing.

    Am I wrong?

    Should I be asking Slashdot? (of course, slashdot is, as we all know full of lawyers!)

  156. Re:Decompilation = halting problem == boloney by jackb_guppy · · Score: 4, Informative

    Been doing it for twenty years. It is easy to do.

    Stop trying to use logic... actually do it.

  157. C++? by aexandria · · Score: 0

    Why didn't they just call it B?

  158. BS - generating assembly functions the easy way by Anonymous Coward · · Score: 0

    I've done this many times to get a decent code quality assembly language function from a C or C++ compiler:

    1. Write a function in C or C++ with the parameters you want to use
    2. Write another function which does three things
    a. Assings a long value of 0xabcdef01 to a local variable of type long (we'll search for this number later)
    b. Calls the function in step 1 with dummy paramters. I usually use unique values to make it easier to track which value goes to what parameter
    c. Assigns the return values to local variables
    3. Compile the C or C++ file with full optimization turned on
    4. Look at the map file from the link or even use the debugger to find the four bytes "abcdef01" to locate where the function we want the assembly code is located
    5. Disassemble the assembly binary code at the function call and then disassemble the assembly code for the function itself.
    6. Save to a file.
    7. Inspect the function and, if necessary, rewrite how the parmaters are passed into it.

  159. Re:BS - hand optimize code is the last step by Anonymous Coward · · Score: 0

    I forgot to add the last step which is to hand optimize the code generated.

  160. Really, then what about this. by lukme · · Score: 1

    Reverse engineering as I understood is that you have 2 sets of programmers, the frist set disassemble the code and write a specification that contains no source.

    The second set of programmers code from that spceification.

    For this to be leagal, the must not be any contact or mixing of the 2 sets of programmers other than the specification.

  161. Dude - Nice algorithms are patentable. by lukme · · Score: 1

    Before you use a nice algorithm that you lifted from decompiling the ASM, make sure you do your research.

    1. Re:Dude - Nice algorithms are patentable. by Anonymous Coward · · Score: 0

      No, don't. You've got no obligation to, and if you try and fail, you're in deeper trouble than if you had never tried at all! Let them come after you -- chances are, they won't.

  162. Sure you can. by purduephotog · · Score: 1

    It just takes intelligence and insight. Frankly many things aren't that hard- I remember reading some 'essays' about this topic when I was studying. Do a google search for 'fravia' '+orc' and tutorial- You'll find some mirrors. There's even some essays on how to add new functions to notepad given the binary only... which goes to show decompiling a c++ (or any, really) program isn't new.

  163. In the future... by lukme · · Score: 1

    Please make some observation that can't be deduced from half of a Computer Architecture course.

    In your examples, don't use the same variable name in the original source. For example in one of your original sources you used s1 as a variable name. When you were decompiling this, you stated So let's create an alias for the address's 0000:0000 - 0000:0003, which will be s1. which amazing is the same varable name.

    Please try to pick non-trival examples to highlight you decompilation acume.

    1. Re:In the future... by Anonymous Coward · · Score: 0

      The debate was interesting, but the original article was a joke.

  164. Define decompile... by BagMan2 · · Score: 2, Insightful

    It's relatively easy to come up with the set of C statements that would mimick a particular set of asm statements that you wish to decompile, but the end result would be a C program that was not much easier to read to the original asm was. Changing various assembly operations into C operations does get you back the information you really need.

    The symbolic names make up the bulk of the lost information, but often times programmers will organize a sequence of code in a certain way to make it easier to understand. The compiler will often rearrange that code in a manner that makes it easier for the computer to understand. Compilers will do screwy things like increment a variable on the stack, while holding the original in a register for later usage. Where the original C code might have had the variable increment at the end like this:

    while (x 10)
    { // do a bunch of stuff using x
    x++;
    }

    The way the compiler optimizes register usage may cause the assembly to actually increment x just after doing the conditional, then hold the non-incremented value in a register for use down below. The decompiled asm might look like this:

    while (x 10)
    {
    int temp = x;
    x++; // do stuff with temp
    }

    While this may seem like a trivial difference in the C code, it can often distort the intent of the algorithm. When a C programmer sees a construct like the latter, they naturally assume that the temp variable was used because more natural constructs would not. The C programmer then wastes time mulling it over only to discover that it was just dumb.

    I am currently on a project where I am maintaining some pretty poorly written code. I can't tell you how much time I waste looking at a particularly ugly algorithm trying to figure out why they are doing all these screwy things, only to discover they were just idiots.

    My point is, that the compiler and optimizer are going to mangle the logical order of the code in such a manner that it will be far more difficult to read.

    Like I said at the beginning, simple translation of assembly to C is easy, getting back the meaning that gives the endeavor any value at all is much more difficult.

  165. Famous last words by serutan · · Score: 1

    As last words go, "You can't decompile a C++ program" can't hold a candle to, "Hey Honey, watch this!"

    Go ahead and mod me off-topic, but it had to be said.

  166. Function overhead by msobkow · · Score: 1

    A huge portion of the benefit of the "ugly" solution is that the overhead of invoking memcpy() as a function far exceeds the execution of the byte assignments. If your compiler is smart enough to unroll memcpy(), it would produce the output I described.

    --
    I do not fail; I succeed at finding out what does not work.
    1. Re:Function overhead by sholden · · Score: 1

      *Every* compiler that can compile C++ code will be able to inline and unroll bloody memcpy.

      And if you stuck with some compiler written by a first year CS student that doesn't, then there are these things called macros...

      Those macros will already exist if your libc wasn't also written by a first year CS student (as extensions of to the standard functions).

      Compotent programmers would use them rather than writing a bunch of assignments by hand and making the code completely unmaintainable.

      Honestly a programmer who produced that code would be sacked in an instant if I happened to be making hiring/firing decisions...

      That code has to be the worst C++ I've ever seen (and I've taught (and hence marked the code of) around 1500 intro C++ students)...

  167. Read this article... by tundog · · Score: 1


    because Its written in pig latin and I havn't read it myself so it might suck, but read it anyway...

    This is just as bad as those people at work that are constantly forwarding news stories with the proprity flag set.

    --
    All your base are belong to us!
  168. Ignore previous comment. by joto · · Score: 1
    After some actual research, it seems I was at least wrong with respect to gcc. Yes, it seems to generate assembly even when not asked to, and, yes, it is I who have learning disabilities, not you.

    Sorry about that. And I apologize for being rude.

    1. Re:Ignore previous comment. by Anonymous Coward · · Score: 0

      good on you. even if you do use 'most' when you should use 'almost'.

    2. Re:Ignore previous comment. by leviramsey · · Score: 1

      Of course, gcc compiles first to it's own meta-asm (with an ungodly large number of registers), then from there to assembly.

      Stroustrup's old Cfront C++ compiler, of course, compiled C++ down to C, so if you're using that compiler with gcc as your C-compiler, it's going through three different languages before you get machine code.

  169. Re:BS - generating assembly functions the easy way by Anonymous Coward · · Score: 0

    8. Profit ???

  170. Ever write any production code? by msobkow · · Score: 1

    You know, it's funny how you can take a common sample of performance tweaked code and turn it into a personal attack. Think I'll skip the many snarling replies that come to mind, and just remind myself you have no idea who I am or what my skills are.

    Get out of the research labs and start creating and maintaining code that has to run on old hardware, old compilers, old third-party products, and has no upgrade budget. You go and explain to the management that performance tweaks would make the code "unmaintainable", and I'll stand by and have a chuckle while interviewing your replacement.

    Real production code has tweaked segments. It's not "pretty", it's not as "readable" as some would like, but it is functionally equivalent. Odds are that if you're trying to reverse-engineer code, it's going to be old code that has been hacked and tweaked to keep it going after being rushed into production.

    But hey, what do I know? I've only been programming for about 20 years, so obviously some prof saddled with the first year classes must have more insight. I humbly bow before the wisdom of one who preaches source code style while discussing decompilers...

    --
    I do not fail; I succeed at finding out what does not work.
    1. Re:Ever write any production code? by sholden · · Score: 1

      I'm not a prof, I'm not saddled with first year classes - I taught some CS2 classes a few years ago which doesn't make me a prof and doesn't mean I still do it.

      I've been progrmming for about 20 years too, though a large chunk of that was C64 assembler and basic when I was yet to be teenager.

      I've paid the rent hacking code originally written by biologists, trust me I've worked with crappy code which uses undocumented binary only buggy libraries and unoptimised crap that needed fixing in order to have the run not take all week...

      My main programming at the moment is on a micro with 512 bytes of RAM (though I'm mainly writing prose at the moment), a few hours ago the delivery guy dropped of a package of some *drool* 32KB srams... My code is now twice as fast, not due to the sram (which is for something else) but due to the 8Mhz crystals he also dropped off, I changed a '4' to an '8' in my code swapped the crystal and it's now twice as fast... And my code is just as readable as before :)

      But all that is irrelevant.

      I understand that hand optimising C code sometimes leads to better programs.

      However, your example was silly - which is what I was pointing out, but you decided it was worth defending and hence I can only assume you don't consider it completely silly.

      The original code was inneficient because the programmer was using functions which are designed to work with strings of unknown length (strlen and strcpy) when in fact they had a known length string and should have been using functions designed to work with those (sizeof and memcpy).

      'Fixing' the code by writing a bunch of char assignments may improve the speed (or may kill it completely due to the increase in code size - though with just 4 chars it will likely reduce the code size too) but at the cost of a massive decrease in maintainability. Since using sizeof and memcpy will give the same speed improvement and not make the code any less readable it seems the better solution. As a bonus memcpy might even copy word sized chunks instead of bytes - maybe i you get lucky.

      There are many better examples, in which the solution is actually hand optimising as opposed to being the use of an appropriate special purpose function/macro instead of a slower more general one.

      The example sucked and simply wasn't an example where hand optimising produces a useful benefit.

      Trading maintainablility for speed is not uncommon, just giving it away for no return is a bit silly though.

      Unrolling a loop is a common enough technique to speed things up, but your choice of example presented memcpy as the solution. Writing a set of memcpy macros which automate the loop unrolling is easy enough - especially if your compiler supports modern C++ templates (though in that case I would suspect it unrolls loops just fine itself thanks).

      An example like:

      char res[5];
      char src[5];
      // ...
      for (int i=0; i<5; ++i)
      res[i] = src[i] + src[i]

      Wouldn't provoke the same response from me if it was unrolled, since if it is too slow it isn't due to the str* functions but due to loop overhead (or too slow hardware, full stop.)

      So how would you optimise that?

      What if you know that the values are always between 20 and 100 (feel free to add some other known constraints)?

      In my (I admit not exactly exhaustive experience) the best value optmisations are those that are based upon things the programmer knows that are not expressed in the code and hence the compiler doesn't. Of course if you are using a shitty old compiler you also get the joys of doing by hand all the optimisations that have been automated for the last decade or two :)

    2. Re:Ever write any production code? by msobkow · · Score: 1
      In my (I admit not exactly exhaustive experience) the best value optmisations are those that are based upon things the programmer knows that are not expressed in the code and hence the compiler doesn't. Of course if you are using a shitty old compiler you also get the joys of doing by hand all the optimisations that have been automated for the last decade or two :)

      Alas, it is the use of old broken compilers that usually has lead to the twisted code. Maybe you wouldn't be shocked at the number of sites out there that don't upgrade anything because they're horrified at the regression test expenses, but realize you can't update without testing.

      Cool history on the coding. Cut my teeth on a TRS-80 Model I Level I myself, later banging away at C64, Amiga, and God knows how many variants on DOS, Windows, and *nix.

      Any decent compiler would convert your array indexing to pointer code instead of recalculating the offset on each iteration.

      Modern CPUs no longer have an effective performance difference between integer shifts and adds (reg-reg should be 1 internal CPU cycle, which is often a multiple of the external clock.)

      So here is an "optimization" that might result from outdated processor/compiler assumptions. It would only work for positive integers on a "normal" processor (there used to be oddball CPUs that use 1's complement instead of 2's complement for signed integers):

      char src[5];
      char dst[5];
      char * psrc = src;
      char * pdst = dst;
      char * pend = src + sizeof( src );
      while( psrc < pend ) {
      *(pdst++) = *(psrc++) << 2;
      }

      Not only does the "tuned" version fail to improve performance, it loses the ability to deal with negatives. It's completely obfuscated the intent to fill dst with a doubling of src, has done nothing to improve performance, and thereby would deserve to be spanked.

      Realistically, many CPUs could actually do the whole look as a single vector processing instruction -- a feature that has no explicit syntax, but can only be enabled by a good optimizing compiler or assembly inlines.

      --
      I do not fail; I succeed at finding out what does not work.
    3. Re:Ever write any production code? by sholden · · Score: 1

      I've seen some reasonably modern banking code (which my wife was working on), wrappers on wrappers on wrappers each fixing some small flaw in the layer it wrapped around (and adding a few of its own) because changing the lower wrapper was either impossible or too scary to contemplate...

      My first computer was a little "dick smith" (think radio shack or tandy) machine for which I had a basic catridge. It got stolen and replaced with the vastly superior C64 by the insurance company which was nice :)

      As for optimising the code, I was thinking of something along the lines of:

      *(unsigned int*)dst = *(unsigned int*)src*2;
      dst[4] = src[4]*2;

      Whch has the additional assumption of ints being 4x a char (in size), and possibly that char is unsigned but I haven't actually though about that, and that src and dst are aligned appropriately for ints, and numerous other things I haven't considered....

      But on an appropriate 32 bit machine, doing 4 operations in parallel (which we can do because the 20-100 limit means there won't be any 'overflow' between the bytes) will be a win, especially if 8 bit char access requires bit masking/shifting on the processor.

      Doing 100000 loops of the orig, your pointer version, a loop unrolled version, and my ugly unmaintainable int pointer version (clock count):

      orig: 340000
      unrolled: 110000
      ptrs: 330000
      int: 20000

      Of course on a different machine, the int version just plain won't work :)

    4. Re:Ever write any production code? by msobkow · · Score: 1

      Actually your int version doesn't produce identical results.

      In your prior example, each byte is calculated seperately, discarding any overflow bit. Changing to 4-byte integers means that the byte overflow bit is carried into the neighbouring byte by the integer math, producing different results.

      Now if you want to get machine-specific, you could mess around with something like this:

      char src[5];
      char dst[5];
      int * iSrc = &src[0]; // compiler alias, no code
      int * iDst = &dst[0]; // compiler alias, no code
      *dst = (*src << 1) & 0x01010101;
      dst[5] = src[5] * 2;

      Remember the int pointers are just compiler aliases, so the optimizer just ends up discarding the "variables" in favour of using the existing addresses provided by src and dst.

      --
      I do not fail; I succeed at finding out what does not work.
    5. Re:Ever write any production code? by msobkow · · Score: 1

      Typo. Should by the iSrc/iDst in the shift/mask calc.

      --
      I do not fail; I succeed at finding out what does not work.
  171. I don't think so. by twitter · · Score: 1
    If you read carefully, you'll note that the "honest work" sentence is NOT Davak's

    I read carefully and noticed that the only sentences that were Davak's bitched about AC posting. Because he had nothing else to offer, I imagined he agreed with what he posted from the FAQ. Once again, I can thank him for nothing. Next time he might add some commentary to the FAQ or just say it ain't so.

    --

    Friends don't help friends install M$ junk.

  172. Ever heard of self modifying code? by bluGill · · Score: 1

    No you can't dissassenble every assenbly binary into the source code. I've worked with self modifying code, which actually writes itself on the fly, sometimes in compelx ways. (fortunatly modern operating systems technically don't allow this, but once in a while you still run across it) I've also worked with hand written machine language. You can't get assembly because there never was any, and the author took advantage of variable instruction lenght so that a JMP to $200 or JMP to $201 is valid, but the results are vastly different. Makes for some really small code, but it is difficult to write, and debug, bordering on impossibal.

  173. Salary by Anonymous Coward · · Score: 0
    Nearly six figures? Shit, man, during the dot-com boom, they were giving college dropouts six figures for doing codemonkey perl work in silicon valley.


    You must suck.

  174. Who? by Anonymous Coward · · Score: 0

    "The Great Jack Schitt"

    Never heard of him. I don't know him.