Slashdot Mirror


Reverse Engineering Large Software Projects?

stalebread queries: "Me and a team of other students have been tasked with reverse engineering a massive C/C++ (mostly C) computer game of about half a million lines. We have most of the source, but no clue of how to approach a task of this magnitude. Anyone have suggestions of programs, or techniques we could use to understand the structure of the game?"

4 of 104 comments (clear)

  1. Reverse Engineer or Refactor/Port? by linuxtelephony · · Score: 4, Interesting

    It sounds like you are wanting to refactor the code, or port it to another platform. If you are missing some of the code, then you'll have to reverse engineer that portion of it.

    As for how to approach it - I think it depends on the size of your team, and what goals you set for the effort. Are you just wanting to learn? Or do you want to improve performance? Or make it work on another platform? What are the goals for this project?

    Once you know those details, they might give you an idea where to begin.

    --
    . 62,400 repetitions make one truth -- Brave New World, Aldous Huxley
    1. Re:Reverse Engineer or Refactor/Port? by QuantumG · · Score: 5, Informative

      Yeah, the "most of the source code" part is a bit scary. If they really are talking about reverse engineering from executables they are in for a hell of a time. The state of the art is a project I work on now and then, Boomerang, and it isn't for the faint of heart. I've been hearing for years about people who are working on decompilation tools that are integrated into IDA Pro but I've yet to see it. The time where you can enter a binary, press a button and get back compilable, maintainable source code is still a long long way off. But that's good, friends of mine do commercial decompilation work.

      --
      How we know is more important than what we know.
  2. Cross-reference first: Doxygen is your friend by treerex · · Score: 4, Informative

    It sounds like you are unable to build the complete system and run it, since you're missing functionality. This removes the possibility of using runtime tracing tools.

    The first thing I would do is run something like Doxygen over it to generate a cross-referenced description of the structures. It won't give you a global view of things, but it will give you a decent browsable view of the code itself. Another response mentioned GNU GLOBAL which may work better for you. Yet another possibility is LXR, though it may not work as well in C++. Regardless, a nice thing about Doxygen is that, when used with GraphViz, you can get useful diagrams generated showing class containment and file inclusion graphs.

    After you have that, get out your paper and pencil, and start drawing and manually tracing things. That's how I go about coming up to speed on new code I can't execute and step through. Eventually transfer that knowledge into a text file (or, nowadays, a wiki) so that others can benefit from it.

  3. Resources For the Code Janitor by sohp · · Score: 4, Informative

    I applaud your professor or thesis advisor or whoever for this real-world task. Here's a few resources which I wouldn't do without:
    Code Reading: The Open Source Perspective
    Object-Oriented Reengineering Patterns
    Reading Computer Programs: Instructor's Guide and Exercise
    Tips for Reading Code