Reverse Engineering Large Software Projects?
stalebread queries: "Me and a team of other students have been tasked with reverse engineering a massive C/C++ (mostly C) computer game of about half a million lines. We have most of the source, but no clue of how to approach a task of this magnitude. Anyone have suggestions of programs, or techniques we could use to understand the structure of the game?"
It sounds like you are wanting to refactor the code, or port it to another platform. If you are missing some of the code, then you'll have to reverse engineer that portion of it.
As for how to approach it - I think it depends on the size of your team, and what goals you set for the effort. Are you just wanting to learn? Or do you want to improve performance? Or make it work on another platform? What are the goals for this project?
Once you know those details, they might give you an idea where to begin.
. 62,400 repetitions make one truth -- Brave New World, Aldous Huxley
It sounds like you are unable to build the complete system and run it, since you're missing functionality. This removes the possibility of using runtime tracing tools.
The first thing I would do is run something like Doxygen over it to generate a cross-referenced description of the structures. It won't give you a global view of things, but it will give you a decent browsable view of the code itself. Another response mentioned GNU GLOBAL which may work better for you. Yet another possibility is LXR, though it may not work as well in C++. Regardless, a nice thing about Doxygen is that, when used with GraphViz, you can get useful diagrams generated showing class containment and file inclusion graphs.
After you have that, get out your paper and pencil, and start drawing and manually tracing things. That's how I go about coming up to speed on new code I can't execute and step through. Eventually transfer that knowledge into a text file (or, nowadays, a wiki) so that others can benefit from it.
I applaud your professor or thesis advisor or whoever for this real-world task. Here's a few resources which I wouldn't do without:
Code Reading: The Open Source Perspective
Object-Oriented Reengineering Patterns
Reading Computer Programs: Instructor's Guide and Exercise
Tips for Reading Code