Slashdot Mirror


Learning and Maintaining a Large Inherited Codebase?

An anonymous reader writes "A couple of times in my career, I've inherited a fairly large (30-40 thousand lines) collection of code. The original authors knew it because they wrote it; I didn't, and I don't. I spend a huge amount of time finding the right place to make a change, far more than I do changing anything. How would you learn such a big hunk of code? And how discouraged should I be that I can't seem to 'get' this code as well as the original developers?"

8 of 532 comments (clear)

  1. don't feel bad at all by iggymanz · · Score: 5, Insightful

    So you have been handed the steamin' pile o' code, it is great that you are very cautious and deliberate when modifying it. Make a set of regression tests, that is, make a set of test data and procedures and expected results to ensure original functionality that is still desirable is still working and no other errors introduced. It is hard, much more tedious than just creating new code with few constraints.

    1. Re:don't feel bad at all by kaiser423 · · Score: 5, Insightful

      Definitely what parent said. Also:

      I have inherited huge code bases. I actually kind of like it. Lots of people whom I thought were idiots, and cursed their code, I later found out that they were quite smart. Others, I found that they just thought about problems vastly different than I, and learning how they tackled problems gave me many more tools in my personal arsenal.

      That said, find a big wall or something. Use a debugger or code analysis tool to find the main execution paths (what calls what and when, etc). Diagram that up on the wall really large. Then use the tools to determine when and why certain auxiliary functions get called. Diagram that up, and you'll start getting a spider on your wall. Go from there using your new understanding to re-arrange the program flow not in terms that make sense to you, but rather seem to be how they are programmed (functional, objective, some pattern). Rinse and repeat until you know pretty much what the code is trying to accomplish in 90+% of the situations, and it's general plan for attack.

      With that diagram, dive in! There's tons of little details in every function that look useless but are usually bug fixes. Use a scalpel, not a hatchet.

      I was deployed remotely with no way for the main programmer to get at me. We had prepared 9 months to collect 4 minutes of data, and the test wouldn't wait for us. I found an odd bug hidden somewhere in ~22k lines of code. I did this over a weekend, and found about 4-5 nasty bugs that were combining to produce what I was seeing, and fixed them. I did this with zero input or help, over a weekend in code I had never seen spread around about 60 files. I spent the first half day just diving in and trying things, and nearly shot myself. That's when I went high-level and dug in from there.

      When that was done, I the took over code maintenance and updates on that project. The other guy had wrote it 100% himself, but because after that exercise I knew the code better him. Sometimes being new is good; you don't have all that cruft of implementations that didn't work, etc, but still linger in the original programmer's head.

  2. Use Doxygen by gbrandt · · Score: 5, Insightful

    Doxygen is your friend. run it over the source code and keep the HTML handy for searches and cross references.

  3. Not lots of code by www.sorehands.com · · Score: 5, Insightful

    First of all, 30-40,000 lines of code is not lots of code. Try, 250,000 of code.

    To start, use a good programming editor/environment (Xcode, Vslick, Visual Studio, etc.) that gives you the ability to easily go to definition or references to variables, functions, structs and such. Run some sort of profiler or flowchart type program on it to get a high level view of the code and how it fits together. If you can get the person(s) who worked on it before you to give you an idea of it fits together.

  4. Not at all. by hemorex · · Score: 5, Insightful

    I find that if the other programmer wrote it in such a way where it's too complex for me to follow, I'm not the one who's a moron.

    1. Re:Not at all. by tsm_sf · · Score: 5, Insightful

      Man, always when I run out of mod points.

      Nothing like being handed a steaming plate of spaghetti and hearing about how much of a "genius" its creator was.

      --
      Literalism isn't a form of humor, it's you being irritating.
  5. Try to learn the structure by phantomfive · · Score: 5, Insightful

    I had an English professor who always said, "Structure is the key to understanding." He was talking about literature, but I think the same is true for programs as well.

    Try to understand the structure of the program. What is the basic flow? It should have an initialization routine, a main loop, and a shutdown routine. Find out roughly where they are, then focus on the main loop. Usually there will be one piece of code that is central, and it will occasionally pass control into other large pieces of the program. Sometimes there will be more than one main loop, and control switches back and forth between the various main loops. If the program is event drive, this will make a difference in the structure.

    If you are just trying to make a small change, try to find the sequence of events that will lead up to where that change needs to be made. Follow the sequence of execution until you get to the line you need to change. If you are changing a single variable, sometimes it's helpful to do a search and find all the places that variable is used, to make sure your change won't have any side effects. This may seem time consuming, but it can save 10 times more in debugging.

    Learn to follow code execution with your eyes, without running a debugger. One thing that separates good coders from not so good coders is the ability to follow code that isn't being executed.

    --
    Qxe4
  6. Re:Time by Anonymous Coward · · Score: 5, Insightful

    Everyone, including me, always wants to go for the clean rewrite. But in my experience it almost never turns out for the best. There's a reason for all that messy code. Much of it was bug fixes that real-world users needed. Other complexities were needed in the first place to make the user experience simple (natural, giving it that "hey, it's just works like I expected" feeling).

    The reason you don't understand the code is that you weren't part of the original design discussions, in which weeks or months were spent learning, debating, arguing, etc., about many different design decisions at many different levels of abstraction. You don't know why the trade-offs were made. You just see the finished product.

    Rewriting the code won't give you insight into any of this. Learning the code the hard way, fixing bugs, rewriting *small* pieces and seeing what breaks the regression tests, etc. will eventually help you to understand it.

    There is no point in rewriting it before you fully understand it. Attempting that can kill a product. Conversely, by the time you fully understand it, there won't be any need to rewrite it, because you'll own the code.