Tools For Understanding Code?

Wait for cenqua's solution by ccguy · 2008-01-18 04:37 · Score: 4, Funny

I hear that the commentator guys are finishing a new product that instead of commenting your code is able to comment other's.

Re:Wait for cenqua's solution by Anonymous Coward · 2008-01-18 07:46 · Score: 1, Interesting

Some 14 years ago, I worked on product for Cadre Technology, called 'Ensemble'. It ran on Unix workstations, and was designed to help you understand piles of inherited 'C' code. It read in all of the source code and decomposed it into a Prolog 'database', after which you could visually navigate the structure of the code. We nicknamed the viewer 'the deathstar' because you could model the structure as a rotate-able and zoom-able sphere, (as one of the optional views). Cadre and it's products were sold and resold and I heard that CA ultimately acquired ownership. I have no idea if they currently support the product.
Re:Wait for cenqua's solution by Anonymous Coward · 2008-01-18 10:36 · Score: 1, Informative

red hat source navigator - nice free tool for groking c/c++

Stepping Through by blaster151 · 2008-01-18 04:38 · Score: 5, Insightful

I've always found that stepping through the debugger at runtime is a decent way to start making sense of a large code base. Easier, anyway, than trying to read static code printouts. Just set a breakpoint at a point of interest, fire up the application, and use it as a starting point. You get a sense for program flow and it's a great way to generate questions--lots of them. (What does class SuchAndSuch do? It looks like the application is handling remoting in such-and-such a fashion; is that right?) You can also choose one aspect of the architecture and selectively ignore or step over other aspects, building up your understanding one aspect at a time. In my case, with Visual Studio as a development environment, I can hover the mouse cursor over variable names to see their current values. In the case of variables of a certain type, like datasets or XML structures, I can use realtime visualizers to browse the contents and get a much better feel for what's going on.

If there's no one at your company that can help answer your questions and bring you up to speed, I feel for you - your employers ought to know enough to give you some extra margin. It can be very hard to take over a large code base without some human-to-human handover time.

Also, is it an object-oriented system? I assume that it's not, based on your post, but you don't say either way. If it is, the important aspects of program flow often live in the interactions between classes and objects and the business logic is decentralized. OO is great, but it can be harder to reverse-engineer business logic because it's distributed among various classes. A debugger that lets you step through running code is almost essential in this case.

Re:Stepping Through by daVinci1980 · 2008-01-18 04:47 · Score: 4, Insightful

This post is dead on.

Place a breakpoint somewhere you think will get hit (e.g. main), and then start stepping over and into functions. I usually attack this problem as follows:

Place breakpoint. Use step-in functionality to drop down a ways into the program, looking at things as I go. What are they doing, how do they work, etc.

Once I feel like I understand how a section of code works, I step over that code on subsequent visits. If I feel like this isn't taking me fast enough, I let the program run for a bit, then randomly break the program and see where I am.

Lather, rinse, repeat.

Also, this should go without saying, but you should ask someone who works with you for a high-level overview of what the code is doing. The two of these combined should get you up to speed as quickly as possible.

--
I currently have no clever signature witicism to add here.
Re:Stepping Through by The_reformant · 2008-01-18 05:09 · Score: 2, Informative

Absolutely since joining the real world I have found the visual studio debugger my most prized tool. Somehow I managed all through my degree to never come into contact with one (probably because all the free ones are rubbish and most schools won't shell out for visual studio). I now extol the virtues of debugging to all and sundry!

--
I have discovered a truly remarkable sig which this post is too small to contain.
Re:Stepping Through by dlowder · 2008-01-18 05:39 · Score: 1

Yes, in looking for points of interest to set break points consider the input into the system and the expected output. Identify where in the code key transitions of data take place and set your breakpoint there. If there is no one at your company that can give you an overview of how the code works, start looking for another job or go ask for a very large raise. Your employer doesn't understand software and the value in keeping key developers around.
Re:Stepping Through by dupup · 2008-01-18 05:41 · Score: 2, Insightful

The parent post is correct, IMO. In fact, I have found that, for me, the easiest way to start understanding a new code base is to jump with a bug or two to fix. It's a little painful at first, but a specific goal combined with judicious use of the debugger will help you understand how the system works more quickly.
Re:Stepping Through by CastrTroy · 2008-01-18 05:41 · Score: 1

The debugger is probably the most important tool I have encountered for programming. I'll often step through my own code on the first run just to make sure my logic is correct. This is especially true for .Net where you can notice logic errors before they happen, fix them, and keep right on running the code. Saves tons of time over build,run, crash, find bug, fix bug, start over. Now it's just build,run, fix bug,continue. You can even rewind from exceptions to before it was thrown. I find it amazing that I got through most of university without a debugger, and that none of my courses never mentioned the importance of using a debugger.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Re:Stepping Through by JesterXXV · 2008-01-18 05:50 · Score: 3, Insightful

I don't think there's any replacement for talking to the real-live developers who wrote it. Failing that, any design documentation they left behind. Failing that, just get a task to do, and try to get it to work. Nothing like learning by doing.

--
Yo mama so fake, she failed the Turing Test.
Re:Stepping Through by smittyoneeach · 2008-01-18 05:50 · Score: 2, Insightful

I think unit tests are actually better, for code that is suited to being driven externally.
Pick a tool to wrap something, start writing little bits to excercise the code.
You can comment and version unit tests, giving a sense of history.
Debuggers, on the other hand, mostly exist in the present tense.
Sure, you learn something now, but how about some breadcrumbs for later?

--
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
Re:Stepping Through by Assmasher · 2008-01-18 05:51 · Score: 2, Insightful

I certainly think that stepping through is by far the most valuable method; however, it can be difficult when dealing with asynchronicity and/or parallelism. In those cases, commenting is the only solution that seems to help me... LOL.

--
Loading...
Re:Stepping Through by orclevegam · 2008-01-18 06:10 · Score: 5, Insightful

Much as I would love to agree with you, unfortunately the world isn't always so accommodating. Sometimes you have to suck it up and stay with a job till you can find something better, and most employers won't let you toss anything out, let alone a major chunk of their code base. Doesn't matter if it's utter crap, they paid for it, and as far as their concerned turd polishing is better then starting from scratch even if starting from scratch would be a hell of a lot cheaper. Can't expect MBAs to understand the difference between good code and bad code, to them it's all just code, and as far as their concerned, the more the better. It's the old idiotic idea that more lines of code means a better product, therefor anything that reduces lines of code must be a bad thing.

--
Curiosity was framed, Ignorance killed the cat.
Re:Stepping Through by Anonymous Coward · 2008-01-18 06:17 · Score: 0

Your username is exposing your motives...
Re:Stepping Through by trolltalk.com · 2008-01-18 06:37 · Score: 1

Your username is exposing your motives...

Ah, the old "if you don't like the message, blame the messenger" trick.
There's nothing that sucks the lifeblood out of a coder more than having to try to polish a turd.It is almost always quicker AND cheaper to rewrite from scratch - especially when there are no docs, no specs, and the code is crap.
When you get confronted with WTFs like code where people have over time severely modified a variable or function, but haven't bothered to change its name, so that it does something completely different from what you would expect, because they were just too lazy to do a global search-and-replace, or call the same function from 100 different places, always with the same HUGE list of parameters and structures, with different actions and side effects depending on which parameters were actually filled in ... and an idiot source file naming convention ... and the few comments that exist were never updated to actually reflect what the code does ... and an addiction to cut-n-paste coding, with the same files and filenames in different directories doing completely different things ... ALL because of the perceived need to look productive by "cutting code" ...
Throw it out. Just throw it all the fuck out and start over properly. Nothing else works.

--
Kevin Smith on Prince
Re:Stepping Through by j-pimp · 2008-01-18 06:42 · Score: 1

(probably because all the free ones are rubbish and most schools won't shell out for visual studio).
My experience is that the school pays for Visual Studio, but does not teach you how to use it. Also, I don't think Visual Studio is very expensive for a school. The full price is around a grand a seat. Of course if you did not go to school in the USA, ignore everything I just said here.

I do think that after 4 years of college you should have learned how to do things like step through code, use version control, and use a merge tool. The debugger should be taught in the first or second programming class. Version control and merge tools can be taught in a software engineering class. Perhaps pert of that class should be inheriting a legacy code base from a previous running of the class. Another part of the class could be developing a new project from scratch.

--
--- Justin Dearing http://www.justaprogrammer.net/ We're just programmers.
Re:Stepping Through by dannannan · 2008-01-18 06:55 · Score: 1

To take it a step further, try running the code in Callgrind. Callgrind is part of the Valgrind suite. It basically runs your app on a soft CPU. It's intended for use as a profiling tool, but it gives you a complete map of what is calling what, and how many times. The output is most valuable when you create snapshots around specific use cases in the code -- you can do that with callgrind's external control tool. KCachegrind is an excellent tool for viewing callgrind output.
Re:Stepping Through by superwiz · 2008-01-18 06:57 · Score: 3, Insightful

The guy asked about a large code base. I am assuming that means on the order of at least half a million lines. Stepping through the code won't even get you into most modules of something that big. Never mind that it will do nothing to help you understand that a certain chunk of the code is a module that gets used only under certain extraordinary conditions. To be sure, what you suggest is what you do on day 1. The post was essentially asking what do you do three weeks into it after you've understood what the loop in main does and yet you still don't know what's tied to what and how.

--
Any guest worker system is indistinguishable from indentured servitude.
Re:Stepping Through by psykocrime · 2008-01-18 07:04 · Score: 1

Parent is dead-on; a good symbolic debugger is great for figuring out how a body of code works. That's probably my favorite approach.

--
// TODO: Insert Cool Sig
Re:Stepping Through by skiflyer · 2008-01-18 07:34 · Score: 1

Just wait til you get to the multi-threaded apps or the apps where problems fade away while debugging because of the wait time you're adding. Granted the debugger is a phenomenal tool and can greatly reduce the effort and time... but it's no panacea.
Re:Stepping Through by doti · 2008-01-18 07:42 · Score: 1, Funny

.

worse

advice

ever

--
factor 966971: 966971
Re:Stepping Through by Zwack · 2008-01-18 07:42 · Score: 1

Or worse, the times where just compiling with the debugging option turned on causes enough of a change in the code that the time dependent bug just vanishes.

Z.

--
-- Under/Overrated is meta-moderation, and therefore is Redundant.
Re:Stepping Through by trolltalk.com · 2008-01-18 08:09 · Score: 1

Please explain why you think its bad advice to abandon a ship when its already sinking?
If you've ever been in a situation where you've had a mess of crap code thrown at you, you'd know that that crap code didn't "just happen." It was caused by bad practices, bad management, bad leadership, bad communications, and bad vision.
Changing the coder isn't going to change any of that.
"Lose all hope ye who enter therein."

--
Kevin Smith on Prince
Re:Stepping Through by DataBroker · 2008-01-18 08:12 · Score: 1

Can't expect MBAs to understand the difference between good code and bad code, to them it's all just code, and as far as their concerned, the more the better. It's the old idiotic idea that more lines of code means a better product, therefor anything that reduces lines of code must be a bad thing.
Umm... MBA and coder here. Your apparent problem is that you have trouble conveying ideas to them in concepts that a typical MBA can understand. First hint: don't even try to support your measurements with "lines of code". Use easily-understood, business-common terms; such as return on investment or simply money. Try explaining that you can save the company X downtime or support hours by investing Y development hours into a fix. They ("We MBA's" if you prefer) can quickly calculate that Y development hours costs a lot less than X downtime hours.

By using the important measurement (money instead of lines of code) I have convinced my critics to support me in a total rewrite of the architecture and code. As geeky as it is to say, it's actually fun to impose my will and then be allowed and paid to rewrite it.

As an aside, I have to mention that using proper English will help you in converting others to support you. "their" is possessive; meaning "it belongs to them". You meant "they're", which means "they are" (the ' character replaces the missing characters " a").
Re:Stepping Through by HeronBlademaster · 2008-01-18 08:13 · Score: 1

Most schools with a decent-sized CS department are part of the MSDN Academic Alliance (MSDNAA) through which their students get most MS products for free (for non-commercial use, of course) including various (read: a dozen?) versions/copies of Windows, Visual Studio, etc.

Everything MS on my computer right now besides Office (which is not available through MSDNAA) I got for free from MSDNAA. Which, I suppose, is really just XP and VS2k8, but who's counting?

Anyway, if your CS department is *not* a member of MSDNAA, go talk to the dean of the department about joining - as far as I know it's fairly inexpensive.
Re:Stepping Through by The+Mad+Debugger · 2008-01-18 08:15 · Score: 1

True true true true.

Sometimes you're stuck, and the only thing for it is slow, incremental improvement. Having been on one of those sorts of projects, I can actually report that there is a deep feeling of satisfaction that comes from eventually righting at least the most obvious of wrongs.. but the questions is still "where to start?"

The submitter mentioned cscope, and as much as I've looked, I haven't found anything significantly better. In some quick googling, I see a front-end called kscope which looks like it might be nice, but I've never used it, so I can't recommend it.

The other interesting tool to check out is "doxygen". Doxygen will build HTML documentation for source code. It uses a simple markup language in your comments to figure out what comments go with what code. By default "doxygen" will only build documentation for code that has doxygen comments, but that's easy to change in its config file.

The very interesting thing about it is that Doxygen can also *automatically* build call graphs, object hierarchy diagrams, and collaboration diagrams. This can be very useful for seeing the basic structure of the source code in a graphical way. Even if you don't bother to write doxygen comments for your code, this still might be useful.

Once you've got that set up, my advice would be to find the main entry-point for whatever your code does, and start following it with cscope. If your program processes packets, start with the rx function. If it processes data from the user start there. Start tracing through the "mainline" with cscope, and figure out how the most common input gets processed. From there, you should have at least a basic foundation for understanding what you're looking at.

If you want to be a very good citizen, write some doxygen comments while you're at it and submit them back. If you want to be a star performer, teach others how to use doxygen, and encourage them to do the same. If this is for a professional endeavor, I bet your manager will be impressed. It's really easy, and in the end the documentation it produces is well worth the trouble.
Re:Stepping Through by ricree · 2008-01-18 08:17 · Score: 1

What is it that makes visual studio so much greater than other debuggers such as gdb? I'm still in school, but I've used both and so far the only thing I've seen that Visual Studio has any great advantage in is the learning curve.
Re:Stepping Through by Nethemas+the+Great · 2008-01-18 08:31 · Score: 3, Informative

Clearly you don't write (or at least read source for) applications of any substance as that would be mildly described as tedious if not impossible.

One of the best ways to understand code is to do so visually with the software equivalent of blueprints. UML is generally considered a very capable way of modeling/communicating both static structures and dynamic behavior of software. There exist any number of tools that are capable of reverse-engineering existing source into UML. Two tools that I consider to be more capable than others are IBM's Rational Rose, and No Magic's MagicDraw. If commercial products aren't a possibility there are likely a number of open-source/free tools--though likely of lesser ability--available. A Google search on "reverse engineering UML" should point you at some.

--
Two of my imaginary friends reproduced once ... with negative results.
Re:Stepping Through by doti · 2008-01-18 08:31 · Score: 1

What is wrong to assume the code is bad because it is not documented.

I've seen a lot of excellent undocumented code, and a lot of crappy documented code.

Besides, really good code needs no documentation.

--
factor 966971: 966971
Re:Stepping Through by Panaflex · 2008-01-18 08:49 · Score: 1

You know... it's kind of funny you should mention that.

As a coder with 15 years experience - let me say as a matter of fact that no matter HOW good a job you've done, how WELL documented, or how CHEAP it is to run - someone like you will come along with a crock full of even "better, faster, cheaper" tech and drive the company/division into the ground.

Nothing personal, and hey - it works both ways! I still get paid, either way - and usually get a nice raise along the way too.

I've been on both sides of the development curve and I sometimes think - no wonder mainframes are still around! When sane people look at real project cost, risk, and reward they wisely increment tech on the edge instead of in the core business.

I'm not really singling you out - but this constant need to reincarnate code seems more and more ridiculous as I go along. Is it really cheaper, better, faster? Wouldn't we get the same or better bang with hardware upgrades? Are these questions ever asked?

--
I said no... but I missed and it came out yes.
Re:Stepping Through by jgarra23 · 2008-01-18 08:57 · Score: 2, Insightful

Clearly you don't write (or at least read source for) applications of any substance as that would be mildly described as tedious if not impossible.

I have no idea how you formulated this from parent based from 2 or 3 sentences.

One of the best ways to understand code is to do so visually with the software equivalent of blueprints. UML is generally considered a very capable way of modeling/communicating both static structures and dynamic behavior of software.

A lot of times a programmer is stuck without those tools for any number of reasons. A lot of times people are stuck with spaghetti code which there is no documentation or design pattern to work with. I think your answer is assuming that the planets are aligned and we live in a utopia. Do you have any suggestions for people who have to deal with reality?
Re:Stepping Through by orclevegam · 2008-01-18 09:01 · Score: 2, Informative

Yes, I know the difference between their, they're and there. I noticed the mistake after I posted it. It's one of the few mistakes I seem to be prone to in writing.

As for using "lines of code", I don't, they do. It seems the biggest issue they have with rewriting code (or refactoring if you prefer) has something to do with the way it's budgeted and accounted for. Apparently adding new code/features to a project comes out of a different budget, than replacing or repairing already existing code does. Don't ask me why that is, I just know whenever we've tried to push to replace some horrendous piece of code they would tell us it wasn't in the budget, and as long as the code ran we weren't allowed to change it. We had to work our way around the bean counters eventually by carefully picking features to implement that touched on code we wanted to replace, then as part of implementing the feature we would rip out and rewrite the code we wanted to.

--
Curiosity was framed, Ignorance killed the cat.
Re:Stepping Through by ultramkancool · 2008-01-18 09:08 · Score: 0

This is exactly what to do! Anyone ever reverse engineered anything? You load the app up into your disassembler and, well, there's a nice giant pile of useless assembly. Search around for some things of interestes (strings, function calls, etc) and breakpoint on their use. Using this will allow you to grasp whatever part of the program you want (ex: the key generation function :))
Re:Stepping Through by Anonymous Coward · 2008-01-18 09:24 · Score: 0

He's a The[y're,re,ir] nazi.
what else would you expect from an pedantic MBA you ask? Maybe: forecasting the synergys of a collaborative ventures moving forward. (AKA, spending work hours correcting grammar mistakes on slashdot)
Re:Stepping Through by zenpickle · 2008-01-18 09:28 · Score: 1

MBA's understand cost. The problem is that most programmers have no idea how to estimate the cost of code maintenance based on concepts like adequate documentation. The inarticulate engineer who said "just throw it all out" is typical. He has a gut feel but no cost basis to back it up. Documentation has value in terms of reduced maintenance cost. Measure it! Research it! That value is in the debit column if it doesn't exist. The cost of moving forward without documentation needs to be compared with the cost of starting over. The quality of the code is itself a big factor in the cost equation. High quality code is often its own best documentation if it exhibits well chosen variable/function/class etc. names and has a clean architecture that reflects the problem it solves. Programming is not a religion based on received wisdom. It is a tool to solve a particular class of problems. Good programming practice is not defined by arbitrary rules like "you must do documentation." It is defined by rules that have been shown (with hard data) to solve this class of problems effectively. You would be surprised at how intelligently the MBA's react if the issue is articulated in terms of the cost estimates of various strategies and back the estimates up with good data.
Re:Stepping Through by trolltalk.com · 2008-01-18 09:39 · Score: 1

really good code needs no documentation.

Code doesn't just exist in some parallel universe by itself, unless its something trivial, like "hello.c".
Good code has never been hurt by the presence of good documentation, proper specifications, decent planning, frequent communications between all the parties involved, etc.
At least with crappy code and decent docs, you have a road map. Without the docs, its more a case of "you can't get there from here."
The process of speccing and documenting code forces you to think about what you're doing, rather than "hey, I know just how to code that!" Its also a good way to slow down "feature creep." "You want that feature - draw up the specs first, so we're both on the same page and understand what you want done."
I for one refuse to take part in any more conversations that start with "wouldn't it be nice if ..." and end with "so it shouldn't take more than a day to implement ..."
Nine times out of ten (heck, more like 99 times out of 100), the proposal doesn't even make sense from a real-world business point of view. Every coder who takes on the ultimate responsibility for a project should know how to deflect such stupidity by directing their would-be tormentor into a discussion of market gap analysis as a first step in deciding if there's a need for "feature x".

--
Kevin Smith on Prince
Re:Stepping Through by Mr2cents · 2008-01-18 10:06 · Score: 1

I know exactly what you mean. I have the same discussions with management from time to time. Luckily there is a good athmosphere at work, so we can have these heated discussions and still be friends afterwards. The last time was with my evaluation, where "tendancy to rewrite code" was one of my weak points. My best defence was to point out that after the rewrites, there were far less bugs, and the code size didn't increase much. Another part of the code, where I didn't get permission to rewrite it, had bugs coming out of it almost every release. With each bug the code size increased, doubling it in six months (actually, it went from 3K lines to 6.8K lines). Those are statistics that tell you something is very wrong.

I think the problem is that by writing a piece of code the first time, you gain so much experience, that a rewrite will almost always give improvements. The question then is: is it worth it? It all depends on the quality of the first try. If it's ok, then just try to live with it. But if it causes problems, or you need to add functionality that would be much easier to do if you could start over, then a rewrite should be considered. Also, in my opinion, it's best to rewrite bad code ASAP, because if you don't, you'll spend a lot of time and energy trying to debug it, and the more time you spend on it, the harder it becomes to throw it away. If you rewrite it early, you'll have more time left for testing, and you'll save more time (by not spending it on a lost case).

Sounds like common sense to you? Well, it hardly gets through to managers.

--
"It's too bad that stupidity isn't painful." - Anton LaVey
Re:Stepping Through by asc99c · 2008-01-18 10:23 · Score: 1

Also compiling or running with the debugger in some cases initialises variables that otherwise are uninitialised etc. Unfortunately it is often the most horrible bugs to find the root cause of where a debugger is no help.
Re:Stepping Through by Mr.+Slippery · 2008-01-18 10:36 · Score: 1

The post was essentially asking what do you do three weeks into it after you've understood what the loop in main does and yet you still don't know what's tied to what and how.

Big stacks of printouts, a large conference table on which to spread them out, a pencil, and the license to kill anyone who interrupts you. Start tracing through the code. Think about options and branches. Make notes on the printouts. Incorporate those notes into comments in the code later.
(Same process can be applied for code reviews. Though in that case, if the code is hard to figure out, you can just throw it back to the developer with a demand for more documentation, so a killing people who interrupt you isn't necessary - severe beatings should suffice.)

--
Tom Swiss | the infamous tms | my blog
You cannot wash away blood with blood
Re:Stepping Through by superwiz · 2008-01-18 10:53 · Score: 1

C'mon. If a large part of the process can be automated, why not let the computer do the automation? Because you can? Yes, you'll get a better idea of what's going on your way. But do you always want to? Not all code base is a mission in life. Sometimes it's just a problem that needs to be solved. "I am thinking that all these tables (pointing to the logarithms) might be calculated by machinery" -- Babbage

--
Any guest worker system is indistinguishable from indentured servitude.
Re:Stepping Through by j-pimp · 2008-01-18 11:00 · Score: 1

Anyway, if your CS department is *not* a member of MSDNAA, go talk to the dean of the department about joining - as far as I know it's fairly inexpensive.
Once again, your assuming he's in the US, although its possible there are similar programs overseas. Also, in order for that to affect classroom usage, he would have to attend a school that requires him to own a laptop, unless that program also applied to lab PCs. Plus, there are debuggers for other platforms, that while not as nice, integrate with IDEs and let you do the basic "set a debug point and start stepping." Even if they get a professor that believes all students should learn vim, which is not a bad idea, GDB can step through code.

--
--- Justin Dearing http://www.justaprogrammer.net/ We're just programmers.
Re:Stepping Through by Anonymous Coward · 2008-01-18 14:05 · Score: 0

Note that there are some other tools which generate valgrind-compatible output, such as XDebug for PHP.
Re:Stepping Through by jgrahn · 2008-01-18 14:38 · Score: 1

I don't think there's any replacement for talking to the real-live developers who wrote it. Failing that, any design documentation they left behind. Failing that, just get a task to do, and try to get it to work. Nothing like learning by doing.

Yup. I see newbies fall into the trap over and over again -- trying to understand it all just by spending weeks wading through code, forgetting one part as they read about the next. I do it myself now and then (and probably pick up a few things along the way), but it's only when I have a specific goal (fixing this bug, adding that feature) that I start learning for real.
Re:Stepping Through by jrfonseca · 2008-01-18 14:54 · Score: 1

I think so too. But I've found that command-line debuggers like gdb invariably show a very limited view of the code, which presents an obstacle for code comprehension.

If you're going to debug with the aim of understanding the code, then using a debugger with a graphical interface is a must. On that regard. I personally find Eclipse CDT very useful, since not only it has a graphical interface to gdb, but also has code navigation abilities which allows you to quickly jump back and forth the function/type definitions and so on.
Re:Stepping Through by Mr.+Slippery · 2008-01-18 17:44 · Score: 1

Yes, you'll get a better idea of what's going on your way. But do you always want to? Not all code base is a mission in life. Sometimes it's just a problem that needs to be solved.

Is not the title of this discussion "Tools for Understanding Code"? If that's the question being asked, then I assume that getting a better idea of what's going on is what's desired.
If you're just trying to solve a single problem - remove a bug - then yes, by golly a debugger just might be helpful. (Oftentimes not, though, if you're got proprietary libraries and/or multi-threaded code or other timing dependencies to deal with.)
A debugger is a fine tool. But it in no way automates the process of understanding code. It leads you by the hand down only one path; understanding requires that at each junction, you look down the road a little at each possibility. A big display is useful for that, and until I have access to a display the size of a conference table, spread-out printouts will continue to rock.
And unless you have a memory orders of magnitude better than mine, you'll need to take notes. Hey, it would sure be handy to keep those notes right proximal to the code they discuss. But I don't want to start editing the code yet...if only there were a way to make a copy, in a format that was easy to make notes on - even to add diagrams to, which is sure not easy in comments...

--
Tom Swiss | the infamous tms | my blog
You cannot wash away blood with blood
Re:Stepping Through by dooguls · 2008-01-18 17:51 · Score: 1

Another tool that helps me out is LXR (http://lxr.linux.no/). I know we all use it to help us browse the kernel, but really it will browse anything. I like the ability with tabbed browsers to start my way from main following through functions and seeing where my nose ends up. I also have been known to try and follow a particular function to the end and then roll back up the tree.
--doug >

--
Sig 'em boy!
Re:Stepping Through by plover · 2008-01-18 18:27 · Score: 1

A lot of times a programmer is stuck without those tools for any number of reasons. A lot of times people are stuck with spaghetti code which there is no documentation or design pattern to work with. I think your answer is assuming that the planets are aligned and we live in a utopia. Do you have any suggestions for people who have to deal with reality?

Doxygen. With the proliferation of open source projects, "no tools" is no longer a valid excuse.
Now, I'm not saying that doxygen is going to be able to unravel all his code for him, but it's a good start, and it's free. It may not help much if there's absolutely no structure (a 5,000 line main(), for example) but most code isn't quite that bad.
If his boss doesn't approve of that one in particular, this wikipedia article has many others to choose from.
And finally if his boss says, "No, you're supposed to be a whiz kid, you figure it out with the tools I gave you," the correct answer is to raise one finger in salute and find a job without a batshit-crazy manager.

--
John
Re:Stepping Through by shird · 2008-01-18 19:27 · Score: 1

This may work for a very simple application. However any 3 tier application and/or with multi-process/threading will be next to impossible to step through with a debugger unless you have a reasonable understanding of the code in the first place and know what section of code you wish to debug.

If your code supports 'debug' mode of some sort which logs a trace of the commands executed, this is probably more effective. Run the code with some known data, and take a look at the trace. Match the trace logs with the trace commands in the code. Use ctags/cscope and an effective 'grep' command in your vim etc to jump around the code effectively.

--
I.O.U One Sig.
Re:Stepping Through by plover · 2008-01-18 19:31 · Score: 5, Informative

(Warning: you asked!)
Well, the learning curve is certainly important in the real world, although I expect a professional to know his or her tools before they arrive on the job. But there are a metric crapload of things I like better about Visual Studio that make it a much more effective debugger than gdb, in my opinion. (Note that I am not a big gdb user, so I may be cutting it a bit short in the feature set here. My apologies in advance if I do so.)
Things I've found I prefer include many tool windows simultaneously showing the states of registers, memory, the call stack, an object or seven (expanded to show a few properties), and automatic resolution of virtually every symbol and name, including the operating system (although you have to download the symbol files for your OS version from Microsoft.) And you still have full navigation through the source.
Simply hovering the mouse over a symbol will bring up a tool-tip to display the contents. If you highlight an entire expression such as pFoo->pBar->Blah.count+7 and hover, the tooltip will display the calculated result.
You can set a temporary breakpoint by setting the cursor on a line of code and clicking "run to cursor." You can run, single step, run to the current cursor, or run till function return. That last one is great for re-entering a function multiple times to test different conditions.
The variables window contains the current call stack as a dropdown list -- changing the stack lets you see the newly-local variables. Watch windows can display data as hex or decimal, just right click and select. Watch entries can even be used as calculators (enter a literal value, such as 0xf0 + 12, and it will display the results.)
In the watch windows, you can also call arbitrary functions (good for testing without driving your code to that point) or other functions in your memory space, such as the C runtime memory checkers. If you're trying to track an errant pointer, create a debug build, start running and break, type _CrtCheckMemory() into a watch window, and every time the watch window is refreshed, it will check all your fenceposts. You might get lucky and spot your corruption as it happens. The /GZ compiler option will perform a similar task at the function level, but this would let you do it at a line level.
There are also dozens of possible formats it can display your watch variables in -- suffix a pointer with ,s and it'll display the contents as an ASCII string. Only see one byte because of Unicode? Suffix the pointer with ,su and you'll see the unicode string. A ,wm suffix displays window messages by name. ,hr suffix displays HRESULTs by name.
The memory windows will highlight in another color any data that's changed since the last time it was refreshed, whether it be a single step or a previous breakpoint. You can have memory displayed as bytes, shorts, or longs. And with the newer visual studios, you can have multiple memory windows, so you can keep track of two, three or four arrays simultaneously. You simply drag and drop them wherever they're convenient, then step through the code and watch for colored variables indicating change.
Again, all these windows are automatically updated every time the debugger drops from the program to your control. I've got two 17" monitors, and I can fill them both. The problem with debugging is that sometimes you are really starting blind, and the faster you can get more information, the less time you waste debugging.
There's a cute "magic trick" I like to show people with the memory window and the disassembly window. Let's say you've had a crash, and attached the debugger to the running program. You're looking at a corrupt stack in the call stack window -- just one line of garbage data. What to do? Where did it break? Enter @ESP in the memory window. Change the view to 'long' and it displays the memory as 8-digit numbers. If y

--
John
Re:Stepping Through by JavaRob · 2008-01-19 00:43 · Score: 1

The UML summary might *also* be helpful, and you can use that as something of a roadmap while you're following actual execution paths through the code... but by itself, no. Not that helpful. "Oh, look - we have 2500 objects, with a tangle of interdependancies." (You are in a maze of twisty little passages, all different.)

The problem is that the MOST useful detail would be the *common* use cases linked with detailed sequence diagrams. Is there any way to auto-gen that?

Unfortunately, I think these tools are going to have no idea what are important objects, unimportant, essential but loaded dynamically, and perhaps not even used anymore. And alas, if this project is big enough, there are probably a few entire sections in there where (if you could ask) you'd hear something like "well, when we were using the rules engine from X we needed to build this whole architecture around that to get it to do what we needed; but eventually we migrated over to our own simpler system in this module over here - we haven't pulled that old code out yet because we think there's other code that still uses the utility methods".

Hence, yeah -- following actual code execution is an excellent way to get your bearings. You can figure out the "well-trodden paths" of the application vs. the dim haunted forests of code that are only ever explored in exceptional conditions, if ever. If the system is distributed, you just repeat the experience for each representative piece.

My personal suggestion -- do some documentation and/or code commenting of your own as you figure stuff out, and keep notes on the areas where you know you're missing something important for now. These docs are going to be invaluable to you as you go on (partly just because writing it down forces you to understand it a bit better), PLUS to someone else who comes in and is equally lost.
Re:Stepping Through by AnonChef · 2008-01-19 02:53 · Score: 1

the correct answer is to raise one finger in salute and find a job without a batshit-crazy manager.

I cant help but think that people who suggests you just quit your job still live in their parents basement without any bills to pay.
I have a mortgage to pay and a child to feed.
Just because times are pretty good right now there is no guarantee I can find a new job.
Re:Stepping Through by Nethemas+the+Great · 2008-01-19 03:56 · Score: 1

The only planets not aligning properly are the ones between your ears. You are advocating setting break points all through out the code, tracing the execution path, etc. Perhaps you have worked with larger (several hundred KLOC or more) in this manner and I was reaching. I more accurately should have said "worked efficiently and effectively." Unless you have superhuman abilities (perhaps you do) I challenge you to figure out 1 MLOC of code with naught but your debugger. A modeling tool capable of reverse engineering existing source, even one that just spits out text documentation will give you a birds eye view that's all but impossible to fully achieve with execution tracing. It's an ants perspective of a mountain. Should you have any kind of parallel processing, or even event based handling you'll be more likely to come away from things with little more than a lot of frustration and a headache.

It is essential to have that bird's eye perspective, especially if existing documentation is weak, out-of-date, or just plain not there. You need the static structure relationships, and the execution paths documented. From there you can begin to ask educated questions of existing staff (hopefully you have that much) about specific purposes. These answers then allow you document use-cases (user/system goals), textually and/or visually. For each use-case you document, you'll have the collaborating code typically figured out enough to effectively (at least as much possible with the code's quality) use it.

--
Two of my imaginary friends reproduced once ... with negative results.
Re:Stepping Through by Lemmy+Caution · 2008-01-19 06:00 · Score: 1

I don't like it when people are too cavalier about quitting and getting a new job without recognizing how difficult that might be, but if your motivation for staying in your job is the stability it provides, an unstable, unreasonable and unpredictable management culture suggests that you may never be very secure there, anyway. Even from the perspective of paying off the house and feeding the kid, you may be well-served by moving on.
Re:Stepping Through by Sam+Douglas · 2008-01-19 10:02 · Score: 1

Whilst it may not be completely comparable to Visual Studio's debugger, Emacs 22 has a pretty decent GDB mode which provides at least a subset of those features.
Re:Stepping Through by ricree · 2008-01-19 15:00 · Score: 1

Thank you for taking the time to answer my question so thoroughly. I appreciate it.
Re:Stepping Through by Lord+Kano · 2008-01-20 00:17 · Score: 1

Can't expect MBAs to understand the difference between good code and bad code, to them it's all just code, and as far as their concerned, the more the better.

Sometimes you can get lucky and get a department head who worked his or her way up from your position. They tend to be a bit more understanding than some empty suit who never coded an app. Although this isn't always enough, sometimes even an understanding Department Head needs to get approval from someone higher up than themselves to greenlight a major project and in that case everything falls apart again.

On the grand scheme of things that makes sense. If it were up to the programmers themselves, deadlines would usually be much longer because we are often perfectionists. We'd rather take a bit longer and release a product as near perfect as possible. Sometimes "good enough" is, in fact good enough.

LK

--
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano
Re:Stepping Through by plover · 2008-01-20 08:32 · Score: 1

According to TFA above, he recently took this job. It's not like it's a 20 year commitment he's walking out on, he's just started this job. He doesn't know yet if his manager is smart, reasonable, easy-going, rock-stupid, scared stiff, or bat-shit insane. If it turns out that his manager is one of the bad ones, it's best for him to keep looking quickly as if he hadn't taken that job. A bad manager will lead to poor reviews (and no raises) which will hurt him looking for a new job. A bad manager will lead to ulcer-inducing stress, and/or heart conditions. The guy will wake up every morning trying to figure out if he can call in sick. He will hate his life -- that's what a bad manager can do to you.
I've had the opposite -- a lucky string of managers ranging from "great" to "pretty good". But I've seen some of the bad ones, and how much their employees suffer beneath them. I wouldn't wish that life on an enemy. And I shudder to think what might happen to me the next time I get a new manager, and he or she turns out to be one of the crazies -- I too have a mortgage, two college tuition bills, and no desire to get in a price war with a bunch of low-cost H1-B visas in this job market.

--
John
Re:Stepping Through by dedalus2000 · 2008-01-20 17:01 · Score: 1

I've had allot of luck with Leo (cweb compatible Literate Editor) just import the entire code base and start building an outline based on it. even managed to make sense of a literal 500 line main subroutine.

--
My keyboads not woking popely.
Re:Stepping Through by The_reformant · 2008-01-21 01:49 · Score: 1

Wow thats comprehensive, i already thought it rocked and now ive learned a couple of new tricks!

--
I have discovered a truly remarkable sig which this post is too small to contain.
Re:Stepping Through by doti · 2008-01-21 03:11 · Score: 1

Good code has never been hurt by the presence of good documentation, proper specifications, decent planning, Apart from the harm done by outdated documentation, misleading specifications and bad planning, the code can get less clear with the clutter of unnecessary documentation.

Of course there are places where some lines of text explaining the "why" are need. But, for good -- and that also means clear to read -- code, the "how" is superfluous.

--
factor 966971: 966971
Re:Stepping Through by plover · 2008-01-21 09:32 · Score: 2, Informative

Thanks for the compliment!
I used to teach a course in debugging with Visual Studio, and I basically trawled through my syllabus looking for the cool tricks. Using the stack-crash demo to drop into the source code of the crashing module is a real attention-grabber.
I found debugging in gdb to be a lot like debugging in WinDBG. You have to learn a lot of esoteric commands that you don't use very often, so it takes a lot of practice to learn them. And if you aren't constantly searching for the side effects of each step, you can miss a valuable clue. Seeing the color change on watched values that have changed is a great way to pick up on otherwise subtle corruptions. Seeing an entire object's value hierarchy go red because you munged its pointer really stands out, at least to my eye.
Here is the bibliography and recommended references from my syllabus. It's pretty out of date these days (I especially miss the C/C++ Journal,) but the references are still good if you can find them.
Bibliography Debugging Applications by John Robbins (Microsoft Press, 2000, ISBN 0-7356-0886-5) Visual C++ Development Stunts by Mike Blaszczak (Lecture at Tech Ed 98) How to Debug Quickly and Effectively with Visual Studio 97 by Martyn Lovell (Lecture at Tech Ed 98) MSDN Library Visual Studio 6.0 (Microsoft, 1998), see also http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvc60/html/memleaks.asp Customising Autoexp_dat.doc, by EMCC Software, see http://www.emccsoft.com/devzone/tools/devstudio.html The Bugslayer by John Robbins (Column in Microsoft Systems Journal / MSDN Magazine) MFC mailing list, 1998-1999 (see http://www.microsoft.com/workshop/essentials/mail.asp) 2600 Magazine, Finding and Exploiting Bugs, Spring 2000, (2600 Enterprises, Inc.) Resources MSDN Library Visual Studio 6.0 (Microsoft, 1998), Visual C++ Documentation, Using Visual C++, Visual C++ Programmer's Guide, Debugging. Actually, the whole of the MSDN Visual Studio Library is my number one resource for Windows development. Lean on the F1 key first. It's available online at: http://msdn.microsoft.com/library/default.asp Microsoft Knowledge Base: http://search.support.microsoft.com/kb/c.asp MSDN Magazine (the combined former Microsoft Systems Journal and MIND Magazine) is the official Microsoft developer's publication. You'll find all the current and upcoming Microsoft acronyms detailed here. The writing is usually top notch, but the content is usually based on the over-hyped acronym of the month, and is frequently too specialized to be of real value. Since about 2000, it's been the official mouthpiece of .NET. But, it's pretty much required reading to stay on top of what Microsoft is rolling out the door next month. C/C++ Users Journal is a cutting edge independent magazine that offers the latest developments in C++ techniques, STL work, exception handling research, and C++ language development. It has articles written to many levels, from beginners to experts. Dr. Dobb's Journal is another independent magazine that more broadly approaches development with a wider variety of tools including C, C++, Java, perl, and Python. It is very strong in any subject it touches, but it is not Windows specific (it has a definite UNIX slant and a frequent anti-Microsoft bias) and some of it will be of less value to a Windows-only programmer.

--
John
Re:Stepping Through by Anonymous Coward · 2008-01-21 14:40 · Score: 0

hey, you don't happen to be the star programmer that everyone hates who forces their ways down other programmers' throats do you?
Re:Stepping Through by Snyper1000 · 2008-01-22 02:40 · Score: 0

Not just a debugger, its great for learning code as well. Lets see, where to start.... Well first off, theres a find in all files, so if you put everything into your project, you can search the entire project, and use regular expressions if you like....oh yeah, and replace if you need to. Sure you can do the find with grep...so nothing big here Right click on a function, see that "Show Callers Graph" option? Click it, it will show you every other function in your project that calls it, so you can drill down backwards all the way to main. If you need to go forward, right click on the function call, and click go to definition, magically you're there at the code to that function. Find a variable with a name like "index" but no local defition of index? Right Click on it and go to definition to see if its a global, member, or just a local hidden on a nasty line in the function. You gotta remember, Visual Studio was written by professional software engineers for their own use. They want to bring people in to their huge codebase with minimal time, and find/fix software bugs in minimal time. This is its purpose. Arguably it was released to make development for windows more attractive, but I'm sure that was not the driving cause for developming it. Honest take a look, its an IDE, and a Debugger which get badmouthed by so many, but yet we still have Eclipse, Visual Studio, GDB, etc as IDE's and Debuggers, and development on new IDEs, and better debuggers continue, so they have a purpose, and are quite usefull. I personally choose Visual Studio since I have found nothing that comes close to what it can do.
Re:Stepping Through by Anonymous Coward · 2008-01-25 19:19 · Score: 0

Yeah, VS (2005 is nice, I've used the 2008 Beta 2 Team Studio) is great at getting you familiar with code. Before I started working at my current job (indy game PC/360 game dev) I had very little cause to use it and so thoroughly as I do now. Edit-and-continue and the debugging tools are your hammer and nails when working on games. Conditional breakpoints, executing functions in the watch (i.e., is this object a server or client? What's the result of this transform on a certain important matrix or vector? What's in this data structure? AI state, visible objects, all are pretty easy to look at with it), and hovering for vars are life savers.

I imagine in non-games programming, some things aren't used like they are when you're building a game. Conditional breakpoints take on a lot more meaning when you can have several complex systems interacting to only occasionally produce a certain result. Macros for setting bookmarks in important areas (one guy and I at work basically make a folder for the class we're working on, then name the bookmark after the methods). Often you need to make a change to a different function than what you're inspecting, possibly in a different class altogether before you continue in the same state.
Re:Stepping Through by doktor-hladnjak · 2008-01-29 19:44 · Score: 1

If you're spreading stacks of source code printouts on a large conference table, the code base isn't really all that large. Then again different people here seem to have different ideas of what large constitutes.
Re:Stepping Through by Mr.+Slippery · 2008-01-30 10:37 · Score: 1

If you're spreading stacks of source code printouts on a large conference table, the code base isn't really all that large.

The entire codebase may be too large to print out, sure. But in any sane project, it's broken up into bite-size pieces using classes or file scoping or some encapsulation mechanism, and any one person only has to digest a few of these pieces. The code for the portion of the project you need to digest should be small enough to be printed out and spread on a big table.

--
Tom Swiss | the infamous tms | my blog
You cannot wash away blood with blood

Doxygen by Raedwald · 2008-01-18 04:39 · Score: 4, Informative

For C++ code, Doxygen can be useful, as it shows the class inheritance. As requested, it uses a (rudimentary) parser. It works with several other languages too, although I can't vouch for its utility for them.

--
Ne mæg werig mod wyrde wiðstondan, ne se hreo hyge helpe gefremman.

Re:Doxygen by PetriBORG · 2008-01-18 04:49 · Score: 1

Doxygen I thought did java-doc like parsing for C++? I was thinking he should look for something able to build a UML diagram based on the code... I hate UML, but if there isn't any documentation telling you the structures of the code it might be a place to look.

I would google for that, but I'm under deadline myself... (but yet still reading /. - I think its an addition).

--
Pete/Petri "damn, my chainsaw is clogged with 1's and 0's again." --clyde
Re:Doxygen by JamesP · 2008-01-18 04:58 · Score: 1

I second this. The important thing is to enable all the 'graph' options, as well as call graphs and other stuff. That will be most useful.

As such, it does reverse engineering of code, showing inheritances, clall graphs, include graphs, etc, etc

Only problem is, it is a pain to configure. Also, windows versions don't look very stable.

--
how long until /. fixes commenting on Chrome?
Re:Doxygen by zeekec · 2008-01-18 04:59 · Score: 3, Informative

Doxygen can produce UML diagrams for undocumented code. (UML_LOOK and EXTRACT_ALL)
Re:Doxygen by Bill_the_Engineer · 2008-01-18 06:10 · Score: 4, Informative

Doxygen I thought did java-doc like parsing for C++? I was thinking he should look for something able to build a UML diagram based on the code... I hate UML, but if there isn't any documentation telling you the structures of the code it might be a place to look.

Doxygen is more than a javadoc replacement.
I like Doxygen + Graphviz. Just set Doxygen to document all (instead of just the code with tags) and set it to generate class diagrams, call trees, and dependency graphs and allow it to generate a cross reference document that you can read using your web browser. Set the html generator to frame based, and your browsing of code will be easier. I would also set Doxygen to inline the code within the documentation.
I've use Doxygen to reverse engineer very large programs and had good luck with it. I will say Doxygen is not going to do all your work for you, but it will make your job easier. Especially if you add comments to the code as you figure each section out.
Now if you like to see the logical flow of each method then try JGrasp (jgrasp.org). It has a neat feature called CSD that allow you to follow the logic of the code a little better. It's a java based IDE so that may be a turn off for you. I do whole heartedly recommend the Doxygen (w/ Graphviz).
Good luck.

--
These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
Re:Doxygen by mhall119 · 2008-01-18 06:25 · Score: 1

I agree with all these guys, Doxygen is a great way to get an overview of the structure of the code, and the call/caller graphs will help you walk through the way it all works. If the code is commented, most of that will be pulled into the generated files, but even without documentation, being able to follow call/caller links from a function of interest, and link directly to the line of code where it happens, it a wonderful feature. If you can host a PHP program locally or remotely, you can get automatic search functionality from it also.

--
http://www.mhall119.com
Re:Doxygen by mhall119 · 2008-01-18 06:28 · Score: 3, Informative

Only problem is, it is a pain to configure. Also, windows versions don't look very stable. Windows version has been very stable for me, I've not had any problems with either Doxygen or Graphviz. It also includes a configuration wizard that is both easy to understand and powerful. There is also an Eclipse plugin that lets you configure and run Doxygen.

--
http://www.mhall119.com
Re:Doxygen by QRDeNameland · 2008-01-18 06:34 · Score: 1

For C++ code, Doxygen can be useful, as it shows the class inheritance. As requested, it uses a (rudimentary) parser. It works with several other languages too, although I can't vouch for its utility for them.

Another suggestion I would make, if this is business app and runs atop a SQL database, start by looking at the database schema. Maybe it's just the way my brain works, but to me it is easier to start wrapping your head around the basic architecture of a system from the database schema than from diving into the application code.
When facing a similar situation in my current job, where neither the code nor the database had any useful documentation, I found a saving grace in SchemaSpy. The documentation isn't that great and it took me half a day to get it to work, but it produces a nicely diagrammed schema in HTML that was my base reference while I was deciphering the system.

--
Momentarily, the need for the construction of new light will no longer exist.
Re:Doxygen by Anonymous Coward · 2008-01-18 08:06 · Score: 0

Sure, but then you've got the problem of needing a tool for understanding UML diagrams.
Re:Doxygen by SpinyNorman · 2008-01-18 09:15 · Score: 1

Interesting - thanks.

I wasn't aware that Doxygen could also be used as a call-graph generator, and I've wanted something like Graphviz for quite a while for non-software usage!

When I was your age... by russotto · 2008-01-18 04:40 · Score: 2, Interesting

I use the Mark I eyeball, grep, emacs, and of course, the little gray cells.

(and GET OFF MY LAWN).

Re:When I was your age... by Mr.+Underbridge · 2008-01-18 05:13 · Score: 2, Funny

When I was your age...I use the Mark I eyeball, grep, emacs, and of course, the little gray cells.
(and GET OFF MY LAWN).
They have lawns at the old folks' homes these days?
Re:When I was your age... by jdschulteis · 2008-01-18 05:14 · Score: 1

Last time I faced this problem (about 5 years ago), emacs, ctags, and grep got the job done.

I don't understand why those young whippersnappers modded you "Funny".
Re:When I was your age... by Anonymous Coward · 2008-01-18 05:16 · Score: 0

Having just recently taken a new job, I find myself confronted with an enormous pile of existing, unfamiliar code written for a (somewhat) unfamiliar platform -- and an implicit expectation that I'll 'grok' it all 'Real Soon Now'. Simply firing up an editor and reading through it like I did in uni has proven unequal to the task. I'm familiar with Java programming too, but I was never taught to analyze program structure; I've got a very fancy suit but only a rudimentary understanding of procedural syntax. A new-ish tool called an 'IDE' looks promising, as it appears to be based on an actual langauge parser, but the UI is clunky (I have to use the keyboard), and there doesn't appear to be any facility for integrating/communicating with a developer. What sorts of tools do you use for effectively analyzing and understanding the basic skills of your job?
Re:When I was your age... by russotto · 2008-01-18 06:14 · Score: 1

I don't understand "flamebait". Apparently the mods have never seen real flamebait. Here's a (mild) example:

I use my brain, my editor, and grep. Fancier tools are merely a crutch for the incompetent. If you can't parse C and C++ in your head, you're obviously wasting your employer's money and should find another profession, preferebly one involving a paper hat.

(anyone modding this "Flamebait" will be metamodded "wiseass")
Re:When I was your age... by xtracto · 2008-01-18 08:11 · Score: 1

Howly cow, I think some of the mods need a new sense of humour. Why did they moded parent as flamebait? it was one of the first thing it though (PPP: paper, pencil and patience, lots of them).

However, I did not understood the little gray cell thing... care to clarify??

--
Ubuntu is an African word meaning 'I can't configure Debian'

Paper by raddan · 2008-01-18 04:41 · Score: 2, Insightful

You should really be sitting down and attempting to understand the code, ASAP. Asking Slashdot for fancy tools isn't really going to help you. The real barrier here is your own brain.

Re:Paper by Anonymous Coward · 2008-01-18 04:51 · Score: 0

I agree, stop wasting time searching the web for useless tools and get to reading the code.
Re:Paper by plopez · 2008-01-18 05:08 · Score: 1

Damn. You beat me to it. I would also suggest developing domain knowledge. Reading code is useless unless you understand the what and why of the problems being solved.

--
putting the 'B' in LGBTQ+
Re:Paper by bunratty · 2008-01-18 05:15 · Score: 2, Interesting

I don't think I've ever been able to understand a large body of code by simply looking at it. I've always found that attempting to make modifications (fixing bugs, adding features) to the code gets me to understand it fairly quickly. Often, I'll find myself adding comments or cleaning the code up as I go. There have been times when I've just thrown all the code away and reimplemented the same functionality form scratch. That may not be an option here, but perhaps writing an implementation of part of the code from scratch will help to gain an understanding how that particular feature is implemented.

--
What a fool believes, he sees, no wise man has the power to reason away.
Re:Paper by jklappenbach · 2008-01-18 05:24 · Score: 1

Tools help. Several people have mentioned DOxygen. I've used it in the past on commercial projects. New developers coming in found its output to be of great help in understanding the general structure of the code, the hierarchies (they were C++ projects), and as a reference to quickly identify candidate classes for modifications or the likely source of bugs.

Mod parent down for being arrogant and patronizing.

-jjk
Re:Paper by Anonymous Coward · 2008-01-18 05:29 · Score: 0

Mod parent down for being arrogant and patronizing.

Or up for being right.

Your brain: not just for cooling the blood anymore!
Re:Paper by cjonslashdot · 2008-01-18 05:30 · Score: 3, Insightful

I agree. I have found that it is fairly easy to uncover program structure. But UNDERSTANDING the intention of each line or function is another matter. This is where one wishes that there were documentation of design decisions. This is why whenever I build something I simultaneously maintain a design document in which I record each decision that I make and each pattern that I devise and use. As I revisit decisions, I do it in the design, and only when I have worked out the design do I try to code it. This is not the traditional "big up front design" - it is an agile approach to design, attacking it incrementally and in a just-what-is-needed manner.
Re:Paper by sdpuppy · 2008-01-18 05:48 · Score: 3, Funny

Well when all else fails, look at the variable/function/structure names.
Obviously a program with labels such as "Frodo" Sam" "Gondor" must be doing something Lordly with rings
and if you have labels such as "string1" "string2", then the program must be solving some particle physics problem involving string theory.
... and when that fails, you go back to your old college, find the smartest CS geek and slip him/her a few dollars to figure it out.
Need I add :-) :-) :-) ?
Re:Paper by raddan · 2008-01-18 06:50 · Score: 1

You are apparently the only one who understood my post.

doxygen by greywar · 2008-01-18 04:41 · Score: 3, Informative

If its in a language that doxygen can understand, thats the tool I would HIGHLY recommend.

Ctags by pahoran · 2008-01-18 04:42 · Score: 3, Insightful

google exuberant ctags and learn how to use the resulting tags file(s) with vim or your editor of choice

--
I'd give my right arm to be ambidextrous.

Re:Ctags by Anonymous Coward · 2008-01-18 10:09 · Score: 0

It's good you distinguish between "vim" and "your editor of choice", but then why do you even mention the former?

Old School by geekoid · 2008-01-18 04:42 · Score: 4, Funny

Printouts and colored markers.

--
The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect

Re:Old School by alienpeach · 2008-01-18 05:18 · Score: 1

While using colored markers seems to be funny, as our moderators have noted, this is actually extremely helpful. Taking a "book" of code, a few highlighters and mug of coffee is the best scenario for understanding complex code.
Re:Old School by e4g4 · 2008-01-18 05:46 · Score: 1

Indeed - in fact, if you've got one, I dust off an old dot-matrix printer and print it out on those perforated "spools" of paper they use - i find printed code much easier to read without arbitrary page breaks.

--
The secret to creativity is knowing how to hide your sources. - Albert Einstein
Re:Old School by gstoddart · 2008-01-18 06:11 · Score: 1

Indeed - in fact, if you've got one, I dust off an old dot-matrix printer and print it out on those perforated "spools" of paper they use - i find printed code much easier to read without arbitrary page breaks.

Oh yeah, baby!

Several inches thick of tractor fed, 17" wide green bar paper, three different highlighters, a pen, a notebook, and either a coffee or a lager. Nowadays, I'd throw in an iPod.

That brings back memories of university. =)

Cheers

--
Lost at C:>. Found at C.
Re:Old School by Krishnoid · 2008-01-18 07:58 · Score: 1
1. Print out all your code (a2ps -2r is helpful here)
2. find a conference room with a big table
3. spread out all the code after hours
4. start top-to-bottom and cross out dead code
5. annotate functions with comments as you go along
6. repeat top-to-bottom passes until you don't understand anything more
7. type the comments in when you get back to your desk
8. repeat tomorrow
I'd be curious to see how quickly it takes you to totally understand it (I've done this before and it really helped). I think your thought processes engage differently when you can visually see the whole program at once to get your mind around its entire scope, at which point you can then zoom in on the specifics. Compare this with looking at the program in a sliding window (your terminal or IDE) piece by piece and you'll see what I mean.
Re:Old School by SpinyNorman · 2008-01-18 08:39 · Score: 1

Close, but old school means pens or pencils instead of colored markers, and the printout needs to be on fanfold tractor-feed paper - with the listing spread out to full length on the floor. A laser printer and a roll of scotch tape works too! ;-)
Re:Old School by CableModemSniper · 2008-01-18 15:19 · Score: 1

this is modded funny, but don't underestimate the benefits of a printout. If nothing else it gives you a break from staring at the screen and you can write on it.

--
Why not fork?
Re:Old School by doktor-hladnjak · 2008-01-29 19:49 · Score: 1

The only problem is that the poster is dealing with a *large* code base. I'm not sure your method scales well to millions of lines of code.

Re:How / why did you get the job... by wampus · 2008-01-18 04:43 · Score: 2, Insightful

Sometimes its hard to follow execution, especially in a large codebase. Its made even more difficult when a smug jackass wrote it to be as terse as possible.

Understand C++ by SparkleMotion88 · 2008-01-18 04:43 · Score: 5, Informative

Sorry I don't have an open source tool for you, but I've used Understand for C++ in the past and it was pretty helpful. To me, the most useful piece of information for understanding a large codebase is a browseable call graph. I'm sure there are simpler tools out there that generate a call graph, but this is the only one I've used with C++.

Re:Understand C++ by TacoBellGrande · 2008-01-18 05:14 · Score: 1

I've used both Understand C++ and doxygen (with graphviz) to understand the code of those who are no longer available for me to bug. They both have their strengths and weaknesses, so I tend to use both.
Re:Understand C++ by n0-0p · 2008-01-18 05:57 · Score: 1

I use Understand on a daily basis to review other people's code. It does have it's quirks (mainly on really big codebases), but I haven't found anything that works better. And while it isn't cheap, there is a free one-month trial version.
Re:Understand C++ by Windscion · 2008-01-18 10:09 · Score: 1

I have also used this tool and found it helpful.

RR & EA by Anonymous Coward · 2008-01-18 04:44 · Score: 3, Informative

Sometimes tools like Rational Rose or Enterprise Architect are successful at reading in the code an building a UML model that you can then attempt to parse through. I'm not familiar with the use of either, but I know it can be done, with mixed results depending on the size and complexity of the code being analyzed. Both tools are fairly expensive though, I believe.

Re:RR & EA by wfeick · 2008-01-18 05:07 · Score: 1

I've used Borland's Together in the past and found it really helpful for C++/Java code. It can be really helpful for coming up to speed a code base's class hierarchy. Unfortunately, when I tried it on a large C++ code base where I'm currently working, after loading the code base in it seemed to go into some sort of a analysis phase and then eventually crashed.

I'm not sure what the problem was. A sales droid called to check in on my download, passed the crash info on to a techie, and said I'd get a call back. About a month later another sales droid called, said essentially the same thing, and I never heard back. :-(

Slightly off topic, but at a previous company we ran Together inside a VNC session as a virtual whiteboard during design sessions with a distributed team. That made it really easy for everyone to visualize the designs we were discussing, and produced code for us as well.

It's a great tool, but apparently has its limitations.

Re:How / why did you get the job... by Jeremi · 2008-01-18 04:44 · Score: 4, Insightful

One might as well ask, why are you posting smarmy retorts when you clearly didn't understand the question? The question was about understanding the program, not the underlying language.

--

I don't care if it's 90,000 hectares. That lake was not my doing.

Eclipse works extremely well for Java... by mario_grgic · 2008-01-18 04:45 · Score: 1

if your code is not Java, then I go back to Vim and ctags as a fast start that can be setup in a few minutes (and it works for everything from assembler to Java). It will help you navigate code fast, follow function calls etc, but it won't help you visualize class hierarchies or help you to figure out all the places a function is called from like Eclipse does for Java.

Your best bet is to look for a good IDE specific to the language the code is written in. But as far as I know nothing comes close to the power of Eclipse exploration tools for Java, for other languages since not even Eclipse works as well for say C/C++ as it does for Java.

--
As the island of our knowledge grows, so does the shore of our ignorance.

Re:Eclipse works extremely well for Java... by AmaDaden · 2008-01-18 05:01 · Score: 1

I just started working a company with a horrifying code base and was using Eclipse. Eclipse did a fantastic job of helping me jump around the code (oh how I love you CTRL + left click) but the code it self was still hard to read. I figured out that in Eclipse you can do a LOT more color coding than is used by default. This seems trivial but now with a glance I can get a good deal more information on the scope and type of a variable or function then before. I highly recommend looking in to it. I have to note that I am doing Java so I'm not sure how well it'll work for C/C++.
Re:Eclipse works extremely well for Java... by axb2298 · 2008-01-18 05:49 · Score: 1

I am using Eclipse for C code. It has two neat features that I use for understanding code. Ctrl+Alt+H opens call hierarchy in click-able tree. Shift+Ctrl+T will jump right to the definition of a function that you type in. The search function also has a lot of useful options
Re:Eclipse works extremely well for Java... by Zagrev · 2008-01-18 06:26 · Score: 1

While it's not as good at C as at Java, Eclipse is still the way to go. Eclipse understands the full syntax, and you can Ctl-Click from the call of the method to the definition. Then you can Alt-LeftArrow back to where you came from. This is the only way to understand code. Walk through it, following the path of execution. I just used this to understand a 10 year old, 25,000 line C program, and I'd still be trying to understand it without this tool. It rocks. (For C, C++, Java, PHP, Perl, et al)

Re:How / why did you get the job... by geekoid · 2008-01-18 04:45 · Score: 1

I have seem some pretty awfully used languages.
I started at one company, and they had functions that were 1600 lines long, with gotos.

Not easy to understand, and very complex.

--
The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect

Reverse Engineer? by dotpavan · 2008-01-18 04:45 · Score: 1

For Java, would reverse engineering the code to UML diagrams help? Any good open source tools one could recommend to understand a large code base?

Re:Reverse Engineer? by samkass · 2008-01-18 05:06 · Score: 1

For Java, he probably wouldn't be having this problem as acutely in the first place. The reduced syntax compared to C++ makes many of the hacker types hate Java, which makes Java twice as good in my book. It also makes everything a lot clearer. In addition, the dynamic nature of the language combined with the compact syntax means even the free tools like Eclipse have excellent analysis capability, and tools like IntelliJ offer phenomenal ability to introspect the code.

But yes, C is a few percent faster and lets the hackers go to town, so some companies still choose it.

--
E pluribus unum

lxr by Anonymous Coward · 2008-01-18 04:45 · Score: 1, Informative

I often use LXR for understanding the kernel, but have used it for other large code bases. If you pair it with some sort of sticky note firefox add-on it becomes particularly useful.

http://lxr.linux.no/

You must have inherited my old project by theophilosophilus · 2008-01-18 04:47 · Score: 5, Funny

Sorry about that.

--
Why have 1 person driving a backhoe when you could employ 20 with shovels?

speaking of opc... by airdrummer · 2008-01-18 04:48 · Score: 0

other peoples' code...b sure 2 post the good stuff on http://thedailywtf.com/;-)

Delete it! by Besna · 2008-01-18 04:48 · Score: 1

Make all interfaces use explicit typing (no plain "int"s around, everything clearly signed or unsigned--better yet, use uint32_t and the like from stdint.h). Use one width if possible--whatever your CPU prefers (usually a uint32 or uint64). Learn it by refactoring it. Delete code whenever possible. Kill "#if 0"'s laying around.

When I am particularly frustrated by antifoidulus · 2008-01-18 04:48 · Score: 1

I find that a hammer works well. Not so much for understanding the code, but it CAN help relieve computer-created stress!

--
Monstar L

What I do by laughing_badger · 2008-01-18 04:48 · Score: 5, Informative

SourceNavigator : A good visualisation package http://sourcenav.sourceforge.net/

ETrace : Run-time tracing http://freshmeat.net/projects/etrace/

This book is worth a read http://www.spinellis.gr/codereading/

Draw some static graphs of functions of interest using CodeViz http://freshmeat.net/projects/codeviz/

Write lots of notes, preferably on paper with a pen rather than electronically.

--
Help children born unable to swallow - www.tofs.org.uk

Re:What I do by Anonymous Coward · 2008-01-18 05:08 · Score: 1, Informative

Use sourcesinsight for C/C++/java/C#/perl/ksh/etc programming languges. It is very light, yet powerful IDE and could be used to browse thru code.

I have used for code bases more than 3000 C/C++ files and yet the IDE behaved well -- jusk like Eclipse for java platform, and consumes very less memory
Re:What I do by GooberToo · 2008-01-18 05:44 · Score: 1

I have to second source navigator. It's crossplatform and supports multiple languages.
Re:What I do by Anonymous Coward · 2008-01-18 06:12 · Score: 0

Actually, there is a very good successor to Source Navigator -- Source Navigator NG

http://sourcenav.berlios.de/

"... strives to improve usability and performance"
Re:What I do by bilbobugginz · 2008-01-18 09:25 · Score: 1

I must add "Silent Bob" - http://silentbob.sourceforge.net/ Cheers.
Re:What I do by Diomidis+Spinellis · 2008-01-18 22:11 · Score: 1
Let me add two pieces of advice.
First, in most cases there's no need to spend time comprehending a large code base. You have to be selective, and the tools you'll use will depend on the problem you're facing.
- If you're only trying to solve a specific bug, then, as other posters already commented, the debugger is your tool of choice.
- If you're facing performance problems, then use the various performance measurement tools.
- If you must evaluate the code, then invest on metric collection tools, like CCCC.
- If you want to reuse the code in another project, then you need to investigate packaging tools and techniques.
- If the code's identifiers or its structure require refactoring, then try using a refactoring browser, like CScout.
- If the code's formatting and style brings you a headache, look at formatter, like indent
Second, never underestimate the power of grep. Grep will often find things that other tools can't. It can search in documentation, comments, binary files, unparsable source code, and change logs. It can work on any language, and (with the help of find(1) and xargs(1) tools) traverse deep directory hierarchies. Unlike most other tools, grep doesn't need any setup, tuning, or configuration. Grep will often locate what you're looking for, in less time than what you need to download a more sophisticated tool.

Doxygen by Anonymous Coward · 2008-01-18 04:48 · Score: 0

How about Doxygen; see their site? Gives you the whole OO inheritance structure, lists of function caller/callees (if desired), graphical representations, etc, etc. And it lets you browse through the code with a web browser...

Non-sequitur time by 14erCleaner · 2008-01-18 04:49 · Score: 1

I'm not exactly answering your question, but in my experience nothing helps you learn about somebody else's code like having to find and fix bugs in it. Just diving in with a specific goal in mind. The next best thing is having somebody who's familiar with it draw you a diagram of the overall structure. Comments in the program, or external documentation, are usually too much to hope for.

--
Have you read my blog lately?

Re:Non-sequitur time by morn · 2008-01-18 05:09 · Score: 1

I absolutely agree. You don't need to understand all the code, you just need to be able to follow the part you're dealing with to fix whatever bug or interface with whatever part you're working on. Don't get me wrong, getting to know the overall architecture is something you should do (hopefully there are some old employees who can draw some block diagrams on a whiteboard for you or something - if not, that's probably something you should try yourself with a bit of archeology of the code), but knowing the ins-and-outs of the whole codebase is not something you should even attempt - you don't need to know all the code in that level of detail.

In my experience, even after two years in my current job, management are still perfectly willing to accept an answer of "I don't know that part of the code very well, give me some time to look into it and get back to you" when they ask me about a bug or a prospective new feature.

--
...or am I missing something?
Re:Non-sequitur time by TigerNut · 2008-01-18 05:17 · Score: 1

The next best thing is having somebody who's familiar with it draw you a diagram of the overall structure.
I find the best thing is to do the drawing myself. It might take a couple of attempts, but in the process you have to dig into the details and discover the structure. The extra interaction with the code gives a more indepth understanding. If I can draw it, then I can create it, fix it, and explain it to someone else. If I can't draw a particular thing to a desired level of detail (whether it's a piece of hardware, mechanical construction, or software) then I don't really understand it.

--
Less is more.

Answer by hey! · 2008-01-18 04:49 · Score: 4, Funny

Yes. Understanding code is one of thing things you hire tools for.
...

Wait, were you talking about software?

--
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.

Re:Answer by WK2 · 2008-01-18 05:16 · Score: 1

Yes. Understanding code is one of thing things you hire tools for.
Yes, but what happens when the tool asks Slashdot how to understand the code?

--
Write your own Choose Your Own Adventure. http://www.freegameengines.org/gamebook-engine/
Re:Answer by hey! · 2008-01-18 05:21 · Score: 1

He gets gratuitously mocked for a cheap laugh by people with a pathetic need to be perceived as clever.

--
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.

doxygen - with full source option by mhackarbie · 2008-01-18 04:50 · Score: 2, Interesting

I agree with the previous recommendations for Doxygen. A while back I wanted to become familiar with the source code for a game engine and tried various tools to help with the 'grok' factor. I found the doxygen docs, with full source code generation in html, to be the fastest and most convenient way to walk around the code. After a while, it just clicked.

Creating small demo apps that use the code can also help.

mhack

--
Building a better ribosome since 1997

Re:doxygen - with full source option by sheltond · 2008-01-18 05:19 · Score: 1

Yes - I have used doxygen for both C and C++ code. When using the full-source option it can be quite slow, but in conjunction with the "dot" tool it produces quite nice call graphs.

See http://www.stack.nl/~dimitri/doxygen/diagrams.html for info.

http://www-scf.usc.edu/~peterchd/doxygen/example/main_8c.html (to pick a random example found on google) has an example of a doxygen-produced page giving both an include-file graph (at the top) and a call graph (at the bottom).

As you can see it gives a quite nice at-a-glance overview of the program's structure. It will happily produce individual pages for each function in your program showing a graph of functions that call into it and all of the functions it calls.

Note that the boxes in the diagrams are hyperlinked to the corresponding page for that function/header-file.
Re:doxygen - with full source option by Anonymous Coward · 2008-01-18 06:25 · Score: 0

I could newer understand the obsession with doxygen as it's a completely STATIC approach:
Alter your code -> do a complete reparse.
Further, it's just generating HTMLs so there is no real interaction with the actual code!.

I've found Source Navigator to be extremely useful which is some kind of "enriched editor" where you interact directly with the code and if you change something, it's directly visible.
It can navigator through the code by

1) find implementation or declarations of functions and variables or macros
2) display a cross-referencing tree of a symbol so you know where this symbol is used
3) do a grep (colorized!) over the source, with full regexp support
4) have a class browser to see the hierarchy or a certain class

That's why I tried to revive the development to "improve usability and performance" and made a fork called Source Navigator NG.
You will find at http://sourcenav.berlios.de./

We're just in the middle of getting our 3rd release ready which features the migration to Berkely db4 which will improve the lookup-performance in big projects (think: Linux kernel) dramatically and should be more stable on win32.

Re:Doxygen, and Extracting Software Architectures by Mr.Bananas · 2008-01-18 04:50 · Score: 5, Informative

I use Doxygen for C code, and it is really helpful. One of its most useful features is that it generates caller and callee graphs for all functions. You can also browse the code itself in the generated HTML pages, and the function calls are turned into links to the implementation. Data structures and file includes are also pictorially graphed for easy browsing.

If the system you need to understand has a really big undocumented architecture, then this presentation might be useful to you (there is a research paper, but it's not free yet). In it, the authors present a systematic method of extracting the underlying architecture of the Linux kernel.

GNU Global by Masa · 2008-01-18 04:50 · Score: 3, Informative

GNU Global is able to generate a set of HTML pages from C/C++ source code. This tool has helped me several times. All member variables, functions, classes and class instances are hyperlinks. It provides an easy way to examine source code. It also provides tags for several text editors (for Vim and Emacs especially). http://www.gnu.org/software/global/

Imagix 4D by Imagix · 2008-01-18 04:51 · Score: 1

Imagix 4d (http://www.imagix.com) was a rather interesting tool the last time I looked at it.

Umm.. documentation? by Anonymous Coward · 2008-01-18 04:51 · Score: 5, Insightful

Seriously folks, having spent large chunks of my working life having to decipher the mess of those who came before me I cannot stress enough the importance of clear comments, variable/function names, and consistent and readable syntax. AND WRITE F@#$%ing HUMAN READABLE DOCUMENTS DESCRIBING FUNCTIONAL REQUIREMENTS, ALGORITHMS USED, LESSONS LEARNED, ETC.
Calling all your variables "pook" or the like may be very cute, but does not help me figure out what the heck the function is supposed to do or why I would ever want to call it. Yes it's a pain. Yes we're all under time deadlines and want to get it working first and go back and document it later. And yes, it WILL bite you in the ass (ever heard of karma? your own memory can go and then you have to decipher your OWN code!).

That said, if you have inherited a code base from someone who ignored the above, go through and generate the documentation yourself. Write flow charts and software diagrams showing what gets called where and why. Derive the equations and algorithms used in each piece and figure out why the constant values are what they are. Finally, start at the main function or reset vector (I do a lot of microcontroller development) and trace the execution path.

visual studio by Anonymous Coward · 2008-01-18 04:51 · Score: 0

I just use visual studio even though the code is not MFC or windows. As long as it is C/C++, it works fine. VS is a great development tool and has all the features (and more) you are asking for build in.

Documentation Documentation Documentation by DrLang21 · 2008-01-18 04:51 · Score: 1

If any documentation describing the code or at least functions in plain language exists (and for the love of God it always should) start there. If it doesn't, advise that your company start making documentation for any new code (not that you should expect them to listen).

--
I see the glass as full with a FoS of 2.

Osmosis by Greyfox · 2008-01-18 04:51 · Score: 2, Insightful

If the original developer made useful comments that will help immensely. If there's a design document showing how the program fits together that helps a lot. If there's a process document explaining the business logic the application implements, that helps a lot. On average you'll start with a marginal code base with no comments, no design documents and no explanation of what the application is attempting to accomplish.

Get the guys who use it to explain what they're trying to do, read the code for a couple of days and then have them show you how they use the application. Then plan on six months to a year to get to the point where you can look at buggy output and know immediately where the failure is occurring. In the mean time just work in it as much as you can and don't try to redesign major parts of it until you know what it's doing.

--

I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

Last time I had to do something similar... by ByOhTek · 2008-01-18 04:51 · Score: 1

I had to do something similar a while ago with a poorly documented piece of software, I pulled out visio (it's what we have here, I'm sure there are better tools for the job, but it worked well enough), and made a diagram of what-called-what. Even without the why/conditionals, that helped me a lot (the names made more sense), on parts where I had trouble, I'd go to the lower levels, figure out what they did, and document those functions in the visio diagram.

That is what I would do in your situation, except:
(A) If you can, find something better than Visio. It beats a paint program, true, but it is still irritating for the task (any recommandations here?).
(B) If you use visio, you probably don't want to make my mistake of doing the drawing on an 8.5x11 sheeet. 85x110 might be better... Assuming you won't print it out...

--
Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).

Re:Last time I had to do something similar... by Anonymous Coward · 2008-01-18 05:01 · Score: 0

Use death trees to draw on?

Constant switching between source and 'visual program' is hell.
Re:Last time I had to do something similar... by ByOhTek · 2008-01-18 05:09 · Score: 1

to each his/her own. I had the source editor on one desktop, and Visio on the other. A key command switched desktops, so I could read something, edit the visio diagram, and go back fairly quickly. Code more resembles that kindof diagram in my head anyway, so I didn't have trouble.

I don't see dead trees as being any easier for me personally. Too much erasing makes them hard to read, and there was a lot of moving/erasing.

--
Self proclaimed typo king, and inventor of the bear destroying coffee table (patent not pending).
Re:Last time I had to do something similar... by Anonymous Coward · 2008-01-18 05:12 · Score: 0

Dia is free, and runs on linux/windows. It also has a wide variety of templates available as well.

perl and graphviz by Speare · 2008-01-18 04:52 · Score: 1

I had to do this sort of "unfamiliar code analysis" with an ancient FORTRAN application written by non-software guys in the 1980s. It was some of the worst spaghetti I'd seen in some time.

To make any sense of it, I asked the compiler for a call tree report, and then I fed this through Perl to make a GraphViz "dot" file of it. After a few shuffles, I could start to determine some architecturally related areas and refactor slightly to decouple them into a more clear arrangement of modules. It was still crap, but it was at least something that I could understand to the point of making unit tests and coverage tests.

--
[ .sig file not found ]

Re:perl and graphviz by Anonymous Coward · 2008-01-18 05:05 · Score: 0

I had to do this sort of "unfamiliar code analysis" with an ancient FORTRAN application written by non-software guys in the 1980s.

Was it called AWSIM?
Re:perl and graphviz by ChoirmasterWind · 2008-01-18 05:26 · Score: 1

Yes . GraphViz and dot are magic. You still have to read the code of course, but this gives you the map to guide your reading. The pictures are usefully impressive for your new bosses to show that you are making progress too. Really useful for showing up that nuggt of code that gets called from everywhere. http://www.graphviz.org/Gallery/directed/profile.html

Don't attempt the impossible... by namgge · 2008-01-18 04:52 · Score: 4, Insightful

and an implicit expectation that I'll grok it all Real Soon Now

It is unlikely that your job is really to 'grok it all'. Most likely there are specific issues that need to be solved - stop panicking and pick the simplest one on the list and start working on it.

In a similar position to you, I followed Brook's advice to study on the data structures and found it good. Also just running the application under a debugger, inserting breaks in important looking code and then having a look at the call stack when that code was used also proved enlightening. A good debugger also lets you explore the data structures.

When smart-asses tell you "Bill would have fixed that in ten minutes." I recommend replying "I never met Bill, why do you think he left?"

Namgge

Re:Don't attempt the impossible... by Lumpy · 2008-01-18 04:58 · Score: 1

When smart-asses tell you "Bill would have fixed that in ten minutes." I recommend replying "I never met Bill, why do you think he left?"

that one works great, Problem is most of the time smart-asses are not the ones doing that, but incredibly stupid managers.

"manager of Marketing, XXX did it in 15 minutes... why are you taking so long?"

The parent's response is perfect for these situations... it shuts them up instantly.

--
Do not look at laser with remaining good eye.
Re:Don't attempt the impossible... by IkeTo · 2008-01-18 05:50 · Score: 1

> I followed Brook's advice to study on the data structures and found it good.

This is often underestimated. Understanding the data structure (and, if available, database entries) of a system usually give you everything you need to understand, especially for applications that are not heavy on mathematics and algorithms. Of course, there are programs for which the data structures don't make any sense, and for those it is going to be very difficult to understand using whatever method.
Re:Don't attempt the impossible... by SavedLinuXgeeK · 2008-01-18 06:24 · Score: 1

Be careful to make sure that your predecessor quit, and didn't die. While possibly funny, that may not go over so well...

--
je suis parce que j'aime

Etags by __david__ · 2008-01-18 04:52 · Score: 2, Interesting

Emacs and etags are your friend. Meta-. zips to the function under the cursor. C-s for incremental search. Meta-x grep-find for any other search.

Also, run the program with a debugger and step through it. Or put some print statements in key places and see what it produces.

I find that's all I ever need.

--
There. Now go play some cool javascript games!

Re:Etags by jgrahn · 2008-01-18 20:01 · Score: 1

Emacs and etags are your friend. Meta-. zips to the function under the cursor. C-s for incremental search. Meta-x grep-find for any other search.

More generally, if you are going to be browsing large masses of code, you need to be familiar with your editor, and it needs to be good. It needs to be fast too, and the taller the screen you can afford, the better.
I see many people working in editors they don't understand beyond the simplest editing commands, and it certainly hurts their understanding of the code. It doesn't matter if it's Emacs, vi, or Eclipse when you don't use the bloody thing!

well.. by Anonymous Coward · 2008-01-18 04:53 · Score: 0

How about doxygen?

Some tips for you... by Anonymous Coward · 2008-01-18 04:53 · Score: 0

You stepped into a bees nest, Getting in the place where you now maintain some other guys code can be a nightmare. Specially if your management is clueless.

1- communicate. The only way they know is if you tell them, if you run up against a pile of spaghetti code that is nothing more than a ugly half arsed hack, tell them. Tell them that it is going to take more time because of the last guy's mess. Being honest is better than being a yes man and acting like you can do anything they ask on their time line.

2- Dont be afraid to ask, Last job I had like that.... The previous guy did everything on his personal copy of the tools and took them with him, if you need to purchase anything tell them you need to buy XYZ at $$$$ cost and why. Justification goes a long way.

3- dont be afraid to let their deadlines slip. It's not like you can control this, You cant know the stuff like the last guy overnight, some code I have here I have worked with for 2 years and I still dont fully understand it... (we are replacing it with something that is not a nightmare) I let deadlines they impose slip all the time if I am not in control. And I let them know this in the meetings when they set the deadline.... "That one will be missed unless we budget way more for it." If you attach dollars to their deadline, they usually move their deadline.

4- Talk to them about getting things replaced with proper solutions. Maintaining that MS access nightmare that some guy in Marketing created 5 years ago is not a real solution, it needs to be replaced with a real solution, let them know.

Re:Some tips for you... by Schraegstrichpunkt · 2008-01-18 05:06 · Score: 1

4- Talk to them about getting things replaced with proper solutions. Maintaining that MS access nightmare that some guy in Marketing created 5 years ago is not a real solution, it needs to be replaced with a real solution, let them know.
Here is a useful bit of vocabulary for explaining why this is so: technical debt.

--
http://outcampaign.org/

Muhahaha by roman_mir · 2008-01-18 04:53 · Score: 1

I hate people who refuse writing requirments / design documentation stating that good code must be self-explanatory.

Now you can hate them too!

--
You can't handle the truth.

Start with the application type by Anonymous Coward · 2008-01-18 04:53 · Score: 0

and reverse-engineer the analysis diagrams and approaches common to that type of problem rather than getting down into the weeds of specific classes and calls.

For instance, a business app is data centric, so start by understanding the persistent data structures and relationships. If your code is real-time or event driven, try to back out state transition diagrams. If this is a web server app, try to extract use cases from the client's point of view. And so on.

It's MUCH easier to learn the details of specific parts of the code when you know the broader what/why.

Read The Fine Manual by frith01 · 2008-01-18 04:54 · Score: 1

1. Understand any documentation or diagrams that explain what the high level purpose is of the processing. 2. Document the inputs / outputs of the system 3. Determine if it is Object Oriented, Procedural, multi-code language, etc. 4. Search for IDE's for step #3 5. Identify code repository used to manage the code. (If none exist, please submit resume to another corporation quickly). 4. Identify the first layer of processing, and sort out the important sections of code. (Is Input translation the most important, is it the User Interface / middle processing, or are the outputs the most important.) 5. Look at most recent changes / bug fixes based on Issue reporting / QA tracking . 6. Dive into most important sections first.

Reading code is no good to start with. by pt99par · 2008-01-18 04:57 · Score: 1

There are some people here who says that you should read and understand the code but that is just stupid to start reading the code. The best thing to do is to use tools to analyze the code with tools so that you can look at the system at different abstraction levels. People that say that you should start at code level have probably never had a real job or have never seen a system with 50k+ loc. When i analyze a system i use grep,sed and graphviz to draw various diagrams for me at different levels. In that way i can understand the sytem much quciker and i dont have to understand all the details directly. After that i can zoom in to the details by starting at the right parts of the system. So try to find good tools and if you cant find any try to use graphviz in combination with your text processing tools of choice.

Codesurfer by ximor_iksivich · 2008-01-18 04:58 · Score: 1

You could have a look at CodeSurfer http://www.grammatech.com/products/codesurfer/overview.html/ which is a program slicing tool for c/c++. I found it extremely useful for analyzing programs. To make full use of it, I would recommend reading the manual in its entirety :)

Re:Codesurfer by hAckz0r · 2008-01-18 06:26 · Score: 1

Mod parent up! CodeSurfer is the Industrial Strength code analysis suite. Understand-C++ is a great bargain for the price, but CodeSurfer wins hands down in the available features and extensibility.

Re:The best tool by Anonymous Coward · 2008-01-18 04:58 · Score: 0

A college degree in something CS/related will help you.

How about creating a new tag:

"Troll needing ego boost"

I have an idea by Anonymous Coward · 2008-01-18 04:58 · Score: 1, Funny

You could try posting the code here and maybe some kind people at slashdot can help.

Source Insight or SlickEdit by dgoldman · 2008-01-18 04:58 · Score: 1

Source Insight and SlickEdit are not open source but trials are there for either.

In my opinion, having a good editor that allows quickly jumping to definitions or references is the best tool out there. Understand works but in the end wasn't as helpful as I hoped. Take a look at these and pick your poison. I like them both but prefer Source Insight for the windows machines and SlickEdit for Linux.

Try everything you can. Find what works for you. Yea, I know, not much help.

Good open source tool for understanding any code.. by mazanoid · 2008-01-18 04:59 · Score: 0, Redundant

There's this great opensource package called OpenEyes, and to my knowledge it only requires nominal installation effort by the user. Basically you just have to configure the face.cfg to provide the correct balance of tension and flexion to the ocular.modules.

Hope it helps.
-Mz

hmm. by apodyopsis · 2008-01-18 04:59 · Score: 1

Thats like a carpenter asking for a nail gun because the hammer is too complicated to use. As with all trades get to grips with the basics first, if you really cannot make a dent on your code mountain then are you sure you should be doing the job? No disrespect intended.

I find, when in similar situations, start in main() and stroll down the call tree. I also make a beeline for interrupt handlers and pointers - but then I specialize in embedded software so bear in mind that my advice might be as useful as rice paper underpants. I suspect that the same idea holds true for most SW. For OO work I try to get a mental image of all the classes before I picture how they stand together.

Of course, as my profession is full of considerate, professional engineers all the code is clearly labeled and structured. riight.

Re:hmm. by Anonymous Coward · 2008-01-18 05:21 · Score: 0

[...] my advice might be as useful as rice paper underpants. [...] I wear rice paper underpants, you insensitive clod!
Re:hmm. by cloud1494 · 2008-01-18 06:18 · Score: 0

If the nail gun is faster and more efficient, why not use it?
Re:hmm. by SageinaRage · 2008-01-18 07:31 · Score: 2, Insightful

It's more like a carpenter asking for a nail gun because it's quicker, less tiring, with less change of damaging themselves. Any carpenter with any sense would ask for one, just like any coder with any sense would ask for these tools.

Just Do It by BAH+Humbug · 2008-01-18 05:00 · Score: 1

Without someone else to lead you through the code set, the best option is to go make a small change that is desired by someone. That person becomes your customer and has a vested interest in confirming that your change works. Don't try to understand the whole code set -- just study the section(s) you think need to change to fulfill the request. Repeat. You'll build an understanding over time.

Add unit tests as you make changes to demonstrate how a section of code is used and to capture existing behavior. When you feel comfortable, begin refactoring sections which you found obtuse. If someone complains that you have broke something, add a test to make sure it doesn't happen again.

Understand the design first, then the code by Anonymous+Brave+Guy · 2008-01-18 05:01 · Score: 4, Informative

I'm afraid you've set yourself an almost impossible task. IME, there are no shortcuts here, and it it's going to take anywhere from a few months to a couple of years for a new developer to really get their head around a large, unfamiliar code base.

That said, I recommend against just diving in to some random bit of code. You'll probably never need most of it. Heck, I've never read the majority of the code of the project I work on, and that's after several years, with approx 1M lines to consider.

You need to get the big picture instead. Identify the entry point(s), and look for the major functions they call, and so on down until you start to get a feel for how the work is broken down. Look for the major data structures and code operating on them as well, because if you can establish the important data flows in the program you'll be well on your way. Hopefully the design is fairly modular, and if you're in OO world or you're working in a language with packages, looking at how the modules fit together can help a lot too. Any good IDE will have some basic tools to plot things like call graphs and inheritance/containment diagrams, if not there are tools like Doxygen that can do some of it independently.

If you're working on a large code base without a decent overall design that you can grok within a few days, then I'm afraid you're doomed and no amount of tools or documentation or reading files full of code will help you. Projects in that state invariably die, usually slowly and painfully, IME.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.

Re:Understand the design first, then the code by pavera · 2008-01-18 05:31 · Score: 1

As someone who has inherited quite a few code bases in the state you describe in your last paragraph (and trades in turning around large projects which have gone off the tracks), I can completely agree with you. If there isn't a decent design behind the system, something that can be explained in a few days or a week, that details the major modules of the system, the major code/data paths in the system, and the overall design philosophy, then it gets very difficult.

In general projects in that state have/had 1 or 2 people who are holding the project/company for ransom (or they are just idiots). They've built the system with the thought in mind that no one can figure this out, so we have jobs for life (or with no thoughts in their minds). Either someone comes in and fixes it, provides a design element to the project or the project dies.

Also, I agree with your overall sentiment, you have to learn the design/big picture first. Individual code details give you nothing if you don't know the big picture. In fact, knowing code details without the big picture can be very harmful.
Re:Understand the design first, then the code by Anonymous Coward · 2008-01-18 07:03 · Score: 0

Heck, I've never read the majority of the code of the project I work on, and that's after several years, with approx 1M lines to consider." Bleh, must be from M$...

useful tools for groking large code bases by Anonymous Coward · 2008-01-18 05:02 · Score: 0

you seem to be looking for commandline tools, but their never really going to offer a great way to visualize a new complex program, although they can be quite useful in development.

ide's with class browsers, like eclipse (w/ cdt for non java) or openkomodo are pretty good aids.

for source search, cross linking, and highlighting, the best tool i've come across is opengrok - http://opensolaris.org/os/project/opengrok/

if your more apt to build your own tool, there are a couple of nice libraries out there, scintilla has cross platform language parsers. silvercity builds a python api for looking examing language constructs. for ruby the recently released ohloh contains parsing capabilities, http://labs.ohloh.net/ohcount

also the venerable exuberant tags, is a must for non ide development environments. its a great tool in conjuction with flexible environments like emacs, or textmate.

of course, nothing beats a good debugger, and stepping through the runtime execution of the code paths.

Re:The best tool by Anonymous Coward · 2008-01-18 05:02 · Score: 2, Insightful

The best programmers I've ever worked with didn't have degrees. But some of the worst ones did.

Re:The best tool by Anonymous Coward · 2008-01-18 05:03 · Score: 0

It was a serious question, and your reply is not only not helpful, it stinks - and probably so do you.

Re:GNU Global vs HyperAddin for Visual Studio by pg--az · 2008-01-18 05:03 · Score: 1

HyperAddin is actually merely on my "list of things to try", I have never actually installed it even. It's at http://www.codeplex.com/hyperAddin, part of "Microsoft's open source project hosting web site". On a new project, theoretically it would be great to link things up as you believe you understand them. On the other hand I have met folks who would actually delete all comments from something they are trying to understand, but that philosophy goes too far, I think a grain-of-salt is what you want.

Tools For Understanding Code? by mattboston · 2008-01-18 05:04 · Score: 1

I thought those were called programmers?

--
Cyberbite Networks - Web Hosting, Dedicated Servers & Colocati

Re: Tools For Understanding Code? by datadigger · 2008-01-18 09:01 · Score: 1

Read the original design documents to get a grasp of the over all application architecture. A quarter of the code will fall into place. Concentrate on the datastructures / database schema. That's another quarter of understanding.
The last time my company had to do something like this we hired a refactoring specialist and called back a retired guy who knew the application architecture in great detail. And we formed a team of knowledgable end users to test the refactored application.
Add a good project manager and a lot of money (many millions $$ in our case) and time (two years) and it worked.

--
Aphorisms don't fix code. (Bart Smaalders)

kcachegrind by Akatosh · 2008-01-18 05:04 · Score: 1

kcachegrind is very nice for a lot of languages. It makes an easy to read function call map, among other things.

Look at doxygen/umbrello by Yiliar · 2008-01-18 05:04 · Score: 3, Informative

See:
http://www.stack.nl/~dimitri/doxygen/
and:
http://uml.sourceforge.net/index.php

These tools allow you to 'visualize' a codebase in several very helpful ways.
One important way is to generate connection graphs of all functions.
These images can look like a mess, or a huge rail yard with hundreds of connections.
The modules, libraries, or source files that are a real jumble of crossconnected lines are a clear indication of where to start clean up activities. :)

Good luck!

Vim and etags by Anonymous Coward · 2008-01-18 05:06 · Score: 0

I use [g]vim with etags. This works really well, even for exploring complex code like the Linux kernel.

Wait 'till you get to reading the specs... by crovira · 2008-01-18 05:08 · Score: 2, Interesting

That should be good for a laugh or three.

They'll be out of date, full of inconsistencies and incomplete.

Then you'll be reading the code only to discover that people's idiosyncrasies and personalities definitely affects their coding styles. (There's even some gender bias where women tend to set a lot of flags [sometimes quite needlessly] and decided what to do later in the execution while men code as if they knew where they were going all the time, just that when they get there, they're missing some piece of information or other.)

If you read code developed by a whole team of people, you'll get to know them, intimately.

Good luck. You'll be at the bar in no time... I kept the stool warm for you.

--
MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.

Re:Wait 'till you get to reading the specs... by Intron · 2008-01-18 05:57 · Score: 1

For most programs there was some sort of design document or requirements document, or both, usually out of date by the time the program was finished. Maybe start with that and all the meeting notes and emails about changes which have taken place since and use that to update the document. Once you have a spec that people agree to you will be on firmer ground.

I like doxygen for inline documentation since it is easy to keep updated as you make changes.

Idiosyncracies -- too true. One guy hated gotos so he had something like below all through his code.

do {

some stuff
if (goto needed) break;
more stuff

} while (0);

--
Intron: the portion of DNA which expresses nothing useful.
Re:Wait 'till you get to reading the specs... by liquiddark · 2008-01-18 10:48 · Score: 1

Specs? HA! That's a good one.

The Classics by Dunx · 2008-01-18 05:09 · Score: 1

In you situation, the thing you need to use most is your voice: talk to people who already understand the code.

The last time I had to do this (with no documentation, meaningful code comments, or engineering support - no voice option!) it was in a mixed-language code base too.

My tools of choice were:

* etags - like ctags, but supporting pretty much any block-structured language. So navigating from Delphi code into C# code actually worked.

* vim - reads etags files, and of course it is my editor of choice.

* grep - etags doesn't work so well on finding references, nor on qualified names in Delphi (and why should it? I was delighted it understood Delphi at all)

Other tools that were used in the team included Eclipse, Visual Studio and Delphi for the parts that they could each understand but jumping across languages was hard in those IDEs.

Then we wrote lots of wiki pages and I drew UML diagrams to capture program structure. We got there in the end, but it was a hard road.

But it was a nasty mess and I sympathise with your predicament.

--
Dunx
Converting caffeine into code since 1982

massive printfs by cathryn · 2008-01-18 05:13 · Score: 1

Try to change something. Maybe try to fix a bug, something repeatable, but non-cosmetic. Guess names and grep for objects that look like they have about the right name, put in a lot of 'log print' statements and run the thing, adding more log printing as needed. Repeat this every day of your life for about a year.

--
http://junglevision.com -- Shamus for Gameboy

Read the Comments by Anonymous Coward · 2008-01-18 05:13 · Score: 0

I find that the best way to catch up is to read the helpful comments left behind by the original developers.

They often contain such helpful gems as "Once again the SCU team folks decided to Ass/U/Me that the replication would occur in the ORD node so we have to come in and clean up their mess again. Thanks a lot Dave!"

What's even more fun is when the variable names are encoded: DO WHILE R1 LT D4; R1++ { ...

R1?!? Not so bad when there are only two variables. Mental Sudoku when there are 25.

The Slashdot attitude by gaspyy · 2008-01-18 05:19 · Score: 2, Insightful

I'm appalled by some of the comments that imply that the poster may not be fit for the job.

A few years back I had to maintain a large module written in C#. I had about 200K lines of code, 50 classes, zero documentation, zero comments, zero error logging support, and I was expected to find and fix bugs and add functionality the day after the module was handled over.

So if you were never in this position, just STFU. Yeah, the code is there, but is this flag for? Is this part really used, or is obsolete? What are the side-effects of using that method? And so on...

Eventually, I learned it, especially after some intensive debugging sessions, but it was frustrating to say the least. I would have loved to have some aiding tools.

sourcenav-NG by Anonymous Coward · 2008-01-18 05:22 · Score: 0

At least one poster mentioned Source Navigator. I second
this as a good choice for digging into the structure
of several programming languages. I've used it off and
on for several years (even bought a copy back when it was
a cygnus product). I think the original project
(sourceforge page) is unmaintained (last news posting
was in 2003), so it is a challange to build on
a modern linux distirbution (there is a windows
binary as well).

There is a fork working to update the package
SourceNavigator NG. I was able to build their
release with no problems.

http://developer.berlios.de/projects/sourcenav

I've used it for C, C++, Java, and some Python.

I highly suggest giving it a look.

Robert Wood
woodr[at]hiwaay.net

Editplus by Anonymous Coward · 2008-01-18 05:23 · Score: 0

I've used Editplus 2 for years and years - it parses code and color-codes the different elements (functions, variables, strings...).

Re:Umm.. documentation? by Skewray · 2008-01-18 05:24 · Score: 3, Funny

Why? I can write crap and you can clean it up. This is Division of Labor, which is the basis of our civilization.

Where be dragons? by mm4 · 2008-01-18 05:24 · Score: 2, Informative

Apart from Understand for C++, I'd also suggest SourceMonitor - http://www.campwoodsw.com/sm20.html It will at least quickly point you to potentially problematic parts (long functions, deep nesting, etc.).

Sounds like a lack of documentation... abuse it. by Eternal+Annoyance · 2008-01-18 05:26 · Score: 1

use all sarcastic hints from http://mindprod.com/jgloss/unmain.html. Once they start getting desperate, ask them to produce complete documentation so you can actually do your job.

Reverse Engineering Tools by kaladorn · 2008-01-18 05:26 · Score: 1

Rational Rose and Enterprise Architect both allow you to reverse engineer OO projects to produce a model. Of course, the product depends a lot on the complexity of the architecture. I've tried with EA and found that it didn't like (at least the version we had) STL. And the COM stuff through it for a bit of a loop too. But it did show some interesting (and correct) relationships. I've seen MFC reverse engineered in Rational Rose and, with some tweaking, provided some useful insights.

I also second the recommendation to pick a place you think matters to you in the code and start using breakpoints and observing program flow. The code base itself (and even any model made) can be misleading because it may well include dead code, code which the comments say does X but actually does Y and code which is included, but never called (some other mechanism subsequently put in place or a feature unimplemented). Understanding what the actual code is doing, rather than what some of the files might appear to indicate that it might do, can be quite critical in places.

Of course, external design and functional documentation and API specs should be helpful, right? (Yeah, mod that part "+1, Funny"....)

Trying to grok a big new project is tough. I'm trying to come to terms with a project using a lot of Javascript and XSLT (including some XSLT that generates more javascript), as well as WFS and SVG. The fact that I'm used to working in strongly typed languages like Java and C/C++ where object heirarchies are a little more stringently defined (unlike JS) and where tools make browsing that heirarchy or data content while running easier (unlike XSLT building dynamic JS!) makes it a fair challenge. But perserverance, patience, and experiment are the tools that serve you best.

--
-- Mal: "Well they tell you: never hit a man with a closed fist. But it is, on occasion, hilarious."

If there's an online component by crovira · 2008-01-18 05:26 · Score: 1

http://media.libsyn.com/media/msb/msb-0195_Rovira_Diagrams_PDF_Test.pdf

might help.

Its a technique I used successfully, wherever the client was, whatever the client was up to and with whatever staff was on hand. Its domain independent too.

Enjoy.

--
MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.

Only 2 tools needed by spazoid12 · 2008-01-18 05:26 · Score: 1

#1 - Time
#2 - Experience

If you think about the entire problem as one thing to swallow then you will be overwhelmed. Just know that it takes time. Since this is an existing codebase there is probably an existing queue of bugs (bugzilla, remedy, mantis, whatever)... work through them.

Also, people have argued this with me, but I do find it helpful to reformat the files manually. I'm assuming that there is no clear coding guide that they adhered to (spaces / tabs, indenting, etc). So, what I will often do in order to learn some unfamiliar code that I'm expected to own going forward is reformat it all manually. I prefer to use emacs and I'll include a modeline at the top that is suitable for emacs and vi, and then I'll work through the file reformatting it while gaining some dry understanding and adding comments as I go. Of course, when you reformat a file you should check it back into the source control as a commit that is separate from any actual logic changes.

So- give it time, learn it in chunks, exercise the code and step through it when helpful, make it yours, and most helpful (in my experience) work through the bugs (set a goal, like reduce the overal open bug count from 100 to 10).

Explain the code to someone else. by Organic+Brain+Damage · 2008-01-18 05:28 · Score: 1

You're going to have to read the code. Most programmers love to write code and hate to read code. If you cannot read code, you cannot do maintenance programming.

One technique I've found helpful when confronted with something to big, ugly and important to rewrite....
Find someone, anyone, who will sit in a room with a PC and projector and you explain what the code does to them, in detail.

If you need to diagram, use a whiteboard, Rose is useless. You'll wind up with a huge pile of ineffable UML if you try to diagram it in detail with Rose.

Mod parent up by mccrew · 2008-01-18 05:28 · Score: 4, Insightful

Sorry, no points today to mod you up myself.

I would suggest a slight variation on the theme. Fire up the application, start it on one of its typical tasks, and then interrupt it in the debugger to catch it. While the process is stopped mid-flight, take note of the call stack to see which classes and methods are being used. Maybe step through a few calls, then let the program run some more.

By doing this repeatedly, you will quickly get a sense for which parts of the code see the most action, and would provide the most obvious places to start studying the code base, and provide the best bang-for-buck return on your time.

--
Hey, Windows users, there is no such thing as "forward" slash, there is only slash and backslash.

Re:Mod parent up by Lazerf4rt · 2008-01-18 05:53 · Score: 4, Informative

Fire up the application, start it on one of its typical tasks, and then interrupt it in the debugger to catch it. While the process is stopped mid-flight, take note of the call stack.

Good advice -- breaking randomly. However, it works best in CPU-intensive applications. If the app is mostly idle and event-driven, you're best off searching the code and looking for a place to set breakpoints.

Also, when I use the debugger to help understand some new code, often I'll open a text file and build a "trace" as I go. As I explore things in the debugger and find new call stacks, I add more detail to the trace, in a hierarchical (indented) style. Then I save the traces in case I forget something later.

As for the original question, I would recommend staying focused. Don't go all over the program trying to understand every system at once. Pick a specific part you really need to understand (say, based on a task you have to do) and focus on understanding that.

Unfortunately, the best tool for understanding code is experience. Not theory and not some fancy visualization program. Once you've seen a lot of different code, you come to recognize what each person was thinking when they wrote it. Once that kind of thing comes easily, you no longer find it necessary to bitch about each different programmer's coding style (as some do). So in a way, the guy who posts this question is lucky to have such a big pile of code in front of him.
Re:Mod parent up by ckaminski · 2008-01-18 08:19 · Score: 3, Insightful

The best way to learn the code is to start fixing some low or medium severity bugs. Something that's not a sev1 is either not so endemic to the system that changing it breaks everything, nor is it likely to be some random data corruption issue that will be impossible to find. It will be stupid user-input problems, or interaction issues.

Most of my productive code learning was in the first three months of bug-fixing. I think that's why most newhires end up on bug fixing as a rule - it's the fast-path to comprehension.
Re:Mod parent up by Just+Some+Guy · 2008-01-18 10:34 · Score: 3, Insightful

By doing this repeatedly, you will quickly get a sense for which parts of the code see the most action, and would provide the most obvious places to start studying the code base, and provide the best bang-for-buck return on your time.
If only there were some way to automatically generate this information, this "profile" of the running code, if you will.

--
Dewey, what part of this looks like authorities should be involved?
Re:Mod parent up by jgrahn · 2008-01-18 14:31 · Score: 1

Unfortunately, the best tool for understanding code is experience. Not theory and not some fancy visualization program. Once you've seen a lot of different code, you come to recognize what each person was thinking when they wrote it.

I'm with you so far. Getting inside the head of the previous maintainer is one important step. Was he sloppy or pedantic? If he knew about a problem, would he expose it or try to cover up? Did he just learn about Developmend Fad Of the Year and was eager to try it out? Was he a Lisp/C/ancient C++/C++/Python/etc fan?

Once that kind of thing comes easily, you no longer find it necessary to bitch about each different programmer's coding style (as some do).

I am happy that works out for you, but I never learned that trick. Other people's code still looks like crap to me 90% of the time.
So in a way, the guy who posts this question is lucky to have such a big pile of code in front of him.

He is lucky, because maintaining other people's crappy code is something you need to learn before you can become a Real Programmer. Writing from scratch is something students do.
Maintaining crappy code is also fun, in a way. It's a bit like solving crossword puzzles -- a hobby of mine -- at first it seems overwhelming, white squares everywhere. Then you solve one part in a corner, and another, and it becomes easier and easier as you go.

Re:How / why did you get the job... by PetriBORG · 2008-01-18 05:32 · Score: 4, Funny

Only 1600 lines?

I used to work at a company with a lot of Pascal and C code... It was extremely common (as in, all but a few) for programs to be written entirely in one code file. These files would go on for 20,000 lines or more. So many lines in fact that after the compiler had imported the header files at the top of the file that they would be over 65,000 lines long and the debugger would crap out because it had exceeded the int that it used for line number counting.

Sadly this isn't a joke.

--
Pete/Petri "damn, my chainsaw is clogged with 1's and 0's again." --clyde

HTML based cross reference by NullProg · 2008-01-18 05:32 · Score: 2, Interesting

Run these commands (or put them in a script):

ctags *
gtags
htags -Fan

It will create a ~\HTML folder with all the function/variables cross-referenced. Open the file index.html or mains.html in your browser. If your not running Linux, I think these utilities are included in cygwin http://www.cygwin.com/

Enjoy,

--
It's just the normal noises in here.

Re:HTML based cross reference by nonsequitor · 2008-01-18 06:46 · Score: 1

FYI, The mirror I checked for cygwin does not have gtags or htags. However you can also compile the sources for those projects in cygwin and make your own binaries.

Browse-by-Query by mmacdona86 · 2008-01-18 05:32 · Score: 2, Informative

I'll plug my own open-source project for this:
Browse-by-Query-- it won't help with C/C++(sorry for the original questioner), but it will handle Java or C#.
It dumps the code into a database and lets you query it to find the relationships.
I'm biased, of course, but I've found it's just the thing to understand how a particular piece of functionality in an unfamiliar code base fits into the big picture.

Re:Browse-by-Query by mcmonkey · 2008-01-18 05:47 · Score: 1

It dumps the code into a database and lets you query it to find the relationships.

That is awesome!

Headers by 12357bd · 2008-01-18 05:34 · Score: 1

If it's a C/C++ project, start trying to understand the headers, after the docs/comments they are most descriptive part.

--
What's in a sig?

Re:Understand C++ scitools.com by John+Sokol · 2008-01-18 05:37 · Score: 1

Understand for C++ : from (Scientific Toolworks, Inc ) is the best I have ever seen.

I highly recommend it. Well worth the $500 for it.

--
I am always doing that which I can not do, in order that I may learn how to do it. - Pablo Picasso

This should be the longest thread in /. history. by mcmonkey · 2008-01-18 05:39 · Score: 1

I expect to see copious suggestions from the all l33t hax0rs who tell us it isn't necessary to comment code, and good code is self-commenting, and anyone with any skill can figure out what the code does without comments.

*waiting*

Well, until those guys show up, see the above comments regarding stepping through the code in a debugger. From personal experience I'll say, the larger the application the smaller your initial scope should be.

Don't attempt to grok the whole code base at one. Start with a particular feature or method. Process in small bites and move out form there. As you go, you'll get a better handle on the context to inform your understanding of the parts you've reviewed. And hopefully there's some consistency of methodology to help ease the process as you go.

Also, talk to the last guy. Even if he/she is no longer with the company, if you can get an email address or phone number, 15-30 minutes could save you hours. If the last dev left on good terms/is concerned about burning bridges, they'll have no problem giving you some time. If they left on not so good terms, you'll have their sympathy.

I'm nearing the end of an upgrade to a customization of an off-the-shelf system. The last guy had to make some unconventional design decisions to work around some quirks in the application. A half hour on the phone saved me days of rediscovering the same issues and reinventing the same solutions.

Not for a "large" codebase... by smitth1276 · 2008-01-18 05:39 · Score: 4, Insightful

That doesn't always work for a code base with millions of lines of atrociously written code. I've worked with code where it is absolutely not feasible to step through everything.

It seems like in those cases I end up working from effects... I note some program behavior and then try to find exactly what causes that behavior, which can be surprisingly difficult if you are dealing with the "right" kind of code. After a while, though, the patterns begin to emerge in the system as a whole.

Re:Not for a "large" codebase... by ChrisA90278 · 2008-01-18 07:45 · Score: 4, Insightful

"That doesn't always work for a code base with millions of lines of atrociously written code. I've worked with code where it is absolutely not feasible to step through everything"

You are correct. All these people talking about using a debugger and so on... That does NOT work on larger projects any on fairly simple ones. "Large" projects might have 250 source code files and thousands of functions or classes and likely a dozen or so interacting executable programs. I've seen print outs of source code that fill five bookcase shelves. No one could ever read that.

I've had to come up to speed on million+ lines of code projects many times. The tool i use is pencil and paper

The first step is to become an expert user of the software. Just run the thing, a lot and learn what it does. Looking at code is pointless untill yu know it well as a user.
Re:Not for a "large" codebase... by no-body · 2008-01-18 08:15 · Score: 2, Insightful

Run it through a profiler - giving function names, times called, cpu time used, calling hirarchy/tree

if... there is such an animal around still for the environment in question.
Re:Not for a "large" codebase... by Anonymous Coward · 2008-01-18 11:01 · Score: 0

Sounds like somebody works for Microsoft ...
Re:Not for a "large" codebase... by doktor-hladnjak · 2008-01-29 19:40 · Score: 1

Yup, do you have any idea how sheerly massive the Windows or Office code bases are? They're not the types of things you just sit down to understand over one day or one week or really even one year.

When dealing with such a large code base, you learn quickly that randomly browsing source code is not going to help you figure things out. Sometimes there's architectural documentation, but usually you're figuring it out as you go. The best way is to get your hands dirty in the code by fixing bugs or looking at what somebody else more experienced changed while fixing something else small. Eventually, your only hope is to understand enough about the right subset of the code base to get your work done.

Re:Umm.. documentation? by AaronW · 2008-01-18 05:41 · Score: 1

I would add one more to this. If the code or algorithm is rather complex it also helps describing why you're doing what you are, since being able to recreate your thought process years later can be a huge timesaver should you have to debug or modify the code later.

--
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.

Re:Codesurfer vs "Source Insight" ? by pg--az · 2008-01-18 05:42 · Score: 1

How would you compare CodeSurfer to "Source Insight" ? I had never heard of either before today, but on reading Dgoldman's comment http://ask.slashdot.org/comments.pl?sid=422996&cid=22095314 I am so impressed by the screenshots http://www.sourceinsight.com/features.html that I think it will be worth a try, the visual call-graphs look outstanding, unless Codesurfer has them too and I missed it somehow.

Absolute tosh ! by golodh · 2008-01-18 05:43 · Score: 5, Insightful

An interesting post, even if it's absolute tosh. No-one in his right mind tackles a new code-base of any size or complexity with nothing but a printout. Not if he's expected to understand how it works and/or maintain it in a responsible way.

In fact, it nicely highlights the difference between "software engineers" and "code monkeys". Code monkeys just dive in; they never pause to think. In fact ... they tend to avoid thinking. It's not their strong point. After all ... they're paid to code, right? Not to think. Software engineers on the other hand, look before they leap and spot the places where they need to pay attention first. And they're systematic about it.

In fact, a software engineer will happily spend a day or two putting the right tools in place, *including* a full backup and a proper version management system for when he's going to have to touch anything.

The first thing you want to know about a new code base (after you find out what it's supposed to be doing) is its structure. Tools like Doxygen (see previous posts) show you that structure *far* quicker and *far* more reliably than any amount of dumb code-browsing can. And besides ... once you do it, you've got that documentation stashed away securely instead of milling around incoherently in your head (you'll have completely forgotten most of what you read by next month) or on disorganised pieces of note paper.

The second thing is to figure out if it calls any "large" functionalities like subroutine libraries or even stand-alone programs like databases, let alone if it makes operating system calls. The call-tree will give you an excellent view, and the linker files can complete the picture. You wouldn't be the first maintenance programmer who found out after months that his application critically depends on some other application he wasn't told about.

The third thing is to see where your code does dirty things. Let the compiler help you. Just compile your application with warnings on and have a look at what the compiler comes up with. You might be surprised (and horrified). Then compile with the settings used by your predecessor and check that your executable is bit-for-bit identical to what's running (you wouldn't be the first sucker who's given a slightly-off code base).

If performance is at all important, then running the whole thing for a night on a standard case under a good profiler will also tells you lots of important things. Starting with where your code spends its time, where it allocated memory and how much, and where the heavily-used bits of code are. All neatly written down in the profiler logs.

Finally, run your application with a tool to detect memory management errors the first chance you get. Useful tools are Valgrind (in a Linux environment), Purify (expensive, but probably worth it) under Windows, and sundry proprietary utilities under Unix. Just about 90% of the errors made in C programs come from memory management problems, and half of them don't show up except through memory leakage and overwritten variables (or stacks .. or buffers .. or whatever). You'll need all the help you can get here, and as far as these errors are concerned, dumb code browsing is useless. Just keep your head when looking at reports from such tools ... they can throw up false positives. Ask around on a forum with specific questions if you're allowed, or ask your supervisor. After all ... you showed due dilligence.

When you know all that (if you have the tools in place, all of this can be done within 1 day + 1 overnight run + 1 hour reading the profiler output), go ahead and trace through the code in a debugger. You'll be in a *far* better position to judge what you should be reading.

Re:Absolute tosh ! by mabhatter654 · 2008-01-18 06:11 · Score: 2, Informative

I'd agree. He's being considered a "code monkey" and not a software engineer. Typical situation is that they'll drop some random user problem on his desk after a week "to familiarize himself" then expect him to figure out what program it is and why it broke and suggest a process improvement. Then tell him he's all wrong because "they already tried that 5 years ago."

The question he's trying to answer is what does the code "do"? why does it exist? what problem does it solve? When you inherit some homegrown ERP system for example, it's easy to find a bug in a routine... not so easy is why input from program A is displayed wrong in program E that is processed by B, C, & D then stored for a week. He's looking for a quick picture of what it all looks like.. in 90% of cases nobody has that info for the CURRENT version of their homegrown system...they might have made the flowcharts, data dictionaries, and code books years ago, but nobody keeps them current.. and DOCUMENTED. How do you get enough info in a short amount of time?
Re:Absolute tosh ! by Anonymous Coward · 2008-01-18 06:16 · Score: 0

tree killer
Re:Absolute tosh ! by przemekklosowski · 2008-01-22 06:43 · Score: 1

An interesting post, even if it's absolute tosh. OK, and we could stop quoting here---the rest of this post was just more of the condescending attitude. C'mon, it's just a program! The original poster didn't ask for help debugging, optimizing, garbage collecting---he just wants help with understanding the code base. The reply implies that it's impossible to understand the code without the full shampoo treatment (version control, tracing, profiling, instrumenting of memory allocation).
I think this is unwarranted elitism. Call graphs, some runtime analysis (debugger or profiler), and plain old printer/highlighter are excellent tools, and often sufficient. The first program I ever worked on (I actually learned C from reading that code, because the only C book at that time was the Kernighan/Ritchie, and we didn't have it yet) was a Zilog Z80 assembler, and I still have a fanfold printout with my highlights---I am proud to say that I understood everything, except the setjmp/longjmp part.
By the way, anyone who cares about understanding code should know about Literate Programming (http://www.literateprogramming.com), whose guiding principle is that 'source code's main function is to document the algorithm---and it also can be compiled to an executable'. The problem with LP is that there are so many to chose from: original *web (cweb/nuweb/etc), Doxygen, Perl module comments...
Re:Absolute tosh ! by golodh · 2008-01-25 14:33 · Score: 1

OK, and we could stop quoting here---the rest of this post was just more of the condescending attitude.
Well ... first off we'll make allowances, this being Slashdot, but *sometimes* it helps to read the opening post. Don't let me talk you into anything unnatural for you, but I'd just like to put it forward. As a suggestion you know? For you to err ... think about.
In this case the opening post gives some interesting clues as to the problem the poster is facing. Read with me if you will:

"Having just recently taken a new job, I find myself confronted with an enormous pile of existing, unfamiliar code written for a (somewhat) unfamiliar platform -- and an implicit expectation that I'll grok it all Real Soon Now.
This explicitly tells the interested reader two things:
- the poster is confronted with a large unfamiliar code base
- he isn't even familiar with the target platform
- his job requires is that he has to get this codebase under control pdq
- he's brand new new to his job
Those who, reading between the lines, infer that he hasn't got any proper documentation for said codebase can get bonus points.
The poster further writes:

Simply firing up an editor and reading through it has proven unequal to the task.
This tells the attentive reader one more thing. Just reading the code wasn't helpful. That's what the man said. So ... printing the stuff out and playing "Little Picasso" with coloured marker pens won't be much of a help either, ok? This isn't about regaling people with unverifiable anecdotes of your personal programming highlights, it's about providing the author with sound and practical suggestions that he can use in his job.

The reply implies that it's impossible to understand the code without the full shampoo treatment (version control, tracing, profiling, instrumenting of memory allocation).
Even close reading of my previous post fails to reveal any claims on my part to the effect that e.g. version control or instrumenting memory allocations are crucial to understanding the code base. They are simply sensible precautions any software engineer would take before
(a) karking around with a huge code-base he can't make head or tail of, and
(b) assuming that whatever code he's been handed doesn't contain a lot of goofy mistakes that will come to bite him as soon as he changes anything.
This guy is going to be *responsible* for that code-base. For his *job*. His *brand new* job. And he's asking on *Slashdot* for tips on how to tackle a new code-base. Well ... pardon me for leaping to the conclusion that he needs some help here and that he isn't too experienced in dealing with real-world software maintenance. So let's just forget to mention that he should establish a baseline of the code-base he has before touching it, shall we? Might be good for a couple of laughs when he finds out and can't revert his changes. Tee-hee-hee.
Oh yes, and when would you like him to find out about any memory allocation problems? Right now, or when he's under a deadline to make modifications to the code-base that may play merry hell with whatever allocations that have been hacked into working? And as said ... reading code isn't going to clue you up much regarding any memory allocation problems. It it were, then we wouldn't *need* code instrumenters to check for memory allocation problems, ok?
As to tracing and profiling. Think you can infer the execution path for the normal case of a non-trivial program you know nothing about by reading it??? Is that what you're saying? How do you know that the top 20% of the code doesn't deal with test-cases and verification of whatever input that that program takes, and was left in because it was considered handy to have that right next to the production code?

You need to understand more than just the code by h2o2 · 2008-01-18 05:45 · Score: 1

If you're not only looking for tools but rather systematic approaches, then the book "Object-Oriented Reengineering Patterns" http://www.iam.unibe.ch/~scg/OORP/ is highly recommended, even for non-OO projects. Understanding code does not get you very far if you try to understand the wrong parts, in the wrong order, for the wrong purposes.

Re:How / why did you get the job... by Anonymous Coward · 2008-01-18 05:47 · Score: 0

I hate to be didactic here but wasn't the word you were looking for snarky rather than smarmy ? ;-)

OpenGrok by Anonymous Coward · 2008-01-18 05:48 · Score: 0

I use it non stop.

Re:The best tool by teknopurge · 2008-01-18 05:49 · Score: 0, Redundant

I had a serious response.

The question did not provide the individual's background which leads me to believe this person is looking for something that will tell him/her how the code works. That coupled with the comments about the UI of some other tools not being good, and there is a clear lack of fundamental knowledge.

Everyone has stories about "..some of the best programmers I've worked with didn't have..." - those type of replies are not cute or insightful.

Someone with real CS degree that was focused on theory, language design, algorithm development and analysis and math should make haste with dissecting an application. If they have a "CS" degree that started them learning how Windows Forms work in VB, then well, yeah, expect problems.

I'm not a troll - obviously if I was I would not be replying. I made a serious comment that apparently made light of some individuals' background. I'm sorry but you that took it personally need to grow up.

Regards,

--
Website Hosting

cscope and kscope by snutte · 2008-01-18 05:49 · Score: 0

cscope and kscope if your into X11.
http://cscope.sourceforge.net/
http://kscope.sourceforge.net/

Re:How / why did you get the job... by swillden · 2008-01-18 05:49 · Score: 2, Insightful

...if you don't understand the language?

Yes, it's hard to understand questions when you don't understand the language.

I'm sure you can find some remedial English classes if you look.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.

i feel for ya by pak9rabid · 2008-01-18 05:50 · Score: 1

Man, I definately feel for you. I had the pleasure of tackling a large, undocumented, foreign codebase myself a few months ago when the company I recently got hired at started dumping software bugs on me (I'm technically a sys/net admin who can also code). In our case, we're a java shop that utilizes Eclipse. All I can say is the debugger is your friend (specifically setting break points and stepping through code). Definately not ideal, but it gets the job done. I couldn't help feeling like Neo trying to learn my way around the Matrix.

They won't alllow us to use pavement by crovira · 2008-01-18 05:50 · Score: 1

too many skull fractures.

The walls and the floor are all padded and I'm getting tired of having to eat gazpacho soup through a flex-straw. :-)

-Napoleon XIII

--
MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.

Re:The best tool by teknopurge · 2008-01-18 05:52 · Score: 1

A college degree in something CS/related will help you.

How about creating a new tag:

"Troll needing ego boost" Better yet, how about "-11 - Liberal Arts Major AC sucking my oxygen and tax dollars!"

Regards,

--
Website Hosting

Doxygen by flyingfsck · 2008-01-18 05:53 · Score: 1

Another vote for Doxygen.

--
Excuse me, but please get off my Pennisetum Clandestinum, eh!

Use UML, and focus on the interfaces by davide+marney · 2008-01-18 05:53 · Score: 2, Informative

If your project is object oriented, you may be able to get your UML modeling tool to import the code and visualize the classes. When you do this, you'll probably get a HUGE diagram that seems just as unwieldy as looking at the code. The trick is to apply a filter to the model, so you're not overwhelmed with detail. Your UML tool should be able to do that for you.

I recommend focusing on all interface classes first. This can give you a remarkably sane picture of a system, and will help you divide up the code into more conceptually meaningful chunks.

The tool I use is Enterprise Architect, which does quite a lot of heavy lifting yet is still inexpensive enough for me to own a personal copy.

--
"We receive as friendly that which agrees with, we resist with dislike that which opposes us" - Faraday

Re:Understand C++ vs "Source Insight" ? by pg--az · 2008-01-18 05:54 · Score: 1

How would you compare "Understand" to "Source Insight" ? I had never heard of either before today, but on reading Dgoldman's comment http://ask.slashdot.org/comments.pl?sid=422996&cid=22095314 [slashdot.org] I am so impressed by the screenshots http://www.sourceinsight.com/features.html [sourceinsight.com] that I think it will be worth a try, the visual call-graphs look outstanding, unless "Understand" is enough better to be worth twice-the-price.

Solution by Chapter80 · 2008-01-18 05:54 · Score: 4, Funny

I've always found that the most effective method of learning code is to inject a random line of code somewhere, and see what breaks. Two techniques: 1) print some official-looking error message, and 2) add a large value (a million or greater) to a number somewhere. Keep a nice chart of what you added, where:

Error 'Format Conversion Error, converting from Y2K to Z2L' added to module x1
Error 'Out of Memory Banks' added to module x2
Error 'Object Expected; found adjective instead' added to module x3
Error 'bitbucket 95% full; please empty' added to module x4
Added 1,000,042 to some random value in module x5
Added 5,555,555 to some random value in module x6

Not only will you learn about the code, you'll make a great impression on your boss, when, within minutes, you are able to resolve some mysterious problem that has never happened before.

Slick Edit by Anonymous Coward · 2008-01-18 05:59 · Score: 0

Visual Slick edit has a great source analysis engine, however without the expertise and documentation I don't think any "TOOL" will allow you to grok the code base.

Re:Understand C++ scitools.com by imgumbydamnit · 2008-01-18 05:59 · Score: 1

Amen! I tackled a consulting gig with this one for marking out the refactoring for a 600,000 line legacy C++ line code base. Managed it in the 15 day evaluation period too. ;-)

--
To err is human. To arr is pirate.

Is what you get by aled · 2008-01-18 06:02 · Score: 1

but it doesn't really seem to analyze program structure; it's just a very fancy 'grep' package with a rudimentary understanding of C syntax.

So it is; no one really understands C syntax.

--

"I think this line is mostly filler"

Ask somebody who has worked on it by sneakyimp · 2008-01-18 06:02 · Score: 1

Can you get info from other engineers? People who have worked on the code are your single best resource. It's a sure thing they'll get tired of you pestering them but they can really help by leaps and bounds.

I haven't worked on C or C++ code in years but an automatic code-parsing documentation generator sounds like a fairly good idea. If you look at the comments, do they look like this? /**
* Short Description here
*
* Longer Description here
* @param type a variable description here
* etc.
*/

Note that the comment begins with two asterisks. This is the 'Javadoc' style of commenting. Sun offers a tool (http://java.sun.com/j2se/javadoc/) to automatically parse source code and turn those comments into a fairly useful guide to the code. The comments themselves are written with that in mind and the auto-parsed output can be amazingly useful if they are done well. PHPDocumentor can parse them for PHP code and possibly other types as well.

A Debugger also helps a lot. If you can step through the code you get a much better idea of exactly what is happening.

Log Messages by mitchell_j_friedman · 2008-01-18 06:05 · Score: 1

For swallowing an unknown application whole, I'm a big fan of logs and log messages. One of these days I'll be in a system that I can just set the log aspect up and it will do what I want, but in the past I have actually put in log messages at beginning and sometimes ending of most methods and functions. And then run the app and watch the logs. Much of the time I have been able to label the messages a diagnostic level and leave the code in but turned off. Certainly in C/C++ you can do this with ifdef and choose to include it or not.

No substitute... by pottymouth · 2008-01-18 06:10 · Score: 1

...for starting with main() and breaking the program out into whatever logical blocks it breaks into (if any). Then do the same for each block until you reach a block size (hunk of code) you're comfortable with understanding. Take the block you're most interested in (that you think is most related to your assigned task) and get a good understanding of it. Recurse back up the call tree to understand the bigger context in which that block is used and just keep going until you know enough to do what you've been asked to do. Take notes so that you can refer back to them when you have something different to do. As you do more maintenance you will eventually develop a set of documents describing the entire program.

I know, the question is about tools. I've used some monitoring tools that help me understand call flow and find hot spots of inefficiency during execution but, honestly, the best analysis tools I have are, in order:

vi
grep (Used with all it's flags and a few pipes, the BEST analysis tool EVER!!)
bash
sed
And the suite of text tools that come with every Unix/Linux distro.

Analysis is no fun but I know of no short cuts to understanding code that don't involve all or some of above process.

Fix bugs by Anonymous Coward · 2008-01-18 06:10 · Score: 0

Not a tool, but some advice. Get hold of the bug list and fix as many as you can. Concentrate at first on the little easy bugs that your team just haven't quite got round to fixing (you know, "improve trace in such-and-such a module", "some field isn't case-insensitive"; that sort of thing). Try not to focus too much on any particular area of the code.

Not only will your new team love you for clearing up the bug list, but you'll also be forced to think about a wide range of the code you're going to have to deal with. You'll also get good experience with whatever build and library system you'll be using.

Roll your own tool? by Spinlock_1977 · 2008-01-18 06:13 · Score: 1

The last time I inherited a big bag of crappy, undocumented code (3 years ago), I built my own tool with PHP and Postgres.
This system was an old baseball scores-collection system for a media company. It was written in C in the early 90's on SunOS. The code had NO module headers, and almost no in-line comments. Variable names were poor, and there was no paper documentation of any kind - just the code.

I figured I needed to go through each source file and write down a one-liner as to what I thought it contained. So my database table had a column to hold the actual source code, and a column for my comment (and other columns which I'll get to later). So I wrote a quick 'import' procedure to store each file as a row on my table. Then I built a low-quality web GUI to let me examine the source, and add a comment.

Then I realized I needed to sort all the source files into 'bins'. Libary code in this bin, mainlines in that bin, include files in another bin, scripts over there, etc. etc. So I added a table or two and enhanced my GUI to let me do just that. I quickly realized I needed to be able to search through all the source code at once for wild-carded strings, so I built a function to do that as well (Postgres text columns support this nicely).

At the end of the day, I had a decent way of sorting, grouping, and documenting all the source files without actually modifying them - and that gave me a HUGE leg up on understanding the lay of the land. Fortunately my boss didn't mind the extra couple of weeks it took to build this little system ;-)

--
- The Kessel run is for nerf herders. I can circumnavigate the entire Central Finite Curve in a lot less than 12 parse

Keep a log by durin42 · 2008-01-18 06:17 · Score: 1

One of the things I've picked up in the past year or so is that I keep a real, tangible, dead-tree notebook with me when I code. It's been a huge help while learning new codebases, picking up a language, or whatever. For me, the simple act of writing things down has helped hugely with understanding. This, of course, goes along with other techniques mentioned in earlier replies: use debuggers, doxygen/pydoc/etc, and so forth to help with your understanding. Then, once you get it, write out a brief document explaining what you've learned so other people don't duplicate your work.

I've used QA/C and VectorCast... by Simon+Brooke · 2008-01-18 06:20 · Score: 1

I've used QA/C and VectorCast recently when doing a safety audit on ancient C code the authors of which had long since left the company. VectorCast proved not useful, because it requires integration with a C compiler, and the (obsolete and non-standard) compiler for this particular code was not compatible with it. QA/C, however, proved very useful, not simply in quality analysis but also in navigating and understanding the interactions in the codbase, producing very useful interactive calling graphs.

Of course, it's a commercial tool and very expensive. I had a look for open source equivalents but didn't find anything as good.

--
I'm old enough to remember when discussions on Slashdot were well informed.

Re:Understand C++ scitools.com by ctzan · 2008-01-18 06:21 · Score: 1

I wonder what kind of 'refactoring' you did in 15 days to 600,000 lines of legacy code.

Changed all tabs into 8 spaces ? Camelcased class names ?

I could do all that in half-an-hour with sed and awk, no need to register for evaluation.

Use profiling debuggers by bacchu_anjan · 2008-01-18 06:22 · Score: 1

for java, I'd use one of several CPU profilers. they let you capture the runtime behavior of the system. All you need to do is test the several usecases and take applicaton snapshots for the time that you care about.

Yourkit, JProbe, JProfiler come to mind. Most have 30 day free evaluation periods. So, you can try before purchasing. They also have decent forums, so you can ask several practitioners for the info.

for a static analysis, eclispe has a call graph plugin that can tell you about all the calls a given method is making as well all of the callers.

You can start with static analysis and then goto runtime analysis.

However, if you want chronological method call trace, JProbe is probably your ticket.

It would be nice if any of the vendors come out with a tool that can do both static and dynamic(runtime) analysis.

Nothing beats... by Babu+'God'+Hoover · 2008-01-18 06:23 · Score: 1

tracking down the original coders and taking them to the strip club.

MOD PARENT UP by mkcmkc · 2008-01-18 06:30 · Score: 1

Although it's nominally a profiler, kcachegrind actually produces a fairly nice call graph, which can be quite helpful in figuring out what's going on. Even works (somewhat) if you don't have the source, as long as the binary isn't stripped...

--
"Not an actor, but he plays one on TV."

More than tools by sohp · 2008-01-18 06:31 · Score: 4, Informative

The best tool is your brain, applied liberally. Here's some thoughts to put in it

Feathers, Michael. Working Effectively with Legacy Code, Chapter 16 especially.

Spinellis, Diomidis. Code Reading: The Open Source Perspective, Chapter 10 lists some tools for you.

My own thoughts now. First, don't trust the comments, they are probably outdated. Second, if it's a big code base, forget the debugger. Write some little unit test cases that exercise the sections of code you need to understand, and assert what you think the code is supposed to do.

Finally, unless you are cursed with a codebase which is not kept in version control (in which case, ugh, time to start the jobhunt up again maybe), then take a look at the revision history. See what changes have been made to the area you are working on. With luck, someone will have put in a revision message that points you towards greater understanding of why a change was made, which will in turn nudge you towards knowing the purpose of the section of code that was change.

Tools of the trade by EvilXenu · 2008-01-18 06:31 · Score: 1

Eyes. Brain. Caffeine. In no particular order.

Read the Documents by ktstzo · 2008-01-18 06:31 · Score: 0

If its such a big code base, there should be documentation on the clases interaction and structures of the program, i would start whit that

That was the best reply in the entire thread! by JonTurner · 2008-01-18 06:32 · Score: 2, Insightful

An *excellent* stragegy and thorough explanation. Especially the bit about stopping to think and devise a plan rather than just diving in headfirst. All spot on!

The only thing I could possibly add is to say "gather resources to understand the *purpose* of the system", either through documentation or by speaking with project management and/or end users. If you can learn the business rules and processes, that will be an enormous help in understanding the code's design.

Unless your boss insists you do otherwise by davidwr · 2008-01-18 06:35 · Score: 1

If your boss says "here's 100KLOC understand it by next Thursday" and doesn't give you any documentation, you are sort of stuck.

You could start looking for a new job and you could tell him you won't make his deadline. Even if he backs off and says "OK, understand it by Christmas, and by the way nobody left here knows anything about it" you are still left with nothing but the code.

Such is life sometimes :(.

--
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.

Re:Unless your boss insists you do otherwise by Anonymous Coward · 2008-01-18 10:16 · Score: 1, Insightful

It just sucks around here, too ... some 40k lines of code plus some more in zend encoded libraries: out of those 40k of lines, 20k are in a big file that's just one big mangrove of if's; what makes it suck even more, register_globals must be ON, so I guess anything might happen anyplace.

When I am asked when I will be ready, I just say: "I have no f****g idea, better start from scratch or fire me".

For the fellows that said 'just read the code, loser', my answer is "it's probably one of you that took wages for three years to write such a big chunk of junk".

Crystal REVS by lordmage · 2008-01-18 06:38 · Score: 1

Takes source code and generates flowchars and other understanding indicators.

to me, if you have Source.. you should have Design Documents. Read them first.

--
I can program myself out of a Hello World Contest!!

Use a DSM to understand the architecture by scott151 · 2008-01-18 06:40 · Score: 1

I am currently trying out a tool from Lattix (http://www.lattix.com/) to create a Dependency Structure Matrix(DSM) to represent the software. Lattix LDM will generate the DSM from the output of "Doxygen" or from the output from "Understand for C++". It scales well - I had no trouble with applying it to a code base of 5m loc. I find DSM to be very useful in understanding the overall architecture.

Been there... by seanadams.com · 2008-01-18 06:40 · Score: 4, Insightful

There are two kinds of hard problems in programming: problems that are hard because they require ingenuity and deep thought, and problems that are hard because they require weeks of unraveling someone else's garbage.

There are some horrible programmers out there and I have on many occasions been tasked with cleaning up their messes. In your situation I would suggest either a) try to figure out if it would take less time for you to implement it in a clean and maintainable way or b) find someone else you can hire who knows the code base or at least is more familiar with the specific problem.

If you can't do a or b then you're screwed. In that situation, personally, I would either quit, ask for a different project, or print out the whole source code and sit back with a pen and start studying and commenting - one of the few tasks for which I still prefer dead trees.

Re:Been there... by skiflyer · 2008-01-18 07:31 · Score: 4, Insightful

a) is so often the wrong choice and can really submarine a company because they keep getting a cycle of a)'s ... every 5th release becomes a complete rewrite as the new team says "we need a refactoring of the code, no one here is familar with it and/or it's spaghetti code, just give us 5 months we'll maintain the behavior 100% and we'll clean up a lot of bugs and we promise in the future maintenance will be a breeze"
Re:Been there... by RobDude · 2008-01-18 07:36 · Score: 1

I couldn't agree more.
Re:Been there... by varunvnair · 2008-01-18 07:40 · Score: 1

The submitter never mentioned that the entire codebase was bad. Just that it was enormous (probably not well commented) and he needed to understand the program structure. Tools are helpful even if the code is well structured and easy to read, don't you think?
Re:Been there... by hpebley3 · 2008-01-18 11:20 · Score: 1

There are two kinds of hard problems in programming: problems that are hard because they require ingenuity and deep thought, and problems that are hard because they require weeks of unraveling someone else's garbage.

Not everything is a duality.

There is also the problem that's hard because it requires weeks of unraveling someone else's garbage until you understand the problem space enough to realize it's an ingenious and thoughtful solution.
Re:Been there... by seanadams.com · 2008-01-18 12:23 · Score: 2, Funny

Not everything is a duality.

Ah, so really there are two kinds of things: those which are dualities and those which are not?
Re:Been there... by Anonymous Coward · 2008-01-18 21:19 · Score: 0

So every time you join a new company with an existing codebase, you begin to pick holes in what they've done up till now, and request a rewrite? Surely that doesn't win you, the new guy, any favours with the other programmers, or with management who expect you to contribute?
Re:Been there... by SnowZero · 2008-01-18 23:06 · Score: 2

Ah, so really there are two kinds of things: those which are dualities and those which are not? Finally, someone who understands both sides of the sphere.
Re:Been there... by cbart387 · 2008-01-19 06:22 · Score: 1

I myself am currently a grad student and shall be graduating this spring. I can say that what you're talking about is where my university has not done a great job. Yeah, most of the teachers push for us to do good commenting but we don't spend a lot of time dealing with the same code over and over again. It's typical that after a project is completed (be it a month or semester long project) that we do not touch it again. I think it could be beneficial if there was a project that spanned multiple classes during your undergraduate years. That way students (I'm including myself in this) would experience having to support a software program over time. There's nothing better than experience in my book.

--
Lack of planning on your part does not constitute an emergency on mine.
Re:Been there... by try_anything · 2008-01-19 10:18 · Score: 1

Honestly, though, unless you can get your hands on an old-timer who knows the code, you can't plan on ever grokking the program theory of the old code. There's no alternative but to keep the old code running and do as few bug fixes and enhancements as absolutely necessary. The old code has its own logic, but as soon as you start messing with it, it will turn into exactly the incoherent mess that you perceive it to be. If you don't mess it up, it has value as a working product and a reference implementation. It gives you breathing space so you can figure out how to live without it. Maybe you replace it with a new product that fills the same role. (Obviously if you're going to put in the effort of building a new product, it shouldn't be a rewrite of the same functionality -- it should be a better product that takes advantage of new ideas, new hardware, etc.) Maybe you refocus the company around a different set of products.
This has a science-fiction feel to it, though. Why do people consider this an important question? Surely it's just as silly as people getting all excited about their post-apocalyptic fantasies where they morph from average guys into civilization-rebuilding supermen? Just like the US and the USSR went to great lengths to avoid the Cold War going nuclear, a company is going to pay whatever is necessary to keep one guy around who knows the old code, or at least pay him to consult. The new programmers can call him on the phone and ask questions like, "I need to enhance the product to handle non-ASCII ISO-8859-1 characters. Is it a big job or a trivial one? What are the modules that probably need to be fixed?" and "It gets really slow when more than ten clients connect at the same time. Why would that be?" After he has explained his thinking on enough questions like that, the new programmers will have absorbed enough of his understanding that it won't be legacy code any longer.
If a company is stupid enough to keep off handing big, valuable codebases to new teams without guaranteeing access to previous developers, then a better question for their developers is, "What's the job market like?"
Re:Been there... by try_anything · 2008-01-19 11:19 · Score: 1

A rough rule of thumb - It is twice as hard to unravel existing code as to write it again from scratch.
You've misremembered this saying in a crucial way. The more common (and more correct, IMO) wording is, "It's twice as hard to read existing code as it was to write it in the first place." Reading the existing code only takes twice as long as it took the original programmer to write *those lines* of code in the first place. It doesn't matter how many false starts and bug-fixes were involved. However, unless you assume that you're smarter than the original programmer, you have to assume that you will go through a similar pattern of false starts and bug-fixes.
Plus, the rule of thumb only applies to the first working version of code. Allow me to illustrate with a rather extreme metaphor. Compare the effort required in understanding just the definitions and proofs in an undergraduate real analysis textbook (skipping the rest of the text) against the effort required to recreate the theory yourself. Reading just the definitions and proofs is a feasible way of learning undergraduate analysis (some people actually do it that way) because the theory has been refined over centuries into an elegant system.
As with mathematics, the more effort is invested into refining and refactoring code, the easier it becomes to understand. Every time I revisit a module to fix a bug or add a feature, I spend a few moments reorienting myself and often discover a way to clarify the code. "What was I thinking? Looking at this module after some time away from it, the structure of the code doesn't help me remember the logic at all. Hmmm, but if I ... then the logic becomes more evident." If you compare the first version of my code to the form it converges on, the first version has less functionality and would therefore be easier to rewrite from scratch, but the second version is easier to understand.
Re:Been there... by try_anything · 2008-01-19 11:29 · Score: 1

Replying to myself here because I forgot to explicitly state the main point of my second two paragraphs:
For some code, the time required to read and understand it is inversely proportional to the time invested in writing it.
Re:Been there... by pAnkRat · 2008-01-19 23:54 · Score: 1

> Therefore, we said, code simply (Particularly your own stuff,
> you may have to support it later.).
> If you are trying to do more than one thing on a line - DON'T.

Yes, I know, but my supervisor is convinced that everythink should be coded to run as fast as possible.
"What, you create a new MyBlahObject for every time a page is rendered in this web application?
That are a lot of unneccesary cpu cycles, you should use caching or apply the singleton pattern."

I tried to bring the "premature optimisation is the root of all evil" reasoning,
and that the the code would be better readable and maintainable when it is less speed optimzed.
But, that's nonses, put in a few comments, than everything will be clear.

Nowadays, I just do less coding, and start to go into project management, pays of better anyways.

--
we need an "-1 Plain wrong" moderation option!
Re:Been there... by Anonymous Coward · 2008-01-21 12:20 · Score: 0

I agree to that! software

Re:Understand C++ vs "Source Insight" ? by Anonymous Coward · 2008-01-18 06:41 · Score: 0

Understand is a more mature product. It integrates better with an existing environment(you can use it from emacs), it supports more languages, and more operating systems. My impression is that Understand is geared mostly towards understanding(or comprehending) existing source code, whereas sourceinsight is also geared towards new development as well. Personally I prefer Understand. I like my tools to be more focused and integrate well with my environment. That said, I think your best bet is to get the trial versions(all tool vendors have to offer trials), spend a little time with each one and see which is more to your liking. Codesurfer by Grammatech is another option, although it is the most expensive of the three. You should be able to get by with just open source tools if price is really an issue by the way. There are many out there, sourcenav, doxygen, silentbob, cscope, cbrowser, cscout, opengrok, codeviz, ncc, the list goes on, but if you just want callgraphs then doxygen should suffice.

Document A Project by rrobbins · 2008-01-18 06:41 · Score: 1

I don't know about C/C++ because I am a web developer but I have an arsenal of tools to document my projects. First I create a data dictionary to document the database design. Then I add XML comments to my code in Visual Studio 2005 to get Intellisense and to support automated code documentation. Then I create a few class diagrams in Visual Studio 2005. I recently found a web site that can generate a chart from a CSS file. I compile all my documentation into a help file or a help collection that integrates into the MSDN Library. If the project uses any web services I create test scripts and add a web page to my help file to consume the web service (requires a XML to JSON proxy to avoid cross domain request restrictions). If there is time I also create videos about the project using After Effects. Seriously, I do most of this because documentation is for your own benefit.

Hopeless case story by Anonymous Coward · 2008-01-18 06:42 · Score: 0

It'll be interesting if someone out there recognizes this story: I took a job that looked great on the outside, lots of coolness potential, great products, some cool smaller projects, SOME good people. The big/main project I inherited was written in C/Asm for a very specific i86 platform and chipset- which, of course, was no longer in production. The project would not compile as it was handed to me. Typical crap- barely commented, comments stated the obvious: "i++; /* increment counter i */", cryptic variables, no variable dictionary, spaghetti, linguini, Capellini, /* FIX THIS! */ everwhere, you get it. The author had moved to another dept. (red flag!) and was barely available- answered 2 questions- barely with grunts, literal handwaving, and no usable information. To his credit, he called it the "Skunk- because it stinks". And it did.

easy by Anonymous Coward · 2008-01-18 06:42 · Score: 0

its friday, just invite a coworker for a beer and let him outline the architecture on a napkin...

Read the docs. by KillerCow · 2008-01-18 06:43 · Score: 1

Read the requirements document.
Read the architecture document.
Read the design document.
[They do have all of these, don't they? If not, you should write them.]
Start making some bug-fixes to find your way around.

Re:The best tool by djupedal · 2008-01-18 06:44 · Score: 1

"I'm sorry but you that took it personally need to grow up."

If you find public rants so stressful, perhaps you should forgo them altogether :)

three letters by fdisk3hs · 2008-01-18 06:46 · Score: 0

OCB - Other Careers Beckon.

If you want to be a programmer, but don't want to read and understand how programs work? Fuck off and get a job in copier repair or something.

BTW - Thanks Slashdot! You Michigan fucking Ass-hats have moved the "Reply" link off to the side in some floating javascript piece of shit instead of where I have found it for the last 9 years. Eat shit and die.

UML for Understanding code by clintt · 2008-01-18 06:46 · Score: 1

In short, reading lines of code is most in-efficient method. When dealing with projects with tens of thousands of lines of code, or more (hundred thousand plus), reading and going through the debugger is simply the worst possible way.

Studies have been performed as to what the limits are for the human brain to process information: and it's called the Magical number 7.
http://www.musanim.com/miller1956/

Hence, a method must be devised in order to model the large system (i.e. take out key components) and help chunk it to meet this magical number 7 rule.

UML class diagrams are essentially what you need to understand the code. UML is your tool; the unfortunate part is that these diagrams will most likely need to be drawn from hand.

UML class diagrams are a visual tool for building a model of the system; with these, you will be able to at an instant gain an understanding of say a class hierarchy for a given C++ project--such a method is 10x faster than having to read code.

A well designed large software project will start drawing these diagrams from the very beginning, and spent effort updating them.

Profiler by Myshkin · 2008-01-18 06:50 · Score: 1

I've found that running a program in a profiler and looking at the function call tree in a nice tool, like the profiler in netbeans 6, really helps to visualize where a program is spending its time, and what shape the call tree takes. I've really only started doing this with the netbeans profiler, so I'm not sure how much of this is just universal to all profilers, but when I've done it with netbeans, you can specify packages to exclude from profiling, so as not to create too many data points. So, I'll start by only profiling the program logic itself and excluding data collection on libraries.

Fulltext search for dynamic technologies by bandannarama · 2008-01-18 06:52 · Score: 1

I've found that setting up a fulltext search system on the code can be invaluable. Much of the code I deal with is dynamic, so it's hard or impossible to statically determine a call graph. (Think about objects that register for events at runtime.) In these cases it's nice to be able to search for the "glue" (e.g. interface or event names) that binds functions together.

I've used dtSearch in the past with success, even though the version I used didn't have syntax highlighting since it was aimed at human languages. I don't know if more recent versions have that.

- B

--
Bandannarama

What do I use? by JerryLove · 2008-01-18 06:53 · Score: 1

What sorts of tools do I use for effectively analyzing and understanding a large code base? Well encasulated code and liberal use of remark statements (plus documentation)... sadly, it looks like the people who wrote the code you are looking at did not.

Leo - Literate Editor and Outliner by Anonymous Coward · 2008-01-18 06:54 · Score: 0

Links:
Leo's Home Page .

A Tutorial Introduction to Leo.

Leo starts with the representation of source code that Knuth uses in The Art of Computer Programming and makes it work in an outliner that can round-trip source code.

The original post concerns analyzing existing source code. A reasonable approach within Leo would be to load it in, and start iterating over the resulting outline, refining it by breaking out chunks of the code into new nodes, each of which did a specific job. What makes this noteworthy is that the nodes one creates need not be an entire function, entire procedure, or even an entire statement within the language in which the code is written. The Leo user is free to create nodes that encompass chunks of the code as the human being sees them. The name of the node need not be legal within the language of the source code; it can be in natural human language (in my case, U.S. English). In the code from which the code was taken, the reference to that code is replaced by the name of the node.

Analyzing code this way produces an outline; any node in the outline contains source code that contains comments in natural language that express part of the source code without the syntactic requirements of the language in which the source code is written, and without the pedantic verboseness required to explain the algorithm to the computer.

But wait, there's more. A node can contain documentation that is not part of the program itself. The documentation can be anywhere in the outline (and in the source code).

An outline can contain nodes that are not part of a file of source code. These can contain design notes, to-do's, anything.

Mathematically speaking, Leo outlines are Directed Acyclic Graphs -- graphs without loops. Yet, Leo supports "clones" -- nodes that are references to other nodes. When displaying the contents of a clone, the body of the original node is displayed.

When I use Leo to organize the code I am writing, I clone nodes as I go, and move the clones to design notes and to-do's. The design notes incorporate the source code. The to-do's also incorporate the source code. When I work on a to-do, I modify the clones of the source code nodes within the to-do list -- and Leo propagates the changes to the original node and to all the other clones of it. When I am finished, I save... and Leo updates the source code that I feed to the C++ compiler, Python, or whatever processor.

But wait, there's more. Leo is scripted in Python. Leo scripts can iterate over an outline, creating or moving nodes, modifying the contents of a node. Leo scripts can add buttons to the Leo GUI.

But wait, there's still more. Leo is open source, under a Python license. It is cross-platform -- I can run it under Windows or OS X or Linux. I can add source code highlighting for a language that it doesn't support.

Re:Doxygen, and Extracting Software Architectures by Anonymous Coward · 2008-01-18 07:01 · Score: 0

In that slide show, they analyzed the Linux kernel and showed it to be, in some architectural decomposition, a fully connected graph. That result means that their decomposition failed, ie. they chose a decomposition that doesn't isolate any cohesive subsystems, ie. those in which internal units are not exposed outside the subsystem boundaries.

I don't see how this helps the state of Linux documentation, let alone the requirements in TFA.

I had a pile of C++ dropped in my lap 2 years ago. by Richard+Steiner · 2008-01-18 07:02 · Score: 2, Informative

My main tool for figuring it all out was to use exuberant ctags to create a tags file, and Nedit to navigate through the source under Solaris, with a little grep thrown in. I also used gdb with the DDD front-end to do a little real-time snooping.

I've since added both cscope and freescope, as well as the old Red Hat Source Navigator for good measure.

--
Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
The Theorem Theorem: If If, Then Then.

VHDL? by Alexpkeaton1010 · 2008-01-18 07:02 · Score: 1

Any good tools for either VHDL or Verilog? Preferably free or cheap.

Tests by gerddie · 2008-01-18 07:04 · Score: 3, Interesting

Tests are indeed very good to understand a code base- Nearly all the last year I was working on a code base that nobody understood completely, although I had someone to ask about the general code structure. Writing tests helped me to understand what some parts of the code actually do. And where I needed to change things I could make myself sure that I didn't break anything.

Another great tool is valgrind+KCachegrind - it gives you really nice call trees. Vtune can do something similar as well, but IMHO the output is not as good as in KCachegrind. The only problem, of course, is that valgrind makes your program very slow and, it is, AFAIK, not available on MS Windows.Vtune, OTOH, runs the program at normal speed, but it's calltree output is ugly, at least on Linux.

If these two options are not for you than you might add a trace output to each function. IMO this is better than using a debugger - especially in C++ with BOOST and STL, where a lot of stepping goes through inline functions.With proper logging levels you can get a very useful output to see what's going on. It helps to understand the code, and it also helps, if you hit a bug.

Re:Understand C++ scitools.com by imgumbydamnit · 2008-01-18 07:09 · Score: 1

Oh sad, sad noob. Don't you know: "I'm Yertle the Turtle! Oh, marvelous me! For I am the ruler of all that I see!" Next time, RTFComment "...marking out the refactoring...". Understand for C++ is not a code editor, it's a reverse engineering, documentation and metrics tool, what the poster was looking for. With this tool any competent developer could identify the bottlenecks in 600,000 lines of legacy code. Oh, and I could change all tabs into 8 spaces and Camelclass class names with Perl in that same half-hour.

--
To err is human. To arr is pirate.

SourceNavigator is an excellent tool by kndyer · 2008-01-18 07:12 · Score: 1

I have used SourceNavigator several times to become familiar with a new code-base. The interface is a bit awkward, but a few minutes with the manual will leave you with a very valuable tool.

My favourite feature ... the X-ref tool which gives you a navigable/expandable reference tree. Windows -> Add View -> Editor and you are good to go.

Kel.

Understand? by iso-cop · 2008-01-18 07:14 · Score: 1

Understand for C++ http://www.scitools.com/products/understand/cpp/product.php if you have some money to spend. SCI will give you a 15 day evaluation copy and the cost is $495 (cheaper each if you buy more). Nobody has been maintaining it for a while but for free you can have Source Navigator http://sourcenav.sourceforge.net/. It is basically a Tcl/Tk based editor that has decent cross-referencing capabilities. It also builds a class hierarchy and lets you search on files, variables, functions, etc.

Re:Understand? by iso-cop · 2008-01-18 07:21 · Score: 1

Oh yes, CodeSurfer http://www.grammatech.com/products/codesurfer/overview.html is another option. It costs $945. CodeSurfer has the capability to write fancy macros to do checks on your code...not sure how that compares to Understand's macro capabilities.

Did anyone mention the Linux Cross Reference by malk315 · 2008-01-18 07:16 · Score: 1

I found taking the time to snag the code and index it for the LXR allows you to click through functions quickly without needing any special C-scope type application etc. http://lxr.linux.no/ I like it since it's web based and you can plow through code from anywhere in your work area (any computers that have web access to the server w/ LXR on it). I used to create a cron driven script that would grab the source from source control once a night and index certain key versions of the code we were working on to make it readily available.

Learn What the System Does First by MikeWB · 2008-01-18 07:20 · Score: 1

The way I would handle this issue is by doing the following. 1) Learn what the system is supposed to do. Talk to the domain expert(s) and have them give you a walkthrough of the system. You have to understand what the software is supposed to do and how it works first. 2) Learn the entire UI. 90% of the functional requirements of a system will manifest themselves through the UI. 3) Define 2-3 Exemplary Use Cases With your knowledge of 1 & 2, define some typical system use cases. Now you are armed with enough information to begin learning the code. You can make assertions about the system. This means you know what to look for, you just are not certain what form it will be in. e.g. A widget processor will have some sort of workflow code to do so. 4) Trace Through The Code Now execute your use cases using the debugger to trace through the code. This will allow you to hit most all of the major subsystems in the application. 5) Comment the Code as You go Along As you read and learn the code, comment it! The next poor bastard who comes after you will be eternally grateful.

My two tools by GrEp · 2008-01-18 07:26 · Score: 1

http://opensolaris.org/os/project/opengrok/ and http://www.ece.iastate.edu/~zola/glow/ . The latter requires addr2line which is available for linux, but not OSX :(

--

bash-2.04$
bash-2.04$yes "Don't you hate dialup connections?"| write USERNAME

Lots of Bad Advice Here... by RobDude · 2008-01-18 07:26 · Score: 1

I don't mean any offense to anyone; but there is some really bad advice. I'm not 'calling anyone out' but I wonder how many people who posted are undergraduates/script kiddies/or lifer's at a corporate gig.

The truth is, understanding code is a unique skill from being able to write code. When I was in college, I didn't *care* about understanding other people's code. Other kids would ask me why their code didn't work, and I'd glaze over and say, "I dunno man, looks good". Professors would put code up, but mostly, I wanted the theory, and I'd go write my own code. It is a skill that really isn't taught in schools, at least not mine.

Most developers are REALLY BAD at this. I say this as someone who was an IT Consultant, who worked on-site with many developers, on many projects, and all that jazz. First off, as a developer, you are hired to MAKE SOMEONE ELSE'S LIFE EASIER. People who think you should walk in on your first week and demand to see the requirements doc, and if they don't have it write it yourself...well, 9 times out of 10 you are going to piss a lot of people off. These people pay you.

Every company I've worked for is trying to get their product out the door. Requirements are often done AFTER the code is done, or at least almost done. Most code is hardly commented. The documentation is never complete, often times, half the developers don't know it exists. Unit test, code reviews, these are things that SHOULD be done, EVERYONE agrees - but is it your place as the 'new guy' to walk in and demand that they be done? Is that why they hired you?

If you are hired as a developer, the answer is 'no'. If you are a project lead or manager of some sort, then the answer is maybe.

Using 'output statements' of any sort as a way to learn a large application is a joke. For a class assignment, sure. But, for any large application, it's essentially worthless. If you don't know how to use a debugger (and I mean no offense when I say this, I used output lines to debug for years before I was actually on the job) LEARN. They aren't difficult, but if you haven't used them, you simply don't know you can do that.

Looking at the code, raw code, and 'thinking' about it; unless you are some amazing code-genius is, again, going to be too cumbersome, at least for me.

Asking about design patterns is certainly something good to ask; but MOST places don't really follow a design pattern. You'll get a buzzword answer, but the actual architecture is always a bastardization of that system. You *could* correct it and make it all perfect; but that would take a lot of time, a lot of testing, and in the end, you'd still have the same end-product. Your boss's boss doesn't want to waste that kind of time and money.

If they haven't given you a specific task or area to work on, then you want to get as good of an understanding of the application in general as you can. The specifics of that will vary depending on the platform/type of application. I work, mostly, with .Net Windows Applications.

First thing I do, is get the app up and running on my local machine. This is pain in the butt, and normally takes a full day, believe it or not. There's typically access problems, or my account isn't setup, yadda, yadda.

Next, since windows is all GUI and all; I get a feel for the main gui elements that are in use. That big tool bar that is always at the top of the application....where does that live in code? What events does it have, yadda, yadda, yadda. At a very high level. We're talking an hour or two here.

Then I jump into the code. Where at? The start. I open the start-up application and I set a breakpoint at whatever the entry-point is. Sometimes, I'll use Visual Studio's Code Diagramming tool to get a visual hiarchy of the classes; but mostly I just print that out so my desk looks 'complicated'. Pay attention to global/application wide variables - what do they do, why are they there. Also, look for inheritance. If a form/class inher

Re:Lots of Bad Advice Here... by aled · 2008-01-18 15:55 · Score: 1

Amen! Funny most answer in this post only name some tool -that probably will be useful for a small codebase- and talk nothing about real life work environment.

--

"I think this line is mostly filler"

Re:Codesurfer vs "Source Insight" ? by ximor_iksivich · 2008-01-18 07:28 · Score: 1

I am not familiar with Source Insight, but CodeSurfer does have call-graph support. http://www.grammatech.com/products/codesurfer/screenshots.html . It also features a query analysis engine that allows you to run interesting queries on the data/control flow of the program. Plus the program slicing engine is useful for understanding how the program operates. The best thing would be try the trial editions for both products and see which works best :)

QOTD by Anonymous Coward · 2008-01-18 07:35 · Score: 0

Either comment your code good, or people will comment about your code bad.

Re:The best tool by teknopurge · 2008-01-18 07:35 · Score: 1

Where did you get "stress" from? Doesn't bother me a bit. :)

--
Website Hosting

Re:How / why did you get the job... by Anonymous Coward · 2008-01-18 07:55 · Score: 1, Insightful

Don't knock the gotos: they have a legitimate use in environments where exceptions can't be used. I once saw a 1000-2000 line function which should have had an end block and a bunch of gotos to it (for "bail out now but clean up" type situations - you'll see these all over the Linux source). Whoever wrote the function obviously got taught not to use gotos in his programming class, so instead he put all the cleanup code into a ~100-line macro and called it wherever he wanted to bail out.

Think about that: you can't step most debuggers through that cleanup code, you can't set breakpoints on specific parts of it, it probably won't be properly syntax-highlighted, and you have N inline copies of the code in your executable.

Sourcepublisher by Uosdwis · 2008-01-18 07:57 · Score: 1

Source publisher is a great tool. http://www.scitools.com/ it is a compiler that produces web pages not machine/binary code. It won't produce macros but can create calling trees, review docs, metrics etc. You can 'execute' the code by follow links etc really helpful for degugging to. Plus if you don't know a type you can vew it. Great little tool.

no easy way by maestroX · 2008-01-18 08:02 · Score: 1

In no particular order; fix the bugs, extend functionality, talk with your co-workers. In other words, spend time with the code.

I agree by joggle · 2008-01-18 08:10 · Score: 1

This is a great way to try to understand a difficult code base. I once checked out a rather large program from CVS and made a branch for myself. After running Doxygen I was able to get an initial understanding of the organization of the code (which happened to be rather awful). I then went through the headers adding my own Doxygen comments where I understood their function. After a week or two of this I had a pretty good understanding of most of the functions in the program (at least, well enough that I knew where to look if something went wrong or needed modification). It was only about 60,000 lines of code, but it was almost 100% technical math/scientific code with no equations commented anywhere (written by scientists with apparently little understanding of object-oriented approaches to programming).

documentation by Anonymous Coward · 2008-01-18 08:20 · Score: 0

i use this thing called the "documentation".

Here's what I do. by Haszak · 2008-01-18 08:21 · Score: 1

I always start with the database, if it has one. Once I understand what the application needs to leave behind, I have a much better understanding of what it could do.

Another trick is running a tool to list out every code file sorted by line count. Sounds strange, but I get to see where all the action is. :-)

The last time I had to do this was with a Java framework that used Spring heavily. If you're not familiar, basically, the code is woven together using XML files. Imagine sorting through that and figuring out how it works! This and the developers abstracted everything to the Nth degree. Arg! I did two things: (1) Try throwing debug lines in everywhere. Have them store a stack trace into a file somewhere. That's the quick and dirty way of seeing who's calling who. Once you start to see what code lies on top of other code, you can stop with all the debugging and tool use and just read the code for itself. (2) I was able to use jdb (the command-line java debugger Sun releases) to report every method call from every thread. That's very similar to what I think etrace does, another app someone's already mentioned on this thread. Basically, it was a complete record of everything the application did. It happened to be useful in my case, but of course, it produces a document too huge to consume easily. There are ways of setting certain calls to "not-important" and reducing the size significantly.

Good luck!

--
find me at haszak.org

This is Slashdot.... by Anonymous Coward · 2008-01-18 08:31 · Score: 0

... in order to filter out the meaningful responses to your questions, you might wanna check out a recent and somewhat helpful post, similarly entitled, "Code For Understanding Tools."

I kid....

better: don't grok the codebase by jhoger · 2008-01-18 08:32 · Score: 1

As a contract programmer often faced with maintenance, grokking the codebase is a waste of customer's time.

I install the application, look at the feature request or bug report, see where the new functionality fits in. Usually in the UI there are identifiable strings.

Use a combination of find, grep to locate the strings, and follow the logic back to locate candidate points for insertion of new functionality. This is where you start to need your brain.

Design your change or fix as if the rest of the codebase doesn't matter, because, well, it doesn't.

-- John.

Re:better: don't grok the codebase by Khalid · 2008-01-18 09:32 · Score: 1

Amen to that :) quick and dirty, this is how I have always done it and I think 99% of other people do it too ! and this also why nearly every code turn into an incredible mess, this the first programmer law, a kind of an entropy law in fact.

"Design your change or fix as if the rest of the codebase doesn't matter, because, well, it doesn't."

I will just add that if does "really" matter, you will "quickly" notice it, if dosen't, someone, maybe the next contractor will notice it, and the cycle goes on, but meantime you have solved the problem for some time, this what really matters for the client and this how it works in real world !

easier yet... by kellyb9 · 2008-01-18 08:38 · Score: 1

I think you should just go back in time and shoot the person who didn't adequately document their software. This may prove to be an easier task.

2 ideas: Class-based local Variables w/ Labels by rfc1394 · 2008-01-18 08:42 · Score: 1

Sometimes the problem is figuring out where the problem is, or exactly what code paths are being used. Here is a suggestion I got from a magazine which presented a way to provide for tracing of either procedures or code segments. Now, this was done for Visual Basic, but it can be done from any language that provides for dynamic variables that have an initializer and a destroyer.

In VB, you have a variable tied to a class, which is declared in the procedure (a sub or function in VB) you want to mark. You create (or instantiate) that variable at the start of the procedure, assigning its name to the procedure. And that's it. When you call the variable's init code it writes out the name of the procedure beginning and the time, and saves both.

When the procedure ends, the variable's lifespan is up, and the destroy method is automatically invoked to clean up the class for that variable. So it can now give the name of the procedure that is exiting, and how long it ran for. Or how much CPU time. Or anything else you want to report. The practice works automatically as long as a local variable of a class which is declared in a procedure is automatically destroyed when the procedure that instantiated it ends.

If you need to get indicators for less than a full procedure, you do an explicit call on the variable's destroy method at the point you want to indicate a piece of code you're monitoring ends.

So you end up getting a listing showing procedure 1 start, procedure 2 start, procedure 3 start (time), procedure 3 end (time), procedure 4 start (time), and nothing further, you know that procedure 4 is where it's hanging up. You also know how long it's running up to the point it hangs. This is very similar to the READY TRACE and RESET TRACE that's been available in Cobol for decades.

This is really helpful when you have a long-running program or one that handles a lot of different events and such, and you don't necessarily know what is happening or the execution path. The program can tell you what it's doing, or you can even write the information to a log file or the registry (for Windows-based programs). It's really great for applications because you can selectively enable or disable the practice at run time, and as such, you can track down an error causing a program to hang or be not responsive down to the precise line of code where the problem is, if you need to. Or you can simply use it to monitor which paths are being executed.
---
Paul Robinson - My Blog

--
The lessons of history teach us - if they teach us anything - that nobody learns the lessons that history teaches us.

True. Also, however, consider the NIH problem. by Futurepower(R) · 2008-01-18 08:50 · Score: 1

You made an excellent point that definitely applies sometimes, I think.

Also, there is another issue, usually even stronger. Programmers like to write code. They like the feeling of invention.

They DON'T like working with someone else's code. It's frustrating to let someone else be the leader, particularly when the other person's code is not easily readable. So, programmers often have a very serious case of NIHM, Not Invented Here and by Me.

Another problem is that programmers often live in a fantasy world concerning how long it will take to re-write working code. They often underestimate the amount of work by a factor of 10. They like to paint the broad strokes; they don't like to do the tedious work of making a program perfect; they don't like to write clear documentation. In business applications, those two are usually more than 80% of the work.

Re:The best tool by Alpha830RulZ · 2008-01-18 08:51 · Score: 1

The single best developer I have ever worked with had a degree in medieval poetry.

--
I was taught to respect my elders. The trouble is, it's getting harder and harder to find some.

FYI re: Lattix by tpz · 2008-01-18 09:00 · Score: 1

I have run into a particular "scaling" problem with Lattix that I would like to make sure people know about:

Lattix doesn't "scale" past asking for pricing.

The response I got for a request regarding pricing was less than useless and was imbued with a tone that was on the edge of insulting. It had a very "one man shop" feel, and that one man was obviously not at all interested in selling his product to someone who was very, very interested in buying it. I can only imagine how interested he would be in supporting the product given how little interest he has in selling it.

Suffice it to say that Lattix was immediately dropped from our evaluation list.

You can't say that by Shandalar · 2008-01-18 09:17 · Score: 1

(a) is irresponsible to even consider without knowing in detail about the project. If it's a 10 year old application that has had 10,000 bugs fixed over a grueling amount of time, like a web browser ... well, ask Netscape how it worked out for them when they decided to reboot the development of Navigator.

Re:How / why did you get the job... by Retric · 2008-01-18 09:21 · Score: 1

Pascal files tend to be huge but Functions <> Files. I once had a million + line pascal project that was so well designed you could follow the logic about as fast as you could read the code.

This Is Silly by severoon · 2008-01-18 09:23 · Score: 1

You need your management to support you. The notion that they can drop a big codebase in someone's lap with the implicit expectation that it will be grok'd in a short period of time is poor management—knowledge and change management are not technical issues, they're business issues, and it sounds like management dropped the ball at your place.

There are two options here: (1) go above and beyond, and demand recognition for doing so, or (2) make it a management problem to get you trained and get the proper knowledge transferred. It is not your job to start cold, and management's expectation to the contrary is just bad management. If they didn't require the previous guy to write good documentation, then that's their chosen process and they have to deal with the consequences of such a management decision, which is that knowledge is now lost and can never reasonably be regained.

The most helpful thing you can do is make sure they fix their broken approach. If you do undertake the Herculean task of papering over prior poor management decisions, you just make it harder for the rest of us.

--
but have you considered the following argument: shut up.

Re:How / why did you get the job... by rasjani · 2008-01-18 09:23 · Score: 1

Had to click and visit your homepage just to check if we have worked in the same company =)

--
yush

Not open source but very useful by malsmith · 2008-01-18 09:28 · Score: 1

Source Insight is a ide/editor that consumes and parses a couple dozen languages and full text indexes results to allow very fast searching, definitions navigation and class diagrams on the fly. You don't have to be able to compile to code to index it. Its not free/open source but there's a trial version and I've found it's worth the money. http://www.sourceinsight.com/

Lazy by Quickfingers · 2008-01-18 09:29 · Score: 1

Stop being lazy and read the code yourself. There are no QUICK SOLUTIONS to being a good programmer!

Source Insight, Zynamics binnavi by jdp · 2008-01-18 10:06 · Score: 1

I haven't used it for a few years, but back when I had to learn about different large (100,000 to multi-million) code, SourceInsight was invaluable. Even for huge projects, its parsing is extremely fast; and I thought it's UI was quite decent. And while it operates on the binary rather than source level, Zynamics' binnavi is a great reverse engineering tool.

Tools and Techniques by choch · 2008-01-18 10:06 · Score: 1

I do C++ consulting work, so I have to climb the code-base comprehension mountain fairly often. These are the tools and techniques I find most helpful:

ctags + <a text editor that supports ctags>: The last thing you need when trying to answer the question "what does this function call do" is to be distracted by trying to locate the function.
A debugger: Being able to set breakpoints and examine the state of the program is a must. Make sure you are fluent in your debugger's capabilities and usage. Setting breakpoints in code sections that you've discovered and then examining the stack to see how you got there is very useful in understanding how things are architected.
Doxygen: Even on code bases that don't use doxygen-style comments, Doxygen can still generate useful information, especially call graphs and class hierarchies.
When getting started, ask yourself questions based on the observable behavior of the program. For example, "How does the piece of data I entered here make its way to this output file." Use your tools to find the answer. Repeat until you start to get a feel for how things work.

Absolute tosh! by KZigurs · 2008-01-18 10:16 · Score: 1

Some good points (verifying reproducibility, version control, profiler, establishing baselines) and a pure bollocks one.

If you are an idiot - no tool will help you. No good dev will ever trust anything but his favorite IDE set up on the current codebase (or a preprocessed version of one if source pushes it a bit) operating on his 101 times tested code jumps, breaks and stack traces.

If you are in a rush - risking new tools or approaches is the worst you can do. You are in a paid job and expected to deliver predictable results. "I'll let you know in three months if I can do anything about deciding whether I can do about anything about it" is NOT an answer (although then you resemble one of my current colleagues. Gosh he has to be soooo fired...). If you have no tools or approaches that are working for you and you could apply now - well, you are screwed. Or the company. Or the code.

Use brain. Use what you know works. Do not, ever, risk an experiment where the outcome is not 99% certain and your failure may have an ... impact...

Asking on slashdot does not inspires confidence ether.

New codebase:
1) make sure you can run it. dependencies will enlighten you. Existing build scripts or assumptions will drive you mad ("it works on MY machine...").
2) make sure you can modify it. Comments are not to be trusted. Define pre and after conditions and track them.
3) make sure you can pinpoint a problem in it. break an stack, break and stack...
4) make sure you at least can predict scope impact of your modifications. Encapsulation is rarely to be trusted. There will be code dealing with framebuffer in the kernel registers or calculations in DAOs' easily.
5) make sure you can PREDICT it. Once you have that, you are the guru. Alas usually comes after a year or two on a larger codebases.

And - of course - each code, each dev, each team creates their own 'taste'/'feel' in a codebase. The sooner you will understand what are the driving assumptions and working practices of the previous team - the better. Team-maintained codebases usually contain some certain degree of conflict. Look for casual comments containing 'fuck' or 'shit' or the ever-lying 'todo:' in them...

Source Insight by Effugas · 2008-01-18 10:22 · Score: 2, Informative

It's inexpensive, and scales astonishingly. I've spent the last two years in it, and it's just how I audit code nowadays.

Rt. Click - Find All References by boatboy · 2008-01-18 10:25 · Score: 1

And Rt. Click -> Go to Definition...

Oh, wait, forgot what site I was on.

Search will help by Matt+Graney · 2008-01-18 10:25 · Score: 1

Navigating through the source is key to understanding; a good approach to this is to use search, which is ideal for all sorts of ad hoc investigation.

One source code search option is Krugle Enterprise, which can crawl and index the entire code base directly out of the SCM tool. It also finds code-related information, such as check-in comments, references to bugs, and can even be pointed out requirements and design documents that caused the code to be written in the first place (assuming those exist!). Because Krugle parses the code, it can tell the difference between a function call, a function definition, a comment, etc. It's then easy to see, for example, how and where a given interface is being used, even if it crosses language and functional boundaries. Krugle serves up search results from inside the wirewall alongside results from over 2.5 billion lines of Open Source code, too.

Disclaimer: I'm an employee of Krugle. You can check out a demo at http://www.krugle.com/.

Get an overview... by Savage-Rabbit · 2008-01-18 10:33 · Score: 1

This post is dead on.

Place a breakpoint somewhere you think will get hit (e.g. main), and then start stepping over and into functions. I usually attack this problem as follows:

Place breakpoint. Use step-in functionality to drop down a ways into the program, looking at things as I go. What are they doing, how do they work, etc.

Once I feel like I understand how a section of code works, I step over that code on subsequent visits. If I feel like this isn't taking me fast enough, I let the program run for a bit, then randomly break the program and see where I am.

Lather, rinse, repeat.

Also, this should go without saying, but you should ask someone who works with you for a high-level overview of what the code is doing. The two of these combined should get you up to speed as quickly as possible. I agree, setting out and trying to thoroughly 'understand' a large code tree you have just been handed and trying to become thoroughly familiar with every aspect of it in a short amount of time is pretty pointless. The trick is to gain a good high level oversight over what the tree looks like, a rough idea of what individual 'branches' of it do, and roughly how they do it, and then only familiarize your self thoroughly with the 'branch' of the code tree you are working with at the moment. Depending on what kind of code it is we are talking about my personal choice of tools will vary. For Java code I tend to use either IntelliJ or Eclipse making heavy use of built in Javadoc dislpay functionality along with the entire arsenal of IntelliJ/Eclipse editor's navigation features plus the search feature and of course the debugger and let's not forget a copy of 'Java in a Nutshell'. For C/C++ code I tend to use vi, Cscope and GDB along with a big fat Unix Programming book I keep on my desk and a copy of both the Kernighan/Ritchie and Stroustrup C and C++ bibles.

I'm not saying that this method suits everybody but it works for me. Basically, no matter what language I am dealing with the method is always the same. When debugging for example, I try get an overview of the code tree, find an entry point, and then limit myself to thoroughly understanding only as much of the code as I absolutely have to as I trace my way through it until I have found the bug or located the critical section of code. If I am expected to join a development team and continue development of a particular code tree the method is a bit different. I usually familiarize myself thoroughly only with the particular 'branch' or 'twig' on the tree I am working with and disregard the rest of the tree as much as possible expanding the area I am thoroughly familiar with as needed. Jumping into a code tree you have never seen before is IMHO harder than debugging and it is pretty much impossible to do if you have a PHBs on your back who expects you to reach full productivity in a totally unreasonable time frame which is all to often the case.

--
Only to idiots, are orders laws.
-- Henning von Tresckow

Shameless plug for CodeSurfer by mmcdouga · 2008-01-18 10:50 · Score: 2, Interesting

My company makes a code understanding tool called CodeSurfer. It's not open source, and it's not free (though it is free for academic use).

You can browse your code, following dependences and definitions. You can also construct queries, do isolate what statements can affect a particular variable, and a bunch of other tricks based on static analysis. There's a programming interface too.

Other good ways to get your head around code (speaking as a software engineer, rather than a guy promoting his company):

I agree with whoever suggested breaking in a random spot and stepping through the code.
Talk to the other developers, if they are around. Don't suffer in silence for the sake of doing it on your own.
Pick a minor throwaway feature (eg every button should be blue) and modify the code to add that feature. This forces you really learn the code, but without the pressure of making a real product-worthy feature.

Profile it by bullgod · 2008-01-18 10:57 · Score: 1

There lot of suggestions but they all, so far, fall into
a) steping through the code (either with pen or debugger) or
b) giving you something in the in the absence of comments (doxygen etc).
All very sensible.

I'd add into the mix, profiling the running code.
See where it spends most of it's time, what you can ignore for later, and what you need to understand first.

Source Insight by c0d3h4x0r · 2008-01-18 10:57 · Score: 1

See http://www.sourceinsight.com./ It's not free, but it's great.

Basically it has a smart parser/indexer and it builds and maintains an internal database.

Once that is done, it lets you easily jump around the codebase (jump to any class/function/variable definition/prototype), find all references to something (not just simple textual searches but actual qualified conceptual references), etc. I would never work on any sizable C++/C# project without it.

The UI is a bit unorthodox but once you learn to customize it to your liking it's extremely efficient and drastically speeds both learning and coding.

--
Moderator hint: a comment is neither "Flamebait" nor "Troll" if it is true.

Academics vs Reality by geek2k5 · 2008-01-18 11:23 · Score: 1

If this were in the type of environment that academic types work in, or multinational corporations with thousands of IT staffers, there wouldn't be a problem. Everything would be properly documented at all levels and the documentation would be completely accurate and up-to-date. There would even be a detailed history of changes and proposed enhancements so you could get a feel for where software development had come from and where it was heading.

I would love to see this in action. It is such a nice concept.

Unfortunately, ideal software engineering in smaller organizations gets corrupted by such realities as time limits, budgets, last minute emergencies and differences in implementation. This is rarely addressed in the programming courses I've seen over the years.

I've always thought that colleges that teach programming should have a year long 'maintenance' programming course that echoes reality in smaller organizations. They would start with a functioning software package that needs repairs and enhancements. There would be some catches though.

For example, some but not all documentation would be available. This would simulate documentation being lost, misplaced or removed by ex-employees.

Of the documentation that is available, a good chunk of it will be inaccurate or out-of-date. Some of it may not even be relevant to the software package because it represents proposals that never became projects.

Dropping down to the code level, there should be several programming styles, from the ideal, with comments, to the horrific, uber-geek 'one line does it all'. Then add complications like large sections of code that are never used and comments that are inaccurate.

Now add a dose of deadline oriented reality by having things like production breakdowns requiring quick fixes interrupt project development. Then interrupt project development with mini-projects that the 'CxO' wants now. (These one-time projects might appear multiple times, but with slightly different parameters.)

If the students haven't been scared off by this time, have the 'CxO' make changes to the project or even put it on hold while a different project is done.

These and other catches would give future software engineers an idea of what things are like in less than ideal situations.

Spend more time by polyex · 2008-01-18 11:53 · Score: 1

Spend more time reading and analyzing the code instead of looking for a tool to do the job for you. Experience is the key here, and expecting to find a tool to leapfrog senior developers hard won experience is a mistake. There is nothing wrong with coming across tools that may help, but in my experience many programmers waste inordinate amounts of time trying to find some tool to do the job that they were hired to do in the first place.

Re:I had a pile of C++ dropped in my lap 2 years a by zoranlazarevic · 2008-01-18 12:06 · Score: 1

I used TakeFive Software's SNiFF+ (TakeFive has been bought by WindRiver) for navigating C code. The software was fenomenal and very easy to use. Right-clicking any function/variable name gives you option to see where it is defined and all the places it is used. So it was very easy to jump from file to file. SNiFF+ also creates diagrams showing calls and such. I remember the package being costly, but definitely worth if you have a lot strange code to read.

Re:How / why did you get the job... by vbraga · 2008-01-18 12:17 · Score: 1

Funny? Why is this moded funny?

I always thought Microsoft should use longs for line count. It's pretty common to exceed it, on really f*cking large projects.

--
English is not my first language. Corrections and suggestions are welcome.

setting breakpoints and stepping is good but... by scroby · 2008-01-18 12:55 · Score: 1

As a fun (and possibly time-wasting) diversion, automate the whole breakpoint-call-stack-watching process. Instead of just setting breakpoints, figure out how to obtain stack traces programmatically. Every time the program runs past a certain part of the code, append the stack trace to a text file. Pull out your favorite scripting language and convert this text file into a format http://graphviz.org/ can understand, and you've got yourself a neat little run-time call-graph generator. Of course this is predicated on the fact that you can generate call stacks, i've never tried it in C/C++...

Obviously, it helps if your familiar with graphviz. If you're into graphs and such, it can transform a boring day (week, month...) of trawling through crappy code into a fun experience in scripting and making big fancy-looking graphs to impress your co-workers...

Source Insight by Anonymous Coward · 2008-01-18 13:14 · Score: 0

I didn't see it posted elsewhere: Source Insight is a great tool, especially when you can't even compile the code. It just slurps up code and helps you find connections.

It's a source code editor (free trial is fine) but it indexes the code. Point it at a bunch of source files (C, C++, Java, C#, tons of languages) and it tags all the code.

You then have a full list of every function, global variable, class. You can jump to one easily.

Even better, just having the cursor in the name of a class or function will automatically display the definition of the function or class in a context window. You can choose to have a window show the call graph too. This saves a surprising amount of time compared to hitting a lookup button that changes your current document.

Now this is based solely on string matching and not what was strictly compiled, so two classes with the same name will require you to disambiguate. But when you can't compile the code this is actually an interesting side, because you can see where code duplication has happened.

I find it the quickest way to dive into a new codebase and get a feel for it, especially when it doesn't compile yet (on my machine. Im sure the build machine or whoever created the code has themselves set up properly) primarily due to the multiple language support, no need for compilation/code modification/symbols and the omnipresent context window.

to comment or not to comment, that is the question by www.tech4um.com · 2008-01-18 13:43 · Score: 1

I've been told by one professor that comments are one of the most powerful tools, and by another that they are pretty much unnecessary. IMO it would be best to have a general explanation at the top of what the code does, and the very few comments after that. as they say, good code comments itself

--
Technology Forum

Learn to swim by Anonymous Coward · 2008-01-18 14:11 · Score: 0

Seriously, after working on several million lines of cruft, usually taken on in 500,000 - 750,000 line increments I've learned the following:

1) caller - callee tables are invaluable

2) you'll never understand it

3) even if you wrote it, you won't remember enough to understand it

This makes you religious about comments in two ways:

1) writing useful comments

2) not trusting existing comments

Somewhere along the line I realized I just swam around the stuff looking for landmarks and worked it out as I went. It's hard work, not fun, but it *can* be very profitable.

Good luck

rhb

Re:Doxygen, and Extracting Software Architectures by prashanthellina · 2008-01-18 14:32 · Score: 1

I've written a call graph generator (graphical) for python code. You can try it out here. I found it useful to cleanup some code at work. (Generating call graphs for understanding and refactoring python code)

Code Analysis Tools by Anonymous Coward · 2008-01-18 14:35 · Score: 0

Relativity has a very robust tool for doing legacy code analysis:
http://www.regdeveloper.co.uk/2006/08/23/code_eam/
http://www.relativity.com/pages/home.asp
(also sold by IBM ftp://ftp.software.ibm.com/software/websphere/awdtools/atw/library/Analyzing_Programs.pdf)

Visustin is also worth looking at:
http://www.aivosto.com/visustin.html

http://www.fatesoft.com/s2f/

Call graphs yes, but from run-time profile data by jrfonseca · 2008-01-18 15:10 · Score: 1

Draw some static graphs of functions of interest using CodeViz http://freshmeat.net/projects/codeviz/

Call graphs are nice, but call graphs of large applications done via static code analysis are so huge and dense that become useless. Call graphs taken from run-time profile data, with all those irrelevant nodes pruned out, are IMO much more useful, as they naturally direct you to the most interesting parts, where the action is.

Re:How / why did you get the job... by wrook · 2008-01-18 15:28 · Score: 1

I once spent 6 months refactoring code at a large telecommunications company.

The problem that was presented to me was this: They wanted to add more functionality to the code. But alas, the *functions* had exceeded the max file size for the compiler (32K lines). All they wanted me to do was break apart the functions and put them in different files so that they could jam more functionality in them...

openGrok by Anonymous Coward · 2008-01-18 15:32 · Score: 0

A while back I started a new job where I'd be maintaining a lot of old code, so I spent a bit of time trying to answer that same question. The first thing I looked at was OpenGrok. Sun uses it for online browsing of the OpenSolaris code. You can see it in action here. The cross referencing is nice, but it takes a while to generate, can use a lot of disk space, and is only marginally more useful than grep with a good editor.

Source Navigator is a bit more useful.

Generating call graphs can be helpful. I know kprof will generate them, but it requires generating profiling information, and more or less requires Linux, so it might not be possible in your environment. I think Doxygen can do it, and I'm sure there are other tools that can.

Other than that, I agree with the people who said to set breakpoints and start playing. Obviously it helps if you have some kind of goal in mind, or a specific area of the code to look at.

Dynamic methods by code_rage · 2008-01-18 15:40 · Score: 1

A few posts have noted the use of the debugger to see the thread of control. Assuming you have a scalar application (no IPC and no multithreading), that should be useful, but a bit tedious. How about this: use your text editor to put some debug (trace) statements in every function so you can see the actual thread of control. If text output can be created on your platform (e.g. stdio or iostream), just create logfiiles. Otherwise, reserve some memory and figure out how to log interesting things and how to extract the log records from the machine. Even if you have a very primitive embedded platform, you should be able to figure out how to create a log of where you went at runtime. If using the C preprocessor, the __LINE__ and __FILE__ tokens may be invaluable with appropriately designed CPP macros. Even if your platform has nothing like stdio, if you think a bit you may be able to create a way to uniquely identify each file with a unique integer and then use __LINE__ to create a unique description of where you are. Think a bit, experiment and you may arrive at a solution.

If the code is OO with many instances, the above methods may yield rather high entropy results. You could always log the 'this' pointer to try to discern object lifelines and behaviors. Use grep or perl to organize the output into some useful form.

Then, find your use cases (you do have use cases, right). If there are no use cases, talk to some actual users (not programmers) to see a few actual sequences of operations. Assuming you are supposed to maintain this product, those use cases will likely form your regression test suite.

For state machines, well you just have to derive them from the code. If the original developers were disciplined, the states and signals can be discerned without too much difficulty. It's important to analyze them carefully to see which states and transitions are possible (as opposed to the ones that were actually used in a given test run). If the developers were undisciplined then your reverse-engineered state charts will look a mess but at least you have a starting point for analysis.

None of the above is a panacea, but maybe helpful. Have fun.

Re:Doxygen, and Extracting Software Architectures by pestilence669 · 2008-01-18 16:11 · Score: 1

Doxygen is awesome. It's callgraph support is unparalleled in the documentation world. I've used it w/ 550+ classes and it allows me to trace every dependency. That's even without any Doxygen specific tags. Better Objective C support would be nice, but it's satisfactory. C, C++, and PHP are all flawless. It's also extremely fast for most reasonably sized projects.

Ubuntuforums.org Thread by __aammpv6063 · 2008-01-18 16:30 · Score: 1

There is a thread on ubuntuforums.org with a similar discussion you may want to check out. Keith

Sad but true department... by not_a_bot · 2008-01-18 17:05 · Score: 1

The nightmare of maintaining someone else's code is often the fact the original they don't comment, don't explain their reasons, and are frequently hard to communicate with, or in one of my current projects, dead. Having a map of the code is great, but it doesn't really explain how or why it does what it does. In this case, unfortunately, you have to do the heavy lifting. Understand how it starts, how it's controlled, and what the points of input and output are. I'm guessing that you have at least the source code. You've got a lot of work, it's not easy, but it can be done. You're smart enough for them to have hired you. And while you're in there tinkering, do the next guy a favor and comment the gorram thing.

Comment removed by account_deleted · 2008-01-18 17:55 · Score: 1

Comment removed based on user account deletion

Re:Understand C++ scitools.com by ctzan · 2008-01-18 18:32 · Score: 1

Ah.

NO competent developer could identify the 'bottlenecks' in 600,000 lines of code in 15 days.

Got it now ?

I don't think you have ever participated in such a big project. Let's not even talk about managing it or single-handedly refactoring it. You have simply pulled that number out of your ASS.

Don't waste your time learning analysis software by gloryhallelujah · 2008-01-18 18:42 · Score: 1

If you're under the gun then don't waste precious time learning a new app; improve your built-in analysis tool; let the users teach you the app and the business domain - it's the world your software models; use a pencil and paper and make your own diagrams. It's hard, challenging work and you'll do fine.

--
The Turing test cuts both ways

Software Reverse Engineering Tools by Sephrial · 2008-01-18 19:17 · Score: 1

In a reverse engineering course, we learned to use a series of tools to produce a high level organization of software using Module Dependency Graphs (MDG) based on relationships of interest between classes. After applying a clustering algorithm (using a tool called bunch) on large graphs for simplification, we would view simplified MDG's using Dotty. Here are some relavant links for those interested: www.mcs.drexel.edu/~bmitchel/research/tooldemo.pdf http://flourish.org/cinclude2dot/ http://tarind.com/depgraph.html

Take your time, post your resume by grikdog · 2008-01-18 19:29 · Score: 1

Like it says in the Tao Te Ching, how'd you get in this mess in the first place? If you have to ask that question, management has hired you on because you're desperate and work cheap and may be qualified to solve their problem — IOW, they're outsourcing. If they weren't so cheap, the team that wrote the stuff would still be on board. Finally, if you have to grok it, do it inside out — look at the user interface (if there is one), then find the data, then find what calls what. I agree with previous remarks about ignoring comments, especially if the code is very old and has seen more than a few layers of maintenance. Two other observations: The elegant stuff is the oldest. The outrageously rewritten stuff is recent and extremely dangerous — change it at your peril. Well, three. Beware of academic styles. No one designing software tools actually has to use them, so usually any team of coders will have used 20% of the tools to do 80% of the work, all differently, except for the first guy, who used 95% of the tools to do what now amounts to 20% of the current specs.

--
``Tension, apprehension & dissension have begun!'' - Duffy Wyg&, in Alfred Bester's _The Demolished Man_

Re:Codesurfer vs "Source Insight" ? by pg--az · 2008-01-18 20:33 · Score: 1

Thanks !

Re:Understand C++ vs "Source Insight" ? by pg--az · 2008-01-18 20:38 · Score: 1

Thanks for the detailed information, I was so clueless about this area, a little less so now !

Re:Understand C++ vs "Source Insight" ? by bar-agent · 2008-01-18 22:09 · Score: 1

Thanks for the detailed information, I was so clueless about this area, a little less so now !

So, you understand more and have some insight now?

--
i'd hit it so hard, if you pulled me out you'd be the king of britain [bash.org]

Re:RR & EA - nah, try Structure101 by Cauliflower+Kid · 2008-01-19 00:16 · Score: 1

Reverse engineering into UML is way too clunky. Structure101 is much more geared to understanding big ugly code-bases.

About "new tools" by golodh · 2008-01-19 04:09 · Score: 2, Interesting

@KZigurs,

Well ... some good points, and some I'd say are too detailed at this point.

I totally agree with point (1). I forgot to mention it since I assumed (always a bad thing) that the author actually could compile and run the thing. An important point to keep in mind. Thanks for bringing it up.

Points (2)-(5) however all come after you've understood the basic structure of your code base.

Next, I'd say that a fairly junior software engineer trying to tackle a large unknown code-base without proper tools is doomed to failure no matter what. So the step from "If you're in a rush" and "You are in a paid job and expected to deliver predictable results." to "forget about tools you're not familiar with and just dive in" is an exercise in self-delusion and a recipe for disaster. Nothing less. It's like someone rushing out of the house and sprinting for work because they don't know where they put the car keys or their bus ticket and feel they are too much in a rush to search for them.

Besides, producing automated documentation is a good way to communicate. The tool communicates the structure of the code-base to you, and you can use e,g, the call-graphs to (efficiently) communicate the complexity (or otherwise) of the code-base to your supervisor. It also communicates to him how you are approaching the problem, which is likely to be a plus.

Now suppose the codebase is really difficult. A competent software engineer is, like any other kind of engineer, co-responsible for making actual and potential trouble spots *visible* to management. Preferably before they explode. Although it's popular wisdom to despise Management, if you, the hands-on person, don't tell Management of the problems, you ensure that they're driving blind. You rob them of the chance do do anything about it before the problem becomes so acute that even they'll have to notice. They will recognise it if you do and keep it in mind when they have to assess you. Depend on it. Besides, you just happen to be the only one who can tell them, and you fail in your responsability if you don't. Part of a software engineer's job is to *communicate*. Now you can't give your supervisor any honest estimate of how well you have the new code base under control before you get to know it. And tools really really help you save time and allow you get a much better overview.

Communication works both ways. If, with all the tools you use, you are unable to understand the code-base, you lack one or perhaps two elements that distinguishes a basic software engineer from a good or even a great one. Talent and experience. And you should be honest with yourself and your supervisor about that too. If the job really is too hard for you, have the guts to own up before you mess up and thereby save yourself and your company a lot of trouble. And believe me ... there are lots and lots of good jobs in software development / maintenance that can be done without a surfeit of either. Such is the power of engineering.

Now Doxygen (or similar tools) may be unfamiliar to the author, but such tools really work. Besides, I've seen students download, understand, and use Doxygen in less than 1 hour after they were told about it.

Re:Understand C++ scitools.com by imgumbydamnit · 2008-01-19 04:30 · Score: 1

You don't know me, but you attack me. I don't owe you any explanation for a post that was merely pointing out how much Understand for C++ accelerated what would have been a far more painstaking task, but you seem intent on your own preconceived notions of what a software consultant does when called in to speed up a large systems and remove unsupported products.

It isn't about making loops faster, it's not about looking at each line of code, it's about quickly finding the use of slow, obsolete libraries (in this case NetClasses from defunct CORBA pioneer Postmodern Computing), tracing the dependencies, then picking and integrating the replacement products that best fit the need. You only need the code graphing software for the first parts of the effort. It's entirely reasonable to find the bottlenecks and the dependencies in two weeks.

The point is, with the right tools, you don't have to look at 98% of the code.

--
To err is human. To arr is pirate.

Start with what it does, then how its controlled by DrChuck · 2008-01-19 11:08 · Score: 1

Between cscope and a tags following editor and a notebook (or another text editor window open) figuring out code like this is kind of fun. Basically start with what the code is supposed to do, and follow the trail of how it does that. I like to start at the function or module that flips the bit or presents the screen and looking at who called it (cscope, follow tag) then looking at what the function needed to know before it could call it, and then how did it get that information (following back up the tree). Eventually you will end up with a whole bunch of things that can cause the code to do what its designed to do, and along the way those paths will have modulators that sometimes enable, sometimes disable what they do. That is your "control plane" if you will. Follow the control plane and you will know how to make the software do what you want and then you should have enough to start making it do something different than what it does (assuming what it does is broken ;-). Good luck. --C

DTrace by Lally+Singh · 2008-01-19 12:57 · Score: 1

Put solaris 10 on a PC (or vmware or what not). Then, start tracing function (& method) call invocations. Use ustack() to save the stack at the point of invocation.
For C++: http://developers.sun.com/solaris/articles/dtrace_cc.html
Java: http://www.devx.com/Java/Article/33943
C: The dtrace docs :-)

Debuggers are useful here & there, but 90% of the figuring out in a program is complexity spread over way too much code to try and figure out single-stepping in a debugger.

DTrace has taken over for both the debugger & printf() for me. Also, as I can change my script and rerun it on the same running process, my round-trip time has reduced quite a bit.

--
Care about electronic freedom? Consider donating to the EFF!

Comment removed by account_deleted · 2008-01-19 17:01 · Score: 1

Comment removed based on user account deletion

Been there, will never return by owndao · 2008-01-19 18:03 · Score: 3, Interesting

I had a very wise undergrad EE prof who said on the first day of design class that we needn't worry about the many "complicated" things that we would have to design during the course because we had already completed all of our circuit analysis courses. He said it's much harder to figure out the details of someone's design than to design it yourself. Same applies here in software. I've been there working with other's undocumented code and quite frankly it was infrequently that I left the project with more respect for the programmer. Here I'll just say what I learned from the experiences as useless as it might be.

If the coding style used is appropriate you stand some chance. Lines of code don't matter much when behavior is sufficiently complex that you cannot list the states and events that trigger execution and state change let alone keep track of them in your head long enough to understand their context.

I once had a similar problem with some legacy OS9 c code that performed a simple communication task and updated a monitor. With no documentation from the writers I was to "simply add some new data to be collected and display it." The problem with this 3000 loc was that it was written as a state machine with no modularization - next to impossible to follow in a debugger. What I wanted to do is run a performance analyzer along with the code but I was told that was "out of budget". This would have told me at least the parts of code being executed frequently and I could start to associate the external events with the code processing.

On very large applications like AT&T's RNS (residential account management for BellSouth) that exceed million-lines-of-c-code the only thing that made the application workable for new features was the fact that it was created in a CMM III product environment thus it was well documented in design, development, testing, feature changes, bug fixes, etc. Even with all of this the number of processes and related data stores still showed a lot of bleed over and function duplication (there was no simple way to determine if a function was in existence that already did what you needed and even harder to determine if it was state data dependent and thus unusable in certain other states. Attempts by us (contract coders mainly) to get the company to allow us build a function-finding-tool/database to eliminate this problem fell on mostly deaf ears.

Because of this we had to depend on the longer-lived of the system architects to get an idea of where functionality existed. There were many times though when no one knew and weeks had to be spent reverse engineering communication structures, what the heck undocumented stretches of code did, re-write the documentation correctly and then start to implement the feature or correct the problem that had "been there for years." Management did not like the time taken to repair poor coding as this was not included as one our trackable metrics and therefore not in our feature/bug's budget (since it was not considered to be either).

RNS sounds bad but it was a breeze compared to that tightly optimized state machine code without documentation. So, my recommendations are:

1) If it is stream-of-thought-code (kind of like Faulkner's The Sound and the Fury), not modularized, not documented Tell your manager that it most likely will have to be re-designed to understand it fully. That means do an essential model of it's processing and data stores, use-cases, objects and events or whatever rigorous methodology you prefer. Then use that to re-write it. If management doesn't want to do that then you do not work for a company interested in maintainable code but wants a cheap fix. I would leave as soon as you get from them what they took from you in suckering you into the place.

2) If it is structured and/or developed in a "self-documenting-language" like Ada, Modula, Eiffel, etc. that forces structure (or at least makes it easier to write structured rather than unstructured), finish documenting it properly a

--
Be as you would have the world become.

Tell your boss the truth. by Lord+Kano · 2008-01-19 23:58 · Score: 1

Tell your boss that it's going to take some time for you to get an understanding of someone else's code.

I can't guarantee success in your case, but the last time I had to do that, my boss completely understood. If your boss is a programmer too (and didn't write the spaghetti code that you're tasked with understanding) then he or she will completely understand that it's not easy to pick up someone elses code and start at the point where they left off.

If programming was easy, everyone would be able to do it. I'm not a plumber. I can't do what they do. Plumbers charge a premium for their services because not everyone can do it. It's no different for most other specialized professions.

LK

--
"Hi. This is my friend, Jack Shit, and you don't know him." - Lord Kano

Use Source Navigator NG by alderX · 2008-01-20 10:51 · Score: 1

I completely agree with the tip about Source Navigator.

But beware that the original project (liked in parent) is stalled. So instead you should go with the well maintained fork - called Source Navigator NG. It's hostest at Berlios: http://developer.berlios.de/projects/sourcenav

From someone in the trenches by Rophuine · 2008-01-20 12:07 · Score: 1

I've read some suggestions, but I don't think they're quite hitting the mark. I don't care enough to read right through to see if someone's already said the following, but conversely, it's unlikely that anyone will read far enough through to see this anyway.

Debuggers: Bah. If it's a large enough system to be a problem, randomly interrupting execution, or debugging a component of the whole, is too hit-and-miss, and definitely using a microscope to analyze a mountain.

UML, etc.: Lack of tools, lack of original design documentation, lack of actually valuable output.

Source code analyzers: Never had a good experience here.

I maintain a large chunk of C++ code someone else ( == various teams over a long period) wrote. Here's how I started, and how I continue:

First, look at your code layout. Separate folders for separate processes / object collections / whatever is a huge help, and may give you a good visual overview.

Then, look at your output. Is it a bunch of different binaries? Are you getting intermediate object files which are included in a number of different binaries? Here are more tips on structure.

Read through some 'main' files. Files which actually commence execution: code entry points. Look at what's being loaded, what's being instantiated. Look for outside 'main' or 'event' loops to get an idea for the broad execution paths.

By now you should have a good overall view for which bits of source do what major tasks, what includes what, how the thing 'holds together' in a very general sense.

You're done.

What? You haven't looked at all sorts of detail. You don't know what 2/3 of the classes do. You haven't even worked out the IPC yet. How are you done?

Because these are established details, which currently work. There's no need to know every inch of the source. You now know where to look for changes, where to start tracking down bugs, and you're going to dig into those details of the source as necessary. Some parts you'll know well, some files you'll never need to open.

The only caveat here is you're relying on changes you make not having obscure unintended side-effects in some other part of the code. BUT, even if you were the original coder, if the design allows this, you'd probably miss it. You need to pick up things like this in testing, not by trying to know every inch of the code.

If there's no testing framework right now, start building one. Right now. I know, you're under pressure to DO STUFF, you don't have time to design a whole testing system. So sit down, hack together a quick testing harness, justify it by saying you're just writing something to help test the changes you're making, and every time you write a test for something you're modifying, also take the time to write a test for one other aspect or behavior. This way, instead of two years from now having new bugs you've introduced and a handful of out-of-date tests which don't apply any more, you caught a few of your new bugs, found a few old ones, and still have a useful test suite.

Oh well, hope someone reads this and benefits.

Re:I had a pile of C++ dropped in my lap 2 years a by Vector+Meson · 2008-01-20 16:05 · Score: 1

If you like cscope, I think you'll really like kscope (http://kscope.sourceforge.net/) - best for C code but it can be useful for C++ too. The LLVM folks seem to think that a new collection of tools are needed to help with a variety of stages of program analysis....

Look at the system call pattern by jdimpson · 2008-01-21 05:16 · Score: 1

For some applications, I find it useful to look at how the app makes use of system calls, via `strace' for most Unixen or `truss' on Solaris or `ktrace' Mac OS X. (I don't know of a similar tool for MS Windows.) strace and it's ilk show what system calls the app makes into the operating system, including decoded arguments and results. The basis for these tools is the `ptrace()' system call.

Analysis at the application/OS boundary isn't as generally useful a technique as having a good debugger or other source code analysis tools, but can be useful in certain circumstances. It may be a good starting point to understanding unfamiliar code since it shows you exactly how the application interacts with the outside world. It bypasses the indirection/obfuscation that comes from OO languages or interpretters/VMs, and can let you know what to look for when you start diving into the code. Among other things, it can help you figure out useful breakpoints to use in subsequent source-level debugging.

Also, if you're in a situation where you don't even know how to run the application (due to absence of documentation), looking at the app/OS boundary can show you what the app is trying to do (e.g. connect to a server, find a config file, etc).

It can help you understand the overall of behaviour of an application, esp. if that application is a daemon, service, or other semi-autonomous or agent-like piece of software. Using it on GUI intensive apps can be painful because you end up seeing a lot of low-level drawing routines (for example, the ultimate result of using GTK/QT/Xlib/X11 are a bunch of calls to the X server via shared memory, and are unintelligible to mere mortals).

All that said, strace is probably a better troubleshooting tool than discovery tool, and it certainly requires that you understand Unix/Posix APIs (or at least know how to read man pages). Better said, strace is good for understanding how a strange or malfunctioning application works, and thus is a useful starting point when trying to understand strange or malfunctioning source code.

I've used a similar tool called `ltrace', which intercepts calls into libraries, but I've found a good source level debugger to subsume and exceed its value most of the time.

Re:Umm.. documentation? by gcooke · 2008-01-21 05:21 · Score: 1

I used to go to great lengths to document my code. Then one day I started keeping track of the time I spend documenting code, correcting old/mistaken documentation, and digging myself out of misunderstandings caused by poor documentation. I quickly realized I spend a substantial amount of time doing the latter (can't remember the percentage...don't have a brain for numbers) -- so I started just deleting ALL the documentation in the code I edited before I did anything else with it.

And I watched my productivity increase. Substantially.

Some documentation is a good thing...but the lion's share just gets in the way. Documentation belongs in user's manuals and programming guides, not in source code.

That said, aside from the use of code viewers like doxygen and trips through the debugger, I have found the most useful way to understand code is to get one's hands on a Subject Matter Expert: either the original programmer or designer, or (better yet) a power-user who can tell you what the program -should- be doing and what it -does- do. Such folks are far more valuable than any pile of documentation.

[Disclaimer: I am a "software researcher" -- I have spent the past 20 years rummaging through -other- people's code, trying to make enough sense of it to determine why it runs slowly, consumes too many resources, or is hard to maintain. So my view of the issue may be a little different than most.]

Re:How / why did you get the job... by gcooke · 2008-01-21 05:30 · Score: 1

"Knowing the language" is one of the last things on my list of items to check off when interviewing developers. I could care less that they can rip out Java or C# in 10 minutes...and if that's their primary skill, then I don't want them around.

I want people that can solve problems, not language hacks. If the project is so straightforward it only requires skill with a language, then I'll outsource it to India (or China or Russia...or South Dakota, etc.). Sounds like the original poster is someone I'd hire, because he's NOT familiar with the language yet he clearly has the wherewithal to solve the problem at hand.

Yes... it's called a professional, non-hacker tool by Money+for+Nothin' · 2008-01-21 17:50 · Score: 1

See also Visual Studio, or WSAD/Eclipse, or NetBeans.

Professionals do not waste time with half-assed, flimsy, easily-broken/high-maintenance hacks like the tools normally used on *nix systems... Professionals get their boss to pay hundreds of dollars for a competent toolset, or (if they are unlucky) buy it themselves.

I understand that professional mechanics often have to purchase $10k in tools; guys in construction also spend several thousand on their tools. You think they'd rather use a rock, rope, and a stick to pound nails -- or would they rather use a solid, well-made, for-reals hammer that isn't just cobbled-together by some pimply-faced car-nut teenager mechanic-wannabe, and which costs money and is mass-produced?

Stop screwing around with vim and grep (except whe you have big text files to parse - they're still great for those purposes). Forget 1978; join us in 2008!

--

Is Capitalism Good for the Poor?

leo by Anonymous Coward · 2008-01-25 11:52 · Score: 0

You might want to take a look at leo (http://webpages.charter.net/edreamleo/front.html)
It is amazing for hacking unknown code.

Koders Pro Edition by beebe4 · 2008-01-26 14:32 · Score: 1

(warning, shameless self promotion, BUT I honestly do feel like we have built a product that can help) I am a product manager at Koders (see http://www.koders.com/ for our Open Source search engine) and we have a product that has helped people in this situation before. If you installed the Pro Edition from our code search suite (http://www.koders.com/gopro/) you will be able to easily search for any code in your index. This can also be helpful when dealing with multiple repositories and has also been extremely useful to QA / Support teams that don't do full time development, but would like to take a look into the code and offer as much info as possible back to development when reporting bugs or escalating customer issues to development.

Slashdot Mirror

Tools For Understanding Code?

383 comments