Source Code Browsers?
patonw asks: "I just started working for a company as a programmer on a project with a huge existing codebase. The person hiring me half-jokingly said that it usually takes new employees two years before they understand the system. What I am looking for is not just an editor/browser but a program that displays functions and classes as connected graphs -- preferably free. I would like to view how programs are structured by function calls and class relations. I have access to several different kinds of platforms/operating systems."
The truth is, that I have done a lot of research on the Internet to find suitable stuff for this problem. There is none. Period.
This comment is printed on 100% recycled electrons.
If it's a c++ project KDevelop will show you a list of functions and classes and what files they're in. Things of that nature. Try it out, it's free.
Sponsored by RedHat:
http://sourcenav.sourceforge.net/
From the FAQ:
Source-Navigator supports C, C++, Java, Tcl, [incr Tcl], FORTRAN and COBOL, and provides and SDK so that you can write your own parsers.
Use Source-Navigator to:
* Analyze how a change will effect external source modules.
* Find every place in your code where a given function is called.
* Find each file that includes a given header file.
* Use the grep tool to search for a given string in all your source files.
I havnt heard about any free source browsing programs... But I've been using Doxygen to generate HTML documentation of the source when I need to familliarize myself to new code. Definitions in the documentation will be hypertext linkes and there are class inheritance graphs generated. What's missing is really some kind of call-tree but you cant have everything. Doxygen also extract JavaDoc comments from C++ code and insert them into the HTML-documentation.
Doxygen can also generate LaTeX, and RTF files instead of HTML.
Doxygen is a good choice for C++, C, Java, Objective-C, IDL... I used it to get into a ~50K line project a few years ago and have used it regularly whenever I'm forced to use C++... Get Graphviz as well so Doxygen can draw pretty pictures for you.
Unfortunately graph generation is pretty slow, but otherwise it's a fantastic tool.
If your codebase is anything like what I've been working with, there's no tools that are going to make your life easier.
If the code had decent structure, you'd not be asking this question. But it's a mess. And if you display the mess as a tree structure, it's still a mess. The value is limited.
The best thing I've done is set up etags accross the entire codebase. This way I can at least navigate code easier. But I doubt you will understand anything more from tree graphs.
Ecce Europa - Web Design for Business
This is perhaps a tangential answer, but I do much better by going through the code with a debugger and watching things happen. Especially with some of the more compilacated OO stuff, and when the comments are unhelpful or wrong, it can be much more useful than reading the code.
See you, space cowboy...
"If your codebase is anything like what I've been working with, there's no tools that are going to make your life easier."
Were's Artificial Intelligence when you need it?
"that everyone is asking for code analysis software these days? There was a story just a couple of days ago already."
I suspect some's trying to reengineer Slashdot's code.
Currently SHriMP runs both as a standalone application and, using the Creole plugin, inside Eclipse to augment its existing, extensive code browsing capabilities. There's also a plugin for Protégé, a Stanford project to build "an ontology editor and a knowledge-base editor" supporting new techologies such as OWL.
While Creole is currently Java-specific, SHriMP is a generic framework for code visualization.
I was actually going to mention JQuery, but changed my mind. But some people might find it useful.
Doxygen does exactly what you described. See item #2 in the link.
If that doesn't work look up programs that will convert to UML. Since you didn't mention it in your question I'll expand: Unified Modeling Language diagrams are a standardized means of describing the relationships between objects in classes. To any Slashdotters out there in college looking to take a software engineering course, you'll be seeing a lot of UML.
Direct away from face when opening.
Sometime ago I used Cast for this purpose. I think it's currently called Cast Enlighten (http://www.castsoftware.com/Products/platform/AMS /Enlighten/Features.html. It's a big and complex (and probably costly) product, but it really does the job. It covers lots of languages and it is even able to drill down from e.g. java into Oracle stored procedures.
I tend to roll-my-own code analyzers because each system tends to have a different personality and there are different things to look for. Generally I try to parse the code and log each file, module, and subroutine into a relational database. Then I can query it later as needed. I also may log every variable and and token, filtering out common key-words. This risks over-indexing but is better than under-indexing becase extra tokens are less of a problem than non-indexed tokens. And, one can adjust the indexer to remove the extra ones over time. (Ideally one should parse it using perfect grammar parsers, but that is not always an option, especially with web stuff that mixes multiple languages.)
Then I log all tables and columns of the databases (data dictionary). Once all the tables, columns, variables, subroutines, and files are logged in database tables, one can cross-reference it all either batch-wise and/or using ad-hoc queries. It is just regular database stuff at that point, so one is no longer dealing with code text but SQL (but I still like FoxPro for such because of its informal nature). However, it may launch a code viewer to inspect matches.
It takes about 3-days to a week to do all this if you have done it before and know your tools, plus future tweaking to make improvements.
I have seen commercial tools that do similar things for a lot of money. But to be frank, if someone besides you will be using such a tool and they don't like using SQL, then a commercial product may have a more mature user interface. They also don't bring complaints to you when there are glitches or omissions in the parser.
Table-ized A.I.
We use "Source Insight" and we are very happy. ;) It's not free, though.
In theory you can use Doxygen with any OO language, provided you can get a parser for that language. But I haven't heard a lot about Doxygen outside the C++ community. I imagine Java people mostly stick with JavaDoc, since that comes with the JDK. But I consider Doxygen to be far superior.
Okay, so this doesn't answer what he was looking for, but the title reminded me of Koders... the search engine sucks, but it's a great idea.
I second the recommendation for Source Navigator. It's been a a great help to me in comprehending legacy codebases.
--
CPAN rules. - Guido van Rossum
This sounds a lot like a relatively old, but intriguing idea. "Literate Programming" is exactly what you describe.
It is exciting to write heavily documented code, but I doubt it can be done after the fact :-/
My other computer runs FreeBSD too.
PHPXRef: PHP Cross Referencing Documentation Generator
http://phpxref.sourceforge.net/
how about LXR?
I've been using it to browse linux source code lately: here
from the site:
A general purpose source code indexer and cross-referencer that provides web-based browsing of source code with links to the definition and usage of any identifier. Supports multiple languages.
except for lack of syntax hilighting, it works well.
-metric
I worked on the flight operations system for a large airline for over eight years (actually ten if you count my contractor time), and I only learned the intimate details of perhaps 20% of it bu the time I'd left.
Complex applications require a huge amount of specialized knowledge in order to understand, and most of that knowledge relates to the application or work process itself, not the technical environment...
Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
The Theorem Theorem: If If, Then Then.
Here's one that no one else probably thought of. If the software runs on linux, load it up in valgrind with the calltree tool, use it a little, and look at the kcachegrind visualizations. It'll give you an idea of what code is actually used, and how it's interrelated.
I like Emacs Code Browser, it's fast, featureful, and it deals with a bunch of different languages, see the screenshots
Of course, if you don't use emacs, it won't be nearly as handy.
Shae Erisson - ScannedInAvian.com
If that's too much typing for you,(without any spaces put there by Slashdot) yields: http://sourcenav.sourceforge.net/
Oh, and for you "Well just right-click on the text and click 'Follow Link'." people, tell me how to open a selected-text link containing extraneous Slashdot spaces in a new tab using Mozilla, or shut up.
Also:
how a change will effect external source modules
"affect".
I've written a few in Perl for quick breakdowns. No graphing, but the equivalent textual information. I recall seeing a few projects of a similar nature on sweetcode, with graphing (3D?) unfortunately I don't recall what they might have been called.
[ approaching AI ]
find, grep and vi is all you need! :-)
I can't believe nobody has mentioned cscope yet. We used that in the multi-million line project I worked on until a couple of years ago. My division was only responsible for a few hundred thousand lines of code with a relatively well defined interface, so we generally kept our own cscope subset (Hint: cscope has an option to cache its results, and I highly recommend doing that if your project is more than a few thousand lines). I never actually had to use cscope for the entire source tree, but it worked VERY well for my area of responsibility (several tens of thousands of lines).
In order to stick to the original question, I should also mention that most nontrivial programs end up using dynamic programming styles, and there's no way to graphically display those. I also want to point out is that no source code analyzer is going to do a even a half-assed job at figuring out dynamic relationships, so if your project contains any drivers/vtables/virtual functions, then you're basically S.O.L, and you may as well just use cscope. However, if you really insist on getting a graphical output, check out the free code graphing project. It has a nice picture of the linux kernel.
CScope, originally developed at AT&T Labs, now under the GPL and available at SourceForge. It works amazingly well on the Linux kernel. I've even tinkered around in the XFree86 code using CScope.
(Also comes with vim integration, if you're into that.)
Use it, love it. The UI is rather ugly, but it is the best damned source navigation tool available for Windows today. (Link)
iRooster, the Mac OS X a
However, neither supports Perl. Has anyone seen a tool like this that works on Perl code?
The one at is not free (my company pays for it...), but it works very, very well.
I'll try those programs and keep you posted.
There's several tools out there for static analysis. I guess source-navigator has real parsers for the language it supports. I wonder if someone has ever attempted to integrate source-navigator into Emacs or Vim for live-parsing. Ctags and Etags can't do that right now and aren't real parsers to boot. Look at an IDE like Eclipse (and others) that do live parsing. Until recently, editors and developer tools have been sorely lacking real parsers. Something that many windows tools have had for a while (especially for C++).
I found Eclipse very good for this in a recent new 600K lines Java + JSP project I needed to understand in a few weeks. Of course, your machine needs to be spiffy and it needs to have plenty of RAM.
The features I found most useful were:
- hierarchy browser
- call graph (finds out who calls what)
- debugger (when all else fails, debug to find out what is happening)
- resumable debugging (change while debugging)
- incremental building.. which in Eclipse immediately tells you where you broke the code when you make a change (sometimes the easiest way is to create a compilation error and see what it impacts)
- refactoring tools
I bet it isn't as good for C++ and other languages though.
Microsoft's Visual C++ creates a "code database" containing information about functions and classes.
Version 6 had operations like "call graph" and "caller graph" that graphically showed the function calling sequence from or to a given function.
Unfortunately, Microsoft, in their infinite wisdom, decided to scrap this functionality in version 7 (a.k.a
Theoretically, since the code DB still contains the full info, an add-on could be written to support this functionality but
I'm surprised no one has talked about Smalltalk's integrated code browser yet. See how much languages and environments have to learn from Smalltalk?
There's a nice front-end called Codeviz that 'writes' the graphs for Graphviz to render, and lets you filter symbols based on regular expressions. I find it pretty useful on C source - can't comment on other languages.
i use visual slickedit & its code browser works just great for me , it uses incremental search to update the tags so it dosent takes hell of a time updating tags database, if u want 2 use a free too then use kscope which uses cscope & ctags.
We are always correct.. even when we realize we were wrong.