Source Code Browsing Tools?
Marco Sanvido asks: "I often look at source code (especially C, but this question is valid for other languages as well) and I have a really hard time in understanding how it works. Documentation is often missing or quite outdated, and the only way to see how the program works is to try to understand the source code. Which tools do you prefer to use for browsing and studying source code? So far I have used LXR for Linux, Eclipse for java, and CScope, but I'm not sure that these tools are the best solution." It's tempting to flood this question with answers for your IDE, but the key thing here is _browsing_, not _development_. What decent, lightweight programs would work well as source code viewers?
If you're looking more for documentation of existing code, doxygen is great. It produces click-to-follow hierarchies, graphical pictures of trees, and can will intelligently display some of the comments it encounters. It can produce output in html, LaTeX, rtf, PS, PDF, and even man pages. And I know from experience that it can handle some pretty massive projects.
John
Real programmers use "type".
the opensolaris code browser is built off a bunch of open source stuff.
OpenGrok
its incredibly easy to use, and makes things very easy to read. and is now packaged for your enjoyment.
and available at http://www.opensolaris.org/os/project/opengrok/
XML - A clever joke would be here if
Nothing Can Beat a Good Editor
As much as we might like to use some special purpose tool for this purpose, most of the time that I'm looking at code I'm not entirely sure if I'm going to be editing it or just peeking. Thus, it's silly to be in one program when I need another. And, the added "system weight" of running a "heavy" editor vs.
Syntax highlighting is THERE in an editor, and I don't have to restart if I change my mind about changing the file.
http://ultraedit/com/ is a GREAT editor for Windows, or Jedit or Eclipse for Win or unix.
Unitarian Church: Freethinkers Congregate!
If I'm looking at third-party code (instead of my own), I like to use Krugle. It's still in beta and I was lucky enough to get a beta invite, but it's an extremely powerful tool for searching through repositories and documentation.
I've always considered stuff like LXR (linux cross reference) to be good for this kind of thing.
LXR's claim to fame is that it started out being a cross-referenced browser for the linux kernel source code, but since it was released, the newer versions has moved towards becoming more of a general source browser. (might need to use cvs, don't know if a proper release was made)
It does neat tricks like processing source code, building function/variable/header/etc line references for usage, definition, declaration etc, and cramming them into a db for retreival. It can also handle interfacing with CVS to pull source code directly from a CVS server, which is interesting. Also handles full text searching too, if you get glimpse or whatever.
Of course, the interface to LXR is a web browser, which makes it less than ideal if you consider that it isn't integrated into an IDE, but for the purpose of tracing/searching large amounts of code, it's still pretty useful.
ash
You can make some useful call graphs with codeviz + graphviz. I sometimes find this useful for tracing the heirarchy of abstraction through a set of C source files.
http://www.csn.ul.ie/~mel/projects/codeviz/
..is Redhat Sourcenavigator . You can look at class hierarchy, static call graphs, jump to function declarations/definitions/callsites. Try it out.
I needed an easy to use C source code browser because I'm porting an old bbs game to Java. JEdit fit the bill perfectly. Out of the box it's not much more then a text editor with syntax highlighting(130+ languages). It has a feature call HyperSearch that can be used for search through single files or multiple files and have a little box of hyperlinked results. It has lots of plugins to extend it's funtionality but nothing extra to get in the way when you first install it. Check it out at jedit.org. The only thing some people might take issue with is that it requires Java.
My Hello World is 512 bytes. But it's also a valid Fat12 boot sector, Fat12 file reader, and Pmode routine.
I'm actually a fan of using a debugger to step through code I'm trying to understand. I can let it keep track of the call stack for me and it saves me from having to manually surf around multiple source files to figure out where the next function I need to look at is.
It's not a good way to figure out how every nook and cranny of the code works, but it's great for an initial scan-through to see the overall structure of a module. And if you are at liberty to throw in an embeddable scripting language (I use F-Script) you can poke and prod at anything you want with ease.
I prefer to use SciTe - it's really lightweight - supports code folding, syntax highlighting and it's open source.
Freedom is not worth having if it does not include the freedom to make mistakes. - Mahatma Gandhi
http://www.sourceinsight.com/
I've found Doxygen to be pretty handy, it'll output to a bunch of HTML files and even create nifty relationship graphs for OO languages like C++
http://www.stack.nl/~dimitri/doxygen/
If you develop in Java you could try FishEye:
http://www.cenqua.com/fisheye/
To plug a personal project of mine--
Browse-by-Query is a database for code with a query language specifically designed for finding things in code.
I was dissatisfied with fixed-function browsers, so I developed this.
Use expressions more powerful than regular expressions to search through and understand your codebase.
Works only with Java now (there's a standalone version and Eclipse plugin) but I hope I (or someone else) will extend it to others.
We use Understand for C++ (link is to the index of all "Understand for..." family members) when reviewing and designing formal unit tests for our clients' code. It's extremely useful for manual static analysis: understanding structure and inter-relatedness, so to speak.
However, to understand dynamic behavior you should look at various tracing options, even the lowly printf(), or try stepping through in a debugger. The larger, more complex and the more object-oriented the code, the more important understanding the dynamics are.
Anyway, Understand for C++ is much more interactive than any of the free comment extraction or cross reference tools and the database has a Perl API, though we've not had a chance to use it. It's worth the price if your doing this as part of your job.
- BarrieI know you said no IDE's, but if you merge well with Emacs, it can do the work too. Emacs is not exactly a heavy weight, depending on how you install it (it's often built into your distro anyway). It uses this thing (I don't really know it that well to be honest) called tags - ctags or etags.
Basically you run etags (check your man pages) from the command line that will parse through your source files and create a lookup table in a file (name TAGS by default I think). While browsing the source file, you just have to position the cursor at the right symbol, and press M-. (that's usually Alt-DOT) and it'll take you to the function definition. (vim has a similar thing)
I haven't used it too much, since I'm a Lisper, and the Slime development environment (Emacs addon mode) for Lisp has a similar thing (and it doesn't need to create any tables beforehand) that also provides a "stack-like" functionality. That means you can jump to the function definition, then pop back to where you were. This can be handy for quick detours just to lookup small functions for example.
The advantage here is that you have the all the files locally, so it's faster than browsing through a web interface (html-ized source files, like the Sourceforge CVS frontend - I still use that a lot, and it is SLOW), and you can also edit the source (just a bonus).
One of the things I like most, is that it also colours all parts of the code that is #ifdef'ed out. Another thing is that the information windows for any token displays all kind of information: calls, callees, references, uses (including sets/get/modifies), used locals, used globals, exposed globals etc - all of which in a tree view, so it is very easy to decide what's important and what's not.
It is possible to use it as an editor as well, which I do, but as such it isn't perfect.
Also, importing a completely new project requires almost no intervension - it will simply prompt for where any missing #includes are located and add them to the searchpath, so just setting it up for a quick test is done in no-time.
Hey, we weren't supposed to talk about IDEs!
My other account has a 3-digit UID.
You could also take a look at SourceNavigator at http://sourcenav.sourceforge.net/index.html.
I have sometimes used GNU Global which makes indexed html pages of the code. Somewhat similar to lxr but there is no setup, just run two commands, gtags and htags. One nice thing about global is that it can be used on any incomplete subset of a software system. Want to just look at the files in the drivers/net/wireless directory in the linux kernel tree? Fine, just run gtags and htags from that directory (and no other setup is necessary).
I have also used NCC which "compiles" each file and makes a index file with information like "function AAA calls functions BBB, CCC and DDD, reads variables EEE, writes variables FFF and GGG". The format is not exactly like that but you get the idea. NCC includes a text mode gopher-like variable usage/function call browser and there is a script to make graphical call graphs (via dot from graphviz). At work I have also used information from ncc files in combination with with information from the map file to find maximum stack usage.
This study (which I just found while writing this) seems to have an interesting analysis of this topic.
When you are sure of something, you probably are wrong (search for "Unskilled and Unaware of It").
Leo - http://leo.sourceforge.net/
A GUI literate programming editor - can import sources in many languages, and break them down into classes/methods/functions.
You then have ability to create all manner of 'views' of the code.
-- In the beginning was the WORD, and the WORD was UNSIGNED, and the main(){} was without form and void...
Just rewrite it. Reading others' code is for wimps.
Unfortunately the story poster was a bit vague. They could mean that they or the person assigned don't read code, yet they need to know how a program works, or what it does. There are many times when you don't want to or cannot run/debug a program to analyze it.
... Current tools give you unuseful amounts like cyclomatic complexity, or other ratings like test coverage or Fowler's class interdepence/brittleness/(whatever) measure. However those things are useful to me only: if you know nothing about code, have any resources to plug testing holes, or are actively managing a codebase, respectively. None of them say what the code *does* or how it does it. So we need something that covers concerns orthogonal to what already exists.
There have been many links posted which I'm going to have to explore - not the editor links (jEdit and Emacs rulz!) but the conceptualization links. Unfortunately most if not all IDEs are still code-file based. The most prominent other tool for project conceptualization is UML which has been gaining IDE integration. Yet there are still a couple problems with that:
1. The UML usually generates accurate structures for classes, but doesn't or cannot generate execution diagrams (what, 4 types in UML?) or state diagrams.
2. These static diagrams represent only the level of the code instead of being an interactive object with drill-downs or abstract-up commands!
To understand a new set of code or codebase, I need something that analyzes the code and reports back to me on its tactics and strategies - what patterns are used, what weaknesses are in the code,
Which makes me more curious about the comment asking if anyone uses the tools researchers are working on for visualizing a project. I haven't kept up on academics - where do I start looking? And is there anything there with demos, products available, or in post-beta releases?
8-PP
Source Navigator is the perhaps the most amazingly useful freebie I've ever downloaded. It's absolutely indispensible for making sense of large C/C++ codebases (and it has some support for other languages too). The cross-referencing ability is particularly useful; it's great to be able to call up a graphical call-tree of any function.
--
CPAN rules. - Guido van Rossum
14-inch greenbar, preferably printing on a color-capable impact printer.
Wide continuous paper, plus a large work surface, means I can stretch a module out and mark it up with highlighters and scribble notes. A straightedge and some detective work means I can verify "indentation" levels (code nesting).
Of course, run a source code beautifier over it first.
Why, yes, I am old; how did you guess?
Welcome to the Panopticon. Used to be a prison, now it's your home.