Source Code Browsing Tools?
Marco Sanvido asks: "I often look at source code (especially C, but this question is valid for other languages as well) and I have a really hard time in understanding how it works. Documentation is often missing or quite outdated, and the only way to see how the program works is to try to understand the source code. Which tools do you prefer to use for browsing and studying source code? So far I have used LXR for Linux, Eclipse for java, and CScope, but I'm not sure that these tools are the best solution." It's tempting to flood this question with answers for your IDE, but the key thing here is _browsing_, not _development_. What decent, lightweight programs would work well as source code viewers?
If you're looking more for documentation of existing code, doxygen is great. It produces click-to-follow hierarchies, graphical pictures of trees, and can will intelligently display some of the comments it encounters. It can produce output in html, LaTeX, rtf, PS, PDF, and even man pages. And I know from experience that it can handle some pretty massive projects.
John
I personally would like to see tools for visually displaying objects at runtime in an animated graphical display. It would be a quick way to figure out a large code base rather than having to read all the code.
Real programmers use "type".
the opensolaris code browser is built off a bunch of open source stuff.
OpenGrok
its incredibly easy to use, and makes things very easy to read. and is now packaged for your enjoyment.
and available at http://www.opensolaris.org/os/project/opengrok/
XML - A clever joke would be here if
Personally I use vim/vi for both, but then I guess I'm old school :-)
Nothing Can Beat a Good Editor
As much as we might like to use some special purpose tool for this purpose, most of the time that I'm looking at code I'm not entirely sure if I'm going to be editing it or just peeking. Thus, it's silly to be in one program when I need another. And, the added "system weight" of running a "heavy" editor vs.
Syntax highlighting is THERE in an editor, and I don't have to restart if I change my mind about changing the file.
http://ultraedit/com/ is a GREAT editor for Windows, or Jedit or Eclipse for Win or unix.
Unitarian Church: Freethinkers Congregate!
If I'm looking at third-party code (instead of my own), I like to use Krugle. It's still in beta and I was lucky enough to get a beta invite, but it's an extremely powerful tool for searching through repositories and documentation.
I've always considered stuff like LXR (linux cross reference) to be good for this kind of thing.
LXR's claim to fame is that it started out being a cross-referenced browser for the linux kernel source code, but since it was released, the newer versions has moved towards becoming more of a general source browser. (might need to use cvs, don't know if a proper release was made)
It does neat tricks like processing source code, building function/variable/header/etc line references for usage, definition, declaration etc, and cramming them into a db for retreival. It can also handle interfacing with CVS to pull source code directly from a CVS server, which is interesting. Also handles full text searching too, if you get glimpse or whatever.
Of course, the interface to LXR is a web browser, which makes it less than ideal if you consider that it isn't integrated into an IDE, but for the purpose of tracing/searching large amounts of code, it's still pretty useful.
ash
You can make some useful call graphs with codeviz + graphviz. I sometimes find this useful for tracing the heirarchy of abstraction through a set of C source files.
http://www.csn.ul.ie/~mel/projects/codeviz/
Yeah I know we're not supposed to discuss IDE's. But its class browser (which can handle multiple languages) can greatly simplify trouncing through a code base.
..is Redhat Sourcenavigator . You can look at class hierarchy, static call graphs, jump to function declarations/definitions/callsites. Try it out.
I needed an easy to use C source code browser because I'm porting an old bbs game to Java. JEdit fit the bill perfectly. Out of the box it's not much more then a text editor with syntax highlighting(130+ languages). It has a feature call HyperSearch that can be used for search through single files or multiple files and have a little box of hyperlinked results. It has lots of plugins to extend it's funtionality but nothing extra to get in the way when you first install it. Check it out at jedit.org. The only thing some people might take issue with is that it requires Java.
My Hello World is 512 bytes. But it's also a valid Fat12 boot sector, Fat12 file reader, and Pmode routine.
Notepad
Error 2101: all your sig are belong to us
I'm actually a fan of using a debugger to step through code I'm trying to understand. I can let it keep track of the call stack for me and it saves me from having to manually surf around multiple source files to figure out where the next function I need to look at is.
It's not a good way to figure out how every nook and cranny of the code works, but it's great for an initial scan-through to see the overall structure of a module. And if you are at liberty to throw in an embeddable scripting language (I use F-Script) you can poke and prod at anything you want with ease.
I prefer to use SciTe - it's really lightweight - supports code folding, syntax highlighting and it's open source.
Freedom is not worth having if it does not include the freedom to make mistakes. - Mahatma Gandhi
http://www.sourceinsight.com/
I've found Doxygen to be pretty handy, it'll output to a bunch of HTML files and even create nifty relationship graphs for OO languages like C++
http://www.stack.nl/~dimitri/doxygen/
You mentioned scope... what exactly doesn't it do that you'd like? Other than the time to build to source database (minutes for large projects, amortized heavily if you're only reading), it seems perfect to me.
I've had this sig for three days.
Here's what I use:
a2ps -o output.ps --prologue=color SOURCECODE.c
lpr output.ps
I love it.
If you develop in Java you could try FishEye:
http://www.cenqua.com/fisheye/
To plug a personal project of mine--
Browse-by-Query is a database for code with a query language specifically designed for finding things in code.
I was dissatisfied with fixed-function browsers, so I developed this.
Use expressions more powerful than regular expressions to search through and understand your codebase.
Works only with Java now (there's a standalone version and Eclipse plugin) but I hope I (or someone else) will extend it to others.
Notepad++ is opensource, available on most platforms, allows tag/function/logic expansion, has preset color coding for most all popular languages. The find/edit capabilities are comprable to Dreamweaver's allowing regular expressions if your into that thing. It does color labeling to find snippets you ran a find on, of you can do the traditional up and down find. If anything, if has too many features.
It is what you would expect the microsoft people to produce if they were trying to sell notepad/wordpad on the featuresets like they used to in order to engulf the market.
We use Understand for C++ (link is to the index of all "Understand for..." family members) when reviewing and designing formal unit tests for our clients' code. It's extremely useful for manual static analysis: understanding structure and inter-relatedness, so to speak.
However, to understand dynamic behavior you should look at various tracing options, even the lowly printf(), or try stepping through in a debugger. The larger, more complex and the more object-oriented the code, the more important understanding the dynamics are.
Anyway, Understand for C++ is much more interactive than any of the free comment extraction or cross reference tools and the database has a Perl API, though we've not had a chance to use it. It's worth the price if your doing this as part of your job.
- BarrieI know you said no IDE's, but if you merge well with Emacs, it can do the work too. Emacs is not exactly a heavy weight, depending on how you install it (it's often built into your distro anyway). It uses this thing (I don't really know it that well to be honest) called tags - ctags or etags.
Basically you run etags (check your man pages) from the command line that will parse through your source files and create a lookup table in a file (name TAGS by default I think). While browsing the source file, you just have to position the cursor at the right symbol, and press M-. (that's usually Alt-DOT) and it'll take you to the function definition. (vim has a similar thing)
I haven't used it too much, since I'm a Lisper, and the Slime development environment (Emacs addon mode) for Lisp has a similar thing (and it doesn't need to create any tables beforehand) that also provides a "stack-like" functionality. That means you can jump to the function definition, then pop back to where you were. This can be handy for quick detours just to lookup small functions for example.
The advantage here is that you have the all the files locally, so it's faster than browsing through a web interface (html-ized source files, like the Sourceforge CVS frontend - I still use that a lot, and it is SLOW), and you can also edit the source (just a bonus).
...and also totally irrelevant to this discussion. The question was about source browsing tools, not about source editing tools. The "find" and "find in files" functions that SciTe provides don't really count as browsing tools.
This is a question, not a recommendation: Does anybody out there using program slicing tools? Or any of the other "program understanding" tools that people doing software engineering research seem to spend a lot of time developing?
think beyond ed
My Aurora : http://www.youtube.com/watch?v=o91ZsGwJYyg
FB : https://www.facebook.com/TanveersPhotography
...if your code is in a CVS or Subversion repository. It uses enscript for syntax highlighting which works pretty well for a variety of languages (for example, Ruby).
I agree with some of the other folks here, though - a good IDE makes an excellent code browser. IntelliJ IDEA is awesome if you're working with Java code...
The Army reading list
One great thing about Emacs is that a directory browser is built-in. It doesn't use the native operating system's file/directory browse dialog by default (although it's there in GUI builds). The browser is in the same window as the editor. This means you don't have to use the mouse and click on five things to open a file.
... be-all-and-end-all IDE dozens of times a day to use Emacs. It's just faster when all I want to do is find a file and search through it or edit it.
You can navigate from one directory to another just by pressing the up/down keys and enter. For example, pressing enter on the '..' line moves you up one directory. You can then 'close' the directory as if it were a text file and you're back where you started from. This makes traversing directories and files a very quick process, with stack behavior--all from the keyboard.
Of course, it comes with the attendant price of the Emacs learning curve. One mitigating factor is that directory browsing is often covered early in the tutorials and if that's all you're using it for, it shouldn't be too painful.
I learned Emacs because I needed to master an editor on Linux (I came from the Windows world). I can't live without it. Even though I'm back on Windows most of the time these days, I pop out of my "latest and greatest" Visual Studio 2003, 2005,
On a seprate but related note, I use V 2000 http://www.fileviewer.com/. It's faster than any directory/text-oriented file browser than anything I've used (even Emacs). It saves a couple of keystrokes for each file, compared to Emacs, and they add up. I've used it for years and love it.
find, grep, less, and cat. Seriously, if a combination of that doesnt find what I'm after, and cant display it, there is something wrong.
One of the things I like most, is that it also colours all parts of the code that is #ifdef'ed out. Another thing is that the information windows for any token displays all kind of information: calls, callees, references, uses (including sets/get/modifies), used locals, used globals, exposed globals etc - all of which in a tree view, so it is very easy to decide what's important and what's not.
It is possible to use it as an editor as well, which I do, but as such it isn't perfect.
Also, importing a completely new project requires almost no intervension - it will simply prompt for where any missing #includes are located and add them to the searchpath, so just setting it up for a quick test is done in no-time.
http://www.opensolaris.org/os/project/opengrok/ is the best that I've used till now. Easy to setup and quite fast to use.
Try jGRASP (http://www.eng.auburn.edu/department/cse/research /grasp/). Some good points below.
1) IDE front-end to compilers.
2) Generates CSD (source code visualization).
3) Runs on all platforms that use JVM.
4) Supports Java, C/C++, Ada, VHDL & Objective-C.
5) In-built debugger for Java.
Is there anything better? Run ctags recursively over your code base, and then use your favorite editor (vim, right?! or emacs, if you must[1]) to follow paths through your program logic, jump to variable definitions, and all kinds of other fun stuff. It keeps a stack, so you can pop back up to a higher frame and recurse down another path, ad nauseum.
It supports 33 languages, and is used on all 7 continents (don't know why that matters, but hey, it's on the home page).
http://ctags.sourceforge.net/
[1] I had to.
Try PSpad. It's free. www.PSpad.com
I like Emacs a lot and use it each time I have to code but for most activities like modifying a file or to writing scripts I prefer gvim. Emacs takes ages to start and its UI is not consistent with most desktop environments. Unfortunately, the port to modern widget toolkits like GTK never worked nicely. In marked contrast, the Windows version is really nice but I hardly remember the last time I booted on Windows.
This depends on what you want to do.
If you are looking for this in relation to debugging a known bug, ctags + vim (or etags + emacs) is the way to go. This also applies in case you just want to learn the code.
If you are looking for this to audit the security of a program, then you need to follow code paths. While ctags will help you there, I don't see much stuff which is capable of showing flow paths in a program.
I can throw myself at the ground, and miss.
You could also take a look at SourceNavigator at http://sourcenav.sourceforge.net/index.html.
I'm using Vim 7.0 too with a 'tags' file generated with Exuberant Ctags. :help tags-and-searches
See
I mapped Alt-Right and Alt-Left to quickly follow a variable/function name to its definition and go back:
map <M-Left> <C-T>
map <M-Right> <C-]>
Vim and tags.
But, you have got to want to learn it if you are not already required to learn it.
The upside?
I have sometimes used GNU Global which makes indexed html pages of the code. Somewhat similar to lxr but there is no setup, just run two commands, gtags and htags. One nice thing about global is that it can be used on any incomplete subset of a software system. Want to just look at the files in the drivers/net/wireless directory in the linux kernel tree? Fine, just run gtags and htags from that directory (and no other setup is necessary).
I have also used NCC which "compiles" each file and makes a index file with information like "function AAA calls functions BBB, CCC and DDD, reads variables EEE, writes variables FFF and GGG". The format is not exactly like that but you get the idea. NCC includes a text mode gopher-like variable usage/function call browser and there is a script to make graphical call graphs (via dot from graphviz). At work I have also used information from ncc files in combination with with information from the map file to find maximum stack usage.
This study (which I just found while writing this) seems to have an interesting analysis of this topic.
When you are sure of something, you probably are wrong (search for "Unskilled and Unaware of It").
It doesn't really look maintained, but I really enjoyed Cygnus Source Navigator when I need to read a lot of source bases for a living. You can find it at http://sourcenav.sourceforge.net/ or probably as 'sourcenav' in the distribution of your choice.
The underlying technology is not the prettiest ever. Yeah it uses TCL. But it has a workmanlike efficacy in terms of interface. Give it a try.
For most smaller projects I just use vi and ctags, or maybe cscope with those, but I'm sure you're familiar with all that already.
-josh
I'm amazed, even the title of this story is wrong. For browsing any tool who can display code is sufficient. Sure tools which can syntax highlight or have name reference lookup are better but the real issue is to find the source code first hand.
Whenever I search for a solution I first go to http://koders.com/ but their index seems a little limited. Still just try once looking up "wxSingleInstanceChecker". There are others like Koders.com but I've forgotten their names. Next I try to think of a most fitting statement for code and feed it to Google (e.g. "class App: public wxApp", but beware of the white space) which at least returns some hints which project might have source code available. Yet the free text search isn't very suited for searching code since it produces too much wrong results.
When I've found a project which might have fitting code I either look into their LXR if it's available or simply download the source tarbal and use a decent editor (e.g. http://freshmeat.net/projects/wyoeditor/).
It's said that none of the current CVS web tools are searchable, nor that Google is able to restrict results to CVS pages, else it would be much easier to search for source code.
O. Wyss
See http://wyoguide.sf.net/papers/Cross-platform.html
Browsing and, incidentally, development. Use ctags (or compatible) and in vim press ^] over a symbol and vim will launch to the location that symbol is defined. Pop back by pressing ^T. See C++ development using vim and ctags for more options. Awesome.
Leo - http://leo.sourceforge.net/
A GUI literate programming editor - can import sources in many languages, and break them down into classes/methods/functions.
You then have ability to create all manner of 'views' of the code.
-- In the beginning was the WORD, and the WORD was UNSIGNED, and the main(){} was without form and void...
Just rewrite it. Reading others' code is for wimps.
a2ps already sends to default player if you don't specify the -o tag :) Just so you save 20-30 characters.
Of Code And Men
You're doing this the wrong way. Pouring over code is very useful, but doesn't show you the layering. So, whip up trusty old gdb, set a breakpoint and run the damn code. Then use the step and next commands and just see where this leads you.
8 of 13 people found this answer helpful. Did you?
Any editor with good syntax highlighting would probably give you most of what you want. Nedit has long been my favorite. Simple, light, and with (last time I counted) 28 grammars.
I bleed to National instruments so that I can use thier graphical programming language. If anyone knows an alternative pipe up.
http://phpxref.sourceforge.net/PHPXref is a great tool which builds an HTML-based outline of your source code. It's been an indespensible tool for working on a very large project, http://www.moodle.org/Moodle), especially when getting my hands dirty with a new section of code I haven't used yet as it makes following an execution path very easy to do.
From the site description:
* Minimal requirements, minimal setup.
* No web server required to view output.
* Cross-references PHP classes, functions, variables, constants and require/include usage.
* Extracts phpdoc style documentation from source files.
* Javascript enhanced output provides:
o Mouse-over information for classes and functions in the source view.
o Hot-jump to the source of any class/function definition.
o Instant lookup of classes, functions, constants and tables by name.
o Search/lookup history.
* Pretty-prints PHP files from the browser.
* Stays crunchy in milk.
{justin.filip | jfilip AT gmail DOT com} {http://jfilip.ca/}
Unfortunately the story poster was a bit vague. They could mean that they or the person assigned don't read code, yet they need to know how a program works, or what it does. There are many times when you don't want to or cannot run/debug a program to analyze it.
... Current tools give you unuseful amounts like cyclomatic complexity, or other ratings like test coverage or Fowler's class interdepence/brittleness/(whatever) measure. However those things are useful to me only: if you know nothing about code, have any resources to plug testing holes, or are actively managing a codebase, respectively. None of them say what the code *does* or how it does it. So we need something that covers concerns orthogonal to what already exists.
There have been many links posted which I'm going to have to explore - not the editor links (jEdit and Emacs rulz!) but the conceptualization links. Unfortunately most if not all IDEs are still code-file based. The most prominent other tool for project conceptualization is UML which has been gaining IDE integration. Yet there are still a couple problems with that:
1. The UML usually generates accurate structures for classes, but doesn't or cannot generate execution diagrams (what, 4 types in UML?) or state diagrams.
2. These static diagrams represent only the level of the code instead of being an interactive object with drill-downs or abstract-up commands!
To understand a new set of code or codebase, I need something that analyzes the code and reports back to me on its tactics and strategies - what patterns are used, what weaknesses are in the code,
Which makes me more curious about the comment asking if anyone uses the tools researchers are working on for visualizing a project. I haven't kept up on academics - where do I start looking? And is there anything there with demos, products available, or in post-beta releases?
8-PP
On Windows, I use ConTEXT (http://www.context.cx/) extensively. It has built-in syntax highlighting for several languages, and you can download highlighters for hundreds more. Also, it usually takes well under a second for it to startup (uncached).
Plus it's free, which you can't beat (OK, OK, maybe with open source...).
pico? pico? nano man!!!
I write code.
Source Navigator is the perhaps the most amazingly useful freebie I've ever downloaded. It's absolutely indispensible for making sense of large C/C++ codebases (and it has some support for other languages too). The cross-referencing ability is particularly useful; it's great to be able to call up a graphical call-tree of any function.
--
CPAN rules. - Guido van Rossum
For pretty-printing Java, I like jGRASP. It draws outlining-type lines to the left of the code to show nesting and control structures. I always use it to print code for code reviews. It can generate class diagrams, too, but I haven't used that feature.
14-inch greenbar, preferably printing on a color-capable impact printer.
Wide continuous paper, plus a large work surface, means I can stretch a module out and mark it up with highlighters and scribble notes. A straightedge and some detective work means I can verify "indentation" levels (code nesting).
Of course, run a source code beautifier over it first.
Why, yes, I am old; how did you guess?
Welcome to the Panopticon. Used to be a prison, now it's your home.
Also worth mentioning (and related) is the Java Development Environment for Emacs, which makes analysing and traversing a large Java project a whole lot easier, with integrated class management, wizards, skeletons for creating classes and javadoc comments. You can get JDEE from its homepage.
Cheers,
Toby Haynes
Anything I post is strictly my own thoughts and doesn't necessarily have anything to do with the opinions of IBM.
Since the original post said 'understand' and 'other languages', I'd just say, I've started a project for untangling legacy Perl CGI and writing the call graphs (via dot etc.) out as svg diagrams. The code is young and messy but has started to work. It's at http://sourceforge.net/projects/codewalker/.
On y va, qui mal y pense!
(1) NEdit combined with exuberant ctags.
(2) Red Hat's SourceNavigator.
(3) GNU Global to generate a nice clickable HTML version of a source tree.
I used to also use CSCOPE, but I can't fine Solaris/Sparc binaries which don't require root access to install (pkg format isn't helpful for me -- I'm just a developer on the box, not an admin).
On the mainframe side, I usually use a combination of FINDREF, IACULL, and CULL, which together form a sort of superpowered CSCOPE, but I'm not aware of a similar tool in the UNIX world other than things like CSCOPE (which are useful but rather basic in functionality).
Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
The Theorem Theorem: If If, Then Then.
yeah, nano is so much better than pico!
(I have never noticed a difference, but then again all I use nano for is to make small changes to xorg.conf when I screw something up)
If you really want to read some code, print it with a good old fashioned chain printer on green bar paper formatted with "pr" for page numbers, and get a set of colored highlighters to mark useful things.
For code you want to browse through, hack up a little perl script that gives you page number references for each symbol and subroutine/function/method, and print that out on a separate stack.
Go sit somewhere nice where you can relax, flip through your stack of green bar and sip your favorite caffiene, and life is good.
- "History shows again and again how nature points out the folly of men" -- Blue Oyster Cult, 'Godzilla'
I'm an emacs user. The reason I "corrected" pico was that it's not free. nano is a free replacement for it.
I write code.
You've already got a problem. I would recommend starting to document and refactor the code yourself. Ensure that all objects are self-contained, remove inter-object relationships and break complex object down into multiple smaller objects. Document as you go and test constantly. Add tests if they don't already exist so that you know your refactoring didn't hurt anything--remember a refactor is a change that can be proven not to effect anything else in the system, so be meticulous!
If you are just looking for tools, I often load smaller packages into BlueJ just to get an idea of the layout--BlueJ does a good job of reliably breaking down a bunch of classes and building a UML diagram from them. I understand they will make this part of BlueJ an eclipse plug-in eventually.
I emailed them about OS X support, and they replied that they have an OS X version in beta, but won't be ready to release until late this year.
Just FYI for anybody else looking into it.
Comment of the year
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
I use cscope (and vi) as they are both curses based. Vim has a cscope mode too!
I am surprised though that there wasn't a mention of Source Navigator http://sourcenav.sourceforge.net/
What is Source-Navigator?:
Source-Navigator is a source code analysis tool. With it, you can edit your source code, display relationships between classes and functions and members, and display call trees.
My favorite tool for the job is Source Insight.
It's a bit ugly (compared to, say, Visual Studio), but has plenty of useful features, including extensive searching capabilities, most importantly including the "search for cross-references" option. Also, a good idea there (that was copied in Visual Studio 2005) is the lower-pane. The window has a lower pane which also displays code, and it displays the last item you selected - e.g. some function has a variable which is a struct, you click on the struct, and the lower pane jumps to the definition of the struct. Same thing for functions, macros, and just about anything.
Another great feature is the ability to highlight any identifier. Have a big function, and you just want to trace through the value of a single variable to a certain point in the function? simple - highlight it, and every access to it is clearly visible.
Another significant feature I can think of is symbol browsing - say you want to jump to the function 'record_read', a single key away from you is the global symbol browsing menu, which includes all the global symbols the the source tree, and lets you do simple, real-time searching (by the time you finish typing 'record', only symbols with the word 'record' in them will appear).
The little things also make a difference - for example, the symbol browser knows naming conventions (if you type 'read_record', the results will contain 'record_read', and even if you type 'readRecord', or 'RecordRead'), and tiny things like the fact that the size of parthenses changes according to the nesting level (the outermost are larger), so it's really easy to see at a glance what fits where.
All-in-all, a great tool. Just give it a try.
There's a useful product called Agent Ransack that I use all the time to search through source code either for variable/member references or for common coding errors (e.g. '=' instead of '=='). http://www.mythicsoft.com/agentransack. Also supports regular expressions.
as you are looking for code browsing, codewright makes your life easier. Source code highlighting and comments that you can see distinction. searching for some object that calls for some class can be easily find. it is you call anyway
http://www.allseocontest.com http://www.seoremake.com
Try Source-Navigator - A former Red Hat product, it's open source and supports a host of programming languages including C++, runs on Windows or Linux (etc. etc.) - and it's fast as well.
While having tools to assist in the actual task of navigating through the codebase is important, a firm handle on the HOW is more critical.
c ts.html#EFoCS-33-98 section 5.
http://c2.com/cgi/wiki?TipsForReadingCode
"Comprehension and Visualisation of Object-Oriented Code for Inspections" http://www.cis.strath.ac.uk/research/efocs/abstra
Demeyer, Serge. Ducasse, Stephane. Nierstrasz, Oscar. Object Oriented Reengineering Patterns ISBN: 1558606394
Feathers, Michael. Working Effectively with Legacy Code ISBN: 0131177052
Glass, Robert L. Facts and Fallacies of Software Engineering, section in Chapter 2 on Maintenance ISBN: 0321117425
Spinellis, Diomidis. Code Reading: The Open Source Perspective ISBN: 0201799405
And you can now use UndoDB -- extends gdb to enable you to step the code backwards as well as forwards. Great if you want to know how a particular piece fits in to the bigger picture. Also means no more "oops - I stepped too far; start again", which is nice.