Static Code Analysis Tools?
rewt66 asks: "We are looking for a good static analysis tool for a fairly large (half a million lines) C/C++ project. What tools do you recommend? What do you recommend avoiding? What experience (good or bad) have you had with such tools?"
1. If you have 500k lines in a single project, consider re-factoring it into separate libraries that you can divide and conquer. Also, if you have 500k lines of code, consider cleaning it up, re-factoring it, etc. Fewer lines of code is more impressive than more.
2. Google for David Wagner and David Molnar, they seem to be up on that sort of work.
Someday, I'll have a real sig.
For Java programming, I use FindBugs. I mostly use it through an Eclipse plugin.
A dyslexic man walks into a bra.
http://www.castsoftware.com/
I found the static analyser in SGI's Prodev Workshop to be quite excellent, though that was a while ago and I am comparing it with nothing - I'm not sure how it stacks up against more recent offerings :
r odev.html#B
http://www.sgi.com/products/software/irix/tools/p
Looks like it's IRIX only though, so YMMV, to put it mildly.
Max.
I just started looking at LLVM, maybe it is good for what you want.
http://llvm.org/
If you are on Windows, you can use the native C++ static analysis that comes with the Windows SDK. /analyze switch when invoking the compiler (cl.exe)
Just add the
It's the tool that is used by MS to test its own code, known internally as PreFast.
It helped me find many bugs in other people's code.
India.
Work smarter, not harder.
I strongly suggest you look at coverity.
They have excellent checks as well as the best framework for creating custom tests that I have ever come across.
NOTE: I am not affiliated with coverity, just a very satisfied user.
LL
http://www.gimpel.com/html/lintinfo.htm/
I've never tried it for a code base as large as 500k. My guess it that I used it up to 15k. I was very pleased with it. I agreed with just about every warning it raised, and was able to easily suppress individual instances or whole classes of errors. I also found it somewhat easier to get started with compared to the big tools from Rational et al.
I think it's a bit pricey for a an open-source coder like me, but it should be cheap enough for a company with a tools budget.
wc project.c
Swedish plasma phys. PhD student; MSc EE; knows maths, programming, electronics; finance interest; seeks opportunities
...which, in a 500K LOC program, there may be a bit of, try the copy/paste detector, CPD. There's a chapter on CPD in my PMD book, too...
The Army reading list
http://www.splint.org/
END OF LINE
Whatever you use, make sure you adjust the settings to only capture those problems that you think are critical. With 500k lines of code, unless your codebase is *extremely* solid running a Lint tool will result in a LOT of action items. I've used SPLINT (a lint for secure programming - http://www.splint.org/) in a project with a codebase much smaller than 500k and it took weeks to finish addressing all the issues - sometimes these things can be more of a curse than a blessing.
I work on a C/C++ code base that is a lot bigger than 500k lines. I've worked with results produced by Klocwork and also with the output from Reasoning. Both of these services/packages will cost you money but both provide good insight into your code. The commercial packages generally produce more focused results with less false-positives, so while they cost you money up front, your developers will spend less time weeding out the noise.
If paying money out for a commercial package isn't your thing, don't overlook the old standby lint or splint, an updated successor.
Also well worth investigating to see how your code is actually running is Valgrind and it's associated tools. The Valgrind toolkit will give you a good idea where memory is being leaked, where variables and pointers are going off the rails. Valgrind hooks into a running program, so it's important to make sure that you test all the corners of the codebase if you go this route.
Cheers,
Toby Haynes
Anything I post is strictly my own thoughts and doesn't necessarily have anything to do with the opinions of IBM.
A coworker of mine who's quite a C/C++ jockey used it recently (this month), and said it's still very good.
Why are you analyzing your code? What are you looking for? Performance optimizations? Security flaws? Bugs in general?
One important thing to consider is the set of compilers, tools, target system, and build environments you are using. If you are using MS only products the you will most likely have very good support because most all source code analysis suits will simply import the build information and you will be off and running right away. If your environment is Unix or embedded systems then things may be more difficult because you will need to hook into the build process somehow. The scanner tools usually intercept the CC command from a "make" build and call their back end using their custom processing rather than the compiler proper. Different products do this in different ways so be sure the product you choose knows how to deal with your specific build environment. In my case I walked into another parties environment and needed to simulate a build for a new build environment that I had never seen before, every time. Not one environment ever looked like the next, so the setup and configuration was always a big challenge, just to get started.
Prexis is primarily a tool for life cycle scanning of source code for security issues. There are two ways to perform the code scanning, with either the main engine component which can schedule nightly scans and track progress over time or with the additional Prexis Pro utility, which is designed for quick assessments by the engineers on their own code without logging everything into the main database. The Pro tool worked best for my code assessments since I had no need for tracking changes over time, and it was a little easier to configure which counts for a lot in my situation.
PolySpace is a completely different tool with a different purpose from Prexis. PolySpace attempts to mathematically discover runtime flaws in the code while only using static analysis to do so. It does a great job on smaller projects, but because of the complexity and thoroughness of its analysis, it is somewhat slow. PolySpace needs to evaluate an entire application all at once in order to do a good analysis. If your .5 MSLOC of code is many separate programs/executables then you will be fine, but if you are talking about one huge monolithic application then you may have to evaluate it in chunks which just increases the false positives and forces the engineer to do more manual chasing of details to determine if the issue is really a problem or not. From what I have seen this product is in a class by itself.
BTW - keep you eyes on this site: http://samate.nist.gov/index.php/Main_Page
I work on a commercial static analysis tool called CodeSonar. It costs money, but we do offer free trials.
Our major competitors in this space are Coverity and Klocwork.
All three tools can (to some extent) infer how a program will behave at run-time, so they find more subtle bugs than tools that just look for suspicious patterns in your code.
I have used it sometimes, and as I have noticed that in some cases the version from CVS is better than the released version. (but as always, your mileage may vary).
For C++ it's a lot harder, but the programming rules for C++ and the compilers are a bit stricter too, so you may be helped there.
To make things worse (or better, depending on how you see it :-) ) you can always take a look at PurifyPlus from IBM. It contains three components, Purify; which checks runtime for memory leaks and illegal memory access, Quantify; which checks for performance bottlenecks and PureCoverage; which checks so that all parts of your code actually has been executing during your tests.
C++ is also a lot harder to do static checking on due to the fact that it contains inheritance and still allows a lot of features from C so an object can be passed around in a perfectly legal manner and still be hiding from the syntax checker. Openings for really strange bugs if someone decides to do "smart programming".
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
Regardless of what tool you select, you will have to decide what rules you want to apply and what you are trying to get out of using the tool. If management doesn't understand the purpose of the tools, they may make inappropriate decisions on how to use them. As an example, I worked on a large project, (hundreds of developers), and management decided that we needed to use a static analysis tool and that code had to be "clean" before it could be checked in. It was phased in, so we had a month to eliminate errors, and another month to eliminate warnings from our entire code base, but management wanted to use the entire ruleset and not allow the comments in the code that told the tool to ignore certain rules for certain lines.
Fortunately, we had a commmittee that was responsible for testing the tool and integrating it into our software engineering process, with a few people that management really listened to. After a few meetings, the committee was allowed to determine the ruleset that would be used and updated the rules for code inspections so the "ignore" comments were allowed, but had to be included as part of an inspection.
If we hadn't had a strong committee that got management to relax the rules life would have been a living hell trying to alter the code to make the tool happy. If 4 people can agree that the best thing to do is break a rule, you should trust them. If you can't trust them, then you shouldn't have them working for you. Remember, tools are dumb and don't understand the "why" behind the code. Yes, tools will find a lot of things that should be fixed, but they aren't always right.
Reading code is like reading the dictionary - you have to read half of it before you can go back and understand it.
Would you consider SAP an application? Or any other ERP system? I would.
Shameless commercial plug here... I'm the CTO of Klocwork (www.klocwork.com), a vendor of source code analysis tools. We provide security vulnerability and implementation defect checking for C, C++ and Java. In addition, as others on this thread have stated, you're going to want to look at refactoring, architectural analysis, rule tuning, metrics, trends, all the usual stuff and all of which we supply as part of our enterprise suite of products. Check your supplier list carefully as all of the companies in this space offer different subsets of the whole. There's a decent page on Wikipedia on static analysis that mentions the prevalent tools in this space, including our major competitors. Last point: be careful to try before you buy (whether "buy" involves money or not), as all tools are not created equal.
I used there tools for a large project (probably 100k) that had spiraled out of control and needed some major restructuring. You use their compiler to build your code, and it gathers lots and lots of information. We used it to analyze all the connections between the various modules/files, but it will also give you many different metrics. We also used their GUI to restructure the existing code base visually, and see manage all the interactions. Very useful, and nice friendly small company. http://www.headwaysoftware.com/index.php
If your project has 500K lines of C/C++, it will almost certainly fail.
"Not an actor, but he plays one on TV."
I am employee of Klocwork.
If you are researching this for you enterprise I suggest you evaluate Klocwork (and its competitors: Coverity, Grammatech, Parasoft, there are others). We handle large-scale C/C++ projects, our own codebase is much larger than yours and we run Klocwork in-house to track defects in our own code on a daily basis and on developer desktops for subprojects. In fact we successfully handled mammoth projects as big as 10M lines of code and beyond (but frankly, it is getting rather tricky at that point).
We do have product for individual developers and small shops, but for now it is Java only.
my sstream of consciousness
You can use the free tool : C Advise on HPUX to run static analysis of C and/or C++ code. It's pretty good. I think you have to be DSPP member to download it, but registration is free.
From their marketing blurb...
Understand, our flagship product, helps thousands of companies maintain impossibly large or complex amounts of source code. It parses source code for reverse engineering, automatic documentation, and calculating code metrics. We have versions for Ada 83, Ada 95, FORTRAN 77, FORTRAN 90, FORTRAN 95, Jovial, K&R C, ANSI C and C++, Delphi, and Java. Multi-million SLOC projects are common with our users.
from parasoft corporation. Statically tests functions for 50 cases.
I prefer the Insure++ product myself. It really helps in finding bugs.
IT IS NOT FREE.
I can program myself out of a Hello World Contest!!
You didn't mention what platform you're building on, and you also didn't mention exactly what kind of analysis you want to perform.
/analyze flag to cl.exe.
If you're on Windows, the latest Visual Studio C/C++ compilers include a pretty good (but basic) code analysis tool built in. Just use the
Moderator hint: a comment is neither "Flamebait" nor "Troll" if it is true.
Fortify a security static scanner and covers C/C++ as well as Java, JSP, .NET, C#, XML, CFML, PL/SQL and T-SQL.
We're on an embedded system with several CPUs.
One of the CPUs is running Linux. This code we compile with gcc on a Linux box.
Another CPU is running ThreadX. We cross-compile this on Windows using the Green Hills compiler.
A couple other CPUs run Nucleus OS. These are also cross-compiled on Windows using the Green Hills compiler.
We have gotten evaluations of KlokWorks and Coverity (and I've probably said enough here for them to figure out who we are). And they do good stuff, too. But I'm trying to look around to see what else is out there, since Coverity especially is pretty pricey, and KlokWorks didn't give us a long enough demo for us to really evaluate how well their tool found the kind of issues we are looking for.
As someone (CTO of KlokWorks?) said earlier, try before you buy...
I don't think it's publicly downloadable, but IBM has a tool called BEAM. http://www.research.ibm.com/da/beam.html
The results are okay from BEAM. Maybe you can submit a comment (see bottom of the page) to request use of the tool.
I've tried to use splint with mixed results. The default warnings were rather verbose, and most of them were unimportant.
Another open source project ran Coverity over our source code and sent us a summary of the results. The noise to signal ratio seemed better for BEAM and Coverity than splint. Though it's possible that the person running Coverity over our code turned off some warnings by default.
If you're interested in runtime tools in addition to static analysis tools, IBM Rational Purify (commercial) and valgrind (free) work fairly well. Each have their own issues. Both occasionally give false positives. The valgrind tool comes with many Linux distributions, but it's not really there for Windows.
Of course, those tools don't work effectively, unless you have a good test suite. IBM Rational PureCoverage help you to discover where code coverage is needed by your test suite. Our open source project aims for 100% API coverage and >85% overall line coverage for each release. While it's difficult to get some error conditions to be exercised and tested with the test suites, it's well worth it in the long run. It's satisfying to fix a bug, add a test for the bug, and see that your fix didn't break any of the other tests. You might also be able to use the gcc profile option to get similar functionality as PureCoverage, but it won't generate a summary graph to tell you if you're meeting your code coverage coverage goals.
The static analysis tools are good to test for things that you didn't think about testing, or didn't have time to test. None of these tools solve all your software stability problems, but they greatly improve the stability. Turning on compiler warnings for your application will find some minor issues, but the static analysis tools make it easier to find bugs that only appear if you analyze both the called function and the function caller.
(Full disclosure: I work on open source software at IBM)
Linux kernel developers use this:
http://www.kernel.org/pub/software/devel/sparse/
If you are like all projects I have seen, you haven't turned on the relevant compiler switches for ANSI/ISO compliance and full warnings. Do that first.
Second, get yourselves a few more compilers. If you use gcc, fetch the latest version. It doesn't matter if you can run the compiled code.
Third, write type safe code. If it's really C++ code you are writing, disable C-style casts and see how much of those monstrosities (from a C++ point of view) you use.
After you are done, look for linters (aka static analysis tools). One weak spot of C and C++ compilers is that they work on the translation unit level; they don't see the complete source code at once.
Last time I heard, Polyspace didn't do C++ -- just C and some random toy language (Java or Ada?). Cool but extremely expensive.
C++test from Parasoft has static analysis and Automatice Unit test generation. But you should always try before you buy.
Some related C++ analysis tools for Visual Studio may also be of interest to you, IncludeManager and StyleManager: https://secure.profactor.co.uk/products.php