Do Static Source Code Analysis Tools Really Work?
jlunavtgrad writes "I recently attended an embedded engineering conference and was surprised at how many vendors were selling tools to analyze source code and scan for bugs, without ever running the code. These static software analysis tools claim they can catch NULL pointer dereferences, buffer overflow vulnerabilities, race conditions and memory leaks. Ive heard of Lint and its limitations, but it seems that this newer generation of tools could change the face of software development. Or, could this be just another trend? Has anyone in the Slashdot community used similar tools on their code? What kind of changes did the tools bring about in your testing cycle? And most importantly, did the results justify the expense?"
They're not perfect, and won't catch everything, but they do work. Combined with unit testing, you can get a very low bug rate. Many of these (for Java, at least) are open source, so the expense in negligible.
It's another tool in the toolbox. However, the results are not necessarily easy to understand or simple to fix. For example, see the recent SSL library issue - Which exhibited minimal randomness due to someone "fixing" an (intended) uninitialized memory area.
Here at IBM we have an internal tool from research that does static code analysis.
It has found some real bugs that are hard to generate a testcase for. It has also found a lot of things that aren't bugs, just like -Wall can. Since I work in the virtual memory manager, a lot more of our bugs can be found just by booting, compared to other domains, so we didn't get a lot of new bugs when we started using static analysis. But even one bug prevented can be work multiple millions of dollars.
My experience is that, just like enabling compiler warnings, any way you have to find a bug before it gets to a customer is worth it.
Terrorist, bomb, al Qaeda, nuclear, yellowcake, kill, assassinate. Carnivore is dead... long live Echelon.
If I remember correctly, one of these companies donated their tool to many open source projects, including Linux and the BSDs. I think it led to a wave of commits as 'bugs' were fixed. It seemed like a pretty good endorsement to me...
"Here Lies Philip J. Fry, named for his uncle, to carry on his spirit"
I use PC-lint religiously for my embedded code. In my opinion it has the most bang for the buck. It is fast, cheap and reliable. I've found probably thousands of issues and potential issues over the years using it.
I've also used Polyspace. In my opinion, it is expensive, slow, can't handle some constructs well and has a *horrible* signal to noise ratio. There is also no mechanism for silencing warnings in future runs of the tool(like the -e flag in lint). On the other hand, it has caught a (very) few issues that PC-Lint missed. Is it worth it? I suppose it depends if you are writing systems that can kill people if something goes wrong.
Static analysis does catch a lot of bugs. Mind you, it's no silver bullet, and frankly it's better, given the choice, to target a language+environment that doesn't suffer problems like dangling pointers in the first place (null pointers, however, don't seem to be anything Java or C# are really interested in getting rid of).
Even lint is decent -- the trick is just using it in the first place. As for expense, if you have more than, oh, 3 developers, they pay for themselves by your first release. Besides, many good tools such as valgrind are free (valgrind isn't static, but it's still useful).
Yes, some static analysis tools really work. FindBugs works well for Java. Fortify has had good success finding security vulnerabilities. These tools take static checking just a step beyond what's offered by a compiler, but in practice that's very useful.
The best thing these tools can do is to tell everyone what they probably already know -- that a particular coder or coder(s) are responsible for a whole ton of the errors in the code. I think it'd be much better to move that coder to some other part of the company ... it would be way cheaper than trying to fix all their bugs.
stuff |
I am presently working on an update to static analysis tools. Static analysis tools are not a silver bullet but they are still relevant. Look at them as a starting point in your search for programming problems. A lot of potential anomalies can be detected like the use of uninitialized variables. Of course, a good compiler can use these tools as part of the compilation process. However, there are many things that a static analyzer can't detect. For this, you need some way to do dynamic analysis ( execution based testing). As such the tools we are developing also include dynamic testing.
You will probably be amazed at what you will catch with static analysis. No, it's not going to make your program 100% bug-free (or even close), but every time I see code dies on an edge case that would've been caught with static analysis, it makes me want to kill a kitten (and I'm totally a "cat person" mind you).
Static analyzers will catch the stupid things - edge cases that fail to initialize a var, but then lead straight to de-referencing it; memory leaks on edge-case code paths, etc. that shouldn't happen but often do, and get in the way of find real bugs in your program logic.
Such tools work in a very similar way to what is already being done in many modern language compilers (such as javac). Basically, they implement semantic checks that verify whether the program makes sense, or is likely to work as intended in some respect. For example, they will check for likely security flaws, memory management/leaking or synchronisation issues (deadlock, access to shared data outside critical sections, etc.), or other kind of checks that depend on whatever domain the tool is intended for.
:)
It would probably be more useful if you could state which kind of problem you are trying to solve and which tools you are considering to buy. That way, people who have experience with them could suggest which work best
Every expression is true, for a given value of 'true'
I forgot to answer your other question.
Since we've had the tool for a while and have fixed most of the bugs it has found, we are required to run static analysis on new code for the latest release now (i.e. we should not be dropping any new code that has any error in it found via static analysis).
Just like code reviews, unit testing, etc., it has proved useful and was added to the software development process.
Terrorist, bomb, al Qaeda, nuclear, yellowcake, kill, assassinate. Carnivore is dead... long live Echelon.
Ya can't beat a good "Lint party" after all the testing is done! You'll find all kinds of cool stuff that slipped through your testing suites.
However, static code analysis is just one part of the bug-finding process. For example, in your list, in my limited experience, I have found that buffer overflows and NULL pointer derefs get spotted really well. Race conditions? Memory leaks? Hmm. Not so good.
YMMV. Don't expect magic. Oh to hellwithit, just let the end-users test it *ow!*
The Astrée static analyser (based on abstract interpretation) proved the absence of run-time errors in the primary control software of the Airbus A380.
Add me to the Yes column
We use them (PMD and FindBugs) for eliminating code that is perfectly valid, yet has bitten us in the past. Two Java examples are unsynchronized access to a static DateFormat object and using the commons IOUtils.copy() instead of IOUtils.copyLarge().
Most tools are easy to add to your build cycle and repay that effort after the first violation
I took a very cool graduate-level class at MIT from Dr. Michael Ernst about this very subject. Check out some of the projects listed at http://groups.csail.mit.edu/pag/.
While not the be-all-and-end-all of code quality metrics, VS2008/Team Foundation Server has this built-in now so you can stop developers checking in completely junk code if you so wish - http://blogs.msdn.com/fxcop/archive/2007/10/03/new-for-visual-studio-2008-code-metrics.aspx
.Net development). It takes one experienced dev to customise the rules, and you've got a fairly decent protection scheme against insane code commits.
FxCop too has gone server-side too (for those familiar with
throw new NoSignatureException();
Yes, static code tools do work well for finding certain classes of issues. However, they are not a panacea. They do not understand the semantics that are intended and cannot effectively replace code reviews.
the more they stay the same. Static code analysis tools are just like smarter compilers, better language libraries, new-and-improved software methodologies, high-level dynamic languages, modern IDE's, automated unit test runners, code generators, document tools and any number of other software tools that have shown up over the past few decades.
Yes, static code analysis can help improve a team's ability to deliver a high-quality product, if it is embraced by management and its use is enforced. No, it will not change the face of software development, nor will it turn crappy code into good code or lame programmers into geniuses. At best, when engineers and management agree this is a useful tool, it can do almost all the grunt work of code cleanup by showing exactly where problem code is and suggesting extremely localized fixes. At worst, it will wind up being a half-assed code formatter since nobody can agree on whether the effort is necessary.
Just like all good software-engineering questions, the answer is 'it depends'.
I'm disabling ads until because I choose not to reward redesigns that are less usable than "view source".
In my own corner of the world (.NET Compact Framework 2.0 on old, arcane hardware), they certainly don't. Each time I get optimistic and search for new or previously-missed static analysis tools, all roads end up leading to FxCop. Horrible signal-to-noise ratio, and a relatively small number of real detectable problems. That said, I'm always willing to submit myself to the genius of the slashdot masses. If you know of a great one, feel free to let me know. = )
At Symantec, I used to use these tools to help plan tests. I wrote a simple code velocity tool that monitored Perforce checkins and generated code velocity graphs and alerts in different components as time passed. With it, QA could easily see which code was being touched the most and dig down to the specific changelists and see what was going on. It really helped keep good visibility on what needed the most attention and helped everyone avoid being 'surprised' by someone dropping a bunch of changes into an area that wasn't watched carefully. During the final days of development before our products escaped to manufacturing, this provided vital insight into what was happening.
I've since moved on, and I think the tool has since gone offline, but I think there's a real value to doing static analysis as part of the planning for everything else.
Is this what you were talking about?
http://it.slashdot.org/article.pl?sid=08/01/11/1818241
- doug
I've also used Polyspace. In my opinion, it is expensive, slow, can't handle some constructs well and has a *horrible* signal to noise ratio.
The signal-to-noise ratio is pretty horrendous in most static analysis tools for C and C++, IME. This is my biggest problem with them. If I have to go through and document literally thousands of cases where a perfectly legitimate and well-defined code construct should be allowed without a warning because the tool isn't quite sure, I rapidly lose any real benefit and everyone just starts ignoring the tool output. Things like Lint's -e option aren't much good as workarounds either, because then even if you're hiding an issue that might be a phantom problem today, you'll still be hiding it if it becomes a real problem tomorrow. :-(
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
We have had presentations from both Coverity and Klocwork at my workplace. I'm not entirely fond of them, but they're wayyyyy better than 'lint'. :) I much prefer using "Purify" whenever possible, since run-time analysis tends to produce fewer false-positives.
My comments would be:
(1) Klockwork & Coverity tend to produce a lot of "false positives". And by a lot, I mean, *A LOT*. For every 10000 "critical" bugs reported by the tool, only a handful may be really worth investigating. So you may spend a fair bit of time simply weeding through what is useful and what isn't.
(2) They're expensive. Coverity costs $50k for every 500k lines of code per year... We have a LOT more code than this. For the price, we could hire a couple of guys to run all of our tools through Purify *and* fix the bugs they found. Klocwork is cheaper; $4k per seat, minimum number of seats.
(3) They're slow. It takes several days running non-stop on our codebase to produce the static analysis databases. For big projects, you'll need to set aside a beefy machine to be a dedicated server. With big projects, there will be lots of bug information, so the clients tend to get bogged down, too.
In short: It all depends on how "mission critical" your code is; is it important, to you, to find that *one* line of code that could compromise your system? Or is your software project a bit more tolerant? (e.g., If you're writing nuclear reactor software, it's probably worthwhile to you to run this code. If you're writing a video game, where you can frequently release patches to the customer, it's probably not worth your while.)
I'll probably get modded to hell for asking but seriously -- all these new trends, tools, etc - are they not just crutches, which in the long run are seriously going to diminish the quality of output by programmers?
For instance, we put men on the moon with a pencil and a slide rule. Now no one would dream of taking a high school math class with anything less than a TI-83+.
Languages like Java and C# are being hailed while languages like C are derided and many posts here on slashdot call it outmoded and say it should be done away with, yet Java and C# are built using C.
It seems to me that there is no substitute for actually knowing how things work at the most basic level and doing them by hand. Can a tool like Lint help? Yes. Will it catch everything? Likely not.
As generations of kids grow up with the automation made by generations who came before, and have less incentive to learn how the basic tools work, an incentive which will diminish, approaching 0, I think we're in for something bad.
As much as people bitch about kids who were spoiled by BASIC, you'd think that they'd also complain about all the other spoilers. Someday all this new, fancy stuff could break and someone who only knows Java, and even then checks all their source with automated tools will likely not be able to fix it.
Of course, this is more of just a general criticism and something I've been thinking about for a few weeks now. Anyway, carry on.
Static analysis is another tool in the toolbox. Its a great indicator of overall code quality and care taken by a developer which may predict code quality during dynamic testing.
"You can put lipstick on a pig, but its still a pig".
"You can lint check bad code and add comments, but its still bad code".
At least with the static tools run early in the development process, you can identify the code pigs and make a decision to rebuild parts or team up an experienced developer with a new one. Using RegEx tools like AWK, you can even build your own static analysis tools. We did this for Y2K checking some years ago and will probably need to do it again in 2036.
You actually need to tolerate a number of false positives in order to get good coverage of the true bugs. That means you have to follow-up on every report in detail and understand it.
However these things do work and are highly recommended. If you use other advanced techniques (like Descign by Contract),they will be a lot less useful though. They are best for traditional code that does not have safety-nets (i.e. most code).
Stay away from tools that do this without using your compiler. I recently evaluated some static analysis tools found that the tools that do not use the native compilers can have serious problems. One example was an incorrecly set symbol in the internal compiler of one tool, that could easily change the code functionality drastically. Use tools that work frrom a build environment and utilize the compiler you are using to build.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I've used the above two tools ... the IntelliJ IDEA IDE for Java development, and the Visual Studio plug-in Resharper for C# development ... and can't imagine living without them.
Of course, they provide a heck of a lot more than just static code analysis, but the ability to see all syntax errors in real time, and all logic errors (like potential null-references, dead code, unnecessary 'else' statements, etc, etc) saves way too much time, and has, in my experience, resulted in much better, more solid code. When you add on all the intelligent refactoring, vastly improved code navigation, and customizable code-generation features of these utilities, it's a no-brainer.
I wouldn't program without them.
- Spryguy
There are three kinds of people in this world: those that can count and those that can't
FindBugs is becoming increasingly widespread on Java projects, for example. I found that between it and JLint I could identify a substantial chunk of problems caused by inexperienced programmers, poor design, hastily written code, etc. JLint was particularly nice for potential deadlocks, while FindBugs was good for just about everything else.
For example:
At least in the Java world, I wish more people would use them. It would make my job so much easier.
My experience in the Python world is that pylint is less interesting than FindBugs: many of the more interesting bugs are hard problems in a dynamically typed language and so it has more "religious style issues" built in that are easier to test for. It still provides a great deal of useful output once configured correctly, and can help enforce a consistent coding standard.
Integrate Keynote and LaTeX
I have used static analysis as part of our build process on our Continous Integration machines and it's definitely worth your time to set it up and use it. We use FindBugs with our Java code and have it output html reports on a nightly basis. Our team lead comes in early in the morning and peruses them and assigns them to either "Suppress" or fix the issues. We shoot for zero bugs either through suppressing them if they aren't bugs or by fixing them. FindBugs doesn't give too many false positives so it works great.
Could this be just another trend?
I don't worry about what's "trendy" or not. Just give the tool a shot in your group and see if it helps/works for you or not. If it does keep using it otherwise abandon it.
What kind of changes did the tools bring about in your testing cycle?
We use it _before_ the test cycle. We use it to catch mistakes such as "Whoops! Dereferenced a pointer there, my bad" before going into the test cycle.
And most importantly, did the results justify the expense?
Absolutely. The startup cost of adding static analysis for us was one developer for 1/2 a day to setup FindBugs to work on our CI build on a nightly basis to give us HTML reports. After that, the cost is our team lead to check the reports in the morning (he's an early riser) and create bug reports based on them to send to us. Some days there's no reports, other days (after a large check-in) it might be 5-10 and about an hour of his time.
It's best to view this tool as preventing bugs, synchronization issues, performance issues, you name it issues before going into the hands of testers. But, you can extend several of the tools like FindBugs to be able to add new static analysis test cases. So if a tester finds a common problem that effects the code you can go back and write a static analysis case for that, add it to the tool and the problem shouldn't reach the tester again.
We use them more for optimizing code than anything else... The biggest problem we see is that there are often false positives... A senior person can easily look at recomendations and pick whats needed... A junior person, not so much, which we learned the hard way...
My experience has been that while in the hands of people who know what they're doing, they're a nice tool to have, well, beware managers using their output as metrics. And beware even more a consultant with such a tool that he doesn't even understand.
// ignore }}" blocks a thousand times over in each "finally" block, when you can write it once and just call the method in your finally block. This tool had a trouble understanding that it _is_ all right. Unless it saw the "connection.close()" right there, in the finally block, it didn't count.
The thing is, these tools produce
A) a lot of "false positives", code which is really OK and everyone understand why it's ok, but the tool will still complain, and
B) usually includes some metrics of dubious quality at best, to be taken only as a signal for a human to look at it and understand why it's ok or not ok.
E.g., ne such tool, which I had the misfortune of sitting through a salesman hype session of, seemed to be really little more than a glorified grep. It really just looked at the source text, not at what's happening. So for example if you got a database connection and a statement in a "try" block, it wanted to see the close statements in the "finally" block.
Well, applied to an actual project, there was a method which just closed the connection and the statements supplied as an array. Just because, you know, it's freaking stupid to copy-and-paste cute little "if (connection != null) { try { connection.close(); } catch (SQLException e) {
Other examples include more mundane stuff like the tools recommending that you synchronize or un-synchronize a getter, even when everyone understands why it's OK for it to be as it is.
E.g., a _stateless_ class as a singleton is just an (arguably premature and unneded) speed optimization, because some people think they're saving so much by a singleton instead of the couple of cycles it takes to do a new on a class with no members and no state. It doesn't really freaking matter if there's exactly one of it, or someone gets a copy of it. But invariably the tools will make an "OMG, unsynchronized singleton" fuss, because they don't look deep enough to see if there's actually some state that must be unique.
Etc.
Now taken as something that each developper understands, runs on his own when he needs it, and uses his judgment of each point, it's a damn good thing anyway.
Enter the clueless PHB with a metric and chart fetish, stage left. This guy doesn't understand what those things are, but might make it his personal duty to chart some progress by showing how much fewer warnings he's got from the team this week than last week. So useless man-hours are spent on useless morphing perfectly good code, into something that games the tool. For each 1 real bug found, there'll be 100 harmless warnings that he makes it his personal mission to get out of the code.
Enter the snake-oil vendor's salesman, stage right. This guy only cares about selling some extra copies to justify his salary. He'll hype to the boss exactly the possibility to generate such charts (out of mostly false positives) and manage by such charts. If the boss wasn't already in a mind to do that management anti-pattern, the salesman will try to teach him to. 'Cause that's usually the only advantage that his expensive tool has over those open source tools that you mention.
I'm not kidding. I actually tried to corner one into;
Me: "ok, but you said not everything it flags there is a bug, right?"
Him: "Yes, you need to actually look at them and see if they're bugs or not."
Me: "Then what sense does it make to generate charts based on wholesale counting entities which may, or may not be bugs?"
Him: "Well, you can use the charts to see, say, a trend that you have less of them over time, so the project is getting better."
Me: "But they may or may not be actual bugs. How do you know if this week's mix has more or less actual bugs than last weeks, regardless of wh
A polar bear is a cartesian bear after a coordinate transform.
* I really like Insure, but it is difficult to set up on a system composed of many shared libraries. However, there are some bugs that really need run-time analysis to catch.
Another good point for using lint, is that after a while a programmer learns the the way, and the outcome is a better code in a shorter time. Of course I also found that are a few ways to avoid lint errors/warnings in a way that lead to some very ugly bugs.
How about a tool that will tell me if my program will
eventually halt or not for a given input? I'd pay big money for that!
Short version:
There are real bugs, with huge consequences, that can be detected with static analysis.
The tools are easy to find and worth the price, depending on the customer base you have.
In the end, that cannot detect "all" bugs that could arise in the code.
Worth it?
Only you can decide, but after a few sessions learning why tools flag suspect code, if you take those suggest to heart, you will be a better coder.
My company uses Ounce for static analysis. It's great at helping find the tedious bugs and potential security violations. For instance, server side input validation for web sites is much easier and more accurate with static analysis. As others have said, it's not perfect but having an extra set of digital eyes looking at the code helps. The sooner bugs are fixed in the development cycle, the cheaper it is to fix.
At my little corner of Lockheed Martin we use Klocwork and LDRA to analyze C/C++ embedded code for military hardware. Since the various compilers for each contract aren't nearly as full-featured as say, Visual Studio or Eclipse, I've found static code analysis tools invaluable. Can't comment on the cost/results ratio though, since I don't purchase stuff. =)
The linux kernel developers use a tool originally written by Linux Torvalds for static analysis - sparse.
http://www.kernel.org/pub/software/devel/sparse/
Sparse has some features targeted at kernel development - for instance spotting mixing up kernel and user space pointers and a system of code annotations.
I haven't used it but I do see on the kernel mailing list that it regularly finds bugs.
Every man for himself, all in favour say "I"
That is the problem for a beginner. When you first configure PC-Lint you need to tune the configuration to ignore stuff that you don't have a problem with, ie. assignments within a test. After than you need to configure your project for lint, setting up the lint files to include the correct headers and such. Then the noise is not too bad. Just make sure when you think something is noise, it is not really noise.
Fight Spammers!
YOU sir, have amazing timing! I just wrote a 2-part article on this topic! Interesting... mine was published http://portal.spidynamics.com/blogs/rafal/archive/2008/05/06/Static-Code-Analysis-Failures.aspx The Solution: http://portal.spidynamics.com/blogs/rafal/archive/2008/05/15/Hybrid-Analysis-_2D00_-The-Answer-to-Static-Code-Analysis-Shortcomings.aspx Comments welcome!! Interesting that this topic is getting so much attention all of the sudden
s/Linux Torvalds/Linus Torvalds/ - I keep making that typo ;-)
Every man for himself, all in favour say "I"
Generally, these tools make up for deficiencies in the underlying languages; better languages can guarantee absence of these errors through their type systems and other constructs. Furthermore, these tools can't give you yes/no answers, they only warn you about potential sources of problems, and many of those warnings are spurious.
I've never gotten anything useful out of these tools. Generally, encapsulating unsafe operations, assertions, unit testing, and using valgrind, seem both necessary and sufficient for reliably eliminating bugs in C++. And whenever I can, I simply use better languages.
Ever heard of the Apollo Guidance Computer ? Even if they had several computers on board the space vehicles, they surely used computers to design the space vehicles.
At my company, we use Compuware DevPartner Studio and have found it to be a very comprehensive package. I have used it for performance optimization, memory leak detection, and resource misuse. I have not used its ability to find "dead" code, but that exists. It plugs into Visual C++ 6 and Visual Studio .NET and it takes minimal time to get used to it.
Others may know of it from its legacy name "Bounds Checker"
Where's the 0xBEEF
Something that we've found incredibly useful here and in past workplaces was to watch the _differences_ between Gimpel PC-Lint runs, rather than just the whole output.
The output for one of our projects, even with custom error suppression and a large number of "fixups" for lint, borders on 120MiB of text. But you can quickly reduce this to a "status report" consisting of statistics about the number of errors -- and with a line-number-aware diff tool, report just any new stuff of interest. It's easy to flag common categories of problems for your engine to raise these to the top of the notification e-mails.
Keeping all this data around (it's text, it compresses really well) allows you to mine it in the future. We've had several cases where Lint caught wind of something early on, but it was lost in the noise or a rush to get a milestone out -- when we find and fix it, we're able to quickly audit old lint reports both for when it was introduced and also if there are indicators that it's happening in other places.
And you can do some fun things like do analysis of types of warnings generated by author, etc -- play games with yourself to lower your lint "score" over time...
The big thing is keeping a bit of time for maintenance (not more than an hour a week, at this point) so that the signal/noise ratio of the diffs and stats reports that are mailed out stays high. Talking to your developers about what they like / don't like and tailoring the reports over time helps a lot -- and it's an opportunity to get some surreptitious programming language education done, too.
Static analysis tools never work for me, I don't declare anything static.
Linus is not Linux
Yes, I have heard of it. However, it's hardly a Quad Xenon workstation, is it? That's still just part of what I'm talking about. Their computers took up rooms (on the small side), and were less powerful than some calculators today.
It's like when Wheeler died and he was called "one of the last great titans of physics." One fellow slashdotter called this an unfair characterization as it is unfairly biased against the people who are doing work today, which he sees as no less important, comparing it to if say, Linus Torvalds were to die and was called one the "last great titans of computing."
The best computers they had available when they were designing the bomb were like UNIVAC. They had to do their math by hand, and much of their calculations were largely based on assumptions. These days the computer does a lot of the work and doing things by hand is for "suckers."
Comparing Wheeler to Linus is absurd. It'd have been more poinignt to compare Wheeler to Grace Hopper or someone, frankly.
That's just my two cents though. Your exchange rate my vary.
"...Many of these (for Java, at least) are open source, so the expense in negligible."
can you list a few? This summer I am picking back up a couple of old embedded projects, and am interested in learning more about the open source embedded tools available.
Static analysis is a tool. In good hands, it is a valuable tool. In expert hands, it can be invaluable, catching really subtle bugs that only show up in situations unlike anything you've ever tested -- or imagined to test. You know, situations like what your customers will experience the weekend after a major upgrade (no joking...)
Generally, bash is superior to python in those environments where python is not installed.
It may be the best tool in the world - I admit I do not know it. But the word "proved" makes me suspicious. To me this sounds like the typical - and wide spread - management speak to make business decision makers and their insurrers sleep well. Thank you! This gives the perfekt example that the misleading wording is even used by educational bodies.
Is this is a proof or do some mistakenly think they're safe?
Who "proved" Astree to be error free in the first place?!
how IT is changing the world - http://max.zamorsky.name
You should strive to make your code as clean as possible. Turn on maximum warnings from your compiler, and don't allow code that generates warnings to be checked in to your source repository. Use static analysis tools, and make sure your code passes without issue there as well. These tools will generate many false positives, but if you learn to write in a style that avoids triggering warnings, quality will go up. You may be smarter than Lint, but the next guy that works on the code may not be. Static analysis tools are just another tool in the tool box. Also use dynamic analysis tools like Purify, valgrind, or whatever works in your environment. Writing quality code is hard. You need all the help you can get.
They will indeed find certain classes of bugs, and code that is lint-free (especially with more modern versions of lint has fewer defects. Other metrics, like McCabe cyclomatic complexity, can also point out areas in which bugs have a high probability.
On the other hand, no tool can find 100 percent of bugs. This is a theorem (via Turning's halting and equivalence theorems), and also because some bugs are places where the code is doing what it was supposed to do, but that isn't what the user actually wanted.
I've used coverity. It's not cheap, unless your project is opensource, but it is very good. Very low false positive rate, and finds some stuff that would be pretty tricky to find through code reviews etc due to its ability to track things all across the project.
I use PC-Lint religiously, and have been for maybe 18 years now. In my opinion, it has made me a better programmer. It teaches you to be disciplined, because whenever you get sloppy, it barfs all over your disgusting code. The discipline you learn from it remains even after you stop using it. You start to think like Lint.
The First Commandment for C Programmers states "Thou shalt run lint frequently and study its pronouncements with care, for verily its perception and judgment oft exceed thine."
This is as true now as it was when the prophet (PBUH) spoke.
I love PC-Lint, but it does have some drawbacks. They added C++ support to it, but it really is having a hard time with the crazy template meta-programming tricks found in the Boost library. Using Boost will bury you under an avalanche of spurious warnings. Sometimes, it will even crash. In this case, studying its pronouncements with care is liable to give you a stroke.
We've had fantastic luck with Coverity. It's expensive, but it catches a lot of bugs with a very low false positive rate. It's good about explaining, step by step, "okay, what if this variable has this value, and this if/else goes this direction, and this other thing happens..." so you can tell why it gives the error in question.
:-)
So the false positives are pretty much always when an environment variable or a file contains something we don't expect it to. And those are only sort of false positives -- those things *could* actually happen, we just don't guarantee the results if they do
People do need to keep in mind that the statically typed languages that they most commonly work with are quite primitive when it comes to type systems. It's no surprise, for example, that a static analysis tool can find spots where one can get null pointer exceptions, since it's easy to design a language with a type system that guarantees they can't happen.
To make such a language, you just have to distinguish the types of variables that are allowed to be null, from the type of those that aren't, and force code that deals with the former to deal with the possibility the variable holds null. Nice does this; and of course, the general labeled union types available in ML and Haskell make nulls completely unnecessary.
Static analysis tools are by no means new. DEC, for instance, had DEC Source Code Analyzer (SCA) on the market in the mid-1980s. Logically, it was implemented as an add-on option to the DEC Language-Sensitive Editor (LSE). That way, SCA could assume all of LSE's functionality, and concentrate on adding extra features. Moreover, the arrangement ensured absolute consistency between the two products.
Of course static analysis cannot find all the problems in a piece of code. But when used in conjunction with compilers, editors, profilers, coverage analyzers, and automated test systems, it adds another piece to the jigsaw. And it's obviously better to detect errors sooner rather than later, thus shortening the code-test loop.
I am sure that there are many other solipsists out there.
I've never used lint or any of the other tools being talked about, but I have experience with the ML family of languages (OCaml, mainly), which have type-inference. Type inference is static analysis and catches, in my experience, a great deal of the bugs caught by those other tools, and better yet it's a core part of the language. Having to be aware of types forces you to catch the bugs before you make them, and your code won't compile until you've fixed the bugs. It, of course, doesn't solve logic bugs, but just about everything else gets caught. Obviously, this doesn't work for existing an codebase (and so this post is OT), but for greenroom projects, it is worth investigating.
I would rather put my efforts in developping my code with a safe-by-construction language (for instance with the Coq proof assistant), or use tools like Why to guarantee that my code won't blow up. Time spent debugging with static analysis tools should have been spent proving the correctness of the program beforehand.
I'm jack's useless sig
I used to work for a static analysis company. You get the most return for the first run.
The analyzer, like any automaton, matches patterns based on a set of criteria. These criteria are often simple in nature, as they are simple to implement. The more complex software defects, such as cross functional variable reference tracking are often ignored. Complex matching can be done, but it's simply too expensive to implement and may produce a large number of false positives. Run time errors, obviously are not static and cannot be caught. Running against C will return much higher quality results than against C++, as the object oriented model is almost impossible for the analyzer to follow.
It's great for finding all those elusive bits of code that might be accidentally seeding a pseudo-random number generator somewhere.
Show me a program than will know what I mean when I say (or type) "allow my friends to see my stuff" and does what a human would do. Then I'll start believing in automated debugging.
Static analysis tools are common in the open source world. The lint name is well known enough that many projects make theirs a pun on it, ala lintian. A few years ago a local root exploit in X was discovered by running these sorts of checks. But generally, static analysis tools require human review -- with large code bases they generate large numbers of false positives, especially the dumber ones. This leads to trouble for perfectionists, a common trait among software developers interested in bug fix analysis. For example, the recent massive Debian vulnerability was caused by an overzealous developer trying to fix static analysis flags. One of these flags was valid, one was not, and removing both removed nearly all entropy from the RNG.
In the more general sense, static analysis cannot find all bugs. There's a trivial proof: a program stuck in an infinite loop is a bug, but finding all such loops would solve the halting problem. Handling interrupts and the like also causes reasoning problems, as it's very hard, if not computationally intractable, to prove multi-threaded software is safe. So static analysis won't rid the embedded world of watchdog timers and other software failure recovery crap.
I Browse at +4 Flamebait
Open Source Sysadmin
Me: "ok, but you said not everything it flags there is a bug, right?"
Him: "Yes, you need to actually look at them and see if they're bugs or not."
Me: "Then what sense does it make to generate charts based on wholesale counting
entities which may, or may not be bugs?"
Him: "Well, you can use the charts to see, say, a trend that you have less
of them over time, so the project is getting better."
Me: "But they may or may not be actual bugs. How do you know if this week's
mix has more or less actual bugs than last weeks, regardless of what the
total there is?"
Him: "Well, yes, you need to actually look at them in turn to see which are actual bugs."
Me: "But that's not what the tool counts. It counts a total which includes an
unknown, and likely majority, number of false positives."
Him: "Well, yes."
Me: "So what use is that kind of a chart then?"
Him: "Well, you can get a line or bar graph that shows how much progress
is made in removing them."
Your next line is:
Me: "So you're selling us a tool that generates a lot of false warnings
and a measurement on how much unnecessary extra work we've done to
eliminate the false warnings. Wouldn't it make more sense not to use
the tool in the first place and spend that time actually fixing real bugs?"
To work this question must be asked with the near-hypnotized manager watching.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Yes, they kind-of work; it depends what you're doing and what you're trying to achieve.
Example: recently I was doing a quality audit of a bit of code which ran an industrial control unit in a safety critical application. The software was written using an obsolete, closed source C compiler and the processor was an old Hitachi; there is a GNU C compiler for it but because the code was non-ANSI we couldn't use that (the object was to audit the existing code for safety, not to change it). So dynamic analysis of the running code, other than black box testing of the complete controller, was not possible.
My job was to demonstrate that the code could not fail unsafe. I used QA/C, which I found very useful, and VectorCast, which turned out not to be useful on this particular project because it needs to interact with the compiler. The compiler would only run under 16 bit DOS, VectorCast under 32 bit Linux or Windows, so it proved to be impossible to get them to communicate (this doesn't mean VectorCast wouldn't be useful on other projects).
In summary, you wouldn't want these to be the only tools in your audit toolbox. But to get to understand a piece of not-very-well structured legacy code quickly, they're pretty useful.
I'm old enough to remember when discussions on Slashdot were well informed.
Because they cannot solve the halting problem, there are many instances where they will see a questionable piece of code, and have to decide whether they should flag it and risk a false positive, or ignore it and risk a false negative. This is where the magic happens, at least in the high-end commercial code analysis tools. If it always errs on the side of false positives, the output will be ignored in all but the most thoroughly audited fields. If it always errs on the side of false negatives, it's worthless. A lot of work goes into analyzing which practices commonly cause problems in the real world, and fine tuning the problem detection code to look for those, while perhaps passing up certain classes of bugs that are very rare and very computationally difficult to identify.
There's no failure quite as dissatisfying as a complete and total solution to the wrong problem.
What Javascript tools do you recommend? (similar to Findbugs for Java) What about ActionScript? Thanks
Hey, it's just the French. I deal with them regularly, and *any* software they write is always "perfect" even if it can be demonstrated to be a piece of crap in front of half a dozen decision makers. It's a cultural thing I think.
As to how they proved Astree with perfect, they ran it on itself until they had zero errors, obviously...
Coverity does not check for unitialize variables that is float or double.
Lint has never found any real bugs for me. Not one. Nor have a couple other C++ static analyzers that I've tried (though I've felt that Coverity might still be useful even though it hasn't found bugs for me). People say that these code analyzers are invaluable, but I think it depends on the quality of the programmer. Some people might not like to hear it, but the fact is that some programmers are much more accurate than others.
However, by saying that these tools never find bugs for me, I'm not saying that my code is bug-free. The existing bugs in my code tend to be higher level logic bugs that these tools don't understand.
ASTRÃE analyzes structured C programs, with complex memory usages [17], but without dynamic memory allocation and recursion. This encompasses many embedded programs as found in earth transportation, nuclear energy, medical instrumentation, aeronautic, and aerospace applications, in particular synchronous control/command such as electric flight control [22], [23]. There is a load of research on proving program correctness for limited subsets of languages (especially with no dynamic memory allocation). On functional languages these proofs can be almost trivial, not so much on imperative languages.
Correctness proofs go as far back as computer science itself, so it's an extensively studied area, which is unfortunately mostly relegated to academia on most cases because of the restrictions it places on the expressive power of such systems. But don't underestimate its ability to not get people killed.
Its not a trend, it is something developers have been doing for a long time. We have a build system here that automatically compiles and runs unit tests, and when something fails the developers gets an email. We try to automate as much as possible, so we also have several static code analysis tools like PMD, Findbugs, Checkstyle installed. All of them are not perfect, but they all detect at least some problems; its better than nothing. It is also important that these tools can be switched off so that they don't get annoying. PMD does this very nicely, you can disable checks on a method based granularity with a simple annotation at places where appropriate.
Open Source Alternatives
The English Breakfast Network
It helps to find potential problems and also checks for licenses, spelling errors and coding conventions.
DNA is the ultimate spaghetti code.
Several posters have cited the "halting problem" as an issue. It's not.
First, the halting problem does not apply to deterministic systems with finite memory. In a deterministic system with finite memory, eventually you must repeat a state, or halt. So that disposes of the theoretical objection.
In practice, deciding halting isn't that hard. The general idea is that you have to find some "measure" of each loop which is an integer, gets smaller with each loop iteration, and never goes negative. If you can come up with a measure expression for which all those properties are true, you have proved termination. If you can't, the program is probably broken anyway. Yes, it's possible to write loops for which proof of termination is very hard. Few such programs are useful. I've actually encountered only one in a long career, the termination condition for the GJK algorithm for collision detection of convex polyhedra. That took months of work and consulting with a professor at Oxford.
The real problem with program verification is the C programming language. In C, the compiler has no clue what's going on with arrays, because of the "pointer=array" mistake. You can't even talk about the size of a non-fixed array in the language. This is the cause of most of the buffer overflows in the world. Every day, millions of computers crash and millions are penetrated by hostile code from this single bad design decision.
That's why I got out of program verification when C replaced Pascal. I used to do this stuff.
Good program verification systems have been written for Modula 3, Java, C#, and Verilog. For C, though, there just isn't enough information in the source to do it right. Commercial tools exist, but they all have holes in them.
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
In the design of synchronous digital circuits the tools measure how long it takes signals to propagate through networks of logic from one layer of flops to another, try to optimize the organization and physical layout of the logic to meet timing requirements, and tell you how fast you can run your clocks and still have everything settled at flop inputs in time for them to correctly capture the data.
They do this by making worst-case assumptions: How long after a clock the output of a flop is stable, how long it takes to get through gates of each type, how soon before a clock it must arrive at the input to the next flop, and so on.
But one of these assumptions is that ANY input to the combinatorial logic might change on one clock and that change might propagate through any available path to affect any output, which must be stable by the next clock. This assumption might be wrong for a number of reasons.
For instance: A complicated logic block might be used differently on different cycles. An adder might be adding two values near the output of a combinatorial block on some cycles, two near the beginning (with its output driving something else complicated) on others. The tools see that there's a long path on the way to the adder's input and another coming from it's output. But they may not see that both can't happen at the same time (either though not checking all the possible combinations of inputs, or not being able to see the circuitry that creates the guarantee - which may be in another block that's not available to the tool).
Without this knowledge the tools may perform unnecessary optimizations to "fix" the non-problem: Using fast gates (with higher power and more silicon area), rearranging the logic to improve the long path (ditto and lengthening some that DO get used). And/or it may report timing violations for the user to fix (masking any shorter but real ones) or prescribe an unrealistically low clock rate (reducing the part's performance).
To avoid this, such tools provide the ability for the users to declare such paths to be "false paths", which should be ignored in optimization. (Unfortunately, the paths often cross module boundaries, so the ability to declare them is generally provided OUTSIDE the source code, in some separate configuration file for the build.)
IMHO many of the bogus warnings from static analysis tools are a similar problem. As a result, such tools need a couple of similar features to solve it.
At a minimum they need the ability for the code author to say "This is really OK." in a way the tool can process. This gets rid of the bogus warning - clearing the output so that REAL warnings won't be buried in the false cries of wolf and will be acted upon.
A useful addition would be an ability to say "This is really OK because of X". That way the tool could then check that X actually still holds and sound off if it gets broken later. (Unfortunately, X can be pretty general. So you still need the ability to say "This is really OK because I told you so.")
This already arose in C++ and ANSI C strong type checking. But idioms were available to tell the compiler you really meant it. (Cast to void then cast to another type, store in a union as one thing and load as another, argument type of void or pointer-to-void, encapsulating such idioms in typecast defined operators, etc.) Now we have another checking tool that needs its own flavor of communication from the designer to the tool.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Yes, static analysis, especially with some of the open source tools available in the Java space, can be awesome. Check out PMD and FindBugs for two awesome examples of this. Both are free, and do everything you could ever want. If you are willing to pay money for a tool like this though, you are focusing on the wrong things.
No, these tools won't catch entire categories of bugs that can only crop up while your application is running. For those, use unit tests, profilers, etc. For an explanation beyond code for this, check out the book "Godel Escher Bach, an Eternal Golden Braid", specifically the discussion about recording records that when played, destroy the record player playing them. No, its not real, its a thought experiment.
I've been using static analysis tools since 1993 now and I'm totally convinced of their merits. I've found no compiler that yet comes close to being as capable at finding problems as the best static analysis tools are now.
I develop medical software for cancer treatment professionally (which is keenly watched by the FDA) and thus am exceptionally keen to maintain bugfree code. While static analysis can't guarantee it, it can make a huge difference to code quality and reliability - and yes, to answer the original questioner, yes, they can catch stuff like buffer overruns, pointer abuse, memory leaks etc, and in some cases, resource/locking abuses etc. I will admit that gcc is very good these days, but even it isn't as good as these tools.
In particular I'm a very big fan of Gimpel's PC Lint - which is astonishingly good and astonishly inexpensive.
If you wish to try a demo of it, interactively, try their demo tool at:
http://www.gimpel-online.com/OnlineTesting.html
Also, they do a monthly "bug of the month" in a code sample - see whether your compiler can spot the issue/potential issue:
http://www.gimpel.com/html/bugs.htm
For anyone integrating it into Visual C++ (PC Lint is a command line tool), take a look at Riverblade's Visual Lint also:
http://www.riverblade.co.uk/
Disclaimer: I don't work for any of these companies (Gimpel or Riverblade) but I'm a huge fan of both of their products - I believe they've made more difference to my development than any number of UML tools etc - I tend to dump tools that don't make a real difference to me, and these are the only tools that I've used year in year out...
(Gets off soapbox)
Seriously though - whatever you choose, do investigate static analysis - it really is worth it.
Linux fan and Win32 developer
I've been using PC-Lint since about 1997, and have found some pretty nasty bugs using it - in most cases well before they had surfaced in the field. Given that, using such tools is (for me at least) a no-brainer. You do however need to a a particular mindset and take the time to learn how to use such tools effectively - how to tune your warning policy to your codebase, and what issues to look out for which could indicate real trouble (e.g. I've seen behavoural bugs caused by variable name scoping issues which lint caught but which at first glance would appear to be "lint noise"). I have a red flag list of issues I watch for in a new codebase, which works rather well in practice. I should add a disclaimer that these days I make most of my living from working with PC-Lint, so I'll acknowledge my interest in the subject up-front before anyone else points it out!
"If mushy peas are the food of the devil, the stotty cake is the frisbee of God"
Only solution is to work for a manager who is either technically competent or trusts your judgment.
Of course, the chain can be a lot more complicated than that -- you could have a technically competent boss whose boss is a PHB -- but the basic theme is, the person ultimately making the decisions either must understand the real impact of those decisions, or must be delegating to someone who does.
Anything other than that, and your corporation is dysfunctional. I realize many, maybe even a majority of corporations are dysfunctional, but it's good to know.
Short-term solution: Adopt issue-tracking software and have it generate graphs. Then, when the snake-oil salesman comes around, you whip out your chart and say, "I have a chart here of actual bugs closed, and new bugs opened, and average total bugs open over a given week. You have a chart of potential bugs remaining, and no way to mark one of them as not a bug. Why would we want your chart when we have my charts?"
Don't thank God, thank a doctor!
They're definitely worth it.
We've used McCabe IQ here and it's very helpful.
IQ focuses more on metric (grading code quality) and coverage (lots of little 'i got here's in the code). The combination of the two is nice since you can get an idea of where your testing effort is going wrong as well as highlighting suspect code.
It's expensive but you only need to find a few 'deal-breaker' bugs to increase sales / decrease costs. So in short these tools are definitely worth it.
Mozilla is making some interesting static analysis plugins for GCC. It looks like something that could be easy to utilized by open source projects.
See Dehydra static analysis tool
We also do a lot of work with the formal verification tools like Spin. Again, the effort can be substantial (some have likened Spin modeling to a black art) but the results that you get can be well worth it.
I suppose the last word is that if your definition of static analysis tools consists solely of Lint and its derivatives, you will get very little benefit (if any) from the effort. The same goes for assuming that you can buy a tool, install it, and get results immediately -- it just doesn't work that way. The tools, like everything else in the software realm, are not silver bullets, but rather another arrow in the quill.
Is interesting, but what I found is that I use Several Methods:
:)
1. -Wall -pendantic, etc. Rule: No Warnings, or errors of course.
2. Insure++ from Parasoft. Absolutely Wonderful. Catches about everything you can think of and we work in our Debug Areas with it. It catches a lot of issues before they even become bugs.
3. Logic Flow issues are solved with DEBUGGERS, and PRINTFs
Insure++ is a major money saver in the long run.
I can program myself out of a Hello World Contest!!
My (potentially flawed) understanding is that Purify was reporting the function calls as "possibly" using a variable that was not initialized.
The relevant point was that the function in question initialized a buffer with random data (e.g. from a random number source) [e.g. _not_ "from uninitialized memory" but "_into_ uninitialized memory"] but the tool could not tell what the function was doing, so it said "look here, possibly uninitialized data".
The programmer, having no clue how to imterpret the (false positive) result simply removed the code.
This is what happens when someone who doesn't know what he is doing takes the advice of a tool that cannot understand the semantic importance of an otherwise opaque action.
If the guy had looked up the function call and seen what it did (instead of just removing it) he would have seen it as a false positive.
Thus Is It Proved: powerful tools are dangerous in the hands of the untrained.
Innocent people shouldn't be forced to pay for inferior software development.
--"Code Complete" Microsoft Press
They can help good programmers quite a bit, and they can help bad programmers become good programmers. In both cases it takes someone with a desire to learn and not just some 9-5 coder here for the chick-magnet status.
The other thing you may want to factor in is the language. Static Typing allows much more reliable static analysis than dynamic typing. A language can communicate to the tool--for instance, if you tell a language that a variable can't be null, either the language or a test-mode tool should be able to assert that easily.
Java has some really nice tools built in. Some have more. Hints like "This variable can't be null", or "This variable must be assigned by the time the constructor ends and can't be changed" really help if applied correctly.
For example. In java, this code will produce a compile error:
private String isFive(int i)
{
String s;
if(i==5)
s="i was five!";
else if(i > 5)
s="i remember when i was 5";
return s;
}
since there is a path through the code that allows s to be returned without having ever been assigned.
In this case, it's pretty obvious, but sometimes you have many paths through your code including exceptions, loops, and if statements--it's really handy to have it pop up a note saying that a variable was used before being assigned.
Because of this really helpful warning, it's a really bad idea in java to initialize variables to null:
String s=null;
Although sometimes it seems like it's the only way to "Eliminate" that warning, what you are really doing is defeating valuable static analysis tools. Aside from that, it's totally redundant.
In the above case, an else clause would fix it. Even:
else
return null;
Note that the string isn't assigned, but since that path returns it is never used in that path, so the compiler is happy.
The warnings generated by the error analyzers in eclipse, when cranked up, are also amazingly helpful--but again, more helpful in some languages than others.
Every project should turn them up and possibly add additional error plugins, and then eliminate every single warning. You do this by looking at each error and making a deliberate decision to mask or fix. Sometimes you have to mask, but looking and deciding alone will fix bunches of potential bugs.
Also, never make the decision to mask based on verbosity. Errors should only be masked if there is just no way around it--although in some cases an entire class of errors have to be ignored because of a pattern that is just too pervasive or too hard to eliminate or doesn't match your deliberately chosen coding style (Some settings actually can enforce a coding style).
Sorry about the really long example. I just wanted to demonstrate the amount of thought that could go into a single detection mechanism--and this one is basic.
To answer your question: It depends.
Whether it is "worth it" to use static analysis tools depends on how much bugs cost you. Static analysis tools can find important defects in code, and it can find them in places that are difficult to test well (e.g., error cases).
The reason you saw a lot of this in an embedded systems conference is that it's a lot more expensive to send a patch out for a dishwasher than it is for a desktop app. Plus the embedded systems market includes the medical, military, and casino industries, all of whom have reasons to be very conservative when it comes to software quality issues.
I think you'll hear a lot less about static analysis among developers of desktop apps, where issuing updates is inexpensive.
Is this "another trend" or "going to change the face of software development"? As computers get faster and algorithms get better, static analysis will continue to provide more accurate results faster. So I think the trend will continue. I think probably it will get integrated with a lot more people's dev processes over the next decade or so, and hopefully software quality will improve as a result. How radical a change is that really, compared to other technologies (e.g., widespread adoption of OOP) over the last 15 years? *shrug*
Disclaimer: I am an employee of Coverity.
If it's a _stateful_ singleton (i.e., a real singleton), then yes, damn better synchronize it even on single-CPU system. No arguments there.
The problem is that there are classes which really shouldn't be singletons at all, and are a mis-use of that pattern. They don't actually _have_ any state, which could be stale on another cache. (Which actually is the least of your concerns there.)
E.g., as a trivial example, consider the following kind of singleton: Yeah, there are better loggers available, but it's just an example. Bear with me.
The class has no state, and holds no resources. It's just a collection of idempotent methods.
It's not a _real_ singleton, it's just someone's misguided attempt at saving the (ridiculously low) overhead of a "new LoggerSingleton()" everywhere it's used. It doesn't matter if two threads get two different copies of it. Worst case scenario is, literally, one of them wasted 8 bytes on the heap, which will be garbage collected later. That's all. There is no way to malfunction because of that.
Synchronizing that, simply isn't necessary. It also defeats the whole "optimization" there, since synchronization is actually more expensive than just not making it "singleton" in the first place.
Now I'm not saying it's right to have that kind of singleton, but it's the kind of "optimization" you tend to get from people coming over from C++. They occasionally even write pools for such objects.
But ultimately it doesn't matter much.
_If_ you have plenty of time, and/or you're at architecture design stage, yeah, tell the people to not do it.
But it's kinda annoying to see an overpaid consultant insisting that synchronizing those is top-priority, and he's saving everyone from some great catastrophe with it. In a project which is already overdue and in trouble. There are better ways to use those resources, than "fixing" non-problems.
A polar bear is a cartesian bear after a coordinate transform.
I'm sitting here, reading this post and peoples comments about static analysis, while I should be fixing Lint errors.
I have 136 of them to fix, all of them are re-using variable name errors and "this variable might have not been initialized" errors, because it's state machines that use a switch statement in a for loop. All the damn variables are initilized in state 1, which is were the damn thing always starts, but Lint can't figure that out. I've been at this for days because an 85K source file was ported into a new project and the previous project didn't lint.
Next I have to check a text analyzer, it has 3000 lint errors, all because the analyzer is based on c strings and Lint is worried there might be array overflows in the specific walks. This is code which has been heavily used since the late 1980s without ever having any problems. Management wants 0 Lint errors, it's not that there's actually any problems. Sigh. I hate Lint.
Just to clarify my previous point, the point of contention there is almost invariably the "getInstance()" method. Such tools trip over it, hard, and there's usually no way to tell them, basically, "it doesn't freaking matter if another thread or another CPU essentially bypasses the singleton and gets a different instance of a stateless class."
The actual methods of the singleton, well, you use your own head. In my logger example it works because it essentially wraps around System.out which is already synchronized, so there's no need to synchronize the outer layer again. In other cases it's really stateless and idempotent, and again it doesn't matter there either. In very few cases, they might actually need to be synchronized.
But anyway, my minor peeve there is with tools and people getting tripped by the unsynchronized "getInstance()" in a situation where it just doesn't matter. I'm not even opposed to turning it into a non-singleton, but seeing someone who should know better (or claims to know much better) spend pages in his report and _hours_ worth of meetings just over the lurking evil of that "getInstance()", just tells me they might have exaggerated their claim to greatness, ya know?
A polar bear is a cartesian bear after a coordinate transform.
They work, but not nearly as well as bunny - which is free. It's a drop in replacement for gcc, and does 9 types of fuzzing/analysis, reporting changes in behavior of the program.
http://code.google.com/p/bunny-the-fuzzer/wiki/BunnyDoc
www.isoHunt.com
Many of the work well, as can be seen from all the endorsements in other comments.
There is one thing to watch for though -- companies selling you software or consulting around the issues their tool highlights.
A few companies will practically give you a tool that they claim finds lots of critical bugs, either through static or runtime analysis, and by the way, for a more considerable sum, offer you "fixes" for these "bugs". This is something that companies sell to a PHBs desire to use the tool to generate metrics and measure progress by how many problems the tool spits out get fixed.
Of course there are companies that offer genuine value in finding and fixing software problems, but every red flag in the room should fly if the company proposes to offer you a free or almost-free "no obligation" assessment. Once the sales dogs are in the door, watch out!
I recently discovered the use of __attribute__ ((format(printf, ...)) in GCC to tag all our own printf-like functions.
We recently started compiling on 64bit and used to the above to pickup all malformed printf style arguments in our code. There were over a thousand! Things like %ld where parameter was only int, etc.
A lot of the yes responses in this thread are dead on. The current crop of static analysis tools have improved tremendously over the previous generation. And the gold standard of this generation for C/C++ is Coverity Prevent. FindBugs and Coverity Prevent for java. Klockwork is gaining in C/C++.
Now, be careful. This is another tool. It means that the problem may still lie with the programmer. All reported defects should be seen not just as a bug to fix but the number of defects related to similar code may point to a design issue.
And to all of you punks out there saying there are a lot of false positives. There are way fewer false positives then you think. These tools can point out bugs that you don't understand until you have really sat down and figured out what is going on. Understand the defects before discounting them.
I work for a research laboratory for a space agency and static analysis tools are soon going to be a requirement for flight software. And this is after running these tools on a lot of flight code to see if these tools are any good.
Unfortunately, even after being very strict with compilers (-Wall -pedantic) and using static analysis tools until there are near zero defects bugs still appear. It is maddening. But having near zero defects almost makes finding these bugs easier.
Remember that OS called OS.2? 32-bit flat memory, multitasking/threads? 16-bit drivers. Wha. 16-bit drivers? Surely some mistake. Chkdsk that tooks hours on a 1 GB drive? and always complained about the, yes,
Java's memory model needs to be the same on all CPU architectures. What the exact behavior is on x86 is irrelevant to deciding whether your Java program is correctly synchronized; what matters is the Java Memory Model. The Java Memory Model says that, in the case you're talking about, any non-final fields of a newly constructed object may be stale from the point of view of a thread that has a reference to that object.
But how granular is the synchronization needed to correctly allocate memory from a heap shared by all threads? Presumably, the threads just need to synchronize on a single pointer. What if you're on a machine that allows you to synchronize on just that?
Also, what if some crazy guy implements a VM where each thread gets its own young generation, and therefore, allocation doesn't need to synchronize at all?
You're reasoning about this problem in terms of implementation, not semantics. Java's memory model tries to specify a minimal semantics, so as to allow maximum implementation flexibility.
I'm seeing a handful of posts decrying false positives. With my experience using Findbugs for Java, we haven't found too many false positives at all! In fact, things that we originally thought were false positives, with a little thought, turned out to be improvable code. It was worth changing a line of code or two here or there to make the so-called false positive go away, because the resulting code was both prettier and more useful.
One concrete example: I used to call toString() on String arguments to constructors to make sure that NullPointerException was thrown if nulls were passed in. Findbugs objected to this. At first, I thought of this as a false positive, because I really wanted the toString method to be called on the String object as a defense against nulls. But with some further thought, I realized that I should explicitly test for null in the construtor code itself. This was not only less surprising to anyone reading the code, it also allowed me to throw a NullPointerException with a meaningful message.
Yes, static analysis tools really help. Ask folks to bring their Findbugs output to code reviews. Make it easy and routine to run Findbugs from Ant. (I'm also very impressed with Coverity, but we couldn't afford it.)
How many of you will get on a plane that's never been flown? Or buy a car without taking it for a test drive?
Now, before I buy software, I want to at least see a screen shot.
What?
Even if such a tool only catches a couple errors, it is probably worth the investment. If there is one intermittent error on a subset of your target platforms, and if this tool catches that error it can easily save hours of debugging work. Considering engineering rates, these will pay for themselves quickly.
Unless, engineers begin to rely on them! If I stop thinking about referencing null pointers because my tool catches 90% of them, I haven't gained a thing.
The type of coding I do is applied mathematics ( see Digital Signal Processing ) where most of the code is procedural C code ( see NOT object oriented ) and we abuse the rules of C a lot. There are tons of small abuses and warnings. Most of these could be easily found but are not found until it freezes our embedded system and then we have to physically hit a jumper or power cycle the box.
... make our code much more robust... and increase our productivity.
So if we had one of these it would prevent that sort of thing from happening a lot more
-----
I don't see how it would be helpful for Java though... because those languages already have *very* robust warnings from the compiler... and the languages themselves are designed to be more dummy proof ( C trusts that you know exactly what you are doing... java trusts that you don't know what you are doing ).
-----
So for languages like C where there is lots of potential to screw up these would be excellent. But for more dummy proof languages like Java / C# the payoff would be less.
Yes, use FindBugs frequently. Happy with it.
For Perl, Perl::Critic is excellent:
I don't have a lot to add, except that if you program Perl and don't use it, you shall spend more time on debugging.
Some people ran these kinds of tools on the Linux kernel, and found some suspicious code. It is similar to turning on -Wall on your gcc compiler: Some bugs suddenly turn up (with exact line number!) without having to go into the testing phase. This means you have more time to test for real bugs that only show during runtime.
One reason why static analysis works is
because this is in large part what you
do with your brain when your read code.
There's obviously a limit to how much
a static analysis tool understands about
what your code is supposed to do, but in
a relatively restricted scope, they can catch
a ton of things.
GCC has a number of options that add static checking. Additionally splint (when properly parametrized) will catch a number of other common gaffs. Who ever said C does not express enough to add checking clearly does not understand the problem. The syntactic sugar in many popular languages actually adds complexity. Some implicit garbage collection, pooling and threading mechanisms add non-deterministic qualities which make *static* checking a np problem. If you are smart enough to keep your aliasing confined and tagged there is little danger in C and the simple syntax makes static checking easier.
Also of note valgrind is an excellent tool suite but it is not a *static* checker.
I tried Splint - relay hard - tried all the annotations - but it was never able to find as many possible problems as an Ada compiler could find. Split was a failure and the project is dead since 2003 [1].
/. you most likely belong to the other 20% but you should not project your level of C expertise to the other 80%.
Note that I am fully literate in C and Ada (and C++ and Java) - so I do fully understand the problem.
If anybody does not understand the problem the it's the die hard C programmers who never relay tried another programming language. It's not all syntactic sugar, it's about always knowing how many elements are in an array. With emphasis on "always" - not sometimes - but always.
Have a look at:
http://en.wikibooks.org/wiki/Ada_Programming/Type_System
Last not least, the sad truth is that 80% of all programmers are not smart enough. Note that since you read
Martin
[1] http://sourceforge.net/project/stats/?group_id=34302&ugn=splint
It is useful when you inherit a project with poor coding and documentation. Particulary when there are a lot of files , it allows to see better the relations and where are defined such and such function or class.
I have found such tools to be invaluable. I had code like this:
// ...
class Lock {/*...*/};
void Foo( Mutex& m )
{
Lock(m);
}
This is valid syntax, but I intended to use that lock instance for the duration of the function, so the first line should have read "Lock lock(m);". Multithreading is tricky enough, and I looked at the real code for a long while, reading right over this bug. PC-Lint found it for me right away (thankfully, it was already tuned, and I should have been using it before running my code).
But even though PC-Lint is pretty good, it ain't perfect. I have found that it has some trouble with advanced C++ templates (e.g., policy-based design). I have submitted bug reports for many of these problems, and they do seem responsive in working them in to the patches.
Finally, here's an article from 2006 discussing the available static analysis available tools for C, C++, and Java and describing how and why to integrate it into your development process.
It's not "error free", it's _run-time error free_. Which according to the GP's link means that no undefined behaviour according to the C standard or user added asserts may happen.
So for example, the program won't ever divide by zero or overflow an integer while summing.
So, basically, you're good at picking fonts, and that pie chart was really gorgeous. Congratulations.
NASA has analyzed several static analyzers and continues to apply a couple to all the flight software projects under analysis. Most of the issues we identify are from the output of these analyzers. Coding errors identified by such analyzers generally have little relationship to design and requirement issues. Issues found during requirement and design phases of development our generally more valuable to the projects but they do not prevent coding errors. Static Analyzers do find the type of errors listed in the original post. Any development project that neglects to use lint-type analyzers routinely through development lifecycle will effectively raise the cost of development and increase risk. Integration and Test can identify some of the problems such as race conditions and memory leaks. However, the complexity of today's applications, NASA flight software included, nearly always demands more resources than are available to Integration and Test Analysis. Much of the time, if not all of the time, spent testing is to validate that the code functions under nominal conditions. The amount of stress testing applied often will not uncover many of the straightforward issues that a "simple" static analyzer will. The output of false-positive results by static analyzers are their weak point. Commercial efforts have attempted to incorporate algorithms to filter out false-positives with some success. The ratio of false to positive results can be 10:1 or higher. An analyst is required to review and identify real errors. But it is usually only the first build analyzed that has an inordinate number of false-positives. Thereafter, results can be differenced and the net output of new flagged issues is significantly reduced. Unquestionably, eliminating all coding errors possible with static analyzers, improves the ease of startup of testing, how quickly testing proceeds (eliminating the need to re-execute test runs) and improves the level of assurance that test results provide. Altogether V&V of requirements and design do not prevent coding errors and static analyzers are not a replacement for exhaustive testing. The latter two analysis processes are complimentary and they are all necessary to assure that the software performs as intended and responds to adverse conditions appropriately.
I think you should try Coverity Prevent. You will be amased how one pointer set to NULL only under certain set of circumstances and passed through several functions gets dereferenced - and Prevent will catch it. I have been in the industry for many years and had a clear idea what is possible what is not. Well, my ideas had to change after using Prevent. See http://www.coverity.com./ There are some other tools on the market, but their depth is not even close.
A good article on this exact question can be found here, with a case study:
http://www.embedded.com/columns/technicalinsights/207000574?_requestid=815557