Firefox Analyzed for Bugs by Software
eldavojohn writes "In a brief article on CNet, a company named Coverity announced that Firefox is using software to detect flaws in Firefox's source code. Even more interesting is the DHS initiative for Coverity to use this same bug detection software on 40 open source projects." An interesting tidbit from the article: "Most of the 40 programs tested averaged less than one defect per thousand lines of code. The cleanest program was XMMS, a Unix-based multimedia application. It had only six bugs in its 116,899 lines of code, or .51 bugs per thousands lines of code. The buggiest program is the Advanced Maryland Automatic Network Disk Archiver, or AMANDA, a Linux backup application first developed at the University of Maryland. Coverity found 108 bugs in its 88,950 lines of code, or about 1.214 bugs per thousand lines of code." We've covered this before, only now Firefox is actually licensing the Coverity software and using it directly.
That's .051 bugs per thousand lines of code for XMMS, an order of magnitude better.
If this is the same as most automated testing software I've seen, it detects many things which aren't truly bugs as bugs. Accuracy on automated testing tools I've been exposed to is around 40%.
I will definitely take another look at Coverity's products, if the Firefox team is finding value in it.
Why can't I mod "-1 Idiot"?
Err?... I always thought Bugzilla was just where you reported bugs in Mozilla suite products?...
How does this bug detection software work anyway?
if you look at the coverity site ( http://scan.coverity.com/ ) you will see that there are already multiple projects who have brought there bugs down to zero. samba being on of the earliest.
Bugzilla is a issue tracking software; it's useful only after you've already found a bug. The only other bug-related tool they use is the FullCircle crash reporter thingy, again, after-the-fact thing. This is different - this tool finds flaws from the source code automatically.
I find the AMANDA results interesting because AFAIK it hasn't recieved a code rewrite since the early 90's. I think an interesting study would be the to compare older projects with ones that have been rewritten from the ground up. Comparing the rate of new bugs introduced as opposed to those hidden in legacy code.
If an officer ever threatens to taze you, say you have a pacemaker.
"It had only six bugs in its 116,899 lines of code, or .51 bugs per thousands lines of code."
Sounds like someone needs to run this debugger on their calculator.
Or that job is left for the monkeys banging on the keyboards.
> I hope these Coverity guys aren't pompous enough to think that their tool can find ALL bugs in a program with... magic...
I am sure that they know their tools limitations, but I am pretty sure that others will interpret
no outstanding bugs as if the application is secure or bugfree. Ethereal (now known as wireshark) has
a very low bug count, but I will not use it due to numerous past remote exploits coupled with
little interest in fixing bugs contra adding new features.
> Hmm, they should run their tool on its own source code, that would be fun.
I would be very surprised if they did not.
Finding all POSSIBLE bugs in a software program means traversing all possible paths in the code with all possible inputs. That's a HUGE problem. You can "model" the code using Logic Equations and that helps some but any errors in the conversion from code to logic equations invalidate results. The DoD and NASA have spent many millions on solving this problem over the last 10-12 yrs. When I was at NASA we used several different tools (CodeSurfer, Purify, Lint, Polyspace as I recall) as each tool was better at one thing (i.e memory leaks vs null pointer dereferences). A The complete process took a couple of days to weeks and then human eyes and expertise were still needed to remove false positives. A good site for all the tools out there, old & new is http://spinroot.com/static/. Looks like Coverty might be a good one to look into, as the best I had seen was CodeSurfer. All the good tools I have seen are commercial (NOT open Source) and EXPENSIVE!! I'd love to see a decent open source tool to run as a first pass before applying the other tools. Another point is that these tools are STATIC analysis. Run-Time Analysis is a whole 'nother animal but that area is improving with tools like DTRACE in Solaris.
If they gave it, its own source wouldnt that be like use playing with DNA and stem cells? -Kemo
Amanda works on many unix and unixoid operating systems, it's not a "linux" backup system. It's used primarily for driving remote backups to big tape libraries, most /. reading linux users would never have systems large enough to justify its use. :-)
Amanda IS, however being very actively developed right now, lots of new features -> lots of new bugs. Other issue is that it's a componenty, plugin architecture, made of a few processes communicating over pipes and sockets. A failure in one component won't necessarily be a security risk or take the whole system down, it's extremely robust in normal operation in my experience, despite this "high bug count". Unlike XMMS, various contributed plugins (e.g. tape changer robot drivers) are redistributed in the source tarball but only used by very small numbers of people with outlandish hardware.
I suspect if you included various XMMS plugins in the XMMS count, things would be different...
None of that *really* excuses a high bug count - but what really pisses me off is coverity's "we've found X bugs, but we're not going to tell you what they are or substantiate our claims (some of amanda is quite old code, has a lot of strcpys, I know that some automated security checkers will treat a strcpy as a "bug" even if it's safe), just FUD your project in various public fora...
One has to wonder if these are coding/language bugs or logical bugs. Finding coding bugs is of course a valuable time saver, but the challenging and usually most costly bugs are of the logical sort, and invariably app specific.
In this age of SarbOx and risk management there is a real competitive advantage to F/OSS over proprietary code to large companies: audit-ability. In previous roles I've had to attest under HIPPA::Security that proprietary code was "secure" -- how? All I could do was obtain a vendor statement that was as non-commital and burden-shifting as possible. Yet, with a true ability to audit the code my pharmaceutical company depended on it would tilt the balance between similar-featured Closed vs Open source solutions. Especially today.
Ok, maybe nobody really cares about the 'many eyes' theory anymore. Regardless, the "open the hood" theory still applies, perhaps more than ever.
-- @rjamestaylor on Ello
Coverity segfaulted whilst auditing MS Vista.
Here are some links to show the bugs in the Bugzilla database which were turned up by Coverity.
Open Coverity Bugs
All Coverity Bugs
I should hope not, as that is demonstrably false. For example, at one point the KDE project with its I-don't-know-how-many-millions of lines of code had a coverity rating of 0 open bugs, but I'm sure no one is silly enough to think that such a large and complex project has no bugs at all!
Most static analysers look for very simple, easily machine-detectable, low-level imperfections which could conceivably lead to hard-to-spot bugs - not initialising a variable before it is used is probably the classic example of the kind of "bug" that would be detected by an analyser such as Coverity. I imagine Coverity is quite a lot more sophisticated than that, though :)
Looks like somebody failed troll academy ;)
Unless the program's domain is restricted to context-sensitive languages. In fact, it is impossible for a computer to try to decide anything more general than a context-sensitive language because anything bigger requires the resources of a Turing machine, which has infinite memory. Computers implementable in a finite amount of matter are equivalent to linear bounded automata, not Turing machines.
AMANDA could easily be the buggiest OSS program in existence, and it would still be OK. The reason? It just has to be less buggy than Netbackup, and more usable than Legato. Luckily for the AMANDA developers, this are very very difficult criteria to miss.
"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
Wait! Is that the source code?!
the mods may say you posted flamebait, but to me it's a flame that warms my heart. rock on, brother! --chebucto
And thank god for that score. I don't mind my AMANDA backups being corrupted, but if my mp3 of "Hit Me Baby, One More Time" ever skipped a beat, I wouldn't know what to do...
Coverity's own site shows how many defects each product has fixed. the number of outstanding defects on AMANDA is now zero. zdnet reported the fixes back in April.
Those that follow amanda-hackers will know that there was less than a week between when coverity released the report on March 6th and it was announced that all bugs were fixed in AMANDA on March 12th.
Coverity sounds like a scam. It is not possible for a program to analyze another program and find all the bugs; see halting problem.
I would find heuristic analysis annoying. I'd get quite annoyed if the program says "fix this buffer overflow" 1000 times because I use "strcpy" somewhere - even though I'm very careful and only use it when I know it can't overflow.
I should write a program that searches for odd perfect numbers and terminates if it finds one. I wonder whether Coverity would say it is an infinite loop.
Coverity sounds like scare tactics to make money by claiming to do the impossible. They won't even disclose what their algorithm is. I would never trust them, especially on closed-source programs. Firefox doesn't have that risk, but they are wasting money.
Microsoft's PREfast is simpler but seems like a much more realistic solution: mark up your code to say how things are supposed to be used and the compiler can decidably sense problems. I'd just get tired of typing 2 underscores a million times.
Melissa
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
And I'm assuming that they mean "Mozilla is using Coverity..." or "Firefox developers are using Coverity...". After all you don't hear about what Internet Explorer is doing, but rather what MS are doing with it.
Wouldn't it be great if the summary was clearer and neither of us had to make mental amendments?
The real question is, what happens if they run it on itself and it reports that it DOES have bugs? Suddenly we're in "this statement is false" territory...
> Finding all POSSIBLE bugs in a software program means traversing all possible paths in the code with all possible inputs. That's a HUGE problem.
That is provable impossible for applications in general using the software tools as of today (in general).
So tools concentrate on common problems, or low-hanging-fruits, so to speak.
> You can "model" the code using Logic Equations and that helps some but any errors in the conversion from code to logic equations invalidate results.
There are several logic models where everything expressed in those model is provable true or false. But
using these models demands a higher level of mathematics and tolerance to "slow" progress that just about
any business or open source project will tolerate. Of course, you need a programming language where this
is practical, and C/C++ does not cut it.
> I'd love to see a decent open source tool to run as a first pass before applying the other tools.
OpenBSD has recently put quite a bit effort into making the in-tree lint much more useful.
zmanda and other amanda hackers have been actively developing AMANDA. While the comparison of bugs in new code and legacy code might be interesting, one wouldn't really see this by just counting projects.
I took a look at the bugs and funny enough, almost all of these would be immediately catched by the C# compiler or would be non issue (memory leaks). The remaining others (and much more) would be detected by FxCop. And then, there is still Spec# ! Like someone said before, the real hard bugs are logical. So really, one question : how is this newsworthy ?
The only thing that this teach me is that a language/platform that allow better typing, memory management and static analysis is far far more robust and productive in the end.
Intelligence shared is intelligence squared.
Hmm, they should run their tool on its own source code, that would be fun.
And if they figure out how to get the tool to modify and improve its own code we'll have Strong AI.
"I am the king of the Romans, and am superior to rules of grammar!"
-Sigismund, Holy Roman Emperor (1368-1437)
Hate to tell you, but Purify is most definitely a dynamic analysis tool. Basically it works by substituting malloc, new and friends and then showing you on the fly your program behaviour (good at catching things like memory leaks). Static analysis tools like lint do not involve executing the programs, just analyzing it's structure to see which code paths are followed; dynamic analysis tools hook the actual execution.
--- Nick, hard at work
Funny selection of programs; I don't see rsync on the list. From the article: DHS wants to reinforce the quality of open-source programs supporting the U.S. infrastructure. So, XMMS (an MP3 player) is more important to the U.S. infrastructure than rsync?
I'm somewhat surprised to see amanda being badmouthed here by this tool. It was mentioned on the amanda-users list a few months back that the amanda tree had been checked by coverity, and the 2 bugs coverity found were promptly fixed.
Thats not to say that as new features are added, new bugs haven't been too, but to actually call amanda a truely buggy application does stretch this users belief a wee bit. I'm currently running a 20060424 dated snapshot of the 2.5.0 tree, with no hiccups at all.
--
Cheers, Gene
You say this as if it invalidates his point. Since (as you would obviously agree) no computer is more powerful than a Turing machine, if something is impossible for a Turing machine it is necessarily impossible for a computer as well. If anything, your quibble makes his argument stronger.
--MarkusQ
can they run these programs on the source of the program itself to look for bugs? Or would that be like the human brain being able to completely understand itself inside and out (aka. not possible)
Life is rarely fair. Cherish the moments when there is a right answer.
Lots more to run time analysis than what Purify does. I can check for the same faults w/o executing the code in many other tools.
Yeah, but my point is that the parent and the site he referenced claimed that Purify was a static analysis tool; it isn't.
--- Nick, hard at work
I had some extensive conversations with the team at CodeSurfer and they think they the problem is NOT impossible, maybe more like Polynomial time. The DOD was funding them (this was about 3 yrs) ago to try to develop a solution that worked for C/C++ and Ada. NASA wanted to tag along on the research but we were told it was "classified" and DOD only. It's rare when someone turns down research money so they must be on to something.
As an object oriented programmer, I always follow the general rule of having a function always give the same output for the same inputs. That is, you then don't have to worry about the 'state' of an object and you as a result have fewer paths to test and fix. This is why, IMO, global variables aren't such a good thing unless they are constant/rarely change.
This should be common knowledge to a good object oriented programmer, but I wonder how often it's employed in the 'C' discipline.
A function that always returns the same value given its inputs is part of functional programming, not object-oriented programming. Most OO code is littered with side-effects and state-dependent behaviour. If you like to program in such a way, you may find yourself much more comfortable with a functional programming language. Languages like Haskell even enforce this.
That works great until you want to do something really off the wall... like input.
-- The act of censorship is always worse than whatever is being censored. Always.
Tools like this aren't for perfecting software. They will often find a lot of bugs so that QA doesn't have to. The software will still need to be tested by a person, but some of the work will already have been done. The best way to find bugs is to have a person see if they can break the software.
Stupidity is like nuclear power, it can be used for good or evil. And you don't want to get any on you.
"Coverity sounds like a scam. It is not possible for a program to analyze another program and find all the bugs"
What a silly reason! How about gzip etc then?
"gzip sounds like a scam. It is not possible for a program to analyze any data and always compress it successfully"[1].
I could go on: "life sounds like a scam..."
But I suggest you wake up to the harsh imperfect real world some time and leave that sort of thinking to the run-of-the-mill "academics".
How you deciding whether Coverity is good or not should be like how you decide whether gzip is good or not. If Coverity doesn't find bugs better than even gcc then it probably useless to most people.
[1] On a related note, in my opinion programming can be viewed as a type of compression.
Good programming practice says ANY function should give the same outputs ALL the time for the same inputs (i.e if you put in a 2 today you get out a 4 and the same thing tomorrow). What you seem to be talking about are "side effects" where a global variable or input parameter is modified within the context of a function. Some programming languages DO allow you to change the value of a parameter within the function and that result is passed back to the caller. In fact thats easy to do in C with pointers. Harder to do in other languages. Either way IMHO, it's a horrible programming practice. The hardest thing I ever saw was a bunch of C programmers trying to learn how to code in Ada. All the "shortcuts" they used to use were removed by strong typing and strict rules. Testing of OO code where you are changing the internal state of an object via one of it's methods or via another method (such as in C++) makes things a LOT harder to develop good tests for and I would suspect good code analysis tools.
So, while absolutely true that a proprietary vendor can run the code checker on their code as well as an open source project, there is a huge difference when it comes to the customer/user of said software: with Open Source the user has the freedom to run such a tool over the source code themselves.
Actually, I would argue that it isn't just a freedom, it's a necessity. Having the source open means that wrongdoers can use bug-seeking programs to find exploits (presumably they have already been doing so for a while). So just to even things out, the Open Source community should scan them as well. This issue shouldn't be ignored.
Of course, closed-source programs are also being scanned by exploit-seeking software (it's not too hard to e.g. search for all calls to strcpy in a binary). So this isn't a 'new' problem with open-source projects. But, with the advantages of Open Source come a few risks, which we should deal with, as mentioned in the previous paragraph.
What are they teaching kids nowadays?
im in ur
We were using it since it was the Meta Compiler. I believe we had some interns from the project. They used our codebase to research their algorithms and we got free scanning. We may well be using the Coverity commercial code today.
Firefox is, once again, the most unstable program in common use.
The 1.5.0.4 version of Firefox was quite stable, if the Flashblock extension was installed. The 1.5.0.6 version is unstable again. The CPU-hogging bug is back!
This comment posted from a copy of Firefox that is constantly using 2.8% of the CPU, even when all pages have been loaded, and there is no active content. That's 2.8% on the way to 70% or more, making it necessary to close Firefox and reboot Windows XP.
There are some bugs found by Coverity left unfixed, but so far things have gotten worse since 1.5.0.4, not better.
The halting problem is not an issue for program verification. This claim is raised repeatedly by the clueless, and it just isn't an issue.
Yes, you can construct a program that's formally undecideable. It's a hard way to write a bad program. It takes some work, and the resulting program is unlikely to be useful.
Most crash-type and security-hole problems in programs are entirely decidable. This is because almost all subscript calculations are composed from addition, multiplication by constants, and logic operations. Those are totally decideable, and there are good decision algorithms for that problem. Only when multiplication of two variables (both non-constant) is introduced can formal undecidability appear. See Presburger arithmetic.
In fact, halting is decidable for all deterministic machines with finite memory. Either you repeat a previous state, or halt within a finite number of cycles. The decision process may be made arbitrarily hard, but that's not undecidability. True undecidability in the Turing sense requires infinite memory.
Most of the practical problems with program verification come from dealing with interactions between various parts of the program. Containing those interactions well enough that you can localize problems is constraining on the programmer. "Design by contract" languages like Eiffel try to do that, but they're not popular. Retrofitting design by contract into C and C++ has been discussed, but the proposed schemes all have holes you could drive a truck through. A big truck.
Although software work seldom uses proof of correctness techniques, there's a whole industry doing it for hardware. There was a machine-generated formal proof of correctness for the FPU in AMD's K7 processor. AMD thus avoided the "Pentium division bug".
After looking at some of the results from the Firefox sources, I see that "bugs" include unreferenced variables and dead code that never gets executed.
It looks like most of the real bugs consist of not checking return values, the worst being routines that act upon an object allocated by another routine without checking for null pointer.
Dan East
Better known as 318230.
The discussion on this bug which was eventually resolved as WONTFIX is quite interesting, IMHO.
I'll probably be modded down for this...
XMMS, a multimedia/mp3 player was tested as part of what the article calls a "$1.2 million, three-year grant [the Department of Homeland Security] awarded to a team consisting of Coverity, Stanford University and Symantec Corp" that was setup to "reinforce the quality of open-source programs supporting the U.S. infrastructure".
40 programs were tested. 40 open source programs. Not even all the programs installed by, or regularly used on, a default install of a particular distro or two; just 40 programs. I thought maybe these 40 were just the first 40 tested, but the original announcement of the award of the grant states that 40 programs would be tested.
And yet they didn't test BIND? ssh? Also, PostgreSQL is on the results list, but MySQL isn't? Did Homeland Security put this list together?! Using a dartboard and a list of open source applications, or what?!
This seems like a great software package, and I'm glad that Homeland Security acknowledges that "much of the critical infrastructure runs on open source", but I could think of a few other ways they could've spent $1.2 million, or at least a few other applications they should've tested before they got to XMMS.
Believe it!
The road to tyranny has always been paved with claims of necessity.
You're begging to get flamed, aren't you? Anyway it's great that OSS projects are doing code auditing much like closed source ones often do.
It does. The halting problem proof applies only to machines with infinite memory. There exists a trivial algorithm to determine whether a program halts when run on a linear bounded automaton: given a capacity of b bits and s states, run b*s*2^b steps and see if the machine has halted. This reduces an undecidable problem to an NP-hard problem. Further optimizations are possible, such as by detecting obvious cycles in the state of the machine and/or by recognizing parts of the system as regular or context-free.
But in practice, the question is not whether the program halts when run on a Turing machine but whether it halts when run on an LBA. A program for a Turing machine halts when run on an LBA if it halts normally or if it steps the cursor beyond the linear bounded portion of memory (which causes a special type of halt called a segmentation fault).
No.
I just thought the summary could have been worded a bit better, that's all.
VOTE!
Detecting flaws != reporting flaws. The summary was clear as day to me.
However, that makes no sense. For instance, if your function returns that most recent 5 records from a database table, then the result will often be different depending on when you run it. Then again, I guess you could count the database contents as an input, however, there's lots of functions where you wouldn't get the same output for the same inputs. Random number generators come to mind. Mind you, the whole purpose is to produce different results. However, if you consider all data used to be input, well, then apart from hardware faults flipping bits, it's impossible to get different outputs.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
It seems silly to me that we're still looking for memory leak bugs, buffer overrun/strcpy-type stuff, and pointer dereferencing bugs. These problems have been fundamentally solved (or at least all but abstracted from the programmer) by managed code environments like Java, .NET, and others.
Why are we in the IT world still causing ourselves problems by using C/C++ in any situation except those which call for the strengths of C/C++ -- strengths which are quickly being matched by their managed counterparts.
Realtime? Embedded? Video game? Ok, use C/C++ (though even video games you could make an argument...). For everything else, there's managed code. No more memory pointer leaks (well, the hard-to-find kind caused by poor pointer management in C/C++), no more buffer overruns that aren't immediately fixable in one place, etc, etc.
C'mon, we, as the IT profession, have evolved past that. Why are we still trying to work around these already-solved problems?
I hope these Coverity guys aren't pompous enough to think that their tool can find ALL bugs in a program
We aren't (I'm a Coverity employee). We find real bugs, and we find false positives (but not too many of those).
Hmm, they should run their tool on its own source code, that would be fun.
We do that regularly.
I forgot to add: with many good libraries.
Intelligence shared is intelligence squared.
If I understand you correctly.
Suppose I have a 1Gb (=2^20) memory and some smallish number of states (say 2^8). Then we would need to run the program for 2^20 * 2^8 * (2^(2^20)) steps. This might take a while - if I were to start it running tonight on a nice teraflops machine (2^30 operations per second) it would be done in only about 2^(2^20) seconds. I'll start it running tonight and let you know when its done.
The CPU hogging bug only occurs when Firefox windows and tabs are kept open over a period of hours or days. (See the link for a description.)
This causes lots of severe problems for heavy browser users, like equipment buyers, for example. Buyers often visit several pages, then have to wait for information, and while they are waiting, they work on buying other items.
They'd also be useful for process improvement. One method of measuring the quality of of your QA process that we (briefly) studied in one of my software engineering classes is a technique called defect seeding. Basically, before sending a unit to QA, you inject known defects. By measuring how many of the known defects were caught by QA, you can get an estimate on the effectiveness of your QA process. The problem is injecting realistic defects. You could use this tool before sending the unit to QA to find a set of known defects, that most certainly are realistic, because they're real defects. Of course, if your QA organization uses this tool as well, then your defect detection metric is boned... but QA shouldn't be looking at the source, anyway.
According to the blog post that announced it, Coverity were scanning 3.9 million lines of KDE code. Although the reports are a bit wonky at the moment, I'm sure Apache has more than 9 lines of code!
I'm scared of numbers that can't be written as a fraction. It's an irrational fear.
Hiring dozens of QA people to discover discover a broken error path that a developer could have fixed in 5 minutes is inefficant and slow. Instead, developers can find many bugs quickly by excerising the data paths of their own code. After (or even better, before) writing a new snippet of code, a motivated developer writes a unit test for it. The test is simple: given certain specific conditions, your code should return some predictable result.
Example- I need to write code that tell me which of two numbers between 1 and 10 is larger. I would need to write at least 6 simple tests that would guarantee this code will work in virtually every scenario. I would need to three tests to make sure it works when the input in valid, and another 3 tests to make sure it works when the input is invalid. Sample inputs:
After a while there will be dozens, hundreds of even thousands of these simple tests. They will not mean much alone, but chained together they can provide a very stable testing bed. By running all the tests every time you release another version, developers can be pretty sure that everything that worked in the past will still work. (see regression testing).
Anyway, I didn't mean to start a lecture on basic testing principles... This post was motivated by the claims in the article regarding code quality. I would take the claim that 'XMMS is the most bug free software' with a grain of salt; a big grain. The results of any testing procedures are only as good as the individual tests. Lets say that the above example was one of the tests that Coverity was using in XMMS. Maybe it passed, great! But what where to happen if a user input -13 instead of 12? or maybe -5,000,000,000 and 1? How about the letter A as an argument?
They don't even give us code coverage numbers, or a count of tests... just a claim. I would bet that the AMANDA project had 10 times as many tests per line of code as XMMS. That would make AMANDA the most bug free software. Oh, and on another note, i noticed some comments about IE testing. I don't use IE, but I am 100% certain that Microsoft has an internal testing framework that puts this one to shame.
There are 10 types of people in the world. Those who understand binary and those who do not.
>Finding all POSSIBLE bugs in a software program means traversing all possible paths in
>the code with all possible inputs.
Even that won't find all the bugs. To do that you also need to know what the code is supposed to do.
I.e. You need to know in advance correct outputs for every combination of inputs. Depending on your input domain, that may or may not be impossible.
For a company selling a software product, they seem stupidly protective of how much the damn thing is going to cost me to obtain. Try and find a price sheet on their website. It isn't there.
The less up-front anybody is about costs, the less worthwhile their product usually is. And the more variable the cost usually is (ie: as they figure out how much they can overcharge you). And no, I will not register with them for the "honor" of finding out more information. I'm guessing that it's something stupidly outrageous since the cost of running their application on a bunch of Open Source programs cost $1.2 million - which anyone with a single copy and a free weekend probably could have done for themselves.
They also don't disclose what their product actually does. So I'll join with the other voices here in calling for the need of an open-source alternative to this project - an alternative that has full disclosure about what the product is capable of and what it's going to cost you to use.
The program need to be suid or run by root (on "hostile" input") for a local exploit to be relevant.
Of course, amanda probably run as root on "hostile" input, so local exploits can be relevant.
Mozilla Firefox and many other GUI programs react to input as a stream of events. Each event handler could be considered a separate program under the LBA model. Treating things that fit outside a given model as "black boxes" (where a model will fail to find bugs) is still useful for reasoning about things inside the model.
Which is why these static analysis tools concentrate on factoring parts of the program (which is guaranteed to be context-sensitive) into regular and context-free parts that are more suitable for analysis. For instance, a lot of security vulnerabilities come from parser bugs, but luckily a lot of languages encountered by parsers are context-free.
These guys did a presentation at RTECC this year, it was actually fairly interesting. The C/C++ tests range from finding the usual language-specific gotchas by static code analysis (malloc, pointer usage, my buffer overfloweth) to statistical analysis over the entire codebase, and flags suspicious sections. If you've been consistently using a variable in a specific way, then break from this in a couple places, you'll probably get a yellow flag. I'm told the full analysis takes an ungodly long time on large projects, though.
That said, I've never actually used this software, so I can't say how thorough/complete these checks are or how many false positives are generated.
Caveat Emptor is not a business model.
As an object oriented programmer, I always follow the general rule of having a function always give the same output for the same inputs.
So your objects are pure function bundles with no state? Or do you count the internal state of the object as part of the input and part of the output?
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
I made a mistake in my original post by asserting that too strongly. What I mean't was that global variables are sometimes misused and functions (or rather methods) can cause object state to be overly complex, giving the code more 'paths' to follow and so probably more bugs. In OO, I think the more correct way of giving function clarity is through pre-conditions and post-conditions like in the design by contract methodology.
Yes, because the CLR gobbles up so much memory that leaks in C# apps are barely noticeable in comparison.
"The only open-source tool I know of is called FindBugs developed by the University of Maryland"
|ironic mode on|
It must be bullshit when the worst bugs/codelines ratio comes from Amanda... the Advanced Maryland Automatic Network Disk Archiver, from University of Maryland
|ironic mode off|
I know, I know... just joking!
"Coverity was also run on the Windows source code. Unfortunately, the 32-bit integer iterator in Coverity was 1 count too small to store the count of the number of bugs found, and so Coverity's counter rolled-over, showing that Windows actually has -2,147,483,648 bugs. Microsoft employees were ecstatic at the results, and Steve Ballmer was said to be seen dancing in his office, yelling 'developers, developers, developers, developers!!'."
Is Capitalism Good for the Poor?
It's rare when someone turns down research money so they must be on to something.
Maybe, but i'm skeptical.
Classification is just a way to hide bullshit and prevent exposure of incompetance/mismanagement/squandering/etc/etc.
Here's a bug: The program does not terminate in a finite amount of time!
Can this bug be found in polynomial time by another program?
Well, the name Turing should ring a bell.
Generally speaking, this is an unsolvable problem (in *finite* time) on almost all programming languages. [It is trivial on non-Turing complete languages, but then those languages are limited to only certain classes of computations and aren't very useful].
So forget polynomial time.
From disputing his original claim:
You have been reduced to saying:
In other words, you have gone from claiming that it is possible to write a program that will find all bug in any program (which, the original poster and I agree, is impossible) to arguing that static analysis can find some bugs in some kinds of programs, which no one is disputing.
The problem with your "finite number of states" argument is that, if you are going to allow real-world constraints to intrude, you will run out of time long before you run out of storage space. Even a very fast machine, with a very small amount of memory, will never be able to get through even a vanishingly small proportion of its states before the universe is canceled due to low ratings.
For all intents and purposes, Turing's results apply to real computers, in so far as the actual real world limitations they face don't change the conclusions.
--MarkusQ
Multiplication can be defined entirely in terms of + and &, a subset of your "addition, multiplication by constants, and logic operations". Consider this function:
uint32_t multiply(uint32_t left, uint32_t right)
{
uint32_t result = 0;
uint32_t mask = 1;
for (unsigned i = 0; i < 32; i++)
{
uint32_t rightmask = right & mask;
uint32_t leftmask = 0;
for (unsigned j = 0; j < 32; j++)
{
leftmask = leftmask + leftmask;
leftmask = leftmask + rightmask;
}
leftmask = leftmask & left;
result = result + leftmask;
mask = mask + mask;
left = left + left;
}
return result;
}
The "for" loops can be completely unrolled because they only use constants and are never used by the code inside. The entire function then uses only the constants 0 and 1, and consists solely of the instructions "mov" (immediate and register-register), "add", "and", and "ret". Actually, on x86-32, you'd run out of registers, but it'd work exactly like this for x86-64.
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
Unfricking believable! You obviously know nothing about this subject, so please, please keep your trap shut and let the grown-ups talk.
This is "insightful?" Speak of waiting for grownups! If the parent thinks he knows about the subject, perhaps he could enlighten the GPP (which is what the post directly above hod did, explaining the difference between this and bugzilla) rather than flaming as an AC?
How is any insight at all afforded that post? I'd have modded it flamebait. Something is broke here (besides me).
mcgrew's razor: Never attribute to stupidity that which can be explained by greedy self-interest
I am looking at analyzing code for some high-reliability applications. For many applications, it is very difficult to determine the correct outputs for a given set of inputs. Specifically, most of the mathematical proving falls victim to the fact that in the end some poor engineer is developing the specification. At high reliability levels, the engineer may not even know all the fault modes of the inputs, so cannot accurately specify the correct outputs for all fault cases.
Things get worse, if you assume you have statistical failures on each of the inputs. Then the accuracy of the outputs can become governed by how accurately the statistical model describes the type and likelihood of faults on the inputs. Eventually, you reach the conclusion that no matter what we build, somehow, somewhere, it will be vulnerable to a fault, and we are simply doing a fault minimization exercise. We can't prove correctness. We can just minimize the severity and impact of the faults.
The mathematical definition of a function maps a range of inputs (domain) to a set (range) of outputs. For each and every input a certain output is generated. Ideally a function should return ONE value, be that True/False, a result, a pointer, etc. but multiple returns are possible. Random number functions don't normally require an input (yes, some require a seed the first call) to return an output, returning 5 records from a database is probably most likely done via loading an array or other data structure within the code not via the return value of a function call. I'm not a Java programmer (like 99% of /.), my background is in embedded so I don't know if Java has the concept of subroutines that are NOT functions. Every call in C is a function call, things like loading the records from the DB would be done by sending in the pointer to the location in the array and then indexing and loading data, the function would still return TRUE if it completed the load and FALSE if an error occured thus still meeting the definition of a function. Each and every input (pointer) maps to one output TRUE or FALSE. The array load is actually the side effect.
I'm pretty sure it's theorhetically impossible for any machine to examine its own programming perfectly (ie. have a look at the halting problem) Not just "going to take ages" but actually impossible to come up with an algorithm to do it.
So any software solution is not going to be able to find flaws perfectly, but of course they can find common errors (the compiler even does this in a very basic way)
being vague is almost as cool as doing that other thing...
Even better (or worse), you need to verify that all your test results are valid, which for many applications would require a bug-free version of the program that generates the output you compare against.
I can very cheaply write a program which has no false positives, and almost as cheaply one that has no false negatives; the first will run in constant time, while the second is linear in the size of the program :-)
Xenu loves you!
I'm not a programmer but I once wrote a blackjack game in Basic. I assume Prevent is good at finding basic coding errors but I doubt it could detect defects at a higher level of logic. Such as that pilot who wondered what would happen if he flipped the switch to raise the landing gear while still on the tarmac. Well rightoff one very expencive fighterjet. Would Prevent have detected this software error.
davecb5620@gmail.com
I work at GrammaTech on CodeSurfer. I thought it might be helpful to clarify a few things:
It's impossible to detect logic errors out of the box (cases of "it's a feature, not a bug!"). (At least, it's impossible without "strong" AI, which would "know" if it's reasonable to allow raising the landing gear while on the tarmac).
It might be possible with techniques such as model checking, but it involves a lot of help from the programmer.
http://wiki.zmanda.com/index.php/Developer_documen tation
Amanda: Open Source Backup Software
http://www.coverity.com/news/nf_news_06_27_05_stor y_9.html
It crashes often and the user interface is a Human Interface Designer's worst nightmare.
IANAL but write like a drunk one.
An example from one of the submissions for the Coverity bugs in Mozilla where the coder who made the patch actually had the balls to set the arrogant committer prick right.
In a society that believes in nothing, fear becomes the only agenda ~ Bill Durodié