Software Code Quality Of Apache Analyzed
fruey writes "Following Reasoning's February analysis of the Linux TCP/IP stack (putting it ahead of many commercial implementations for it's low error density), they recently pitted Apache 2.1 source code against commercial web server offerings, although they don't say which. Apparently, Apache is close, but no cigar..."
The difference is that now that someone has found 31 errors in the open source Apache software, they will be fixed fairly quickly whereas closed source software will have to have the company do a cost-benefit analysis, put together a team to do the fixes, probably charge to put out patches or minor upgrades (assuming the product is Microsoft's IIS ;b)...
Only two things are infinite, the universe, and human stupidity,
and I'm not sure about the former.
Umm, Apache 2.1 hasn't been released yet. Current latest stable is 2.0.46.
I can only assume that they're looking through the current DEVELOPMENT codebase -- finding a higher ``defect density'' in such a development codebase compared with commercial offerings is not exactly unexpected.
They're also some automated code inspection product; the press release doesn't go into details as to the severity of the defects found or the testing methodology.
It'll be necessary to read through the full report before drawing any sound conclusions.
Prette lame when we are talking server software where apache has the lead. (apache 63% vs IIS 25% netcraft.com)
/Esben
"Nobody really checks their email any more. They just delete their spam"
FYI
5100 != 58,944
58,944 is the number from the article.
Great Linux Site
NULL Pointer Dereference (Expression dereferences a NULL pointer) 29 instances
Uninitialized Variable (Variable is not initialized prior to use) 2 instances
They also list the files and code snippets where the errors were found.
In addition, the comparison is made against an industry average of commercial code they have tested this way, NOT against other webservers.
Money for nothing, pix for free
Some things I found interesting:
One of the explanations (given by Reasoning) for a NULL pointer dereference is "can occur in low memory conditions," which I think means the original allocator did not check for malloc failure.
So you can get a sense of what a defect looks like, here is #21. The orignal uses bold and fonts improve readability, but I don't know how to reproduce that in slashcode:
DEFECT CLASS: Null Pointer Dereference
DEFECT ID 21
LOCATION: httpd-2.1/srclib/apr/misc/unix/otherchild.c : 137
DESCRIPTION The local pointer variable cur, declared on line 126, and assigned on line 128, may
be NULL where it is dereferenced on line 137.
PRECONDITIONS The conditional expression (cur) on line 129 evaluates to false.
One of the explanations (given by Reasoning) for a NULL pointer dereference is "can occur in low memory conditions," which I think means the original allocator did not check for malloc failure.
appache got its own malloc() that kills the child (and closes connection) if it fails to allocate enough bytes.
Metric Report
They make you fill out a form that asks for your email and then do an opt out checkbox at the bottom of the form (you have to check it to NOT get spam from them). The site's a bit slashdotted right now though.
For instance, the first bug is
Each bug report is followed by the snippet of source code containing the defect.
The metric report simply reports the statistics. For instance, the most bug ridden file is otherchild.c. The most common bug class is "dereferencing a NULL pointer".
If the Apache developers simply want to fix the bugs, they can use the Defect Report. If they want conduct a brutal purge of their contributors, they can use the Metric report.
*Yes, Reasoning wants an email address. They will mail you a URL (a rather simple one at that) to access the reports.
Well, Yes and No. The problem is that there may be no logical way that the pointer may be NULL today. But tomorrow, a new coder will add something that modifies the preconditions and suddenly that pointer can indeed be NULL. Even where you are sure that a condition is impossible, it is usually a good idea to check for NULL in order to avoid future errors.
And for those who haven't seen this trick before, a nice habit to get into is to write your checks like so:
This lets the compiler catch errors where you meant '==' rather than just '='. As in
FreeSpeech.org
In almost every case they listed the pathway was via a failed malloc.
Apache has it's own malloc that kills the connection (and the child) if it fails.
That code can never be reached. Their test is invalid.
Actually, I've found that fixing bugs in large projects is about the same whether or not you are familiar with the project, provided that the author was no smoking crack at the time he wrote it.
For example, I managed to code, test, and patch a "fix" for PostgreSQL this weekend in under 2 hours, having never seen the code before.
The "fix" wasn't a bug, per se, i't just that the output of pg_dump wasn't optimal in my usage for dumping the schema for CVS revision control. I added two flags, -m -M, which molded the output to my liking.
If you haven't seen your code in two months, you and an outsider have about the same chance at finding and detecting bugs/misfeatures.
Engineering and the Ultimate
Turning on all warnings in gcc (-Wall) catches this, and many other common errors.
(In effect it does a lint-like check on the source.)
MY compiler (Microsoft C++) does catch this
if (myPointer = NULL) {
and issues a warning. Doesn't gcc?
Yes, it does. So does every other C compiler I've ever used (quite a few). I suspect the original poster may be the sort who ignores warnings....
Why?
Looking at their first "bug", a little manual inspection shows that it's in the "can't happen" category, even without knowing about hidden information. The code looks like this:
current_provider = conf->providers;
do {
{some safe code}
if (!conf->providers) {
break;
}
current_provider = current_provider->next;
} while (current_provider);
and they identify the second-to-last line as the "possible NULL pointer reference". Note that the "break" before that line will be taken if the pointer is NULL, so it can't happen. In fact, the static analysis could have determined this if it were a little better at propagating values.
First conclusion: subtract at least one "bug" from the 31 defects in Apache. This lowers the rate to 0.51, the same as the "average commercial code" number they quote. Yahoo!
Second conclusion: their static analysis must identify a lot of false positives, if the very first one in the list is one (I would look at more, but I should really get back to work...)