Slashdot Mirror


Software Code Quality Of Apache Analyzed

fruey writes "Following Reasoning's February analysis of the Linux TCP/IP stack (putting it ahead of many commercial implementations for it's low error density), they recently pitted Apache 2.1 source code against commercial web server offerings, although they don't say which. Apparently, Apache is close, but no cigar..."

442 comments

  1. So if they found them... by Marx_Mrvelous · · Score: 5, Funny

    Why don't they fix them? It seems almost paradoxical, if you find .53 errors per thousands lines of code and fix them, then you'll have 0 errors. But since we can only fix errors we can detect, we only detect errors we can fix. Ok, it's too early on a Monday morning...

    --

    Moderation: Put your hand inside the puppet head!
    1. Re:So if they found them... by dkh2 · · Score: 3, Insightful

      Sure, they found them but, did they catalog them in any way. .53/KLOC errors translates to approx. 1 error every 1886 LOC on average. On top of that, on further investigation, which of these are actual errors and which only look like errors?

      I'm just glad I'm not the poor go-coder who has to go through the code to find and fix these few "errors."

      --
      My office has been taken over by iPod people.
    2. Re:So if they found them... by Jeremy+Erwin · · Score: 5, Informative
      If you download the defect report (available from here*, it will explain exactly where the bugs are.
      For instance, the first bug is

      DEFECT CLASS: Null Pointer Dereference DEFECT ID 1
      LOCATION: httpd-2.1/modules/aaa/mod_auth_basic.c :291
      DESCRIPTION The local pointer variable current_provider, declared on line 235, and assigned on line 257, may be NULL where it is dereferenced on line 291.
      PRECONDITIONS The conditional expression (res) on line 253 evaluates to false AND
      The conditional expression (!current_provider) on line 264 evaluates to true AND
      The conditional expression (!provider || !provider->check_password) on line 268
      evaluates to false AND
      The conditional expression (auth_result != AUTH_USER_NOT_FOUND) on line
      282 evaluates to false AND
      The conditional expression (!conf->providers) on line 287 evaluates to false.


      Each bug report is followed by the snippet of source code containing the defect.

      The metric report simply reports the statistics. For instance, the most bug ridden file is otherchild.c. The most common bug class is "dereferencing a NULL pointer".

      If the Apache developers simply want to fix the bugs, they can use the Defect Report. If they want conduct a brutal purge of their contributors, they can use the Metric report.

      *Yes, Reasoning wants an email address. They will mail you a URL (a rather simple one at that) to access the reports.
    3. Re:So if they found them... by MisterFancypants · · Score: 4, Insightful
      None of that bug report is at all useful if there is no logical way for all of those preconditions they listed to actually be met.

      I mean, yeah, it would be nice if code would explicitly check for a NULL before dereferencing, but if there's no earthly way for the pointer to actually BE a NULL pointer at that time (barring memory corruption -- in which case all bets are off and your code is doomed anyway) then I wouldn't call those errors.

      This whole exercise seems very suspect to me.

    4. Re:So if they found them... by tomstdenis · · Score: 5, Interesting

      Agreed. Things like splint often report "warnings" on code that shouldn't be. For instance

      int some_func(char *somebuf)
      {
      if (somebuf == NULL) return ERROR;
      somebuf[0] = 'a';
      return OK;
      }

      Will generate a warning with splint saying "pointer may be null" despite the fact it cannot be.

      Those tools are generally too sensitive and give too many false positives to be useful in the long run.

      Tom

      --
      Someday, I'll have a real sig.
    5. Re:So if they found them... by Anonymous Coward · · Score: 0

      Weird... I just tried (!somebuf) instead and splint didn't complain. The detection algorithms must be a little too tuned to the author's own style.

    6. Re:So if they found them... by tomstdenis · · Score: 2, Informative

      Neat, well its been nearly a year since I used splint last. Maybe they just have updated the code.

      Eitherway I prefer

      "--std=c99 -pedantic -Wall -W -Wshadow" as my warnings for GCC. It catches a shit-load of common coding foobars and also ensures the code follows ISO C [definite bonus].

      Tom

      --
      Someday, I'll have a real sig.
    7. Re:So if they found them... by coliva · · Score: 2, Informative

      I found it interesting that they used a 1/31/03 version of Apache 2.1-dev. This wasn't mentioned anywhere in the article- either that it was a development version or that their analysis was of a development-level piece of software 5 months ago.

      It would be interesting to see how far 2.1 has progressed since then.

    8. Re:So if they found them... by Skjellifetti · · Score: 4, Informative
      None of that bug report is at all useful if there is no logical way for all of those preconditions they listed to actually be met.

      Well, Yes and No. The problem is that there may be no logical way that the pointer may be NULL today. But tomorrow, a new coder will add something that modifies the preconditions and suddenly that pointer can indeed be NULL. Even where you are sure that a condition is impossible, it is usually a good idea to check for NULL in order to avoid future errors.

      And for those who haven't seen this trick before, a nice habit to get into is to write your checks like so:
      if (NULL == myPointer) { ... }
      This lets the compiler catch errors where you meant '==' rather than just '='. As in
      /* Do we really mean this? */
      if (myPointer = NULL) { ... }
    9. Re:So if they found them... by Anonymous Coward · · Score: 5, Insightful

      The funny thing is that this "bug" doesn't appear to actually be one...

      Note that current_provider is set to conf->providers on line 257. The loop starts and neither current_provider or conf->providers change. Then on line 287 there's a conditional break if conf->providers is NULL.

      If current_provider is going to be NULL at line 291, then conf->providers must be as well, so the conditional break will happen and the NULL dereference will be skipped.

      Or am I missing something else?

    10. Re:So if they found them... by fnorky · · Score: 2, Insightful
      I found it interesting that they used a 1/31/03 version of Apache 2.1-dev. This wasn't mentioned anywhere in the article- either that it was a development version or that their analysis was of a development-level piece of software 5 months ago. It would be interesting to see how far 2.1 has progressed since then.

      After reading the review I came a way with the impression that the reviewers were trying to hide this very fact. No mention this is a development version of Apache. No mention of what the "several commercial equivalents" are. Not much to back up their claim "Apache http server V2.1 code has defect density rate similar to the average found within commercial applications - Findings differ from previous Open Source Study".

      I dare say that at first glance this this seems to be a case of FUD.

    11. Re:So if they found them... by johnnyb · · Score: 1

      " Well, Yes and No. The problem is that there may be no logical way that the pointer may be NULL today. But tomorrow, a new coder will add something that modifies the preconditions and suddenly that pointer can indeed be NULL. "

      Unless there's documentation saying that it shouldn't be null. If there isn't, it's a bug in the present code. If there is, it's a bug in the future code.

      Documentation really is a part of the code, even though many people treat it as a side-issue.

    12. Re:So if they found them... by arrogance · · Score: 1

      Umm, the title of the report is "ReasoningTM Inspection Service Defect Data for Apache 2.1-dev". I think most of the readers, at least here at /., know what "dev" means.

      I don't think it's FUD, just straight marketing. I'm kind of interested to know if anyone asked for the study to be done or if they just said "Hey, I'm bored. Let's do a defect analysis of some open source stuff and get it posted to slashdot."

    13. Re:So if they found them... by Anonymous Coward · · Score: 1, Insightful

      My quibble with explicitly checking for NULL pointers is that you're only going to catch the case when the pointer is NULL. Just about any other bad value is going to give you a segmentation fault (which is exactly what a NULL pointer is also going to give you). I would consider such a check of more value if you also bothered to check all the other pointer values it shouldn't be, but that's something which is mainly only practical at the kernel level. Otherwise, I find all the extra NULL checking pedantic.

      The only place where I like to put NULL checks is where passing a NULL pointer has some sort of meaning in the API (in which case, it's obviously necessary). Doing so helps signal to anyone reading the code (mainly myself) that a NULL pointer value has significance beyond a possible segmentation fault. That would be drowned out if I put a NULL pointer check everywhere just to return a marginally useful error code, which I would also have to check for, rather than the program crashing in a clean and spectacular manner (the fail fast mentality).

    14. Re:So if they found them... by apankrat · · Score: 2, Insightful

      .. But tomorrow, a new coder will add something that modifies the preconditions and suddenly that pointer can indeed be NULL.

      That's what assert() exists for. And 'preconditions' you are referring to are actually 'invariants', so if "suddenly that pointer can indeed be NULL" it means that someone broke a fundmental design assumption and should not be tweaking the code anyway.

      And for those who haven't seen this trick before, a nice habit to get into is to write your checks like so:..

      I found this trick pretty annoying. First of all any decent compiler can catch this with a warning. Second, if you are in fact misplacing == with = so often that you need a special habit for fighting it, then perhaps you should look at what you type :) There are plenty C language constructions that can ruin your code with a single misplaced character:

      "xFF" vs "\xFF"
      comma operator; for instance, f(param) vs f,(param)
      misplaced structure initializers
      etc, etc

      It does not mean the programmer need to guard against all these too, it just means that the code must be proofread as it's being written, which is a reasonable thing to expect from a professional developer.

      --
      3.243F6A8885A308D313
    15. Re:So if they found them... by Error27 · · Score: 2, Interesting

      FUD??? Gimme a break.

      It says pretty clearly that they purposely chose a less mature sample of open source software than they did last time. The point is, does open source software start out bug free or do the bugs get worked out with age?

    16. Re:So if they found them... by coliva · · Score: 2, Informative

      Correct. The title of the report is clear. However, that info didn't make it into the news release that they put out.

    17. Re:So if they found them... by Jeremy+Erwin · · Score: 4, Insightful

      The earlier study was of polished code, many iterations after release. This latest study is of an unpolished developers snapshot. I suppose that you might be able to divine some kind of wisdom about the development of open-source software-- Development branches shall be as stable as commercial code. Release branches shall be more so.

      The metrics report does mention the version number (dev-1/31/03), though the fact that this is development code is not explicitly noted No mentions is made who commissioned this study. Perhaps the company is simply fishing for clients.

    18. Re:So if they found them... by fnorky · · Score: 1

      FUD??? Gimme a break.

      It says pretty clearly that they purposely chose a less mature sample of open source software than they did last time. The point is, does open source software start out bug free or do the bugs get worked out with age?


      I said at "first glance" it seems to be a case of FUD. When I think of a "less mature sample of open source software" I normally don't think of Apache (even version 2.x). I would have chosen somthing in a pre 1.0 state myself. Apache v2.1 is build on top of previous versions. I think that alone would make it harder to know if it started out bug free or had the bugs worked out with age.

    19. Re:So if they found them... by pclminion · · Score: 2, Insightful
      Considering that Brian Kernighan, co-inventor of the C language, advocates this coding style in his book The Practice of Programming, I think it might be you who's the moron (and the 12 year old). This is a classic error that thousands of programmers have made and continue to make. It's the difference of a single repeated keystroke.

      So shut up, you little twerp.

    20. Re:So if they found them... by Martin+Spamer · · Score: 1

      None of that bug report is at all useful if there is no logical way for all of those preconditions they listed to actually be met

      That is a very big IF and an assumption that is just not sustainable is you want to produce quality software.

      it would be nice if code would explicitly check for a NULL before dereferencing, but if there's no earthly way for the pointer to actually BE a NULL pointer at that time (barring memory corruption -- in which case all bets are off and your code is doomed anyway) then I wouldn't call those errors.

      If you want to ensure your code is robust today and stays so in future a programmer needs to implement his code defensively. One requirement of this is guarding against NULL pointers. Invalid assumption such assumping this may never happen in a root of many bugs in software. It will a programmers a few seconds to do this right in the first place, discovering this type of bug the hard way can cost man-days or man-weeks of testing and debugging and lost productivity.

      Regocnition of this fact is a primary reason why mature development shops use code walk-through techniques and since this is an automated version of this proven technique all professional programmers should applaud it.

      This whole exercise seems very suspect to me.

      How ? Their agenda is clearly demonstrating the capabilities of their tool-set. Making excuses for sloppy coding practice is much more 'suspect' IMHO.

    21. Re:So if they found them... by conway · · Score: 3, Informative

      Turning on all warnings in gcc (-Wall) catches this, and many other common errors.
      (In effect it does a lint-like check on the source.)

    22. Re:So if they found them... by dpuu · · Score: 1
      write your checks like so...

      Even nicer it not to do the equality check at all, and use boolean context:

      if (!myPointer) { ... }
      Its shorter, less error prone, and doesn't require stdlib.h to import the definition of NULL.
      --
      Opinions my own, statements of fact may contain errors
    23. Re:So if they found them... by fnorky · · Score: 1

      The earlier study was of polished code, many iterations after release. This latest study is of an unpolished developers snapshot. I suppose that you might be able to divine some kind of wisdom about the development of open-source software-- Development branches shall be as stable as commercial code. Release branches shall be more so.

      Although one can say that Apache v2.1-dev is nothing more than a few iterations after it's initial release. That is why I said in a different post that using a pre-v1.0 release might be a better choice for this. I don't think using a development branch is really a good choice at all. Dev branches are just that, development, not intended for normal, every day use (except by the very brave).


      The metrics report does mention the version number (dev-1/31/03), though the fact that this is development code is not explicitly noted No mentions is made who commissioned this study. Perhaps the company is simply fishing for clients.

      I agree. It was a press release. I guess it just rubs me the wrong way.

    24. Re:So if they found them... by scrytch · · Score: 1

      Bleh, I can't get into the habit of rewriting my terms when my language instincts (and I am writing in a computer language, meant for humans) tell me to put the thing being checked on the left.

      any decent compiler will warn you about unintended assignment, and will give you an option to turn warnings into errors. Frankly I thought using = instead of := was just silly, but there it is, so at just let the compiler get at it.

      Also, one should be using const wherever they can. It helps the compiler out too, which means potentially faster code.

      Know what'd be nice, is if whatever program they used could automatically generate asserts and insert them into the source.

      --
      I've finally had it: until slashdot gets article moderation, I am not coming back.
    25. Re:So if they found them... by Anonymous Coward · · Score: 0

      You dont scare me with your fancy links. I still think you are dumb.

    26. Re:So if they found them... by mobileskimo · · Score: 0, Offtopic

      Moderation: Stick your hand up the rear end of the puppet

      --
      "Last one in is a rotten goblin!" - Kepp
    27. Re:So if they found them... by aborchers · · Score: 1


      int some_func(char *somebuf)
      {
      if (somebuf == NULL) return ERROR;
      somebuf[0] = 'a';
      return OK;
      }

      Will generate a warning with splint saying "pointer may be null" despite the fact it cannot be.


      By way of disclaimer, let me say I've never used splint, but could it be that the bad-form early return is confusing it? Does it report the same error on

      int some_func(char *somebuf)
      {
      int result= ERROR;
      if (somebuf != NULL) {
      somebuf[0] = 'a';
      result= OK;
      }
      return result;
      }

      --
      Trouble making decisions? Just flip for it.
    28. Re:So if they found them... by aborchers · · Score: 1

      One other thing occured to me. Is *somebuf protected by a synchronization mechanism? If not, it is possible that another process/thread may have altered it, ergo it is entirely possible that *somebuf could become NULL during a context switch between the first and second line of your code sample.

      It's been a long time since I did C programming, so be gentle if I'm missing something critical about pointer access...

      --
      Trouble making decisions? Just flip for it.
    29. Re:So if they found them... by Anonymous Coward · · Score: 0

      Yeah, assert() is useful, but only in debug code, not production code. In production code you want a little bit better error recovery than your program dying with an assert error.

    30. Re:So if they found them... by tomstdenis · · Score: 1

      It's an argument to a function. It cannot be modified by another thread/process [other than access the stack directly].

      Besides if you are really out to screw a program up no form of mutex/semaphore/etc can save you. You might have written a function which doesn't lock correctly, etc...

      Tom

      --
      Someday, I'll have a real sig.
    31. Re:So if they found them... by steveg · · Score: 1

      NULL is usually 0.

      But it doesn't have to be. It's safer to explicitly check against NULL than to use a boolean. Using a boolean might never cause a problem, but years from now when the code is ported to some oddball system, *someone* may have to spend a long time scratching their head about some mysterious bug.

      In cases other than where NULL is one of the possibilities, where you know that 0 or false or whatever is what you are checking against, a boolean is the best way to go.

      --
      Ignorance killed the cat. Curiosity was framed.
    32. Re:So if they found them... by aborchers · · Score: 2, Funny
      It's an argument to a function. It cannot be modified by another thread/process.


      Thanks for the reality slap. Years of LISP and Java have made me weak and flabby. :-)

      --
      Trouble making decisions? Just flip for it.
    33. Re:So if they found them... by Josh+Booth · · Score: 1

      That also seems to be the mentality of GLib and GTK+. There are many times when a NULL pointer is passed to some API function and my program segfaults immediately. I can bring up GDB and find what function first passed the NULL pointer.

    34. Re:So if they found them... by Ed+Avis · · Score: 1
      None of that bug report is at all useful if there is no logical way for all of those preconditions they listed to actually be met.

      True, of course. However it gets insanely complicated to start worrying about 'is it ever possible that p == 0 and q != 0 and a < b + 3 and x != y...'. The only way is to fix the function so that it has a nice simple set of preconditions, which you can easily check are being followed. And if the preconditions aren't met, technically you are in the right if you segfault or do anything else, but in practice it may be better to log a message and keep the server running.

      --
      -- Ed Avis ed@membled.com
    35. Re:So if they found them... by Ed+Avis · · Score: 1

      What splint would really like you to do is to annotate the parameter somebuf to say whether it can be null. IIRC something like:

      int some_func(char * /*@notnull@*/ somebuf)
      {
      somebuf[0] = 'a';
      return OK;
      }

      Then it will statically check that you never pass a non-null pointer to this function, so the runtime check is no longer needed.

      --
      -- Ed Avis ed@membled.com
    36. Re:So if they found them... by Ed+Avis · · Score: 2, Informative

      The null pointer in C is written as 0, and tests as false when used as a Boolean. It might be stored internally as the bit value 1010101, but still in C source code it is 0, and false. So

      if (pointer) ...

      is perfectly legal, and portable, C.

      --
      -- Ed Avis ed@membled.com
    37. Re:So if they found them... by LilMikey · · Score: 1

      So what I'm hearing is all that fancy error-finding software does the same thing as assert?

      --
      LilMikey.com... I'll stop doing it when you sto
    38. Re:So if they found them... by tomstdenis · · Score: 1

      Hey no prob. Not like I could say I know how to program in LISP [at all] or Java [proficiently] :-)

      Tom

      --
      Someday, I'll have a real sig.
    39. Re:So if they found them... by Anonymous Coward · · Score: 0

      Campchaos James Hetfield impression:
      LISP, Java, you and your mom: Bad!

    40. Re:So if they found them... by Anonymous Coward · · Score: 0

      It doesn't have to specifically be zero, but NULL values must equate to false in C. So, (!pointer) is always valid.

    41. Re:So if they found them... by the_duke_of_hazzard · · Score: 2, Insightful
      "The defect density of the Apache code inspected was 0.53 per thousand lines of source code, while the commercial average defect density came to 0.51 per thousand lines of source code."

      A simple reductio ad absurdum from this: if you produce thousands and thousands of lines of harmless, simple code to do something that could be done in a line, then your more verbose code is "better" than the concise one by this metric.

      This is assuming that it is possible to reliably statically test for errors in the first place, and that one "error" is equivalent to another... All seems a little suspect to me.

      This signature is intentionally pointless.

    42. Re:So if they found them... by Ed+Avis · · Score: 1

      No, because the check happens at compile time. Firstly this means you find the error sooner. Secondly there is no run-time overhead. Most importantly if it can statically check that the pointer is not null (or any other property), then you know that this is _always_ true, for _all_ inputs to your program. Whereas assert() will find a bug only if your test suite is comprehensive enough. Finally, if the checker does find a case where a null pointer might be passed, it can tell you which line of source that is ('foo(x) is called on line 100, but x might be null and foo() expects a non-null pointer').

      I'm not arguing against assert() or run-time checking in any way, just pointing out that a compile-time check is better in cases where it is possible. But there's a limit to how much compilers or static checkers can figure out, so you have to use run-time checks for more complex things. Normally you'd use a mixture of both.

      --
      -- Ed Avis ed@membled.com
    43. Re:So if they found them... by jonadab · · Score: 1

      > bad-form early return

      I don't know why early return is considered bad form. It's MUCH
      better form than using traditional but intensely error-prone char*
      buffers.

      --
      Cut that out, or I will ship you to Norilsk in a box.
    44. Re:So if they found them... by tomstdenis · · Score: 1

      Which is totally useless if you happen to be authoring a library which other developers depend on.

      Nice try though :-)

      Tom

      --
      Someday, I'll have a real sig.
    45. Re:So if they found them... by jonadab · · Score: 1

      Lisp is well worth learning. It's a fairly different paradigm, and
      a fairly influential one. (A great many languages have borrowed
      ideas from lisp, almost as many as have borrowed syntax from C.)
      Also, lisp is not hard to learn; I would say it is WAY easier than
      C. The syntax is about as trivial as it gets, so you mostly just
      have to learn the semantics, and the only really hairy things are
      lambda expressions and maybe closures.

      Is lisp practical? Well, not for everything. But it's worth
      studying for the ideas. It'll change the way you think about
      programming, even in other languages.

      --
      Cut that out, or I will ship you to Norilsk in a box.
    46. Re:So if they found them... by pod · · Score: 1
      Those tools are generally too sensitive and give too many false positives to be useful in the long run.

      Run gcc with the -Wall option, and fix the warnings, many of which are of the type 'control reached end of a non-void function', 'use of uninitialized variable', etc, and ARE actually real or potential problems. gcc does flow control analysis, so if it tells you you're using an unitialized variable, or not returning a value in a non-void function, it is PROBABLY right, and you should look at what you are doing wrong.

      --
      "Hot lesbian witches! It's fucking genius!"
    47. Re:So if they found them... by pod · · Score: 1

      WTF?! This is horrible! It's very hard to guarantee that a non-null pointer will never be passed as a parameter at compile time! And if it is, that just means you're checking the value at multiple other points, as opposed to a central location, ie. the function that uses it. Your functions should always check their parameters are valid, and non-null check takes all of 1 clock cycle, plus a fetch (if it's not in a register), byt you'll have to do that anyways to use it.

      Unless I'm missing something major here, I'm now not at all surprised at the terrible quality of some of the code out there.

      --
      "Hot lesbian witches! It's fucking genius!"
    48. Re:So if they found them... by Anonymous Coward · · Score: 0

      Yeah, assert() is useful, but only in debug code, not production code. In production code you want a little bit better error recovery than your program dying with an assert error.

      Production code should go ahead and dereference the NULL pointer, crash, and be recovered by the OS. (or somewhere else in the program) If you end up with debug code that's faster than production, you have a problem.

      You should eliminate NULL pointer exceptions through proper design and copious asserts, and do plenty of unit testing. It must be clearly documented (preferrably in the source) that this function may not be passed a NULL pointer.

    49. Re:So if they found them... by aborchers · · Score: 2, Insightful

      Sorry to get pedantic, but char* buffers are not error prone. Programmers are prone to make errors when using them. Lack of maturity (so to speak) in the language and bad programmer form are not the same. Bad form is bad form in C or Java. That one lacks array bounds checking that the other provides is irrelevant. Languages that protect the programmer from errors may make bad form less likely to result in a failure, but failing to employ best practices in code design can still lead to hard-to-detect logic bombs.

      In this case, the bad form in using early returns is that using them leads one to not look at the whole routine as a cohesive whole where all the antecedents and consequents are correctly considered and accounted for. It's similar to why:

      if (a) { ...
      }
      else if (b) { ...
      }

      is bad form compared to

      if (a) { ...
      }
      else {
      if (b) { ...
      }
      }

      From tracing point of view, they are indistinguishable. They may even compile to the same set of instructions. The second, however, shows a level of diligence on the part of the engineer that all the possible routes are considered and there is no dangling consequent.

      Disclaimer: The real reasons why these things are bad form are practically impossible to convey in an example that doesn't make use of real code. i.e. it's the "..." bit that provides the opportunity for the bad-form constructs to leak bugs.

      --
      Trouble making decisions? Just flip for it.
    50. Re:So if they found them... by Anonymous Coward · · Score: 0

      Yes, yes, and skipping words saves writers many keystrokes as well, but if you want to produce quality writing that people will enjoy reading you have remember to cross your t's and dot your i's and include all the necessary words in your sentences. Hmm kay?

    51. Re:So if they found them... by jonadab · · Score: 1

      > It might be stored internally as the bit value 1010101, but still
      > in C source code it is 0

      C is evil.

      In lisp, the equivalent is nil and it has no numeric representation.

      In Perl, the equivalent is undef, it stringifies to "" or numifies
      to 0 regardless of platform, and _you can safely dereference it_.
      You can also scope the resulting storage location lexically...

      undef $x; $$x=42; {my $$x="foo"; print "$$x\n"} print "$$x\n";
      foo
      42

      This is of course due to the fact that it's a reference, not a
      raw pointer. I have been told that C++ also has references, but
      for some reason people are still using pointers. (Maybe the
      references in C++ are less useful than in other languages? Maybe
      C++ programmers have brain dammage? I don't know, maybe a lot of
      things.) (Yes, I know C is not C++. Actually, C++ is even more
      evil than C in my book, but what do I know? I'm Just Another PH.)

      --
      Cut that out, or I will ship you to Norilsk in a box.
    52. Re:So if they found them... by HiThere · · Score: 1

      To me it seems wierd to prefer the second form to the first. A chained else-if structure is equivalent to a switch. A nested block structure is much more complicated conceptually. (I.e., it requires nesting the stack to an arbitrary number of blocks...and tracking the parenthesis for all of them.)

      That said, the second form is more general, and there it can handle all the cases that the first form can, and more besides. But this generality doesn't come for free, and if you can use the simpler structure without duplicating code, then you should.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    53. Re:So if they found them... by aborchers · · Score: 2, Informative

      A chained else-if structure is equivalent to a switch.


      Funny you should point that out: a chained else-if structure without a terminal else is equivalent to a switch without a default which is notoriously vulnerable to the same sort of logic errors.

      if you can use the simpler structure without duplicating code, then you should


      While I agree with that principle, the whole issue of good form (which I won't argue can be inefficient and cumbersome) is that following it slavishly can prevent the coding patterns that lead to hard-to-find bugs. It protects us from our own worst tendencies, one of which is assuming when we write the code that we know exactly what we mean it to do. :-) Optimization is a valuable step to be sure, but optimizing too soon is a route to buggy code.

      --
      Trouble making decisions? Just flip for it.
    54. Re:So if they found them... by Bakaneko · · Score: 1

      I can't really imagine that this would be a hard or interesting question for most people...

      It's roughly equivalent to asking whether a bottle of wine tastes better before or after fermentation.

    55. Re:So if they found them... by mvonballmo · · Score: 1
      But tomorrow, a new coder will add something that modifies the preconditions and suddenly that pointer can indeed be NULL.


      That's why it would be lovely if more people would start using tools that allows preconditions, postconditions, etc. to be specified at the language level. Then, there are no more magically changing preconditions because they are explicitly specified in the code instead of implied through comments, or, worse, coding style. Changing a precondition then requires a concrete act of actually removing it from code.
    56. Re:So if they found them... by Anonymous Coward · · Score: 0

      The problem is that there may be no logical way that the pointer may be NULL today. But tomorrow, a new coder will add something that modifies the preconditions and suddenly that pointer can indeed be NULL. Even where you are sure that a condition is impossible, it is usually a good idea to check for NULL in order to avoid future errors.

      And in a sudden flash of insight, I know why java is so horrendously slow!

      java2c:

      if (pointer != NULL) do one thing;
      if (pointer != NULL) do one thing;
      if (pointer != NULL) do one thing;
      if (pointer != NULL) do one thing;
      if (pointer != NULL) do one thing;
      if (pointer != NULL) do one thing;
      if (pointer != NULL) do one thing;

    57. Re:So if they found them... by Anonymous Coward · · Score: 0

      Seems to me that the second example still has a dangling consequent.

      If b is false in the if statement within the else block, we leave the if/else block without doing anything just as we would in the first example should b be false.

      That route was not considered in the second example any more than it was the first.

    58. Re:So if they found them... by Anonymous Coward · · Score: 0

      SNAFUs. The vernacular term is SNAFU.
      FUBAR (or the alternate representation "foobar") is not used in the capacity of a plural noun.

    59. Re:So if they found them... by kasperd · · Score: 1

      Frankly I thought using = instead of := was just silly

      To some extent I must agree with you. Now I wonder if it would be possible to change this without breaking too much. The obvious first step would be to allow the use of := where we would usually use =. The second step is to give warnings or errors when = is being used, but only in new code modules, we still want old code to compile without a flood of warnings.

      --

      Do you care about the security of your wireless mouse?
    60. Re:So if they found them... by Alsee · · Score: 2, Funny

      I don't think using a development branch is really a good choice at all. Dev branches are just that, development, not intended for normal, every day use (except by the very brave).

      Some people love the thrill of skydiving and opening their parachute 5 seconds before they hit the ground. Some people defy death by wrestling crocidiles bare handed. Others get a rush pushing 200 MPH going into a turn in a formula one race car.

      Me, I get my adrenaline pumping by running code on the development branch.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
    61. Re:So if they found them... by Ed+Avis · · Score: 1
      It's very hard to guarantee that a non-null pointer will never be passed as a parameter at compile time!

      Indeed, which is why checkers like splint use annotations. You annotate your code to say which variables or parameters can be null and which can't be; then the checker only has to look at each function individually and at the boundaries between functions. Just the same as the compiler checks that the parameter type is correct when you call a function expecting an int.

      It's no big deal; arguably you should be documenting whether your function allows null for its parameters anyway, and having this documentation in the code itself so it can be automatically checked is better. Have a look at Nice for a similar idea applied to Java.

      But you do have to adopt this bondage-and-discipline across the whole program; if some of your code is annotated and some isn't, the checker won't be able to analyse the whole program.

      that just means you're checking the value at multiple other points, as opposed to a central location, ie. the function that uses it.

      The compiler or static checker is doing the work, so no extra runtime checking is necessary. You are writing the function specification in one central location - 'this function expects its parameter to be not null'.

      Your functions should always check their parameters are valid, and non-null check takes all of 1 clock cycle, plus a fetch (if it's not in a register),

      I agree that one extra clock cycle, or even several, is not a big deal for most applications (even if it happens on every function entry and exit). But it is a bunch of extra code to write, and only works to dodge a problem (calling with a null pointer) once it happens at run time. Whereas with static checking you can be certain there is *no possibility* of calling with a null pointer, and if you accidentally write a call that might be null, you get pointed to the exact source line when building.

      It's worth noting that many important C libraries do not follow your rule of always checking arguments - the C standard library in particular does not.

      Unless I'm missing something major here, I'm now not at all surprised at the terrible quality of some of the code out there.

      I don't really understand this comment - having a code base where you can be _certain_ there will be no null pointer errors, because the checker has proved it statically, is surely better than code where you hope there are no such errors, and added runtime checks to catch them if they happen, but you don't really know.

      (Not that a checker like splint would usually let the programmer eliminate all such errors - but it can certainly cut out a huge chunk of them, and alert you to the things it couldn't check statically, so you can test them at run time.)

      --
      -- Ed Avis ed@membled.com
    62. Re:So if they found them... by Ed+Avis · · Score: 1
      Which is totally useless if you happen to be authoring a library which other developers depend on.

      Not true - you can put the annotations in your header files and then other developers can use splint just the same. If they prefer not to use it, and would rather take on the burden (of checking that a parameter is non-null) for themselves, then fair enough. Either way you've documented that the pointer must not be null.

      If you want to be 'nice' and fail cleanly when given a null pointer then yes you do need extra code - but the standard C library is not written to do that and I don't feel that third party libraries need be either. If passing a null pointer gives a nice, clean segfault that lets you bring up the offending line in the debugger, that's good enough.

      (If your function doesn't have an explicit precondition saying the pointer must not be null, then that's different, of course.)

      --
      -- Ed Avis ed@membled.com
    63. Re:So if they found them... by 2short · · Score: 1

      "It's roughly equivalent to asking whether a bottle of wine tastes better before or after fermentation."

      Before, definitely. After aging, the wine might be good, or not, but right after fermentation? I'll take the grape juice.

    64. Re:So if they found them... by Ed+Avis · · Score: 1
      In Perl, the equivalent is undef, it stringifies to "" or numifies to 0 regardless of platform, and _you can safely dereference it_.

      Well firstly stringifying or numifying undef produces a warning in most circumstances - and the Perl developers themselves encourage you to turn on warnings (in fact it is considered a bug that -w is not mandatory). But more importantly, it's not true that undef can safely be dereferenced:

      my $s;
      print $$s;

      gives Can't use an undefined value as a SCALAR reference.

      What you showed in your code is an example of autovivification, where Perl creates data structures for you when values are set. It's the same as something like push @{$hash{hello}}, 55 where if the hash element was undef before, a new anonymous list will be created. But autovivification is for _setting_ values, there is no magic for ordinary dereferencing of undef, and that will crash your program just as dereferencing null will in C. (With a more helpful message, however.) FWIW, the code snippet you posted does not run because 'my $$x' is not legal.

      I have been told that C++ also has references, but for some reason people are still using pointers.

      A lot of this is because they've picked up bad habits from older (or just plain bad) C++ textbooks. Modern C++ coding does not usually need pointers or explicit memory management that much, except for containers of pointers.

      --
      -- Ed Avis ed@membled.com
    65. Re:So if they found them... by Anonymous Coward · · Score: 0

      Agreed.

      They should have run this test on Apache 2.0.46 (or even 1.3.27, e.g. as a point of comparison) but against the latest 2.1? That seems suspect.

    66. Re:So if they found them... by tomstdenis · · Score: 1

      My approach is to use ASSERT like macros which the end developer can tweak to various styles [or NULL]. Good code should always check for the majority of errors out of the box.

      It comes in handy specially when you are writing new code and happen to forget to copy a pointer [which happend to me just the other day].

      Tom

      --
      Someday, I'll have a real sig.
    67. Re:So if they found them... by murr · · Score: 2, Informative

      Interestingly enough, that very first bug report demonstrates a limitation in the logical reasoning of the analysis tool, not a defect in the Apache code:

      current_provider was assigned from conf->providers (line 257), so it cannot possibly be NULL unless conf->providers is NULL, and that condition is tested for on line 287.

      NEXT!

    68. Re:So if they found them... by lsdino · · Score: 1

      My quibble with explicitly checking for NULL pointers is that you're only going to catch the case when the pointer is NULL. Just about any other bad value is going to give you a segmentation fault (which is exactly what a NULL pointer is also going to give you). I would consider such a check of more value if you also bothered to check all the other pointer values it shouldn't be, but that's something which is mainly only practical at the kernel level. Otherwise, I find all the extra NULL checking pedantic.

      In C++ it would be pretty easy to verify a pointer. Step one, implement your own heap with an address checker. Step two, ensure all classes have vtables. Step three, Check both the memory address & vtable of the object. It's all quite extreme to comparing a value to NULL though, and there's *still* classes of errors it won't catch (for example, passing a freed & reused address that got reused by the same type). That aside, there's a significant difference between a NULL pointer, and give a good example below. So given the simplicity of the check, and the fact that it does often have meaning, I think it's a good idea to do the check. Not necessarily for all internal APIs, but certainly public APIs. A good error return code is quicker to solve than a seg fault.

      The only place where I like to put NULL checks is where passing a NULL pointer has some sort of meaning in the API (in which case, it's obviously necessary). Doing so helps signal to anyone reading the code (mainly myself) that a NULL pointer value has significance beyond a possible segmentation fault. That would be drowned out if I put a NULL pointer check everywhere just to return a marginally useful error code, which I would also have to check for, rather than the program crashing in a clean and spectacular manner (the fail fast mentality).

      Now, this depends on how you implement it. If you do the old if(arg1 == NULL || arg2==NULL) return(NULL_POINTER_ERROR) at the top of every function (or throw an exception) it'd be immeditely obvious you don't accept NULL pointers on some functions - no matter what you do else where. You bring up the "fail fast mentality", and I think that generally applies to when you CAN'T handle the error. In general you can handle a NULL pointer before you dereference it. So there's no need to bring down the entire application. What if someone did:

      char *buf = new buf[181];
      result = fill_my_buffer(buf);

      If you run out of memory you can pass a NULL pointer. Should fill_my_buffer really crash the program, or should it return a meaningful result (that could maybe be propagated up)? It is an easy error to overlook, what would you prefer an application you were using do?

    69. Re:So if they found them... by Anonymous Coward · · Score: 1

      You should try reading the C standard before making yourself look like a moron on Slashdot.

    70. Re:So if they found them... by jhunsake · · Score: 1

      Would you take advice from Henry Ford about your new 2003 Ford car? I didn't think so.

    71. Re:So if they found them... by achurch · · Score: 1

      And for those who haven't seen this trick before, a nice habit to get into is to write your checks like so:

      if (NULL == myPointer) { ... }

      And how do you read that? "If NULL is equal to myPointer, then..." Yes, we all know equality is symmetric, but that doesn't make any intuitive sense; since NULL is a constant, how can it ever have a value other than NULL? IMO, it's much clearer--and therefore easier to read and maintain--to put the variable first. Besides, as others have pointed out, modern compilers are intelligent enough to catch improper usage of "=" in a boolean test (and if you don't compile with -Wall or the equivalent for your compiler, then you deserve all the bugs you get).

    72. Re:So if they found them... by fforw · · Score: 1

      Another interesting question is :

      how many errors do you insert into the code while removing the found ones?

      --
      while (!asleep()) sheep++
    73. Re:So if they found them... by Vej · · Score: 1

      True, but it's the same basis for all code run through it, is it not?

    74. Re:So if they found them... by jonadab · · Score: 1

      > Programmers are prone to make errors when using [char* and its ilk].

      Yes, I can go along with that.

      --
      Cut that out, or I will ship you to Norilsk in a box.
    75. Re:So if they found them... by jonadab · · Score: 1

      > gives Can't use an undefined value as a SCALAR reference

      Okay, but that's fourty-two zillion times better than dumping
      core seven minutes later for no apparent reason. Hence, "safely".

      I didn't mean to imply that using undef as a reference was going
      to normally be what you really want to do, or anything. Just that
      it won't crash things randomly, allow an attacker to execute some
      arbitrary code, or be Practically Impossible To Debug (TM).

      > A lot of this is because they've picked up bad habits from older
      > (or just plain bad) C++ textbooks

      Could be.

      FWIW, I do use warnings at development time (though I often turn
      them off when I'm done working on the program), and I've recently
      taken to using taint checking for many things, and I'm working on
      training myself to use strict in longer programs (anything more
      than a screenful is my rule of thumb for that now).

      I still say C and C++ are evil. If I had back the time I'd
      wasted on them, I could learn three other languages with time
      left over for reading slashdot.

      --
      Cut that out, or I will ship you to Norilsk in a box.
    76. Re:So if they found them... by Ed+Avis · · Score: 1
      Okay, but that's fourty-two zillion times better than dumping core seven minutes later for no apparent reason.

      Yes, but in almost all C implementations dereferencing a null pointer will segfault the program there and then, letting you see the exact source line in the debugger. So apart from not having the line number printed as part of the error message, it is pretty much the same as Perl's behaviour.

      The seven-minute-later-coredump thing happens when you start using pointers that are not null, but dodgy in some more subtle way, like pointing to freed space or off the end of an array.

      --
      -- Ed Avis ed@membled.com
    77. Re:So if they found them... by EelBait · · Score: 1

      If used well, early returns make for much more readable code than the Pascal-ish code you supplied.

      With your sample, the important part of the function (setting the first character of the buffer) has been partially obscured by the conditional.

      Early returns are best when used for safety net and error handling as in the original sample.

    78. Re:So if they found them... by Anonymous Coward · · Score: 0

      Man, if Henry Ford were to somehow give me advice about my new 2003 Ford car, then by god I'd take it.

    79. Re:So if they found them... by gfim · · Score: 1

      I'm sorry, but I think that is absolute crap. The early returns mean that you don't have to consider "special" cases throughout the rest of the function - that it a big PLUS in my book. If I see "if(p == 0) return;" at the top of a function, I know that I don't have to worry about that case from then on. If I see "if(p != 0) {" then I spend time looking for the "else" and wondering what is going to happen in that case. Have a look at the SESE vs. SEME thread that's being going in comp.lang.c++.moderated for the last few weeks.

      Graham

      --
      Graham
    80. Re:So if they found them... by aborchers · · Score: 1
      I'm sorry, but I think that is absolute crap.


      You are entitled to.

      If I see "if(p == 0) return;" at the top of a function, I know that I don't have to worry about that case from then on.


      You also don't have to worry about it if everything dependent on a non-zero value for p is contained in a block under the opposite test, as in "if (p != 0) { ... }".

      Write all the early returns you like. As long as I don't inherit your code, may they live forever...

      --
      Trouble making decisions? Just flip for it.
  2. A bit late, aren't we? by Anonymous Coward · · Score: 2, Interesting
  3. apache 2.1? by fishynet · · Score: 5, Interesting

    2.1 is'nt even out yet! the latest is 2.0.46!

    --

    Cats: All your base are belong to us.
    Captain: Take off every sig !!
    1. Re:apache 2.1? by bigpat · · Score: 1

      "to examine defect density rates in a less mature Open Source application and compare it with the commercial equivalent."

      I read that to mean that they looked at a development version of a commercial project also. Which means that the company had to decide that the commercial company decided that their project was at a similar place to Apache's development project. Given the subjective nature of that determination, this is essentially an endorsement of the Apache Development effort.

      Essentially, a Software company wanted to check it's development against an open source project and found out that a certain error rate was about the same, which seems likely given that they were looking at essentially were syntax errors not design flaws or anything else more complicated to look at.

      I read this as finding out that essentially a programmers fallability is about the same regardless of whether they are being paid or not.

    2. Re:apache 2.1? by fnorky · · Score: 1
      Essentially, a Software company wanted to check it's development against an open source project and found out that a certain error rate was about the same, which seems likely given that they were looking at essentially were syntax errors not design flaws or anything else more complicated to look at.


      I would say it is more likely that Reasoning is just using this "study" as a marketing tool.

    3. Re:apache 2.1? by bigpat · · Score: 1

      "I would say it is more likely that Reasoning is just using this "study" as a marketing tool."

      No, the results were released, not the company's name. So, unless they just didn't like the results, which is possible since the conclusion was that the error rates are nearly the same, then this is a legit analysis of code and they are relatively open about their methods.

      A reasonable conclusion about what this group has found is that error rates in open source projects are relatively the same as in commercial projects, but as they mature a greater number of errors are fixed in Open Source vs Commercial Software... this is not negative towards Open Source.

  4. It's not fair! by jpmahala · · Score: 5, Funny

    Just because Open-Source coders can't spell when they insert comments doesn't mean that they can't write good code!

    1. Re:It's not fair! by MrPerfekt · · Score: 4, Funny

      Unless they can't spell other things like...

      inklude
      dephine
      retern
      brake... etc.

      --
      I just wasted your mod points! HA!
    2. Re:It's not fair! by Anonymous Coward · · Score: 0

      If you believe Darl, it's the SCO coders that can't spell.

    3. Re:It's not fair! by Drathos · · Score: 2, Insightful

      That's what compiler errors are for.. How else are you supposed to find typos when vim doesn't have a spellchecker? :)

      --
      End of line..
    4. Re:It's not fair! by Anonymous Coward · · Score: 0

      That is what #defines are used for! Why do you think we have the precompiler? It was to fix spelling errors in the code without having to goto each spot and do it. You can just fix them all in one place. This is also the best way to change variable names.

    5. Re:It's not fair! by Jucius+Maximus · · Score: 3, Funny
      "(putting it ahead of many commercial implementations for it's low error density)"

      This line gave me a good chuckle. I expect that most people did not even notice the grammatical error in a sentence talking about low error densities.

      Note: The rules for its/it's are not covered in Bob's Quick Guide To The Apostrophe, You Idiots since the Guide covers nouns and 'it' is a pronoun.

    6. Re:It's not fair! by blibbleblobble · · Score: 1

      "Unless they can't spell other things like...
      inklude
      dephine
      retern
      brake... etc.
      "

      Add a spellchecker to the compiler if you like: "35 warnings: 1 undefined symbol and 34 spelling errors in the comments".

      Problem is, many coders know a lot more about linguistics than the spellchecker, you'd hardly be a unix programmer (i.e. hyper-literate) if you didn't regularly invent words because there weren't any good enough to describe what you want to describe.

      Codepoet, etc.

    7. Re:It's not fair! by rabs · · Score: 1

      eazy solutoin:

      #define dephine define
      #dephine inklude include
      #dephine retern return
      #dephine brake break

    8. Re:It's not fair! by Anonymous Coward · · Score: 0

      That's what compiler errors are for.. How else are you supposed to find typos when vim doesn't have a spellchecker? :)

      Well as an emacs user, I rely on the colours of the words... if it aint blue, then it aint a keyword, etc.

      Of course emacs has a spell checker =)

    9. Re:It's not fair! by TeraCo · · Score: 1
      Yeah, but emacs has a shuttle control console built in as well.

      --
      Not Meta-modding due to apathy.
    10. Re:It's not fair! by Anonymous Coward · · Score: 0

      ... and Colonel.

    11. Re:It's not fair! by Troll_Kamikaze · · Score: 1

      /* I rypped this koad of from SCOA: */
      ...

    12. Re:It's not fair! by Anonymous Coward · · Score: 0
      No.

      It's more like:

      dependant

      seperate

      delimeter

      BTW, look up the word "dependant." It's not what you think it is. And there is no such word as "independant."

    13. Re:It's not fair! by fruey · · Score: 1
      You are right; I'm not sure whether I made that error, or if it was added by the editor. However, the link you provided didn't cover that, here's a better one:

      When Not to Use An Apostrophe

      --
      Conversion Rate Optimisation French / English consultant
  5. Code defects appear to be a small part of the equa by mao+che+minh · · Score: 4, Insightful
    I suppose now we have to question the severity of the defects (and also factor in the implementation and use of the code). If Apache and, say, IIS are roughly equivalent in terms of code defects, you have to ask yourself "well, why does IIS have so many more general problems and security flaws then Apache, when they both carry the same general amount of coding defects?". Is IIS just inherinetly insucure because it is used on a Windows platform? Is it because hackers generally target IIS and not Apache (most people will rush to this conclusion)?

    But here's the kicker: the vast majority runs Apache on either BSD or Linux. All of this code, from the kernel to the library that tells Apache how to use PHP, is open source. Every hacker on the planet has full access to the code - which means that they can review it and find vulnerabilities in it. Not many people have access to Windows or IIS code. So why does IIS and Windows come out as far less secure, and is exploited so much more?

    I think the answer lies in the severity of the code defects, and the architecture and design of the operating system that powers the web server. And yes, I know that Apache can run on Windows.

  6. Wait a second by Knife_Edge · · Score: 3, Insightful

    Has Apache 2.1 been released as a stable, non-developmental release? If not I would say testing it for defects is a bit premature.

    1. Re:Wait a second by AftanGustur · · Score: 2, Interesting


      Has Apache 2.1 been released as a stable, non-developmental release?

      According to the official site.
      The latest 2.* relase is "2.0.46 " and version 2.1 is nowhere to be seen ....

      So the question is : Which version did they audit ??

      --
      echo '[q]sa[ln0=aln80~Psnlbx]16isb572CCB9AE9DB03273snlbxq' |dc
    2. Re:Wait a second by willamowius · · Score: 2, Funny

      They probably compared it to IIS 7.4 to make it a fair comparison. ;-)

    3. Re:Wait a second by znaps · · Score: 1

      But very helpful for the Apache coders, and another addition to the 'Open Source Software bugs are fixed 10 times quicker' argument :)

    4. Re:Wait a second by 13Echo · · Score: 1

      Not to mention that most servers still run on 1.3 releases. Apache 2.x still isn't as stable, nor does it have the features and add-ons that 1.3x has. Some distributions shipped with 2.x only to switch back.

  7. Defect? by Jason_says · · Score: 5, Interesting

    Reasoning found 31 software defects in 58,944 lines of source code of the Apache http server V2.1 code.

    so what are the calling a defect?

    1. Re:Defect? by Anonymous Coward · · Score: 2, Funny

      so what are the calling a defect?

      I guess would be quite a good example.

    2. Re:Defect? by Anonymous Coward · · Score: 0

      lol

    3. Re:Defect? by richie2000 · · Score: 5, Informative
      From the report:
      NULL Pointer Dereference (Expression dereferences a NULL pointer) 29 instances
      Uninitialized Variable (Variable is not initialized prior to use) 2 instances

      They also list the files and code snippets where the errors were found.

      In addition, the comparison is made against an industry average of commercial code they have tested this way, NOT against other webservers.

      --
      Money for nothing, pix for free
    4. Re:Defect? by Malc · · Score: 1

      "Uninitialized Variable (Variable is not initialized prior to use) 2 instances"

      How does this happen? I know the MS Visual C++ compiler has been issuing warnings for this for years.

    5. Re:Defect? by capnjack41 · · Score: 1

      Why don't they just tell us what the defects are, so they can be fixed?

    6. Re:Defect? by Anonymous Coward · · Score: 0

      Ah, there we go -- I couldn't get through to the site before.

    7. Re:Defect? by podperson · · Score: 1

      NULL Pointer Dereference (Expression dereferences a NULL pointer) 29 instances
      Uninitialized Variable (Variable is not initialized prior to use) 2 instances


      Just curious, but...

      Why aren't the developers of Apache (or IIS or whatever) using software tools that automatically detect this kind of defect themselves? This doesn't seem like rocket science -- tools like this are available on most platforms.

    8. Re:Defect? by Q2Serpent · · Score: 1

      Static analysis catches so much more than simple compiler syntactic analysis.

      For example, this can be caught easily by the compiler:

      int foo(int bar)
      {
      int a;
      return bar + a;
      }

      But this requires a deeper analysis:

      int foo(int bar)
      {
      int a = 0;
      int b;

      if(bar > 0)
      {
      a = bar;
      b = bar - a;
      }

      if(a == 0)
      {
      a += b;
      }

      return a;
      }

    9. Re:Defect? by Anonymous Coward · · Score: 0

      Many are built into the compilers, too. I mean, adding -Wall to gcc ain't exactly tough either :/

    10. Re:Defect? by richie2000 · · Score: 1
      I don't really know. However, my (meager) experience is that the tools aren't important - getting the developers to use them is. I have first-hand experience with a "developer" that routinely turned off all compiler warnings and when the code compiled without errors - simply shipped it. This was on the Windows platform, BTW. And yes, I'm pretty sure he re-defined atleast a few errors to warnings. Which he already had suppressed.

      Some of us jokingly referred to this practice as "[company name withheld] Tested and Approved" as a spoof off the Novell Tested and Approved stickers.

      Microsoft has spent a good deal of time and money on educating their deveopers about seemingly simple things like buffer overruns, but still they creep into shipping code. Why? Because they're human (contrary to popular belief, not all Microsoft employees are gnarly evil trolls, just the legal and marketing departments - the remaining 3% of the company are actually human beings, some of them are even rather nice persons) and excrement occurs.

      --
      Money for nothing, pix for free
    11. Re:Defect? by Anonymous Coward · · Score: 0

      If you couldn't get through to the site, then how did you know they didn't show the defects? Did you just assume?

    12. Re:Defect? by cpeterso · · Score: 1


      so how long until the Apache developers fix those 31 bugs? Then Apache will have "zarro boogs" in 59 KLOC!! A milestone in software engineering history! ;-)

  8. How do they get to look at closed source? by 3.5+stripes · · Score: 3, Interesting

    And don't most NDAs for when they do let you look forbid any competetive analysis?

    Or am I just too far out of that line of work to know how these things work?

    --


    He tried to kill me with a forklift!
    1. Re:How do they get to look at closed source? by dracocat · · Score: 1

      Easy.

      "Hey I would like an NDA where I can release benchmarks with your software. I will do it against an open source project, and I will only release reports that shows yours is better, even if I have to use a development release of the open source project in question."

      "Uhhh, ok."

  9. 2.1 ? by Aliencow · · Score: 4, Insightful

    Wouldn't that be unstable? I thought the latest was 2.0.46 or something.. If I'm not mistaken, it would be a bit like saying "Freebsd 4.8 has less bugs than Linux 2.5!"

    1. Re:2.1 ? by penguinblotter · · Score: 1


      These guys can't even round up !

      --
      Mind the gap
    2. Re:2.1 ? by pthomsen · · Score: 1

      But the point is that they were testing "less mature" OSS against something commercial.

      To me, the upshot is that even stuff that's still in development is about as "bug-free" as commercially available wares. A win for OSS in my book!

    3. Re:2.1 ? by Aliencow · · Score: 1

      By less mature I thought they only meant less, as in less than the TCP/IP stack itself, not "beta software" or anything..

    4. Re:2.1 ? by Leffe · · Score: 1

      If there are no new features in 2.1(yeah, right ;)) it should be more stable. 2.0.46 doesn't sound very stable to me either.

      Does anyone know when 2.1's coming out?

  10. What do reasoning do? by SystematicPsycho · · Score: 4, Insightful

    So basically they offer a service like lclint only many times more advanced ? What is to say they haven't missed anything?

    This is probably a publicity stunt for them although a good one. I think it would be a good idea for them to sell software suites of their product if they don't already.

    --
    Analytic & algebraic topology of locally Euclidean meterization of infinitely differentiable Riemmanian manifold
    1. Re:What do reasoning do? by ichimunki · · Score: 1

      This was exactly my thought. How much of this is just an attempt to sell their proprietary software?

      --
      I do not have a signature
    2. Re:What do reasoning do? by Anonymous Coward · · Score: 0

      No this *is* a publicity stunt. They make their money by hand-reviewing code and charge by the line. Now a lot more people know about them. We had them do a code review of our stuff and most of what they came back with wasn't a bug or lint would have caught it. This 'stunt' tells me that they're having a hard time in the current economic climate. When is IIS being 'Reasoninged' anyhoo ?

    3. Re:What do reasoning do? by Florian+Weimer · · Score: 1

      So basically they offer a service like lclint [virginia.edu] only many times more advanced ? What is to say they haven't missed anything?

      Exactly, this is a PR campaign for their tool. If Apache had used it, it would have had zero defects in this test. What does this tell about the actual bug count in Apache? Nothing. The tool doesn't even check functionality, only some occurrences of undefined behavior (which is an achievement nevertheless, such things are quite hard to detect statically).

      One I think I wonder: What happens if you run the tool on Apache 1.3.10 (say)? Does it find all the security bugs? And if it dows, should its use be banned as a cracker tool? 8-)

    4. Re:What do reasoning do? by cpeterso · · Score: 1


      I have played with LCLint (now called Splint). Does anyone know any non-trivial programs (open or closed source) that have used LCLint/Splint? Their C language annocations are cumbersome at first, but the benefit is huge (especially for secure software like web servers).

      Imagine if the Linux kernel adopted LCLint/Splint's annotations. They would help find some kernel bugs, but I think the biggest benefit would be for people writing device drivers. The kernel core is well understood by a privileged few. These kernel gurus could use the LCLint/Splint annotations to add extra SEMANTIC information to the device driver APIs. The annotations would be like extra device driver documentation.. that is checked by the compiler!
      And the annotations add no runtime performance penalty.

    5. Re:What do reasoning do? by RajivSLK · · Score: 1

      This is probably a publicity stunt for them although a good one

      This entire evaluation is one big piece of well crafted flaimbait specifically designed to get the publishing company a lot of attention.

      Come on, walking through source code with an automated program counting every "possible" NULL point dereference is a joke. Laughable at best.

      Essentially we have a software program analyzing a very small part (i.e. chucks of code) of a large application that it doesn't understand. (In fact understanding is beyond the scope of the program; that is why we still have programmers)

      That would be like me walking into the center of a huge factory thought a maze of machinery with my screw driver and testing every screw. Writing a report as I go. Aaa Ha! That screw is loose.

      Obviously I can't begin to test a factory for defects until I understand what the factory does and how it is supposed to do that. (That screw was supposed to be loose! Tightening it will restrict the flow of coolant and the entire plant will explode!)

      In conclusion:
      Who ever came up with this methodology has a few screws loose.

  11. And the point is? by Anonymous Coward · · Score: 0

    Would someone please tell me what the point of releasing an article comparing one known product against an un-named one?

  12. FACT: 3 is a larger number than 2 by TheRaven64 · · Score: 4, Insightful
    Hmm, so they looked at 58,944 lines of code, and found 31 defects? Did they find every defect? Can they prove this? What about those found in commercial code? If it were possible to find all of the defects in a piece of code this big in a small amount of time, then there would be no defects, since they would all be identified and fixed before release.

    As far as I can see, this article says 'We have two arbitary numbers, and one is bigger than the other. From this we deduce that Apache is not as good as commercial software.'

    --
    I am TheRaven on Soylent News
  13. In term of looks Apache is quite good by r6144 · · Score: 0

    I once traced sendmail's source code. Absolutely messy.

    1. Re:In term of looks Apache is quite good by Tony+Hoyle · · Score: 1

      I've been through sendmail myself and it jumps around a bit but isn't too bad (once you've got your head around it it's relatively understandable). Some of the commercial code I've worked with looks like an explosion in a code factory...

  14. Apache 2.1...? by bc90021 · · Score: 4, Insightful

    According to Apache.org, Apache's latest stable version is 2.0.46. Is that a typo on their part, or are they testing a development version? Also, since 1.3.27 is widely used, it would have been interesting to see how that stacked up as well, having been developed longer.

    Either way, to have only 31 errors in close to 60,000 lines of code is impressive!

    1. Re:Apache 2.1...? by Bu+Na+Dan · · Score: 2, Funny

      the error density in the announcement of reasoning.com is pretty high ... testing a non released software against an unknown commercial software ... sounds like an ancient tale. where are the people who accept this kind of crap?

    2. Re:Apache 2.1...? by jbp4444 · · Score: 3, Insightful

      I was quite impressed by the fact that Apache can cram all the functionality into ~59k lines. So besides defect rate, I would like to know how many lines of code the commercial package had ... 0.51 defects per 1000 lines sounds good, unless there are 1,000,000 lines more code in the commercial package.

    3. Re:Apache 2.1...? by hpavc · · Score: 1

      and cross platform. which in my mind would tend to make the source some what more generic and larger to handle the tools for configuring it for porting.

      --
      members are seeing something, your seeing an ad
    4. Re:Apache 2.1...? by bofkentucky · · Score: 1

      Thats the great thing about apache (and some other OSS projects) The core project is often about building a framework for other apps to hook in. Apache-core is all about calling various filters and sub-programs to do all the complicated stuff, from JSP (Jakarta and more sub-groups in the java game), PHP, mod_perl, rewrites, SSI, SSL, mod_python, DAV, and more

      --
      09f911029d74e35bd84156c5635688c0
    5. Re:Apache 2.1...? by pmz · · Score: 2, Insightful

      I was quite impressed by the fact that Apache can cram all the functionality into ~59k lines.

      Agreed. It would be interesting to know whether this low LOC is accomplished through good architecture that emphasizes simplicity and maintainability or "clever" hacks that compress a 10-line loop down into a three-line abomination of pointer arithmetic. I genuinely hope it is not the latter.

      Regardless, 59K lines is small enough a program that--given a good architecture--can be studied and debugged relatively easily by one or two people. I'd estimate that this is why Apache is known for its low number of exploits in spite of its enormous web server market share.

    6. Re:Apache 2.1...? by pmz · · Score: 1

      Actually, this is the correct way to write software. Reduce the number of lines. Increase the speed of execution.

      No, it is not. Let's take Sed or Perl, for example. It is possible to do a temendous amount of work in one line of code using a complex Regular Expression, but debugging or maintaining that one line of code can be a nightmare, especially in shell scripts, where three or four parsers are operating on top of one another.

      The same can occur in C regarding mixtures of structures, arrays, pointers, and function pointers. I can guarantee that the programmer will lose more hair debugging function pointers in structures or prefix increments within dereferences to arrays of structure pointers or whatever than they would lose had they originally written something readable.

      Hard-to-read code is also more likely to get trashed, re-written, trashed, re-written, ad nauseum. Only people who write undocumented trash solely in the interest of job security would consider the-whole-world-in-two-lines a good programming model.

      The truly correct way to write software is to write it first, then profile it to optimize the bottlenecks. Of course, the overall archtecture of the first draft must be reasonable, too, lest the whole program need optimizing.

      I mean, after it's compiled, why do you care?

      Because, I'm the one doing the compiling!

    7. Re:Apache 2.1...? by Fulcrum+of+Evil · · Score: 1

      Actually, this is the correct way to write software. Reduce the number of lines. Increase the speed of execution. I mean, after it's compiled, why do you care?

      I care because I have to maintain that rotting pile of garbage. Anyway, complicated pointer tricks do little to speed execution, but if you want that, why don't you update gcc's optimizer? Then we can all get the speed of unspeakable ugly code with the benefit of maintainability.

      --
      "We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
  15. "Defect Density"? by sparkhead · · Score: 4, Insightful
    A key reliability measurement indicator is defect density, defined as the number of defects found per thousand lines of source code.

    Since LOC is a poor metric, a "defect density" measurement based on that will be just as poor.

    Yes, I know there's not much else to go on, but something along the lines of putting the program through its paces, stress testing, load testing, etc. would be a much better measurement than a metric based on LOC.

    1. Re:"Defect Density"? by p3d0 · · Score: 1
      LOC is a good measure of the effort involved in a project, and the complexity of the code base, but not anything else. Particularly, it certainly doesn't indicate how much functionality the code has.

      So I consider LOC a metric to be minimized. For a given project, the less code the better, within reason.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    2. Re:"Defect Density"? by Anonymous Coward · · Score: 0

      Summary: Their defect density is a fine metric for what they're comparing, which is an estimation of the probability of finding an error in a given line of code.

      Overly Verbose Reply to sparkhead's:
      'Since LOC is a poor metric, a "defect density" measurement based on that will be just as poor.'

      Ugh. I'm amazed that I still had enough faith in the Slashdot community to actually blink when I saw this marked as Insightful as opposed to Troll. I think perhaps I'm even more startled that I find myself actually biting. *sigh*.

      Of what, pray tell, is the number of Lines of Code a poor metric of? The price of tea in china? The amount of oregano that belongs in pizza sauce? The number of statements that comprise a program?

      Well, let's take a second and think about how these guys might define a Line of Code for C and it's progeny.
      .
      .
      .
      .
      .
      How bout something like the presence of a semi-colon that a lexer would parse as a statement closure when compiling.

      Wow...with a definition like that, LOC ends up as a pretty decent measure of the number of statements that comprise a program...

      Now lets assume that what we're _really_ interested in is an estimation of the probability of finding an error given a line of code. Hmmm. Let's stab around in the dark for a few seconds and see if anything screams... We _could_ try taking our total number of errors and divde by the number of Lines of Code that we have...

      or nah, let's put the program through its paces, stress testing, load testing, etc. I hear its a much better measurement.

    3. Re:"Defect Density"? by 2short · · Score: 1

      Of course lines of code is a good number to have if you want to estimate the "probability of finding an error given a line of code".
      The point is that that is a pretty useless thing to want to estimate. The article is sugesting that number is an indication of "reliability", which is bogus. Actual testing IS a much better measurement, not just because it is a good measurement, but because it is a measurement at all.

      If my program needs to do the same thing in ten different places, I could write one function and call it, or I could rewrite the same functionality ten times in ten ways. The latter will produce ten times the LOC, one tenth the "defect density", and almost certainly be less reliable. The chances of producing a bug (maybe even one their checker misses) are ten times as great, and finding it is ten times less likely.

      Lines of Code is by definition an excellent metric of the number of lines in your source. It might be a mediocre metric for maintenance difficulty. It's a worthless metric of anything else.

    4. Re:"Defect Density"? by Anonymous Coward · · Score: 0

      Given that various web servers may have much different LOC figures for their active core (as opposed to in modules that are not even loaded in most configurations) LOC is a terrible metric.

      Now defects per web server "core" would be more interesting -- though they should still use 2.0.46 as the basis of the comparison, not 2.1 dev!

  16. Open Source versus Closed by ElectronOfAtom · · Score: 3, Informative

    The difference is that now that someone has found 31 errors in the open source Apache software, they will be fixed fairly quickly whereas closed source software will have to have the company do a cost-benefit analysis, put together a team to do the fixes, probably charge to put out patches or minor upgrades (assuming the product is Microsoft's IIS ;b)...

    --
    Only two things are infinite, the universe, and human stupidity,
    and I'm not sure about the former.
  17. their own code? by Jearil · · Score: 5, Funny

    Why does it seem a bit odd to be testing software quality with other software? I wonder if they ran their own software through its own program, but then that gets kinda weird when a program starts noticing errors about itself... maybe it'd get depressed and start ranting at the creator on how they should have taken better care of it... ok, I need more sleep

    1. Re:their own code? by ElectronOfAtom · · Score: 1

      hmm many of the "defects" their software found are *possible* NULL pointer dereferences. Many of the possible NULL values seem to depend on function return values. I wonder if their software was smart enough to process the functions that are called to check for possible return values or if it just assumed "All functions involving pointers may return NULL" Perhaps Apache put the NULL value checking inside the function that is called so that NULL will never be returned by the function. Just out of curiosity, I wonder what the result is if they run that program on its own code. LOL... anyone find that result published on the site?

      --
      Only two things are infinite, the universe, and human stupidity,
      and I'm not sure about the former.
  18. What kind of BS test is this? by dtolton · · Score: 2, Interesting

    They are comparing a development version to an un-named commercial web server?

    Why don't they compare it to apache 2.0.46 if they want a newer, but release product? I expect they did, but they didn't get the results they wanted.

    This is a development version, it's an odd numbered release for crying out loud.

    I wouldn't be suprised to see this is bankrolled by M$. Let's compare IIS in development to Apache 2.1, and then see what IIS bug density rate is.

    Bah!!

    --

    Doug Tolton

    "The destruction of a value which is, will not bring value to that which isn't." -John Galt
  19. Absolute crap by degradas · · Score: 1, Interesting

    I can't think of any reason why should anybody trust this analysis until they publish the methods used. Anybody can say "Hey, I tested something using my proprietary method, and $foo has more bugs than $bar!". Unfortunately, such tests really don't say anything substantial about the quality of software. IMHO.

  20. Re:Code defects appear to be a small part of the e by demaria · · Score: 1, Funny

    Hypothesis: Taking down IIS, Windows or Microsoft is more fun/cool.

  21. Apache 2.1 does not yet exist by David+McBride · · Score: 4, Informative

    Umm, Apache 2.1 hasn't been released yet. Current latest stable is 2.0.46.

    I can only assume that they're looking through the current DEVELOPMENT codebase -- finding a higher ``defect density'' in such a development codebase compared with commercial offerings is not exactly unexpected.

    They're also some automated code inspection product; the press release doesn't go into details as to the severity of the defects found or the testing methodology.

    It'll be necessary to read through the full report before drawing any sound conclusions.

    1. Re:Apache 2.1 does not yet exist by David+McBride · · Score: 4, Informative

      The above link wants your email address. Bah.

      The direct URLs for the reports are:
      Defect Report
      Metric Report

    2. Re:Apache 2.1 does not yet exist by orb_fan · · Score: 1

      I guess what the report is actually saying is that coders generally generate the same number of defects when they write code.

      Let's face it - who writes code without at least one error in it, either syntacical or logical.

      The question isn't about how well we write code, but how well we debug - how many times through the code-cycle we put the app.

      I would say that open source is better in this respect as we don't have the business mentality to release code early to make money (after all M$, etc. is in the business of making money, not writing code). I think the prior report about the TCP/IP stack proves this point.

    3. Re:Apache 2.1 does not yet exist by iabervon · · Score: 1

      Looking at a development codebase is, in fact, the entire point of the exercise, according to the article. They wanted to determine whether Open Source is effective at fixing bugs, or effctive at not generating them in the first place, in response to questions about the previous study they did in which they found that Open Source production code was remarkably free of bugs. As it turns out, everybody's code is initially of approximately equal quality, but Open Source code gets substantially better over time, while proprietary software doesn't improve as much.

      It's amazing how many people have managed to figure out what the article says without reading it.

    4. Re:Apache 2.1 does not yet exist by blibbleblobble · · Score: 1

      "finding a higher ``defect density'' in such a development codebase compared with commercial offerings is not exactly unexpected."

      Except they didn't. Apache is equivalent.

      Not that it matters, testing a named product against an unnamed product. This webserver washes whites whiter than a famous supermarket-stocked webserver.

    5. Re:Apache 2.1 does not yet exist by Anonymous Coward · · Score: 0

      Some of these are kind of ridiculous. I looked at the second one for five minutes and couldn't find a problem. Then I realized they were saying they skipped an out of memory check and rolled my eyes. If malloc is returning NULL, the software isn't going to behave as expected anyhow. If a malloc fails, what is a poor HTTP server to do? Does it return a 500 Internal Server Error, or exit? In this case, it'll dereference a null pointer, thus exit, the incorrect way, dumping core. Big deal. Ideally, this shouldn't happen at all. Get some more memory, add some more swap, move on.

  22. Links to the Reports (no free reg required) by Anonymous Coward · · Score: 2, Informative
    AC, thank you for contacting Reasoning!

    Here are the links to the Apache Open Source Inspection Report you requested:

    Apache Defect Report: http://www.reasoning.com/pdf/Apache_Defect_Report. pdf
    Apache Metric Report: http://www.reasoning.com/pdf/Apache_Metric_Report. pdf

    Reasoning provides the world's leading automated software inspection service. We boost the productivity of development teams by finding software defects faster and at a far lower cost than traditional approaches. Please let me know if you would like additional information. Thank you again for contacting Reasoning!

    Sincerely,
    Reasoning

  23. more to it than # flaws-per-unit-"whatever" by Asprin · · Score: 5, Insightful


    What bothers me about these articles is that there is more to software quality than the # of flaws-per-unit-"whatever".

    Like design.

    It seems to me most of the problems with Apache's main competitor in terms of software quality are the result of design and engineering choices made by MS's IIS development team.

    In other words, it does exactly what they designed it to do, but what they designed it to do was a very bad idea.

    --
    "Lawyers are for sucks."
    - Doug McKenzie
    1. Re:more to it than # flaws-per-unit-"whatever" by Anonymous Coward · · Score: 0

      I have no interest in defending Apache, really, it's good enough for me, if it wasn't as good as Zeus or IIS or whatever it's no skin of my back and I'll still use it!

      However, the metrics here are all but pointless.

      One, this measures bugs you should have at all, not defects in the product, which are really defect in reaching it's specified goals and features.

      Most of the "defects" are NULL pointer dereferences which are behind conditions which might well mean the pointer is confirmed to not be NULL but via tests this automated thing does not check. E.g.

      Thing *pThng = *pInputThing;

      if (pThng)
      {
      pInputThing->member;
      }

      The real defects this found (no detected memory leaks... etc.) are use of uninitialized variables, two cases, and those are probably bad.

      Still, the real point, it cannot tell you anything about the design problems, or bugs that don't involve dereferencing a null pointer.

      What we have here are people trying to sell software! That's my guess.

  24. Interesting, with or without modules? by hughk · · Score: 3, Interesting
    If anyone has an Apache 2.1 dist around, they say they checked 58,000 lines - does this seem reasonable? Is this with any of the modules such as PHP or Perl or is this raw????

    I know that Apache has vulnerabilities but it should come better than IIS. You can't realisticly give a verdict on IIS without looking at the libraries called.

    As for the rest, I can imagine some commercial products coming in better, but not many.

    --
    See my journal, I write things there
    1. Re:Interesting, with or without modules? by alder · · Score: 2, Informative
      they checked 58,000 lines - does this seem reasonable?
      It looks reasonable if they checked only the server "core".
      • All *.c files under httpd-2.0.46 - 375K lines
      • APR (i.e. srclib) - 230K lines
      • All modules - 93K lines
      • modules/http - 5K lines
      • modules/loggers - 1.6K lines
      • modules/cache - 0.4K lines
      • some files from modules/mappers - 4K lines
      375 - 230 - 93 + 5 + 1.6 + 0.4 + 4 = 63K in ~ 100 files

      Subtract 53 lines per file on Apache Software Licence and you'll end up with ~58K.

    2. Re:Interesting, with or without modules? by Anonymous Coward · · Score: 0
      I know that Apache has vulnerabilities but it should come better than IIS.

      Apache does probably come in as having fewer defects even if it has a higher defect rate than IIS. This is due to the fact that IIS has so many more lines of code.

      Defect rate is meaningless by itself. I could have one application with a defect rate of 1/1,000 lines of code and a second application with 10/1,000 lines of code. Still, the second application may have fewer defects.

  25. No cigar, my ass. by KFury · · Score: 5, Insightful
    The article claims Apache's error density, based on a meager 5100 lines of code, is 0.53, while that of 'comparable commercial applications' is 0.51.

    The problems with this are:
    • 5100 lines of code does not give you a confidence range of less than 0.02, especially when the error rate can be expected to be heterogeneous across the code base, as would be the case in an open-source product where different code pieces are created by entirely different groups.
    • 'Comparable' my ass. If they can't provide details of what software they're comparing to (I somehow doubt they got a look at IIS source code) then the stats are worthless, because anyone who's ever programmed knows that quality control isn't a constant factor across commercial products any more than it is among open-source products.
    • What's the error rate of their 'defect analysis'? If they're so good at finding defects, why aren't they out there writing perfect software? If their defect detection rate is less than 98% accurate, then the difference between a rate of 0.51 and 0.53 is meaningless anyhow.
    • There's a big difference between caught coding exceptions and fundamental security problems. The first can cause code to run a little slower, the second can destroy your company. This testing methodology doesn't even look at the second.
    1. Re:No cigar, my ass. by HowlinMad · · Score: 3, Informative

      FYI

      5100 != 58,944

      58,944 is the number from the article.

    2. Re:No cigar, my ass. by Anonymous Coward · · Score: 0

      'Comparable' my ass

      No cigars and comparable. Must be a nice ass.

    3. Re:No cigar, my ass. by Anonymous Coward · · Score: 0
      If they're so good at finding defects, why aren't they out there writing perfect software?

      Ah, Zen! I *love* Zen!

      If doctors are so good at healing, why do they die?
      If policemen are so good at catching criminals, why don't they commit the perfect crime?
      If you find your socks in the wrong drawer, why did you put them there in the first place?

      I must stop now before I reach satori in front of all you geeks.

    4. Re:No cigar, my ass. by KFury · · Score: 1

      "Ah, Zen! I *love* Zen!"

      This argument doesn't apply well at all.

      Doctors don't purport to be perfect at healing.
      A police officer catching a criminal doesn't purport to have foreseen all the possible ways the criminal could have been caught, much less be morally corrupt enough to employ them.

      The sock example is the best one: If you find your socks in the wrong drawer, you can put them into the right drawer at that time.

      I'm not saying that these folks could write perfect software the first time out, but if they have a magic bullet for finding code errors, then presumably they could fix them as well.

      You do realize that, unlike laundry and criminology, coding is an iterative process, right?

    5. Re:No cigar, my ass. by Anonymous Coward · · Score: 0
      The doctor analogy: clearest one I thought. Being able to diagnose a flaw doesn't imply you're also capable of fixing it.

      The policeman analogy: software in action compared to criminal action; if the process is not acted out perfectly, one can spot bugs/criminals. Spotting the criminal might be a whole other line of business than doing the crime.

      You got the mislayed socks analogy.

  26. BSD codestyle... by BigBadDude · · Score: 3, Funny


    The defect density of the Apache code inspected was 0.53 per thousand lines of source code...


    We can bring this number down to 0.2 by avoiding the BSD style guidlines. No kiddings, have you seen the density of MFC code?

    BSD code:

    char*
    foo(int bar, double baz)
    {

    /* do something */
    return bar + random();

    }



    MS code:

    char* Foo(int nBar, double dBaz) { return bar + random() + m_ExtraWindowsBugModifier(); }

    1. Re:BSD codestyle... by Anonymous Coward · · Score: 1, Interesting

      have you seen the kernel code?

      about half of it is "comments", that are really arguments/fights between Linus, Cox and Russell.
      (no kidding, if you ever read the kernel mailinglist you should already know this).

      Did they include comments in the test?

    2. Re:BSD codestyle... by poot_rootbeer · · Score: 1


      Changing the source formatting to reduce the number of lines of code would INCREASE the defect density. The numerator stays the same, the denominator decreases... the solved expression increases.

      No WONDER Microsoft code has such a high bug/line ratio!

    3. Re:BSD codestyle... by Anonymous Coward · · Score: 0

      Yeah, I knew there were bugs in the MS code. I didn't know that they came from changing BSD code's variables to hungarian notation.

  27. This is an ad for their software by Sikmaz · · Score: 2, Insightful

    This looks like it was just an ad/demo of their code testing software.

    I am trying to get the main analysis downloaded now, but they must have been prepared for a slashdot posting ;)

  28. code defects? by Anonymous Coward · · Score: 1, Insightful

    I see the point in automatically checking the
    source code for common programming errors,
    but how can such a system ever find semantic
    errors, such as complicated protocol handling
    issues?
    It seems to me that those just happen to be
    strong points of open source software.

  29. Does it matter? by pubjames · · Score: 5, Interesting


    So?

    There are errors and there are errors. There are error that don't matter a jot, and there are errors that are show-stoppers.

    I've worked on banking software containing code that was written in assembly for PD11s and developed over decades. The most horrible spaggetti code you could ever imagine. Why did the banks keep using it? Because for any particular input it always gave the correct output.

    Years of bug fixing had made the code horrible and probably full of errors if you were looking at it from a purely theoretical/software engineering viewpoint. But from an input/output point of view, it was faultless.

  30. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 1, Insightful

    For the same reason that windows boxes get hacked more often. The more a platform is used the more attacks on it.

  31. THIS IS FUD! by Anonymous Coward · · Score: 0

    This comparision is tottaly fishy!

  32. Apostrophe abuse by worst_name_ever · · Score: 1, Funny
    putting it ahead of many commercial implementations for it's low error density

    ...but behind the Slashdot editors in terms of number of abuses of the word "its" per story.

    --

    In Soviet Rush, today's Tom Sawyer gets high on you.
  33. Re:FACT: 3 is a larger number than 2 by frankthechicken · · Score: 2, Insightful

    Completely and utterly agree, I mean hell, I could write fifty thousand lines of code, each line completely and utterly with no meaning, run it through the checker and produce 0 defects, except for one overall defective piece of software. Does this article have any point whatsoever to it at all, I mean, even if the results had any meaning, what on earth is the point of comparing a known to an unknown ?

  34. Correction: 58,000 lines of code by KFury · · Score: 1

    Still, mitigate that with the pre-release status of Apache 2.1 and it cancels out.

  35. what is a "software error"? by siskbc · · Score: 5, Insightful
    If Apache and, say, IIS are roughly equivalent in terms of code defects, you have to ask yourself "well, why does IIS have so many more general problems and security flaws then Apache, when they both carry the same general amount of coding defects?". Is IIS just inherinetly insucure because it is used on a Windows platform? Is it because hackers generally target IIS and not Apache (most people will rush to this conclusion)?

    First, are all of IIS's issues "software errors" per se? I'm wondering if all security problems would have been caught, or if that was really the goal of the analysis. Perhaps it was, but I'm not sure. One could contest that IIS has a lot of things unprotected, but that this doesn't constitute a software error.

    And as you say, severity would be another issue. It's always been typical open-source style to get the mission-critical parts hardened against nuclear attack, but leaving the other bits a tad soft. I wouldn't be surprised to learn that was the case with apache.

    One thing I want to know - did MS (or whoever) give these guys source or were they analyzing the binaries?

    --

    -Looking for a job as a materials chemist or multivariat

    1. Re:what is a "software error"? by Q2Serpent · · Score: 2, Informative

      Obviously they had source code access. That's the way reasoning works - their program reads in and parses the source code, generates a parse tree, and then analyzes that. That's why it's called "static analysis" - no binaries, runtimes, or testcases are needed, and errors can even be found in code that is never excercised.

    2. Re:what is a "software error"? by Tony-A · · Score: 4, Insightful

      It's always been typical open-source style to get the mission-critical parts hardened against nuclear attack, but leaving the other bits a tad soft.

      IMNSHO, that ought to be standard for any mission-critical software. Bugs and the places that bugs live in are not created equal. The beauty of Apache (at least 1.13) is that the overall system can be very robust and reliable with rather buggy modules. I suspect the problem with IIS is that everything assumes everything else is perfect, which overall doesn't quite work so well.

    3. Re:what is a "software error"? by gregmac · · Score: 1
      I'm wondering if all security problems would have been caught, or if that was really the goal of the analysis.

      It would seem to me that security problems caused by unchecked buffers, and other such coding problems would be caught. But it's possible to write 100% solid code that has flawed logic that leads to a security hole.

      Remember, computers do exactly what you tell them. If you tell them to do something wrong (even if your code is perfect), they'll do it perfectly wrong every time. :)

      --
      Speak before you think
    4. Re:what is a "software error"? by beta21 · · Score: 2, Funny

      These acronyms sometimes get me IMNSHO?

      I Am Not A Single Horny Octupus?

    5. Re:what is a "software error"? by Anonymous Coward · · Score: 1, Informative

      IMNSHO: In My Not-So Humble Opinion... ie. a 'nice' way of saying 'I'm smarter than you...'

    6. Re:what is a "software error"? by Anonymous Coward · · Score: 0
      But it's possible to write 100% solid code that has flawed logic that leads to a security hole.

      Hmm, then it's not very solid code, is it?

  36. That's so weird ... by SuperDuG · · Score: 3, Interesting
    I found just the opposite.

    Important Tech City, CA, July 7th 2003
    For Immediate Release
    Sbj: Apache beats other webservers

    Recently we had our staff (some guys kid) look over the source code of 3 major webserver packages available, in that code nearly 8 million lines of error were found, but surprisingly the damned things still worked?!

    We placed a performance test (click a link and see if porn comes faster) with apached and 3 other commercial offerings. Apache seemed to knock them all of the water, boy will those other three companies be mad now.

    While we cannot tell you what the other three offerings were (that might make this whole thing more believeable) we can tell you that we think they're popular.

    Here's the results

    Apache ------------------- 104
    Com 1 --------32
    Com 2 -----------45
    Com 3 ---------------53

    As you can see by the clear test results, apache wins in all tests.

    Since when are unfounded results from a company that doesn't explain what the "32 defects" were, newsworthy. Don't act like these guys are worth my time, this is bullshit.

    --
    Ignore the "p2p is theft" trolls, they're just uninformed
    1. Re:That's so weird ... by leuk_he · · Score: 1

      company that doesn't explain what the "32 defects

      THe defect are "lint" like errors. You can view a report about this. I don't think you can trigger most of the errors. first pages i looked they were null pointer dereferences under some condition:
      If value a is tue and value b is false, under low memory conditions, in other words real bugs, but not the kind of bugs you would see on a error report.

      Conclusion:
      1. They are selling a tool.
      2. THe # errors per 1000 lines is an indication of the quality of the code. Not the quality of the design, not the quality of the concept. nothing else.

      The result you found is the quality of the design.

    2. Re:That's so weird ... by Anonymous Coward · · Score: 0

      Hehe... It sounds like reading their article isn't worth your time. So why would bitching about an article that you never read be worth your time, hummmm????

      Go read the bloody document and stop spreading FUD.

  37. Re:Code defects appear to be a small part of the e by phre4k · · Score: 4, Informative

    Prette lame when we are talking server software where apache has the lead. (apache 63% vs IIS 25% netcraft.com)

    /Esben

    --
    "Nobody really checks their email any more. They just delete their spam"
  38. Dubious by cca93014 · · Score: 4, Insightful

    Is it just me that finds this entire concept of "code defects per 000 lines" sounding like a little bullshit?

    If the company has developed proprietary tools to enable them to identify defects in medium-sized software projects, which of the following business models do you think is more effective:

    1. Design proprietary tools to identify defects in medium-sized software projects.
    2. Fix defects
    3. Profit

    or

    1. Design proprietary tools to identify defects in medium-sized software projects.
    2. Sit around mumbling about defects, Open Source software, closed source software and why farting in the bath smells worse
    3. ???
    4. Profit

    Secondly, where on earth did they get hold of a closed source enterprise level (which Apache undoubtedly is) web server software codebase?

    "Hi, is that BEA? Do you mind if we take a copy of your entire code base so that we can peer review it against Apache's? What's that? Yes, Apache might come out on top, and we will make the results public..."

    How do they define a defect anyway? A memory leak? A missing overflow check? A tab instead of 4 spaces?

    It just sounds like bullshit to me...

    1. Re:Dubious by TrekkieGod · · Score: 1
      Well...they said what "defects" they found, and they put up the snippets of code. That means that the apache people can now fix it for themselves.

      Of course, NULL pointer references and uninitialized variables are easy enough to detect and fix, but I think the point is to measure the carelessness of the programmer. If a programmer makes these kinds of mistakes, he's bound to make worse ones.

      --

      Warning: Opinions known to be heavily biased.

    2. Re:Dubious by nadam · · Score: 1

      Of course it is bullshit. It's a press release. It's marketing. They wanted people to know who they are and they succeeded in doing so.

      If you would have read the press release you would have noticed their business model by the end of it.

      Reasoning Inc. is the leading provider of automated software inspection services that help development organizations reduce the time and cost involved in finding software defects.

      To "Sit around mumbling about defects, Open Source software, closed source software..." is just part of their marketing. I guess it worked well with the TCP/IP stack study in February, so they decided to do it again.

  39. Different standards? by NotClever · · Score: 5, Insightful
    When the same group said that the IP stack in Linux was cleaner than a comparable one, everyone was screaming from the rooftops that it validated the open source model. When they say that an open source project and a closed source project are roughly comparable, all of a sudden everyone criticizes the methodology of the report!

    --
    Hell, there are no rules here. We're trying to accomplish something. - Thomas Edison
    1. Re:Different standards? by mofochickamo · · Score: 1
      ...all of a sudden everyone criticizes the methodology of the report!

      I have seen 3 main criticisms:

      • Bad Statistics
      • They used an Apache development version against a released commercial version
      • The severity of the defects in Apache are not as bad as the severity of the defects in commercial web servers.

      I agree with you that those who say these are "Bad Statistics" have different standards if they thought the TCP/IP report was great. But the other two reasons are valid. Testing a development version of Apache against a commercial release version (if it was actually a release version of whatever commercial web server they used) is not a fair test. Also, the severity of the bug is very important. If you have one bug on Apache that will cause the server to crash and one bug on IIS that will allow you to execute commands on the web server then obviously the two bugs are not equal. It would be more informative if there were different weights given according to bug severity (though it would be difficult to come up with a weight objectively).

      --
      Honk if you're horny.
    2. Re:Different standards? by fgb · · Score: 1

      Actually, when the IP story came out a bunch of people said "open source is great" and another bunch of people said "it doesn't mean anything".

      Now that this story is out a bunch of people are saying "open source is no better than closed source" and another bunch of people are saying "it doesn't mean anything".

      The only difference is the posts you are referencing to make your "slashdotters have different standards" post.

    3. Re:Different standards? by defile · · Score: 1

      The statistical validity of these tests has been suspect since even the first one which said the Linux TCP/IP stack was of higher quality. See the previous Slashdot article.

      They would not identify which software lost in the last test, so it's hard for anyone to take a defensive position in favor of the software which open source was allegedly better quality. No one could have stepped up to call the test BS if they didn't know what it was being compared to. What makes this case different is that they criticised Apache, something we're all familar with.

  40. automatically detected defects exclude security by brlewis · · Score: 5, Insightful

    Another post seems to indicate this was done via software to automatically detect defects. Many (most?) security defects cannot be detected automatically, as they involve using the software in an unintended way.

    1. Re:automatically detected defects exclude security by Alsee · · Score: 1

      One of the best aspects of automated checking is that it doesn't know or care how you intended the code to work.

      It certainly can't catch everything, but it can be real good for picking up unintended paths in the code.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
  41. Re:FACT: 3 is a larger number than 2 by rootofevil · · Score: 1, Funny

    Turing says no.

    --
    turn up the jukebox and tell me a lie
  42. The article doesn't say that at all. by Chuck+Chunder · · Score: 1

    It says the results are "objective and comparable across software applications, development methodologies, and coding styles".

    --
    Boffoonery - downloadable Comedy Benefit for Bletchley Park
  43. Defect Density Indeed! by Anonymous Coward · · Score: 0

    You compare 50k lines of semi-optimized beta code against 1000k lines of a "commercial" product. This is why statistics are the best liars. What does it take to get the "commercial" product to lower the contamination in parts per thousand?

    Bloat.

    Add a few more comments as unessential error checking, hell, add DRM to check to see if you are hosting the lastest Emimem MP3s. That should do it.

    If anything is defective or dense, it's the people who came up with the statistics for the sake of PR.

  44. If Apache is so poor in quality... by tsetem · · Score: 4, Funny

    ...then why is it their webserver? :)

    Of course it is Apache 1.3.23...

    1. Re:If Apache is so poor in quality... by cant_get_a_good_nick · · Score: 1

      Older version too. 1.3 series is up to 1.3.27. 1.3.24 (the next version) came out in Mar 2002, so they're at least a year out of date.

  45. So the error level in pre-release Apache ... by burgburgburg · · Score: 4, Insightful

    is equivalent to the error level in post-release commercial web serving software. Sounds like an endorsement to me.

    1. Re:So the error level in pre-release Apache ... by Kynde · · Score: 4, Insightful

      is equivalent to the error level in post-release commercial web serving software. Sounds like an endorsement to me.

      That, too, but I'm damn certain that they must have tried it on recent stable 2.0.46ish release aswell. The question is, why weren't those results made public?

      I'm guessing it's because the results were something that would've placed their "defect detection sw" into bad light. I.e. nothing as fancy as the forementioned "use of uninitialized variable" and "dereference of a NULL pointer" (which strikes really odd to me in the first place).

      Naturally the other explanation is endorsement. It would be so much not-the-first-time that I don't even bother... but I wouldn't bet that this is the case here, because the defect counts were only compared to production release code averages (which strikes me as the other extremely dubious part of this whole "experiment").

      --
      1 Earth is warming, 2 It's us, 3 it's royally bad, 4 we need to take action NOW
    2. Re:So the error level in pre-release Apache ... by yaphadam097 · · Score: 3, Insightful
      I've worked on open source projects and I've also worked in commercial development shops. I think that their findings are accurate but misleading:
      1. In my experience there are generally less bugs in pre-release code on a commercial project because there is a stronger culture of code ownership, and most if not all code is independently reviewed before being committed.
      2. There are generally a high number of defects in pre-release open source code, because developers commit early and commit often. Independent review happens more often in open source projects, but it usually happens after the code has already been committed to the dev branch (Before that, the geographically dispersed dev team has no access to it.)
      3. The quality of code released to production in a commercial environment is usually very similar to the quality of code in the development branch. Once it is reviewed and committed it enters a QA cycle where an independent team tries to find any bugs. At this point there is invariably strong pressure to release. So, bug fixes happen quickly and quality suffers (I've always found it ironic that we called this "Quality Assurance.")
      4. Once an open source project has been completed (Meaning all of the features have been developed) it enters a much longer period of code review, bug hunting, and alpha release. For a project like Apache it was over a year before anyone started to use 2.0 in production. Most commercial companies can't afford nearly that much "QA" time, because they are spending money to make money.
  46. Bad Statistics... by FunkZombie · · Score: 5, Insightful

    Also keep in mind that defect density is just an average. If you have 31 defects in 60k lines of code, that is potentially 31 security risks, or out-of-operation risks. If the other software tested had double the lines of code (120k), the density would imply that they had slightly less than double the defects, so say 58 or 60. That implies _58_ potential security or uptime risks. In this case, imho, defect density is not a good indicator of the reliablity of the software.

    My general rule is that if someone is quoting statictics to you, they are lying. At least on average. :)

    1. Re:Bad Statistics... by Lxy · · Score: 4, Funny

      My general rule is that if someone is quoting statictics to you, they are lying. At least on average. :)

      39% of Slashdot readers already know that.

      --

      There is no reasonable defense against an idiot with an agenda
      :wq
    2. Re:Bad Statistics... by NeoNormal · · Score: 1

      >"39% of Slashdot readers already know that."

      And now, thanks to your post, that number has soared to 41%!... good work!

    3. Re:Bad Statistics... by Anonymous Coward · · Score: 0

      Technically what you mean is that it is a sample mean, not population mean.

      Given a sample size of 58,944 and a sum of 31:

      sample mean = 31/58944 = 0.000526
      sample standard deviation (fuzzy) = sqrt((58944*31-31*31)/(58944*58945)) = 0.022935

      With 90% confidence, the population mean is between .000526+/-1.65*.022935/sqrt(58944)
      = between 0.000370 and 0.000682 defects/line.

      So in 120kloc, you'd expect 44 to 82 defects.

      which is all pretty meaningless, but I'm taking a statistics class so it's good practice :)

    4. Re:Bad Statistics... by Alsee · · Score: 1

      39% of Slashdot readers already know that.

      And 71% don't.

      (Yes, I did that intentionaly)

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
  47. to be expected from Open Source by Illserve · · Score: 3, Interesting

    By its very nature, Open source will tend to fix important bugs and leave unimportant ones unfixed, while standard QA processes associated with commercial software will tend to fix little UI issues during the release schedule before dealing with vulnerabilities.

    So seems pretty clear to me that in Open source, the ratio of showstopper bugs to miscolored widget bugs will be much lower than for commercial software.

    1. Re:to be expected from Open Source by Ymerej · · Score: 1

      By its very nature, Open source will tend to fix important bugs and leave unimportant ones unfixed, while standard QA processes associated with commercial software will tend to fix little UI issues during the release schedule before dealing with vulnerabilities.

      As one who works in QA, I take exception to this assertion. It can happen that some undisciplined individual developers may find it more attractive to pick the low hanging fruit of miscolored widget-type bugs, but in my experience, standard QA and configuration management processes associated with commercial software definitely do take into account the severity and priority of bugs when deciding which ones to tackle.

    2. Re:to be expected from Open Source by Daniel_Staal · · Score: 2, Insightful

      I don't think the poster meant to dis commercial QA work: he was instead of the opinion that commercial software will value the widgets and so on more than open source does.

      That is: he is sure that *both* processes take into account severity and priority of bugs. The poster just felt that their priorities were different. (Polish being more important for commercial code, absolute correctness for open source. The question of the 'correct' balance is left up to the reader.)

      --
      'Sensible' is a curse word.
  48. FACT: Reading is Good by Cancel · · Score: 5, Informative
    That's not what they're saying at all. In fact, Reasoning concluded that there was no statistically significant difference in 'defect density' between Apache and the unnamed commercial product.
    "In our February study that compared the defect density of the Linux TCP/IP stack to the average defect density of commercially developed TCP/IP stacks, we concluded that Open Source had a significantly lower defect density compared to commercial equivalents," said Bill Payne, President & CEO of Reasoning. "We received numerous inquiries about that study and took seriously requests for us to examine defect density rates in a less mature Open Source application and compare it with the commercial equivalent. Taking advantage of our database of automated software code inspection projects, we were able to do exactly that, and found the difference in defect density between the two was not significant." (emphasis mine)
  49. Actually the article suggests apache is better by sterno · · Score: 4, Insightful

    This doesn't indicate that the commercial equivalents are better. You've got the DEVELOPMENT branch of Apache, which is derrived from the 2.0.x code which is a complete rework from the original 1.X branch of code. So it's a rather new code base and it's showing similar defect rates to a code base that has been around for a while. I'd say this prooves that open source is better.

    --
    This sig has been temporarily disconnected or is no longer in service
    1. Re:Actually the article suggests apache is better by saskwach · · Score: 0

      It still proves nothing, as the "defects" in question were things like null pointer issues and other compiler level problems. This software they're selling doesn't do squat to check for logic errors or any hard-to-find problems. Seems to me like they just made a knockoff of GCC with -Wall.

  50. Confuse with Linux? by yerricde · · Score: 1

    This is a development version, it's an odd numbered release for crying out loud.

    You refer to the version numbering rules used by the developers of the Linux kernel. Does Apache follow the numbering scheme of Linux?

    --
    Will I retire or break 10K?
    1. Re:Confuse with Linux? by bofkentucky · · Score: 2, Informative

      now they do, 2.0.x are stable, production releases 2.1.x are testing branches

      --
      09f911029d74e35bd84156c5635688c0
  51. Re:FACT: 3 is a larger number than 2 by StrawberryFrog · · Score: 1

    Hmm, so they looked at 58,944 lines of code, and found 31 defects? Did they find every defect? Can they prove this?

    Proving program correctness and bugfreeness is real hard. If they did find every defect and they can prove it, then I supect that it would be a significant breakthrough in Computer Science, not to mention a comercial goldmine.

    As you can imagine, I am a bit sceptical.

    --

    My Karma: ran over your Dogma
    StrawberryFrog

  52. Recursion by sterno · · Score: 2, Funny

    They didn't do that because if they did that, then they'd find bugs in their bug finder, so they'd have to run the bug finder on the bug finder to find bugs there, but then they'd have to run the bug finder on the bug finder on the...

    --
    This sig has been temporarily disconnected or is no longer in service
    1. Re:Recursion by fgb · · Score: 4, Funny

      That reminds me of an old (early 1980's) product named BILF (Basic Infinite Loop Finder). It was supposed to be run against BASIC source code and it would find all infinite loops in the code, or so the vendor claimed.
      A magazine reviewed the product. In their review they included a formal mathematical proof that such a program could never work. The vendor responded to the proof by saying that they would fix that problem in the next release!

    2. Re:Recursion by nick255 · · Score: 2, Interesting

      Yes the proof is quite a simple application of the famous halting problem proof.

      Imagine you made the program go into an infinite loop whenever the program it was analysing did not have an infinite loop.

      Them run the program on itself......

    3. Re:Recursion by Anonymous Coward · · Score: 0

      Why couldn't it ever work? Alls you have to do is build a table of all the variables the loop condition depends on (and the variables they depend on, and so on). Then as you execute the loop, you put the values of each of the dependent variables into the table. You procede until either the loop ends (in which case it's obviously noninfinite) or until you detect that all the dependent variables in one loop instance are exactly the same as the depedent variables from a previous loop instance. Obviously, this is depending on the fact you are using fixed size numbers which gaurentees that an infinite loop will traverse over at least one set of depedent variables more than once. But seeing how computers are finite machines anyway.....

    4. Re:Recursion by Anonymous Coward · · Score: 0

      Err, I meant "finite size" rather than "fixed size."

    5. Re:Recursion by Anonymous Coward · · Score: 1, Interesting
      Why couldn't it ever work? Alls you have to do is build a table of all the variables the loop condition depends on (and the variables they depend on, and so on). Then as you execute the loop, you put the values of each of the dependent variables into the table. You procede until either the loop ends (in which case it's obviously noninfinite) or until you detect that all the dependent variables in one loop instance are exactly the same as the depedent variables from a previous loop instance. Obviously, this is depending on the fact you are using fixed size numbers which gaurentees that an infinite loop will traverse over at least one set of depedent variables more than once. But seeing how computers are finite machines anyway.....
      What happens if you get something like this:

      int k=0;
      int i=0;
      while( i < 20 || k < 365 )
      {
      k++;
      }
      As in you forgot to put
      i++;
      in the loop.

      The snapshot of every loop's dependent variables is different, but it's still an infinite loop because 'i' never increases. Keep in mind this is just a counter example, and of course you could modify your idea to make it work in this case. However, somebody has formally proven that you can't make an infinite loop detector. IIRC, the book "Godel, Escher, Bach" has some interesting stuff on this and other issues with AI.
    6. Re:Recursion by Piquan · · Score: 1

      Do you have any references for this event? I'd like to find out more. (References to the halting problem by itself are not necessary; just this event.)

    7. Re:Recursion by Anonymous Coward · · Score: 0

      Actually, it is solvable. Computers are LBAs (Linear Bound Automatons), a restricted type of turing machines. Because memory is linearly bound, you have a finite (and actually factorial) bound on possible valid executions without there being an infinite loop. Realistically, though, you'd never want to run such a test on even (640K/4)! code, let alone what's really possible given the HD space to swap out to (as a side note, one could argue that two removable media slots would allow for infinite storage through two removable media devices constantly being interchanged with new media, but then you run into the physical limits of reality..and that's probably finite).

    8. Re:Recursion by Anonymous Coward · · Score: 0

      Actually it is trivial to prove that one can construct a program that runs on a finite real machine (finite storage) in finite time that will decide if an arbitrary program running on a non-networked non-random deterministic machine (finite storage) will halt or not.

      It is so blazingly obvious, that I won't bother to state the solution here. Hint while (1) if (++state_table[current_state]) == 1) printf("Infinite loop\n"); else current_state = next(current_state);

  53. Wrong Math by bstadil · · Score: 4, Insightful
    You got the math reversed

    The longer and more content you have per line the higher the likelyhood of error/ line.

    As example with one errror in 100 lines you get 1% error. Imagine you could do the whole thing in one line. Now you have 100% error.

    --
    Help fight continental drift.
    1. Re:Wrong Math by BigBadDude · · Score: 2, Informative

      yeah, that was actually my point. nice someone got it :)

      The source of most free software [KDE is an exception] tend to be smaller, more readable and more effective. Ever wondred why winword.exe is 10.598.984 bytes?

    2. Re:Wrong Math by be-fan · · Score: 1

      You'd be surprised how small KDE is. As of KDE 3.1, the KDE CVS only contains about 2.6 million lines of code. Given that the KDE CVS includes everything from a toolkit to a web-browser to an Office suite, 2.6 million lines is quite a bargain.

      --
      A deep unwavering belief is a sure sign you're missing something...
    3. Re:Wrong Math by Alsee · · Score: 1

      Imagine you could do the whole thing in one line. Now you have 100% error.

      Oh my god! You mean the entire Windows operating system is a "one-liner"?

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
  54. Magic software by pubjames · · Score: 1


    So if they can write software to automatically spot coding errors, then it must be possible for them to automatically fix them, no?

    1. Re:Magic software by Eustace+Tilley · · Score: 2, Insightful
      Ok, pretend you are the magic software and you see this code:
      int ar[50];
      for (int i = 0; i<=50; i++) { ar = 1;}
      How are you going to "automatically" fix that? Change the comparison operator? Change the array size? Replace the loop with a library function?

      "Fixing" requires understanding the code's intent.
    2. Re:Magic software by pubjames · · Score: 1

      How are you going to "automatically" fix that?

      I was being sarcastic in the original post.

      "Fixing" requires understanding the code's intent.

      You've hit the nail on the head. Personally I think spotting errors also requires understanding the code's intent. That's why I was being sarcastic about them having "magic" error spotting software.

    3. Re:Magic software by Eustace+Tilley · · Score: 1
      Do you think spotting the error in ...
      int *p;
      *p = 5;
      ... requires understanding the code's intent?
    4. Re:Magic software by Anonymous Coward · · Score: 0

      No, but, again, to repeat, FIXING it does. Anyhow, how can you be sure that's an error in the first place? Even finding errors requires context.

  55. In other news... I have begun testing by teamhasnoi · · Score: 4, Funny
    Apache 4.2 Alpha, a release that is yet to be even a twinkle in it's Daddies' eyes. I have found a whole bunch of errors, bad comments, a few scribbles on napkins, some old Populous save games, and a letter to 'Mom' asking for money.

    I compared this to my 'other' server, for now unnammed.

    My 'other' server brought me coffee, 2 pieces toast, 2 eggs OVER EASY, 4 strips of bacon, *and* Smucker's Grape Jelly with nary a mistep, or hesitation. This other server smiled, asked how my wife was, and brought me a new fork when I dropped my first one.

    Congratulations, Gloria! You win the 'great server' award!

    This article isn't worth the 2 dollar tip.

  56. Dupe by Sardonis · · Score: 1
  57. scope by thoolihan · · Score: 0, Redundant

    I wonder what scope of errors they are looking at? For instance, are they counting assignment errors (overflow), IIS->Com higher level type errors, or both.
    -t

    --
    http://unmoldable.com W:"No one of consequence" I:"I must know" W:"Get used to disappointment"
  58. Here's an idea by Daath · · Score: 4, Funny

    Why doesn't Reasoning fill the niche, and code a completely error free web server? They know other peoples mistakes, so they should know how to code an error free one.
    Well, seriously, I wouldn't put much in their obvious estimation.

    --
    Any technology distinguishable from magic, is insufficiently advanced.
  59. Don't assume IIS by m00nun1t · · Score: 5, Insightful

    Ok, IIS is the obvious choice as being the second most popular web server after Apache. But I hardly think Microsoft will be letting these guys all over the IIS source code.

    It could also be Zeus, SunOne or one of the other lesser known web servers out there.

    1. Re:Don't assume IIS by BigBadDude · · Score: 1


      I am pretty sure that Zeus is based on Apache code.

  60. Apache 2 is not Apache 1 by defile · · Score: 2, Insightful

    The test may be more interesting if applied to Apache 1. As someone who has had to migrate a mod_perl site from Apache 1 to Apache 2, I can tell you that Apache 2 is a very new beast, and it doesn't shock me at all that there are dozens of bugs that still need to be shaken out. Fewer users are running Apache 2 in a production environment as well, since it's considered a development branch. See less eyeballs rule.

  61. Defect Details by Eustace+Tilley · · Score: 5, Informative
    Interested persons can download the full defect report free of charge.

    Some things I found interesting:
    1. Apache 2.1 (dev) is a mere 76,208 LOC.
    2. No memory leaks detected
    3. 29 NULL pointer dereferences
    4. 2 Uninitialized variables
    5. No bounds errors, no bad deallocs
    6. otherchild.c had a rate of 7 NULL pointer dereferences per 1000 KSLC


    7. One of the explanations (given by Reasoning) for a NULL pointer dereference is "can occur in low memory conditions," which I think means the original allocator did not check for malloc failure.

      So you can get a sense of what a defect looks like, here is #21. The orignal uses bold and fonts improve readability, but I don't know how to reproduce that in slashcode:
      DEFECT CLASS: Null Pointer Dereference

      DEFECT ID 21

      LOCATION: httpd-2.1/srclib/apr/misc/unix/otherchild.c : 137

      DESCRIPTION The local pointer variable cur, declared on line 126, and assigned on line 128, may
      be NULL where it is dereferenced on line 137.
      PRECONDITIONS The conditional expression (cur) on line 129 evaluates to false.
      CODE FRAGMENT
      124 APR_DECLARE(void) apr_proc_other_child_unregister(void *data)
      125 {
      126 apr_other_child_rec_t *cur;
      127
      128 cur = other_children;
      129 while (cur) {
      130 if (cur->data == data) {
      131 break;
      132 }
      133 cur = cur->next;
      134 }
      135
      136 /* segfault if this function called with invalid parm */
      137 apr_pool_cleanup_kill(cur->p, cur->data, other_child_cleanup);
      138 other_child_cleanup(data);
      139 }
  62. Defects and maturity of code base by the+eric+conspiracy · · Score: 4, Insightful

    This study makes a lot of sense to me - that the defect rate is tied to the maturity of the code base. I have long felt that Microsoft's business model where they redo the operating system in order to churn their user base and induce cash flow will always result in more defects and security problems than a model where software change is driven on a solely technical basis.

    I think the next step for these folks would be to take a project that has a long history, say perhaps Apache 1.x and show defect rates over the life of the project.

    1. Re:Defects and maturity of code base by Kynde · · Score: 1

      I think the next step for these folks would be to take a project that has a long history, say perhaps Apache 1.x and show defect rates over the life of the project.

      Wonderful idea! I mean, as full of it as this study was, the defectrates would be interesting read for any long term oss project.

      --
      1 Earth is warming, 2 It's us, 3 it's royally bad, 4 we need to take action NOW
  63. Null dereferences and uninitialized variables by ByTor-2112 · · Score: 2, Informative

    29 possible "null dereferences" and 2 possible "uninitialized variables". Some of them are simple "fail to check return value of malloc() for null", and others are not bugs in the code but bugs in the logic of the scanner. This is, of course, a precursory review of their document. All in all, these are absolutely minor bugs if they are real at all.

    1. Re:Null dereferences and uninitialized variables by ByTor-2112 · · Score: 1

      Replying to my own post...

      Having read the entire report and reviewed the included code, many of the "bugs" are likely not bugs in apache but in their software.

      Without a doubt, this kind of report is one that should NEVER be "published" widely. It is nothing more than what everyone is saying -- a list of suspicious code. Before calling these "defects", it would be absolutely necessary to hand review every single one. This whole report stinks of pure, insubstantiated FUD.

      On a side note, they are including APACHE code but I do not see the Apache license included anywhere. By my readings of the ASL, they are in violation of the first clause by publishing this report!

  64. Having read the reports.. by David+McBride · · Score: 4, Insightful

    Well, the reports simply state that, in the 360 files they checked (most of them header files) they found 29 cases of a potential NULL pointer dereference and 2 potentially uninitialized variables. This is from the Apache 2.1 codebase as of 31st Jan this year, about 58k lines of code.

    Their automated checker also searched for out-of-bounds array accesses, memory leaks, and bad deallocations. It found none.

    They also state that they ran the same checks against other codebases, and found that they did marginally better, on average.

    In short, this report says that OLD development code for an unreleased opensource project is nearly as good as current commercial offerings. That's at best, when you consider the huge gamut of possible defects that this checker won't pick up. That margin probably disappears in the +/- of the sampling if you were to do a proper statistical analysis.

    The report is fairly useless. It certainly should not be taken as a reason to not trust Apache; to do so would be foolhardy particularly given Apache's track record.

    Oh, and Reasoning's webserver is being pounded into the ground. You can get my local copy of the reports from here.

  65. That was not the conclusion: RTFA by arrogance · · Score: 2, Interesting

    As others have stated, the article states that "the difference in defect density between the two was not significant." Meaning that defect density, especially with such a small differential, has little bearing on the overall quality of the software. We know nothing of the severity, impact, etc of the defects: they could all be cosmetic for all we know. This is probably nothing more than a marketing strategy by Reasoning: publish a study without any details on a hotly debated topic and see how many people check out their site. It'd be nice if they had a downloadable version of their software to test drive.

    FxCop is an example of a "defect" or code analysis tool. While I have NO idea of Reasoning's methodology, I know that with FxCop (which is specifically for .NET code analysis), you have to set it up to filter out the majority of its rules or you'll get 3000 instances of "You didn't name this variable the way MS says you're supposed to." FxCop is extensible though. The point is, not a single poster on this page (unless they work for the companies involved) knows what Reasoning's methodology or rule set was when they did this so we can glean virtually zero value from this analysis. I look forward to 600 anti-Microsoft posts because of it though....

  66. WHAA? by Anonymous Coward · · Score: 0

    "for it's low error density"

    for it is low error density?

    TARD.

  67. Re:FACT: 3 is a larger number than 2 by imAck · · Score: 1

    If you read the actual report, it does cite what type of defects they looked for, and what they actually found.

    29 NULL pointer dereferences

    2 Unitialized variables


    The unitialized variable is just a -Wall issue, the NULL pointer thing may or may not be serious depending on the context...

    --

    It's hard to tell the cool to chill, my favorite hotel room has a view to an ill.

  68. Re:Code defects appear to be a small part of the e by jdh-22 · · Score: 5, Insightful
    Every hacker on the planet has full access to the code - which means that they can review it and find vulnerabilities in it. Not many people have access to Windows or IIS code.
    To quote Bruce Schneier: "If I had a letter, sealed it in a locked vault and hid the vault somewhere in New York. Then told you to read the letter, thats not secruity, thats obsecurity. If I made a letter, sealed it in a vault, gave you the blueprints of the vault, the combinations of 1000 other vaults, access to the best lock smiths in the world, then told you to read the letter, and you still can't, thats security." Open source does have an upper hand on holes and bugs, but the code isn't where we should be looking.

    The majority of the secruity holes are from the people setting up the web servers. The holes are usually abused by "wanna-be" hackers, or script-kiddies. The problem is that people are not educated enough to run some of these programs. Being able to understand Apache, and how to make it operate correctly is not everyone's top priority. As long as it works, people don't care how it works (as goes for many other things in this world).
    --
    Every Super Villan uses Linux.
  69. sorry, but thats pure BS... by BigBadDude · · Score: 3, Informative


    One of the explanations (given by Reasoning) for a NULL pointer dereference is "can occur in low memory conditions," which I think means the original allocator did not check for malloc failure.


    appache got its own malloc() that kills the child (and closes connection) if it fails to allocate enough bytes.

    1. Re:sorry, but thats pure BS... by Eustace+Tilley · · Score: 1

      That's the way to do it, if you oughtn't use an exception-throwing [i]new[/i] :-).

      Defect 21 (reproduced) looks like a fair catch, though, since it appears that one could exit the while loop with a NULL cur.

    2. Re:sorry, but thats pure BS... by Eustace+Tilley · · Score: 2, Informative
      Hmm, Defect 10 is a little trickier:
      DEFECT CLASS: Null Pointer Dereference DEFECT ID 10
      LOCATION: httpd-2.1/modules/mappers/mod_negotiation.c : 2495
      DESCRIPTION The local pointer variable arr, declared on line 2349, and assigned on line 2365, may be NULL where it is dereferenced on line 2495. This NULL pointer dereference only happens in an Out Of Memory context.

      PRECONDITIONS The conditional expression (neg->send_alternates && neg->avail_vars->nelts) on
      line 2364 evaluates to true AND
      The function apr_array_make, called on line 2365, returns NULL AND
      The conditional expression (neg->send_alternates && neg->avail_vars->nelts) on
      line 2494 evaluates to true.

      CODE FRAGMENT
      2336 static void set_neg_headers(request_rec *r, negotiation_state *neg,
      2337 int alg_result)
      2338 {
      ...
      2349 apr_array_header_t *arr;
      ...
      2364 if (neg->send_alternates && neg->avail_vars->nelts)
      2365 arr = apr_array_make(r->pool, max_vlist_array, sizeof(char *));
      2366 else
      2367 arr = NULL;
      ...
      2494 if (neg->send_alternates && neg->avail_vars->nelts) {
      2495 arr->nelts--; /* remove last comma */
      2496 apr_table_mergen(hdrs, "Alternates",
      2497 apr_array_pstrcat(r->pool, arr, '\0'));
      2498 }
      2499
      2500 if (neg->is_transparent || vary_by_type || vary_by_language ||
      2501 vary_by_language || vary_by_charset || vary_by_encoding) {
      2502
      2503 apr_table_mergen(hdrs, "Vary", 2 + apr_pstrcat(r->pool,
      2504 neg->is_transparent ? ", negotiate" : "",
      2505 vary_by_type ? ", accept" : "",
      I traced through the code on lxr.webperf.org and it appears that pool_alloc can return NULL.

      Is the idea that this code will never be executed in an out-of-memory condition, because it is only executed by a child, and the child dies automatically on malloc failure?
  70. It's all in how you calculate a defect by sterno · · Score: 3, Insightful

    The thing that always kills IIS, is the integration it has with Windows. This isn't a defect in IIS, or Windows, per se, but rather a defect that arises because of how they integrate with eachother. A script executes on IIS in a way that's not inately a bug, but then when it interacts with Windows, Exchange, etc, suddenly it becomes one.

    Apache is just a webserver, and that's all. PHP, JSP, etc, are all separate applications treated separately. The integration does make things more efficient, yes, but also more prone to problems.

    --
    This sig has been temporarily disconnected or is no longer in service
  71. Re:Code defects appear to be a small part of the e by AftanGustur · · Score: 2, Interesting


    Is IIS just inherinetly insucure because it is used on a Windows platform? Is it because hackers generally target IIS and not Apache (most people will rush to this conclusion)?

    Microsoft will try to make people belive whatever is in their interests .. Even if it means contradicting themselves ..

    Last Friday Microsoft called all their Premier customers in France with "information" related to the upcoming "hackerfest" last Sunday.

    According to Microsoft mostly Unix and Linux servers would be the target of the hackers but it did not exclude IIS Web servers to come under attack.

    The FUD coming from MS is absolutely unbeleavable..

    --
    echo '[q]sa[ln0=aln80~Psnlbx]16isb572CCB9AE9DB03273snlbxq' |dc
  72. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    Exactly. Mostly you are fixing security-critical bugs, the other ones remain non-recognized and untouched (what is the bug anyway?). It does not matter (mostly) it your original raw code has 1.0 or 0.5 bugs per line. If the secutitycritical bugs are, say only 10 percent, and you fix them only in the first case, you still have 0.9 > 0.5, but much more secure code.

  73. Re:FACT: 3 is a larger number than 2 by Anonymous Coward · · Score: 0

    You obviously didn't read the article.

    They stated the defect percentage between Apache and typical commercially available web servers were so minute. This means there's not a heck of a lot of difference between the two...in source code errors that is.

  74. It's a demonstration of their services by Chuck+Chunder · · Score: 1

    The fact the results might be somewhat useful to the Apache community is a happy bonus.

    --
    Boffoonery - downloadable Comedy Benefit for Bletchley Park
  75. WTF by Anonymous Coward · · Score: 0

    Questioning Apaches superiority by comparing it with inferiority?
    ...by looking for bugs in a development version?
    ...by kissing /.-editor's for spreading this incompetent FUD?

    Man, someone needs to re-define the meaning of "serious journalism"... especially when it comes to reports about open source projects and there is no corp. which can kick the authors ass for bad journalism.

  76. Something is wrong here... by XaXXon · · Score: 2, Insightful

    I have to play the BS card here.

    There is no magic "defect detector" for software. If there was such a thing, they would be making a helluva lot more money than they get for doing little defect tests.

    It is very difficult to prove a program to be correct, and there's a lot of REALLY smart people who have tried.

    Maybe these people have stuff than can look for buffer overflows and stuff, but actually being able to tell if Apache is returning the correct results requires far more than generic tests.

    And I'll all but guarantee they didn't get together an entire development team to understand the code base and how it works as apache is a very large and complex code base.

    Maybe they take what the find for their generic tests and extrapolate that if they find more generic problems there are probably more specialized errors as well, but they make it very clear in the report that the difference between .51 and .53 defects / KLoC (thousand lines of code) is statistical noise.

    Anyways, I'm not saying the entire thing is worthless, just not to read too much into it -- either this one that puts Apache slightly behind some unnamed commercial implementation or the one that put the Linux TCP/IP stack ahead of some other commercial implementation (though I'd say it would probably be easier to test a TCP/IP for correct behaviour than a web server).

  77. Number of Bugs is no measure of Performance by drgroove · · Score: 0, Offtopic

    Analysis of the quantity of bugs in a software application is by no means a qualitative analysis of the performance of that application.

    The predominant httpd servers available on the market today are Apache; iPlanet/SunOne; and IIS. Additionally, there are lesser-known httpd servers (zeus, cern), as well as 'niche' httpd servers (caucho) which typically perform additional functions to parsing HTML code (such as acting as a Java server, etc).

    According to Netcraft, Apache is the #1 httpd server in use today, and has been for nearly 7 years.

    Regardless of the purported 'quality' provided by commercial, closed-source alternatives, the Apache httpd server is the only solution in the marketplace that supports - in a stable, qualitative fashion - a startling variety of additional software to provide functionality to a website.

    A primary example of this bundled flexibility would be the vast number of scripting languages supported by Apache. Java, Perl, PHP, and TCL are all free, stable, and work wonderfully with Apache. This kind of flexibility in application environments is simply unparalleled by the other httpd servers.

    You might say that 'you can run java, perl, php, and tcl on iPlanet or IIS, though'. Sure you can. Have you tried that?

    First, your commercial vendor won't support it - Microsoft will only support you if you're running ASP.NET et al on IIS; Sun will only support you if you're running Java on iPlanet.

    Second, non-supported scripting languages often don't work on non-apache httpd servers. Why? Because the source code for the httpd server isn't available to the scripting language developers - making intelligent integration more difficult - additionally, the major vendors don't test competitive scripting language functionality on their products, meaning that while the writers of PHP, Perl, TCL, etc may offer a version of their product for other httpd servers - Microsoft and Sun aren't testing them on their httpd servers - plus, they aren't guaranteed to work, and often don't. (At my company, we've never been able to get PHP to work correctly under iPlanet - and guess what? Sun doesn't give a shit. Big surprise, huh?).

    Commerical httpd servers may indeed have less bugs - but they certainly are not as stable in performance, nor do they support as wide a variety of available software extensions - as Apache.

    I'll gladly take that extra .02 in software bugs over a commercial, proprietary httpd server any day.

  78. Here are the links to the defect reports by arrogance · · Score: 5, Informative
    Defect Report

    Metric Report

    They make you fill out a form that asks for your email and then do an opt out checkbox at the bottom of the form (you have to check it to NOT get spam from them). The site's a bit slashdotted right now though.

    1. Re:Here are the links to the defect reports by Eric+Damron · · Score: 1

      "They make you fill out a form that asks for your email and then do an opt out checkbox at the bottom of the form (you have to check it to NOT get spam from them)."

      Nah... Just put in your boss' email address and tell them to bring it on...

      --
      The race isn't always to the swift... but that's the way to bet!
  79. every program. by leuk_he · · Score: 1, Funny

    The # flaws per leads to:

    -Every program can be at least one line shorter.
    -Every program has a least x bug per xxx lines.

    Conclusion:

    The ideal program has no lines and no bugs.

    and to prevent any insightful moddings of this post:

    Yes, the design is more important than the quality of the software, ask MS about this.

    1. Re:every program. by Asprin · · Score: 1, Funny


      Except I heard it as:

      Since every program contains at least one bug,
      and further, every program can be reduced by one line.
      Therefore, by induction, every program can be reduced to one line which doesn't work.

      The proof is left as an exercise for the reader.

      --
      "Lawyers are for sucks."
      - Doug McKenzie
    2. Re:every program. by lucas_gonze · · Score: 2, Insightful

      that's not just reductio ad absurdem, it's actually useful. you should always write the least code possible, and since features mean code, you should have as few features as you can get away with.

  80. Netcraft Confirms It... by Jerk+City+Troll · · Score: 1

    Apache isn't dying.

    So whatever these people claim about the quality of Apache is really not useful. For being the most used web server software (a factor of 3 over certain commercial offerings) with continued growth, it suffers from the least bugs and is generally the most stable.

    Are we to read this as anything other than FUD?

  81. This is a dupe by presroi · · Score: 2, Informative

    This Slashdot-Posting was featuring the same PR from Reasoning.

  82. please tell me guys... by BigBadDude · · Score: 1

    Tonight (if its not already done) all those 31 "bugs" are removed from the apache CVS tree. Now, who said opensource development is not effective?

    When will the "only 0.51/KLOC" IIS bugs will be removed? next service pack?

  83. lousy metrics by Shalda · · Score: 1
    Their testing metrics are pretty thin. They test for the following possible scenarios:
    Memory leak - 0
    NULL pointer dereference - 29
    Bad Deallocation -0
    Out of bounds Array access - 0
    Uninitialized Variable - 2
    And this isn't to say that any of these scenarios could or would actually occour during the execution of the code, they're just theoretical possibilities. Furthermore, these 31 are real easy things to fix. It's just a bad test reporting meaningless data in a feeble attempt to sell a product of little practicle value.
    1. Re:lousy metrics by Anonymous Coward · · Score: 0

      Too bad they don't sell a spell-checker. You might find it to be of great practicle (sic) value.

  84. Automated bug finding? by Bas_Wijnen · · Score: 1

    Reasoning's code inspection service is based on a combination of proprietary technology and repeatable process. The results are objective and comparable across software applications, development methodologies, and coding styles

    I have been thinking of writing a program that can detect security holes (buffer overflows in particular) automatically. It would be very hard. But they claim they have such a program, that just finds bugs automatically, and all they use it for is counting them? Somehow I can't believe that. So I guess they don't have such software and are doing something which isn't really objective and comparable at all...

    Another reason for this suggestion is that they count bugs per line of code. That for one is not comparable across coding styles.

  85. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    Maybe that's because the majority of web servers are running on Unix/Linux? Or maybe that involves too much common sense to be believed? I guess so.

  86. Lies, damned lies, and statistics by UnknowingFool · · Score: 4, Insightful
    Numbers can mean anything. It's the interpretation that matters. 31 errors in 58,944 lines. Hmmm. Even if we take Reasoning's word that these are errors and not "features", that's 0.53 error rate. The unnamed commercial software had an error of 0.51. So what does that prove?

    1) Apache 2.1 has more bugs than some unknown commercial competitor. If the version is correct, a development (not-ready-for-release) build was pitted against a released commercial build. Not fair playing ground.

    2) Reasoning does not detail the severity or kind of the bugs. Certainly, a web server not being able to handle a type of format (pdf, csv, ogg vorbis) is less severe than a security hole. Pitted against IIS, I would trust Apache even if it had more bugs, because historically it has had fewer security patches. Check out Apache's 2.0 known patches vs IIS 5.0

    --
    Well, there's spam egg sausage and spam, that's not got much spam in it.
    1. Re:Lies, damned lies, and statistics by tcopeland · · Score: 1

      If you want to see a bunch of open source projects' code evaluated (just quality checks, not bugs), take a look at:

      PMD-WEB

      Lots of Java projects being checked for unused code. Fun stuff!

  87. Lines of code? by slackr · · Score: 1

    What is a line of code anyway? Is that the number of hard returns or the number of semicolons? Even so, can we talk about the number of times a line of code is executed? For instance, an efficiently looped statement can often be broken out into tedious and unnecessary repetition. In this way, bad style can reduce your "defect density" by padding your overall volume of text. At the very least I can change
    int x,y;
    into
    int x;
    int y;

    and reduce my defect density by 50% (if the above code weren't brilliantly flawless).

    --

    * Please do not read my signature.
  88. Re:Code defects appear to be a small part of the e by neosake · · Score: 1

    That's why they rewrote IIS 6.0 (included with .net server 2003) from scratch.

    --
    "When a ball dreams, it dreams it's a frisbee"
  89. NOT VULNERABILITIES, FOOL. by Anonymous Coward · · Score: 0

    Visible errors in the code. dereferencing NULL pointers, not saying 'int i=0;', etc. not "well, if you issue this crafted URL to the app, and its a full moon and a monday and the weather is 78F, you can get root!"

    did you even bother to read the article?

    oh wait, this is slashdot. my bad.

  90. Re:Code defects appear to be a small part of the e by MisterFancypants · · Score: 2, Insightful
    Every hacker on the planet has full access to the code - which means that they can review it and find vulnerabilities in it.

    Do you know how long it takes to read someone else's code on something like an Apache-level webserver and understand it to the point where you can make useful changes and fixes? The big lie of the "all bugs are shallow" argument is that such a thing is simple, when in fact it is not.

    Fixing a non-obvious bug in a 100k or so line C or C++ project is hard enough when you wrote the code yourself. If someone else wrote the code, it is harder still.

  91. RTFAdvertising by tanguyr · · Score: 4, Insightful

    As has been pointed out a couple of times in other comments, 2.1 is the development branch of the Apache web server - ie "beta", "buggy", "work in progress", etc. etc. In stead of reading this as "Apache has roughly as many defects as closed source web servers" let's read this as "the development version of Apache has as many defects as... well, some unidentified (beta? shiping?) version of some unknown (iPlanet? IIS?) web server". But you can be *much* more confident that these defects will be fixed in Apache than in the *other* product.

    Heck, forget confidence - YOU CAN JUST CHECK.

    The fact that Reasoning didn't have to go and get permission from Apache to run this test - coupled with the fact that we don't even know what Apache is being compared to - is the *real* point behind this "article". /t

    ps: IANAL but don't they have to include a copy of the Apache License given that they publish fragments of the source code in their defect report?

    --
    #!/usr/bin/english
  92. Some "defects" aren't really... by peerogue · · Score: 2, Interesting

    Look at defect ID #26 in the report.

    You'll see that this can only happen when nItems is 0. This means that if a pre-condition was added to the routine tsort() that the nItems argument MUST be strictly positive, defect #26 vanishes.

    If I'd put:

    assert(nItems > 0);

    at the routine entry, it would prevent the further null-pointer dereference and spot the bug immediately when it occurs. I'm not sure how well a web-server crashing would be perceived, but that would not be worse as a kernel panic'ing, and there is indeed a potential bug there.

    My point is that to call #26 a defect (or not), we'd have to check all the callers, and if all the callers were to guarantee that nItems is strictly positive, then there would be no bug at all.

    Apart from this remark, I think that kind of work is really great. I'd love to see it applied to my favorite open-source Linux Gnutella client (all Gnutella clients are by definition an HTTP client/server). We'd see how a small open-source project compares to a big one.

    1. Re:Some "defects" aren't really... by Anonymous Coward · · Score: 0

      Regardless... it's sloppy coding. Removing the checking code from the callers and putting it in central place in the function that uses it is the fix. You should never rely on callers behaving.

  93. Reasoning's Webserver by Phishpin · · Score: 1

    Netcraft on Reasoning.com's webserver. Apache 1.3.23 on Redhat.

    --
    -phish
  94. shouldn't this be in the apache section? by Anonymous Coward · · Score: 0

    I mean, slashdot has an apache section (which is retarded as the radio section, and only slightly more popular)... shouldn't a story about apache go there?

  95. Apache 2.1 1/31/2003 code by semanticgap · · Score: 1

    The report says they're using 1/31/2003 code.

    IIRC, in 1/31/2003 Apache 2.1 branch was only a couple of months old, it wasn't even alpha quality...

    I'm curious who is footing the bill for this "research"?

    1. Re:Apache 2.1 1/31/2003 code by Nathan+Ramella · · Score: 1
      They're footing the bill for this "research". Their company provides source code auditing services. So, to gain visibility on their services, they release a few white papers showing how they can strut their stuff.

      It doesn't matter to them who wins or loses, just that someone says 'Ah, OUR product could use a source code audit...' and goes to their webpage to check out their services.

      Don't always be so quick to whip out the Microsoft tin-foil hat! :)

      --
      http://www.remix.net/
    2. Re:Apache 2.1 1/31/2003 code by semanticgap · · Score: 1

      But then why pick a pre-alpha version of Apache? Clearly they wanted to twist the facts. They could have gotten more visibility and a better reutation by analyzing Apache 2.0.46.

    3. Re:Apache 2.1 1/31/2003 code by Nathan+Ramella · · Score: 1

      More bugs to find == a larger report. If they showcased their product's capabilities and turned in 2 pages of minor bugs, who would care?

      --
      http://www.remix.net/
  96. The Origin of the code check software is here! by ratfynk · · Score: 1
    Here is the origin of company that wrote the code check interface. They have evolved and grown at an exponential rate since learning to use visual studio!

    http://slashdot.org/article.pl?sid=03/05/11/0015 24 0



    Sorry I worked for the firm and do not know how to link in htm with ./ yet.

    Just remember to move the last url 0 back one space, after ctrl-c and ctrl-v the above url.

    --
    OH THE SHAME I fell off the wagon and use sigs again!
  97. Re:Code defects appear to be a small part of the e by SquadBoy · · Score: 1

    Thank you so much. I *hate* when people pull that crap out because guess what in the server room Windows is still losing. Thanks.

    --

    Cypherpunks: Civil Liberty Through Complex Mathematics. Those who live by the sword die by the arrow.
  98. This sounds like ... by Anonymous Coward · · Score: 0

    Mis-information at it's finest.
    Who paid them to do this report? Microsoft?

    Maybe slashdot's new byline should be :

    Slashdot,

    Misinformation that pretends to be news for geeks.
    Stuff that looks like it should matter, but dosen't.

    Not quite as catchy though is it huh?

  99. sco! by Anonymous Coward · · Score: 2, Funny

    The lower defect rate in Linux TCP/IP can only be explained by a large chunk of more mature, commercial, stable SCO UNIX code.

  100. bugs are necc. where crashes are... by pioneer · · Score: 1

    If the Apache developers simply want to fix the bugs, they can use the Defect Report. If they want conduct a brutal purge of their contributors, they can use the Metric report.


    Any developer knows that code that crashes is rarely the code that contains the defect. If this were the case then the bug would have been found long ago because its faultly behaviour would have presented itself immediately. Difficult bugs (those probably found by this test) are those that start somewhere in the code but do not surface until much later, masking their true identity.

    That being said, certainly a list of crash sites may provide hints as to where to look for the real bug.

  101. Re:Thank God Hemos is back by Anonymous Coward · · Score: 0

    that's the first time that phrase has ever been used before.

  102. Defect is too strong a word... by Bazman · · Score: 4, Insightful
    Take the null pointer dereferencing thing. All this program seems to do is see if there's a possible path for null-pointer dereferencing. It has no clue as to whether this is logically going to happen. For example:
    2815 while (1) {
    2816 ap_ssi_get_tag_and_value(ctx, &tag, &tag_val, 1);
    2817 if ((tag == NULL) && (tag_val == NULL)) { 2818 return 0;
    2819 }
    2820 else if (tag_val == NULL) {
    2821 return 1;
    2822 }
    2823 else if (!strcmp(tag, "var")) {
    2824 var = ap_ssi_parse_string(r, ctx, tag_val, NULL,
    2825 MAX_STRING_LEN, 0);
    The software claims that tag could be null on line 2823. But thats only if on return from ap_ssi_get_tag_and_value that tag is a NULL pointer and tag_val is non-NULL. If ap_ssi_get_tag_and_value cant return these conditions then this is not a defect. If anything its a red flag, in case the return values of ap_ssi_get_tag_and_value could satisfy that condition.

    I suspect the following code will be flagged as a defect:

    char *tag=NULL;
    doOrDie(&tag);
    strcmp(tag,"do");
    as long as doOrDie() does its job and never returns a NULL then where's the defect? The guys who wrote this tester seem to want you to check any pointer dereferencing against NULL before use - I might be doing this in my doOrDie() function, I dont want to have to do it twice.
    1. Re:Defect is too strong a word... by hey · · Score: 0

      I understand your point but I don't buy it.
      Just because a bug or defect is unlikely to occur
      doesn't make it less of a defect to me.

    2. Re:Defect is too strong a word... by Anonymous Coward · · Score: 0

      Looking at the source (which I assume their scanner traced through), looks like it can't.
      guess it doesn't trace deep enough

    3. Re:Defect is too strong a word... by DrInequality · · Score: 3, Interesting
      Defect is way too strong. Take Defect 1. Can only possibly derefence a NULL pointer if a number of preconditions are true. The last one is (!conf->providers)[the pointer in question] must be false.

      !!conf->providers => conf->providers => conf->providers != NULL

      Their program has detected "defects" where there are none. Perhaps the greater coding style variation on open source projects exposes more defects in their automated program!

    4. Re:Defect is too strong a word... by BrittPark · · Score: 1

      In these examples I'd say the only defects are missing asserts(); An assert would document the invariant of the particular pointers' never being NULL.

    5. Re:Defect is too strong a word... by johnchx · · Score: 1
      But thats only if on return from ap_ssi_get_tag_and_value that tag is a NULL pointer and tag_val is non-NULL.

      Read the condition in line 2817 again.

      Notice that it tests both tag and tag_val against NULL, implying that the author of the function believed that tag could be NULL at the same time that tag_val was non-NULL.

      If ap_ssi_get_tag_and_value cant return these conditions then this is not a defect.

      The code itself implies that this guarantee is NOT provided. And Reason's software notices this fact. (There are similar patterns in all of the "dereferencing a NULL pointer" errors that I looked at -- that is, it's always code that's dereferencing a pointer that is tested for NULL along previously along the code path, but not modified as a result of the test.)

      So...whatever Reason's software is doing, it's considerably more clever than you imply.

  103. Here is the real question by Anonymous Coward · · Score: 0

    Who judged the code they used to judge Apache? I bet there code has even more defects...

  104. Re:Code defects appear to be a small part of the e by AftanGustur · · Score: 2, Interesting


    Maybe that's because the majority of web servers are running on Unix/Linux?

    True, but according to statistics 56% of defaced webservers run Microsoft IIS, and (only) 34% Apache..

    This is not brand new data, but it is the latest I can find ... And If Microsoft had some stats showing different results, you can be sure they would publish them..

    The competition was about defacing 6000 webservers in 6 hours, so one would tend to conclude from the above that Microsoft IIS would be the primary targets..

    --
    echo '[q]sa[ln0=aln80~Psnlbx]16isb572CCB9AE9DB03273snlbxq' |dc
  105. Useless information presented confusingly by albin · · Score: 4, Interesting

    Slashdot's summary of this article is way off base, and the article itself couldn't be less useful. Counting the number of "errors" in lines of code... and the ratio is supposed to mean something to us? As compared to unnamed other software? C'mon, I have better things to do with my time.

    *plonk*

    --
    A hen is only an egg's way of making another egg. -- Samuel Butler
    1. Re:Useless information presented confusingly by fruey · · Score: 1
      I wrote and submitted the summary. It's not a very interesting article, perhaps, but it's only confusing because the survey was pretty confusing.

      As for useless, the point is rather the *debate* about such tactics, and the link back to the TCP/IP survey which was on Slashdot too, rather than the article itself.

      I am just a SlashDot reader like yourself, by the way. You can submit articles too (just click the link on the left column).

      --
      Conversion Rate Optimisation French / English consultant
  106. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    Poor Example. A better one is testing your spam blocker by "open sourcing" your email address (plastering it all over the web, usenet, opting in, etc) vs keeping your email address private.

  107. OSS Standards by pmiller396 · · Score: 2, Insightful

    Okay, we've beat to death the fact it was a pre-release version. But look at it this way:

    When Open Source software is about the same quality as closed source, the developers consider it unstable and warn people that they may run into problems.

    It shows a big difference, to me, in the quality standards that OSS developers (and users) expect.

  108. Null pointers and uninitialized variables by mystran · · Score: 2, Insightful
    I don't know, probably some of these defects might be actual problems, but unless the software is real good, it's always possible that certain cases never happen, although automatic software can find "defects".

    As a rather "stupid" example, I had to initialize a Map to an empty HashMap just last week to get Sun's Java compiler accept my code, although the only two references to the Map where within two if-blocks, within the same function, both of which depended on the same boolean value, which wasn't changed in the whole function.

    There's a difference between defect and a bug. Tools that help in finding problems are great, but after all, they can only point possibly unsafe points. Ofcourse it's good to write code that doesn't trigger any such possibilities in the first place.

    --
    Software should be free as in speech, but if we also get some free beer, all the better.
    1. Re:Null pointers and uninitialized variables by tcopeland · · Score: 1

      > both of which depended on the same boolean value

      Hm. That's interesting. Was the boolean value a constant?

    2. Re:Null pointers and uninitialized variables by mystran · · Score: 1
      Yeah, about 15 lines above indeed, it was set unconditionally to value "true", though it was inside an object (with only public members) which is probably what confuced the compiler ?

      Only it wasn't, because I already though about this, and cached the value in a "final boolean" variable, to see if that would be the problem. It wasn't, still complains and refuces to compile. So it doesn't matter a bit if it's constant or not. Well, it's not too bad a performance hit to create one extra object, but it's kinda bad for code readability, but what can I do.

      --
      Software should be free as in speech, but if we also get some free beer, all the better.
    3. Re:Null pointers and uninitialized variables by tcopeland · · Score: 1
      > So it doesn't matter a bit if it's
      > constant or not.

      Hm. Is this kind of like what the code is doing?
      [tom@hal tmp]$ cat Bar.java
      public class Bar {
      void foo() {
      String x;
      final boolean buz = true;
      if (buz) {
      x = "hey";
      }
      System.out.println("Fiddle: " +x);
      }
      }
      [tom@hal tmp]$ javac Bar.java
      [tom@hal tmp]$
      Because that seems to compile OK. Or is there more to it than that?
    4. Re:Null pointers and uninitialized variables by mystran · · Score: 1

      It was about like this.. void foo() { java.util.Map map; final boolean buz = true; if(buz) { map = new java.util.HashMap(); map.put("foo", "bar"); } /* something else */ if(!buz) { System.out.println( "doing something else"); } else { System.out.println(map.get("foo")); } }

      --
      Software should be free as in speech, but if we also get some free beer, all the better.
    5. Re:Null pointers and uninitialized variables by mystran · · Score: 1

      ok, slashcode ruined the formatting.. (yeah yeah, problem exists between chair and keyboard).. sorry

      --
      Software should be free as in speech, but if we also get some free beer, all the better.
    6. Re:Null pointers and uninitialized variables by tcopeland · · Score: 1
      Hm, that's odd, that compiled also:
      [tom@hal tmp]$ cat Bar.java
      import java.util.*;

      public class Bar {
      void foo() {
      Map map;
      final boolean buz = true;
      if(buz) {
      map = new HashMap();
      map.put("foo", "bar");
      }
      /* something else */
      if(!buz) {
      System.out.println( "doing something else");
      } else {
      System.out.println(map.get("foo"));
      }
      }
      }
      [tom@hal tmp]$ rm ~/tmp/Bar.class
      [tom@hal tmp]$ javac ~/tmp/Bar.java
      [tom@hal tmp]$ java -version
      java version "1.4.1_02"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_02-b06)
      Java HotSpot(TM) Client VM (build 1.4.1_02-b06, mixed mode)
      [tom@hal tmp]$
      Maybe we're using different JDKs or something....
    7. Re:Null pointers and uninitialized variables by mystran · · Score: 1
      Ok, this seems to compile on me too. I admit I was lazy and didn't test what I posted on the compiler, and since I couldn't paste the original offending code (closed source) I had to simplify the case.

      Too bad I already rewrote the code with another algorithm so I don't have anything to refer to anymore. Probably one of these bugs like gcc2.95.3 C++ compiler sometimes having trouble with two nested for-loops where there's no block for the outer, but just the inner for statement. Putting the for-loop into a block fixes the case.. but this is starting to get out of topic so I shut up.

      Ps. the java version was 1.4.0_01 (windows) so it's old anyway, and the bug might as well be fixed.

      --
      Software should be free as in speech, but if we also get some free beer, all the better.
    8. Re:Null pointers and uninitialized variables by tcopeland · · Score: 1

      Cool. This is kind of neat stuff - I mean, the whole control flow analysis thing. Seems like a good optimizing compiler could do quite a bit of work. I wonder how much it actually does, and how much gets pushed off into the JVM?

      Anyhow, fun stuff!

  109. Re:Code defects appear to be a small part of the e by bwt · · Score: 3, Interesting


    One of the best ways to get to know a large code base like Apache or something else is to find a repeatable bug and track it down. To fix a bug you do not need to understand the whole program, just the relevent parts. I've submitted bug fixes to several projects, so I must strenuously disagree, especially because, ahem, I have never submitted a bug fix to a proprietary project because its impossible.

  110. Thank you, Captain Obvious by Sxooter · · Score: 2, Funny

    Well this certainly falls under the "duh" category. Freshly written code tends to have fewer bugs than older, well reviewed, well tested code.

    Wow, next we'll learn how you shouldn't buy any Ford, GM, or Chrysler product in the first year of production.

    --

    --- It is not the things we do which we regret the most, but the things which we don't do.
    1. Re:Thank you, Captain Obvious by cant_get_a_good_nick · · Score: 1

      Wow, next we'll learn how you shouldn't buy any Ford, GM, or Chrysler product in the first year of production.
      Go PINTO!! And Fiero, which was a very applicable name that first year.

  111. How about duplicate code in the Apache source? by tcopeland · · Score: 1

    Here's a duplicate code report (generated by CPD) for a checkout from the APACHE_2_0_BRANCH as of about a month ago. Time for some refactoring....

  112. Coding errors & program logic errors by MROD · · Score: 3, Insightful

    Of course, this test of the code is purely a test of coding errors rather than errors in the code logic.

    The most worrying errors in programs are generally not coding errors as they are either terminal (ie. crash) or they are benign (the error may cause memory corruption in a place where it does no harm). Of course, there are exceptions such as buffer overflows, but I'd class those, in general, into the logic error category.

    Logic or algorythmic errors are far more dangerous as they can be well hidden and are more likely to make the code do things unintended. The code itself may be perfect but if the algorithm is faulty then there's a major problem.

    --

    Agrajag: "Oh no, not again!"
  113. Re:FACT: 3 is a larger number than 2 by bwt · · Score: 2, Insightful

    I agree completely. Any metric based on Lines of Code anything is a harmful metric. Any metric based on defect counts is also harmful. Both of these are left-overs from attempts to (mis)-apply statistical process control. Control of crappy metrics give crappy quality.

    Suppose I had 100K lines of code with 100 defects. After reviewing my code I discovered that I could refactor it to 80K lines and suppose further that doing so had no effect on the defect count. Defects per line of code would look worse after an improvement.

    Also, given that this is an automated program, I have to ask how they calibrate and validate its results. How many of the 32 errors found actually aren't errors? How many existing known bugs were not found by this program. I really can't accept these results as anything more than fluff with numbers.

  114. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    To quote me:

    "If I had a letter, sealed it in a vault, gave you the blueprints of the vault, the combinations of 1000 other vaults, access to the best lock smiths in the world, then told you to read the letter, and you still can't, and I decided that I was safe from the rest of the world, then I am a stupid sysadmin."

    If I have a vault that claims to be secure, then I am an idiot for publishing the details. True, obscurity will not substitute for security, but it does slow the attacker down, and possibly raise the cost of attack so that it is not economic to attack my vault.

    Every time I hear the "obscurity is not security" mantra I chuckle. Of course it isn't, but that doesn't make publishing the information a good idea. Is Fort Knox secure? Probably. If so, then why don't they publish the blueprints, guard rotation schedule and security policies? Because that would be stupid, that's why.

    Go tell every LAN admin that they need to publish their LAN architecture, firewall config and security policies, because "obscurity is not security." Watch them laugh their ass off at you.

    Maybe I drank too much coffee this morning...

  115. Re:Code defects appear to be a small part of the e by cornjones · · Score: 1

    If you are going to compare apache w/ IIS you need to compare apache+php (or modperl or similar) w/ IIS+ASP. the addition of a server side programming lang adds alot of complexity. how many of the IIS bugs are in the iis server itself vs. its handler dlls. all the ida and indexing service attacks were this type of vuln.

  116. Development release by Door-opening+Fascist · · Score: 3, Insightful

    Why did they use the development branch of Apache, when only a handful of sites are running it? I would have found an analysis of the stable 1.3 branch, which 60% of the web-serving world uses, to be more informative.

    1. Re:Development release by sabat · · Score: 3, Insightful

      Why did they use the development branch of Apache

      Let me restate this: why are they comparing pre-alpha software with production releases?

      Most simple answer: because they wanted to find flaws. The second most popular web software is ISS. This looks like a Microsoft tactic: anonymously hire this company to "evaluate" code so that the results look unbiased. Everyone will likely realize that the competitor is Microsoft's ISS, so it doesn't need to be stated bluntly. MS wins; another (small) battle for mindshare is won.

      --
      I, for one, welcome our new Antichrist overlord.
  117. BINGO by Anonymous Coward · · Score: 3, Informative

    In almost every case they listed the pathway was via a failed malloc.

    Apache has it's own malloc that kills the connection (and the child) if it fails.

    That code can never be reached. Their test is invalid.

  118. Apache 1.3? by Spazmania · · Score: 4, Interesting

    First, as many posters have noted, Reasoning DID NOT TEST APACHE 2.1. They tested Apache 2.1-dev. That's dev, as in development branch. As in: I have new untested code, so don't use me on a production server until I'm released in the STABLE series.

    For a valid comparison versus commercial software, the testers should have used Apache 2.0.46, the most current STABLE series release.

    Second, I'd be interested to see a comparison of 2.0.46 versus 1.3.27. I have a pet theory that multithreaded C code has more bugs than single-threaded C code, and I'd like to see whether there is evidence to support it.

    --
    Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
    1. Re:Apache 1.3? by Florian+Weimer · · Score: 1

      Second, I'd be interested to see a comparison of 2.0.46 versus 1.3.27. I have a pet theory that multithreaded C code has more bugs than single-threaded C code, and I'd like to see whether there is evidence to support it.

      I don't think the tool can statically detect synchronization errors and the like. Probably it just viewed Apache 2 as a singled-threaded program.

    2. Re:Apache 1.3? by Piquan · · Score: 2, Insightful

      I keep hearing this, and I'm not convinced.

      I didn't see anything in the article about what versions of closed-source codebases they used for comparison. But I'd hypothesize that it's code that they've been contracted to analyze. That means it's probably development code in that event, too.

      We can't gritch about them using Apache 2.1-dev unless we have reason to believe they didn't compare againt dev versions. We can gritch about not having this information.

    3. Re:Apache 1.3? by Spazmania · · Score: 1

      We can't gritch about them using Apache 2.1-dev unless we have reason to believe they didn't compare againt dev versions.

      Sure we can. The number of bugs in a dev version isn't an interesting metric here. The bug count in a release version is. Also, their press release not only implied that they were testing release versions, but generalized the claim to say that open source web servers had as many bugs as closed source web servers.

      --
      Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
    4. Re:Apache 1.3? by Spazmania · · Score: 1

      I suspect you're right, but I'm interested in whether multi-threading makes debugging enough more difficult to result in a higher defect count in the basic bugs as well.

      --
      Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
  119. Re:Code defects appear to be a small part of the e by jdh-22 · · Score: 3, Interesting

    You have the wrong idea here. There is a point in which you must realize what information you can release without comprimising the security of your system. While I can give you the plans to my vault, I will not give you the combination, nor the first or second numbers in it.

    For the star wars geeks out there, if you were a Jedi, you don't go around telling everyone you're a Jedi, nor do you flash your light saber in public places. They do realize when to show their light saber, and when they can tell people they are a Jedi. Nor do they not tell anyone who they are, or never show their lightsaber.

    You might want to check out Secrets and Lies which will give you a better understanding of security philosphy.

    --
    Every Super Villan uses Linux.
  120. Re:Code defects appear to be a small part of the e by johnnyb · · Score: 3, Informative

    Actually, I've found that fixing bugs in large projects is about the same whether or not you are familiar with the project, provided that the author was no smoking crack at the time he wrote it.

    For example, I managed to code, test, and patch a "fix" for PostgreSQL this weekend in under 2 hours, having never seen the code before.

    The "fix" wasn't a bug, per se, i't just that the output of pg_dump wasn't optimal in my usage for dumping the schema for CVS revision control. I added two flags, -m -M, which molded the output to my liking.

    If you haven't seen your code in two months, you and an outsider have about the same chance at finding and detecting bugs/misfeatures.

  121. Errors mean nothing... by Foofoobar · · Score: 2, Insightful
    Errors in coding mean next to nothing when it is a machine that is checking the syntax of your code. Variations in coding techniques that are perfectly acceptable often show up as errors merely because the program doing the code checking does not understand your syntax. I've seen it happen time and again with error checkers and one could even say that 2% of all errors found by error checkers are mere differences in syntax.

    My wife who is a lead QA tester could vouch for that...

    --
    This is my sig. There are many like it but this one is mine.
  122. Hmm, the first claim seems to be wrong... by marcink1234 · · Score: 2, Insightful

    I have just read the first 'null dereference' claim and it seems to me that in fact it is not possible. Maybe we got amount of reasoning bugs?

  123. Peer Review is a Good Thing by APDent · · Score: 1

    Mabye I'm naive, but I read the press release as an endorsement of open source, and a prediction that as the code (the development branch of Apache in this case) matures and is subject to peer review, the code quality will improve (by whatever measure Reasoning is using).

    Here's why I think that:

    Reasoning found that the Apache Open Source server had a similar defect density compared to the average defect density of several commercial equivalents. This finding, when considered alongside an earlier Reasoning study released in February (http://www.reasoning.com/news/pr/02_11_03.html), suggests that as software applications mature, there is a correlation between code inspection/peer review and the resulting defect density. [emphasis mine]
  124. Re:Code defects appear to be a small part of the e by schon · · Score: 3, Insightful

    Every time I hear the "obscurity is not security" mantra I chuckle. Of course it isn't, but that doesn't make publishing the information a good idea.

    Nobody's saying that the information should be published - what they're saying is that you can't rely on that information being a secret.

    Is Fort Knox secure? Probably. If so, then why don't they publish the blueprints, guard rotation schedule and security policies?

    That's pretty much the point you're missing - even if that information was published, it wouldn't diminish the security of Fort Knox..

    If the people in charge relied on the fact that they don't publish those details, that would be obscurity, because it would lead them to make errors elsewhere. (Oh, it's OK if we leave the main vault open tonight - nobody knows that there will be no guards around it for 10 minutes at 3:30 AM tonight.)

  125. Re:Code defects appear to be a small part of the e by aziraphale · · Score: 5, Interesting

    One word: architecture.

    And not just the architecture of the web server, but the architecture of the entire platform. But specifically looking at the architecture of Apache versus the architecture of IIS, you'll immediately see that the goals of the two pieces of software are not the same. Look at things like IIS's metabase - the structural details of the server's configuration are kept in an in-memory data structure, which is easily modified while the server is running. Apache, in contrast, reads its configuration at startup, and uses it to determine which modules of code are loaded, and how they are used to process requests - fixing the behavior of the web server at startup.

    IIS follows typical MS enterprise software design - it has to interface with COM, and the NT security model, and active directory, and the registry, and a million other systems, all in the name of integration, and enterprise management. Apache doesn't have PHBs telling it that it needs another way for the metabase to be edited, or a new instrumentation API, or whatever else a particular large customer asked for - and can get on with just providing its facilities cleanly.

    That's why IIS has so many more security holes, even if it does (as may or may not be the case) have the same raw coding error rate as Apache.

  126. Ok, I'll bite by Anonymous Coward · · Score: 0

    What's wrong with the bottom code?

    1. Re:Ok, I'll bite by twitchkat · · Score: 1

      If foo is called with bar less than or equal to zero, b is not initialized at the point it is used in:


      a += b;


    2. Re:Ok, I'll bite by twitchkat · · Score: 1

      err, forget the less than -- bar has to equal zero. (its early :) )

    3. Re:Ok, I'll bite by twitchkat · · Score: 1

      err, forget forgetting. Too bad preview can't wake me up.

  127. Guy, they're on your side by DASHSL0T · · Score: 1, Insightful

    Everybody take a deep breath.

    Their conclusion is that while the INITITAL defect rate of Apache is roughly equivalent to a closed source product (since they are testing a development release), the Open Source methodology reduces the defects to a greater extent and results in code with fewer defects over time.

    They are saying that Open Source coding methods are producing _better_ code in the long run.

    --
    Freedom Is Universal
    Linux-Universe
  128. Re:Code defects appear to be a small part of the e by fnorky · · Score: 1
    But here's the kicker: the vast majority runs Apache on either BSD or Linux. All of this code, from the kernel to the library that tells Apache how to use PHP, is open source. Every hacker on the planet has full access to the code - which means that they can review it and find vulnerabilities in it. Not many people have access to Windows or IIS code. So why does IIS and Windows come out as far less secure, and is exploited so much more?

    Why does IIS and Windows come out as far less secure? That is easy. It is less secure by design. Or should I say security was not a factor (or at least not a critical factor) is the design of Windows. Over the years MS has tried to add security to an inherently insecure system.

    I once heard Bruce Schneier say "you can't build a secure system on an insecure foundation" (at a DefCon some years ago). He was talking about ALL OS's, not just Windows. Linux, BSD and Windows are all inherently insecure.

    What little I know of IIS suggests that it was designed on the assumption that the security in Windows would be enough. Oviously it isn't enough.

    Why is IIS and Windows exploited so much? Well, it seems that the vast majority of exploits are done by script kiddies. Script kiddies and the ones who make the scripts seem to go after the easiest targets. Linux and BSD are also inherently insecure, but are tougher nuts to crack than Windows.

  129. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    That's pretty much the point you're missing - even if that information was published, it wouldn't diminish the security of Fort Knox.

    That makes no sense at all. No installation has bulletproof secuirty. I agree (as I said above) with the assertion that obscurity is no substitute for security.

    If you had a security system that was totally unbreachable (which does not exist), then, yes, you could publish away.

    Unfortunately, no security installation is 100% secure. If it is less than 100% secure, then nondisclosure raises the security. Inversely, disclosure lowers the security.

    Again, we agree that (at one extreme) obscurity alone will not make something secure. The other extreme is that security with full disclosure somehow exists, and that we should totally dismiss any solution that relies on obscurity to raise the cost of breaching it.

    This is nonsense. Back to Fort Knox and firewall configs, keeping that stuff secret does, indeed, raise the secuirty.

  130. obligatory snowcrash reference by pinkfalcon · · Score: 1


    "I told you they would listen to reason"

    --
    Real SUV's don't have cupholders
    It's 5:42 A.M., do you know where your stack pointer is?
  131. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    What's your point? IIS was written from scratch. What make scratch so much more secure?

    Do you mean that IIS was designed fundamentally insecure? Where did you get that information?

  132. Microsoft C++ catches this. Doesn't gcc? by Phronesis · · Score: 2, Informative
    This lets the compiler catch errors where you meant '==' rather than just '='.

    MY compiler (Microsoft C++) does catch this

    if (myPointer = NULL) { ... }
    and issues a warning. Doesn't gcc?
    1. Re:Microsoft C++ catches this. Doesn't gcc? by Anonymous Coward · · Score: 1, Informative
      Yes, GCC does issue a warning for that. However, it should be noted that this is perfectly valid C.

      GCC does not issue a warning for this, though:
      if( (ptr=NULL) );
      Note parenthesis.
    2. Re:Microsoft C++ catches this. Doesn't gcc? by Rasta+Prefect · · Score: 4, Informative
      This lets the compiler catch errors where you meant '==' rather than just '='.
      MY compiler (Microsoft C++) does catch this

      if (myPointer = NULL) { ... }
      and issues a warning. Doesn't gcc?


      Yes, it does. So does every other C compiler I've ever used (quite a few). I suspect the original poster may be the sort who ignores warnings....

      --
      Why?
    3. Re:Microsoft C++ catches this. Doesn't gcc? by Phronesis · · Score: 1
      Interesting about the parentheses.

      About it being valid C, of course you're right (that's why it's a warning, not an error), but as Alan Holub said about C in the title of one of his books, it gives you Enough rope to shoot yourself in the foot . If such expressions were illegal, we wouldn't need lint and relatives as much as we do.

    4. Re:Microsoft C++ catches this. Doesn't gcc? by Skjellifetti · · Score: 1

      Yes, it does. So does every other C compiler I've ever used (quite a few). I suspect the original poster may be the sort who ignores warnings....

      Nope, just old enough to remember when the compiler's didn't catch this. Since I still use the trick, I haven't noticed that the compilers have caught up.

    5. Re:Microsoft C++ catches this. Doesn't gcc? by Anonymous Coward · · Score: 0

      Stickler compilers of old would trap this error by saying you have an unparenthasized expression, ie. they're telling you to type

      if ((x = 1)) { ... }

      It's also a royal pain in the ass to read code that has the invariant on the left. It is backwards because

      1. looks odd
      2. doesn't follow math rules
      3. logically the part you compare your variable TO follows it, ie. goes on the right hand side.

    6. Re:Microsoft C++ catches this. Doesn't gcc? by Breakfast+Pants · · Score: 1

      the real solution to this is to always do boolean expressions like: if(NULL = myPointer) {..} that way you always get an error if you forget == instead of =.

      --

      --

      WHO ATE MY BREAKFAST PANTS?
    7. Re:Microsoft C++ catches this. Doesn't gcc? by Anonymous Coward · · Score: 0

      if (myPointer = NULL) { ... }
      and issues a warning. Doesn't gcc?


      the reason for doing:
      if (NULL == myPointer) {...}
      as opposed to:
      if (myPointer == NULL) {...}
      is because NULL cannot be assigned to, therefore
      if (NULL = myPointer) {...} will result in a compile time error, but
      if (myPointer = NULL) {...} will only result in a warning.
      Also, by using the first form, you make it very clear that you definitely do not want the assignment operator,
      but in the second form, it may very well be that you do want the assignment operator.

    8. Re:Microsoft C++ catches this. Doesn't gcc? by typobox43 · · Score: 1

      the real solution to this is to always do boolean expressions like: if(NULL = myPointer) {..} that way you always get an error if you forget == instead of =. You mean like how you forgot to use ==, right?

    9. Re:Microsoft C++ catches this. Doesn't gcc? by Phronesis · · Score: 1
      I see two problems with this. It's not that there is some absolute right vs. wrong way to program, but that this does not fit my style.

      The problems I see are:

      1. It only addresses a subset of the possible errors of this sort. It catches assignment instead of equality check only when one argument is a constant, but lulls the coder into a false sense of security regarding the same error when both arguments are variables:
        if (my_pointer_1 = my_pointer_2) { ... }
      2. It makes it harder to read the code. The natural flow of mathematical or logical thought in English is left-to-right: "is x a pentagon?" not "is a pentagon x?". As a believer in literate programming, I perfer to write code that's closer to native language and use tools together with close human reading of the code to catch errors. For the same reason, I tend to write more verbose code that avoids side effects:
        for(;*px != 0; ++px, ++py)
        {
        *px = *py;
        }
        instead of
        while (*py++ = *px++);
        Following this idiom, I just tell my compiler to flag any conditional expression whose argument has side effects as an error and I get a compile-time error for every conditional expression with a side effect. MSVC lets me selectively tell the compiler, via #pragmas, to promote certain warnings to errors. I presume that gcc has a similar capability.
    10. Re:Microsoft C++ catches this. Doesn't gcc? by Skjellifetti · · Score: 1

      1. looks odd That's part of the reason. Its OBVIOUS what you are trying to do.

      2. doesn't follow math rules What rules are you talking about? The equation y = x + 1 is identical to the equation x + 1 = y. Yes, we most often solve equations by reducing the left side to a single var with the right side being the numeric answer or expression, but that is merely a convention.

      3. logically the part you compare your variable TO follows it, ie. goes on the right hand side. Again, we do this only by convention, like the side of the street you drive on. I wonder how people who write from left to right (Arabic and Hebrew) or from top to bottom do their derivations. Does (NULL == foo) seem unnatural to them?

      But I can sorta sympathise as well. Grep, for example, drives me crazy. 1st time I use it in the day, I want to grep FILE EXP (i.e. "Look in the drawer for the socks"). But its really grep EXP FILE ("Look for the socks in the drawer"). Its backwards. Links are similarly confusing to me -- ln NEW OLD ("Link NEW_NAME to OLD_FILE") should be correct. But its really ln OLD [NEW] ("Create a link from OLD_FILE to NEW_NAME"). It seems unnatural to me.

    11. Re:Microsoft C++ catches this. Doesn't gcc? by Anonymous Coward · · Score: 0

      I really can't remember the last time I gave grep something besides stdin. 70% of the time it's echo | grep or cat | grep.

  133. Perhaps we will find out... by PetoskeyGuy · · Score: 1

    Maybe we will fond out which company they used. Having valid third party tests confirm that your webserver is coded better then apache would be hard for any company to pass up. Especially Marketing types who advertise stuff they imagine they heard.

    Of course it may not be DESIGNED any better, but they dot every i and cross every t. That's pretty incredible in my experience.

  134. Interesting guesswork, but what about real info??? by HydeMan · · Score: 1

    Your comments are not really interesting nor informative or enlightening. The same old sales job about IIS problems being the result of severe bugs and bad design.

    Here is the real story. Windows worst enemy are the dumb sysadmins who are put in charge of running the boxes. In fact, for a lot of companies (including my employer), there are no dedicated sysadmins. Since the overhead seems minimal, programmers are put in charge of running the boxes, which is the beginning of the bad rap that IIS gets nowdays. The last thing a programmer wants to worry about is locking down an IIS box. After 6 months of developing an application, these guys want to work on something else, and focusing on configuring the application and the OS is extremely boring an uninteresting.

  135. Message from Hiro.... by rmassa · · Score: 1

    Don't worry about the errors in the code, I'm sure the apache developers will listen to Reason.

  136. Accuracy and False Positives? by Milo77 · · Score: 1

    I didn't see anything about this in the article or on their website (i didn't look too hard). Did anyone else find anything? They infer better than three "9"s of accuracy (31 bugs in 60k), but how much better? If I run their product on a project with millions of lines am I going to be chasing false positives all month? Are they finding bugs, possible bugs, or what? Sounds fishy to me...

  137. Re:Code defects appear to be a small part of the e by MOMOCROME · · Score: 1

    Quite typical of the slashdot crowd, the slurs and insults are being spewed from both sides of the 'community mouth' when it comes to MSFT vs. F/OS/S. In this case, IIS sucks because it is closed source, so the bugs are worse. Other times, it sucks because MCSEs and department supervisors around the world don't know how to configure it (look at the dumb users! hahah). Finally we have the matter of familiarity and popularity, ften pointed to by sympathetic and/or apologistic commentators, usually to the tune of much derision and contempt... but that is not the case here, as Apache has the market.

    The reality, of course lay squarely in the middle of these extreme opinions. IIS (pre-Server 2003 versions, anyway) had some flaws and NT has some flaws, namely shipping with a soft pre-configuration (which, believe it or not, makes sense from a certain standpoint. it's called ease of use). This is often the main reason for IIS being a major target, and most exploits are performed against flaws that have already been patched. Of course, this crowd will spit on automatic updates as an invasion of privacy or some such malarkey, while trumpeting Apache's superiority in between their own patch binges. As you can see, there is very little space for MSFT to do right here. They do a great job given what pressures they face, and the new IIS/S'03 is fantastic, and though we'll need top wait on indications of security, things are looking good.

    But as it turns out, the most likely cause of the rampant IIS exploits is that your hax0rs and script-kiddies are often the same F/OS/S enthusiasts flaming MSFT in a forum like slashdot, bearing a senseless grudge against an important and influential developer like MSFT and gleefully proving their point with cheap trick after cheap trick.

    Finally, as an aside, I had a friend visiting this weekend, here in Seattle. He's an F/OS/S enthusiast, so I took him on a tour of 1 Microsoft Way in Redmond. Aside from the pleasant, tranquill atmosphere and a small group of Indian developers playing Futball, the only thing that stood out for us was the many banners urging everyone at the company to "make it trustworthy", hung over every door, on every light-post and wall, it seemed. It struck me that the last couple times MSFT set their sights on a goal like that, it was dominance of the web (IE) and total hardware compatibility (win95). Regardless of your personal feelings on the matter, it is hard to argue that they didn't succeed in those pursuits!

  138. Worst kind of science by AYEq · · Score: 4, Interesting

    Reasoning's code inspection service is based on a combination of proprietary technology and repeatable process.

    Am I the only one who looks at reasoning's results with suspicion (even when I agree with them). Any analysis using methods that are not open and repeatable is not science. This just feels like marketing to me. (it is sad because the study of code quality is such a worthwhile pursuit)

    1. Re:Worst kind of science by mhifoe · · Score: 1
      I think you can paraphrase their description like this:

      We run PC-Lint or Splint on your source code.
      Then we tart up the output, stick it in a report and charge you 10 grand.

    2. Re:Worst kind of science by pooh666 · · Score: 1

      It looks to me like this report is just an attempt to gain favour with the dimly aware in the media. There is NO science involed with this report. That means, there is no control, there is not a complete look at factors that affect errors outside of a very small specific and not very troublsome set.

  139. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    You mean you hate people like yourself?

    The Netcraft figure cited compares domain names. When you compare the actual number of public webserving hosts, Apache and IIS are pretty much equal. When you add IIS's considerable lead in private intranet hosting, they are the larger target. Then add the number of personal machines (unintentionallly) running a webserver, and IIS comes out as the best worm food.

  140. Re:FACT: 3 is a larger number than 2 by frankthechicken · · Score: 1

    Yeah, you're right, probably a more truthful statement would have been:-

    I usually write fifty thousand lines of code, each line completely and utterly with no meaning, run it through the checker and produce 0 defects, except for one overall defective piece of software

  141. Congrats to all posters by Anonymous Coward · · Score: 0

    I'd like to say that this wmay have been the most intelligent and interesting discussion I've sesn on slashdot in quite some time.

  142. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    Why do you think IIS is exploited "so much more"? Not in my experience, and not for many others as well. I host on multiple platforms, the only one that has ever been hacked is the Redhat box running Apache. Never have I lost a FreeBSD or Windows box. IIS can be just as good or better than Apache in many ways. Most people are using statistics derived from attrition.org that are three years old to show that IIS is hacked more often. Times have changed my friend. Linux systems running Apache are nailed far more often these days, and a great percentage of the mass hacks involve Apache. Anyway, it really kind of comes down to the setup doesn't it?

  143. what doesn't kill you makes you stronger by f00zbll · · Score: 2, Insightful

    The report hardley takes down OSS or Apache. The report is reasonable and doesn't over extrapolate about quality. For me, the report is encouraging because MS has something like 80 programmers working on IIS and apache is made up of volunteers with far fewer resources, that is pretty darn impressive for alpha code. I haven't looked at the list of active committers lately, but I know it's no where near 80. Draw your own conclusions.

  144. Let me guess... by Anonymous Coward · · Score: 0

    You either work on the Windows Kernel for MS or you are age 13.
    Most bugs are due to subtle errors that were introduced by the coder that will look correct. good examples if an extra semicolon directly after the if and prior to the block/statement. This tends to happen more so if you use ansi style rather than k&r. it is a subtle bug that is easily overlooked by the eye (all statements end in semi-colon).
    The grandparent offered a good suggestion on defensive coding. I would suggest that you take it.

    1. Re:Let me guess... by Anonymous Coward · · Score: 0

      You are missing the point. You only need to do crap like that if you are a complete moron. If you can't figure out where to put your semi-colons or whether to use = or ==, then maybe you need to look for a different profession. Its not that hard- really.

    2. Re:Let me guess... by thogard · · Score: 1

      So your saying the 99% of coders out there that are bitching about tailing ; are simply illiterate when it comes to C? I agree with you. I know I don't make these errors when it comes to C but I have had to explain it to plenty of programmers who think they know the language.

  145. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    But they _do_ publish the information... the guards need to know when they are working for instance. The guards spouces need to know when they are working too. See how this data is going beyond the needs of just the securing of Ft. Knox.

  146. everyone is reading this wrong by Major+Tom · · Score: 2, Insightful

    There is no need to freak out about this being some sort of attack on open source software or agonize over what the unnamed commercial product used for comparison was.

    The article seems to indicate that the .51 error density for "commercial software" is talking about commercial software in the abstract. Presumably, this isn't the error density of some secret web server, but the average density of all the commercial products they've analyzed so far.

    This report is simply an attempt to prove a simple hypothesis about OSS: it gets increasinly refined as it matures.

    Reasoning believes they've proved the hypothesis because Apache, a middle-aged project, I suppose, has an error density comparable to commercial software, while the TCP/IP stack, a mature project, has a significantly lower density.

    This isn't inteded to be a comparison of web servers (come on, people, *of course* they didn't have access to IIS) it is intended to be a mildy interesting observation about the life-cycle of open source software.

    It would be a lot more interesting if we could see an analysis of whether or not commercial software goes through a similar maturing process. Maybe commercial products also grow refined with age. Maybe not. If so, which matures faster?

    --
    What's good for the syndicate is good for the country. --Milo Minderbinder
  147. Re:FACT: 3 is a larger number than 2 by GlassHeart · · Score: 1
    Suppose I had 100K lines of code with 100 defects. After reviewing my code I discovered that I could refactor it to 80K lines and suppose further that doing so had no effect on the defect count. Defects per line of code would look worse after an improvement.

    This is a pitfall of any statistical measure, but is not an indictment of statistics itself as a science or as a tool. You need to understand the metric (in this case, looking at the raw numbers would reveal that the total defects has not increased) to use it properly.

  148. Re:Code defects appear to be a small part of the e by TKinias · · Score: 1

    scripsit demaria:

    Hypothesis: Taking down IIS, Windows or Microsoft is more fun/cool.

    I don't think so... I'm not sure what I would do with a `r00ted' Windows box if it were given to me; why expend effort on it?

    --
    In principio creauit Linus Linucem.
  149. All bug reports are always welcome by seniorcoder · · Score: 1

    I welcome someone else spending time debugging my code. Sometimes they find benign bugs, sometimes they find real bugs. It's all good. It helps make my product stronger. I assume Apache will shortly have 0 bugs as reported by this automaton.
    Don't waste time getting bogged down in what appear to me to be irrelevant statistics about what percentage of bugs per whatever.... Just fix the bugs and move on.

  150. Considering that their previous report was Bogus.. by oldCoder · · Score: 1

    I would expect the current report from Reasoning to be bogus also. The previous report (on TCP/IP stacks) was about code density. This means a code base that was 3 times as bloated (code size) but only had twice the number of bugs would come out as being better than its competitors. And that report did not give information on code size or total number of bugs or on the performance of the tcp/ip stack.

    --

    I18N == Intergalacticization
  151. Defect density by El · · Score: 1

    So, if I insert 9 empty lines between each line of code, I've just lowered the defect density by 90%??? Are they counting comments and whitespace in the LOC count?

    --

    "Freedom means freedom for everybody" -- Dick Cheney

  152. Their bug-detection software is defective! by El · · Score: 1

    I've examined defect #1, and it obviously isn't a bug (the code checks the variables and breaks out of the loop if it is NULL). This casts serious doubts as to the accuracy of their results, doesn't it? Anybody want to examine the other 30 "defects"?

    --

    "Freedom means freedom for everybody" -- Dick Cheney

  153. ASSUME by Anonymous Coward · · Score: 0

    It appears that you ASS-U-ME that you never make mistakes in your code. Hence my earlier suggestion that you either work at MS or are a very junior coder.

    1. Re:ASSUME by Anonymous Coward · · Score: 0

      What the hell does working at MS have to do with this? Or what the hell does age have to do with this? Isn't that a bit juvenile? I've seen plenty of shitty OSS. As a matter of fact, Win2k has worked way better for me than either Linux or FreeBSD has. But that's just my experience, so I don't go around poking fun at these projects. And I've seen many talented young programmers. Your issue seems to be that of maturity. Grow up.

  154. This sooo does not matter by LilMikey · · Score: 2, Insightful

    This is a pointless study. While yes, the slight possibility that one may dereference a NULL pointer is a bad thing it's miniscule compared to bad design. A perfectly programmed web server designed poorly will have bazillions more bugs and security flaws than a slightly bugged well-designed one. An objective code scanning bug-finder can't fix stupid.

    --
    LilMikey.com... I'll stop doing it when you sto
  155. prove it. by Mark19960 · · Score: 4, Interesting

    they dont say what they used for a comparison.
    when they tell us what they used, then I will believe it.
    this smells microsoft.

    bring it on! we want to know what it was compared against, sure as hell was NOT IIS...

  156. Defects vs. performance? by CapnWacky · · Score: 1

    What does this have to do with the web server performance? 53 vs. 51 defects is all well and good, but a) how often do these occur, and b) what about actual running time? This test seems worthless...

    --
    god's lonely man
  157. here we go again by Anonymous Coward · · Score: 0

    Lets complain about open vs closed source quality. Yet, the guy codes at work then comes home at night and does some free code...

    Then we look at the linux kernal and see all the sco unix code it contains. :)

  158. within the statistical margin of error by dh003i · · Score: 2, Insightful

    0.53 errors per 1000 for Apache, vs. 0.51 per 1000 for "commercial equivalents" (note, that they fail to say how many equivalents were used to generate the average, nor which ones)? That's definately within the margin of error. Not only that, but Apache is a less mature FS/OSS project, so the comparison seems to favor the FS/OSS model.

    Furthermore, while presumely many commercial equivalents were used to generate the commercial average, only one Apache was used to generate the FS/OSS average error density. Again, very crappy statistics.

    Even if 100 different FS/OSS projects like Apache and Apache were used to generate that 0.53 average, and 100 different commercial equivalents used to generate the commercial average, it's probably still within the margin of error (or standard deviation).

    In short, this study = completely insignificant. Likewise, so was their previous study showing that FS/OSS has a lower bug-density, as it only used one FS/OSS project. To get useful statistics, you need hundreds of data-points -- not one.

  159. Science??? by nadam · · Score: 1

    This just feels like marketing to me.

    Duh! It's a press release, not a scientific paper.

    They are simply saying - look, we found 31 bugs in Apache, imagine how many bugs we can find in your software.

  160. LOC Metric by multipartmixed · · Score: 1

    I'd be much more interested in a metric based on a number-of-statements metric.



    This wouldn't be difficult to calculate: obviously, they have lexers and parsers which grok C code. This insulates the metric against coding styles, so that, say, some guy who litters his source with comments and braces, but writes the exact same effective code as another fellow will have statistically similar defect densities.


    The comments and braces (and whatnot) aren't the only examples where this is useful, either.. Consider the two snippets which follow:



    char *strcpy(char *P, const char *Q) {
    const char *p;

    for (p = P; *Q; *p++ = *Q++);
    *p = (char)0;
    return P;
    }

    and

    char *strcpy(char *P, const char *Q)
    {
    char *p = P;

    while ((*p++ = *Q++));
    *p = (char)0;
    return P;
    }

    as well as

    cdecl
    char *strcpy(char *P, const char *Q)
    {
    char *p = P;

    do
    {
    *p = *Q;
    p++;
    Q++;
    } while (*Q);

    return p;
    }

    Even without comments, it's pretty plain to see that these segments, which pretty much all implement the same function with effectively the same code, with probably the same defects, have drastically different line counts -- but VERY similar statement counts. Remember, ++ side-effects, assignment operations, the three parts of the for(;;), would all be considered individual statements, even if they are not blatantly decomposed as such in the source code.



    And, for the -fpedantic PITAs out there, no I haven't even bothered to compile the code (or really think about it). It's just a friggin' example!

    --

    Do daemons dream of electric sleep()?
    1. Re:LOC Metric by p3d0 · · Score: 1
      Yeah yeah, it's the same old story. I think everyone who has really thought about LOC has had the same idea. The truth is that plain old lines-of-code is the most effective measure of software development.

      The problems you are trying to solve really are not problems in practice. Most files don't have many blank lines. Comments take just as long to write as any other code, and arguably deserve to be counted. Statements broken among multiple lines usually indicate that the author thought they were complex enough to warrant multiple lines, and therefore they probably deserve to be counted that way. Et cetera.

      And above all else, I don't think people consider a difference in 50% in LOC to be significant anyway, so there's no point worrying about whether people put the curly brace on a separate line. If you ever see someone say that system X is simpler because it has 25kLOC instead of 30kLOC, then you ought to mention that this could be attributed to differences in coding style.

      In short: if it ain't broke, don't fix it.

      I once had a code counter that went one step farther than you suggest: it simulated the effect of a very high-quality language-specific data compression algorithm, and computed the entropy of a piece of code. I'm not sure the results were any more valid than any other metric.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  161. ah, interesting by pyrrho · · Score: 1

    so, evidently numbers are involved with this somehow... facinating!

    --

    -pyrrho

  162. hmmm, interesting by pyrrho · · Score: 1

    so it occurs to me to run a few test of my own... now that we are going to count not only actual errors, but errors that might potentially be added to the language.

    [scribble scribble][calculate][ponder] yes... the number of defects, including potential defect is exactly infinite!

    Now, if only they used a language where it is impossible to code a defect, like Java, or Godland, there would be no problem!

    oh, the sarcasm! I'm so full of it!

    --

    -pyrrho

  163. Got THAT right by siskbc · · Score: 1
    IMNSHO, that ought to be standard for any mission-critical software. Bugs and the places that bugs live in are not created equal. The beauty of Apache (at least 1.13) is that the overall system can be very robust and reliable with rather buggy modules. I suspect the problem with IIS is that everything assumes everything else is perfect, which overall doesn't quite work so well.

    I looked at the "defect" report for apache, and 29 of the 31 errors were null pointer dereferences (the other two were references to unitialized variables). NO array overruns. I'd rather much have a null de-reference (from run-time allocated memory, ostensibly) than an array overrun, which could be used to do a buffer-overflow attack. Apache had none of those.

    Furthermore, almost all the errors were in a handful of files, which one could probably assume weren't particularly critical. I'd love to see a re-analysis of Apache's "guts," as I believe it would be rock-solid.

    --

    -Looking for a job as a materials chemist or multivariat

  164. The first "defect" is provably not a defect at all by Anonymous Coward · · Score: 3, Informative

    Looking at their first "bug", a little manual inspection shows that it's in the "can't happen" category, even without knowing about hidden information. The code looks like this:

    current_provider = conf->providers;
    do {
    {some safe code}
    if (!conf->providers) {
    break;
    }
    current_provider = current_provider->next;
    } while (current_provider);

    and they identify the second-to-last line as the "possible NULL pointer reference". Note that the "break" before that line will be taken if the pointer is NULL, so it can't happen. In fact, the static analysis could have determined this if it were a little better at propagating values.

    First conclusion: subtract at least one "bug" from the 31 defects in Apache. This lowers the rate to 0.51, the same as the "average commercial code" number they quote. Yahoo!

    Second conclusion: their static analysis must identify a lot of false positives, if the very first one in the list is one (I would look at more, but I should really get back to work...)

  165. Oops, found a bug (in Reasoning's C parser!!) by Anonymous Coward · · Score: 0

    Defect ID 14 in the report complains that:
    "the expression pattern++ = '\0' is not a valid pointer."

    in this line of code:

    1480 *pattern++ = '\0';

    Apparently their C parser has incorrect precedence rules; they think the assignment operator "=" binds more tightly than the pointer dereference "*".

    So far, every one of their "defects" that I've examined look like non-bugs. This one is not only not a bug, but it doesn't even parse the C code properly.

  166. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    We can't assume Apache and IIS are roughly equivalent in terms of code defects, and we certainly can't make any assumptions on the OS based on the fragmentary information given by Reasoning.

    For one, a large number of the "defects" listed by Reasoning are false positives. Such as warning about dereferencing a NULL pointer where the pointer cannot possibly be NULL due to an action on the previous line.
    And second, we have no idea what they compared Apache to or how they got ahold of the source code to these mystery commercial offerings. They could be making everything up, and I'm inclined to believe that they are given the reluctance of commercial providers to disclose source code.

    The facts is, IIS has a much smaller market share than Apache according to netcraft and is closed-source so attackers can't just read the code... Yet it's broken more often according to Zone-H and more advisories come out for IIS than Apache according to CERT.
    Statistically speaking, IIS must have a much higher incidence of severe defects.

    Your comment was not insightful. It was misleading.

  167. Re:Code defects appear to be a small part of the e by IncohereD · · Score: 1

    Rewriting something from scratch is unfortunately the surest way to introduce new bugs. If the architecture was completely flawed it'll pay off eventually, but I would NOT trust the first iteration, just like I wouldn't buy a new car in its first model year. No one's had a chance to really hammer on it yet.

  168. Re:Code defects appear to be a small part of the e by phre4k · · Score: 1

    When you add IIS's considerable lead in private intranet hosting

    Tell me if it so private how can people hack it. INTRANET. means that the people who has access/use are very limited. How can it be a goal for hackers if it is so limited?

    /Esben

    --
    "Nobody really checks their email any more. They just delete their spam"
  169. Modded Up! by Anonymous Coward · · Score: 0

    I modded you up cause you had good knowledge in your post, but why are you AC?

  170. Whoops by Anonymous Coward · · Score: 0

    Now all my moderation in this thread is gone. Gimme a break Slashdot, why can't moderators post AC in threads they moderate? Are they going to put all five points on their AC post? It's going fucking show up in meta-moderation and they're going to get screwed anyways.

    Fuck you Slashdot. THINK a little bit about the implementation with checks and balances before you rush to check out your new affiliate check from ThinkGeek.

  171. Both security and obscurity are useful. by MickLinux · · Score: 1

    Let me point out that the problem phrase is "security through obscurity". Both security and obscurity are useful.

    Obscurity isn't hiding it in a vault somewhere in New York, and telling you to try to read the letter. Obscurity is when you don't want anyone to read the letter, and so you bury it in a vault and don't tell anyone.

    But you can't depend on obscurity. You especially can't depend on obscurity if you're trying to sell a product. So if that's so, you'd better have security in addition to any obscurity you have.

    --
    Correct Horse Battery Staple: 72 bits of entropy. Enter "Correct H" into google. When it generates the phrase, that's
  172. Re:Code defects appear to be a small part of the e by Anonymous Coward · · Score: 0

    Apache, in contrast, reads its configuration at startup, and uses it to determine which modules of code are loaded, and how they are used to process requests - fixing the behavior of the web server at startup.

    Not true. The command "killall -SIGUSR1 httpd" will tell apache to reread it's config file.
    Changes can quite happily be made to the server configuration without having to restart it.

  173. Apache license by 200_success · · Score: 1
    IANAL but don't they have to include a copy of the Apache License given that they publish fragments of the source code in their defect report?

    Free software licenses work in because of copyright law. Copyright law says that you cannot copy the code, but the authors grant you exceptions under contract law if you obey the terms of the license.

    However, quoting excerpts is considered fair use under copyright law, so they can ignore the license.

    I'm not a lawyer either.

  174. YOU FAIL IT! by Anonymous Coward · · Score: 0

    Hahahaha!

  175. Apache = A Patch by kyoko21 · · Score: 1

    If I recall the original name of Apache, it was a play on words of "a patch." Considering that it is "a patch", the results is really not that surprising when compared to its commercial counterparts. The good thing is that it's free. Yay!!!

  176. Apaches track record?!?!?!? by Anonymous Coward · · Score: 0

    hahahhahaha

    have you seen how many security exploits have been released for apache over the years?

    No one should trust apache.

    1. Re:Apaches track record?!?!?!? by Fallen_Knight · · Score: 1

      Its a hell of alot better thens ISS.......

  177. JUST FIX IT! will this work? by Anonymous Coward · · Score: 0

    Why can't somebody write a little preprocessor that fixes all the warning-type code prepending stuff like

    if (NULL == myPointer) { ... }

    where necessary so as to minimize the count of these sorts of "errors"?

    seems trivial to me, but i'm not that good a coder

  178. HOW ABOUT THIS? by Anonymous Coward · · Score: 0

    hOW hard is it to just fool the thing by making a preprocessor that "corrects" "incorrect" code (ADDS THE CODE NECESSARY E.G. ETC). Wouldn"t that be easy?

  179. Poof by Anonymous Coward · · Score: 0
    "Reasoning found 31 software defects in 58,944 lines of source code of the Apache http server V2.1 code"

    Of course, they were all fixed within 7 hours.

  180. Design by Contract by peerogue · · Score: 1
    Hello AC,

    You have not heard of the Design by Contract paradigm? Because this is what we're talking about here.

    In design by contract, the callee makes up the contract (precondition) and the caller promises to never violate the precondition. If you do, it's an error in the caller.

    You can choose to enforce the precondition by checking it in the routine. When your software is validated, you can remove this runtime checking. But with design by contract, you never replicate the check of the precondition.

    After you have started practicing this paradigm, you will see that your code is clearer, is less cluttered with tests, and more robust than it can be with defensive programming.

  181. Update to Reasoning's erm.. reasoning by galtsavenger · · Score: 1

    Just to update those interested, I contacted Reasoning and asked why they used a pre-release development version of Apache to analyze, and their response was that they are hoping to show the impace of peer inspections over time, and they will be posting more inspections of Apache as the code matures. I think it's great that the unstable development branch has only as many defects as the commercially available web servers... Should be interesting!