Slashdot Mirror


MySQL & Open Source Code Quality

dozek writes "Perhaps another rung for the Open Source model of software development, eWeek reports that an independent study of the MySQL source code found it to be "in fact six times better than that of comparable commercial, proprietary code." You can read the eWeek write-up or the actual research paper (reg. required)."

90 of 446 comments (clear)

  1. Six times better? by pyite · · Score: 5, Insightful

    Six times better? I didn't know it was possible to quantify code quality in that matter. Interesting.

    --

    "Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman

    1. Re:Six times better? by doowy · · Score: 3, Insightful

      It was based purely on "defect density" - the number of errors per throusand lines of code.

      MySQL had a defect density of 0.09 and the industry standard was found to be 0.57 defects per thousand lines of code.

      The MySQL development team has since fixed all of the 'defects' that were found in the study. (which ranged from a few uninitialized variables prior to usage to memory leaks).

      --
      ..mork
    2. Re:Six times better? by man_of_mr_e · · Score: 3, Informative

      Sadly, this isn't what most people assume it means. Reasoning's software only finds "obvious" defects, such as null pointer assignments. It doesn't (and can't) determine if a bit of code does what it's supposed to do, only that it does whatever it does without any danger of crashing.

      Basically, it's no different from running your code through BoundsChecker or CodeWizard, or any number of other such tools that check for obvious errors (Null pointers, obvious buffer overflows, dangling references, etc..)

      While I have no doubt that MySQL's code is perhaps "cleaner" than your typical unpublished code, I have plenty of doubt that MySQL's code is "better" than unpublished code in terms of efficiency, logic errors, etc..

  2. Just wait... by cableshaft · · Score: 5, Funny

    ...until I release my MySQL source code to the open source community. Then that 6x multiplier will drop down to 2x.

    Yeah, it's really that bad. Gets the job done, though. Hell to maintain. Probably would've helped if I documented any of it.

    Maybe I should read that Code Complete book I keep meaning to read sometime.

    --
    Creator of the popular web game Proximity
    1. Re:Just wait... by I8TheWorm · · Score: 2, Informative

      Good practices nonetheless, and not really win32 specific. Another fairly good one is The Pragmatic Programmer: From Journeyman to Master by Andrew Hunt, David Thomas.

      --
      Saying Android is a family of phones is akin to saying Linux is a family of PCs.
  3. Don't rest on your laurels. by grub · · Score: 5, Insightful


    Perhaps another rung for the Open Source model of software development

    Uhh... no.

    It's is a glowing report for this particular open source project but that brush shouldn't be used to paint all open source. That will just lull open source developers into a false sense of euphoric contentment. Code quality didn't get this far by having a fixed target, that target should be a carrot on a stick that will never quite be reached.

    --
    Trolling is a art,
    1. Re:Don't rest on your laurels. by MartinG · · Score: 3, Insightful

      The point is that for some folks it's still unfortunately the case that open source software is automatically worse the proprietry software, despite some of us knowing how outdated and wrong their ideas are.

      The "rung" in question here is the one where open source progresses in those peoples minds from "must be worse" to "can be as good or better"

      There's no suggestion of "all open source is better" anywhere.

      --
      -- MartinG To mail me: echo kewyjlcxyzvjfxbqwh | tr bcefhjklqvwxyz .@adgimnoprstu
    2. Re:Don't rest on your laurels. by grub · · Score: 2, Insightful


      it's still unfortunately the case that open source software is automatically worse the proprietry software

      All software sucks, the degree of suckiness is what matters. :)

      --
      Trolling is a art,
  4. Measurements by Stiletto · · Score: 5, Insightful


    Undoubtedly()
    {
    when();
    you = measure(quality);
    in.defects();
    per->lines_of(code, anyone);
    can = write(good, solid, code);
    }

    1. Re:Measurements by Walterk · · Score: 5, Funny

      Post:2: warning: return-type defaults to `int'
      Post:2: In function `Undoubtedly':
      Post:3: warning: implicit declaration of function `when'
      Post:4: `you' undeclared (first use in this function)
      Post:4: (Each undeclared identifier is reported only once
      Post:4: for each function it appears in.)
      Post:4: warning: implicit declaration of function `measure'
      Post:4: `quality' undeclared (first use in this function)
      Post:5: `in' undeclared (first use in this function)
      Post:6: `per' undeclared (first use in this function)
      Post:6: `code' undeclared (first use in this function)
      Post:6: `anyone' undeclared (first use in this function)
      Post:7: `can' undeclared (first use in this function)
      Post:7: warning: implicit declaration of function `write' Post:7: `good' undeclared (first use in this function)
      Post:7: `solid' undeclared (first use in this function)
      Post:8: warning: control reaches end of non-void function

    2. Re:Measurements by Anonymous Coward · · Score: 3, Funny

      Think about this: While you were writing that, those guys from school that stole your lunch money and kicked your ass where getting laid.

  5. Re:Duh! by pyite · · Score: 3, Informative

    MySQL is not touted as Enterprise because its not Enterprise. Sure, it's fine for running Slashdot, but I wouldn't want it storing mission critical data. Oracle may be slower, but I'd much rather trust it to make sure my data is properly stored than MySQL.

    --

    "Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman

  6. If you would RTFA... by Theatetus · · Score: 5, Informative

    ...they quantified it by dividing verified defects by lines of code. MySQL had 0.09 bugs/KLOC while the "commercial" defect density was 0.53 bugs/KLOC. (Their use of the term "commercial" confused me since MySQL is, after all, a "commercial" project, just an open-source one.)

    --
    All's true that is mistrusted
    1. Re:If you would RTFA... by pyite · · Score: 5, Insightful

      "Defect" is also a difficult term to define. Some errors are much worse than others. It's not all about numbers, folks. Don't get me wrong, I'm not saying that MySQL isn't a great product. I just get skeptical when I hear things talked about in terms of "better" and "best."

      --

      "Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman

    2. Re:If you would RTFA... by SonicBurst · · Score: 5, Insightful

      Not only is it hard to define defect (and it is very obvious that some defects are worse than others), but this code review sounds like it only spots "grammatical" or style errors in the code. It doesn't sound like it could find a defect in an algorithm implementation or logic. To me, these are where the true defects are, in the logic/reasoning breakdowns.

      --

      Geek used to be a four letter word. Now it's a six-figure one.
    3. Re:If you would RTFA... by rembem · · Score: 4, Insightful

      0.09 vs 0.53 bugs/KLOC can also mean mysql has six times the amount of code per line, compared to an average "commercial" program. Those numbers should be divided by a code-density-factor.


    4. Re:If you would RTFA... by B'Trey · · Score: 5, Insightful

      I'm not sure what you mean by "grammatical" or style errors. If you're talking about syntax errors, those should prevent the code from compiling. I'm not aware of how coding style can be an error (unless you're programming in Python).

      The specific errors in MySQL were dereferencing null pointers, failure to deallocate memory (memory leaks), and use an uninitialized variable. These aren't the only bugs that such an analysis can find; they're the ones that were found in MySQL. And they're definitely errors in logic.

      Certainly, there are bugs that such an analysis can't find. If you define PI as 3.15, your calculations are going to be off. If you create a function to determine the circumference of a circle as 2 * PI * Diameter, you've got a bug. I suspect that those are the types of errors in logic that you were referring to, and you're right that they will not be caught by a code analysis. However, that doesn't mean that comparing the frequence of the errors that CAN be caught between two programs is an invalid act. From my experience, programmers who make fewer of the former errors also make fewer of the latter. Analyzing catchable errors is a good metric for the frequency of errors in a given source tree, even if all errors aren't caught.

      --

      "The legitimate powers of government extend only to such acts as are injurious to others." Thomas Jefferson.

    5. Re:If you would RTFA... by Tassach · · Score: 5, Insightful
      No defects != good software.

      A flawless implementation of a crap algorithm is still crap. I don't care if your bubble-sort routine has no memory leaks or buffer overruns; it still scales O(N^2). Likewise, a so-called "database" which does not implement key features like transactions and stored procedures is fundamentally flawed even if there are zero coding errors.

      MySQL may be well-written, but it's still a piece of crap by the standards of any professional DBA.

      --
      Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
    6. Re:If you would RTFA... by leviramsey · · Score: 2, Informative

      MySQL does have transactions, and has had them for quite some time. Stored procedures are due in a future version.

    7. Re:If you would RTFA... by Dun+Malg · · Score: 5, Interesting
      they quantified it by dividing verified defects by lines of code.

      Problem with that is that it assumes the same "code density". Granted, it's probably not going to differ by a factor of six, but remember the old question about programmer productivity:
      who's more productive: the coder who solves a given problem with 100 lines of code written in one hour, or the coder who solves it with 10 lines in two hours?

      I mean, simple stuff like doing this:

      bool function(int i);
      main(void)
      {
      int i;
      if(function(++i))
      //blah blah blah
      }
      ...instead of:
      bool function(int i);
      main(void)
      {
      int i;
      bool foo;
      foo = false;
      i++;
      foo = function(i);
      if(foo)
      //blah blah blah

      }

      ...will give you a threefold difference in line count (specifically counting lines in the main() function). Throw in an identical line using malloc in each, both forgetting to free it later, and you've got a "bug density" of .33 for the former, and .14 for the latter. Heck, you could have two un-freed malloc's in the latter an it'd still only be at .25! I'm not saying the study is wrong-- I'd rather have the code out where I can see it, no matter WHAT the "bug density"-- I'm just saying that I wouldn't take any statistic that is derived using "lines of code" as a variable as a serious, hard number.
      --
      If a job's not worth doing, it's not worth doing right.
    8. Re:If you would RTFA... by Sxooter · · Score: 4, Insightful

      Sorry, but until MySQL has a mode where ALL tables are transaction safe, or at least throws an error when you try to create a fk reference to a non-transaction safe table, it's transactions are too prone to data loss due to human error.

      It's a good data store, but the guys programming it have to "get it" that transactions can't be optional in certain types of databases, and neither can constraints, or fk enforcement.

      MySQL has a tendency of failing to do what you thought it did, and failing to report an error so you know. This is a legacy left over from being a SQL interpreter over ISAM files. It makes MySQL a great choice for content management, but a dangerous choice for transactional systems.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    9. Re:If you would RTFA... by Tassach · · Score: 2, Informative

      Everyone does not know this, and everyone does not understand it, or I wouldn't have spent a substantial percentage of my carreer cleaning up other people's messes.

      --
      Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
    10. Re:If you would RTFA... by scrytch · · Score: 4, Informative

      If only it were MySQL just lacking features that would, after much mudslinging at the ideas themselves, be grudgingly retrofitted into a new table type. MySQL's brokenness goes deeper than that.

      MySQL's attitude toward data integrity can be summed up as "if the constraint can't be satisfied, do it half-assed anyway". I find myself having to write application code to manage data integrity with MySQL, something I can take for granted with a real database.

      --
      I've finally had it: until slashdot gets article moderation, I am not coming back.
    11. Re:If you would RTFA... by jc42 · · Score: 3, Insightful

      ...they quantified it by dividing verified defects by lines of code.

      If I write a script to go through my C and perl code, and make sure that there's a newline before and after every brace, that will approximately double the lines of code, and will thus cut my error rate in half.

      This isn't a joke; I've done this on a couple of projects where they measured output by lines of code, just to illustrate the real impact of such measures.

      OTOH, if I deleted the comments from my code, that would approximately double my error rate, so I guess I won't do that.

      I'm also reminded of a project that I worked on a while back in which nearly every routine had some sort of error, sometimes several, and I didn't fix any of them. This would look really bad, I know. But you can probably guess what my task was. I was writing a test suite for a compiler. Most of the tests were to verify that the compiler would catch a particular kind of error. So of course my code contained that error, and the test script verified that the result was the proper error message.

      This is one of the fundamental problems with nearly every definition I've ever seen of "quality code". They usually don't measure the suitability of the code for the task. If your task is to measure a system's response to failures, you code will of course intentionally produce those errors in order to determine the system's responses. So what is an error in other situations is exactly correct code. Counting errors detected without asking what the task was gives you exactly the wrong results in such a case.

      I'm not sure I'd want my name associated with a project that didn't include this sort of test code in the basic distribution. If there are problems with an installation, I want to know about them before the users start using the stuff. And I want to know in a manner that will pinpoint the problems, not from the usual bug report that typically describes some symptom that is only remotely related to the actual problem. So nearly everything that I work on has a component with a high error rate, run under the control of a script that verifies the correctness of thee error messages. If the installation doesn't handle the errors correctly, the users are given output that will tell me what the problem is.

      I'd only be impressed by a study that handles such a test suite correctly. One that counts such "errors" is worse than useless; it actively discourages useful test suites.

      (Actually, just before reading this /. article, the task I was working on was adding some more tests to a test suite for a package that I'm porting to a number of different systems.)

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
    12. Re:If you would RTFA... by jc42 · · Score: 2, Informative

      Yeah, and we might also add that in some circumstances, a bubble sort is a very good way to sort data. There's a lot of data around that is normally sorted into just one (usually time) order, and which is also produced in an order very close to that. For such data, "efficient" sorts are usually very inefficient, and a bubble sort can beat them easily.

      Most theoretical work on sorting has assumed randomly-sorted input data. That's an important case, sure. But there are many situations where it's not a valid assumption. And a sort that's good on random data is not necessarily very good on non-random data.

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
    13. Re:If you would RTFA... by neelm · · Score: 3, Interesting

      So what you are saying is you would rather have your DB crash over not supporting some feature in a way which is only applicable in select situations?

      As a real world programmer (versus someone living in an academic world of theory) I prefer the what-I-have-works-and-I'm-Working-on-the-rest approach. In the real world, stability and performance are paramount to feature set. Also, when you consider the domain of creating web driven applications, some features of a DB become less important because the stateless nature of a http connection. Server-side cursors don't do well in a cookie.

      > MySQL may be well-written, but it's still a piece of crap by the standards of any professional DBA.

      Which is why I give little attention to certifications.

    14. Re:If you would RTFA... by Greyfox · · Score: 3, Insightful
      According to the article (You DID read the article right?) they found (in the mysql code) 15 null pointer dereferences, 3 memory leaks and 3 usages of uninitialized variables. Apparently they look for comparable defects in commercial code and I think everyone who programs will agree that those are fairly major defects.

      The code scanners I've looked at will flag potential errors even if it's impossible to reach the error condition in code, so it's possible that some or all of that stuff may never have actually happened, but it's generally better to program defensively anyway. All it takes is for some bozo to change your if condition and all of a sudden you're segving all over your customer's important data. 15 null pointer derefences in nearly a quarter million lines of code is a pretty low number though. I've seen more than that in a single thousand line file written by "professionals."

      --

      I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    15. Re:If you would RTFA... by Greyfox · · Score: 3, Insightful
      Yeah, and the 3 users on the planet who actually need a full fledged SQL database can install Oracle or DB2. Although I've had my indexes corrupted and other horrible things with both those database packages.

      I've worked on several projects interacting with SQL databases and I've only seen one really take advantage of the power of the database. Most of them are using Oracle as a glorified DBASE III, and as a glorified DBASE III, MySQL is much less expensive. And I've seen entire companies built around DBASE III applications.

      --

      I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    16. Re:If you would RTFA... by frostman · · Score: 2, Interesting

      A funny thing to add to this...

      I'm doing my first MySQL work (done a lot of Oracle and a little PostgreSQL) and I was *flabbergasted* when I realized that, when you update a table but the data has not actually changed, you get success and zero rows updated.

      Which is exactly what you get (and should get) when you try to update and no rows are found to update.

      I suppose with no triggers anyway, it might be a tiny bit faster to skip the actual update when the data hasn't changed, but to real DB folks this is not only counter-intuitive, it's *scary*.

      'Course this is 3.23, maybe they changed that in 4. I read that they added booleans in 4... though just as an alias for ItsyBitsyInt.

      MySQL is fast and free and there is a lot of community support for beginners. And if you have oodles of RAM, the HEAP tables are a sweet thing indeed. As such it's good. But I sure hope nobody ever makes me use it for anything mission-critical... and I fear for people using this as an "enterprise" DB.

      (donning flame-proof suit...)

      --

      This Like That - fun with words!

    17. Re:If you would RTFA... by a_ghostwheel · · Score: 3, Informative

      Wrong. Insertion sort (linear or binary) will be efficient way to sort "almost-sorted" data. Plus, commenting phrase "efficient sorts are usually very inefficient" - you have to realize (if you dont know this) - sort algorithms are classified into stable ones (e.g. merge sort) and non-stable (e.g. quick sort).

      Stable algorithms have identical efficiency no matter what kind of order input data had. Non-stable algorithms have predefined best and worst cases.

      But, overall - you will not be able to come up with the data where bubble sort will be best way to sort - usually you will end up using merge or quick sort for large data sets and insertion sort for small data set (some quick-sort implementations use insertion sort during last stages of sorting - when data has been "almost" sorted).

    18. Re:If you would RTFA... by Tassach · · Score: 3, Informative
      Most theoretical work on sorting has assumed randomly-sorted input data
      Bullshit. Every textbook comparison of sort algorithm I've ever seen assumes three cases: nearly-sorted data, random data, and inverse-sorted data. Even if bubblesort were the fastest for nearly-sorted data (Working from memory, I'm pretty sure it would run in O(n) as it's best case), it's still O(n^2) for the other two cases. Quicksort, heapsort, and insertion sort all scale differently; but even assuming their best-case performance is worse than bubblesort's best-case, their worst-case performance is FAR better - typically O(n log(n)) or thereabouts. IIRC, The AWK Programming Language has some excellent sample code which graphs the performance of the major sorting algorithms for different kinds of input.

      I seem to recall that insertion sort is also O(n) on nearly-sorted input, so it would be a much safer choice than bubblesort for the situation you describe. You have to consider best- and worse- case scenerios as well as the nominal path. IMHO, bubble sort has no place outside of an instructional setting.

      --
      Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
    19. Re:If you would RTFA... by Tassach · · Score: 2, Insightful
      The fact that you used big O notation and referenced a bubble sort tells me you're still in school.
      Considering that I've been out of school and working as a software engineer for over 15 years, I'd say you have much to learn. As long as we're in ad-hominim mode, your comment leads me to believe that you are suffering from some combination of arrogance, ignorance, inexperience, and unprofessionalism; but that's besides the point.

      I find mathmatical notation to be clearer and more succinct than the longhand equivilent. "O(n)" is, IMHO, a superior way of saying "scales linearly". All of the really good engineers I've worked with over the years have held the same opinion.

      As to not having used MySQL in a long time, that's true. I don't use MySQL because I see no purpose for it. If I need a fast non-relational, non-transactional data store I'll use an ISAM solution. If I need a real relational database I'll use Sybase or Oracle (or MS-SQL if I have no choice, or even PostgreSQL if I have to make an open-source zealot happy). The only time I'd use MySQL was if I needed a semi-relational database with half-assed transactions, no stored procedures or triggers, broken referential integrity, a plethora of non-standard behaviors, and rampant data integrity issues.

      If MySQL had stuck to it's original vision of being a SQL frontend to an ISAM database, it might actually be worthwhile. Instead it's become a bastard hybrid that's too bloated to be a good ISAM db and too broken to be a good relational db. I'll admit that there are jobs that MySQL can do well -- however, my professional opinion is that there are better tools for that class of tasks.

      --
      Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
    20. Re:If you would RTFA... by B'Trey · · Score: 2, Informative

      Well, actually, I'm not speaking only from my experience. I'm also speaking from a number of other published articles and reports. Rational has a few out concerning Purify. Reasoning (who did the MySQL study) has a couple out as well. There have also been independent studies which confirm the (possibly biased) findings of the companies who write the software analysis suites. If you're really interested in the subject, Google is your friend.

      There are no guarantees, in this business or any other. But in general, people who are meticulous in their coding tend to be meticuous in all areas. If they bother to run software analysis tools and correct the bugs there, they usually bother to spend time evaluating the design and looking for bugs there as well. They also tend to test their code once written, which helps to identify those errors in coding logic that a software analysis suite can't find.

      And regardless of whether or not an unitialized pointer is an easy bug to find or not, if it exists in the code it's still likely to cause an application crash. Would you rather run code which has one of those type errors every seventeen hundred lines, or code which has one every eleven thousand one hundred lines?

      --

      "The legitimate powers of government extend only to such acts as are injurious to others." Thomas Jefferson.

    21. Re:If you would RTFA... by eloki · · Score: 3, Insightful

      A flawless implementation of a crap algorithm is still crap.

      No.. a flawless implementation of a crap algorithm just doesn't scale well. Of course bug rate is not the only criteria used when evaluating software, but people spend hundreds of man-hours fixing bugs.

      It demonstrates that the quality of open source code is not automatically worse than professional proprietary code (which some people believe is the case). The important thing is that it's at least an attempt at formal study (and not simply personal collating of anecdotal reports).

    22. Re:If you would RTFA... by Sxooter · · Score: 2, Informative

      Firebird is closer to Postgresql in capabilities, and closer to MySQL in terms of size (Postgresql is friggin huge, and sucks up disk space quickly, so it's a bad choice for embedded db applications with limited space unless you're willing to do a lot of hacking to make it "lose weight").

      It felt a lot like postgresql to me. I didn't do anything fancy like writing a stored proc or a trigger or something like I've done in Postgresql.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    23. Re:If you would RTFA... by rifter · · Score: 2, Insightful

      Understand the limitations of the tool you choose to work with and live with it or use a different tool. Nobodies forcing you to use any specific tool.

      No, but somebody is trumpeting the lory of that tool as end-all-be-all, while simply ignoring the points of people who dare to break rank, spurn the kool-aid, and point out flaws. I swear every damn slashdot article about open source tools has some thread like this. And every time we get some version of "love it or leave it." What happened to actually trying to improve on the basis of valid criticism?

      The article and the posts following it seem to promote MySQL as a production database to compete with Oracle. It is clear that while it is a nice database with good features and is useful for many projects, it lacks many things which DBAs like about RDBMS systems like Oracle. It is also clear that if any of the posts here and linked articles accurately describe MySQL behaviour it violates some very basic rules of software design.

  7. Clearly biased! by unborn · · Score: 2, Funny

    This article must have been written by supporters of closed software. The ratio of 0.57/0.09 is 6.333~ and the article states it is 6. Clearly FUD. Let the flaming begin!

  8. On paper it looks better by the+real+darkskye · · Score: 3, Insightful

    And line of code for line of code there are less known errors in MySQL than there are assumed/predicted/mean errors in their commercial counterparts, but that doesn't answer the question of how does MySQL compare performance-wise to Oracle or <flameretardent coating>MS SQL 2003</flameretardent coating>

    Just my 0.03 (adjusted for inflation)

    --
    Music is everybody's possession.
    It's only publishers who think that people own it.
    Fuck Beta
    ~John Lenno
    1. Re:On paper it looks better by Tassach · · Score: 2, Interesting
      Access has foreign keys, but unless they added it in the latest version, it does not support real transactions. Add to that the fact that it's locking model is fundamentally broken, you have something which is just powerful enough to let you do things with it that you shouldn't. MySQL suffers from the exact same problem.

      I shouldn't complain -- I've made a lot of money over the years cleaning up the messes left by inexperienced people who thought Access or MySQL were real databases.

      --
      Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
  9. Re:"6 times better" by dcordeiro · · Score: 4, Insightful

    I agree with you that you can't simply measure quality but...
    If you just RTFA, you'll see that is not "6 times better" but "6 times less bugs found then the average on commercial products"

    The only thing wrong in the article:
    They should replace the term "commercial" with "closed source", because Mysql is also a commercial product and what makes it different is the open source model.

  10. Lines of Code? by ksa122 · · Score: 3, Interesting
    Reasoning performed its independent analysis using defect density as a prime quality indicator. Defined as the number of defects found per thousand lines of code, MySQL's defect density registered as 0.09 defects per thousand lines of source code.
    Can any measurement that uses lines of code to compare code that could be written in different languages or for different types of applications be very accurate?
    1. Re:Lines of Code? by nojomofo · · Score: 4, Insightful

      I'm under the impression that most "bugs" in software (certainly most bugs in my code) aren't bugs like these in the article (null dereferences, uninitialized variables, etc), but they're algorithm bugs. As in, there's a subtle interplay between different parts of complicated algorithms that can be easy for programmers to miss. Those types of bugs are going to be much harder to find, and certainly not going to be found in analysis such as this one.

  11. All that's missing ... by JSkills · · Score: 4, Interesting
    All that's missing - to go along with the defects per lines of code comparision - is a comparison of features and performance benchmarking to other commercially built database products. Now that would be the complete comparison.

    As strong proponent of MySQL, I'd be very curious to see how it stacks up in those regards.

  12. Stanford Checker by eddy · · Score: 4, Interesting

    Anyone know how this one is faring? Will it ever be released? It's based on GCC, right? How many students can it pass between until it's "distribution"?

    The reason I'm asking is because I saw that one member of the team has jumped over to a company called Coverity where one can read:

    Originally developed by a team of researchers in the Computer Systems Lab at Stanford University, Coverity's patent-pending source code analysis technology successfully detected over 2000 bugs in Linux including hundreds of security holes.

    I just think it'd be horrible if they used the GPL'ed GCC to develop their methods (having access to a full portable compiler onto which to do research and development is hardly a "small thing"), and then lock these same methods away from the community.

    I'm grateful for their work on checking linux, but really... this smells bad, IMHO.

    (If you don't know what I'm taking about, don't assume it's off-topic, okay? The Standford Checker is a related topic to the Reasoning analysis of MySQL, and I'm not sure we'll ever have a _better_ fitting topic to discuss this)

    --
    Belief is the currency of delusion.
    1. Re:Stanford Checker by Error27 · · Score: 4, Interesting

      I wrote a similar tool to the Stanford Checker called smatch.

      I post the bugs and stuff that it finds on kbugs.org. The most recent kernel that I've posted is 2.6.0-test11.

      One thing that I was working on a couple weeks ago was invalid uses of spinlocks. Here are my results from that. I found quite a few places that don't unlock their spinlocks on error paths etc.

    2. Re:Stanford Checker by owenomalley · · Score: 2, Interesting

      > I just think it'd be horrible if they used the
      > GPL'ed GCC to develop their methods (having access
      > to a full portable compiler onto which to do
      > research and development is hardly a "small
      > thing"), and then lock these same methods away
      > from the community.

      Yeah, that is the way it is going to go. Dawson and his students and employees use gcc for a parser and have no intention of releasing their tool under any open source license. They claim that they modified gcc to write out Abstract Syntax Trees (ASTs) that are then read in to their tool (the Coverity/Stanford checker), which Coverity is selling commericially. Richard Stallman has long fought to keep gcc from publishing useful ASTs to prevent things like this from happening, but it is obviously impossible to stop in the long run and he should just concede the point.

      We should pressure Dawson and Coverity to at least release the modified gcc parser that will dump the AST. ASTs enable all kinds of program analysis tools, such as doxygen and static analysis tools. Furthermore, we should pressure FSF to roll the changes back into the GCC mainline.

  13. Debatable scale by Basje · · Score: 5, Insightful

    I do believe that Open Source is better than proprietory. Faults per 1000 lines of code may seem like a valid scale, but I think it is indicatory at best, not proof.

    * It does not take into account the design of the software. This is often as important as the actual quality of the code.
    * It does not take into account the kind of errors. This is related to the first, but a buffer overflow that allows root access is worse than a failed instruction.
    * It does not even take the length of lines into account. Shortening the lines could lower the number, without actually changing anything.

    So, small victory, but the race goes on.

    --
    the pun is mightier than the sword
    1. Re:Debatable scale by ebuck · · Score: 4, Insightful

      Good points, and I agree.

      Also if "lines of code" are going to be part of any code comparisions, then a standard should be propsed that does (at a minimum) the following:

      1. Formats the code consistently. We don't want one project to have more lines of code (and therefore less bug density) because they put a brace or parenthesis on a separate line while others do not.

      2. Strip the comments. Someone could decrease bug density by heavy, heavy commenting. Comments are a vital part of coding (and more usually is better), but they have no impact on the bugginess of the code.

      3. Format conditionals, blocks, and function calls consistently, or better yet, ditch the line counting and count bugs per (function call, assignment operation, operation).

      Lines are easy to count, but they hold so little meaning in determing code quality.

    2. Re:Debatable scale by Zathrus · · Score: 4, Interesting

      Faults per 1000 lines of code may seem like a valid scale, but I think it is indicatory at best, not proof.

      It's actually a really miserable scale because of your 3rd point. If they ran the code bases through something like cindent and standardized the code formatting and removed all comments and whitespace then it's a somewhat more valid comparison. I didn't look at the actual research paper -- maybe they did. Odds are, your other two points are valid though.

      Additionally, they only say that the commercial code is "comparable". What does that mean (again, maybe answered in the paper)? Do they have roughly the same features? Are the query optimizers of roughly the same quality? Do they support the same platforms? I can't think of a major commercial database that doesn't exceed MySQL in all of these areas (ok, excepting SQL Server which fails on the 3rd only). Maybe it was a minor player in commercial databases. Dunno.

      These are the kinds of points that are raised when someone bashes OSS. There's no reason that they shouldn't be raised when the inverse is true as well. MySQL has progressed nicely and is worthy of consideration for light to moderate database loads now, I don't question that. All I'm saying is don't take things at face value.

      So, small victory, but the race goes on.

      The nice thing is that this is small and succinct -- it's suitable for showing to upper level management. That's a big win IMHO -- because normally the text bites they read are biased against free/open software.

    3. Re:Debatable scale by G4from128k · · Score: 2, Interesting

      It is very true that we can measure the "quality" of software with many different dimensions. The parent posts' suggestions of assessing design, error type, and parsimony (lack of dilution of errors with verbose code) are good.

      But the existence of alternative scales does not detract from the original assessment of defects/line unless we have separate knowledge that OSS is unfavorably biased. Do we have reason to believe that OSS is more poorly designed than commericial software, or that OSS has more serious bugs, or that OSS is especially verbose? Without that additional information, it is just as likely that commerical software has a worse design, more serious bugs, and bloated code in addition to a higher defect density (I know I can think of at least one dominant vendor that is guilty of all three sins). In fact, a higher defect density is probably a good indicator for both worse design and the presense of more serious bugs.

      Yes, the race still goes on. It would be nice to benchmark MySQL on these other dimensions of quality and benchmark other OSS projects. But without an a priori reason to suspect that OSS is worse on these other dimensions, I think we can conclude that the report is a victorious validation for MySQL and its team.

      --
      Two wrongs don't make a right, but three lefts do.
    4. Re:Debatable scale by TheMidget · · Score: 2, Insightful
      It does reflect the quality of design, but not necessarily in the way you think it does. What if, due to poor design, the code is unnecessarily bloated (needs 1000 lines for what a competitor does in 100)?

      If the bloated program has only 5 times as much bugs as the small one, it would still be considered "twice as good", because it has ten times more code for the same task!

  14. 6 times better? by kjba · · Score: 5, Insightful
    I don't see how you can make the statement that MySQL is 6 times better than the proprietary code from the facts that the defect densities are 0.09 and 0.54 per 1000 lines respectively.

    This just looks like some quasi-scientific statement, trying to express things as a number that really don't fit such a representation. For example, as the number of defects decreases, it becomes increasingly more difficult to find the ones that are left. And is code that contains no bugs at all infinitely much better than code that contains a single bug which hardly ever occurs?

    1. Re:6 times better? by Urkki · · Score: 2, Interesting
      • And is code that contains no bugs at all infinitely much better than code that contains a single bug which hardly ever occurs?

      Fortunately for the "model", there is no substantial piece of code that contains just one rarely occuring bug, let alone code that contains no bugs at all. Therefore such infinities never need to be considered in real life cases.

      But if you think of it theoretically, if that one rarely occuring bug potentailly causes your company go bankrupt (like being sued for huge damages), then I'd say the bugless version is infinitely better.
  15. As John Carmack put it... by rafael_es_son · · Score: 5, Interesting

    The main difference between open and *MOST* closed code is the fact that the early release of closed code means mucho mas money to corporate pigs and dogs, thus, proper requirements analysis, design, coding and testing are usually pummeled in the name of happy-go-lucky capitalism. "It will be ready when it is ready." -Carmack "I love America!" -Murphy

    --
    HAD
  16. Re:"6 times better" by revividus · · Score: 2, Insightful

    I don't think MySQL is intended to be `comparable' to OracleSQL, but someone else may be able to clarify.

  17. Re:New unit ? by pragma_x · · Score: 5, Funny

    Since we're measuring Defects per 1000 lines, perhaps calling them "Gates" or "Ballmers" might be more appropriate.

  18. Re:Duh! by I8TheWorm · · Score: 5, Interesting

    I've used mySQL, Oracle, MS SQL, DB2, and MSDE. I'm not sure I get your comment about MS SQL server. Like any other RDBMS, a little performance tuning goes a long way. As a matter of fact, until Oracle's release of 10g, MS SQL beat all commercial offerings in the TPC benchmarks.

    MS has a buggy os and an awful model for business practice, but I think MS SQL server is a fairly nice offering. It's too bad it only runs on Windows servers though.

    --
    Saying Android is a family of phones is akin to saying Linux is a family of PCs.
  19. OSS To Vendors by the_mad_poster · · Score: 2, Funny

    Neener neener!

    Now, I'm sure we can all be very mature about this...

    --
    Alito: A vote for Alito is a punch in the eye to put that bitch back in her place!
  20. Don't generalize! by Junks+Jerzey · · Score: 3, Informative

    This "proves" that MySQL is better than commercial offerings. Good. A lot of people knew that. Hats off to the developers. But...

    1. This cannot be generalized into a property of all open source projects.
    2. It's more a tribute to the architecture and original core developers of MySQL than anything else.
    3. Realize that even though MySQL is an open source product, MySQL AB is the *company* that organizes and pays for MySQL development. So, again, you can't generalize this into something that covers late night hackers working on personal projects in their basements (the open source geek fantasy).

    MySQL is awesome! But let's be careful about this story, okay? It's the over-generalization that gives OSS/Linux advocates a bad name ("The Gimp is equivalent to Photoshop!").

    1. Re:Don't generalize! by SuperBanana · · Score: 4, Insightful
      This "proves" that MySQL is better than commercial offerings. Good.

      No it doesn't. It "proves" that on average, by line, MySQL has fewer errors in code. It says nothing of the severity of the errors in either package.

      Furthermore- MySQL is not even close to being equal in feature set to almost any commercial DB; replication/backup sucks, it's not ACID compliant, it had no transaction support until recently, no stored procedures, no triggers.

      How on earth could you possibly compare it to almost any commercial SQL DB which has all these...and say MySQL is better?

      A lot of people knew that.

      No, every two bit web designer thinks its the greatest thing since sliced bread, since they think a select w/group+sort is an advanced query. Every professional DBA I've met refuses to work with MySQL and/or hates it, and they can go on for an hour about why. When are you people going to realize that PostgreSQL is so much better than MySQL, save some incredibly risky performance options?

      MySQL is awesome! But let's be careful about this story, okay? It's the over-generalization that gives OSS/Linux advocates a bad name ("The Gimp is equivalent to Photoshop!").

      But you just said "This proves that MySQL is better than commercial offerings!"

    2. Re:Don't generalize! by ajs · · Score: 2, Insightful

      Most of your points on MySQL are out of date. Its featureset has progressed a great deal since, apparently, you last looked into it. Even way back when this was a hot topic (when PostgreSQL, another excellent open source DB, was an up-and-comer), MySQL developers were already saying that most of people's concerns were being addressed in upcoming releasesd... Those releases have since come and gone (mostly in the form of 4.0, though IMHO, 4.1 is MySQL's finest moment, and its current release status as alpha is kind of funny given that it's been rock stable for a year).

      Just off the top of my head, you mention ACID. MySQL now offers a choice of back-end table managers that range from the original fast, but strictly non-ACID version to Berkely DB (which is fairly fast and supports transactions, but I think falls short of ACID in terms of rollback) and the fully ACID InnoDB, which is the (now open source) back end from the Progress database.

      So take your pick, depending on your app. Do you want speed? Transactions? Full ACID? Better yet, you can make that choice on a table-by-table basis!

      MySQL also has the best full-text-searching features I've seen in any DB, open or closed.

      There are limitations, and I might choose another DB for certain specific tasks (e.g. Oracle for statisics in the DB) but MySQL is a great first choice for most projects.

  21. Must have been baaad commercial code then.. by jordan · · Score: 4, Interesting

    Because there are portions of the MySQL code that are just painful to look at.

    Take for instance the part that takes as input the key index size and calculates internal buffer sizes. The option's size is an unsigned long long, but they cast it to an unsigned long all over the place, do in-place bitshifting on the cast (and cause it to wrap -- try specifying 4G for your key index sometime and you'll get 0), and the quality of code in that case is just painfully horrible to look at or even figure out what it's doing.

    I could only shudder to think what the quality of the commercial product looked like, in comparison. Hell, I'll have nightmares if I consider the quality of MySQL++ as a comparison..

    --jordan

  22. Re:Duh! by James+Thompson · · Score: 5, Informative

    Need a particular reason? Take your pick. http://sql-info.de/mysql/gotchas.html

  23. Total Crock by nberardi · · Score: 3, Insightful

    So how many of the eWeek people do you think saw the code to MS SQL Server or Oracal SQL? I am hightly doubting that they even were able to get to the front door to knock on either of the doors to ask if they could see the code. I mean this just looks like pure propoganda to anybody that has half a brain and keeps up with the industry.

    Don't get me wrong I love MySQL, but these types of articles are just as bad as the people saying that MacOS X isn't that secure because of the less users on it. Or the guy claiming that MS is way superior in the Internet Server world. These type of articles are just there to cause controversy and seperate us as a community Mac/Windows/Linux combined.

    I am not putting any merrit in this article and neither should you.

  24. Re:Duh! by pyite · · Score: 5, Informative

    Up until recently, MySQL had no transaction or atomic operation support. As such, you need to write application code to trap problems. Whereas with Oracle, when you run an atomic operation, you know without certainty whether the query failed in its entirety. I also believe stored procedure support is somewhat lacking in MySQL (however, there is that new Java function support). The MySQL 3 tree does not enforce constraints which is something most essential for data integrity. MySQL does not have subrow locking, whereas enterprise databases do. Once again, MySQL is great. I use it. However, it is not enterprise.

    --

    "Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman

  25. Hardly fair by stinkyfingers · · Score: 3, Insightful

    There's a hell of a difference between 235,667 lines of code and 35 million lines of code. Just like there's a difference between 1000 lines of code and 235,667 lines of code. That is, the more line of code, the more likely a defect will survive.

  26. Re:Slashdot is a bad commercial for MySQL. by pegr · · Score: 5, Funny

    "but the Slashdot database regularly becomes confused, such as posting a comment to the wrong story"

    That's not the db... around here, we call them "trolls"...

    ;)

  27. MySQL is a "TOY" as far as RDBM'S goes by ad0le · · Score: 3, Insightful

    First off, I think MySQL is a fantastic product. Its the perfect mix of speed and ease of use well suited for small to medium sized datastores where speed and relaibility are a must. That being said, I think it's unfair to describe this product alongside others such as Oracle, MSSQL (blow me guys, its a great product) and even PostgreSQL and SAP DB (which is be best OpenSource option in my opinion). The codebase for MySQL will never acheive the magnitude of the aforementioned products so it should be used that way. Just my 2 cents.

    --
    My mother never saw the irony in calling me a son-of-a-bitch.
  28. MS SQL Server by rlp · · Score: 2, Funny

    I'm in the midst of upgrading a SQL Server 2000 installation. MS issued their latest patch in August - a mere 56 MB patch. Hopefully that will fix some of the flakiness I've been seeing.

    --
    [Insert pithy quote here]
  29. Re:Now apply to IE patches.... by thebatlab · · Score: 3, Interesting

    That open source patch was quite shoddily and hastily written. It wasn't even a patch really. Using it as representative of open source is not fair in any way whatsoever to other successful open source products.

    "Now apply the 'Rule of 6 times' to Microsoft's closed source IE patches..."

    There is no 'Rule of 6 times'. An analysis concluded that MySQL had a very limited number of defects in their code base. Kudos to them. This doesn't define a rule to be used in the open source vs. closed source holy war.

  30. Re:Duh! by Frymaster · · Score: 3, Informative
    Take your pick. http://sql-info.de/mysql/gotchas.html

    those are just bugs! what about lack of features?

    • no subqueries
    • no stored procedures
    • no triggers
    • no foreign key constraints
    • no updates on joins

    at least there's row-level locking now... finally.

  31. MS SQL Sybase ASE by Anonymous Coward · · Score: 3, Informative

    MS SQL is basically a revamped Sybase. So, on UNIX & Linux you could use Sybase ASE.

  32. Re:Duh! by An0maly · · Score: 2, Insightful

    open sourcers don't necessarily get paid to release code, so they don't have the luxury of releasing shit just so they can keep their jobs by releasing updates for the next 5 years. when a commercial product finally DOES become useable they make a whole new buggy/bloated product that they can release fixes and patches for.

    --
    "...if you don't like your job, you don't strike. You just go in every day and do it really half-assed..." -Homer
  33. FUD by Kenneth+Stephen · · Score: 5, Insightful

    This is proof positive that the marketing engine has started churning in the Linux / Open Source arena. The quoted statistics are meaningless. Here are is a short list of things (in no particular order) that are wrong with this "study" (who paid for it anyway?):

    Lines of code is meaningless as a reliable measure of anything. The most this number can be used for is for assessing the high level complexity (i.e. simple, non-trivial, or hard) of an application / code construct. It is absolutely pointless to compare two different applications against each other by lines of code. This means that you can say that one is non-trivial and the other is complex or you can say that both are complex, but there is no valid way of determining (by using this particular metric) that one application is more complex than the other. I believe this is the fundamental flaw in this "study".

    The study igores capabilities. If application A has feature a, b, and c, and application B has features a, b, c, d, e, f, g, h , is it even meaningful to compare the number of defects detected between applications A and B? And no - normalizing it by lines of code is not valid (see previous point).

    Testing methodology : from the defects quoted in the article, it appears as if they "study" did white box testing on MySQL. This is hardly complete. While null pointer dereferences are certainly terrible, I would be also very very concerned about bugs pertaining to SQL capabilites, data integrity, performance, etc. If I go out and do a comparison of RDBMS's for a client, my report wouldnt be complete at all without covering these areas. How come the "study" doesnt mention any of these things?

    Lets face it : this is a paid propaganda article by the marketing machinery. Much like Microsoft has done in the past.

    --

    There is no such thing as luck. Luck is nothing but an absence of bad luck.

  34. It is embarassing to show bad code. by jhines · · Score: 4, Insightful

    It is really embarassing to have bad code with your name on it, released to the public.

    Not only that, but there is a small percentage of coders when presented with an ugly solution to a problem, will pretty it up, just "because". And it is a good way to get known in the OSS world.

    Unlike the corporate world, working but ugly code is hidden deeper and deeper, and people go out of their way to avoid it.

  35. Toy DBMS by leandrod · · Score: 2, Interesting

    Seen lots of intelligent comments about lenght of lines and potential bloat skewing the results, but there is one more issue to consider: design.

    No matter how good the coding itself, if the design is broken, the tool is broken, period.

    And MySQL has a broken design. So broken that the upgrade path isn't MySQL X or something the like, but MaxSQL -- in fact, rebranded SAPdb. That SAPdb is at most at Oracle v7.2 levels tells lots about MySQL.

    I could be more specific, but do your own research in Google -- lack of SQL compliance, lack of features to enable declarative coding at the server instead of procedural client code, and so on.

    Now, the interesting part. Suppose MySQL AB would have a sudden insight and repent of their un-SQL, anti-relational ways. Unlikely, you say; yet possible. Now suddenly they have to recode, or change drastically the current code. The resulting tool will be probably much bigger than the current, because SQL is baroque; or even worse than much bigger, because of MySQL backwards compatibility.

    The sheer bloat will make even this faulty measure of bugs/KLoC skyrocket. Now, run the comparision again...

    Not to say SQL compliance shouldn't be attained. In fact, bloat in the SQL DBMS is a more than good enough tradeoff against bloat in the application. The ideal would be a RDBMS, but while there isn't a MyDataphor a SQL DBMS should do.

    Even today, I don't care about comparing to, say, Oracle or MS SQL Server. IBM DB2 would be a better baseline, but best of all the real competitors: PostgreSQL and Alphora Dataphor.

    --
    Leandro Guimarães Faria Corcete DUTRA
    DA, DBA, SysAdmin, Data Modeller
    GNU Project, Debian GNU/Lin
    1. Re:Toy DBMS by kpharmer · · Score: 2, Informative

      > Even today, I don't care about comparing to, say, Oracle or MS SQL Server. IBM DB2 would be a better
      > baseline, but best of all the real competitors: PostgreSQL and Alphora Dataphor.

      I think you've got your dbms' mixed-up:
      Oracle, Informix, and DB2 are all of comparable complexity and power: Oracle's partitioning is the simplest and its clustering the most complex. DB2 & Informix have more complex partitioning - but can scale beowulf-style to hundreds (if not thousands of separate servers).

      SQL Server is less functional than the above servers, though obviously similar to Sybase (due to its heritage).

      Postgresql is less functional than SQL Server - though it's a fine product anyway.

      MySQL is less functional than Postgresql.

      Not aware of any other database that occupies the limited transaction support / limited ANSI support niche that mysql does. MSQL perhaps?

  36. MySQL and Commercial Licenses by Anonymous Coward · · Score: 3, Interesting

    I'm a little confused. I thought I understood how to make profit with the GPL, but now I'm not sure.

    MySQL GPL'ed all their products. (presumably so they could get developers and bug-fixes to their product for no charge.) However, they offer "commercial" licenses for people who want to integrate MySQL into their software, but don't want to GPL it. How can they do that? Presumably, any improvements/bugfixes/modifications that came from the community would be GPL, and therefore cannot be re-integrated under a more restricted license. I'm a little confused here. How can they take code that has been released under the GPL and turn around and release it under a more restrictive license?

    1. Re:MySQL and Commercial Licenses by mcc · · Score: 2, Informative

      Well, let's see.

      The FSF demands you to sign your righs over because they want to be able to effectively and easily defend the copyrights of all GNU software in court. For example, if GNU software is having its copyright infringed, they want to be able to go right ahead and act with immediate legal authority on that software, rather than having to track down every single contributor to that project-- some of whom may no longer be contactable-- and get permission to proceed with a legal action. They are open about this. They tell you this up front.

      MySQL AB demands that you sign your rights over because they want to be able to take the code you contribute, repackage it as a commercial product, and sell it for their own profit. They are open about this. They tell you this up front.

      While I don't think there's anything necessarily bad about what Mysql is doing, it seems pretty easy to me to state that there's a fundamental difference in "openness" between these two situations.

  37. Re:Quality vs. Features by scambaiter · · Score: 2, Insightful

    yes, sure. Stuff like stored procedures or views are just toy features nobody really needs for database development... Better do all those things in you application code, makes it so much easier;) Come on, if you really need some ultra-fast small reduced-to-the-max sql database you might look at sqlite, if going for some bigger real life application you might discover that those bloated features actually do make sense... and one day you might find yourself posting things like "foreign key constraints would be so cool to have in mysql" as some of us did ages ago...

    --
    sick of sigs... *sigh*
  38. DMCA Offense! by mustangsal66 · · Score: 3, Funny

    By including the use of 'stdio.h' to which we (SCO) own the rights to, you have violated the DMCA.

    MrHanky, you now must either pay us for the use of said file ($699) or ceist and decist.

    We hold rights to your future earnings from your use of our file, and we option the rights to your childrens earnings.

    Thank you
    Daryl

    soo so sorry... It just popped into my head...

    --
    Why worry? Each of us is wearing an unlicensed "nucular" accelerator on his back.
    Sig changed for readability by G.W.
  39. Re:MySQL vs. Oracle by IANAAC · · Score: 4, Insightful
    If you're using any of Oracle's standard feature set, you'll have a tough time converting everything over. Oracle is much, much more SQL standards compliant (what's with MySQL's backticks anyway?). If your applications use stored procedures, triggers, primary and foreign keys, transaction-based recovery/redo, you're looking at a complete rewrite of your apps. Regardless of what database you choose to use, you're looking at at least a partial rewrite, but why complicate matters more than you need to?

    Sorry, but my opinion is pretty strong on this. Going from anything Oracle to MySQL is NOT trivial.

  40. Re:Duh! by proj_2501 · · Score: 2, Informative

    there are foreign key constraints, but only on certain table types, and only in certain versions, and only on certain column types.

    on mysql 3.x, the table types that support foreign key constraints don't support transactions, and vice versa.

  41. SCO Has Just Announced by Bruha · · Score: 3, Funny

    That they're filing suit against MYSQL for violating their IP on code quality.

  42. Re:MySQL vs. Oracle by mydigitalself · · Score: 2, Insightful

    it really depends on how heavily your developers have embraced 8i. as another poster mentioned - if they are really exploiting it then you will have a big migration task. if your applications only perform basic SQL statements - then you could probably get away with it. actually, if all you do is perform basic SQL, then you aren't utilising oracle to its full potential and you'd probably get a better ROI (return on investment) by moving to MySQL.

  43. Re:MySQL vs. Oracle by kpharmer · · Score: 4, Insightful

    Porting between dbms products depends primarily on two issues:
    1. usage of vendor extensions
    2. usage of standard relational functionality

    Generally speaking, if you've minimized #1 in your application you can easily port between Oracle, DB2, SQL Server, Sybase, Postgesql, etc: sure, you could hit some issues with jdbc drivers, and may need to port a few idioms (partitioning for example), but it shouldn't be a killer. But going from any of the above list to mysql isn't suggested: you'll get hung up on #2 (it doesn't support standard SQL or DDL)

    Realistically, if I wanted to go to a less expensive product than oracle I'd look down this list:
    - db2 (1/3 to 1/2 oracle cost)
    - sybase (cheaper than oracle, but dwindling market share)
    - firebird (very low cost)
    - postgresql (free)
    All of the above are mature relational databases that you could port oracle applications from.

    But you mentioned 'mission critical'. At this point I'd be very cautious about either postgesql or mysql in a mission-critical role. How important is it to you that you can recover 100% of your data in the event of a database crash? I'd put my money (and career) on db2 or oracle delivering that kind of quality over mysql...

  44. Kinds of errors -- it's Reasoning, Inc. again by Anonymous+Brave+Guy · · Score: 5, Insightful
    Not only is it hard to define defect (and it is very obvious that some defects are worse than others), but this code review sounds like it only spots "grammatical" or style errors in the code.

    It does indeed sound a bit like that, and with good reason. If you notice, the "indepedent review" was carried out by Reasoning, Inc., and we've heard of them before in these parts.

    For the benefit of those who haven't seen this trollfest^H^H^H^H^H^H^H^H^Hstory in its previous incarnations, Reasoning's services spot what some people call "systematic" errors, things like NULL pointer dereferencing or the use of uninitialised variables. As many people note every time this subject comes up, any smart development team will use a tool like Lint to check their code anyway, as a required step before check-in and/or as a regular, automated check of the entire codebase, and so any smart development team should find all such errors immediately. IOWs, it's grossly unfair to compare open and closed source "code quality" on this basis. Any project that has errors like this in it at all isn't serious about quality, and it shouldn't take an external study to point this out.

    Serious code quality is not dictated by how many mechanical errors there are that slip through because of weaknesses in the implementation language. Rather, it is indicated by how many "genuine" logic errors -- cases where the output differs unintentionally from the specifications -- there are. Of course, no automated process can identify those, but to get a meaningful comparison of code quality, you'd need to investigate that aspect, rather than kindergarten mistakes.

    There are other objections to their principal metric as well. For starters, source code layout is not normally significant in C, C++ or Java, so any metric based on line count is going to be flawed at best. But the big objection is that they're talking about childish mistakes, and comparing supposedly world class software based on childish mistakes isn't helpful (except to dispel the myth that some big name products have sensible development processes).

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  45. Code Quality by octogen · · Score: 2, Informative

    In my opinion, there is no substantial difference in code quality between open source software and proprietary software.

    I have seen a lot of very buggy commercial software (including nVidia drivers, IBM's LANManager Services for OS/2, lots of Microsoft's services and utilities in Windows 2000 (for example, "TCP/IP Helper Service") and Netscape 4.7).

    On the other hand, I have also seen very bad code quality in open source products - for example, GTK+ (actually, the really bad thing about GTK+ is primarily its install scripts, makefiles and such). Compiling and installing GTK+ on anything else than on a GNU/Linux-machine is some kind of an adventure, while its commercial counterpart, Qt from trolltech, can be compiled quite easily.

    - I set the PKGCONFIG env variable before running 'configure'. It worked quite well until line 27.000 (or so) in 'configure', where the variable's content was suddenly gone (BTW, I really dislike debugging 28.000+ line shellscripts). I tried to 'configure' with bourne shell and with korn shell 93.

    - It assumes, you have Perl installed (if it's not in your PATH, 'configure' creates funny things like "#! -w" instead of "#!/path/to/perl -w"). The error message produced due to this bug was something like '/usr/bin/env: no such file or directory' - because the perl script was directly started using /usr/bin/env. Kind of confusing %-)

    - 'configure' forgot to add '-fPIC' to CFLAGS, for this reason all shared libraries where broken. I had to add this option manually.

    - Nothing works with 'make'. I had to install 'gmake' (GNU make) instead.

    - The actual source code of the core libraries finally compiled, after I had upgraded to gcc 3.3.2. The source code of the 'demo' programs was totally broken, and gcc refused to compile it - once more I had to change the makefiles manually.

    -----

    One or two weeks later I compiled trolltech's Qt library on the same computer. It was as simple as './configure --platform=platformname && make && make install'.

    Why do I need to debug 28.000+ lines of shellscript-code and a lot of makefiles, why do I need to install gmake, pkgconfig (by the way, pkgconfig and most other things in GTK+ don't work well if you don't install everything to /usr/local, which is the default location) and Perl 5, just to compile some C/C++-Code?

    Qt does mainly the same as GTK+, but it simply compiles, using only shellscripts, 'make' and a C/C++ compiler.

    Another example regarding code maturity (rather maturity than quality, notice the difference :-) is Sun JVM vs. GCJ's libjava. I compiled a very complex multithreaded application using GCJ; it worked fine on uniprocessor machines, but it randomly deadlocked on my multiprocessor server. Finally I found out, that libjava is broken on SMP machines. That doesn't mean, that libjava's code quality is bad; but it still means, that some other Java-Libraries (those of some virtual machines) are more mature, and possibly better tested.

    -----

    Some fundamental things about Software:
    - The more people read the code, the more people can potentially find and fix bugs (good about open source).
    - If a lot of people are allowed to write the code, somebody has to coordinate the work of all these people. Lots of different versions of the same module, written and/or modified by lots of different people need to be combined or coordinated otherwise (bad about most open source projects, because hardly somebody knows, how trustworthy anyone of the developers is; good abous some closed source projects (e.g. Trusted SunOS kernel, IBM SLIC kernel and other trusted code), because only a small group of really good programmers is allowed to write or modify code).

    Conclusion: It's good to have only a small group of 'trusted' developers, who write or modify the code, and then to let everyone else read and verify the code.

    regards,
    octogen

  46. Re:Duh! by proj_2501 · · Score: 2, Informative

    "OMG! And Windows 98 didn't support fast user switching!"

    Your analogy limps. Did most other operating systems support fast user switching in 1998? No, and especially not Windows' biggest competition on the desktop.

    On the contrary, PostgreSQL has had decent foreign key and transaction and subquery support since 1999.

    MySQL STILL doesn't support subqueries in a production version. Foreign keys are only supported by one table type. It doesn't support views. I could go on, but if you really want to see the differences, look at mysql's crash-me comparison chart. The differences that aren't cosmetic, even talking the last MySQL alpha, are pretty annoying.

  47. Re:Duh! by arivanov · · Score: 2, Informative
    Do you even know what a foreign key is, or how it's used?

    Yes I do. And I have revived and made perform to to spec god knows how many cretinous foreign key designs by a combination of

    • removing the foreign key
    • using join for selects to guarantee that only records with valid referential criteria are retrieved. This is equivalent to having a foreign key constraint in the sense that apps do not see any records that do not obey the foreign key contsraint.
    • garbage collector running in a different thread or often different machine that goes around and kills zombies whose referential integrity has been violated.

    The difference between this and a classic foreign key constraint is that this approach always uses efficiently multiple CPUS while a foreignkey is usually a single CPU bound task, it also maintains much less large scope (global or per table) locks and is generally faster for retrieves by a factor of between 10 and 100 times. Due to the TPC vendors have overoptimized join at the expense of many other different things in order to have nice benchmarks..

    And in btw, learn the difference between a "real DBA" and a database designer. I mean the one that is the justification for the 20+% salary difference.

    Cheers (lessons start at 500 per hour),

    --
    Baker's Law: Misery no longer loves company. Nowadays it insists on it
    http://www.sigsegv.cx/