Slashdot Mirror


Open-Source Python Code Shows Lowest Defect Density

cold fjord sends news that a study by Coverity has found open-source Python code to contain a lower defect density than any other language. "The 2012 Scan Report found an average defect density of .69 for open source software projects that leverage the Coverity Scan service, as compared to the accepted industry standard defect density for good quality software of 1.0. Python's defect density of .005 significantly surpasses this standard, and introduces a new level of quality for open source software. To date, the Coverity Scan service has analyzed nearly 400,000 lines of Python code and identified 996 new defects — 860 of which have been fixed by the Python community."

43 of 187 comments (clear)

  1. Coverity fails to detect errors in python by Anonymous Coward · · Score: 4, Insightful

    "Coverity fails to detect errors in python" would be my headline of choice here. Seem a much more reasonable explanation for the results.

    1. Re:Coverity fails to detect errors in python by someone1234 · · Score: 2

      This causes false positives, so if they are really not bugs, then Python's code is even more awesome :D

      --
      Patents Drive Free Software as Hurricanes Drive Construction Industry
  2. Re:Python is readable by Anonymous Coward · · Score: 5, Funny

    Python is readable and readable code is easier to fix.

    Also smarter guy have tendency to use Python/Haskell/Erlang

    Oh yeah? Well, I'm working on a readable Perl script to refute that statement. How long do they accept comments in these threads?

  3. Re: Python == LAME by Anonymous Coward · · Score: 5, Informative

    Most of Python isn't written in Python, smart ass. They're talking about the language interpreter itself, written in C/C++ etc.

  4. Can some one please explain? by OzPeter · · Score: 2

    I read TFS and both TFAs and all I can glean is that Coverity Scan service is some sort of report that measures defects in code, but never defines how such defect are determined. They articles also mention comparing open source code metrics, but the only project that is mentioned anywhere is Python.

    So what is a Coverity Scan service and why should I care? After all I can make up all sorts of metrics about my own software.

    --
    I am Slashdot. Are you Slashdot as well?
    1. Re:Can some one please explain? by msauve · · Score: 2

      "Coverity's code-scanning system for open-source projects... has been in place since 2006, when the effort was first funded by the U.S. Department of Homeland Security (DHS)."

      A defect is when the code uses encryption, and doesn't send the keys to the NSA, or uses smtplib, and doesn't bcc:archives@dea.gov.

      --
      "National Security is the chief cause of national insecurity." - Celine's First Law
    2. Re:Can some one please explain? by Krishnoid · · Score: 3, Informative

      Here's the python dev's own page describing it and how to get to the results.

  5. Hmmm by Anonymous Coward · · Score: 5, Informative

    TFA seems to be about the Python interpreter, also known as CPython (because it's implemented in C), rather than about code written in Python itself. So maybe it has nothing to do with the Python language, but everything to do with the fact that the Python authors are apparently awesome C programmers.

    That's great, but most people interpret "Open Source Python Code" to mean code written in Python that is Open Source, not code written in C (to implement the Python interpreter) that is Open Source.

  6. ok, and this means what? by intermodal · · Score: 2

    Does it mean better coders, or better language? Seems like the results are ambiguous in their meaning.

    --
    In SOVIET RUSSIA... erm...NSA AMERICA, the Internet logs onto YOU!
    1. Re:ok, and this means what? by dfsmith · · Score: 2

      It means that the Python developers fixed the warnings.

  7. This is w/r/t CPython, not random code in Python by paulproteus · · Score: 5, Informative

    The Slashdot summary is confusing, as is the eweek.com headline. Reading the article, it is clear that it is about the code that powers the official Python interpreter, AKA CPython, AKA /usr/bin/python. When I clicked the link, I thought Coverity had surveyed the entire world of open source Python code and discovered that Python programmers as a whole publish higher quality code than people who e.g. program in Ruby. That's not what the article's about.

    It'd be great if the headline in Slashdot were to be fixed to say, "Python interpreter has fewer code defects compared to other open source C programs, says Coverity."

    --
    |/usr/games/fortune
  8. Math impairment by fava · · Score: 5, Informative

    0.005 defects per thousand lines times 400,000 lines gives a total defect count of 2.

    So where did the other 994 defects come from?

    1. Re:Math impairment by aaaaaaargh! · · Score: 2

      However, only 860 were fixed. Double logic impairment.

    2. Re:Math impairment by jwkane · · Score: 4, Funny

      Maybe those two LOC are is really, really, really bad.

    3. Re:Math impairment by ShanghaiBill · · Score: 2

      I'm more interested in this software that detects bugs in code. Does it also solve the halting problem? Can it satisfy finite combinational logic in polynomial time?

      The don't claim to find all bugs. I have used Coverity, and they found quite a few bugs, and also found many instances of unclear code that wasn't really a bug but should be rewritten anyway. But they don't find most logic bugs, or flaws in your requirements, etc. You still have to use your brain for those. But you can use tools like Coverity and other dynamic and static analysis tools to flag the easy bugs so you can spend more time on the hard bugs.

    4. Re:Math impairment by ShanghaiBill · · Score: 4, Informative

      Does it analyze source code or is it like a fuzz tester?

      It is static analysis of source code. It doesn't actually run the code, it scans it for patterns that might be bugs. I like Gimpel Lint better, but it isn't either-or, so you can use both and they will find different bugs. You still need to do dynamic testing with something like Valgrind. Tools are cheap compared to people, so you want to give your developers the best testing tools you can, and put your code through the wringer. We use six different tools for C/C++, and no code is shipped out the door till it passes them all (plus unit, usability, and requirements testing).

    5. Re:Math impairment by serviscope_minor · · Score: 2

      and no code is shipped out the door till it passes them all

      I quite agree. I won't ship my code until it passes the test tool I use. My test tool is gcc. Once that runs without error, I ship.

      --
      SJW n. One who posts facts.
  9. Excellent marketing! by caffeinemessiah · · Score: 5, Insightful

    So a private, for-profit company named "Coverity" has released a report that shows that their "Coverity Scan" software finds the fewest vaguely-defined "defects" in a programming language whose community has added the "Coverity platform" product to their development process? I was about to say "excellent marketing" by writing a fluff piece for free Slashdot traffic, but it's really not even excellent marketing.

    --
    An old-timer with old-timey ideas.
  10. Coverity: Static analyzer by dwheeler · · Score: 5, Informative

    Coverity sells software that does static analysis on source code and looks for patterns that suggest defects. E.G., a code sequence that allocates memory, followed later by something that de-allocates that memory, followed later by something that de-allocates the same memory again (a double-free).

    The product is not open source software, but a number of open source software projects use it to scan their software to find defects: https://scan.coverity.com/ It's a win-win, in the sense that Coverity gets reports from real users using it on real code, as well as press for their product. The open source software projects get reports on potential defects before users have to suffer with them.

    --
    - David A. Wheeler (see my Secure Programming HOWTO)
    1. Re:Coverity: Static analyzer by Anonymous Coward · · Score: 3, Interesting

      We've ran Coverity on several very large projects where I work. For C++ it did a decent job of finding little and simple things that Visual Studio missed, like variables that were never initialized before use, subtle type violations Visual Studio missed, or accessing past the end of a statically allocated array. These aren't the sorts of bugs that we worry about. The evil bugs - like those created by programmers that don't know enough about multithreading but were assigned because some offshore contractor service is the only place we're allowed to staff from and nobody vets their skillsets - all slipped right by Coverity and had to be fixed by the few remaining senior programmers. ( Attrition will fix that problem soon, at least for the senior programmers moving anywhere less strategically suicidal. )

    2. Re:Coverity: Static analyzer by Anonymous Coward · · Score: 2, Informative

      you should try TSAN. See : https://code.google.com/p/thread-sanitizer/

  11. Past Coverity reviews by greg1104 · · Score: 4, Informative

    Coverity's services have been useful to a number of open-source projects. But this article is carefully picking its terms to get a headline worthy result. Compare against the Coverity scan of PostgreSQL done in 2005 for example, and CPython's defect rate isn't very exciting at all. But that was "Coverity Prevent" and this is "Coverity Scan"...whatever that means.

  12. C code, not Python code by paavo512 · · Score: 2

    The title is misleading again as hell. It appears they talk about the C code included in the Python compiler/interpreter project, and it is to be compared against other open source software projects, not against other languages. All that it shows is the Python project developers are eager to fix problems what this particular verification software founds. If they have fixed all those bugs, then they will have exactly zero known defects. Good for them, but most probably there will remain unknown defects, and it is hard to measure their amount.

    In short, a meaningless article and a misleading title. The correct headline would have been "Python core developers are fixing bugs with help of a tool".

  13. How rude! by sgt+scrub · · Score: 2

    They counted my C++ features as bugs?

    --
    Having to work for a living is the root of all evil.
  14. Re:This is w/r/t CPython, not random code in Pytho by lightBearer · · Score: 2

    Yes it would, as the Python interpreter is open source: Python License & History

    --
    - No Bounce, No Play -
  15. Re:Python is readable by Anonymous Coward · · Score: 4, Funny

    Python is readable and readable code is easier to fix.

    Also smarter guy have tendency to use Python/Haskell/Erlang

    Oh yeah? Well, I'm working on a readable Perl script to refute that statement. How long do they accept comments in these threads?

    How is this possible? Perl is a write only language.

  16. Re:Python is readable by MetalliQaZ · · Score: 5, Informative

    The result in question tested the Python project's code, which is commonly known as CPython, which is the Python interpreter written in C.

    --
    "Here Lies Philip J. Fry, named for his uncle, to carry on his spirit"
  17. Perl IS readable by Anonymous Coward · · Score: 3, Funny

    @*(&^)&^)^$

    Perl programmers write their code in cartoon profanity!

  18. Hey metric retards by Sulik · · Score: 4, Interesting

    While it can be useful in pinpointing common code defects, interpreting coverity results as an absolute indicator of code quality is just retarded. 90% of coverity's defect's tend to be really false positives that would be obvious to even the average code monkey... Not sure that massaging a code base to please coverity and getting a 'high score' is really any kind of achievement and may be more an indicator that you have way too much time on your hands...

    --
    Help! I am a self-aware entity trapped in an abstract function!
  19. Bullshit by gwstuff · · Score: 2

    This is bullshit, but a great tactical conversion of non-informative data into marketable news by Coverity.

    Coverity uses lexical pattern matching to find bugs based on "tricks" discovered by Dawson Engler and his colleagues in Stanford University in the early 2000s. The tricks (find "malloc" not coupled with "free", cli() not coupled with sti(), dereferences of uninitialized pointers etc.) were developed in the context of the C language used for Operating System code.

    So they used tricks developed for one language and context, to another language in a different context, and found that they didn't find as many bugs in the latter as they did in the former. You would think that this suggests a failure - in that their techniques are not quite as effective on Python as they were on C. Instead, they have turned it around as a statement on the inherent high quality of Python code.

    It's like saying that the fact that a good tennis player sucks at playing table tennis, it implies that table tennis is a harder game.

    1. Re:Bullshit by Lehk228 · · Score: 2

      article is about the c code that makes up the CPython interpreter, not about Python scripts.

      --
      Snowden and Manning are heroes.
  20. Re:Python is readable by ceoyoyo · · Score: 2

    It appears you're right. Neither the submitter nor the article writer understand the difference between "code written in Python" and "the CPython interpreter, which is written in C", which is what Coverity actually tested. So 90% of the comments are off topic. Mods - kudos to the parent.

  21. Re:Python is readable by vux984 · · Score: 3, Insightful

    Python is readable and readable code is easier to fix.

    True and true. But Python's use of semantic whitespace is also very brittle very easy to break, and a huge pain in the ass to fix compared to languages that use braces, or keywords to define 'blocks'.

    But that's not even terribly relevant here, because this article is about the source code used for the python interpreter, which is C, not python.

  22. Re:Python is readable by XcepticZP · · Score: 4, Insightful

    But Python's use of semantic whitespace is also very brittle very easy to break, and a huge pain in the ass to fix compared to languages that use braces, or keywords to define 'blocks'.

    This is one thing I never quite get about python criticism. Sure, whitespace is significant, but I've never had it break easily or be "brittle" as you say. Then again, I don't go past 2 or 3 levels of nesting, class nesting included. And all my units of work are in separate methods/functions instead of being child blocks inside a giant function which I've regularly seen done. Perhaps the use of whitespace isn't the real issue many people have with python, but rather delineating blocks using whitespace exposes a bit of an inherent flaw in the way they structure their program's flow.

    Either way, having a proper IDE when writing python code will go a long way to making you comfortable with using whitespace instead of braces. Initially it was weird and unsettling for me, because I didn't understand all the consequences that whitespace could have. But a little fluid and constant coding in a IDE will rid you of that quick enough.

  23. Re:Can't be right by XcepticZP · · Score: 3, Informative

    it might have an advantage in forcing lazy programmers with no concept of 'code etiquette' to write semi-readable code as indentation is forced by syntax.

    on the other hand, making indentation part of the language creates all sorts of other readability problems.

    You'd be surprised at how much syntax in python actively ignores whitespace. As soon as you open up any brackets, it's a veritable free-for-all when it comes to whitespace and indentation. In such a scenario, a proper coding standard document is imperative for readable code.

  24. Re:WRONG! RTFA! by Zero__Kelvin · · Score: 4, Insightful

    "I quote: "Coverity scanned over ten thousand Python programs on the popular GitHub open-source software repository...""

    Great. Now where the hell do you quote it from, since that sure as hell isn't in the linked to article anywhere.

    "Coverity's scanning technology has analyzed more than 396,000 lines of code in the latest builds of Python 3.3.2. That analysis has led to 181 new defects being identified. For the year to date, Python developers have already fixed 278 defects. - See more at: http://www.eweek.com/developer/open-source-python-code-sets-new-standard-for-quality-study.html#sthash.wSdGotDE.dpuf"

    That makes it pretty clear that they are talking about the Python executable itself. Version 3.3.2 to be exact.

    "One of the more interesting defects that Coverity identified in Python that developers have since fixed is a "double-free" defect. "'Double free' means that you allocate memory for a pointer, and then you free the memory twice," Samocha explained. "This can cause memory corruption, which can lead to unexpected behaviors or program crashes." - See more at: http://www.eweek.com/developer/open-source-python-code-sets-new-standard-for-quality-study.html#sthash.wSdGotDE.dpuf"

    ... and that clearly shows that they are talking about the interpreter, written in C, which has pointers, malloc() and free(). Python has a memory manager with garbage collection and doesn't use pointers. The Python programmer doesn't allocate and free memory resources directly.

    I especially love how you criticized a language earlier, when you clearly have literally no knowledge of said language.

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  25. Re:Python is readable by AvitarX · · Score: 2

    I saw a trivial example break when posted to /. not that long ago, in the interview.

    --
    Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
  26. Re:Python == LAME by MikeBabcock · · Score: 3, Interesting

    Nope, nobody at all http://www.python.org/about/success/

    Jeez.

    --
    - Michael T. Babcock (Yes, I blog)
  27. Re:Python is readable by fahrbot-bot · · Score: 4, Insightful

    Sure, whitespace is significant, but I've never had it break easily or be "brittle" as you say.

    Not python, but one example of this type of thing would be in a Makefile where target commands are indented by a tab. Some newer versions of (g)make will allow spaces, but most require a tab. Cut and paste that in an X-Windows session (tabs are converted to spaces) and you're screwed. From Make Software: Makefiles

    Each command line must begin with a tab character to be recognized as a command. The tab is a whitespace character, but the space character does not have the same special meaning. This is problematic, since there may be no visual difference between a tab and a series of space characters. This aspect of the syntax of makefiles is often subject to criticism.

    --
    It must have been something you assimilated. . . .
  28. ..and thats why there are few job opportunities. by ClassicASP · · Score: 2

    I once thought about learning python. Then i combed craigslist across the US looking for job opportunities doing python programming. Relatively few out there by comparison to ASP.NET and Java. Sure its less buggy.....but whats to motivate anyone to learn something they can't easily find work in?

  29. Re:Python is readable by vux984 · · Score: 4, Insightful

    This is one thing I never quite get about python criticism. Sure, whitespace is significant, but I've never had it break easily or be "brittle" as you say.

    Anytime you refactor stuff, or modify something even somewhat nested, especially in a 'dumb text editor', it's a pain in the ass.

    Anytime you need to pass code snippets via email, forums, etc... well... you just don't because its a total waste of time. :)

    Its also easy to barf all over code going into word processors, pdf files, and so forth. Its nice to be able to copy-paste some C out of a PDF file or an email, or off a forum, and then tell the ide to just reformat it.

    erhaps the use of whitespace isn't the real issue many people have with python, but rather delineating blocks using whitespace exposes a bit of an inherent flaw in the way they structure their program's flow.

    No. Because we use whitespace / indenting in our C / C++ etc projects too. We even have standards requiring it, and our IDEs / toolchains may even be set up to reformat it just-so before commits. We want all the benefits of well formatted code.

    We just like the IDE to do all the work actually formatting it, and reformatting it as neccessary.

    Either way, having a proper IDE

    Is how you lose the argument. Everyone but python groupies agrees that any programming language worth considering MUST have its programs represented as plaintext files, with no proprietary / binary stuff that can only be accessed with specialized tools. Requiring an IDE is the sign of a bad language.

    Python passes this test, but it can be pretty hideous to use with an arbitrary text editor. And really, even brainfuck wouldn't be too bad with the right IDE, right?

  30. Re:Python is readable by XcepticZP · · Score: 2

    Is how you lose the argument. Everyone but python groupies agrees that any programming language worth considering MUST have its programs represented as plaintext files, with no proprietary / binary stuff that can only be accessed with specialized tools. Requiring an IDE is the sign of a bad language.

    I don't think you understood what I was trying to say here. The IDE is there to teach you the boundaries when it comes to whitespace in python. Bad indentation, mismatching brackets and overall bad syntax gets picked up immediately and you are warned. Just like you get syntax error highlighting in other languages. Python's usage of whitespace scares a lot of people and keeps them from experimenting. The IDE is what I think would help them overcome their fear/uncertainty. If anything, Python is one of the languages where it's explicitly less required to have an IDE and still be proficient in it.

  31. Re:Python is readable by vux984 · · Score: 3, Insightful

    Blaming Python because you don't have a rudimentary coding editor is like blaming math because you don't have a calculator with a cosine button.

    I don't have the luxury of designing "math". Irrational numbers, periodic functions, and so forth aren't optional.

    But we do have the luxury of designing programming languages, and semantic white space is a choice.