Slashdot Mirror


Open-Source Python Code Shows Lowest Defect Density

cold fjord sends news that a study by Coverity has found open-source Python code to contain a lower defect density than any other language. "The 2012 Scan Report found an average defect density of .69 for open source software projects that leverage the Coverity Scan service, as compared to the accepted industry standard defect density for good quality software of 1.0. Python's defect density of .005 significantly surpasses this standard, and introduces a new level of quality for open source software. To date, the Coverity Scan service has analyzed nearly 400,000 lines of Python code and identified 996 new defects — 860 of which have been fixed by the Python community."

118 of 187 comments (clear)

  1. Coverity fails to detect errors in python by Anonymous Coward · · Score: 4, Insightful

    "Coverity fails to detect errors in python" would be my headline of choice here. Seem a much more reasonable explanation for the results.

    1. Re:Coverity fails to detect errors in python by Anonymous Coward · · Score: 1

      http://docs.python.org/devguide/coverity.html
      Known limitations: Python’s C code are not yet understood by Coverity

    2. Re:Coverity fails to detect errors in python by TopherC · · Score: 1

      The actual doc says "Some aspects of Python’s C code are not yet understood by Coverity." That's much more vague admittedly, but not as shameful.

    3. Re:Coverity fails to detect errors in python by someone1234 · · Score: 2

      This causes false positives, so if they are really not bugs, then Python's code is even more awesome :D

      --
      Patents Drive Free Software as Hurricanes Drive Construction Industry
    4. Re:Coverity fails to detect errors in python by julesh · · Score: 1

      "Coverity fails to detect errors in python" would be my headline of choice here. Seem a much more reasonable explanation for the results.

      Or, to put it another way, "static analysis tool fails to detect many potential errors in code whose authors use the same static analysis tool to find and fix potential errors." Which is hardly surprising.

    5. Re:Coverity fails to detect errors in python by julesh · · Score: 1

      Some would argue that having a codebase that's so hard to understand that static analysis tools get confused about what it does is a bug in itself.

  2. Re:Python is readable by Anonymous Coward · · Score: 5, Funny

    Python is readable and readable code is easier to fix.

    Also smarter guy have tendency to use Python/Haskell/Erlang

    Oh yeah? Well, I'm working on a readable Perl script to refute that statement. How long do they accept comments in these threads?

  3. Re: Python == LAME by Anonymous Coward · · Score: 5, Informative

    Most of Python isn't written in Python, smart ass. They're talking about the language interpreter itself, written in C/C++ etc.

  4. Can some one please explain? by OzPeter · · Score: 2

    I read TFS and both TFAs and all I can glean is that Coverity Scan service is some sort of report that measures defects in code, but never defines how such defect are determined. They articles also mention comparing open source code metrics, but the only project that is mentioned anywhere is Python.

    So what is a Coverity Scan service and why should I care? After all I can make up all sorts of metrics about my own software.

    --
    I am Slashdot. Are you Slashdot as well?
    1. Re:Can some one please explain? by Sponge+Bath · · Score: 1

      What is Coverity Scan service? It is a product they hope to sell you. Does advertising work? It just did!

    2. Re:Can some one please explain? by msauve · · Score: 2

      "Coverity's code-scanning system for open-source projects... has been in place since 2006, when the effort was first funded by the U.S. Department of Homeland Security (DHS)."

      A defect is when the code uses encryption, and doesn't send the keys to the NSA, or uses smtplib, and doesn't bcc:archives@dea.gov.

      --
      "National Security is the chief cause of national insecurity." - Celine's First Law
    3. Re:Can some one please explain? by Krishnoid · · Score: 3, Informative

      Here's the python dev's own page describing it and how to get to the results.

    4. Re:Can some one please explain? by cold+fjord · · Score: 1

      Here is the data sheet (.pdf) that should help you understand.

      Here is some addition detail on the common problems (.pdf) it looks for.

      Here is a background article: A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World

      --
      much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
    5. Re:Can some one please explain? by TapeCutter · · Score: 1

      So what is a Coverity Scan service

      It's the same idea as the 'lint' command, it picks up potential bugs.

      These sort of tools can't help improve the quality of your code. Having said that, in my (20+) years of experience it's not common practice to use these things, I've worked on several large "mission critical" systems and the Y2K ordeal was the only time someone even asked if I used such a tool, let alone demanded it. At the end of the day (actually more like a month) the "Y2K lint" tool's only practical achievement was to tick a due-diligence box for insurance purposes.

      --
      And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
    6. Re:Can some one please explain? by cold+fjord · · Score: 1

      If only there was a way to get more information, somehow.

      Find and fix defects in your C/C++ or Java open source project for free.

      A pity it's free. They must be implying that open source software isn't worth any money, right?

      --
      much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
    7. Re:Can some one please explain? by gl4ss · · Score: 1

      if it's the same as lint then eh..

      of course it has less "defects" to complain about.. with all that whitespace shit defined in language and all.

      "style errors" aren't defects. they're just matter of deciding who decides the right style..

      --
      world was created 5 seconds before this post as it is.
    8. Re:Can some one please explain? by kermidge · · Score: 1

      Warning: pdf
      http://wpcme.coverity.com/wp-content/uploads/2012-Coverity-Scan-Report.pdf
      explains much if not all that you ask

      For a good article and a fun read that goes into the background of Coverity and what it does, see
      http://cacm.acm.org/magazines/2010/2/69354-a-few-billion-lines-of-code-later/fulltext
      it's written by some of the developers and founders

  5. Where is the study? by achacha · · Score: 1, Informative

    I could not find a link to the actual study, instead the company links lead back to the article and the article leads back to the company home page. Is this more "faith-based computing"? I am interested in the comparisons to other languages and in what type of code was analyzed.

  6. Hmmm by Anonymous Coward · · Score: 5, Informative

    TFA seems to be about the Python interpreter, also known as CPython (because it's implemented in C), rather than about code written in Python itself. So maybe it has nothing to do with the Python language, but everything to do with the fact that the Python authors are apparently awesome C programmers.

    That's great, but most people interpret "Open Source Python Code" to mean code written in Python that is Open Source, not code written in C (to implement the Python interpreter) that is Open Source.

    1. Re:Hmmm by shutdown+-p+now · · Score: 1

      The nice thing about Python interpreter is that it is deliberately written in such a way as to be easy to read and understand. I suspect they could have squeezed quite a bit more performance out of it by using more exotic techniques (e.g. tagged ints), but, arguably, it is not worth it - if you want real perf, you'd do JIT anyway (and that's what PyPy is for), while on the other hand it is beneficial to have a well-understood, stable and foolproof reference implementation for the language.

    2. Re:Hmmm by Laxori666 · · Score: 1

      Oh that is extremely misleading. To be honest though I did some mental math and I thought, really, out of the entirety of all programs written in Python, there's only one defect per 200,000 lines of code? Unlikely. Now it all makes sense. My world view has been repaired by your astute observations. If only you had not posted AC so I could direct merit and praise to the appropriate username and so your karma would increase thus assuring you a better rebirth in a heavenly realm wherein you could hopefully meet the Buddha of programming AKA Guido van Rossum and follow his blessings to the nirvana wherein no programmer ever has to code again!

  7. ok, and this means what? by intermodal · · Score: 2

    Does it mean better coders, or better language? Seems like the results are ambiguous in their meaning.

    --
    In SOVIET RUSSIA... erm...NSA AMERICA, the Internet logs onto YOU!
    1. Re:ok, and this means what? by dfsmith · · Score: 2

      It means that the Python developers fixed the warnings.

  8. This is w/r/t CPython, not random code in Python by paulproteus · · Score: 5, Informative

    The Slashdot summary is confusing, as is the eweek.com headline. Reading the article, it is clear that it is about the code that powers the official Python interpreter, AKA CPython, AKA /usr/bin/python. When I clicked the link, I thought Coverity had surveyed the entire world of open source Python code and discovered that Python programmers as a whole publish higher quality code than people who e.g. program in Ruby. That's not what the article's about.

    It'd be great if the headline in Slashdot were to be fixed to say, "Python interpreter has fewer code defects compared to other open source C programs, says Coverity."

    --
    |/usr/games/fortune
  9. Math impairment by fava · · Score: 5, Informative

    0.005 defects per thousand lines times 400,000 lines gives a total defect count of 2.

    So where did the other 994 defects come from?

    1. Re:Math impairment by Anonymous Coward · · Score: 1

      Looks like snake oil to me.

    2. Re:Math impairment by aaaaaaargh! · · Score: 2

      However, only 860 were fixed. Double logic impairment.

    3. Re:Math impairment by Tumbleweed · · Score: 1

      0.005 defects per thousand lines times 400,000 lines gives a total defect count of 2.

      So where did the other 994 defects come from?

      They were in comments.

    4. Re:Math impairment by EuclideanSilence · · Score: 1

      I'm more interested in this software that detects bugs in code. Does it also solve the halting problem? Can it satisfy finite combinational logic in polynomial time?

    5. Re:Math impairment by jwkane · · Score: 4, Funny

      Maybe those two LOC are is really, really, really bad.

    6. Re:Math impairment by ShanghaiBill · · Score: 2

      I'm more interested in this software that detects bugs in code. Does it also solve the halting problem? Can it satisfy finite combinational logic in polynomial time?

      The don't claim to find all bugs. I have used Coverity, and they found quite a few bugs, and also found many instances of unclear code that wasn't really a bug but should be rewritten anyway. But they don't find most logic bugs, or flaws in your requirements, etc. You still have to use your brain for those. But you can use tools like Coverity and other dynamic and static analysis tools to flag the easy bugs so you can spend more time on the hard bugs.

    7. Re:Math impairment by EuclideanSilence · · Score: 1

      Does it analyze source code or is it like a fuzz tester?

    8. Re:Math impairment by ShanghaiBill · · Score: 4, Informative

      Does it analyze source code or is it like a fuzz tester?

      It is static analysis of source code. It doesn't actually run the code, it scans it for patterns that might be bugs. I like Gimpel Lint better, but it isn't either-or, so you can use both and they will find different bugs. You still need to do dynamic testing with something like Valgrind. Tools are cheap compared to people, so you want to give your developers the best testing tools you can, and put your code through the wringer. We use six different tools for C/C++, and no code is shipped out the door till it passes them all (plus unit, usability, and requirements testing).

    9. Re:Math impairment by jrumney · · Score: 1

      So their index is not defects per 1000 LOC, as the GP assumed, but defects per 2.03 lines of code. I guess they had to change the factor after a few high level managers at large corporates ran their department's code through it after committing to certain KPI targets.

    10. Re:Math impairment by EuclideanSilence · · Score: 1

      I wish I could mod you informative for your response.

    11. Re:Math impairment by serviscope_minor · · Score: 2

      and no code is shipped out the door till it passes them all

      I quite agree. I won't ship my code until it passes the test tool I use. My test tool is gcc. Once that runs without error, I ship.

      --
      SJW n. One who posts facts.
  10. Excellent marketing! by caffeinemessiah · · Score: 5, Insightful

    So a private, for-profit company named "Coverity" has released a report that shows that their "Coverity Scan" software finds the fewest vaguely-defined "defects" in a programming language whose community has added the "Coverity platform" product to their development process? I was about to say "excellent marketing" by writing a fluff piece for free Slashdot traffic, but it's really not even excellent marketing.

    --
    An old-timer with old-timey ideas.
    1. Re:Excellent marketing! by cold+fjord · · Score: 1

      So, a private company has been helping 400 open source projects with code quality (usually considered important) for quite some time now using their tools which find many different code defects. It had been started with government money, but now they take it out of hide. And do you shed any light on it? Provide more information? No, you just make uninformed comments about things that have easy to find answers and whine. What a waste.

      Some of the better-known projects scanned include Apache, Firefox, GIMP and a number of forms of Linux and BSD.

      Open Source Is Better Than the Closed Stuff (Until You Hit 1 Million Lines)

      A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World

      --
      much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
    2. Re:Excellent marketing! by dkf · · Score: 1

      So a private, for-profit company named "Coverity" has released a report that shows that their "Coverity Scan" software finds the fewest vaguely-defined "defects" in a programming language whose community has added the "Coverity platform" product to their development process?

      Their stuff does work at detecting certain kinds of problem, but it doesn't detect all possible bugs (nor does anything else I've encountered). It's better to say that it's an independent tool that can be used as well as other tools, and they provide free access to quite a few of the larger OSS projects. They surely don't have to; nobody's forcing them. They've also been doing it for years.

      For an example of the sort of thing they find, in a software package I know about their tool recently picked up that the maximum length of string to return passed in one location to readlink() was the same as the size of the buffer passed in, i.e., not leaving enough room for a terminating NUL. It's a trivial problem, easily fixed if actually tricky to spot in a large codebase, but it's the sort of thing that can cause all sorts of problems when encountered for real (get it wrong and you've got a potential problem with a stack smash).

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    3. Re:Excellent marketing! by Half-pint+HAL · · Score: 1

      The problem is that the marketing is smoke and mirrors: the defect count isn't absolute and objective, it's the number of errors that the software detects. I don't doubt that projects using Coverity during the lifecycle end up with less Coverity-detected defects than projects that don't, but as metrics go, it's pretty hugely biased.

      --
      Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
  11. Coverity: Static analyzer by dwheeler · · Score: 5, Informative

    Coverity sells software that does static analysis on source code and looks for patterns that suggest defects. E.G., a code sequence that allocates memory, followed later by something that de-allocates that memory, followed later by something that de-allocates the same memory again (a double-free).

    The product is not open source software, but a number of open source software projects use it to scan their software to find defects: https://scan.coverity.com/ It's a win-win, in the sense that Coverity gets reports from real users using it on real code, as well as press for their product. The open source software projects get reports on potential defects before users have to suffer with them.

    --
    - David A. Wheeler (see my Secure Programming HOWTO)
    1. Re:Coverity: Static analyzer by Anonymous Coward · · Score: 3, Interesting

      We've ran Coverity on several very large projects where I work. For C++ it did a decent job of finding little and simple things that Visual Studio missed, like variables that were never initialized before use, subtle type violations Visual Studio missed, or accessing past the end of a statically allocated array. These aren't the sorts of bugs that we worry about. The evil bugs - like those created by programmers that don't know enough about multithreading but were assigned because some offshore contractor service is the only place we're allowed to staff from and nobody vets their skillsets - all slipped right by Coverity and had to be fixed by the few remaining senior programmers. ( Attrition will fix that problem soon, at least for the senior programmers moving anywhere less strategically suicidal. )

    2. Re:Coverity: Static analyzer by Anonymous Coward · · Score: 2, Informative

      you should try TSAN. See : https://code.google.com/p/thread-sanitizer/

  12. Past Coverity reviews by greg1104 · · Score: 4, Informative

    Coverity's services have been useful to a number of open-source projects. But this article is carefully picking its terms to get a headline worthy result. Compare against the Coverity scan of PostgreSQL done in 2005 for example, and CPython's defect rate isn't very exciting at all. But that was "Coverity Prevent" and this is "Coverity Scan"...whatever that means.

    1. Re:Past Coverity reviews by cold+fjord · · Score: 1

      So comparing two unknowns you decided one of them was arbitrarily better? Any chances that the tool might be checking for more things after 8 years?

      Or are you carefully selecting data to get a nice report and link to PostgreSQL?

      --
      much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
    2. Re:Past Coverity reviews by greg1104 · · Score: 1

      I tried to be clear that the two results can't be directly compared. My main point is that Coverity likes to put open-source projects in a good light, because there's better PR value for them to do so. Any sort of "best project evuh!" claims from them should recognize that this is ad copy designed to draw attention with its superlatives.

  13. Defect detector limitation by Anonymous Coward · · Score: 1

    The defect detector depends on brackets. The 0.005 defects found is because no code is perfect.

  14. C code, not Python code by paavo512 · · Score: 2

    The title is misleading again as hell. It appears they talk about the C code included in the Python compiler/interpreter project, and it is to be compared against other open source software projects, not against other languages. All that it shows is the Python project developers are eager to fix problems what this particular verification software founds. If they have fixed all those bugs, then they will have exactly zero known defects. Good for them, but most probably there will remain unknown defects, and it is hard to measure their amount.

    In short, a meaningless article and a misleading title. The correct headline would have been "Python core developers are fixing bugs with help of a tool".

  15. How rude! by sgt+scrub · · Score: 2

    They counted my C++ features as bugs?

    --
    Having to work for a living is the root of all evil.
  16. Re:Python is readable by X0563511 · · Score: 1

    I've seen multiple-kilobyte posts before. Slashdot truncates it on initial display with a 'read more' link appended to the end, that shows the full post.

    --
    For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
  17. What does the measering mean? by angel'o'sphere · · Score: 1

    Numbers like .69 or 1.0 or 0.005 mean nothing if you don't know to what it relates.

    Usually defect counts are based on 1k LOC (one thousand lines of code, and no: a line of code is likely not what you consider a line of code).

    I doubt that 1.0 is a accepted industry standard defect density [...] for good quality software of ...

    1 defect per 1 kLOC is absurd high, luckily I never was in a project the last 20 years with such a high defect rate.

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    1. Re:What does the measering mean? by K.+S.+Kyosuke · · Score: 1

      1 defect per kLOC is pretty good. The question is, however, *what* is exactly a defect? It is one thing to define a defect as an error that manifests itself when a piece of code is passed what ought to be a valid input, but we all know that no program will ever be handed any significant subset of all valid input during anyone's lifetime. Even that 1 defect per kLOC may never be triggered because even though the function is defective in terms of not handling all possible inputs from what one would consider the maximum reasonable input domain, the real usage could easily differ.

      --
      Ezekiel 23:20
    2. Re:What does the measering mean? by angel'o'sphere · · Score: 1

      Yeah, what exactly is considered a defect varies.
      In the personal software process by Watts Humphrey(sp) already a line that does not compile is considered a defect and is added to the defect log.
      Bottom line everything that comes up in an issue tracker with the aim to fix it later, is a defect.

      In that regard, sleeping defects that are never discovered because "never" some invalid data triggers them, are no defects.

      Regarding 1 error per kLOC. Serious tools count something like this:

      /**
      * @param in, the amount to get into the function
      * @param out, the amount to get out of the function
      * @returns the min of in and out
      */
      int func(int in, int out) {
          return 9;
      }

      as 8 or 9 lines of code.
      Every comment regarding a parameter can be wrong (or use a wrong parameter name etc.) and is considered code.
      The return type of the function can be wrong and is considered one line of code.
      The two parameters can be wrong (wrong type, wrong order) so every parameter is considered ine line of code.
      Then remaining the single line of code inside of the function is one line of code.
      If it was an 'if' or 'while' some people count the } as a line of code as it could be misplaced.
      So depending on your coding style, you easy have 1000 lines in a single very simple class.
      The same is considerd for calling the above function, as the return value can be assigned to the wrong variable, the wrong function name can be called and both parameters can be wrong (int vs. long mismatch, wrong order or in fact wrong values) it is considered to be 4 lines of code.

      Regarding your Input example: I disagree. Most enterprise systems are very good in rejecting invalid input.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    3. Re:What does the measering mean? by K.+S.+Kyosuke · · Score: 1

      Regarding your Input example: I disagree. Most enterprise systems are very good in rejecting invalid input.

      I think you probably misunderstood what I meant by "invalid inputs". Take the infamous example of the 32-bit version of the Java's binary search in a sorted array: the problem was in the overflow of the midpoint computation: while (a+b)/2 looks like a reasonable way to do it, even if both a, b, and (a+b)/2 are within the range of the integer type used, a+b doesn't necessarily have to in some cases. But since few people did multi-GB arrays to even potentially get the <a,b> tuple into an invalid range, nobody noticed for a very long time. It's wrong from the mathematical point of view (the set of inputs for which it fails is not empty) but it's actually perfectly serviceable for many real-world programs (otherwise someone would have noticed sooner!).

      There was no "enterprise version" with "good rejection of invalid inputs" here. And since I don't think that you were claiming that there was one, I assume that by "enterprise systems very good in rejecting invalid input", you meant such things as bulletproof protocol parsers and external data validators, but what I was actually talking about was the fact that a programming language function taken as an abstract, mathematical unit with a binary view on correctness (correct, defective, and no other option), and the same function in the context of a real-world program that only supplies it with a limited subset of inputs are two different things.

      --
      Ezekiel 23:20
    4. Re:What does the measering mean? by angel'o'sphere · · Score: 1

      Well, not sure what you wanted to point out.

      If someone talks about "input" we usually talk about what comes from "outside of the system" "into the system". That means "random data" from DBs, Excel Files, FTP connections or what ever you can think about. I interpreted your post before regarding this terminology.

      Ofc in "lay mans talk" arguments to a function are also "input".

      And your example is ofc perfectly right for an example of "short sight programming".

      you meant such things as bulletproof protocol parsers and external data validators, but what I was actually talking about was the fact that a programming language function taken as an abstract, mathematical unit with a binary view on correctness (correct, defective, and no other option)

      That is likely why SmallTalk and Lisp programmers would shake heads about a discussion like this :D

      After all in SmallTalk 1/3 * 3 is 1. As it is represented as Fraction(1/3) times Number(3) and the "times or multiply" operator/function of the class Fraction returns the left side if the right side equals the operant.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    5. Re:What does the measering mean? by K.+S.+Kyosuke · · Score: 1

      If someone talks about "input" we usually talk about what comes from "outside of the system" "into the system". That means "random data" from DBs, Excel Files, FTP connections or what ever you can think about. I interpreted your post before regarding this terminology.

      Alan Kay would tell you that there is no difference and that it's little computers all the way down. :-)

      That is likely why SmallTalk and Lisp programmers would shake heads about a discussion like this :D

      They would. But even us Lispers do (declare (optimize speed (safety 0)) (type fixnum var1 var2 ...)) from time to time when we feel like it (and when we can prove to ourselves that the thing won't explode in our face - generating a value that will blow up, e.g., Allegro CL's GC, seems to be dangerously easy).

      --
      Ezekiel 23:20
  18. Re:This is w/r/t CPython, not random code in Pytho by lightBearer · · Score: 2

    Yes it would, as the Python interpreter is open source: Python License & History

    --
    - No Bounce, No Play -
  19. Re:Python is readable by Anonymous Coward · · Score: 4, Funny

    Python is readable and readable code is easier to fix.

    Also smarter guy have tendency to use Python/Haskell/Erlang

    Oh yeah? Well, I'm working on a readable Perl script to refute that statement. How long do they accept comments in these threads?

    How is this possible? Perl is a write only language.

  20. Re:Python is readable by MetalliQaZ · · Score: 5, Informative

    The result in question tested the Python project's code, which is commonly known as CPython, which is the Python interpreter written in C.

    --
    "Here Lies Philip J. Fry, named for his uncle, to carry on his spirit"
  21. Perl IS readable by Anonymous Coward · · Score: 3, Funny

    @*(&^)&^)^$

    Perl programmers write their code in cartoon profanity!

  22. Hey metric retards by Sulik · · Score: 4, Interesting

    While it can be useful in pinpointing common code defects, interpreting coverity results as an absolute indicator of code quality is just retarded. 90% of coverity's defect's tend to be really false positives that would be obvious to even the average code monkey... Not sure that massaging a code base to please coverity and getting a 'high score' is really any kind of achievement and may be more an indicator that you have way too much time on your hands...

    --
    Help! I am a self-aware entity trapped in an abstract function!
    1. Re:Hey metric retards by cold+fjord · · Score: 1

      90% of coverity's defect's tend to be really false positives that would be obvious to even the average code monkey

      If the average code monkey was spotting the defects they shouldn't be in there at all for Coverity to find. Tools catch things that people overlook, including subtle things.

      --
      much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
    2. Re:Hey metric retards by philip.paradis · · Score: 1

      Would you care to share your justification for submitting a story with a grossly misleading headline and story? The code analysis in question wasn't performed on software written in the Python programming language; it was performed on the Python interpreter written in C. Again, why would you submit a story under such horrendously misleading premises? You've probably caused a pile of headaches for developers who will have to explain the difference between C and Python to their development "managers."

      --
      Write failed: Broken pipe
    3. Re:Hey metric retards by philip.paradis · · Score: 1

      You also managed to miss the GP's point that in his experience, most of the "defects" aren't actually defects at all, but false positives that result from Coverity depending on a certain coding style.

      --
      Write failed: Broken pipe
    4. Re:Hey metric retards by kermidge · · Score: 1

      According to their report (take it as you will) false positives as of 2012 were 9.7% of reported defects.

  23. Bullshit by gwstuff · · Score: 2

    This is bullshit, but a great tactical conversion of non-informative data into marketable news by Coverity.

    Coverity uses lexical pattern matching to find bugs based on "tricks" discovered by Dawson Engler and his colleagues in Stanford University in the early 2000s. The tricks (find "malloc" not coupled with "free", cli() not coupled with sti(), dereferences of uninitialized pointers etc.) were developed in the context of the C language used for Operating System code.

    So they used tricks developed for one language and context, to another language in a different context, and found that they didn't find as many bugs in the latter as they did in the former. You would think that this suggests a failure - in that their techniques are not quite as effective on Python as they were on C. Instead, they have turned it around as a statement on the inherent high quality of Python code.

    It's like saying that the fact that a good tennis player sucks at playing table tennis, it implies that table tennis is a harder game.

    1. Re:Bullshit by Lehk228 · · Score: 2

      article is about the c code that makes up the CPython interpreter, not about Python scripts.

      --
      Snowden and Manning are heroes.
    2. Re:Bullshit by gwstuff · · Score: 1

      I apologize. I misunderstood the article, but looking at the other comments I wasn't the only one who misinterpreted "open source python code" to mean a side sampling of open source code written in Python. /me yanks foot out of mouth.

    3. Re:Bullshit by Lehk228 · · Score: 1

      most of the thread did, the headline and summary are of DICEy quality.

      --
      Snowden and Manning are heroes.
  24. Re:Python is readable by Mitchell314 · · Score: 1

    I think GP meant time, as in how long the comment sections stay open for posting. The answer is plenty long enough to finish a readable perl project, as long as TFAC doesn't have a life. Or waste time on petty little thinks like sleep. :P

    --
    I read TFA and all I got was this lousy cookie
  25. Re:Python is readable by ceoyoyo · · Score: 2

    It appears you're right. Neither the submitter nor the article writer understand the difference between "code written in Python" and "the CPython interpreter, which is written in C", which is what Coverity actually tested. So 90% of the comments are off topic. Mods - kudos to the parent.

  26. Re:Python is readable by vux984 · · Score: 3, Insightful

    Python is readable and readable code is easier to fix.

    True and true. But Python's use of semantic whitespace is also very brittle very easy to break, and a huge pain in the ass to fix compared to languages that use braces, or keywords to define 'blocks'.

    But that's not even terribly relevant here, because this article is about the source code used for the python interpreter, which is C, not python.

  27. Re:Can't be right by skids · · Score: 1

    it might have an advantage in forcing lazy programmers with no concept of 'code etiquette' to write semi-readable code as indentation is forced by syntax.

    Since the "density" is measured in defects per lines of code, I siggest that Python mandate an extra line return between all lines. Then they could half their defect density. Done.

  28. Re:Python is readable by XcepticZP · · Score: 4, Insightful

    But Python's use of semantic whitespace is also very brittle very easy to break, and a huge pain in the ass to fix compared to languages that use braces, or keywords to define 'blocks'.

    This is one thing I never quite get about python criticism. Sure, whitespace is significant, but I've never had it break easily or be "brittle" as you say. Then again, I don't go past 2 or 3 levels of nesting, class nesting included. And all my units of work are in separate methods/functions instead of being child blocks inside a giant function which I've regularly seen done. Perhaps the use of whitespace isn't the real issue many people have with python, but rather delineating blocks using whitespace exposes a bit of an inherent flaw in the way they structure their program's flow.

    Either way, having a proper IDE when writing python code will go a long way to making you comfortable with using whitespace instead of braces. Initially it was weird and unsettling for me, because I didn't understand all the consequences that whitespace could have. But a little fluid and constant coding in a IDE will rid you of that quick enough.

  29. Re:Can't be right by XcepticZP · · Score: 3, Informative

    it might have an advantage in forcing lazy programmers with no concept of 'code etiquette' to write semi-readable code as indentation is forced by syntax.

    on the other hand, making indentation part of the language creates all sorts of other readability problems.

    You'd be surprised at how much syntax in python actively ignores whitespace. As soon as you open up any brackets, it's a veritable free-for-all when it comes to whitespace and indentation. In such a scenario, a proper coding standard document is imperative for readable code.

  30. Re:WRONG! RTFA! by Zero__Kelvin · · Score: 4, Insightful

    "I quote: "Coverity scanned over ten thousand Python programs on the popular GitHub open-source software repository...""

    Great. Now where the hell do you quote it from, since that sure as hell isn't in the linked to article anywhere.

    "Coverity's scanning technology has analyzed more than 396,000 lines of code in the latest builds of Python 3.3.2. That analysis has led to 181 new defects being identified. For the year to date, Python developers have already fixed 278 defects. - See more at: http://www.eweek.com/developer/open-source-python-code-sets-new-standard-for-quality-study.html#sthash.wSdGotDE.dpuf"

    That makes it pretty clear that they are talking about the Python executable itself. Version 3.3.2 to be exact.

    "One of the more interesting defects that Coverity identified in Python that developers have since fixed is a "double-free" defect. "'Double free' means that you allocate memory for a pointer, and then you free the memory twice," Samocha explained. "This can cause memory corruption, which can lead to unexpected behaviors or program crashes." - See more at: http://www.eweek.com/developer/open-source-python-code-sets-new-standard-for-quality-study.html#sthash.wSdGotDE.dpuf"

    ... and that clearly shows that they are talking about the interpreter, written in C, which has pointers, malloc() and free(). Python has a memory manager with garbage collection and doesn't use pointers. The Python programmer doesn't allocate and free memory resources directly.

    I especially love how you criticized a language earlier, when you clearly have literally no knowledge of said language.

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  31. Re:Python is readable by Zero__Kelvin · · Score: 1

    ... which would matter if the Python interpreter was written in Python. It's not. It is written primarily in C.

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  32. Re:Can't be right by Zero__Kelvin · · Score: 1

    That would not change the number of lines of code. An LOC is a logical unit not measured by the number of carraige returns or printable lines. For example, here is a single line of C code:
    int


    my_int

    ;

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  33. Re:Python is readable by AvitarX · · Score: 2

    I saw a trivial example break when posted to /. not that long ago, in the interview.

    --
    Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
  34. Re:Python == LAME by MikeBabcock · · Score: 3, Interesting

    Nope, nobody at all http://www.python.org/about/success/

    Jeez.

    --
    - Michael T. Babcock (Yes, I blog)
  35. Re:Python is readable by ebno-10db · · Score: 1

    If you write a readable Perl script, then you've completely missed the point of the language. Ever hear of job security?

  36. Register for the study? by Anonymous Coward · · Score: 1

    So, it's definitely spam then?

  37. Re:Python is readable by fahrbot-bot · · Score: 4, Insightful

    Sure, whitespace is significant, but I've never had it break easily or be "brittle" as you say.

    Not python, but one example of this type of thing would be in a Makefile where target commands are indented by a tab. Some newer versions of (g)make will allow spaces, but most require a tab. Cut and paste that in an X-Windows session (tabs are converted to spaces) and you're screwed. From Make Software: Makefiles

    Each command line must begin with a tab character to be recognized as a command. The tab is a whitespace character, but the space character does not have the same special meaning. This is problematic, since there may be no visual difference between a tab and a series of space characters. This aspect of the syntax of makefiles is often subject to criticism.

    --
    It must have been something you assimilated. . . .
  38. Re:WRONG! RTFA! by Zero__Kelvin · · Score: 1

    They probably have. The Python interpreter is pretty complicated and valgrind isn't foolproof. Furthermore, if you don't have test cases that expose the problem, valgrind won't find them since it doesn't do static analysis of code, it hooks the calls to malloc() and free() and reference counts. Valgrind is an awesome tool, but if you run your program and valgrind doesn't complain that doesn't mean it is bug free, unless it is a very procedural / linear program and you can guarantee that every execution path has been taken and all the corner cases have been captured in your use cases / unit tests.

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  39. Re:Python is readable by XcepticZP · · Score: 1

    As I recall, the comments in that thread pointed out that no sane coder would be transferring code using such a medium as html that mangles white space.

    Although, I have been bitten many times when copy-pasting python code between a text file and the command line. Though I've mostly gotten around that problem by working with files rather than trying to use the CLI to input arbitrary python code as every single console does it slightly differently.

  40. Re:Doesn't surprise me by Desler · · Score: 1

    Doesn't surprise me. Obviously, Python is not suitable for everything. But, it is easy to read, easy to write code in, avoids those little issues of C and even Java where some OK-looking code is in fact a security risk.

    FYI: The article is about the CPython code which as you can probably guess is written in C. It is not about projects written in Python.

  41. ..and thats why there are few job opportunities. by ClassicASP · · Score: 2

    I once thought about learning python. Then i combed craigslist across the US looking for job opportunities doing python programming. Relatively few out there by comparison to ASP.NET and Java. Sure its less buggy.....but whats to motivate anyone to learn something they can't easily find work in?

  42. Python one-liner by tepples · · Score: 1
    Simpler:

    print(''.join(reversed('yuG stsoH .rM olleH')))

  43. C initializers by tepples · · Score: 1
    Whitespace normalization stops some but not all metric gaming. How many lines of code does each of these C examples have?

    // Example 1
    int egg = 0, sausage = 0, spam = 0;

    // Example 2
    int egg = 0;
    int sausage = 0;
    int spam = 0;

    1. Re:C initializers by Zero__Kelvin · · Score: 1

      I concur, and never said LOC metrics was a good metric, nor that it can't be gamed. I was merely pointing out that the GPs idea didn't hold water. OTOH, there is nothing that says a tool that counts atoms and calls both of those three lines cannot be devised / used. Of course, that won't make LOCs a great metric. Nothing can do that, as I think we both can agree. [is that a first? ;-) ]

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  44. Re:Python is readable by vux984 · · Score: 4, Insightful

    This is one thing I never quite get about python criticism. Sure, whitespace is significant, but I've never had it break easily or be "brittle" as you say.

    Anytime you refactor stuff, or modify something even somewhat nested, especially in a 'dumb text editor', it's a pain in the ass.

    Anytime you need to pass code snippets via email, forums, etc... well... you just don't because its a total waste of time. :)

    Its also easy to barf all over code going into word processors, pdf files, and so forth. Its nice to be able to copy-paste some C out of a PDF file or an email, or off a forum, and then tell the ide to just reformat it.

    erhaps the use of whitespace isn't the real issue many people have with python, but rather delineating blocks using whitespace exposes a bit of an inherent flaw in the way they structure their program's flow.

    No. Because we use whitespace / indenting in our C / C++ etc projects too. We even have standards requiring it, and our IDEs / toolchains may even be set up to reformat it just-so before commits. We want all the benefits of well formatted code.

    We just like the IDE to do all the work actually formatting it, and reformatting it as neccessary.

    Either way, having a proper IDE

    Is how you lose the argument. Everyone but python groupies agrees that any programming language worth considering MUST have its programs represented as plaintext files, with no proprietary / binary stuff that can only be accessed with specialized tools. Requiring an IDE is the sign of a bad language.

    Python passes this test, but it can be pretty hideous to use with an arbitrary text editor. And really, even brainfuck wouldn't be too bad with the right IDE, right?

  45. Re:This is w/r/t CPython, not random code in Pytho by jrumney · · Score: 1

    That makes more sense. From the summary, I thought the most likely scenario was that Coverity does not handle Python code very well based on my experience of random buggy Python code. It is to be expected that a widely used VM/interpreter is going to be of better quality than your average code.

  46. They've lots of time.. by segfault_0 · · Score: 1

    The code is so slow, they have lots of extra time to look for defects.

    --

    I was crazy back when being crazy really meant something. (Charles Manson)
    1. Re:They've lots of time.. by Z00L00K · · Score: 1

      When you look at analyzing defects - you can find coding defects pretty easily but you can't find design defects where the designer has misunderstood the goal of the product.

      One example of a pretty annoying design mistake is when you run Microsoft software where you can chose to send a document as an attachment from Powerpoint, Excel or Word. However it will at the same time block all access to other windows in Outlook preventing you to get the list of names that you know were present in another message. Not a crashing bug but pretty annoying and stupid from a user perspective.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  47. Re:Python is readable by XcepticZP · · Score: 2

    Is how you lose the argument. Everyone but python groupies agrees that any programming language worth considering MUST have its programs represented as plaintext files, with no proprietary / binary stuff that can only be accessed with specialized tools. Requiring an IDE is the sign of a bad language.

    I don't think you understood what I was trying to say here. The IDE is there to teach you the boundaries when it comes to whitespace in python. Bad indentation, mismatching brackets and overall bad syntax gets picked up immediately and you are warned. Just like you get syntax error highlighting in other languages. Python's usage of whitespace scares a lot of people and keeps them from experimenting. The IDE is what I think would help them overcome their fear/uncertainty. If anything, Python is one of the languages where it's explicitly less required to have an IDE and still be proficient in it.

  48. Re:Python is readable by gregor-e · · Score: 1

    Often the goal of having a program written in Perl is to get something slammed out and running as quickly as possible. Give a sloppy language like Perl to a talented cowboy, and you can get a huge amount of functionality in a short time.

  49. Re:..and thats why there are few job opportunities by shutdown+-p+now · · Score: 1

    On the other hand, there are also proportionally many Java and .NET programmers, so you'll be competing with fewer people in Python land.

    The right answer, anyway, is to learn all three - and a couple more (C++, in particular).

  50. Re: Python is readable by cold+fjord · · Score: 1

    I can see that you're an idiot. I don't know why you bothered to post.

    --
    much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
  51. Re:WRONG! RTFA! by viperidaenz · · Score: 1

    What about the bit about Coverity Scan only supporting Java and C/C++?

  52. Re:Python is readable by paramour · · Score: 1

    The reason for newline-tab being syntactically significant in makefiles is because by the time make's author, Stuart Feldman, realized the problems with this choice there were already about a dozen users of make and he didn't want to break any of their makefiles with an incompatible change. See _The Art of Unix Programming_ by Eric S. Raymond.

    The lesson is the time to fix a bad design decision is as soon as possible, because it's not going to get any easier later; unless or until your program becomes irrelevant, at which point there's little reason to fix it at all.

  53. Re:Doubtful by dottrap · · Score: 1

    The parent post didn't deserve to be modded down. It is highly credible that Lua would have a very low defect rate. Lua has a small, clean, source code base, and it was audited by large organizations such as Verisign for high reliability databases and by The Wikimedia Foundation for security.

  54. Re:Python is readable by Anonymous Coward · · Score: 1

    You're not really addressing the concerns at issue here: Nobody in the outside world cares that your IDE is teaching you proper whitespace syntax.

    What the outside world cares about is that when your code, years from now, goes out into the world outside your IDE hothouse for delicate code-flowers, that it doesn't cause apoplexy and foaming at the mouth when your email handler, or some other group's language lawyer sets up a tab filter, so, say, for example, it produces a subtle bug you can't see with the debugger because of syntactically significant,
    but debugger-invisible whitespace.

    The fundamental weakness of sytactically significant whitespace is that you can't see it.
    When your compiler or debugger (or you) barfs on a
    visible token, it points at it. When it barfs on an invisible token, chances are it will inadvertantly implicate the previous visible token or the structure containing it, and you will go off chasing your tail down a rathole.

    I think there's a lot to like about python, particularly how it's been a force to clean up libraries that have been miasmically floating around since forever. But I have lost too many hours to syntactally significant whitespace to be happy about it. It's an idea we have to put up with, but let's not convfince ourselves that it's a good idea, it ain't. It was horribly costly in human hours for makefiles, and it's still just as horrible for python.

  55. Re:Python is readable by VortexCortex · · Score: 1

    Python is readable and readable code is easier to fix.

    True and true. But Python's use of semantic whitespace is also very brittle very easy to break, and a huge pain in the ass to fix compared to languages that use braces, or keywords to define 'blocks'.

    Furthermore Python's needless attribution of syntactical meaning to whitespace means it's useless for embedding certain languages...
    ...Like Whitespace.

    Today many languages support Unicode source code which can have tons of new spaces of varying width including zero-width and non-breaking-zero-width space. The multitude of new spaces would make indention distinction all the more brittle, but this also means new extensions to Whitespace can provide more rich and full featured embedded language support to most modern programming languages -- Except Python.

  56. Re:Python is readable by vux984 · · Score: 1

    *cough*

    The code reformatting can be done manually, via a command line tool, via the IDE or not at all. My mention of using an IDE to reformat text doesn't create the same IDE dependency you refer to.

  57. Re:Python is readable by vux984 · · Score: 3, Insightful

    Blaming Python because you don't have a rudimentary coding editor is like blaming math because you don't have a calculator with a cosine button.

    I don't have the luxury of designing "math". Irrational numbers, periodic functions, and so forth aren't optional.

    But we do have the luxury of designing programming languages, and semantic white space is a choice.

  58. uh, i RTFA and... by smash · · Score: 1

    ... couldn't find the languages compared? Curious to know how Ada fared and if Python was compared against it.

    --
    I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
  59. Re:WRONG! RTFA! by TheRaven64 · · Score: 1

    I've been paid to work with the Python interpreter before. If Coverity only found one use-after free, then either the quality of the code has dramatically improved in the last three years, or Coverity is slipping...

    --
    I am TheRaven on Soylent News
  60. Re:Python is readable by TheRaven64 · · Score: 1

    Sure, whitespace is significant, but I've never had it break easily or be "brittle" as you say

    The Jabber Python MSN transport shipped with an intent bug in an error path for several releases. The error path was never hit on the developer's test machine, but always hit for me because I didn't install one of the optional libraries. The error was caused by mixing tabs and spaces, and so looked correct in the editor, but Python happened to interpret a tab as a different number of spaces to the editor[1] and so it ended up doing something different.

    This is what people mean when they call it fragile. You can introduce bugs as a result, but never see them unless you hit the code path in question (this, by the way, is a common source of exploitable bugs in all languages: code paths that are rarely hit that contain bugs, and Python makes them so easy to introduce). Meanwhile, in any language that either enforced the no-mixing-tabs-and-spaces rule with static checking[2], or which had a block delimiter character, these would be caught statically at parse time.

    I can think of no other language where such a high proportion of code that I've run that has shipped as working releases has needed me to fix it before it will even start. As far as I can tell, all of the refugees from VB6 ended up writing shoddy Python code. Is it the language's fault? Well, it certainly doesn't help. I've been asking Python programmers for the last year what an else clause on a for loop meant. Last Friday, one gave the correct answer for the first time. Why do I know what it means? Because a person who wrote some (and shipped) some code using it apparently didn't...

    [1] Ignoring Python's general hostility to using the character that means 'indent by one level' for indents, any language with significant whitespace that doesn't error when you have a line that has both tabs and spaces at the start of a line is broken.
    [2] I believe that Python now has an option to check this. It should have been on by default since the first release.

    --
    I am TheRaven on Soylent News
  61. What this means by mwvdlee · · Score: 1

    So what they are basically saying is "Don't use our product to scan Python code; it doesn't recognize all the defects".

    I know the truth is possibly somewhere in the middle, but this report just assumes the scanning products works equally well for all languages, which is atleast somewhat unlikely.

    Also, what exactly is a defect in this context? Is it a security flaw, a functional error or just something that will crash your software. If the latter is the case, then any language that accepts shitty code and just keeps will win regardless of whether the code actually works.

    --
    Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
  62. Re:Python is readable by greggman · · Score: 1

    If you write all code yourself you might not have this problem. If you ever copy and paste from somewhere else you might. I have run into that problem. It took many hours to find.

  63. Re:Python is readable by real-modo · · Score: 1

    ... which worked fine when I ran it.

  64. Re:Python is readable by AvitarX · · Score: 1

    Trivial code is shared via html all others time by many coders, sane or not. It causes you problems too in it's fragility.

    I personally think it's a fair price to pay for consistent style between coders (which makes multi coder projects easier to deal with too), but let's not pretend that it's without drawback.

    --
    Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
  65. Re:Python is readable by Half-pint+HAL · · Score: 1

    If you're using a "dumb" text editor, then don't complain about it.

    Your argument is back-to-front. Python has whitespace because of dumb editors. Guido's rationale was simple: when writing C in a dumb editor, there is redundancy of braces (for the computer) and spaces (for the human). There is the danger that the two might not match, and that a human debugging the code would misread the structure by following the indentation levels instead of the braces.

    And here lies the problem: Guido's decision was for the sake of "plain text" and dumb editors, but the end result was to force the use of smart editors. Hell, even the "official" Python IDE, IDLE, isn't good enough IDEs that have block hiding have, as a consequence, block highlighting, a side-effect of which is the explicit marking of block start and end... that same redundancy that Guido wanted rid of to begin with.

    I bet you're using a code editor with block highlighting...

    --
    Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
  66. Misleading by cgimusic · · Score: 1

    This all seems very misleading. It took me quite a while to figure out that it is only talking about the code for the Python interpreter, not all open-source programs written in Python.

  67. Re:Python is readable by Half-pint+HAL · · Score: 1

    I've been asking Python programmers for the last year what an else clause on a for loop meant. Last Friday, one gave the correct answer for the first time. Why do I know what it means? Because a person who wrote some (and shipped) some code using it apparently didn't...

    I didn't know that structure... it should be banned... it's totally "un-pythonic" in that it annihilates the principle of readability. Kill it with fire.

    --
    Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
  68. Re:Python is readable by lysdexia · · Score: 1

    "talented cowboy" which quickly becomes "miserable sonovabitch" when you have to clean up after his horse.

  69. Re:Python is readable by lysdexia · · Score: 1

    I do all my coding in vi. Generally "g/ /s///g" takes care of any white space problems, when importing folk code. I hated using curlies when I had to start using Perl/javascript/PHP, but I got used to them. It's just a mental flexibility thing. It sucks at first, but after a couple of days I have no trouble. One thing python has done for me vis. other languages is I don't nest. Generally, I have found if I have a big old stalactite of conditionals, that they can be replaced with a function call that simplifies the flow for other humans and my future self.

  70. Re:Python is readable by lysdexia · · Score: 1

    Yeah, the search and replace got munged. Pretend there is a tab in the replace portion.