Slashdot Mirror


CPAN: $677 Million of Perl

Adam K writes "It had to happen eventually. CPAN has finally gotten the sloccount treatment, and the results are interesting. At 15.4 million lines of code, CPAN is starting to approach the size of the entire Redhat 6.2 distribution mentioned in David Wheeler's original paper. Could this help explain perl's relatively low position in the SourceForge.net language numbers?"

14 of 277 comments (clear)

  1. yeah, but try removing the punctuation by WebMasterJoe · · Score: 4, Funny

    If you take out the punctuation, though, it's down to twelve lines of code.

    --
    I really hate signatures, but go to my website.
  2. Huh? by Billobob · · Score: 4, Insightful

    Low position? For a language that's not suppose to be a full-blown low-level language like C/C++, perl is pretty damn well represented - over 1/3 the number of projects compared to C isn't that bad. If you have just one file, something like sourceforge usually isn't needed.

    --
    If you have to ask, you'll never know.
  3. Bahhh! by justanyone · · Score: 4, Funny

    Bahhh, I know people richer than that!

    Now compute the economic gain of using Perl vs. any other language:
    Perl vs. Nothing : $677M
    Perl vs. C : $1.25B
    Perl vs. C# : $2.77B
    Perl vs. Hand Optimized Assembly on Honeywell DPS-3E running GCOS operating system: Priceless

  4. Re:Huh? by _14k4 · · Score: 5, Funny

    Here, I'll repost the link from the article you never read:

    sloccount

  5. Useless Measurement? by webword · · Score: 5, Insightful

    What is more important, lines of code or lines of quality code? People are always so impressed with sheer numbers. Quality is important.

    A similar issue is format and structure. You might do something almost right, but it could be better. For example, you might include dates on your web pages but is the format good for users? It can probably be better!

    Numbers are only impressive when they are placed in context of their overall utility. Of course, regarding code, measuring "overall utitility" is no joke. Can you really tell that the code from Programmer A is better than Programmer B.

    In any event, keep your eyes open. Don't let "15.4 million lines of code" amaze you just because the number is big. Let it amaze you because of what it means, and what those lines of code do for users.

    1. Re:Useless Measurement? by Geoff-with-a-G · · Score: 5, Funny

      What is more important, lines of code or lines of quality code? People are always so impressed with sheer numbers. Quality is important.

      Seriously.
      And it's Perl.
      I thought the whole point was that you could write a massive Perl program in a single line.
      15.4 million just tells me that CPAN is getting sloppy. Let's knock that down to say, 17 HUGE lines, okay?

  6. Relatively low? by stinkyfingers · · Score: 5, Funny

    It's relatively low because that list is in alphabetical order!

  7. Gilb's Law by YetAnotherName · · Score: 4, Interesting
    For anyone who says that lines of code isn't a useful measure, just remember "Gilb's Law":
    Two years ago at a conference in London, I spent an afternoon with Tom Gilb, the author of Software Metrics ... I found that an easy way to get him heated up was to suggest that something you need to know is "unmeasurable." The man was offended by the very idea. He favored me that day with a description of what he considered a fundamental truth about measurability. The idea seemed at once so wise and so encouraging that I copied it verbatim into my journal under the heading of Gilb's Law:

    Anything you need to quanitfy can be measured in some way that is superior to not measuring it at all.

    Gilb's Law doesn't promise you that measurement will be free or even cheap, and it may not be perfect---just better than nothing.
    --Tom DeMarco and Timothy Lister, Peopleware 2/E, Dorset House Publishing, New York, 1999.
  8. Re:Huh? by servognome · · Score: 4, Funny

    /. response efficiency warning!
    To conserve server resources in the future please update your response "Did you even attempt to click the underlined word 'sloccount'? If not, do it now and read the first line of the first paragraph." with the more efficient "RTFA" or "RTFA you stupid noob" if you are not into the whole brevity thing.

    --
    D6 63 0D 70 89 81 BB 8E 7B 7C 5F 5D 54 EA AB 73
  9. Low position? by fanatic · · Score: 4, Interesting
    Copying and pasting the linked Sourceforge page into a file, then sortting yelds the following highest project numbers:

    Perl 5254 projects
    PHP 9010 projects
    Java 12210 projects
    C 13069 projects
    C++ 13255 projects

    So perl is behind only 4 others. Given that much Perl project work probably ends up in CPAN instead of sourceforge, this is actually pretty high. Did the poster mean he'd expect higher without CPAN?

    --
    "that's not encryption - it's a new perl script that I'm working on..." - from some Matrix parody
    1. Re:Low position? by lawpoop · · Score: 4, Funny

      Yes, but close to 75% of all those PHP Projects are a DVD/CD cataloging system.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
  10. All lines are not equal by fanatic · · Score: 4, Interesting
    One line on perl typically does a lot more than one line of C code (even without absurd "golf" tricks). The same is true of other high level languages. So even leaving out issues of programmer quality, what does this really mean?

    Also, from the linked article:

    Reasons why these results are meaningless:
    • Most importantly, I've told SLOCCount all of CPAN is one project, which is probably inflating the numbers significantly. When I get more time, I may run SLOCCount per-distribution, then sum the totals. However, SLOCCount appears to have bugs handling this many sub-projects, so I will need to run them separately and manually sum the results.
    • mini-cpan.pl doesn't actually find only the latest versions of everything, some dists are duplicated and some may be ignored.
    • There's probably plenty of generated code not being identified correctly.
    • There's probably plenty of code downloadable from CPAN that wasn't written for CPAN, and so probably shouldn't be counted.
    • All the usual reasons why code metrics based on numbers of lines of source code are meaningless.
    And here's another: CPAN includes perl itself - which is probably a *lot* of lines of C code.
    --
    "that's not encryption - it's a new perl script that I'm working on..." - from some Matrix parody
  11. Re:Perl coders make $135k/year? by Minwee · · Score: 5, Informative

    On average, salary is only half of what a company pays for an employee. If you count benefits, office space, training, administration and all of the other costs involved that $135k works out to more like a $67,000 salary.

    A junior programmer working in Manhattan makes about $60,000 a year according to a recent salary survey, going up to $90,000 for a senior guru. Based on those numbers I don't see anything wrong with the $135k/year figure.

    Coders may not _make_ $135,000, but they do _cost_ that much to employ.

  12. Re:Nonsense. by Merk · · Score: 4, Insightful

    Read the quote carefully: "Anything you need to quanitfy can be measured in some way that is superior to not measuring it at all."

    He's not saying that *any* measurement is better than no measurement. He's saying that there exists a measurement that is better than no measurement.

    Which tastes better, ice cream or fresh pineapple? I don't know, but rather than say "It's impossible to say! Any measurement will be flawed." You could do a survey and see what most people think tastes better. That may not be the measurement that is better than no measurement, but for certain purposes it may be.

    In the end, it depends on what your reason for doing the measurement is. If you're going to be marketing a new bubble gum flavour, then this survey is better than no information at all.