Slashdot Mirror


CPAN: $677 Million of Perl

Adam K writes "It had to happen eventually. CPAN has finally gotten the sloccount treatment, and the results are interesting. At 15.4 million lines of code, CPAN is starting to approach the size of the entire Redhat 6.2 distribution mentioned in David Wheeler's original paper. Could this help explain perl's relatively low position in the SourceForge.net language numbers?"

10 of 277 comments (clear)

  1. Re:Perl coders make $135k/year? by daperdan · · Score: 2, Informative

    I don't know any perl coders who make $135 a year, let alone $135,000!

    Just so you know:
    Amazon.com hires Perl programmers at a pretty good rate. Perl continues to be one of the top languages on the net. You statement is pretty rediculous. Check jobs.perl.org if you think people can't make money writing perl. You statement is quite ignorant.

  2. Re:Mining CPan by Waffle+Iron · · Score: 3, Informative
    Well, what I'd like to see first would be a Python equivalent to CPAN existing in the first place.

    While it's not nearly as big as CPAN, I often find Python code I need in the Vaults of Parnassus

  3. Re:Perl coders make $135k/year? by Waffle+Iron · · Score: 2, Informative

    It may cost the employer $135K, but that's not the programmer's take-home pay. One rule of thumb I've seen is to multiply salary by ~2 to get the employer's total costs including equipment, office costs, taxes, etc. That would imply a salary of around $67K.

  4. Re:Perl coders make $135k/year? by Minwee · · Score: 5, Informative

    On average, salary is only half of what a company pays for an employee. If you count benefits, office space, training, administration and all of the other costs involved that $135k works out to more like a $67,000 salary.

    A junior programmer working in Manhattan makes about $60,000 a year according to a recent salary survey, going up to $90,000 for a senior guru. Based on those numbers I don't see anything wrong with the $135k/year figure.

    Coders may not _make_ $135,000, but they do _cost_ that much to employ.

  5. Remember by ajs318 · · Score: 2, Informative

    I think one of the reasons why many of the things people do in Perl don't end up becoming SourceForge projects is because they're specific to a particular environment -- my company does pretty much everything {that others might do on Windows desktops} using in-house-written Perl scripts accessed through a web browser; but they really aren't general-purpose enough to warrant releasing to the world at large. For instance, we need to store the Ordnance Survey grid references of our customers -- but not everyone will need that functionality. Perl itself provides a kind of "generality-of-purpose abstraction layer"; there's not much sense in writing a program that can handle fifty squillion different data formats if you're only ever going to use one, especially given that processor power and disk space are so cheap nowadays. I also use Perl for jobs that could be done using bash or awk or sed, but Perl is just so handy; and if I need to add one more fearure, I know I can. I'll also use perl -e 'print "something\n"' in an Xterm as a calculator {one day I'll even define a key map that puts the sequence on a function key}.

    Alternatively, Perl -- thanks to all those wonderful library bindings -- might well be used for an initial "feasibility study", say to develop and test the most important function(s) that will end up forming the core of a project; and, once the proof-of-concept is there, the whole thing is then rewritten "from the ground up" in something like C or C++ {which has bindings for the dead same libraries anyway, but feels more "proper" because it's compiled rather than interpreted}.

    --
    Je fume. Tu fumes. Nous fûmes!
  6. Re:Perl coders make $135k/year? by CPlusPlusOwnsYou · · Score: 2, Informative

    I don't know any perl coders who make $135 a year, let alone $135,000!

    I used to work at scotia bank for my high school co-op program and I was making $1k/month writing perl scripts to make graphs (using gnuplot) from webserver statistics. Was a sweet job for a 17 year old =)

    --
    "Software is like sex: it's better when it's free."
  7. Duplicate purpose modules and waste everywhere by Offwhite98 · · Score: 2, Informative

    In my experience with CPAN I have found it follows the Larry Wall concept that there are many ways to do the same thing. For starters, there are several modules which can communicate with a POP3 server. There are many XML parsers and many means of talking to a MySQL database. Unfortunately I would not say each solution is feature complete or even good quality. It is great that it has built-in Pod Doc, but the fact remains is that it can be quite difficult to get some things done.

    I was able to whip together a webmail client which fetches mail from a POP3 server and parse the MIME types to display content with several Perl modules which was a pretty amazing feat with the little amount of code which I wrote. But as I wrote it I had to come up with many workarounds for incomplete features in the CPAN modules. I also found that some modules were object oriented and some were not.

    So in the end I am finding things like the Java Foundation Classes or the .NET 1.1 profile implemented by Mono to be much more appealing. While there may be fewer means of connecting to a POP3 server, there is a good chance the one that is there will work well enough.

    But I am still curious how the Ruby folks are doing. They have been committed to object-oriented programming and may be able produce higher quality solitions. Anyone doing Ruby here?

    --
    Brennan Stehling - http://brennan.offwhite.net/blog/
  8. Re:Perl still sucks? Really!? by fprog · · Score: 1, Informative

    Clean syntax...
    You can write some pretty clean syntax in perl just:

    #!/usr/local/bin/perl -Tw

    use warning;
    use strict;
    use diagnostics;
    use vars qw{ ... };

    main();
    exit;

    # Your perl code.

    1;

    portable libraries?!
    What the heck are you smoking dude? I want some!
    It works on more platforms than any other language,
    including C because it wraps libc platform weirdness into "you don't have to know or care" equivalent.
    Think about EBCDIC, incomplete , endianess, file systems
    that don't have all unixes attributes.

    A decent GUI library?
    There's a Perl/Tk. okay it sux.
    There's a wrapper for GTK and Windows API.
    okay it sux too.

    There's an HTML API, where you can write
    your entire program in HTML/JavaScript/Perl,
    and just install some Apache, with mod_xmlrpc
    and mod_perl thing that runs whatever you want locally on the machine. --> "good portable compromise".

    You could also use something like C++ Builder
    an embed your perl program within your C/C++ application.

  9. Other studies: Red Hat LInux 7.1, Debian 2.2 by dwheeler · · Score: 2, Informative
    If you find this interesting, you might also want to take a look at my updated paper More than a Gigabuck: Estimating GNU/Linux's Size, which examines Red Hat Linux 7.1. The "Gigabuck" paper shows that:
    1. It would cost over $1 billion (a Gigabuck) to develop this Linux distribution by conventional proprietary means in the U.S. (in year 2000 U.S. dollars).
    2. It includes over 30 million physical source lines of code (SLOC).
    3. It would have required about 8,000 person-years of development time, as determined using the widely-used basic COCOMO model.
    4. Red Hat Linux 7.1 represents over a 60% increase in size, effort, and traditional development costs over Red Hat Linux 6.2 (which was released about one year earlier).

    Another related paper (that I didn't write) is Counting Potatoes: The size of Debian 2.2. They found that Debian 2.2 includes more than 55 million physical SLOC, and would have cost nearly $1.9 billion USD using over 14,000 person-years to develop using traditional proprietary techniques.

    So what's the purpose of all these studies? Insight. There are all sorts of limitations in any measure, including any source lines of code (SLOC) measure. But, in spite of those limitations, there are things you can learn. Using tools (like SLOC counting tools) to measure software can help you understand things about the software, as long as you understand the limitations of the measure.

    In particular, many studies have shown that SLOC is very strongly related to effort (so much so that you can even use equations to predict it). If you want to determine effort in CPAN, you can't just go ask people; few open source software / Free Software (OSS/FS) developers record exactly how much effort they invested. So, these kinds of measures are really helpful for estimating how much effort went into developing the software. Obviously, not all effort is equal (a genius can turn a hard problem into an easy one). And not all code is good, or even useful. But if you want to understand and measure effort, then these measures do have a value. In particular, these results have shown that OSS/FS can scale up to large projects requiring large amounts of effort.

    --
    - David A. Wheeler (see my Secure Programming HOWTO)
  10. Re:Mining CPan by ajs · · Score: 3, Informative
    I checked out that site.

    I only looked at a handfull of the links. It's sort of a Yahoo! (the original indexer, not todays search engine-cum-kitchen sink) for Python code, which is ok, but check out how one uses CPAN in the real world:
    # perl -MCPAN -e shell
    cpan> i /SpamAssassin/
    Distribution F/FE/FELICITY/Mail-SpamAssassin-2.63.tar.gz
    Modul e Mail::SpamAssassin (F/FE/FELICITY/Mail-SpamAssassin-2.63.tar.gz)
    cpa n> install Mail::SpamAssassin
    ---- Unsatisfied dependencies detected during [F/FE/FELICITY/Mail-SpamAssassin-2.63.tar.gz] -----
    Filter::Simple
    Shall I follow them and prepend them to the queue
    of modules we are processing right now? [yes]
    I'm sure you can see how this makes CPAN far more useful for building a large repository of useful Perl modules. How, in Python, can you build several layers of libraries that depend on each other without this kind of repository of dependency information? How does a user "come into the know" about these factors?

    Of course, that ignores the fact that CPAN modules all come with regression testing and online documentation (installed in the sytem "man" tree) as well.