CPAN: $677 Million of Perl
Adam K writes "It had to happen eventually. CPAN has finally gotten the sloccount treatment, and the results are interesting. At 15.4 million lines of code, CPAN is starting to approach the size of the entire Redhat 6.2 distribution mentioned in David Wheeler's original paper. Could this help explain perl's relatively low position in the SourceForge.net language numbers?"
Wow, using sloccount on the full POPFile source shows that developing it would have cost around $500K in a regular software company. That seems about right given the length of time we've been working on it and the number of people involved. Cool tool.
:-)
Now if only I could push the donations up above $5,000
John.
If you take out the punctuation, though, it's down to twelve lines of code.
I really hate signatures, but go to my website.
Redhat 6.2? Haven't heard that in ages.
Low position? For a language that's not suppose to be a full-blown low-level language like C/C++, perl is pretty damn well represented - over 1/3 the number of projects compared to C isn't that bad. If you have just one file, something like sourceforge usually isn't needed.
If you have to ask, you'll never know.
Did you even attempt to click the underlined word 'sloccount'? If not, do it now and read the first line of the first paragraph.
[Ob. reference: I love Perl and use it all of the time, but a programmer I met years ago said it was the only language where the source code reminded him of line noise]
SourceHosting.net, LLC
Ready. Set. Code.
http://www.sourcehosting.net/
Bahhh, I know people richer than that!
Now compute the economic gain of using Perl vs. any other language:
Perl vs. Nothing : $677M
Perl vs. C : $1.25B
Perl vs. C# : $2.77B
Perl vs. Hand Optimized Assembly on Honeywell DPS-3E running GCOS operating system: Priceless
Unitarian Church: Freethinkers Congregate!
Whatever ones favourite language might be, a project to mine CPan and port useful modules to Python, Java or C# would be interesting.... Perl syntax reads as a little terse to many non-Perl devs.
Here, I'll repost the link from the article you never read:
sloccount
Pfft 15.4 Million lines?
/usr/bin/perl ; ;
;
I could write CPAN in a one liner!
#!
use warnings
use strict
print "CPAN:
This
Perl is a cross-platform tool that existed long before Linux did. Why do such things get posted under Linux ? May as well post it under BSD it would be doing the same thing. This happened with the recent Bash 3.0 topic as well. Why do people associate things with Linux just because it is open source ? (Unless it is BSD open source).
What is more important, lines of code or lines of quality code? People are always so impressed with sheer numbers. Quality is important.
A similar issue is format and structure. You might do something almost right, but it could be better. For example, you might include dates on your web pages but is the format good for users? It can probably be better!
Numbers are only impressive when they are placed in context of their overall utility. Of course, regarding code, measuring "overall utitility" is no joke. Can you really tell that the code from Programmer A is better than Programmer B.
In any event, keep your eyes open. Don't let "15.4 million lines of code" amaze you just because the number is big. Let it amaze you because of what it means, and what those lines of code do for users.
How to Download YouTube Videos
It's relatively low because that list is in alphabetical order!
Embarassing I know. Maybe I can blame on silly new topics and color schemes that are so close to each other :)
/. response efficiency warning!
To conserve server resources in the future please update your response "Did you even attempt to click the underlined word 'sloccount'? If not, do it now and read the first line of the first paragraph." with the more efficient "RTFA" or "RTFA you stupid noob" if you are not into the whole brevity thing.
D6 63 0D 70 89 81 BB 8E 7B 7C 5F 5D 54 EA AB 73
Although their "Book TV" show is usually as dense as Perl, and often profiles books that are write-only.
taken! (by Davidleeroth) Thanks Bingo Foo!
$677 million, 5,000 person-years = ~$135,000/year/person.
I don't know any perl coders who make $135 a year, let alone $135,000!
(sorry, but it's true)
Wer mit Ungeheuern kämpft, mag zusehn, dass er nicht dabei zum Ungeheuer wird. --Nietzsche
What the hell is this post talking about? CPAN? SLOCOUNT? Red Hate 6.2? I honestly have no clue. Is lines of code measured in dollars or lines? If lines, why is there a dollar amount in the headline? If dollars, why is there a lines count in the article?
Comment of the year
So perl is behind only 4 others. Given that much Perl project work probably ends up in CPAN instead of sourceforge, this is actually pretty high. Did the poster mean he'd expect higher without CPAN?
"that's not encryption - it's a new perl script that I'm working on..." - from some Matrix parody
To conserve server resources in the future... STFU?
Might be underlined on your screen, honey, it's not on mine.
Also, from the linked article:
And here's another: CPAN includes perl itself - which is probably a *lot* of lines of C code."that's not encryption - it's a new perl script that I'm working on..." - from some Matrix parody
if you're doing the work for a recognized non-profit organization, then you might want to talk to an accountant, and see if it's worthwhile to do. There are probably implications where you work for non-profit, and it produces code that you can re-use in other for-profit applications.
But I'm not a tax lawyer or an accountant, so if you're really interested in writing off some of your work, it'd be worth looking into.
Build it, and they will come^Hplain.
Lines of debugged code: 0
Patently, bad measurements are worse than no measurements.
"Measurement drives performance." If you are measuring the wrong thing or using misleading measurements, you will do the wrong thing.
Anyone who thinks they can devise a meaningful measurement the quality of Beethoven's Fifth Symphony versus Brahm's First... or which tastes better, vanilla ice cream or fresh pineapple... or who is a better ballplayer, Willie Mays or Sammy Sosa... needs to have their head measured, preferably with a standardized test.
In order to tell whether measurement in some way is superior to not measuring it at all, you need a way to measure the quality of the measurement. But to do that, you need...
"How to Do Nothing," kids activities, back in print!
Low position? For a language that's not suppose to be a full-blown low-level language like C/C++, perl is pretty damn well represented - over 1/3 the number of projects compared to C isn't that bad.
Exactly. Perl is fifth on the list of languages. That's nothing to be ashamed of.
Perl still sucks, yo. When is it going to get a clean syntax, portable libraries, or a halfway decent GUI library?
And don't get me started on OOPerl. The world does not need bolted-on objects.
PERL is nice in that it has a lot of prepackaged modules that provide a lot of functionality. But when you distribute code that uses these modules, the end user must install them. This is a big pain in the rear for the average user, which is why I believe that PERL is a bad choice for programs intented for the end user.
SourceForge is a great tool with meaningful projects there, but you kind of have to take the info you get from looking at overall numbers there with a grain of salt.
Nearly twice as many projects as Python, almost three times as many as VB, and more than a third as many as C or C++ is low? For a scripting language? I think these numbers prove the versatility and popularity of Perl.
Hmm, one sloc is not worth as many as the next sloc.
The idea that some perl code hacked together by retrained webmonkeys or freshmen is worth as much as kernel hacker code is quite sickening.
This type of measurement is simplistic and poor: it fails to take into account the functional density of the languages. For example, in one line of perl, you can do what takes ten lines of C/C++ -- especially considering that this is one of perl's raison d'tre.
The measurements really should be multi-dimensional:
- lines of code
- number of projects
- number of files
- number of modules (classes, namespaces, etc)
- statement density (e.g. per-language primitives: while, for, @, ->,
I read the whole thing including the comments as CSPAN and was like wtf?
b) Maybe the sloc counter didn't recognize Perl comments, so it overcounted lines. Wait, Perl programs never have comments.
c) Does this make it "a Perl of great price"?
Have you read my blog lately?
I think one of the reasons why many of the things people do in Perl don't end up becoming SourceForge projects is because they're specific to a particular environment -- my company does pretty much everything {that others might do on Windows desktops} using in-house-written Perl scripts accessed through a web browser; but they really aren't general-purpose enough to warrant releasing to the world at large. For instance, we need to store the Ordnance Survey grid references of our customers -- but not everyone will need that functionality. Perl itself provides a kind of "generality-of-purpose abstraction layer"; there's not much sense in writing a program that can handle fifty squillion different data formats if you're only ever going to use one, especially given that processor power and disk space are so cheap nowadays. I also use Perl for jobs that could be done using bash or awk or sed, but Perl is just so handy; and if I need to add one more fearure, I know I can. I'll also use perl -e 'print "something\n"' in an Xterm as a calculator {one day I'll even define a key map that puts the sequence on a function key}.
Alternatively, Perl -- thanks to all those wonderful library bindings -- might well be used for an initial "feasibility study", say to develop and test the most important function(s) that will end up forming the core of a project; and, once the proof-of-concept is there, the whole thing is then rewritten "from the ground up" in something like C or C++ {which has bindings for the dead same libraries anyway, but feels more "proper" because it's compiled rather than interpreted}.
Je fume. Tu fumes. Nous fûmes!
don't worry, you still get modded up
I've told SLOCCount all of CPAN is one project,
isn't the sum equal to all parts? I know it is more difficult to do big projects. (all those middle-managers)
generated code
Is code, generated more efficiently.
code downloadable from CPAN that wasn't written for CPAN,
there is probably code in red had that was never written for red had. That is the trouble with open source.
numbers of lines of source code are meaningless.
No no, they give you serious bragging rights!
So things are not that bad. Just the duplicates....
So things aren't quite bad. Just the duplicates....
a programmer I met years ago said it was the only language where the source code reminded him of line noise [emphasis added]
He'd obviously never seen APL -- one the few languages terser and more cryptic than Perl, and (AFAIK) the only one to require its own font.
-- Alastair
Sure, Slashdotters hate Flash, but why aren't there any ActionScript projects on SourceForge, while there are 1822 JavaScript projects?
--
make install -not war
In my experience with CPAN I have found it follows the Larry Wall concept that there are many ways to do the same thing. For starters, there are several modules which can communicate with a POP3 server. There are many XML parsers and many means of talking to a MySQL database. Unfortunately I would not say each solution is feature complete or even good quality. It is great that it has built-in Pod Doc, but the fact remains is that it can be quite difficult to get some things done.
.NET 1.1 profile implemented by Mono to be much more appealing. While there may be fewer means of connecting to a POP3 server, there is a good chance the one that is there will work well enough.
I was able to whip together a webmail client which fetches mail from a POP3 server and parse the MIME types to display content with several Perl modules which was a pretty amazing feat with the little amount of code which I wrote. But as I wrote it I had to come up with many workarounds for incomplete features in the CPAN modules. I also found that some modules were object oriented and some were not.
So in the end I am finding things like the Java Foundation Classes or the
But I am still curious how the Ruby folks are doing. They have been committed to object-oriented programming and may be able produce higher quality solitions. Anyone doing Ruby here?
Brennan Stehling - http://brennan.offwhite.net/blog/
The java and C# can be more functionally dense than perl while still being about 100000x more maintainable shows why they are the industry standard languages of today (or at least Java is, c# will get in the game in 5 years).
But, how do you know that the way you're measuring it is better than not measuring at all? There are lots of ways to measure things that are worse than no measurement at all, because they reward the wrong activity.
The canonical examples here are paying programmers per bug fixed, or paying testers per bug detected. Either one of these alone is bad - together they allow programmers and testers to print money for themselves.
In theory, nothing is unmeasurable. In practice, some things are so hard to measure that you might as well not even try.
To a Lisp hacker, XML is S-expressions in drag.
Could it mean that folks who write Perl are more likely to submit their work to CPAN?
How does the "instant gratification" of using an interpreted language factor into all this? I know one of the attractions of Perl for me is that I don't have to compile it to see if it works. I just run it.
"Obviously, I'm not an IBM computer any more than I'm an ashtray" (Bob Dylan)
I don't understand either. I all my Perl programming for Windows.
If anyone's interested, I've developed a hardware/software audit script entirely in Perl for Win32 (binaries included) that stores data in a centralized MySQL server. Vist here.
-- Political fascism requires a Fuhrer.
After the first detailed analysis of the large perl leak onto the net experts reckon CPAN could cost $677M to clean up 8)
If its the cost of writing the code then it should be a good approximation of the cost of writing it when perl 6 comes out
Clean syntax...
... };
You can write some pretty clean syntax in perl just:
#!/usr/local/bin/perl -Tw
use warning;
use strict;
use diagnostics;
use vars qw{
main();
exit;
# Your perl code.
1;
portable libraries?!
What the heck are you smoking dude? I want some!
It works on more platforms than any other language,
including C because it wraps libc platform weirdness into "you don't have to know or care" equivalent.
Think about EBCDIC, incomplete , endianess, file systems
that don't have all unixes attributes.
A decent GUI library?
There's a Perl/Tk. okay it sux.
There's a wrapper for GTK and Windows API.
okay it sux too.
There's an HTML API, where you can write
your entire program in HTML/JavaScript/Perl,
and just install some Apache, with mod_xmlrpc
and mod_perl thing that runs whatever you want locally on the machine. --> "good portable compromise".
You could also use something like C++ Builder
an embed your perl program within your C/C++ application.
http://perllinux.sourceforge.net
Too many arguments for open at break_filelist line 685, near "$filename) "
685 open(FH, "-|", "openssl", "dgst", "-md5", $filename) or return undef;
"It works on more platforms than any other language,
including C"
No.
I don't know of any perl SourceForge projects that have only one file.
Every one that I've started or worked on has 10s of library (package) files, templates, etc.
There are strange things done, under the midnight sun by the men who moil for gold - Robert Service
APL is the language that went out of the ASCII set to be able to be compressed farther. Here is a wiki article, Why APL and an extended code sample (slow .gif).
Architectural note on Perl 6.
.NET) is intended for highly dynamic languages.
Perl internally uses a virtual machine (OK, very high level virtual machine) already. But one that is very tightly tied to the interpreter. And one that (unlike JVM or
For Perl 6 they are splitting that virtual machine out into a project called Parrot. They are then writing multiple front ends for different languages that will run (and hopefully cooperate fairly well) on the same virtual machine.
One of those front ends (Ponie) is a re-implementation of Perl 5. Most of CPAN should run just fine on Ponie unchanged, and so most of CPAN will be immediately available to Perl 6. Which will be smart enough to autodetect Perl 5 and fallback to Ponie.
There will be bugs in the compatibility, of course. The only truly incompatible difference is that Perl 5 has reliable destruction timing (through reference counting) while Parrot has true garbage collection. (So you cannot rely on DESTROY for scoping effects.) In fact they even have a plan to enable a backwards compatible XS interface for legacy C code.
Given that Ponie is looking like it may run twice as fast as Perl 5, even people who have no interest in Perl 6 will have reason to try to track down the unavoidable problems and make stuff on CPAN work on Ponie.
Ever looked at CPAN?
Look at this: XHTML parser using K programming language :)
Perl is really clean language
Create RSS feed from any web page http://Page2RSS.com/
...then how'd they become the Ruling Class? You know, not every rich person is a slutty blonde bimbo heiress like Paris Hilton (someone who I'm sure would struggle to make up the bed in just one room of one of daddy's hotels). A good deal of the wealthy class is self made (particularly in North America)--perhaps your view is coloured by the more class-oriented system of the UK, where there is a fair bit more wealth through inheritance.
Jobs and Woznaik founded Apple and Jobs still runs it (hell of a lot bigger than a mere electronics store chain). I'm sure both of them would be more than capable of wiring up a 13A plug seeing as they were capable of designing, building and programming a computer (and devices allowing them to call Europe for free). And while Bill Gates came from a fairly affluent family, he was hardly a billionaire and managed to survive the early Micro-soft days in dumpy New Mexico digs and do low-level assembly programming.
And yes, I'm sure many of the owners of GM and Ford know how to change a tyre--seing as they are publicly held companies with a large number of shareholders. I'm willing to bet that the executives.management could do it (Lee Iacocca comes agross as a guy who is down-to-earth enough that he could.
My sister is the Canadian president of a multi-national corporation and not only can she peel a potato, she peeled many of them making dinner for her two kids every night as a stay-at-home mother when she was in her early twenties.
Fact is, it is no longer the 19th century, democracy is widespread and the "ruling class" is no longer so dominated by inheritance like it once was. This Marxist theory of the proletariat rising up en-masse against a ruling class dependent on workers output just doesn't wash. Today, those of the working class with the capacity and drive to step up are able to rise one-by-one. And once you are part of the "ruling class" it is human nature to defend it regardless of others actions--particularly when your wealth is earned.
What about the fact you can do everything that you can do in Perl with PHP only quicker and easier?
I'd daresay that Gilb's law's meaning is nearer to something more like "anything that can be quantified"... I mean, there are no units in which you can quantify the symphonies' quality or other subjective affairs. But if the thing you wanted to measure in the beginning, is measurable in certain units, then surely any approximation is better than no idea at all...
... from the forgotten corner in europe
Congratulations, you have just reaffirmed the grandparent's point.
...how much of that is devoted to MP3 taggers and MVC frameworks... :-)
[Ob. reference: I love Perl and use it all of the time, but a programmer I met years ago said it was the only language where the source code reminded him of line noise]
That's an amazingly insightful and clever Perl joke. In 1992.
I wonder what percentage of 15.4 million lines of code are tests. CPAN emphasizes on tests in every module submitted.
In some cases the size of the test scripts is larger than the core code of the module.
Another related paper (that I didn't write) is Counting Potatoes: The size of Debian 2.2. They found that Debian 2.2 includes more than 55 million physical SLOC, and would have cost nearly $1.9 billion USD using over 14,000 person-years to develop using traditional proprietary techniques.
So what's the purpose of all these studies? Insight. There are all sorts of limitations in any measure, including any source lines of code (SLOC) measure. But, in spite of those limitations, there are things you can learn. Using tools (like SLOC counting tools) to measure software can help you understand things about the software, as long as you understand the limitations of the measure.
In particular, many studies have shown that SLOC is very strongly related to effort (so much so that you can even use equations to predict it). If you want to determine effort in CPAN, you can't just go ask people; few open source software / Free Software (OSS/FS) developers record exactly how much effort they invested. So, these kinds of measures are really helpful for estimating how much effort went into developing the software. Obviously, not all effort is equal (a genius can turn a hard problem into an easy one). And not all code is good, or even useful. But if you want to understand and measure effort, then these measures do have a value. In particular, these results have shown that OSS/FS can scale up to large projects requiring large amounts of effort.
- David A. Wheeler (see my Secure Programming HOWTO)
If your argument is, "this measure doesn't measure how many lines of code it would have taken in C", or "how much effort would it have taken if it was written in C", well, that's true. So what? That wasn't what was being measured. If that's what you wanted, there are well-known conversion factors where you can estimate the SLOC in C, and convert it to effort. But those conversion factors are estimates with a LOT of slop, and the published conversion factors have almost no published data to justify them, nor do they identify the ranges and standard deviations and other caveats. But if that's what you wanted, I'm not sure if there's a better way to do it.
- David A. Wheeler (see my Secure Programming HOWTO)
SLOCCount measures "physical SLOC", and thus ignores blank lines and comment-only lines (including Perl PODs). It's not the same as "wc -l". Go read its documentation if you want to understand exactly what it does; it has a lengthy description of exactly what it measures, and why, along with references to the (substantial) research literature behind such tools.
- David A. Wheeler (see my Secure Programming HOWTO)
"Whatever ones favourite language might be, a project to mine CPan and port useful modules to Python, Java or C# would be interesting.... Perl syntax reads as a little terse to many non-Perl devs."
How about Smalltalk? A language that lends itself well to browsing and integration of pooled code. In fact I see the line blurred from outside to inside. The browser 'browses' both kinds of code, alike. Pulling in only what's needed for execution. Collaboration made easier.
I think that just shows how inefficient businesses are. You basically have two points of inefficiency. All the costs you mentioned, AND not giving a programmer the structure to be as efficient as possible.
"At 15.4 million lines of code,"
But how many libraries of congress is CPAN?
My point was that nearly every perl project that I have been associated with, including the ones on sourceforge, are large projects with "more than one file."
Single-file perl projects are hardly a commonality and that is not the role the language seems to fill.
There are strange things done, under the midnight sun by the men who moil for gold - Robert Service
"I use Perl every day and it has been one of the eisiest to learn and use language that I have found.", "The syntax actually is very clean and rather simple and easy to use. In my extensive use of Perl I have found the syntax to be very clean, clear, and easy to understand."
Another quality commercial brought to you by PEP (People for the Ethical treatment of Perl)
-- Bowery's Razor; a corollary (applicable to programmers and other state crafters) of Okham's Razor.
Seastead this.
Well, kind of. Technically, TECO wasn't a programming language but a text editor (and corrector). Emacs wasn't so much a program written in it but a collection of editing macros for it.
I'll agree, though, that it looks like line noise.
-- Alastair
About 5th out of 50? Is that a low position?
So either the article author is being sarcastic or they think Perl lines should outnumber Java or C++. Maybe they should outnumber PHP, but that would only boost the ranking by 1 (though double # of lines).
What's a full-blown low-level language?
sloccount is bullshit, I ran it on my development tree for work completed in the past 1.5 years.
It seems to think that I alone have done 41 years of development effort with an esitmated cost of $5.6 million.
Woooooow! if only I got paid that much for the work I have done!
I explicitly stated that as time goes on inheritance of wealth becomes less important to determining success.
My sister was borne to a father who worked shift work running the boilers in a meat processing facility at the time. He was not, isn't and probably won't ever be (as he is now retired) president or chairman of a multinational company. My sister worked her way up from selling door-to-door to suppliment a fairly modest household income to where she is now twenty years later. She is not related in any way to the founder of the company, and daddy had nothing to do with her current position except to raise his kids well.
While it's sure a hell of a lot easier to be born into a position of wealth, there is NOTHING in the free world today that prevents a "commoner" from improving his lot in life except his or her own sense of limitation.
Yes. Anyway, the number of development projects using a language is not necessarily the measure of its usefulness or the extent to which it is used. JavaScript is everywhere in HTML but has few whole projects compared with other "languages."
The number of utility scripts and small applications in Perl must be astronomical (many of these of course are available on CPAN and don't need to be developed).
While well written, the author of the linked page completely failed to mention the /real/ date standard, ISO 8601. It is the most logical (descending order) and least confusing.
You have a very strange screen. Is anything underlined on it?
Moderators:
please mod parent up as +5, Funny.
Thank you.
--
A.C.
P.S. Parent: please use <em> tag for emphasis next time. Thank you.
I love it! My favorite part is the fact that there are comments in the code. As if that would help.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
There's a second 's' missing, somewhere...
The post anonymously option you are [not] attempting to use is one that isn't available to your user.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Mod this cunt troll, slanty eyed wog or fucking stupid. Or all three.