Statistics On Free Software projects
GenericBoy writes: "The first edition of The Orbiten Free Software Survey is out online. Some of the stats are number of authors and projects, the top 10 contributing authors, how many MB are in all of the free software projects put together (!) and a bunch more. " Now, as they themselves point out in the their Scope and Method, the methodology is crude, and I don't think Orbiten could quite submit it to Nature yet or anything, but it's an interesting bunch of stats.
Is it fair to mention 'Gordon Matzigkeit' at all? He only appears in the list because his name appears in hundreds of acinclude.m4's. This nicely proves that the statistics is completely nonsense.
Well, if you recognize Gordon's name, you'll remember what project he is perhaps best known for: libtool. Now, packages that use libtool happen to include some rather long (autogenerated) files in them that have Gordon's name attached. So for every package that uses libtool, Gordon gets credited with about 8 thousand lines of code. What a sweet deal!
Rock & Troll: a form of music best played from a Beowulf cluster.
Troller Derby: a skating game in which everyone skates around screaming, "First Score" even if they are the 10th.
Cinnamon Trolls: tasty flavored grits poured down one's pants.
Troll Call: all participants stand in a line and appeal for Natalie Portman's nubile body.
On a Troll: when some loudmouth who cannot read an actual article does nothing but disparage slashdot submissions incessantly.
Con-Troll: a miscreant poster who just escaped prison.
Dave Troll: leader of the band called the Foobar Fighters.
Bridge Troll: offtopic poster interested in card games.
Pet-Troll: (1) impudent poster used as fuel in the UK; (2) a troll belonging to another, as a pet.
Trolley: conveyance used to transport numerous trolls in San Francisco.
Trollkin: the family of a troll.
Trollop: a female poster of ill repute.
Trollanthropy: the rare act of a wiseass poster giving someone or something its due.
Is it just me, or have these stats left out some fairly large projects suck as Jakarta and Mozilla?
:)
Admittedly I didn't look through everything, but I don't see Jakarta mentioned under the apache author page, nor do I see mozilla under jwz or Netscape's author pages. Am I blind, or are they?
And if they did miss these two, (Mozilla alone is a somewhat massive sum of source code) what else are they leaving out?
"Lies, damn lies, and autogenerated reports." -- Peter Baylies, 5/9/00
Or, if you don't believe me, just remember that
"united states government as represented by the" is responsible for 305,338 lines of code, 200k in the Linux Kernel, 100k in OSKit, and 10% of the Linux Surfboard Driver. Go, US!
...and bow down and worship Gordon Matzigkeit. One day, every child in America will be able to spell his last name, and recognize him as the unsung hero of the free software revolution...
---
pb Reply or e-mail; don't vaguely moderate.
pb Reply or e-mail; don't vaguely moderate.
Man, go away. Posting the results here is just not right. I could see it being helpful ./'ed, but it isn't.
if, say, the site was
Wooohoo! My server survived its first slashdotting. Without any particular preparation either (I didn't notice it had made slashdot till a friend told me), and while running all my nice eye-candy too. Kudos to apache...
Adrian.
I think "copyresponsibility" would be better than "copywrong". Who really cares, though?
Yep. I came to the same conclusion. The authors of the survey do a brute force analysis and count whatever name shows up.
So if you manage to show up on some file that gets included in a lot of projects, like the C/C++ libraries, you will score very high. That is what put Ulrich Drepper on number 8.
On the contrary I was not able to spot a lot of hard working folks from the BSD crowd. So the authors of the survey did not scan through a FreeBSD, OpenBSD or NetBSD tree. Even giants, like Donald E. Knuth (DEK) did not show up. So TeX was not included either.
What to think of it?
The basic idea is nice, the equivalent of a Open Source top ten. It could appeal to the same people who try to score high on distributed.net or Seti. (But especially these projects had people show up who increased their scores bei illegal methods)
I however like the idea to, in a few years on from now, to be able to look up on what stuff I worked. But guess this will need a much improved system.
My conclusion is these guys had the right idea, that the existing body of free code screams to be analyzed. So let's forget that they did it poor, and let's try to improve things.
At first they should extend their input, an easy way is to scan the contents of the former Walnut Creek ftp server, as it cover a lot of free software. However one would need to add a lot of different servers too. Adding the major free systems, commercial stuff like mozilla, projects from science (there is a lot of free Fortran out too!
If anyone is interested in setting up a better attempt, please contact me.
Yep. The author credited is usually the person who wrote the first version of a particular file. This neglects the maintainer and the many people who might advance the state with their patches. All of them, plus web masters, documenters, release and source code repository engineers (maybe I forget a couple of important folks too) deserve credit!
If done properly, patch submitters should be noted in the CVS logs. Some projects (like FreeBSD) route that comments in commit logs too.
Ergo: scan the cvs trees and not the release packages.
- Oh, yeah? You have the source. Write it yourself, you moron!
- QT/GTK is for idiots.
- Apple is so stupid. If they open-sourced everything we'd fix it for them.
- M$ code is terrible.
- Why isn't Company X open-sourcing their product? Proprietary software is evil!
- Free software project X sucks.
or such things, be expected to link to this site showing exactly how much they've contributed.Although, given that the study has managed to overlook my insignificant but non-zero contributions, maybe I shouldn't propose that.
What I'm listening to now on Pandora...
Yeah, I'm on that list! Right at position 771 AND 772!
:)
:-/
:(
What!? They counted me TWICE? Once as tord.jansson@swipnet and then later as tord.jansson... hm... 248447 bytes for each of them... Hm, seems like they somehow counted me twice but with the SAME value or maybe they somehow split it in half.
Let's click on my name and see what projects they have mentioned me participating in, should be just BladeEnc... What!? makeMP3.codd!!! What the heck is THAT program!? Hm, I see... got to be some kind of frontend that has included the BladeEnc code...
Feels a bit odd getting credited for a program I don't know anything about, but still kind of okay...
On the other hand, I wonder how they came up with 248447 bytes, the BladeEnc code is about 1.5 meg
But then again, it wouldn't be fair to credit me for more anyway since BladeEnc is so heavily based on the original ISO code and the other BladeEnc contributors haven't gotten any credits since they're just mentioned on the homepage.
Guess this shows how far from precise this study is. A good attempt to measure something quite
imessurable though. Kudoz to all the people who must have put down an awfull lot of work on this and hope you could get some usefull out of the big picture although the small details are terribly wrong.
Tord Jansson
BladeEnc Creator
Money.
Microsoft has hundreds of full-time programmers on Windows, more than enough to swamp the efforts of 13000 part-time hackers and students. IIRC, Windows, measured in man-hours, is the single greatest engineering project in the history of humanity.
I think this would more different if they did the survey on something like debian.
--
A lot of unix coders put the starting brace of while and for loops on the same line. I think windows code generally puts it on the next line.
:)
I know it is only one line, but a lot of unix code I've seen does "} else {". That's three lines of windows code. It adds up!
-- Thrakkerzog
Secondly, most of this community, by its very nature, is distributed, decentralized, and hard to account for. That's not a coincidence - many of us like remaining anonymous.. the man behind the scenes. As anecotal(sp?) evidence look at the .sig blocks on slashdot - how many famous people note their OSS accomplishments in their sig? Very few. And as Linus himself said.. it's not like girls are throwing their underwear at him. Many people don't *want* to be counted.. an anonymous patch here and there is sufficient.. "I just want it to work".
So before people start using this report as a metric of people's contributions, remember two things: Even small contributions count, and this is an inclusive rather than exclusive community - you are welcome here whether you contribute source or not. People who write documentation, help the newbies, and convince management to put their company printers on linux (3Com anyone?) ought to be commended too. There's alot more here than code!
In general the handling of large packages such as KDE seem fairly poor. For example KDE apparantly has no authors according to the by-project listing. I think this is a great idea, but it needs a cleaner source of data, for example Coolo has been able to give some very interesting and detailed figures by running scripts on the KDE CVS repository. Perhaps this is the sort of thing they need to be using as the initial data set from which they make their analysis.
Rich.
on the other hand, the collection of the data -- if it can be arranged in some meaningful manner and then processed in a reasonable way that will yield thoughtful conclusions -- is no small task and rishab and his associates should be applauded for the hard work they did on that portion of the project. i, for one, would be glad to work with them to try to pull out some meaningful reports from their well-meaning but, i think, misfiring project.
Paul Jones
Certified Black Helicopter Pilot *** Unwitting Dupe of One World Gov'ment
But are they including SCSL code in Sun's count?
Actually, the Great Wall of China has more in common with Windows than you might think - it didn't work.
Well, of the graphs they provided, the 3D piegraph was definitely superfluous, and everything except the last pie graph could have been done using free software (take a look at gnuplot; it may have been able to do the pie graph as well, I'm not sure), so yes, I'd say he has a point.
Losing key staff is no longer the exclusive realm of corporations. I sort of surprises me to see this argument brought up in the context of open software! :-)
Absolutely! What is more, losing "key staff" in an open-source project is generally much less devistating than it is in a closed-source context, as open-source by its very nature tends to distribute expertise on a given project much more widely.
For example, early in the Linux Years (pre 1.0) the guy (I forget his name) who did allot of the early networking work abandoned Linux to its own devices, largely due to being flamed for not having written the perfect, most elegant implimentation in his first iteration. Another took over that aspect, the kernel lived on, development moved forward, and Linux is now a raging success. The loss of a very key developer caused hardly a hiccup in development (though an auful lot of discussion, flamage, and doomsday saying).
kNFS was abandoned for almost a year, which caused myself and others a number of headaches in dealing with Linux NFS (and is probably the reason why Linux NFS lags behind the BSDs and commercial UNIXen in performance). That having been said, it was picked up, is being actively developed, with NFS V 3 support in the 2.4-pre kernels. This is probably the best "worst case" or at least "very bad case" example of an open source project being abandoned one can find, at least in the Linux area of endeavor.
Abandonment of a project can lead to some delay (as with NFS), but as often as not the delay is minimal (gimp, Linux networking) as another active developer takes over. I would submit that delays in closed-source commercial applications are much more common and typically much more lengthy.
Finally, with open source the project will always be picked up and continued by someone, as long as there is any interest. Contrast this to many closed-source products which are orphaned, leaving developers and users in a serious bind which they can do nothing about, other than remapping their entire engineering or corporate strategy to a complety new, competing product, at great cost in time and money. In the worst case open-source scenerio, such a customer would have to finance and perform ongoing development and maintenance themselves, which would often be a less expensive solution than the alternatives. Having said that, I do not know of a single open-source project where anyone was compelled to do this. I do know of a number of orphaned, closed-source products which left consumers in a terrible bind, from bitter, personal experience.
Our solution, which has to date saved us tens of thousands of dollars and hundreds of developer hours in cost, was to move to an open source platform (Linux and FreeBSD) and require open source libraries to be used wherever possible, limiting our exposure to orphanage of closed-source products.
The Future of Human Evolution: Autonomy
check out the enlightenment stats
;P)
granted, good ole raster is a huge part of the project, but i was surprised to see him mentioned at least three times ("the rasterman", "carsten haitzler", and "raster@zip.com") as was mandrake...duno if this should be attributed to their data collection methods or to messy credits files (understandable in the case of raster's typing
-dk
-dk
Dream with the feathers of angels stuffed beneath your head.
The discussion points out some interesting facts about why some individuals are listed as big contributers (such as the author of libtool. Duh.) and why some aren't listed at all. They even have some comments from the developers of the survey.
And I just love the comment of Havoc Pennington:
That or the Apollo space program ... or pick your favorite big project. Get a sense of proportion, please.
Preventive War is like committing suicide for fear of death. - Otto Von Bismarck
Hate to say it, but they made their mistake in thinking freshmeat.net was comprehensive. freshmeat.net is a very small part of the open source out there.
RocketAware already lists much more than freshmeat (and is way easier to use, if you are a programmer looking to reuse code, eh?)
Then why are you posting as AC?
Bowie J. Poag
Bowie J. Poag
Good to see something like this. However, I have to admit, its a little bit of a letdown. I've got 10MB worth of gear in Red Hat 6.1, but my name didn't show up anywhere. Yes, yes, I know, it's not code, Bowie..Heh
Bowie J. Poag
Bowie J. Poag
Looks like a resounding victory for the FSF. But respect to Sun, who, despite being a big ole commercial company, still have managed a huge input.
--Remove SPAM from my address to mail me
Random made up statistic to prove point:
Let's say they looked at 10 million lines of code. Well, 0.139% is 13900 lines of code. Not insignificant.
Duh.
--Remove SPAM from my address to mail me
Wonderful point - and I hope folks that are in the less than 1% crowd don't quit either! Even finding and fixing one line of code is a blessing.
Heck, as I sit here now I have found three lines of code I need to put in this program I am writing where I did not clean up my linked list. Argh! No wonder the original app has had a tendency to crash over the past 3 years.
The small stuff is as big as the big stuff.
> So for every package that uses libtool, Gordon gets credited with about 8 thousand lines of code. What a sweet deal!
If this ever gets as popular as karma whoring, OSS is in for some serious bloat! "Oh yea? Well my tool inserts eight million lines of code, nyeh, nyeh, nyeh!"
--
Sheesh, evil *and* a jerk. -- Jade
Well, maybe not quite that much attention. We don't need kiddies who wouldn't know C++ from Excel macros checking in millions of lines of garbage into any open CVS.
As for number of projects, potato has 4376 packages, not all of those are separate projects (some are from multi-binary source, some are task packages), but I'm rather sure more than 3149 of them are :)
He succeeded in writing the exact same size of code in numerous projects:
Interesting stuff I thought at first. Very interesting indeed.
Then I started to check into details. Being the author and participator of at least five projects listed on freshmeat (all of them included in this "report") I checked them up to see what they had to say about me and the projects I've contributed to.
They had no clue at all. Lots of people got a lot of code submissions they've for sure never made, while it was very obvious that some of the major authors did not get as large enough amount acredited as they have done to the projects. Many names were very confusing and mixed up.
Seeing how badly wrong they are on the few projects I have in-depth knowledge about, how can I trust any conclusions they make in general on the whole context?
I say scrap the whole thing, do it all from the start. This is not the truth.
Interesting results, and certainly the numbers involving lines of code per project are probably accurate.
However, glancing through a project that I'm the primary author on shows me as the 24th on the list of developers for it, having written 585 bytes. I suspect I've written a few more than that.
The top of the list was dominated by a mailing list address that isn't even correct. The second name on the list was the UCRegents, who owns the copyright (but certainly their lawers didn't write the code).
And judging by the other comments, I suspect that the majority of their data is similarily way off. I wonder if they even tested the tool they developed on a few randomly selected projects to see how accurate the results were. They didn't even perform the most obvious data collection method I can think of: "cvs annotate".
I like the study, but I'd sure like to see it done better.
The next site to slashdot will be ready soon, but subscribers can beat the rush and start slashdotting it early!
I looked at the algorithm used to determine how they collected the names of contributors. They grepped e-mail addresses, rcs ids, and copyright info from various files. I don't think that's the best way to draw any useful conclusions in regards to Open Source software. The only real conclusion found here is that Open Source projects include a lot of code written by other people. That's trivial. This study fails to make a distinction between an active contributor and someone whose code was simply borrowed. This is an important distinction to make! For instance, what if I were to take 1000 physics homework assignments and search for "F=ma" in them. I can't assume that the appearance of "F=ma" on your paper means that Newton helped you with your homework. I can only assume that you used Newton's second law of motion to help you solve the problem.
Similarly, if you wanted to determine who the most prolific scientific researcher is in a field, would you gather data by simply grepping for names in the texts of papers? No, you'll skew the data by counting the names who appear in the paper's "References" when you should just be counting the actual investigators who are listed as the authors of the paper!
I would like to see this study repeated but making the distinction between an active contributor to a project and someone whose code was simply included. Only then would a top-heavy distribution suggest anything meaningful in regards to OSS authorship.
If anyone has looked at the CODD algorithms/code and can show me if they used a more sophisicated method to filter out authors with no active involvement in a project, please post. It's a difficult problem to infer who actively and who passively contributed to a project with just a perl script.
Well no, actually my wife's name is Heather and my Son's name is Max and it just sort of happened :-)
Andrew.
I noticed on the PostgreSQL Hackers list that Thomas Lane said this was very bogus because it appears to re-include his libjpeg as many times as it is used by something else.
Also, is FSF an Author? Is BSD an Author?
Andrew.
Man, go away. Posting the results here is just not right. I could see it being helpful if, say, the site was ./'ed, but it isn't.
OTOH, it's nice to see some sort of a start at studying the free software community...
"You can never have too many elephants on your team."
"Windows, measured in man-hours, is the single greatest engineering project in the history of humanity."
;)
hmmm... I wonder how many man-hours went into the pyramids and the great wall... Any of you engineers wanna venture an estimate on the G.W.? I think the ancient Chinese beat MS hands down.
Geeky modern art T-shirts
Lets say this were Windows NT .. the person who wrote the 13900 lines of code would have written the code to blast you with "You must reboot for changes to take effect" dialog box that pops up whenever you dare move the mouse. A worthy contribution, indeed! :]
I'm completely aghast that they did not include a single OS beyond a Linux distribution. I'm happy to see they'll include OpenBSD in their next study, though I wonder why they chose OpenBSD instead of NetBSD, which is larger. And I wonder why not include FreeBSD too, whose developers base is quite different from that of Open and NetBSD.
(8-DCS)
It probably depends on your definition of "single". But I reckon the pyramids would beat windows, given that they were done by hand millenia ago.
perl -e 'fork||print for split//,"hahahaha"'
Did the original poster even *mention* Linux? Linux is not the same thing as Open Source.
Free software was not a "rational choice" in 1984, if by rational you mean The Best Tool For The Job. If everyone only cared about using the best toolset, gcc would not have been written and none of this open-source explosion would have happened. Your use of the word "rational" suggests the original poster's view is crazy. Well, remember that this whole shebang has been made possible by a man who is "crazy", in the sense of not always wanting to use the short-term best tool for the job.
I agree with your point, that the use of Excel does not detract from this study at all. You're also right about misuse of the word "ironic". Please don't misuse the word "rational".
perl -e 'fork||print for split//,"hahahaha"'
They list their sources as follows:
Debian would have been a more sensible distro to use, because it is overflowing with (packages|crap). Red Hat (presumably) just ship the ones which it makes commercial sense to ship, wheras Debian has everything that anyone's bothered to include whether it's useful or not. For example, Cooledit (my favourite text editor) is missing from the survey. The only problem with Debian would be stuff missing because it is not DFSG-free. Such stuff is available in the non-free/ directory but it's probably not as comprehensive as the main/ directory is.
Having said that, it's very interesting to see what they have got. I didn't know Andrew Tridgell did all that stuff, for example. This could be a good tool for the community to get to know people better.
perl -e 'fork||print for split//,"hahahaha"'
ESR had a colloquiem at Cornell a while ago and I brought up Nikolai Bezroukov's critique of his CatB, which he loudly discredited. I wish this survey would have come up earlier...I would like to ask him to comment on these statements:
"The top 1271 authors, 10% of the total, accounted for 72.3% of the total code base. The top 10 authors alone (0.08% of the total) are credited for 19.8% of the code base. Free software development may be distributed, but it is most certainly very top heavy."
"Our conclusion: Free software development is less a bazaar of several developers involved in several
projects, more a collation of projects developed single mindedly by a large number of authors."
The question from Bezroukov's paper I didn't bring up was that open source projects look much more cathedralesque and hierarchical as one moves up. E.g., not just anybody gets patches put right in to the Linux or *BSD kernel.
It's 10 PM. Do you know if you're un-American?
What I find most interesting by far is the composition of the contributions when viewed by project. In nearly every project I viewed, there are two or three elite "key contributors" who provide somthing on the order of 1/3 to 7/10 or more of the code, with the remainder provided in a slew of sub-1% coders.
This relates an interesting story. It appears that, while the real strength of OSS is incremental improvement over time, few projects can exist without a guiding intellect or a handful of ambitious coders on the core team.
Presenting this data to employers who are concerned about losing control of their code may help assuage their fears of open source. Clearly projects that are "owned" by no one are rarities. A corporation *can* have its cake and eat it too.
-konstant
Yes! We are all individuals! I'm not!
-konstant
Yes! We are all individuals! I'm not!
Well, there is one way that the OpenSource community can take over and Lead the way over the networking protocols.
Come up with our own protocol.
I have had this Idea in my head for a while, but I am only a network support tech, not a programmer, so I couldn't do it myself. I have some great ideas, but no way of implementing them.
"When I'm singing a ballad and a pair of underwear lands on my head, I hate that. It really kills the mood."
When I'm singing a ballad and a pair of underwear lands on my head, I hate that. It really kills the mood.
-Tom Jones
And what's wrong with using the most conveniently available tool for the job? A rational, non-bigoted person wouldn't see anything wrong with using a tool that you already had available and were experienced in using. I also don't see anybody mentioning any Open Source applications that would have been better suited to the task... does StarOffice have this capability?
"Freedom means freedom for everybody" -- Dick Cheney
By that same criterion, I wouldn't call Windows an engineering project either. "Whilst elaborate, the actual engineering would have been fairly minimal." Yep, sounds like Windows to me!
"Freedom means freedom for everybody" -- Dick Cheney
12706 developers working several years on 3149 projects, and they've still produced fewer lines of code than a single release of Win2K... is this because Open Source is more efficient, less feature-rich, or because it doesn't carry the burden of backwards compatibility with DOS 1.0?
"Freedom means freedom for everybody" -- Dick Cheney
While I think that its good someone is performing some stats on open softawre development (if only to show others that stuff is actually being done) I think this could contribute to some BIG problems if people start to compete for the highest ranking.
There is a good story about IBM in the late 70s about how they measured a researchs labs performance based on KLOCs (100s of lines of code). Suprisingly the lab at Boca was winning most of the time. Then someone figured out that they were unrolling all of their loops in order to increase the line count...
Proves is can/does happen...
so what??? I wouldn't blame them for using it, Excel is a good product. I use good products. I use Linux at home, and I use Excel at work. If it wasn't a good product, I wouldn't use it. If Linux wasn't a good product, how many of us would use it? Personally, I'm a little tired of people bashing products because they're made by MS...bash them for their bugs -- fine -- but not just because they're made by MS.
Seen as this survey has highlighted code re-use in the Open Source community, (Gordon Who ?), do you reckon that OSS proves a good model for that Holy Grail, effecient code re-use (libtool et al).
Is there anybody doing studies on code-reuse on OS sw or closed source sw ?
McC
anyway, the point is that stats can be used to lie, but equally they can be used to extract the truth. For example much of modern materials science is based on statistics. Likewise economic forecasting techniques. Stats aren't always bad, it's just that they can be misused.
"The new wave is not value-added; it's garbage-subtracted" - Esther Dyson, Dec 1994
When watching any sporting event this rings true. I love hearing an ex-sports player turned commentator talk about how "3 out of the last 4 times these teams have met on a full moon during the month of July...."
Statistics are the tool of the devil.
Other than that, they are sampling a very small (and non-representative, I would guess) number of projects. There are a hell of a lot more than 3000 projects listed on Freshmeat alone. And god knows how many developers are missed. It's a start, but no more than that.
-- The Sheep --
I think it's just that there is probably no other good spreadsheet package around. I tried using StarOffice for writing one of my papers that had a lot of charts in it, but I was just disgusted by how you have to coerce the thing into making even a simple chart. Also embedding of a chart in a text document was just plain buggy and crude. It took me hours to do something that would take minutes in Excel and in the end I had to settle for less than perfect charts. I know that there are other spreadsheets around (like the Corel or Lotus one) but they are also closed source commercial products. Also Excel is probably the best one around, even if it's by Microsoft, etc. I don't blame Star (yet), because I was using a beta, but they are very far from the object embedding that Office does. I liked their equation editor though...
And finally, a word from Harry Truman: "There are three types of lies: lies, damn lies, and statistics!"
"It's better to keep your mouth shut and be thought a fool than to open it and remove all doubt."
Statistically speaking, someone was bound to say that.
Geez, Microsoft doesn't have the time or the will to keep their products backwards compatable with CP/M 86, errr DOS 1.0. They're keeping with the times, they only carry the burden of backwards compatability to DOS 3.3. I mean MS has got better things to do with it's time than make sure DOS 1.0 stuff works, they're busy adding the much needed Auto-remove-all-vowels-from-words-directally-follo wing-a-spelling-error feature in Word 201.
The ironic thing is that I believe that those graphs were made in Excel. I doubt you could get less open source or free than Excel.
...and highschool teachers :(
--
Soma: because a gramme is better than a damn.
THese projects would average about 300 K each...what are they? Drivers? Application programs? Pac-Man clones?
Hopefully I didn't put any [] around my words.
Is it just me or did anyone else notice the distinct MS Exel look about those graphs?
Devilish
--------Irc.destructor.net--------
--------The Geek Network--------
Devilish
www.sci-fact.com - From Fiction to fact -
Your one stop science news and discusion site.
I wish I had time to sit around and contribute to a ton of open source projects... alas, I have to make a living.
My question is, after viewing some of the profiles of the top contributors, is 0.139% really much of a contribution to a project?
Based on the fact that my current Free Software projects contributed over the last 10 years comes to 7 projects with 10,586,400 bytes of code puts me at number 6 on the overall list, and I didn't even get a mention anywhere, leads me to conclude that these statistics are nowhere near accurate.
hit the submit button instead of preview... but an impressive list anyhow
"It is a greater offense to steal men's labor, than their clothes"
This would be a nice feature
"It is a greater offense to steal men's labor, than their clothes"
the text clearly lists the limitations of the survey including the small code base used; the algorithm to identify and credit authors is clearly documented - and the source code is available on the site FWIW. of course, the survey is full of errors, some of which i've commented on here, on advogato and elsewhere (e.g. gordon matzigkeit).
the main problem is naturally that this is impossible to do by hand and has to be automated; we did want to look at authorship at a file level (the lowest level of granularity available); and author credits are in no fixed format. they're not even there much of the time, which is why copyright holders such as the FSF get a lot of credit too. the only alternative to listing them as they are is to have a huge "uncredited" portion - at least until authors start consistently claiming credit, using the same name or e-mail address in each file they write.
incidentally it is not possible for us to guess which of many contributors to a single file are more important; as documented, the credit is currently split equally among them.
finally, this is just a start. while we intend to continue working on this, the algorithm source code is available as are all the code bases, so nothing stops you from doing it too.
Of course for those enthusists out there willing to develop on four platforms (Win 9x, Win NT, Win 2000, and Win CE, where Win 9x keeps DoS 1.0 compatiblity), take it from the former richest man in the world -- code "Fatware" and aquire a mono^H^H^H^H er, sucessful company.
Win 2000 will work on any computer! Any computer that is fast as AMD's Athlon can be overclocked! That is the sucess of "Fatware" -- make deals with OEMs, and when the hardware companies get more revenue because of your "Fatware", you get a cut! How can you go wrong? Unless you have a DOJ lawsuit hanging over your head, you can't!
Of course coding excessive lines can be rather time-consuming. The fun part is having more backdoors in your software than the White House. After making a bunch of companies squirm over the backdoors, write a simple patch and charge a fortune. ;) But it's not a "bug fix" as those open-source people would say... nah, it's a "Service Pack".
Ah, maybe M$ should have compatibilty with DoS 1.0 -- the later versions didn't improve the OS much. :)
Karma whorin' since 1999
how those trollers used to make me smile
And I knew that if I had my chance
I could make Slashdot:ers dance
And maybe they'd be happy for a while.
Did you write the stuff that matters /. tells you so
And do you have faith in CmdrTaco
If the
Now do you believe in rock 'n troll
And can moderation save their mortal soul
And can you teach me how to troll real low
Well I know that you're in love with her
'Cause I saw you posting in the forum
Pouring hot grits down her pants
Man, I dig those flamebait rants
I was a lonely teenage broncin' h4x3r
With a pink iMac and a beowulf cluster
But I knew that I was out of my mind
The day The Slashdot died
I started singin'
Bye-bye, Miss Petrified
Surfed my IE to the forum but the forum was dry.
And good old trollers (were) drinkin' whiskey and rye
Singin' this'll be the day that I die
This'll be the day that I die
I met a geekgirl who sang the blues
And I asked her for some nerdy news
She stole my coke and turned away
Well I surfed down to the sacred forum
Where I'd saw the posts years before
But the 404 said the posts woudn't come
Well now on the street the trollers screamed
The l33t3rs cried, and the h4x0rs dreamed
But not a word was posted
The Taco Bells all were broken (OT?)
And the free man I admire the most
OOG with the Holy Post
He sent the last post to the host
The day The Slashdot died
We started singin'
||: Bye-bye, Miss Petrified :||
Surfed my browser to the forum but the forum was dry.
And good old trollers (were) drinkin' whiskey and rye
Singin' this'll be the day that I die
This'll be the day that I die
(repeat plz)
(TTL 4) We started Pingin'
____________________________________________
By: TACO TROLL of the Troll Liberation Lobby