Statistics On Free Software projects
GenericBoy writes: "The first edition of The Orbiten Free Software Survey is out online. Some of the stats are number of authors and projects, the top 10 contributing authors, how many MB are in all of the free software projects put together (!) and a bunch more. " Now, as they themselves point out in the their Scope and Method, the methodology is crude, and I don't think Orbiten could quite submit it to Nature yet or anything, but it's an interesting bunch of stats.
In general the handling of large packages such as KDE seem fairly poor. For example KDE apparantly has no authors according to the by-project listing. I think this is a great idea, but it needs a cleaner source of data, for example Coolo has been able to give some very interesting and detailed figures by running scripts on the KDE CVS repository. Perhaps this is the sort of thing they need to be using as the initial data set from which they make their analysis.
Rich.
The discussion points out some interesting facts about why some individuals are listed as big contributers (such as the author of libtool. Duh.) and why some aren't listed at all. They even have some comments from the developers of the survey.
And I just love the comment of Havoc Pennington:
They list their sources as follows:
Debian would have been a more sensible distro to use, because it is overflowing with (packages|crap). Red Hat (presumably) just ship the ones which it makes commercial sense to ship, wheras Debian has everything that anyone's bothered to include whether it's useful or not. For example, Cooledit (my favourite text editor) is missing from the survey. The only problem with Debian would be stuff missing because it is not DFSG-free. Such stuff is available in the non-free/ directory but it's probably not as comprehensive as the main/ directory is.
Having said that, it's very interesting to see what they have got. I didn't know Andrew Tridgell did all that stuff, for example. This could be a good tool for the community to get to know people better.
perl -e 'fork||print for split//,"hahahaha"'
What I find most interesting by far is the composition of the contributions when viewed by project. In nearly every project I viewed, there are two or three elite "key contributors" who provide somthing on the order of 1/3 to 7/10 or more of the code, with the remainder provided in a slew of sub-1% coders.
This relates an interesting story. It appears that, while the real strength of OSS is incremental improvement over time, few projects can exist without a guiding intellect or a handful of ambitious coders on the core team.
Presenting this data to employers who are concerned about losing control of their code may help assuage their fears of open source. Clearly projects that are "owned" by no one are rarities. A corporation *can* have its cake and eat it too.
-konstant
Yes! We are all individuals! I'm not!
-konstant
Yes! We are all individuals! I'm not!