Estimating the Size/Cost of Linux
2bits writes "Wow... A Billion Dollars Worth Of Software On My System For Free! Check This Guy Out, He Came Up With A Counting / Pricing Method For Quite A Few Types of Source Code. Here is the Program. The results on the site are sorta dated, based on RH 7.1, but the app is pretty cool!... Hey, I can finally find out how much all my side projects are worth / costing me..."
Where did he get the billion dollar estimate from? I see no direct correspondance between lines of code and monetary value.
[cmdrtaco@localhost]$ est slashcode
Analyzing slashcode.....
Result: $6.66
[cmdrtaco@localhost]$
Okay, so now Slashdot is posting this story that is over a year old?
From the header of the paper:
More Than a Gigabuck: Estimating GNU/Linux's Size
David A. Wheeler (dwheeler@dwheeler.com)
June 30, 2001 (updated November 8, 2001)
Version 1.06
www.timcoleman.com is a total waste of your time. Never go there.
I know I'll get modded down for saying this, but Taco, as an "editor", couldn't you at least have fixed This Guy's Moronic Capitalization Scheme?
It's hard to be religious when certain people are never incinerated by bolts of lightning.
A Billion Dollars Worth Of Software On My System For Free!
Yeah, that's what happens when you use P2P _WAY_ too much
Someone finally acknowledging that OpenSource/Free(beer) software actually has an associated cost - what next? Wait - is that a flying pig I see?
Although I rember this article in the Past a fiew months ago. But I am to lazy to look it up. But it is instering how the Open Source movement just by a lot of people just doing a lot of little things (and some not so little) has created a product that would take a lot resources for a large company to complete. Open Source Software in my opinion is the only way the Little Guy to play with the Big Guns.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Maybe MS could spend the money for a working OS in about a week or two.
how much can I charge for my 5 line C++ "Hello world\n" program? ;-)
This must be about the first time I've read an article about Linux, (GNU/Linux, if that's what you like to call it), that hasn't called the X-Window-System(tm), X-Windows.
As far as I am aware, Windows(tm) is a trademark of Microsoft Corporation(tm). The X Consortium actually give recommended names for X, in the X man page.
This looks like a serious problem for Linux distributors like Red Hat, Mandrake, and Debian. They sell their products (which consist of software and support and manuals) for $40-$100, usually. Now we see that what they put into their product (i.e., the cost) is orders of magnitude beyond that. Even if Red Hat sold every single copy it packaged (it doesn't even come close), and even if nobody downloaded it for free or copied the CDs for a friend (again, an incredibly optimistic assumption), it would still be looking at huge losses.
This might have worked a few years ago, but with accounting practices coming under scrutiny across the board, I fear that these companies are headed for trouble.
Karma: Good (despite my invention of the Karma: sig)
Why? He obviously does not use Linux. Just look at his picture! What Linux user out there is gonna be caught dead wearing a white shirt and a tie? Okay, maybe to a wedding/funeral, but that's it.
He also went off and shaved and combed his hair for his picture.
The man just ain't right, I tell you!
woody:~#apt-cache search sloccount ...so it appears theres a *.deb of it already (or is this an old story...) Hmmm... you be the judge.
sloccount - Programs for counting physical source lines of code (SLOC)
Why did I post this? Ask me now!
For slashcode? Dude, you got ripped off!
Instead of wasting time figuring out ficticious pricing based on the way that corporate america prices software, why not figure out a way to remove the aforementioned hidden costs from Linux so that the masses can begin to see what many of us on /. have known for a while: That GNU Linux and Open Source Software represent a great choice.
Amazing magic tricks
It may well containt "over 30 million physical source lines of code (SLOC)", but what about the lines of source code? Eh?
Didn't think about that, did you?
Invoicing, Time Tracking, Reporting
I don't think the measurement of the length of code or the time one has or might have been taken to produce the code is in any way related to the value for the use of the software produced.
The same people that argue in these categories do also try to legitimate open source software by their better "quality" in terms of fewer errors. The result of this argument is that MS software would be great to use if it contained less errors. But that's not the main point. As can be seen when MS does such horrible things like allowing themselves to destroy your software (DRM EULA change) the problem is not the result but the way they produce their software. I'd argue that because their development model is bad the resulting software is bad, too, bad that's only a minor problem in comparison to the harm they do to the software culture in general.
This paper analyzes the amount of source code in GNU/Linux, using Red Hat Linux 7.1 as a representative GNU/Linux distribution, and presents what I believe are interesting results.
In particular, it would cost over $1 billion ($1,000 million - a Gigabuck) to develop this GNU/Linux distribution by conventional proprietary means in the U.S. (in year 2000 U.S. dollars). Compare this to the $600 million estimate for Red Hat Linux version 6.2 (which had been released about one year earlier). Also, Red Hat Linux 7.1 includes over 30 million physical source lines of code (SLOC), compared to well over 17 million SLOC in version 6.2. Using the COCOMO cost model, this system is estimated to have required about 8,000 person-years of development time (as compared to 4,500 person-years to develop version 6.2). Thus, Red Hat Linux 7.1 represents over a 60% increase in size, effort, and traditional development costs over Red Hat Linux 6.2. This is due to an increased number of mature and maturing open source / free software programs available worldwide.
Many other interesting statistics emerge. The largest components (in order) were the Linux kernel (including device drivers), Mozilla (Netscape's open source web system including a web browser, email client, and HTML editor), the X Window system (the infrastructure for the graphical user interface), gcc (a compilation system), gdb (for debugging), basic binary tools, emacs (a text editor and far more), LAPACK (a large Fortran library for numerical linear algebra), the Gimp (a bitmapped graphics editor), and MySQL (a relational database system). The languages used, sorted by the most lines of code, were C (71% - was 81%), C++ (15% - was 8%), shell (including ksh), Lisp, assembly, Perl, Fortran, Python, tcl, Java, yacc/bison, expect, lex/flex, awk, Objective-C, Ada, C shell, Pascal, and sed.
The predominant software license is the GNU GPL. Slightly over half of the software is simply licensed using the GPL, and the software packages using the copylefting licenses (the GPL and LGPL), at least in part or as an alternative, accounted for 63% of the code. In all ways, the copylefting licenses (GPL and LGPL) are the dominant licenses in this GNU/Linux distribution. In contrast, only 0.2% of the software is public domain.
This paper is an update of my previous paper on estimating GNU/Linux's size, which measured Red Hat Linux 6.2 [Wheeler 2001]. Since Red Hat Linux 6.2 was released in March 2000, and Red Hat Linux 7.1 was released in April 2001, this paper shows what's changed over approximately one year. More information is available at http://www.dwheeler.com/sloc. 1. Introduction The GNU/Linux operating system has gone from an unknown to a powerful market force. Netcraft found that, of the systems running web servers on June 2001, GNU/Linux was now the second most popular operating system (with 29.6%, versus Windows' 49.6%) [Netcraft 2001]. Another survey, of primarily European and educational sites, found that GNU/Linux was used more than any other operating system (of the sites it surveyed) [Zoebelein 1999]. IDC found that 25% of all server operating systems purchased in 1999 were GNU/Linux, making it second only to Windows NT's 38% [Shankland 2000a].
There appear to be many reasons for this, and not simply because GNU/Linux can be obtained at no or low cost. For example, experiments suggest that GNU/Linux is highly reliable. A 1995 study of a set of individual components found that the GNU and GNU/Linux components had a significantly higher reliability than their proprietary Unix competitors (6% to 9% failure rate with GNU and Linux, versus an average 23% failure rate with the proprietary software using their measurement technique) [Miller 1995]. A ten-month experiment in 1999 by ZDnet found that, while Microsoft's Windows NT crashed every six weeks under a ``typical'' intranet load, using the same load and request set the GNU/Linux systems (from two different distributors) never crashed [Vaughan-Nichols 1999].
However, possibly the most important reason for GNU/Linux's popularity among many developers and users is that its source code is generally ``open source software'' and/or ``free software''. A program that is ``open source software'' or ``free software'' is essentially a program whose source code can be obtained, viewed, changed, and redistributed without royalties or other limitations of these actions. A more formal definition of ``open source software'' is available from the Open Source Initiative [OSI 1999], a more formal definition of ``free software'' (as the term is used in this paper) is available from the Free Software Foundation [FSF 2000], and other general information about these topics is available at Wheeler [2000a]. Quantitative rationales for using open source / free software is given in Wheeler [2000b]. The GNU/Linux operating system is actually a suite of components, including the Linux kernel on which it is based, and it is packaged, sold, and supported by a variety of distributors. The Linux kernel is ``open source software''/``free software'', and this is also true for all (or nearly all) other components of a typical GNU/Linux distribution. Open source software/free software frees users from being captives of a particular vendor, since it permits users to fix any problems immediately, tailor their system, and analyze their software in arbitrary ways.
Surprisingly, although anyone can analyze GNU/Linux for arbitrary properties, I have found little published analysis of the amount of source lines of code (SLOC) contained in a GNU/Linux distribution. Microsoft unintentionally published some analysis data in the documents usually called ``Halloween I'' and ``Halloween II'' [Halloween I] [Halloween II]. Another study focused on the Linux kernel and its growth over time is by Godfrey [2000]; this is an interesting study but it focuses solely on the Linux kernel (not the entire operating system). Paul G. Allen posted some results from running Scientific Toolworks, Inc.'s tools on the Linux kernel, but this analysis only considered C code (including headers) - ignoring the many other languages used in constructing the Linux kernel (e.g., assembly language), and only concentrating on the kernel. The Free Code Graphing Project at http://fcgp.sourceforge.net generates a graphical representation of a program (currently, the Linux kernel), but only of the C code. In a previous paper, I examined Red Hat Linux 6.2 and the numbers from the Halloween papers [Wheeler 2001].
This paper updates my previous paper, showing estimates of the size of one of today's GNU/Linux distributions, and it estimates how much it would cost to rebuild this typical GNU/Linux distribution using traditional software development techniques. Various definitions and assumptions are included, so that others can understand exactly what these numbers mean. I have intentionally written this paper so that you do not need to read the previous version of this paper first.
For my purposes, I have selected as my ``representative'' GNU/Linux distribution Red Hat Linux version 7.1. I believe this distribution is reasonably representative for several reasons:
Different distributions and versions would produce different size figures, but I hope that this paper will be enlightening even though it doesn't try to evaluate ``all'' distributions. Note that some distributions (such as SuSE) may decide to add many more applications, but also note this would only create larger (not smaller) sizes and estimated levels of effort. At the time that I began this project, version 7.1 was the latest version of Red Hat Linux available, so I selected that version for analysis.
Note that Red Hat Linux 6.2 was released on March 2000, Red Hat Linux 7 was released on September 2000 (I have not counted its code), and Red Hat Linux 7.1 was released on April 2001. Thus, the differences between Red Hat Linux 7.1 and 6.2 show differences accrued over 13 months (approximately one year).
Clearly there is far more open source / free software available worldwide than is counted in this paper. However, the job of a distributor is to examine these various options and select software that they believe is both sufficiently mature and useful to their target market. Thus, examining a particular distribution results in a selective analysis of such software.
Section 2 briefly describes the approach used to estimate the ``size'' of this distribution (more details are in Appendix A). Section 3 discusses some of the results. Section 4 presents conclusions, followed by an appendix. GNU/Linux is often called simply ``Linux'', but technically Linux is only the name of the operating system kernel; to eliminate ambiguity this paper uses the term ``GNU/Linux'' as the general name for the whole system and ``Linux kernel'' for just this inner kernel. 2. Approach My basic approach was to:
More detail on this approach is described in Appendix A. A few summary points are worth mentioning here, however. 2.1 Selecting Source Code
I included all software provided in the Red Hat distribution, but note that Red Hat no longer includes software packages that only apply to other CPU architectures (and thus packages not applying to the x86 family were excluded). I did not include ``old'' versions of software, or ``beta'' software where non-beta was available. I did include ``beta'' software where there was no alternative, because some developers don't remove the ``beta'' label even when it's widely used and perceived to be reliable.
I used md5 checksums to identify and ignore duplicate files, so if the same file contents appeared in more than one file, it was only counted once (as a tie-breaker, such files are assigned to the first build package it applies to in alphabetic order).
The code in makefiles and Red Hat Package Manager (RPM) specifications was not included. Various heuristics were used to detect automatically generated code, and any such code was also excluded from the count. A number of other heuristics were used to determine if a language was a source program file, and if so, what its language was.
Since different languages have different syntaxes, I could only measure the SLOC for the languages that my tool (sloccount) could detect and handle. The languages sloccount could detect and handle are Ada, Assembly, awk, Bourne shell and variants, C, C++, C shell, Expect, Fortran, Java, lex/flex, LISP/Scheme, Makefile, Objective-C, Pascal, Perl, Python, sed, SQL, TCL, and Yacc/bison. Other languages are not counted; these include XUL (used in Mozilla), Javascript (also in Mozilla), PHP, and Objective Caml (an OO dialect of ML). Also code embedded in data is not counted (e.g., code embedded in HTML files). Some systems use their own built-in languages; in general code in these languages is not counted.
I'll never use macros, functions, classes, or the stl again!
"Look, I wrote a program which does the exact same thing as another program, but mine is worth much, much more!"
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Just try explaining it to your insurance company after your house gets robbed, or some idiot airport security inspector accidently trashes your laptop.
Heck, given that theory, one fire should net me more than enough to retire on.
I love these kind of stats.
Slashdot has, say, 100,000 US readers per day.
Each spends an hour reading slashdot when they should be working.
Let's say an average Slashdot reader is worth say, $40 an hour, and they read Slashdot on 300 days during the year.
That means Slashdot costs the USA $1,200,000,000 dollars a year! Crikey! Don't tell Bush!
A shorter program that did the same thing as a longer program, but was more efficient than a longer program might have taken much more time/effort to code.. I don't think it could possibly take this into consideration.
Personally, I'd feel bad if I wrote a program which was just a bunch of spaghetti.
Microsoft puts so much code bloat into their programs...
Part of that $1bil could have helped feed a programmers family and gone toward making a more stable OS. And with all the layoffs in the industry, don't you just feel aweful for patronizing such software?!
So 10 billion lines of bad bloated code will worth more that 10.000 lines of pure, clean and fast code?
But how many rupees?
Just think how much they could have saved if they had outsourced it to an Indian contractor!
To put it mildly...
In his paper, he uses the basic COCOMO model for estimating the cost. This model, quite frankly, sucks. Boehm's book even states, more or less, that the COCOMO model is only accurate to a factor of 10.
Since I no longer have the Boehm book, this quote from a google-found web page will have to do. This is a quote of a quote from Boehm's book, Software Engineering Economics:
"Basic COCOMO is good for rough order of magnitude estimates of software costs, but its accuracy is necessarily limited because of its lack of factors to account for differences in hardware constraints, personnel quality and experience, use of modern tools and techniques, and other project attributes known to have a significant influence on costs."
Basically, this means that the estimate could be anywhere from $100M->10B in true cost.
At the very least, this kid should have stated which of the model variants he was using.
Better yet, he should have subdivided the source code into multiple categories: kernel+drivers, tools, productivity software, etc. etc., and then applied the various models to them.
Just my 2 bits.
BTW, here is the google-found page which has the quote I stole. Plus, it gives a nice, albeit brief, overview of COCOMO.
-d
Well, when I saw the tidbit on /., I thought, wow, a billion dollars worth of software in a Linux distro? That is not what this article says. It simply says that RedHat (would have) had to pay the developers a billion dollars to complete that much work. To find out how much it should probably cost, add some money for profit, and divide that by how many probably users there are. This would only make sense for Linux as a whole, and not just RedHat.
I just heard some sad news on talk radio - troller/crapflooder pwpbot was found dead in its basement this morning. There weren't any more details. I'm sure everyone in the Slashdot community will miss it - even if you didn't enjoy his work, there's no denying its contributions to popular culture. Truly an Slashdot icon.
if analyzing SLOC says nothing about developer contributions, efficiency, or effectiveness - then isn't estimating value based off SLOC fundamentally flawed?
i mean, you can't have it both ways. Either SLOC shows how productive programmers are, or it doesn't.
if it does - then get over the SLOC analysis in your job reviews.
if it doesn't - then you cannot even remotely accurately guage monetary worth through SLOC.
good luck to the people trying to estimate worth of OSS. good luck to the people trying to estimate the worth of programmers.
i just don't know why people don't count 'Customer Problems Solved Over Time' as the end-all, be-all.
(and time and energy fixing software bugs doesn't count. that's not the customers problem. it's the developers)
who cares how many SLOC are in a product. how many needs of the end user does it fulfill, and how long did it take to get done from the word 'go'?
yeah, you'd need to define customer needs much more carefully than most shops do... but isn't that part of the eXtreme Programming retinue
// "Can't clowns and pirates just -try- to get along?"
The cost analysis was done based on linux, however most of the code analysed in fact is for things that run on other platforms, and much of which was in development for years before linux 0.9 hit the 'Net.
So the measure of value based on who uses Linux includes everyone who uses linux-hosted apache servers. The more general case includes everyone who accesses servers that depend on (Perl, BIND, sendmail, mysql .... etc) or were/are developed using (X11, CVS, bitkeeper, emacs, gcc .... etc)
The economic value isn't small. That much I'm pretty certain of, just how big, well it works for me, I'll leave the analysis to the economists.
Linux is Linux, if One need clarify their dist: <Dist>/GNU Linux
bsds are of course just BSD
I kind of hope that nobody uses this to price software that they're selling to a company, lest they lose their credibility. There is no assurance that this guy did not lean toward making this software seem more valuable than it really is, thus making open source software more attractive (because you're getting something for nothing). I'd be careful using this in any other capacity than your home computer for the purpose of having fun.
On a similar note, do the prices seem accurate, for those of you who have used this thing?
Lack of eloquence does not denote lack of intelligence, though they often coincide.
This guy is correct. /. a years ago.
The story already appeared on
If you use moderation to abreact your sexual impotence, then get viagra or stop moderating.
Good lord, taco, you should have known that ~7000 stories ago somebody posted this already!!!
</sarcasm>
Come on people, cut the guy some slack. I am sure you can't remember every story posted!!
Running the same SLOC figures against the statistics from the Function Points methodology and you get a different picture. You are looking at 2500 person years of effort, with a cost optimum development time of 6.5 years. However, to deal with the complexity involved you will need approximately 3000 average and 1500 above average developers (at average development rate you could expect a 13 year delivery!). Total price tag: around $2 billion (that's 2e9, in case your definition of billion is different).
Of course, this is still a very skewed figure. There is no accounting for the quality of code (at the end of such a complex development cycle, you could expect as many as 7 million defects!), and both FP and COCOMO estimate development effort inclusive of design work and documentation, which in OpenSource typically don't match those in mature commercial development environments (from which the FP and COCOMO statistics are derived).
There is also a huge, and invalid, assumption made by the author, regarding the application of COCOMO (and my FP calculations suffer the same problem). The complexity of a system is MORE than the sum of its parts. This is because developer productivity declines as system complexity increases.
At 10,000 FP, as developer is often only 60% as productive compared to 1,000 FP. The situation is obviously far worse at 300,000 FP (the entire distribution), yet the kernel itself only weighs in at around 20,000 FP. And even then, clear modularisation reduces complexity for individual developers. So it is grossly unfair to base calculations on the system as a whole.
The kernel (around 2.5 MLOC) as a single system would be a task for 300 skilled developers over around 3 years, while the Gimp (around 500 KLOC, still near the top of the list in size) would be looking at 50 developers over 18 months. More complex projects need relatively more time and more developers. Doing all these projects in parallel (assuming it were possible - which is isn't because of dependancies, and that's another factor) would take less than the most complex task (kernel = 3 years) and relatively less developers than estimated based on the complexity of that task (30 MLOC / 2.5 MLOC * 300 developers = max 3600 for entire distribution). Max cost: 3600 * 3 * $55k = $594 million.
And you're STILL not accounting for the fact that employing someone costs a lot more than just paying a salary. Which puts all estimates (mine and the authors) up.
i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net
Linux is free (as in beer) if you time is worthless.
</flamebait>
Because we don't know if it's off to the low or to the high. If his estimate was 10 times too low, it was really 10B; if it was 10 times to high, it was really 100M.
This sig under construction. Please check back later.
A proof point from Abiword. A just ran the program over our abi-unstable directory. About 300,000 LOC estimated cost to produce about $10,000,000.
I also ran the program over the abiword plugins directory. Estimated cost to produce, $1,200,000.
Now I know from direct experience that building the main code base of the AbiWord Word Processor took about 100 times more effort than the plugins.
Cheers
Martin Sevior
AbiWord Developer
Oculus Habent is a newbie - he is really new to computers .
Priceless
if analyzing SLOC says nothing about developer contributions, efficiency, or effectiveness - then isn't estimating value based off SLOC fundamentally flawed?
1) SLOC says nearly *EVERYTHING* about developer contributions. After all, the SLOC is what the developer contributes.
2) Efficiency is a measurable metric, and can be quite as simple as (SLOC/MM)-(NumBugs/MM), where MM=Man-Month.
While there is a variance in the efficiency of programmers, for any given company a median efficiency can be determined. From this, a decent cost-estimate for SLOC may be determined.
i just don't know why people don't count 'Customer Problems Solved Over Time' as the end-all, be-all.
That collected metric would have almost no utility, unless you could atomize the concept of a 'customer problem'.
"Well, it took us 6MM to craete that web-based
accounting system, so it should take us about
the same to develop these kernel drivers"
Something like the above doesn't help anyone. It doesn't help the programmers who take part in recording the data; it doesn't help the managers plan and predict the product lifecycle; it doesn't help the customer in letting him know when to expect to see the next product release.
What you failed to do was drill down further in your analysis of the problem.
Let's say you just finished putting out product "X", which solved some customer problem. Now the customer wants product "Y" to solve some other problem. How do you estimate "Y" based upon "X"?
Answer: Break it down. "X" required the following capabilities: A,B,C, and D. You recorded and tracked the amount of time it took to accomplish each capability.
Now, you break down the customer problem, "Y", and determine what it would take to solve it.
If you did a good job at atomizing the customer problem on project "X", then you should have been able to come up with an average amount of time/AtomicProblem. Apply this metric and Viola!, you should have a good idea about the scope of "Y".
Many people like to take the AtomicProblem and equate it to a SLOC estimate.
What SLOC counting does is try to establish a commonality among various projects so that future projects of various natures may be estimated using previous metrics. This is not perfect, but it should be used as an aid in determining overall project scope and costs.
i mean, you can't have it both ways. Either SLOC shows how productive programmers are, or it doesn't.
SLOC shouldn't be used to estimate programmer productivity. It should be used to estimate project productity.
-D
Obligatory Simpsons quote:
"Oh, people can come up with statistics to prove anything, Kent. 14% of people know that."
Does anybody else find it worrying that the kernel is by far the largest component of RHL? I kinda expected it to be one of the smaller of the large projects; way smaller than the likes of KDE / GNOME / Gimp / etc..
I just wonder how much the Debian/GNU Linux would have costed based on the same calculation knowing that it now includes more than 10K packages
If a corporation buys a Linux seat (or heck, downloads an ISO) then it has acquired an asset. Admittedly a digital one, but an asset nonetheless.
Now, if GE can revalue its pension assets upwards, when their value has gone down, then surely the corporation can revalue it to a 'market' rate of (say) $10,000 a seat.
Rolling it out to all the people in your organisation then, gosh!, your company is suddenly as profitable as Enron or WorldCom were.
Best of all, so long as you never run out of blank CDs, your company can continue to make massive profits.
--- My dad's political betting
I thought the value of a program (or any other noun) is related only to the amount of money that someone will pay for it ... If you can convince someone to pay $1,000,000 for linux, then it's worth $1,000,000. that's it.
a nifty little formula which analyzes the actual FUNCTION of a program to figure out how much it's worth is all well and good, but it doesn't mean anything. I bet the functional worth of Internet Explorer is quite a lot, but no one's willing to pay for it, so it's, in reality, worth nothing.
ìì!
These stats, of course, are fun but entirely meaningless.
If you are going to take the entire design cost into one copy, ok, so let's also add the cost of the CD (probably five billion or so in development cost) and the cost of the Microprocessor used to beta-test: around 50 billion I am guessing. Quite an expensive copy of RedHat.
The serious point is: to be at all meaningful, "cost" needs to be divided by number of users over the lifetime of the product. I would love to see those stats (and compare them to MS).
I venture Linux would still outvalue MS on that basis (if only because there are fewer users).
Michael
---
BDOS ERR ON A:>
Note to Mr. Wheeler: when your shirt is the same color as the background of your web site, you might want to put a thin border around the picture with your favorite free image editing software.. though I'm wondering why exactly your picture is there at all..
Sloccount run on Slashcode 2.25 gives us this:
Total Estimated Cost to Develop = $ 996,916
I would have posted the entire output of the program, but unfortunately, their million-dollar lameness filter wouldn't let me!
No, Thursday's out. How about never - is never good for you?
Of course it cost a billion dollars to write the software everyone has on their machine. But Microsoft has $40 billion in the bank and collects $7-15 billion a year in revenues.
You do the math.
--Blair
His paper is valuable, priceless even, in that it is throwing a spotlight on a part of the Open Source phenomenon that has not yet come into public discussion.
While I don't know COCOMO, I accept that his numbers are highly suspect. But you have provided a range of accuracy that corrects for this. I am very confident that any reasonable assessment of the Linux development effort is going to be greater than $100 million and less than $10 billion.
So it is indisputable that Linux is a resource whose development effort exceeds $100 million.
And no reasonable person can question that this resource is now available at very low cost to anyone or any institution, on a global level.
It is difficult to see how anyone could not recognize that the use of this resource increases global wealth. Linux does make the world pie bigger.
I think that is the real story here. Linux is a tool, a lever, that has required at least $100 million of effort to develop, but which anyone can put to work for extremely low cost. I think this kind of phrasing needs to be brought to the attention of those who are being FUDded by groups that feel threatened by Open Source.
Just as I thought! Every copy of linux is costing the software industry over a billion dollars!
a gigabuck of non-business capable software = a gigabuck down the drain
does this get you thinking ?
One thing I got was that the amount of lines of code in Mozilla were about the same as everything else (minus the kernel) put together...
(+1 Funny) only if I laugh out loud.
That's "NMUBER OF TEH BESAT"
so does this mean that all the people who had their place raided and their linux box taken that they now incurred $1billion in damages???
Software value should not be calculated by the amount vendor spends, but by the amount "user gains".
Linux saves software cost. Also linux saves you from NIMDA. But linux means more expenses in tech team.
So value of linux is =
Value of Windows
+ Value that would be lost due to NIMDA, etc
- Cost of tech department difference
Which I guess is "much" more than $1G in total.
I mean, come on, sure, some of this stuff was written by the finest minds in the industry, who could easily have feched premium rates for their work, but chose not to for "the good of humanity" (or some other variation of the rationalization). Then there's the contributions from people who might not be able to hold down a job bussing tables at Denny's. Those are two extremes. You could easily compute an average cost from hours spent there.
;)
But what could you sell the software for?
Nothing. It's market value is zero - because it's market is a Linux box, and we all know that nobody will pay for software on Linux, right?
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
This method severely underestimates Perl programmers' efforts! :-P
I suffer from attention surplus disorder.
The idea is that the inaccuracies go both ways. And for a whole lot of projects even out. If you get enough data then the low precision(* won't matter it the accuracy(* is good.(or was it the other way around)
*) Yes i'm using the math definitions of these words, not the dictionery ones. Because the dict. ones suck.
FRA: STFU GTFO
* The largest components (in order) were the Linux kernel (including device drivers), Mozilla (Netscape's open source web system including a web browser, email client, and HTML editor), the X window system (the infrastructure for the graphical user interface), gcc (a compilation system), gdb (for debugging), basic binary tools, emacs (a text editor and far more), LAPACK (a large Fortran library for numerical linear algebra), the Gimp (a bitmapped graphics editor), and MySQL (a relational database system).
:-)
Since the second largest part of the system is now Mozilla and not gcc mabye we should stop calling it GNU/Linux and start calling it Mozilla/Linux.
Vanguard
That which does not kill me only makes me whinier
This method of software cost estimation is patently ridiculous. I can't even imagine how anyone could take him even remotely seriously.
Counting MySQL, PHP, etc. lines of code as part of the OS is misleading -- did he count MS SQL, Access, etc. and other pieces of software which could be bundled with a particular flavor of Windows? Consumer Windows OS distribution contains a lot more application code (e.g. Office bundled, vendor-supplied drivers/goodies/etc.) than the 'stock' Windows code numbers listed in his comparisons. Further Windows does not contain individual drivers for every single piece of hardware out there, it has some generic drivers and then relies upon you vendor to supply the drivers for them, which is typically free. How many vendor-supplied drivers vs. homebrew are in Linux?
Further, he bases his cost as if Red Hat 7.x was a complete rebuild -- as if every single line of code was re-written from the previous version, so therefore so-much-ever-million-man-minutes went into making it is wrong. Someone invented the wheel many (tens of?) thousands of years ago. I bet a lot of man hours have been spent refining the wheel. Do auto manufacturers include that into the cost of cars? Do they make you pay for 10,000 years of refinement from the rock-with-a-hole-in-it to wagon wheels to the run-flat tires of today? No, they include the cost of the materials that went into making it and certainly *some* R time, but that cost calculation is determined from various sources, not 'how many molecules of rubber are in my tire'.
His LOC calculation is misleading as well.
if( something )
{
stuff
}
else
{
stuff
}
Contain 4 superfluous lines of code. According to his calculations I did 2x more work than if I wrote it like this:
if( something )
stuff
else
stuff
If you're frisky you can write it in a single line:
if( something ) { stuff } else { stuff }
Why this article was even mentioned here is beyond me. If it I could moderate it I'd put it at (-1: Stupid).
Thanks,
--
Matt
So either I'm doing enough work to be worth several hundred thousand dollars a year, or this thing is complete nonsense.
How can we continue to believe in a just universe and freedom to eat crackers if we have no ale?
Who the hell let these people moderate? This guy obvioulsy meant his post to be funny, yet someone modded it as flamebait? Sheesh, people, get a sense of humor!
this app produces nonsense, i have a project here which i have worked on for half a year, and it is declared to be the product of a year and a half of coding by three people, and being worth 200000. They prolly run the prog on itself, seen how much crap it is, and raised the values to save their self-esteem...
Wow! Apparently I can do the work of four normal programmers... time to talk to my boss about a raise!
The cake is a pie
- David A. Wheeler (see my Secure Programming HOWTO)
"Estimating the Size/Cost of Linux"
Let see now: the size is five letters (thank god I don't have to use my other hand!) and the cost is of course "Free" (look ma... no hands!)
From excellent karma to terible karma with a single +5 funny post...
We worked out that it took 8 MAN YEARS to write some code.
That's all well and good, but it's been mostly me writing it on 37.5-hour weeks for the past 10 months.
This is a big "duh" in my book.
Smegma.
What would Microsoft pay to buy up an exclusive right to use all of the Linux distributions? Maybe $1B is on the low end?
I have the COCOMO II book, and I have used the COCOMO model for certain projects. I agree that it is not appropriate here. COCOMO was designed with a narrow focus in mind, and applied best to repeatable projects in a structured work environment. It requires you to estimate parameters for factors such as "Programmer Unfamiliarity", "Precedentedness" "Development Flexibility", "Team Cohesion", "Process Maturity", "Multisite Development", etc. Each of these fudge-factors makes it extremely difficult to correctly apply the model to someone else's work.
Also, each of these factors is likely to be different for each major component.
"I was unable to find a publicly-backed average value for overhead, also called the 'wrap rate.' This value is necessary to estimate the costs of office space, equipment, overhead staff, and so on. I talked to two cost analysts, who suggested that 2.4 would be a reasonable overhead (wrap) rate."(from here)
He is using an average overhead rate for a large corporation. He forgot to take in to account the fact that Open-Source developers (generally) don't get office space or health insurance or secretaries. They use their own equipment in their own homes. So a more reasonable overhead rate for this project would be close to 0.1.
So taking all of this in to account, he's probably off by a factor of more than 100. (If you want to know how accurate he was, compare his estimate to the actual cost of developing a Linux distro... ;)
While it might have made interesting headlines, I see little value in the actual number.
Just ran it on MS windows XP. Itr came to $0.39
Damn!
Does that mean that I've been * coughing* paying too much?
"I used to have that really cool,funny sig
thanks, i'll file that along with 'your computer wouldn't work without electricity' and other such jewels of insight...
quite a bit of mozilla code is written in javascript, and it isn't counted at all (or counted as c++? dunno)
Why start with what it cost to develop Windows? As far as I can see there are very few original ideas in Windows in terms of the GUI. If there is one OS that Windows was inspired by then it was surely MacOS! So, if you want to get closer to the billion dollars - add Apple's development costs to the date that Windows 3.0 was developed. Or would that take you over a billion? Feedback on my figures would be appreciated. I would hate to sound like an advocate of MacOS - but I believe it should be respected and left on the shelf for Microsoft to poach, whereas Linux should be revered for what it stands for. ------ IWX222 ------ the solution to the problem caused by the solution to the problem caused by the solution...