Software Defects - Do Late Bugs Really Cost More?
"If you're a software engineer, one of the concepts you've probably had driven into your head by the corporate trainers is that software defects cost logarithmically more to fix the later they are found in the software development life cycle (SDLC).
For example, if a defect is found in the requirements phase, it may cost $1 to fix. It is proffered that the same defect will cost $10 if found in design, $100 during coding, $1000 during testing.
All of this, to my knowledge, started by Barry Boehm in papers[1]. In these papers, Mr. Boehm indicates that defects found 'in the field' cost 50-200 times as much to correct as those corrected earlier.
That was 15 years ago, and as recently as 2001 Barry Boehm indicates that, at least for small non-critical systems, the ratio is more like 5:1 than 100:1[2].
[1] - Boehm, Barry W. and Philip N. Papaccio. 'Understanding and Controlling Software Costs,' IEEE Transactions on Software Engineering, v. 14, no. 10, October 1988, pp. 1462-1477
[2] - (Beohm, Barry and Victor R. Basili. 'Software Defect Reduction Top 10 List,' Computer, v. 34, no. 1, January 2001, pp 135-137.)"
At any stage, you can only find bugs that are introduced at or before that stage. So while fixing a requirements bug in the coding phase might be more expensive than fixing it during the requirements phase, fixing a coding bug during the requirements phase is a tricky operation that I'll leave as an exercise for the reader :-)
Of course, if you omit some of these phases completely, you won't introduce any bugs during them. That's why the JFDI(*) methodoloy is so popular.
(*)Just F*cking Do It
Every bloody emperor has his hand up history's skirt [Peter Hammill/VdGG]
There's plenty of proof out there. Even "ancient" but worthy texts like "The Mythical Man Month" discuss this one.
The size of the project and the nature of the bug really combine to drastically affect the outcome.
For me personally we have just spent about a year tracking down a particular set of bugs (probably not all nailed yet) which showed up post-live. When we were pre-live these would undoubtedly have been easier to fix, but something else that we could have done at that point would have been to improve our design, which would have nuked most of the bugs completely. Once we are in production however we have this forward/backward compatibility heuristic tying one hand behind our backs, and redesigning the thing gets much much bigger.
But that's just anecdotal, of course.
Defects are easier to find in a concrete product than in a conceptual design. Also, many bugs will be introduced in later stages. Therefore, even a full proof design may evolve into a buggy implementation. So surely: there is a trade-off between looking for "bugs" too early and fixing bugs too late.
Nevertheless a trainer is correct in stressing the golden think-before-you-code rule - especially when instructing unexperienced coders.
--
Every program has two purposes -- one for which it was written and another for which it wasn't.
Looks more exponential to me.
As I recall there was a conference paper in Extreme Programming Perspectives which describes an "infection" model for bug creation, fixing, etc. They were trying to model exactly the effect you describe to see if they could (in a model) find any justification for XP's argument against the increasing cost of bugs through phases. Again, just from memory, they do try to validate the model against figures from real studies.
There's also material in Watts Humphrey's book on the Personal Software Process (about as far from XP as you can get). That book is illustrated throughout with statistics about students who tried to complete the exercises in the book, including in Chapter 13, where there's a section on "The Costs of Finding and Fixing Defects.".
Compare the cost of testing, then over-the-air updates to a set of mobile phones & associated risk management
to
the cost of just building and shipping new code
that has yet to undergo testing or launch.
To give you an idea, managing the testing and upgrading over-the-air softare in mobile phones can become a new project in its own right with all the associated monitoring and overheads.
Fixing the bug of a pre-launch project can be a 1 minute job.
Never forget that complexity accumulates. Fixing the bug itself probably costs about the same at every stage, but other costs are introduced as the project moves along, and peak after the software has been deployed.
A bug found after deployment has costs associated with it that a bug found during coding does not:
The cost of finding and fixing the bug may be negligible compared to other costs.
Another aspect of the issue is the nature of the bugs you find late. In my experience, bugs that survive testing and deployment tend to be either bugs in requirements or pretty subtle bugs that slipped through testing, and both are more expensive than the type of bugs commonly detected early on during development.
The guy in fight club worked out if the cost of a recall and fixing the fault was going to be greater than the cost of litigation.
I would expect the same kind of factors come into play when the product is software instead of hardware. So why not try google
Sometimes it costs less to pay a person to manually correct data that is incorrect due to a fault in the core of a product, sometimes it's cost less to do a re-write.
thank God the internet isn't a human right.
Most recently I've been tracking down an error in our system. After nearly a month of trying various things, I found the problem of an error. In this case, two years ago the hardaware engineer building the FPGA and DSP programs didn't bother to fix the [relatively simple] design problem. Rather than give all communications the same format, a few commands differ substantially from all others (different responses in certain circumstances, for example).
The problem made it into the PC software that interfaces with the board. The problem is documented in several [maybe 20?] bugs of the software that works between the PC and the external device. The problem is documented in at least 50 bugs in a port of that PC software. It has been in production for several years, and implemented by external companies (which I feal sorry for, due to the complexity of the communications bug).
Now we're working on a completely new FPGA/DSP board to replace the earlier board. Design changes prevent us from directly implementing the bug in the new design, although otherwise the communication protocols are the same. Implementing the same malformed communications will mean breaking the simple straightforward design and carefully implementing a set of 'design exceptions' (read: 'bugs').
It would have taken one engineer an hour or so to fix this thing when they first saw it. It would have taken both teams a few days to fix it when writing the PC to DSP interface (~1 FTE month). It would have taken a few weeks to fix it when writing the port, requiring changes to the PC software and the DSP (~1 FTE year). If we choose to fix the error now, it will probably result 2+ FTE years of work to just fix everything, and more time for regression testing every old peice of software for this one bug. If we choose to leave it in, we will devote at least that much time in evaluating, implementing, and testing the old errors. Not to mention the continued maintenence work when the eventual bugs are found in the new board.
Now we're forced with a tough financial decision: do we spend a month or more carefully re-creating and testing the 'design exceptions', (probably 3-5 FTE years in total) or do we do it 'the right way' and break both our own and our customers' software? (again, several FTE years, but potentially loosing faith with the customers.)
This particular bug could have been prevented by about $50 of work. It has now cost the company tens of thousands of dollars, and will probably cost a few hundred thousand before all is said and done.
Now, lets throw some financial ethics into the $50 --> $5,000 --> $50,000 --> $500,000+ problem: The engineer was in a hurry to fix the problem before a company imposed deadline. Is that engineer responsible for the enormous financial cost? If so, how much? If not, why not? It can be argued that his negligence cause a half-million dollars in damages. It can be argued that the engineer was responsible for $50 but the team was responsible for allowing it to grow. It can be argued that this is a regular business cost due to falibility of engineers' designs.
This begs the question:
How responsible are any of us for the errors we introduce?
frob
//TODO: Think of witty sig statement
"The exponential cost curve is mostly in detecting and communicating the Mistake and naming the change that is to be made. XP cannot change that curve, and indeed, XP takes that increasing cost curve neatly into account. So the first lesson I get is that one should not base a defense of XP on the ABSENCE of the curve, but rather on the PRESENCE of the exponential cost curve."
Even disregarding the direct developer cost of finding and fixing the bugs, take a look at a recent example. Enter the Matrix seems to have been plagued with bugs (although I must admit that I haven't played it yet myself). It's "reputation" is now far from good, and so the bugs has caused a large loss in potential sales.
Now a hypothetical example. You work for a large company, and you're looking for some enterprise-level database software (HypotheticalDB). You get some that has been hyped, and spend a fair amount of time learning it, and developing your solution. When you start trying to use it, though, there are a few major bugs that make it unusable for practical purposes. You eventually switch programs. Now, the company that makes HypotheticalDB will have lost money in some sales (as the company expands), and likely support, too. But the major hit is this: HypotheticalDB 2.0 comes out, which is a lot less buggy than 1.0. However, would you really trust that program again? Also, you've already set up your solution and are running it now; there would be a large cost for you to switch over, too. So HypotheticalDB has lost money in software licenses, support, and future sales of that product and probably others. And that's excluding the actual development cost of finding and fixing the bugs.
From what I have read, Oracle's founders had the best solution to the problem of customers holding off buying until version 2.0: "This first Oracle was named version 2 rather than version 1 because the fledgling company thought potential customers were more likely to purchase a second version rather than an initial release."
Depends on the circumstances.
;-)
If the engineer was rushed through the design and nobody had the time to check his designs, the company gets what it deserves.
However if the engineer skimped his work (for whatever reason) and the team failed to check his work, I think the team would share some kind of responsability.
If the engineer made the mistake and willfully let it in, he and the team should both partialy liable (he because he left an obvious flaw in and his team for not checking it)
This is also why I normaly have somebody (as in my boss) check my work and or discuss things that can be flawed or not.
This way I can't be fully responsable (usualy im just responsable for fixing the problem only)
I do notice how my emphasis is placed on automated testing and better designs around here, which makes an software engineer like me happy
If you have a good logical design that compartmentalizes each functional unit of your code (what I'll call "well-factored"), how long should it take to fix any one bug? For a typical app, even of pretty hefty size, you should, in theory, be able to run to the exact object, swap out what's broken, and *poof*, every place that functionality is needed is good to go. XP et al really do lose a lot of time in the overhead it takes to keep two people on any programming task, unit test, and the rest. You might be nearly guaranteed nice code, but what's your opportunity cost? In short, it's having two coders hacking about twice as much on what, if they're mature enough, should be well-documented, modular code!
Now we all know *poof* is not the case, and we all know that a well-factored system is about as hard to come by as nirvana (which means each fix requires ripping out a chunk of code), but the argument is still a valid one. Unless you have a huge system, where perhaps someone's "fixed" a bug by hack on top of hack ("Hrm, Bob's addFunction always returns a number one too low. Instead of bugging Bob, I'll just add one to the result in my function."), bugs today aren't like bugs in pre-object oriented days. If coders in the 80's had the debug tools and langauges we have today... Let's face it, it's much easier to create an Atari 2600 game today than it was when you had to burn to an EPROM to test on hardware each time and print out your code to review it.
The bottom line is whether it's more cost-effective to prevent 99.44% of bugs up front than it is to fix the extra 10% that slip through. I believe the original post is simply suggesting that the cost of fixing on the backside is dropping considerably, especially compared to what the same results would've required decades ago, and that is, honestly, a good point.
(Remember, this isn't upgrading code -- might be awfully tough to make code that's slapped together change backends from, say, flat files to an RDBMS; this is just bug fixing to make what you've got work *now*. But XP tells us not to program thinking that far down the road anyhow, so future scalibility is another topic altogether.)
It's all 0s and 1s. Or it's not.
If POP3 could have looked forward and seen the SPAM and Forged header abuses, security could have been part of the standard. Now that POP3 and IMAP mail is everywhere and forged headers are also everywhere, changing the de-facto standards is a big thing. Making the switch to something more robust will be a long and painful transition. Everything will be incompatible for a while.
It will be as easy as getting the US to switch to the metric system or transition with the rest of the world to driving on the left side of the road. Both would be much cheaper if they were implimented in the beginning instead of attempting a transition later.
The truth shall set you free!
Uh.. that depends on the bug. A bug where the grammar and spell checker are switched has a small initial cost to the user, but once they figure it out, it's fine. Fixing it should be near minimal cost. Something in an errata to the manual.. if there was a printed one, and poof. A software bug that costs near nothing.
If it's a bug where something is off by a dollar per 100 or so transactions, that is hella costly. Both to find if it's not consistent, customer support and all of the other efforts to fix it.
-
ping -f 255.255.255.255 # if only
The rule is, 90% of the bugs take up 90% of your budget, and the remaining 10% take up the other 90% of your budget. This goes the same for time before deadline.
Yeah, it is so cheap having those MS Windows worms running around.
I can't see how it would only cost 5 times as much to get millions of users to patch their system, account for their lost time, and write the patch.
I have some problems with this way you have reported this story (and maybe I'm taking it personally cause I do hardware.).
You say the protocol has exceptions instead of always being the same. Do you KNOW that the exceptions were put there to get around a bug? How do you KNOW that a fix for the bug existed - maybe that fix was the addition of the protocol exceptions because for technical reasons there wsa no other solution available to the engineer. Do you KNOW that the hardware engineer saw the bug - or even defined it as a bug?
Let me give you an example of how the hardware guy might have been constrained. He might not have had enough time to fix the problem otherwise. He might have simply been out of room in the FPGA to implement the fix in another manner. A decision might have been made by management to fix the problem a certain way when presented with choices.
All of these are realities in the world of hardware.
My whole point is that there is almost more than one side to such a story!
As often as not - schedules & resource limit the types of fixes that can occur in a program as it nears production. They DO get more expensive to fix at this point because it often means that the steps that occur between design and production have to be repeated (like layout of the chip in hardware). As soon as you near product release the decision to fix a bug becomes more a matter of "is it a show stopper or not?" Can we "program" around the bug instead of fixing the hardware.
I think the original problem in this post should have clarified the space the question was being asked about. Maybe software production costs are lower than hardware (heck I know they are.) To make a mask set for a 0.13 chipset costs perhaps a million dollars. You ARE going to think twice before you decide to make a hardware fix at that kind of cost.
Have you compiled your kernel today??
Coding bugs are generally not to tough to fix (though sometimes hard to find). Design bugs are the killer. If you discover a design bug after implementation, you might need to change or even rewrite big slabs of code. The logarythmic estimate is probably a worst case analysis, not an average case. But without a doubt, design bugs that make it into production are bad stuff. That's why sofwtare engineers are either grey-headed or bald. :-)
I at the time those numbers were calculated, the software development process was very different from today. It was harder to distribute software, harder to deploy updates, harder for developers to get information about errors in the field. Testing the next release was a lot more critical because if a bug did exist it might not be possible to fix for several months until the next release could be sent out via floppy or mag tape to each customer.
Today most people download their software throught the Internet, and can get patches just as fast, even automatically as they are posted. Tools like Windows Error Reporting, Quality Feedback Agent, and BugToaster make it easier for detect and prioritize bugs based on their frequency of occurrence in the field.
So with all those changes, it's still 15 times more expensive to fix a bug after release? Does that take into account the time value of money, the value of early user feedback, or lost opportunity costs?
Typos: Simple misspellings of words. Infrequent, easy to detect, easy to fix.
Writos: Incoherent sentences. More frequent, hard to detect, harder to fix.
Thinkos: Conceptually bonkers. Very frequent, subtle and hard to detect; almost impossible to fix.
Most 'late' bugs that I've seen in software projects belong in the last category - a lack of design or the failure to make a working mock-up leads to 'thinkos' which are only obvious when the application is nearly completed. These are expensive to fix.
If define "later phase of the project" as a point further down the line on a scale of code complexity, then it's obviously true. Rooting out a bug is much easier when the codebase is smaller and simpler than when it has grown into a huge complex behemoth.
11*43+456^2
I think that the accumulated cost factors for late bugs is exaggerated by lumping in major design flaws with simple bugs. Surely a bug found during the coding process itself is cheapest to fix (thus XPs pair programming and test first methods) because the hunt isn't going to be as large. However, having worked on the same project for three and a half years, we stumble over hidden bugs from time to time, and they really are not much more difficult than recent bugs to fix.
Where we find ourselves paying a premium is changing design decisions. However, as we have been following XP incremental design, even that isn't that bad, and frankly the most expensive design corrections we have had revolve around over designing. Customer driven iterations have been much less likely to be changed than our brilliant analysis.
Sig under construction since 1998.
According to the engineer, the actual thinking for the two communications problems was: "Lets just make the header bigger by one byte for these 'extensions', we'll go in and add new commands for them later", but he never considered that that next byte may be legitimate data in the shorter command since it was to be temporary. The other was "Adding a new response code will require a full rebuild, but this is just a small test for debugging. We'll just prepend a number to one of the other failure codes, just for testing." Both 'temporary' solutions were left in place, and duplicated by a few interns or when they added a few more 'extensions'; his practice of using specific return codes has evolved to a selection of 4 possible return styles. All this because he wanted to avoid a few minutes of compile time!
In this case, it is entirely contained within the flash memory shipped to customers, so we could easily fix it and declare all old versions depricated -- but not until correcting every piece of software, which will take a lot of time.
For now it has evolved into something of a tribute to ad-hoc design: we have either [command][data][crc] or [command][ext][data][crc] where the values in [[ext][data]] is occasionally valid [data] for the basic command, leading to ambiguity: Is it the basic command with 1 or 2 as the first byte, or the extended command? For responses, we have a choice of [command][status] or [command][#][status] or [command][ext][status] or [command][ext][#][status], where [#] is the function-specific error code being returned. The latter is easier to check for than the former, but both are a continual source of flaws.
frob
//TODO: Think of witty sig statement
You mean it raises the question.
Well, past a certain point, you can get your customers to do the testing, and then have them pay for fixes through maintenance and consulting.
The nice point about it : it's also a good way to retain customers and charge them after they paid good $$ for the initial release.
I've been working for quite a few software companies by now, and even though this has never been an explicit policy, there seems to be a tacit agreement that meeting FCS deadlines is far more important than delivering bug-free software.
As long as the customers seem to accept this, and don't find competing products with better practices, I see no particular evil in this behavior: after all, we're still trying to deliver the best software we can...
I used to do embedded programming where it was really costly to fix in the field.
Here's a similar project's repair estimates (pdf). Mind you, this product cost 1000x our product, but since they were at similar customer sites, the repair cost wouldn't be significantly different.
Service trip #1 = $413 million
Service trip #2 = $497 million
Service trip #3 = $547 million
Service trip #4 = $400 million
(note: these prices don't include airfare)
In fact, it would be far cheaper to just toss out our old hardware and start from scratch ($13 million total costs) than it would be to try to fix it in the field.
HIV Crosses Species Barrier... into Muppets
Back when you had to physically send an employee to fix the problem after it shipped, or to send replacement ROMs, it would have been 100:1. Now everbody assumes bugs found after ship are par for the course and builds in software/firmware upgradability over the 'net, it's probably more cost effective to ship with bugs and fix them later, when you factor in the opportunity cost of delaying shipment to be absolutely sure there are no bugs. Many companies appear to operate that way these days (cough, microsoft, cough). The only downside seems to be that sending customers an email telling them they need to upgrade because what you sold them is a crock of manure could be damaging to you companies reputation. However, software companies are working on a fix for that too... they'll simply update your software for you without bothering to tell you about it! Isn't it wonderful now that almost every computing deviced is connected to the 'net?
"Freedom means freedom for everybody" -- Dick Cheney
Would you buy a car from the same company again if your current car had a lot of recalls? Is it cheaper for the car company to fix a defect before the car is made, or perform a recall? While a patch may not appear to cost as much as changing physical parts, it still requires additional $upport and hurts the company's reputation.
So... are you still at Microsoft?
How do you fix a bug cheaply when the contract has ended and all the people working on it are gone? Enter training costs for new staff.
How about needing a whole new contract just for the bugs? Enter the immoble bureaucracy.
How about a year later, when, even if someone from the project is still around, it takes them a few days just to remember what they did 14 months ago? Enter seemingly wasted time.
Anecdotal evidence is viable evidence for the undeniable fact that late bug fixes are very expensive.
Healthcare article at Kuro5hin
The same question was asked in an
XP usenet post. In fact, in Kent Beck's book on eXtreme Programming, he discusses
the cost of changing software in the various stages of the development process,
and a recognition that this "10-100-1000" progression is not necessarily true is
a fundamental part of the XP philosophy.
For what it's worth, I believe there are many domains where that cost escalator
does not apply. If you have a well-designed application rolled out to a manageable
userbase with access to a helpdesk or the development team, it is fairly cheap to
release software, and it is cheap to find and fix bugs. It's reasonably cheap to
find and fix bugs on web projects, too.
There are also domains where fixing bugs after the release is extremely expensive -
embedded devices, shrinkwrapped software, software subject to regulatory checks.
At the time the metrics were collected, compile times were a significant issue for
pretty much all developers - the "code-compile-test" cycle could be hours or days.
Nowadays, most of us can compile our application in seconds or minutes (no, not
the linux kernel. Yet.) IDE's and case tools have made it easier to understand code,
and we have debuggers which allow us to look deep inside the application in real-time.
I don't think the "one-size-fits-all" metric was ever valid, just as there
is no "one-size-fits-all" development process. I think the ratio for any given project
has gone down from its level in the early 1980s, though.
It's all very well in practice, but it will never work in theory.
Minor bugs might be okay, but show stopper bugs are definitely more expensive.
Here's why: when it's early in the development cycle, the team working on the product is small and intimately familiar with it. Finding and fixing the bug takes less time and costs less money in idled staff-hours affected by the bug.
As the product grows, more developers are added, additional staff (let's pretend this is a commercial effort) comes in, sales, marketing, customer support, et cetra... Now, the same show-stopper bug might take the same amount of time to identify and fix, but there's potentially more people down the dependency chain that will also be affected.
and, in reality, as the product gets bigger and has more developers, there's a fair chance the developers will be less intimately familiar with their code and will end up spending more time fixing the bug!
"Begging the question" aka "circular reasoning", in argument, means that you assume that a statement which depends on the conclusion is true, and you use it as proof of your argument.
My rhetorical argument was based entirely on the premise that software engineers must bear responsibility for the errors they introduce, and therefore they are at fault for the errors. The belief that software engineers are responsible implies that they are at fault, therefore the conclusion implies the premise. The logic is falicious, and therefore requires additional verification that the premise is indeed true. If the premise can be shwon to be true through other means, only then can it be validly used in circular reasoning, and even then, it is only generally permitted for contradictory proofs.
frob
//TODO: Think of witty sig statement
It's expensive when you have to trash your CD stock because they're unshippable, or when you have to ship CDs to all of your customers three times in two weeks after you release. Try it and I bet you will have all the empirical data you and your wallet need.
Fix bugs early; it's less expensive that way. =)
It certainly depends on the bug, but when you think about it, in the worst case, a bug is so fundamental that all the rest of the code depends on it and would need to be redesigned. If you catch such a catasrophic bug early in the process it's not such a big deal, but towards the end of the project, it could mean the death of the project.
Exponential up, logarithmic down. They are opposites, like plus and minus. If a progression is exponential, it is logarithmic, and vice versa.
No, they are inverses. If the mapping from a the range to domain is exponential, the mapping from the domain to the range is logarithmic. This does not mean that there is some value A' for every value A such that A^B is the same as log-base-A'(B) for all B. The functions are not the same.
-- MarkusQ
Your response demonstrates that you know exactly what begging the question is. Your original post, however, reads as though you don't. I still can't quite see how you meant it to be interpreted correctly.
You are comparing apples and oranges.
A bug found in the requirements phase is not the same as someone's misplaced semi-colon.
Say you were told to develop an inventory management style. You deliver a curses-based terminal app and the customer says "wait, I was expecting a web interface!"
That is a requirements-gathering bug that will require substantial work to correct after release!
Conformity is the jailer of freedom and enemy of growth. -JFK
"Corporate Trainers" are only mouthing conventional wisdom.
FWIW, many of the original studies are for software development projects for which (1) the system is unique (e.g., flight control); (2) has never been built before; and (3) "the right people" for the job exist but are too expensive (so you have to do with folks that are less skilled.)
With these constraints, the figures of 10x, 100x, 1000x help predict life-cycle cost given a larger economic constraint (i.e., scarcity of talent). Estimation is a major issue when bidding with the gov't. Inside a private company such statistics can be use to bludgeon the employees into all sorts of stupid actions.
The open source movement might provide an intersting counterpoint to this acepted wisdom. There might be some interesting results in the case that the developers are so highly qualified on a specific piece of code. The picture might be different for exploits vs. functionality, too.
Ok, the scientific thing to do would be to 1) fix a bug in development, and compare the cost against 2) leaving the bug as is, then trying to fix it while you are in live production....
;)
Any takers?
It's 10 PM. Do you know if you're un-American?
If I'm anything to go by if it doesn't work, I don't use it for another 2 years.
I'm a software engineer, and I don't have the time to download patches to the software others couldn't be bothered to code correctly: I see buggy software as an attitude problem that won't go away with the next release
The cost of a bug isn't in cash per se. Whether a programmer is in-house or a contractor, they're going to be at your shop for the standard work-week at least, right? So they're either fixing your bug or they're browsing slashdot. You pay the same either way.
;)
The REAL cost of a bug while the project is being coded is in delays to your project, which could push you past deadline. The cost of a bug after the project rolls out is the embarassment of getting caught with your pants down, and of having the inconvenience of pulling people off of other work to fix it.
So in my opinion, bugs are "cheapest" to fix during the initial design and prototype phase, where you're probably not that close to your deadline and you have some wiggle room.
They're more "expensive" to fix when you're closer to a deadline and the delay screws you up (for example, find a bug during user acceptance testing and you've got to go back and code, then start the testing all over again).
They're most "expensive" to fix when you've rolled out the project, the users come to depend on it, and something goes wrong. This embarrasses you and makes your code look untrustworthy, and forces you to scramble to deal with the problem, rolling out a patch, etc, all while dealing with hot-under-the-collar users.
I think this three-level way of looking at it is a lot more useful than any kind of imaginary mathemagical flim-flam. Forget the numbers, worry about the egg on your face.
Farewell! It's been a fine buncha years!
"Begging the question" aka "circular reasoning", in argument, means [...]
"Begging the question" is not a synonym for "circular reasoning". "Begging the question" simply means that your argument is based on questionable premises. "Circular reasoning" means specifically that you're using your conclusion as your premise. Circular reasoning may be begging the question, but begging the question is not necessarily circular reasoning.
In this case, I think you may be right, it's both circular reasoning and begging the question, but they're not synonyms, although they are, obviously, related.
(If more people on slashdot were to familarize themselves with common logical fallacies, I think this might be a better place.)
I think the factor depends a lot on the specific environment the software runs in. When this idea was first proposed, replacing software in the field meant shipping hard copies to users, and for embedded software, replacing PROMs, hence the 50-200 factor. These days the distribution medium is much more likely to be the internet, and at worst even upgrading an embedded system is a matter of plugging in a laptop and reflashing, so the factor might be much lower (5-50 maybe), but the principle still stands.
I worked at a company that produced Telco products (switches, base stations, etc) and we had an estimate of $10,000 per bug found in the field. This was assuming it was a relatively straightforward problem that could be fixed within a few days. If it took longer for the developers to invetigate/fix the bug, the cost went up. The cost was so high because we had multiple levels of support who would investigate before passing the info onto a developer who would confirm the bug's presence and fix it. A new build would be made and tested (to ensure the bug was fixed and nothing else was broken) before being sent out to the customers.
We had a dedicated Verification (testing) team that tested the unreleased code (ie. the next release). They spent their entire time trying to figure out how customers would use the system and using it in that way. We even had a small mobile network setup to do the testing properly.
Bugs caught before release were much cheaper to fix since it didn't require a dedicated build/test run, just a developer to fix it and a verification person to test the fix. The Verification team got a new build each week containing all the bug fixes made in that time.
Link
Other than the infidels.org site you presented, all other descriptions of the two that I have seen, they are synonymous.
And I agree with you and have done so with others in the past: it would be good if there were more real argument and debate on /. as opposed to just contradiction. [Homage to Monty Python goes here.]
frob
//TODO: Think of witty sig statement
This question is idiotic, and in fact given the kind of code I get to work with every day, I should like to punch you in the nuts for asking. But since I cannot do that, I am going to give you a real answer ;-)
BTW, if you only read one part of my post, read the last paragraph.
For small, non critical projects the difference is indeed smaller because the complexity is much more manageable. Let's say you're building a house for your dog and your design forgot to specify which way the door should be facing. It doesn't really matter at which point you figure this out, because at any time you can pick the house up and turn it so the door faces the right way. Cost for error correction is exactly the same because the unit is stand-alone, the error is obvious, and easily correctable.
On the other hand, let's say you're building a pedestrian bridge between the Student Union and the Library, which are also being built at the same time. If during design you realize that "wait a minute, the library's entrance is facing away from the union, how's this bridge going to work", you can correct the issue fairly quickly. By the time the bridge and the library are built, your options for fixing the issue are very expensive. Which is why the bridge we had at Stony Brook wasn't all that convenient for about 20 years. It finally got torn down last year.
Analogies aren't even necessary here because there's plenty of real-world experience (mine!). Here's a quick example. Client does something and the server crashes. It is easy to detect this at the time of bug introduction, because "hey, chances are that the code I just made is the buggy one" so you know where to look. Five years later, when someone else is working on your code and something crashes because the clients started entering new kinds of trades or whatever, or because this guy is Indian and his name is longer than you allocated for, it's going to be a BITCH to find which part of the code does the crashing. Sure, the fix may take the same amount of time (just allocated 20 more chars and you'll be fine until aliens with REALLY long names land and start using our system) but bug identification took you a whole lot longer, and it cost you more.
The biggest incentive to detecting errors at the stage they are introduced is that the stages are developed one from another. In the above paragraph, I show that even an implementation error caught during maintenance stage is more expensive than one caught immediately - but they both stem from the fact that the spec and the design eroneously omited (for example) how long a name should be. It is a spec error all along. If the spec stated the required name length, the programmer would likely implement it correctly. If not, the QA testers would certainly detect it during testing stage.
You can argue with your instructor all you want, but in the real world not only is it more time consuming to find the error later on, it has more of a chance of affecting a customer - which can become an expense of its own easily enough.
Ecce Europa - Web Design for Business
This is more complex than just a simple metric. I would agree that the cost of that coding bug could have been 20 minutes to test, 10 minutes to fix and after implementation it two weeks to placate the client, prove my software, etc. So the management of people costs a lot of time hence money. XP does NOT fix this "people perception" problem. Now look on the counter side. How long would it have taken me to construct the testing regime that would have eventually found that bug on an existing product without real tests. I set up a minimal test system over a period of 12 months and I would estimate it took me about say 10 days (probably more). Was this testing regime stringent, no! Would it have picked up that bug, no! Why not, very simple if I had thought about it I would not have coded it that way. Because I failed to think about that boundary I would not have tested it anyway. So the real cost metric that management are focused on here is: What is the cost of testing vs the cost of bugs in production. I never make this call, I always advise more testing! The cost of testing can far exceed the cost of the code. Is this realistic in your particular environment? Was the code change that you made worth that level of stringent testing? I know that XP brings in test for a bug, fix bug, never repeat bug, I now live by this dogma and it has saved my arse many times. The down side is that you have to "see" the bug and this is the bug problem. There are no golden bullets. Quality is relative, determine your level. Set a reasonable level of certainty that you want to release to. Medical Embedded systems - test, test, and test! Medical Accounting records - test, test. Automatic door closer with manual work around and fast load for software - quick functional test. One size does not fit all.
The reason the trainers get ticked off is because you only ask that question if you are retarded or are trying to tick them off. I guess you should be glad they didn't assume you are retarded.
it would be good if there were more real argument and debate on /. as opposed to just contradiction.
;-)
No it wouldn't!
There are plenty of dinky little bugs that can get unearthed in the few days before release. These aren't necessarily more expensive to find or fix.
Really bad late bugs don't happen if you get your requirements straight, and test early and often.
This is my sig.
Again as mentioned in "The Mythical Man Month", bugs found during the later stages of development (and more so post deployment) are more costly, because:
- The cost of regression testing
- Other (perhaps subtle) bugs may well be introduced by fixing another
The book goes on to say that, post deployment, fixing bugs (and the introduction of new ones in the process) causes the system itself to atrophy. In other words become more unstable, not more stable."This begs the question" does not mean what you think it means. See http://www.wsu.edu:8080/~brians/errors/begs.html It actually means a logically flawed argument.
I stole this
That's an interesting idea that goes right to the heart of how software development is done today. Realistically, at present, a company will have to be accountable for the errors, because if it were all pinned on an individual developer, no-one would risk taking on the job.
In a better world, software engineering (currently a rather offensive term to real engineers, and one of dubious legality in many places) would be done more like real engineering disciplines: ultimately, a qualified engineer would have to sign off on a product and take responsibility for it. However, that engineer would also have the authority to say "no" if management put unrealistic budget or time constraints on a project, and there'd be suitable support, insurance, etc, making his position realistic. Any code monkeys on a project are responsible to that engineer. They aren't liable if it all goes wrong, but if their work isn't up to the engineer's standards, they're deemed incompetent and shown the door.
The "real engineering" scenario is hardly out of this world, but the question is how you make the jump. You need a mechanism for recognising the skill, experience and professionalism of people who are good enough to be engineers. Most of the software developers in the world wouldn't even be close, but who's to decide who is and who isn't good enough?
Remember that you're talking about an industry where "best practices" are in constant competition with fads, and concrete examples often date within a decade. Compare that with, say, civil engineering, where best practices are based on thousands of years of experience, and concrete examples (sorry :-)) last for centuries. At that point you can see the problem with getting software engineering started, but once the ball is rolling, I think the software development world will be a much better place.
In today's management-driven culture, that's easy: your managers have to decide what your customers will accept, and you do what they tell you.
In a more engineering-oriented culture, it's also easy: you do things properly. If your engineers could discuss the problem with their engineers, all sides would probably agree on this, and make the decision that is, in the long time, in the interests of both your company and your client, which is almost certainly to rework the broken system properly, from scratch if necessary.
Personally, I would always prefer to do that, since all my experience tells me that cleaning it up now will take less time than reliably fixing all the known bugs anyway, and will be much more effective at preventing similar "special case" bugs in future.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Unlike this guy, you aren't likely to survive going over a waterfall these days. A more recent discussion of the cost of change and a further examination by Alistair Cockburn might be better than reviewing Boehm again.
Get a phone call at 17:00 from a section that you have never heard of and get a project dumped in your lap to be ready by 10:00am, now that is a JFDI.
This is hilarious. I mean, I hate recompiling myself, but that's just ridiculous.
But perhaps something as important as command protocal the should have been designed before hand?
...the cost of the wasted effort down the wrong path.
For example, if you get a requirement wrong and spend X developer-months designing and coding a subsystem around that requirement, the cost to fix it includes that already sunk cost plus the cost of reworking the design and code to make it conform to what the spec should have said.
Or consider the case where section II.3.iv of the spec conflicts utterly with the requirements detailed in section IV.2.iii. If you don't catch that early (and assuming its a large project, given the size of the specs), you'll have two different subproject teams off designing, coding and testing to cross purposes and you'll only discover the problem at integration time.
Sure, some requirements or design bugs are trivial to fix even after coding is almost complete (you got the color of some GUI feature wrong, say). Others aren't (you missed some key requirement that radically affects the way the data should be represented and you have to change all your data structures and database tables).
-- Alastair
Every method guru agrees that bug will creep into any development effort. The thing you need to do when designing and writing your code, is that it may all be wrong from the design up. I have seen few methods that emphasize on planning your bugs ahead. XP will even declare that YAGNI!
There will be some points in your design that will have grave consequences if there are bug in it, it is up to the designers/programmers to identify those points and plan the repairs ahead.
How? I dont know really, I'm supprised you even read this far.
This space is intentionally staring blankly at you
Bugs are when the software doesn't fulfill the specification; defects are when the specification doesn't fulfill the requirements. These problems are introduced at different stages of development. As one professor put it there are two questions "Did I implement the thing right?" and "Did I implement the right thing?" Early on in the software design it's important to make sure that the specifications that are written for the software actually meet what the customer wants. These are the problems that can potentially be very costly to fix later on. You can implement an entire software system that perfectly meets the specs (i.e. no bugs), but if the specs were flawed it could take a lot of time to revise the specs and fix the system to implement the right thing.
Bug testing is another thing completely. You can't find bugs until they've actually been written in the code. This is the reason for the "test early, test often" philosophy and code reviews. It's important to find bugs early too, but you're right that it isn't feasible to find bugs before the implementation phase.
But that's the cost of updating the software. What we're all missing is the cost of the bug. Not the cost of finding and fixing the bug, but the cost of the bug itself.
Let's say that you're working for a bank and their wire transfer software delvelops a bug during the end of month or (even worse) the end of quarter period. There may only be 5000 to 10000 transactions but those transactions can account for several billions of dollars (Yes I work as a programmer for a bank and yes these numbers are reasonable).
A bug like that could cost you customers. The knds of corporate customers who use that wire transfer system have hundreds of bank accounts with hundreds of millions of dollars in assets. If they're unhappy with their service then the cost of losing even ONE of those customers is in the hundreds of millions of dollars.
What if the bug causes an airplane to crash? Or a car to suddenly accelerate? Those bugs cause damages (both physical and financial) in the millions of dollars. Yes, the cost of a bug in the latest video game is trivial. But the cost of a bug in systems where people's lives and finances are at stake is tremendous.
- Software engineer introduceed a flaw
- Societal values imply that individuals are responsible for the damages and costs due to flaws they introduce
- Flaws demonstrably cost the company money
- Therefore, software engineers need to be held responsible (financially or otherwise) for the flaws they introduce.
From the definition at the place you posted, I "improperly took for granted" that "Individuals [specifically engineers] are responsible for the flaws they introduce", since you cannot use the conclusion as part of the argument except in arguing through contradiction. If you prefer, substitute it into the same form specified in your 'questiong-begging' argument you linked to: "The engineers are responsible for the flaws because they are obviously responsible for the flaws." Therefore, it begs the question (or more precisely, the assertion) that we are responsible for the flaws that we introduce.As you must know (since you are asserting a logical fallacy), the way to remedy a flaw of this type is to either replace the invalid statement or to support it through other means showing that we are not taking the statement for granted, but that it is a valid piece of the argument. Which is why I asked the begged assertion as a question itself, "How responsible are any of us [ as software engineers, electrical engineers, etc. ] for the flaws they introduce?" If we are indeed responsible, then the argument holds because the element has been supported through other means. If not, then the argument fails (although the conclusion may still be proven valid through other means).
Finally, whether you accept my argument or not, and regarless of if you believe the word I should have used was 'begging' or 'demands' or 'brings up' or any other word selection: /. is an inforamal discussion board. Enforcing strict formal language or other strict language rules in this informal arena makes you what is commonly called either a "grammar nazi" or a "troll".
Which would you prefer to be called?
frob
//TODO: Think of witty sig statement
Or you can read Alistair Cockburn's proof
What it boils down to is that if I do something wrong, then at the minimum, the cost of correcting the mistake is:
cost of doing the wrong thing first + cost of changing it do the right thing - cost of doing the right thing first
As the cost of doing the wrong thing + the cost of changing it is always going to be larger than the cost of doing it right, you'll always end up with a positive number.
The rest is about momentum; the earlier the mistake was made in the cycle, the more subsequent decisions were made that are also wrong.
Note, however, this has nothing to do with the cost of adding new features later. Here, you've got nothing done wrong to start with, and the cost of changing it is equal to the cost of doing it right. What you lose is the opportunity cost, which can be iffy.
"Software is too expensive to build cheaply"
On one end you have Ariane5 exploding because of a software error, on the other end, you have 10 clients within your enterprise which loose time because of software's bugs.
With such huge range of differing costs for finding the bug before or after the shipping of your product, the "average cost" of bugs is meaningless.
I think that the only thing to remember is:
- bugs found late cost more to fix than bugs found earlier (any specific number is invalid)
- finding bugs early is difficult and can be expensive.
Of which you can deduce that:
- if late bugs can cost you very much (Ariane5 for exemple), you want to spend a lot of money on software testing|review at each level.
- otherwise if tests can cost more than the fix (a small number of internal users with a non-critical software), then maybe you can use the clients as testers, but it must be managed well (tell the users, be in close contact with the users, don't let them wait the fixes too much).
What I mean is if you find and impementation bug in the impementation then it doesn't cost much but if it's a design bug caught in impementation then you have to undo a lot of code and redo it.
So saying that a bug cost more to fix in each level of deveopment that it's missed is true but you have to remember it's only from the level that it was introduced.
For example, if a defect is found in the requirements phase, it may cost $1 to fix. It is proffered that the same defect will cost $10 if found in design, $100 during coding, $1000 during testing.
In the above example, you're talking about a bug that is an error in getting the requirements down correctly and letting it live all the way out into the field. Such a bug would indeed be quite costly to repair! Most likely it would require a new version of the product. This type of bug might be that the product didn't have a feature that was needed (i.e. to sell it? to use it?). The lost income from lost sales alone would be enormous.
However, that doesn't mean that all bugs are requirements bugs. In general, the shorter time the bug "lives", the lower the cost. The fewer stages the bug moves through, the lower the cost. If the bug is a design error and you find it in testing, it costs a lot more than an implementation error found in testing.
Make a mistake during coding and finding it while you are still coding is very common and the cost of that is minimal. That's why it pays to test your code yourself and to do unit testing before sending it to QA.
Avoid Missing Ball for High Score
Notwithstanding that, I don't generally think of the cost of bugs in dollar terms. As crazyphilman proposes, bugs that cause schedule slippage are of very high concern.
Another cost that is important is "increased risk". Every time you change something, you risk breaking it or something that depends on it. The bigger the change, the bigger the risk.
When a product is nearing release, my team reviews every bug found. We weigh the cost (in customer dissatisfaction) of "release noting" the bug and shipping with it, against the cost (in increased risk and slippage) of fixing the bug.
If the bug affects a small number of users, and/or has a reasonable work-around, it is often preferable to ship with the bug, and promise to fix it in the next version. If it has severe effects, then shipping it may not be feasible.
It would be nice to say that we ship bug-free software, but for large projects that is virtually impossible.
I say "virtually" impossible. It is possible, but prohibitively expensive for most projects. It becomes reasonable when bugs are unacceptable (e.g. aircraft navigation system: bug=risk of death) but the cost of the system rises astronomically. Most customers would rather have a few small bugs that pay 1000 times as much for the software.
Seriously. Bugs are cheap until you deploy.
>XP et al really do lose a lot of time in the overhead it takes to keep two people on any programming task, unit test, and the rest. You might be nearly guaranteed nice code, but what's your opportunity cost? In short, it's having two coders hacking about twice as much
The research doesn't show that. It's about 15% slower, but the quality goes up by 15%. It doesn't take a brainiac to realize that with 15% higher quality, soon you're going faster because you aren't drooling over a debugger screen.
Many of the replies to the question "Do software defects found in later phases of the software development cycle REALLY cost THAT much more than defects found in earlier phases?" use common sense and experience to suggest that this is the case.
But can we rely on common sense and experience? Would a scientific study reveal unintuitive data to suggest otherwise. We all know that people make mistakes, but would an indept study suggest more about:
the factors that cause mistakes?
are the mistakes preventable?
what are the general reasons for mistakes in the various software development phases?
what happens when a mistake is encountered?
how is the cost of the mistake determined?
what are customer expections regarding defects? are certain defects acceptable?
do certain development processes lead to less mistakes?
how many mistakes evolve from poor management?
how many mistakes do test engineers find prior to release?
how many mistakes are found after release?
do software tools aid the development cycle for mistake prevention?
can stringent controls prevent defects/bugs?
if so which costs more? defect resolution or the stringent controls?
whats more important and costs more first to market with bugs or last to market with no bugs?
Finally would anyone care about what data was produced by such a study? How often do we ignore history and continue to make mistakes?
Oh, grammar nazi please!
I stole this
That's what I thought, and he called me a grammer nazi!
I stole this
I am sure you have heard the old joke that "God was able to create the world in 6 days because he didn't have to deal with an installed user base". I have worked on a lot of legacy code and have worked for companies where some users have relied on buggy behavior as a feature. Fix the bug and hear the user scream. It is also well known the effects bug fixes have on monolithic legacy code which has spaghetti architecture (and I use the word architecture loosely). I have no emperical evidence past that which has already been provided by other posters, but in over 7 years in S/W QA and over 8 years writing software, I have never come across a situation where fixing a bug was not considerably more expensive to fix in the later portions of a development cycle than earlier. I have a rule while writing/maintaining code; if I see a bug in the code, I fix it then, while I am in the code rather than later. It is much easier, faster and better to fix it while my mind is on it than to come back to it. This is what pair programming buys you (I don't pair - I can't think with someone watching me); someone to find the bug while you are typing it, rather than 2 weeks later when you integrate the code and then have to spend hours finding out why something doesn't work. Same goes for unit testing; you find the bugs earlier. Just a note: I disagree that just because you find a bug while coding or debugging that you don't have to triage it, report it to QA, etc.; you may have to indeed to some or all of those depending on the nature of the bug. If nothing else, QA may need to know about the bug so they can include tests for it in their suite. It may need to be triaged if the ramifications of fixing it are large enough. Usually I don't log bugs I find in new code unless that code has already been tested or released to QA - but I do list it as a bug in the checkin of the code, and QA may find that info helpful in developing tests - or not.