Standards for Bug Severities?
MythoBeast asks: "While attempting to determine just how far a (unnamed) company's software is from releasability, a group that I belong to discovered that they consider a piece of software releasable with numerous 'severity one' bugs. This seems outright heinous to us at first, but then the conversation devolved into an argument of what constituted a 'severity one' bug. Is there a standard for such things? Is there a standard for how many of these things are reasonable for 'releasable' software?"
Hoo boy, we used to get into this argument everytime we reached the "triage" portion of the project (typically after second beta or 3 months before ship, whichever came later).
Sev1 bugs: cause loss of data, hardware damage, or take down other programs (or the OS) which might cause loss of data.
Sev2 bugs: cause the program to crash without losing data, or otherwise make some feature non-functional
Sev3 bugs: impair the usefullness of some feature
Sev4 bugs: are cosmetic or may cause some confusion to the user (wrong status text, spelling errors, etc)
HOWEVER, there is also a priority scale. A Sev1 bug is not necessarily a Pri-1 bug, if it only occurs under extremely obscure conditions. Eg, if your program crashes, losing data, when a PC-Card is inserted, that's a Sev-1 bug. If it occurs any time any PC-card is inserted, then it's a Pri-1 bug. But if it's only one particular kind of PC-Card that exists in the wild in quantities of, say, a couple of hundred, then it's not a Pri-1 bug even though it's still a Sev-1 bug (you may still choose to fix the bug, but other Sev-1 bugs have priority since you can doc around this one). On the other hand, a low severity bug that's in your face every time you use the product is probably a fairly high-priority bug.
If you're running your business as a business, you will ship with bugs. If your product has a broad market, you will probably ship with a couple of Sev-1 bugs. (This sounds paradoxical, but consider -- the broader your market, the larger the number of permutations of hardware and software your product is going to inderact with, and the higher the chance some obscure situation will occur in which a Sev1 rears its head). It's a business decision to ship, so the number of bugs you ship with has to be based on more than just a technical analysis (Rule #1 of software development: shipping is a feature too).
It's easy for us technical heads to get on a high and mighty horse about quality and not shipping with bugs. But we tend to forget that what we're doing is NOT engineering (it's a lot closer to handicrafts in the days of guilds than it is to engineering the way it is done with bridges and aircraft). And we're not the people trying to run a business.
The corporate-wide standard in this case was badly broken, and was obviously written by someone who doenst understand software release.
QA is NOT there to find, log, and force a fix for every single bug in the system. That is an impossibiliy. Just like manufacturing QA, Software QA is a statistical processes.
The classic manufacturing QA example is the Bolt Building Machine. It makes bolts from metal bar at the rate of 10000 a day. It would be wonderful if manufacturing QA could take each one of the bolts produced by the machine and measure the eight or so critical bolt dimentions to ensure that each bolt was within tolerance. Unfortunatly, it takes about 15 minutes to measure a single bolt with that kind of precision, so it is not cost effective to measure each and every bolt. The next best way to ensure quality is to take a random sample of bolts and measure them, so that you can be statistically certain to an acceptable degree that the rest of the bolts are good.
Software QA works the same way. When a new release of software comes out, it is rarely run against the entire QA testplan again. This is especially true with patch releases where the development team believes that only a few minor things have changed. It just takes too long to run a full QA cycle against a build like that-- it is possible, but it isnt cost effective. Instead, the QA team runs a subset of the test plan against the build in an effort to sample the test space and come up with a statistical certainty that this release of the product isnt broken.
Bug classification is also a statistical problem equivilent to the classic 'expected value' problem. The standard five bug classifications are useful measures of the 'weight' of a problem. They answer the question, 'If this bug happens, how much does it hurt?' What the abovementioned corporate scheme failed to qualify was the 'How often it happens' part. Without knowing the frequency that the problem occurs, it is impossible to calculate the 'expected hurt' that the problem will generate.
Here is a degenerate case example. Say the program has code like this:
If a user directs a cosmic ray into the computer that corrupts the value of pointer between the time its validity is checked and the time it is dereferenced, the program will crash with severity 1. There is no workaround, because every time the user sends in this ray, the program fails. According to the abovementioned corporate policy, the product cant release until this problem is fixed, period.
What are the odds of a user here on earth actually sending in that cosmic ray at exactly the right time, let alone having a need to? Very close to zero. So close in fact, that it practically doesnt matter how severe the crash is. The 'expected hurt' from this problem is going to be very low, and the product should ship with that severity 1 bug going unfixed. If the software was going into space where cosmic rays are common, it might be prudent to recalculate the odds of the problem happening before shipping the code-- and maybe then the only practical solution is to ship with the bug but send up redundant systems.
QA is not there to ensure that the product has no bugs. QA is there to qualifiy risks, and to assure that the quality of the product falls within a certain tolerance. In a perfect universe, software would never have bugs, and bolts would never bend. Here in our solar system though, it all boils down to risk management.
A. Coward, (a QA Manager for a fortune 50)
Show stoppers are usually bugs that crash the software when doing normal functions. Like if you open up a file and the software crashes, that would be a show stopper. In some cases it depends on your clients as well. If the client finds a bug they may deem it as necessary to be fixed. Most companies I have worked at usually work closely with clients. In many cases the software does not get FULLY tested till the client actually gets the software. Then when (not if they will find bugs) they tell us an dthen we make it a priority to fix them and send out patches to certain clients if not all. This would depend on the client contract.
"Low priority" bugs would be ones that do not impead functionality, but are more of annoyances. Take for instance I saw one bug that when you scroll the scrollbar the window would flicker. This is annoying, but does not prevent the user from using the software. Other less important bugs are bugs that are on reports. If a report does not show the correct data and you can get the data from the database or it is a known bug this can also be a low priority bug. The question usually is "Can the client work with the application with this bug or will it cause the client major problems or loss of money?" Take for instance the reports bug, a client could just not use the reports and create their own from sql.
Something you must remember is that some of the programs that I have worked on have been written by many programmers. They are thousands of files andhave client, server, and database code, in multiple languages. Some of these bugs could be caused by some of the software that we use. Bugs in windows API's, bugs in Sybase, or Oracle, bugs in the server OS, or even 'quirks' that many programmers do not know about.
Next, many times programmers get put in a job where they know the language but NOT the API and the API is not well documented, especially if this is a 'custom' system where the software company has developed programs for DB access on top of the 'db lib'. What about programmers that leave the company or get fired?
So who do you blame? The programmer that impemented something that he did not know the company API, the software company for not completely testing "EVERY" single case that could possibly come up?
You never know how a user will use software or what they may do or have installed on their system that could affect how your software will work.
I don't want a lot, I just want it all!
Flame away, I have a hose!
Only 'flamers' flame!
BSOD is not a bug. BSOD is the result of some bug which could be classified according to the scales. Any number of problems, from severity 5 to severity 1 all trigger BSOD. (Note the BSOD is essentially an effort to convert severity 2 bugs to severity three, by giving the operator the option to continue. 'Course, it never works in real life that I've seen, but sometimes it works just well enough to save to a new file and try again.)
as expounded by Watts S. Humphrey of SEI in "Managing the Software Process" (c) 1989, Addison-Wesley, p. 218:
"Severity 1: An error that prevents the accomplishment of an operational or mission-essential function, prevents the operator/user from performing a mission-essential function, or jeopardizes personnel safety.
Severity 2: An error that adversely affects the accomplishment of an operational or mission-essential function and for which no acceptable alternative workarounds are available.
Severity 3: An error that adversely affects the accomplishment of an operational or mission-essential function for which acceptable alternative workarounds are available.
Severity 4: An error that is an operator/user inconvenience and affects operational or mission-essential functions.
Severity 5: All other errors."
Any other questions?...
Hire an intern.
Seriously -- the intern gets a flavour of the company and corperate programming, and is capable of coming on board and seeing his contributions right away.
You get a cheap way to clean up the program. I believe one of the reasons why Microsoft is so popular is that they fix these kinds of bugs. Sure, you may get a BSOD every once in a while, but the program appears professional. Much like spelling and formatting errors in resumes are deadly, spelling and formatting errors in software reek of unprofessionalism.
For the cost of $12,000 (4 months intern salary) you could get tons of simple bug fixes knocked out, and give some poor undergrad tuition for his next semester or two. Everybody's a winner.
Support a few technologists in Washington.
A bug (in my development experience, limited though it is) has both a severity and a frequency. A sev 1 bug that rarely happens might not be as important as a sev 3 bug that is a frequent annoyance. We always eliminated all known Sev 0 bugs (crash server; yikes) but the occasional Sev 1 bug that was very hard to kill AND deemed unlikely to affect the customer was left in so we could release now and fixed in the SP or point release. Yes, leaving bugs in sucks but sometimes not shipping until they are all dead sucks more.
Q:How many libertarians does it take to stop a Panzer division? A:None. Obviously market forces will take care of it.
How the did the above post get modded up to +4 interesting?
Logging bugs, and giving them "severities", is well and good. It can help in the same way that any effective software development tool helps, by enhancing communication. The moral is that bug tracking is useful only in the context of a team that can, and wishes to, use it effectively.
Wrong. Software engineering practices like tracking the severity and priority of bugs are aimed at creating higher quality software. Instead of developers spending time fixing bugs that are unimportant or not as annoying to the user, they have a way of categorizing and fixing bugs on a scale of importance.
Using bug count and severity as a measure of "releasability", however, is the fallacy of feeble-minded managers, who are afraid to make a decision without a number to back it up.
Yet people like you will bitch that Microsoft shouldn't release software with known severe bugs. I guess if they didn't track bugs they would be able to say they didn't know that there were any severe bugs with a clear conscience.
Software development (as practiced in all but epsilon of cases) is simply not a measurable process. You will only waste your time trying to quantify it.
Software development can be measured based on a large number of metrics. The fact that most people refuse to use software engineering practices that have been known for decades doesn't mean that the process is "immeasurable" it simply means our industry is full of people who'd rather subscribe to the myth that programming is an art instead of trying to make it an engineering discipline.
Releasability can only be determined by the judgement of the team working on the release (which may include developers, testers, release managers, beta testers, partners, etc). That is not to say that you should not draw upon the bug database for evidence upon which to base your judgement. But it requires intelligent interpretation, not counting up some totals.
So if the members of the team come up with the rules for what makes a bug a certain severity and what marks a bug as a certain priority exactly why can they not base releasability on these metrics? Quite frankly, waiting until the software "feels right" before releasing it is probably the most ridiculuos thing I've ever heard. On the other hand saying "we won't release until all the bugs that cause core dumps/segfaults/BSODs are fixed (severity one)" or that "we won't ship until we fix the annoying UI issues (priority one or two)" are quite reasonable even from a common sense point of view.
Some people consider this a failing of the software development process. I think they are too quick to condemn. The customer doesn't (usually) judge software by its bug count.
Interestingly most users of Windows 98 I know judged the software on its bug count (i.e. how many crashes needing reboots per day or other ridiculuos problems per day)
Most software is judged by an overall feel: if the software is compelling, many deficiencies will be overlooked. Further, it is difficult to guess ahead of time (even with beta testing) which bugs will really bite people and impede their use of the software.
If beta testers don't give you a feel for a large number of the bugs that will bite people on the ass then there are flaws in your beta-testing process.
Given the many interacting factors that determine the success of software, release decisions are naturally more art than science.
People like you are the major problem with the software industry. Those that believe that even though there have been increasing strides in the field of sooftware engineering, we should all still practice software development like it is a black art handed down from master to apprentice in a scene reminescent of guilds of the middle ages.
I know, I haven't presented any hard evidence. I'm arguing from experience in both free and closed software projects, and appealing to common sense. Most free software is released "when it's ready", without any metrics. Ponder on whether this is a strength or a weakness. And remember that when someone gives you number, the burden is on him to show that the number means something.
Do you really think Mozilla, GNOME, KDE, Apache, the Linux kernel, OpenBSD, etc are shipping "stable" releases of software without keeping an eye on bug severities and priorities? They may choose to ignore them for one reason or the other but they are keeping track.
--
1 - This is a drop-everything-to-fix-this-immediately kind of bug. A bug that makes it impossible to use the software at all.
2 - This level of bug is very serious and prevents some major part of the program from working. If we were doing a word processor, it might be cut-and-paste doesn't work.
3 - This level of bug displays a runtime error to the user but if they accept the error, the functionality is still there. Everything works, more or less.
4 - This is a spelling error or a size-of-textbox problem, something which the end user will notice but which obviously works properly.
Our standards say we are never to release with a level 1 or level 2 bug and, as much as possible, no level 3 bugs. Ideally, of course, we release with no bugs but that can be rather difficult.
Under no circumstances would we ever release with a level 1 bug because this simply means our app is broken.
--
Oceania has always been at war with Eastasia.
This is probably a good time in this thread to throw out an allied problem: what happens when management doesn't like the results of the bug evaluation.
True story: a corporate-wide standard for bug determination was defined in a Fortune-50 corporation. In a project, though, the director of engineering decided that the standards "didn't reflect reality" -- or rather, that the numbers made him look bad. It also meant that there was no way he would be able to ship the product on time, because the number of ship-block bugs was just too high for his development staff to clear in a manner acceptible to him and to his superiors.
Now, the definition was pretty strict: Severity 1 bugs were features in the specification that were implemented and broken in significant ways, severity 2 were features in the specification that were implemented and met the operational requirements but not performance requirements, or were not implemented and marked in the specification as required for release, severity 3 were features that were implemented but broken in non-significant ways (cosmetics, usually) or features in the spec that were not implemented and marked as not required for current release, severity 4 were features that met spec but were identified as areas for improvement, and severity 5 were spec change requests. The rule is that the product MUST NOT ship with severity 1 bugs, SHOULD NOT ship with severity 2 bugs (release required a review and analysis of each S2 bug by corporate QA), MAY ship with severity 3 bugs, and MUST ship with priority 4 and 5 bugs.
So, Job One for this manager was to pull a Captain Kirk and change the rules. Severity 1 was redefined as "causing a system crash". Severity 2 was redefined to "causing an operation failure that does not result in a system crash". Severity 3 was everything else that used to be in S1 and S2, as well as what was originall in S3. That meant that missing or severely broken features identified as required for release obtained an S3 rating instead of the original S1 or S2 rating. Or, in other words, a broken product could now be shipped!
Of course, it didn't do very well in the market, but the manager in question still got his raise -- and I suspect that's all he cared about.
The root cause of this breakdown was that QA/DVT was an Engineering function, not a separate function. The guy saying "no" should be a peer to the guy saying "yes". (It would have helped if the original bug definitions would have been part of the specification, because the spec is king within this company -- but that didn't happen.)
Have you tried social engineering?
Arrange with your management to set aside a day for a "Little Stuff Contest." It works like this:
Make this a regular thing, and you will be surprised how quickly the nuisence bugs get cleared. Moreover, I can tell you from personal experience that in the process of clearing "the small stuff" some big stuff will be found and fixed as well.
Motivation is a beautiful thing.
...would seem to apply.
"In everything do to others as you would have them do to you..." -- Matthew 7:12, NRS TranslationA generally good philosophy, regardless of your adherence to ceratin world religions.
When releasing software, I ask myself: Would I be pleased if somebody gave/sold this program to me? It isn't so much a matter of numbering conventions of bug severities -- it's a matter of pride and responsibility.
--
Scott Robert Ladd
Master of Complexity
Destroyer of Order and Chaos
All about me
there are some rules we would want to stick to at minimum
1. Software can not harm a user or through any inaction allow a user to come to harm
2. Software must obey orders give by said user unless it would conflict with first law
3. Software must preserve itself and OS unless conflicting with first or second law
Adapted from the Genius A.Asimov
"All I can tell the "lesser of two evils" folks is that if they keep voting for evil, they'll keep getting evil."-Lp.org
At my company, we categorize bugs several different ways. This is all from an IEEE standard, but I don't know which one.
The SEVERITY of the bug is entered when it is discovered. It is one of:
1 - Urgent, causes crash
2 - High - no workaround
3 - medium - workaround available
4 - low - inconvenience.
Other information, such as the steps to reproduce, who discovered it, , a unique ID, and date and version discovered are also recorded. Most important is the TYPE of bug, which is one of:
Code Problem
Configuration Problem
Data Problem
Design ISsue
Enhancement
UI Problem
Documentation Issue
Bugs are assigned to a developer to be fixed. If they are easily fixed, then they are immediately fixed. The unique ID is used in the CVS checkin for the changed files. If the bug is NOT easily fixed, then the developer gives a level of effort, so we can determine the 'cost' of fixing the bug.
Bugs are then assigned a PRIORITY in a 'Configuration Control Board' (CCB) meeting with me (the project manager), several developers, documentation writers, and/or our end users/customers/testers. This can be a small, informal meeting, or a large process-driven meeting. I prefere them small. The PRIORITY in one of:
1 - Urgent, resolve immediately
2 - High, resolve ASAP
3 - Medium, resolve pre-release
4 - Low, desired, but not urgent
5 - on hold, fix in future release.
Our customer has deemed that all severity 1 bugs are automatically priority 1, and all severity 2 are pri 2.
Severity 3 and below are given priorities based on the cost to fix, but if we have too many severity 3's in one 'use case', they collectively become a pri 2.
It's not nearly as much process as it seems, and it really streamlines our development.
We use a tool called TestTrack to track all this, but I would prefer to move towards Bugzilla, as my company has recently jumped on the open source bandwagon.
another comment I should have included in my other post...
As a developer, my relationship with our testing department has changed over the years...
Originally, I thought the purpose of testing was to prove there were no bugs in the system.
They ALWAYS discover bugs. I began to think that the purpose of testing was to discover ALL the bugs, and it was the engineer's job to fix them.
They NEVER discover ALL the bugs. I now think that the purpose of testing is statistical... How many bugs are they finding, and what severity are they? This speaks to the quality of the system design and the quality of the engineers working on the system. If there are few bugs found and fixed, you can be confident you have a good system going out. If there are a lot of bugs found and fixed, I would worry that there are a lot more left undiscovered.
It is a common practice to ship with known bugs, but we NEVER ship with a pri/sev 1 or 2, unless a political deadline forces it.
Sounds to me like the company whose product you are using found more bugs than they could reasonably fix in the time they allocated to this phase of their development cycle, and now they are being forced, for political reasons, to ship with those bugs. That would be a red flag to me that there will be other, undiscovered, severe bugs in their software.