How Fast is Your Turnaround Time?
petrus.burdigala writes "I work for a mid-sized commercial software company (~20 Mloc) and we are frequently challenged by our supervisors to get fixes around the clock. Overall, we manage to get a 'bullet-proof' patch in about 4-5 weeks (from coding->QA->Build/Packaging->shipment), which I consider not so bad. But the other day, we got an urgent request from our support team to come up with a decent fix in 48 hours. I think they're a tiny bit unrealistic. So I wanted to get feedback from my peers: are we doing that bad? It takes months for other software vendors to issue zero-day exploit fixes, are our customers being unreasonable?"
It may just be me but I think that's why they are called "customers"
Excuse me while I gather the virgin sacrifice and assemble the pentagram required to solve your problem
How much of that 48 hour deadline did you waste reading /.
Get back to work!
You have to serve the client who is paying the bills - and we had a very vocal one (Nik*). We had a running joke about the release d'jour. But it wasn't a joke. We literally would push a new build to them every day which contained minor bug fixes. It was maddening! But no one had the balls to stand up to the 800lb gorilla, so the madness continued. As a side-note, they were acting as a beta tester and anyone in the software business knows what that can mean.
What was that exploit again?
You can't talk about Wikipedia's flaws on Wikipedia
For high priority bug fixes, it usually takes 1 to 2 weeks to get a patch out once we determine that a patch is needed.
ÕÕ
It depends upon the nature of the problem and the competency of the developers.
If you know enough of the code tree you can tell when first reproducing and examining the failure whether it is a one off mistake or a larger procedural fault.
Single instance stupid errors (doh! moments) can be rectified and put through testing fairly quickly, however if your initial examination uncovered a larger problem then obviously the process will take longer (if at all - consider workarounds).
If the original dev/test team has been replaced over time this becomes a more difficult issue and every bug must go through complete verification simply because the extent or ramifications of the code modification will not be known.
In some instances we have had fixes out of the door the same day an issue was noticed, in others months go by before a final fix is put in place.
liqbase
I can understand a week, but honestly...if you're leaving your customers vulnerable for over a month, they might start looking elsewhere
Exploits should be a high concern for any company
I work for a bank so we don't do box software, but our patches have to meet FTC standards and Federal bank standards.
It is uncommon, but not unheard of to have an 8 hour fix. In cases of customer data vulnerability, legislation has been made such that if we are aware of a problem, we have an automatic injunction against us continuing to do business unless the problem is resolved. So when we have a security flaw, our bank stops working untill it is fixed. So yeah 48 hours would have people fired for sure.
Compliance/security are the only two things that can spark a release with less than 72 hours notice though.
But the other day, we got an urgent request from our support team to come up with a decent fix in 48 hours. I think they're a tiny bit unrealistic.
Well, we really can't answer that question with knowing how big the problem is. If it's an embarrassing typo on a dialog box, then 48 hours is reasonable. If it's a windows vista security patch, then 48 days would be unrealistic.
-Grey
Silver Clipboard: Time Management Tips
It depends on what you're maintaining and how complicated it is. I've gotten fixes out in 2 or 3 minutes. That doesn't mean I'm fast and you're slow, though. "How fast is your turnaround?" is like "how long does it take to write a computer program?" It's hopelessly vague.
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
Yeah, your turn around time seems good and yes, the customer's request is beyond industry norm.
That might mean one of three things:
One: Customer is being foolishly optimistic.
Two: The entire industry is bad about turn around time, and can, if pushed improve it to 48 hours.
Three: Customer needs it really quick and is hoping to get it quicker by asking. They know 48 hours is well beyond the norm, but are hoping you can do it anyway, because the more time it is unpatched the more they are screwed. They know that if you don't ask, you can't get, so they are at least 'asking'.
Me, I think it is a combination of all three. Customer is being a bit optimistic, the industry is bad about turn around time, and also the customer knows it is a bit optimistic but is making the request anyway in hope you will provide amazingly good service.
excitingthingstodo.blogspot.com
Sometimes, customers are unreasonable and if they are, they should be treated with respect and the problem explained to them. Yes, they may be incredulous, but if you hold your ground (if they're being unreasonable), treat them with respect, they will come around.
The fact that the parent was moderated down just shows me that the arrogance, contempt, and stupidity in corporate America is alive and well - especially in IT.
I prefer Flambe as apposed flamebait.
With a little simplification, you have four parameters: Difficulty, quality, speed and available resources. Whenever you fix three, the fourth follows (with some unvertainity). It is well known, that there is a limit on how much you can improve the speed with more resources. So there is an upper limit on speed already. The second problem that difficulty is unknown when starting such a task. There is no fix for that.
So if these people fix speed and available resources, and difficulty is fixed by the task, quality is determined by these factors. Period. There is no arguing with hard, real limits. If they do also want to specify the result quality, then they have to leave speed open. Again, there is no way around that limitation. In fact they should be happy if the team manages the required quality at all in reasonable time. Not all teams do.
Maybe thisn will be an argumentation that is inderstandable for people with a business background. Engineers should already know this.
Software engineering is engineering. Engineering tasks in general have minimal time requirements. Look at structural engineering: Nobody would try to design and build a full-custom bridge in a week. Instead it takes up to a decade, depending on difficulty. And you can generally not speed things up by increasing the team size.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Overall, we manage to get a 'bullet-proof' patch in about 4-5 weeks (from coding->QA->Build/Packaging->shipment)
Not unreasonable, depending on the size of your release. (How many modules and how many LOC you're changing, the number of change requests or bug reports in the build).
But the other day, we got an urgent request from our support team to come up with a decent fix in 48 hours. I think they're a tiny bit unrealistic.
I think they're smoking crack.
So I wanted to get feedback from my peers: are we doing that bad?
With your regular release schedule, I don't think so.
are our customers being unreasonable?
Yes. That's what they do. If they want a crash development program to get this "patch" out the door that fast, they seriously risk software which does nothing but crash. Really, if they want it that bad, they run the risk of getting it that bad.
You have to ask yourself and your "support team" (sounds more like marketing to me): "Do we wish to ruin a perfectly good reputation for quality and reliability in one hurry-up bashfest followed by weeks of agonizing on-line debugging?" Really, advocate any kind of work-around and risk mitigation response before being pushed into an overly-hasty release that will linger on your reputation like a dead skunk.
Welcome to the Panopticon. Used to be a prison, now it's your home.
A patch (IMHO) is a bug fit to existing code. Given the resources we should be able to get a PATCH out in a week. However, if you need a new version of the software to address the issue. Then we're talking longer development/testing/QA times if which case 4-5 weeks would not be unreasonable. Bugs should be fixed as soon as they are spotted. If their is need for a whole rewrite then you may want to talk to your staff
Ask not what you can do for your country. Ask what your country did to you
I know I'm going to end up baiting some developers, but I work for a specialized ASP and see a ton of third party software from a perspective few get...
Normally, the smaller the company the more agile. No surprise. They also get patches out faster too. Also no surprise.
When we look at vendors of equal size, the ones who are really quick at sending out patches are in that situation because their software is more buggy, and they have a *lot* of practice. It never fails.
In response to your question, I would suggest that you should look more at the frequency of patches and less at the duration. Sure, it might not be as fast as your support group wants, but if you start reflexivly sending out patches every time someone yells, then your overall product will suffer since you can't possibly do the proper QA to ensure THAT patch you just whipped up doesn't break something else.
That brings me to the age old choice:
Pick 2 of the following:
Speed
Quality
Cost
How much time do you spend on TPS reports?
The last time I did one I forgot the cover page and my 7 bosses all bugged me about it.
At BSDi, the initial patch (which did have flaws, but it fixed the problem) for the f00f bug was same-day, I believe; might have been next-day, depending on where you're counting from. (Contrary to popular belief, this didn't violate any NDAs.) Now, that was an emergency patch -- it took a while to come up with a patch that fixed the bug without noticable ill side-effects.
We had a better patch later, but the initial emergency patch was VERY fast.
On the other hand, if the initial bug report is "Sometimes the program hangs, no, I don't know when. Maybe every week or two." -- well, that's gonna be hard. Exploits generally have the advantage that an exploit is by nature at least somewhat reproducible, and the hardest part is often getting a reproducer. I've had it take six hours to develop a usable reproducer, and three minutes to develop a patch.
Release time depends hugely on process and procedure. IMHO, an ideal procedure would have some kind of way to get a Temporary Patch out into the field ASAP when there's an exploit.
My blog: http://www.seebs.net/log/ --- My iPhone/iPad app: http://www.seebs.net/seebsfrac/
48 hours is tad bit tight. However, I've turned things around in a similar amount of time.
But, the old adage is true: you get what you pay for:
When faced with unreasonable deadlines in the past, I usually voice my opinion once, and just do the best I can. Your higher-ups are probably already quite stressed at this point, and adding stress to the situation doesn't do anything for your career or theirs. Rather, if you make the point that you're doing the impossible, you might just have a little bit more bargaining power when it comes time for raises.
But on the flip side of the coin, if management doesn't learn, and you find yourself constantly asked to do the impossible, you might want to consider employment elsewhere...
The society for a thought-free internet welcomes you.
*15 minutes.
It's bad enough that they directly state they're not really testing patches with a 15 minute turnaround, but the fact that they're making mistakes that can be fixed in 15 minutes speaks loudly as well.
--
Our running joke used to be:
Marketing: We need it real bad!
Engineering: How bad do you need it?
Marketing: <puzzled look>
Engineering: Careful what you wish for... OK, Ops. Ship it!
I've had situations with customers who require a fix as soon as possible, because if the system is down they are losing money. When this situation occurs, we have two goals in mind:
(1) Get the customer up and running again as fast as possible. This is as often as not some sort of workaround that is not pretty, nor is it permanent, but it works. The workaround does get thorough testing (impossible within the time frame) but the customer is aware of this and willing to accept the risks.
(2) Get the customer a proper, version controlled, patch that they can install to fix the problem permanently. This can take weeks, most of that time being testing. If the customer is insistent we will ship them the proper patch before it is fully tested (again, making them aware of the risks) and continue testing so that we can send the customer some warm and fuzzy news later on (or, if we find a problem, another patch).
Life is like a web application. Sometime you need cookies just to get by.
I made them believe it was a hardware problem!
Engineering is the art of compromise.
Maybe the customer is being unreasonable.
Maybe the developer is being unreasonable.
It isn't possible to determine which from either person's viewpoint. You will ALWAYS think that you're right and that the other person is unreasonable.
Which is why you need criteria for bug escalation. Generating an incorrect response on 1 type of transaction for 1 specific scenario that may pop up once a year is far less important than a bug that corrupts the entire database.
And if your product is considered "mission critical", I would expect a data corruption bug to be fixed within 24 hours. Even if it is nothing more than rolling back the recent patches and re-issuing the previous version.
I'm an embedded developer, and when my stuff goes wrong, it can *really* do bad stuff. I've literally pushed fixed firmware to a controller running in a production scan/sort environment within five minutes of finding the bug, because it threatened to completely bring down a huge sort operation (and by huge, I mean 1 million+ pieces that day alone). I've also stayed up all night tracking down a bug crashing a device used by one of our larger (hundreds of millions of dollars per year) customers. Those, though, are the exception, and are driven by the massive financial and PR consequences of not getting it done right now. Throw caution to the wind, code and load if you are reasonable sure what's wrong and the stakes of not fixing it are high enough.
The usual bug fix cycle depends on complexity, impact, and risk. High risk of breaking things and low impact? Generally gets scheduled for the next release (4ish times per year). Low complexity and risk but medium impact? Code today, regression test the rest of the week, push this weekend. On average, mission critical bugs can get fixed in 8 hours or less around here, small to medium stuff is put on a weekly(ish) cycle with *lots* and *lots* of testing, and large stuff gets rolled to the next major release, unless it just can't wait that long.
We generally get fixes for real bugs out within 24 hours, unless the problem is traceable to the OS, the only factor really out of our immediate control. Even then, we do a quick evaluation to see if we can replace the OS function. Over the years, we've replaced quite a few of them, but rarely within 24 hours.
But we know our code backwards and forwards; I wrote the majority of the current codebase myself, and I can generally get to within a few lines of the problem just by a bug's description... the rest is a matter of minutes and testing. This app is very large - comparable to Photoshop in terms of feature count - but it is also very stable after 15 years of whack-a-bug and a continuous drive to make the internal structure as orderly and regular as possible.
It is my observation that the more programmers you have involved, the slower your turnaround time (for everything from bugs to features) will be. Likewise the larger the entity, the slower it will generally move. Almost every layer of management and corporate compartmenting disease will contribute to slowing down the process.
For the apps that I use that I have had the experience of reporting bugs, it is my general experience that bugs often are never fixed at all. One browser, "Omniweb", truly my favorite in terms of features, has bugs that make it essentially unusable for me. Crashing, slowing, lockups and so on - really serious problems. I've reported them, they never were fixed, in fact the software was never updated. Eventually, I just went back to firefox. Then as Leopard came out, after years of doing nothing, they released a "Leopard version" in which, perhaps, I might find those bugfixes if I looked... but as I say, I have moved on and no longer have any enthusiasm for the product. Slow bug repair (or ignoring them) is synonymous with telling your customers you really don't care what kind of experience they have with your software.
Apple, with all their emphasis on customer experience, does this too. They've had bugs in hand for very long periods where they simply don't address them. If your bug isn't something they think will affect a lot of people, it isn't likely to be fixed. I've not yet purchased Leopard, preferring not to catch early-adopter syndrome bugs myself, but when I do, I would not be the least bit surprised to find you still can't refresh a remote share that's been changed by the remote OS; that the wifi differs hugely in compatibility between PPC and Intel hardware; that mail still hoses the sent mail box based on the return address; that shell fonts are poorly rendered; that shell ANSI compatibility is still broken; that the OS still provides locked-up beachballs at the most inconvenient moments; that the OS still puts the wrong things away on the HD when RAM gets tight, and consequently becomes massively unresponsive... Basically, Apple doesn't have good control of their OS, are unable to respond to bugs in a timely fashion, so much so that they triage out bugs based on report counts, and the common patter is that Apple provides a great customer experience. So while my own experience is that bug fixes are important and can be quick in turnaround, here's Apple showing us that you can make a complete thrash out of the entire bugfix issue and still come out smelling like roses. So is a few weeks too long? Probably not, if you have a good marketing department. :-)
I've fallen off your lawn, and I can't get up.
From nearly forty years of programming (yes, since the IBM 026 keypunch days), I can tell you with absolute certainty that the more that you do for management, the more that they will want from you. It is not your responsibility to bear all the punishment for the lack of foresight and resource allocation on their part.
Consider this: What would be the managerial response if you asked for a cost of living salary increase and that you needed it within 48 hours? Do you think that they would be willing to work day and night to make that happen?
Working in panic mode is not professional behavior, and it certainly is not conductive to good engineering practices. Furthermore, it is detrimental to long term company survival. Engineers who support continued unreasonable demands have only themselves to blame for enabling poor strategic planning by management.
Even if the bug is obvious, it doesn't mean that your fix
1)Works
2)Works correctly for all corner cases
3)Does not have unintended side effects
4)Didn't accidently include some other changes you were working on before, which are not ready for production.
You still need to QA. Attitudes like yours are why the quality of software is so poor.
I still have more fans than freaks. WTF is wrong with you people?
The customer described a program they wanted (to run on an embedded system). I estimated 3-4 months. They asked for 30 days or less. I explained what they'd get if I banged it out that fast - something that would work most of the time and not lose too much data. They then explained that the program would save them over $1,000,000 a month. If it quit working, they quit saving money, but nothing else bad would happen.
So, I saluted and said I'd try really hard for 3 weeks for the first version, then about three months longer for a version that would work all the time. Which is what happened.
Do you know the impact on this customer of not having the fix that soon? Maybe it's worth it to them...
I work for a large healthcare organization and typically have very fast turn-around times (bugs often get squished within an hour). For clinical applications and other core applications, though, we're much more methodical and careful.
I often explain to the user that I can push changes out immediately, but it introduces certain risks. I then detail the risks they may face, and that if they say to go ahead anyway, at least they'll be aware of what might happen.
Really, it depends on your environment, and what needs to be done.
I'll use one of my web site as an example. It's all PHP and Perl, so ya, it's programming (I'm sure people will argue this).
Since I wrote all the code, I know it all inside and out. If you say "there's a problem [here]", I know exactly what file to look in, and what code to look for. I've banged out changes, tested them, and put them into production in a matter of minutes.
On a high traffic web site, we had a java applet which was being used by about 25,000 people per day. For little things, I'd change the code, test on all applicable platforms, and roll out the change in a few hours. Even then, the bosses were sometimes displeased with the time it took. Since I was careful to test, I never rolled out bad code, so I was never pushed into the long QA cycles.
Working with one company, things were a lot different. It went something like this.
1) Propose the change to your manager, with supporting documentation.
2) Manager would go to the project coordinator (i.e., customer liaison)
3) project coordinator would go to the customer
4) customer would approve the change.
Up to here was anywhere from an hour to a week. Sometimes the customer would put stipulations on the change, such as "there's a big event happening, or going to happen, don't make the change until X time."
5) document the proposed changes
6) hold a meeting with development, QA, the project coordinator, and management. Discuss the potential
changes.
1-3 days later
7) hold another meeting with the same people to rehash the changes.
1-3 days later
8) hold another meeting with the same people to rehash the changes.
9) Write the changes. Make them available to the QA team.
3-7 days later
10) Explain to the QA team that the errors they are experiencing with the fix have nothing to do with the fix, they were preexisting problems with another piece of code.
1-7 days later
11) hold another meeting with development, QA, project coordinator, and management, to explain that the error has been fixed with the supplied changes. The other problems are elsewhere.
1-3 days later
12) hold a strategy meeting to plan on how to fix the other problems.
13) fix the other problems, and break more things.
1-3 days later
14) have QA test the other changes.
14) roll back changes in step 13
15) beta test the previous changes, and notify customer
16) Customer balks at other pre-existing problems.
17) Repeat steps 5 to 15 again, until the customer gets tired of balking.
18) Implement changes.
Then start the process all over with step 1 to fix the other pre-existing problems.
The solution really is...
1) Identify the problem.
2) Gather together the appropriate staff who won't talk outside of your group.
3) Fix, internally test, and implement the resolution.
4) If anyone asks, there was no problem to start with, and you were all really working on steps 5 to 15 of the previous plan on another problem.
Funny how that works.
But, it's a matter of, is it a trivial fix, or something that requires serious rewriting? Did someone miss trapping invalid input in one line, or is it a poor coding practice through all of the code? Is it an included library that simply needs to be upgraded and recompiled?
Serious? Seriousness is well above my pay grade.
At the risk of getting modded "offtopic" I will say what everyone is thinking and take a hit for the team
IS THERE ANY WAY TO BAN THIS ASSHOLE!!!! (pardon the little pun I threw in)
Goatse was funny 10 years ago but its really stale.
Make SELinux enforcing again!