Uptime Realities in the Internet World
schnurble writes: "My former boss has written an interesting article on the realities of uptime in the Internet World. It poses the idea that four and five nines of reliability are too expensive to be realistic, especially in the post dot-bomb economy. It's an interesting read, especially if you answer to an 800lb gorilla for outages and uptime issues."
Wouldn't you know it, an article about uptime...and slashdotted. Looks like he needs a mirror.
How many engineers out there have heard the marketing / sales 'it has to be always available' and priced out an infrastructure accordingly.
Even recently I'm working with a customer who wants a compromise between price and availability - but it still needs five nine's
Availability is infrastructure plus process. You need to have the supporting process to go along with the hardware - maintenance schedules, change management (well FCAPS in general), etc. It's not just a big box.
said if i can get this mentioned on slashdot, i'll get the raise after all...
::.. check out some Cell Phone Reviews
Like the Telco... voice grade telco. Better than the power company.
Our web server does about 4 9's, which is a downtime of about 8 hours a year, I think. I really suck at math though. I mean it.. I'm so bad at math I have no idea if thats right. I said "well theres 8544 hours in a year, so 8 divided by that is 0.0009, so thats about 4 9s. I think. 8 hours of downtime isnt that bad. I think the next step up from 8 hours of downtime is essentially those megacorps that have redundant systems, and sirens go off and people die when their server goes down for under a second. In fact, I think if their server actually went down for more than a second, some sort of structual damage to the building hosting it is the only likely scenario. Course, that's closer to 7 9s. I cant figure out how long any of the other 9s are cause I only knew what our average downtime is, and could do the math that way only. Wow, its really hot in here.
Could someone with an 8th grade math education please post the amounts of downtime 1 through 9 9s are, please?!
slashdot: where everyone yells sarcastic metaphors to themselves to understand the issue
I think we just knocked his server down to two nines by slashdotting it.
What else would motivate someone to post an ex-boss' e-mail address on the front page of slashdot?
Five nines uptime is cheap and easy. It all boils down to where you put the decimal point.
Obliteracy: Words with explosions
Let me give you a hypothetical case. One of our clients does about $50k/month on their web site. When the site was built, they were only expecting $10000-$15000/month. At the time, NN4 compatibility wasn't important, because the extra cost ($10k) wasn't going to be worth it. With NN4 sitting between 5% and 10% each month, they have decided that NN4 compatibility is important in the next version.
When we launched, 3 days of downtime a month was considered okay. It was considered a better choice than spending an extra $5k on hardware for redundancy. Well, when the site broke $40k/month, we immediately decided that that was no good and invested in the redundancy.
The site has had a few 15 minute outages over the past 6 months, and a 1 day outage over a holiday weekend (not a big deal). However, if the site doubles in revenue again, downtime is becoming less acceptable, and we'll drop $10k to avoid it.
If your site sucks and no one visits, downtime doesn't matter. If you are making lots of money, downtime does matter. $10k on hardware is worth it if the downtime would cost you $25k?
Alex
I'm sorry, I just had a type mismatch error. I saw "oracle" and "isn't really that expensive" in the same post.
DO NOT DISTURB THE SE
"Are ve up?"
"Nien."
"Are ve up yet?"
"Nien."
"How about NOW?"
"Nien."
"Vill ve be comink up soon?"
"Nien."
"Vill ve be up next veek?"
"Nien."
God is real unless declared integer
The Scenario
Pagers going off. Phones ringing. People shouting fragments of conversations over the tops of cubicles. Groups of people huddled around monitors. Others dashing up and down the hallways, sticking their heads into office doors for just a moment, then scampering along to the next doorway. You are frantically talking on your cell phone, silencing your pager, and yelling into the speakerphone on your desk while typing on two different keyboards attached to three different monitors.
Sound familiar? It's a classic case of the dreaded 'downtime' disease, a terrible ailment where none of your systems work and for reasons you can't always understand. Of course, it typically strikes at the most inopportune moments - the launch of a major product upgrade, or right after announcing your partnerships with 5 of the Fortune 100.
Nobody wants downtime. It's a terrible thing that always involves blood, sweat, tears, and inevitably, a loss of money. This is why when you talk to the upper management of any company with a strategic online initiative you'll be told that the IT group has the highest goals, and that downtime is considered to be an anathema to be stamped out vigorously.
Unfortunately, when you talk to the company's IT manager you commonly hear a different story; the resources to back-up the company's lofty online goals are hard to come by. In fact, with the down swing of the last couple years, combined with the fact that IT isn't, at least directly, a revenue generating entity, IT budgets are being reduced while uptime performance levels are expected to be the same. This can just lead to a death march of extremely over-worked IT personnel, and longer, more numerous, occurrences of system downtime. These goals need to be re-evaluated.
Genesis of the 'Five Nines'
We've all heard the mantra of 'five nines', or 99.999% reliability. Somewhere in the depths of the Internet's 'big bang', when systems were slow and cranky, reliability became a major selling point of why one company's system was 'better' than the competition.
First, people talked about being 'two nines' or 99% reliable. Then someone else would top that, and make their product seem better, claiming 'three nines' (99.9%). Not long after that came 'four nines' (99.99%) and then, near the peak of the dot com era, came 'five nines'.
The herd mentality left no room in which to pitch for investment without the 'five nines' claim. "After all," it was thought, ôif everyone else is saying they can provide 'five nines', I'd have to pretend I didn't know what I was doing if I didn't say I could match everyone else's claim."
'Five nines' isn't impossible. It's merely impractical and unnecessary in the world of the Internet. A shocking statement, perhaps, but a truism none-the-less.
We're not talking about launching people into space (which, by the way, is unfortunately done under 'three nines'), or working with nuclear power plants. We're working within the reference of online systems providing services to users both on and off the Internet - nobody dies from a system failure.
The Greasy Steel Bar
Think of uptime as a chin-up bar coated in grease. The higher the reliability desired, the greater the coating of grease. It's clearly tougher to hang to a higher standard of reliability.
What's not so obvious, but very important, is that the higher the uptime target, the worse one does if not prepared. An IT department capable of three nines faced with a bar that's five nines slippery won't even manage the three nines they are capable of doing.
The same thing happened to me once, a little puddlejumper from Dallas or Houstan to Austin, I think it was.
Anywho, the pilot revs up the engine, then throttles it back done. Fine, brakes and throttle work. Throttles back up, trips the brakes, and off we go screaming down the runway.
Then the plane slows down, and stops.
Pilot comes on the intercom and says 'Um, folks, you may have noticed, we didn't take off. A warning light has come on in the cockpit, and we don't know why. Until we do, we're going to stay right here.
Now, that's not the bad part. The heat and humidity, and a plane full of sweaty smelly passengers isn't the bad part, either.
No, the bad part was the pair of off duty pilots in the seats next to me who started, in loving detail, discussing every thing that could possibly be wrong.
Vintage computer games and RPG books available. Email me if you're interested.