HA Metrics on Non-clustered Systems?

← Back to Stories (view on slashdot.org)

HA Metrics on Non-clustered Systems?

Posted by Cliff on Friday September 21, 2001 @11:28AM from the you-do-the-math dept.

javester asks: "Has anybody given any thought to compiling metrics for high-availability for the different OSs on a high-availability non-clustered system? Is 'High-availability Windows' an oxymoron? Can you even get close the 5 9s (99.999% which is about 5 minutes of downtime a year) on a typical stand-alone Windows 2000 Server running IIS 5 with typical patch-it-up, three-finger salute routine? If you are just serving up a web-site on a plain-vanilla Windows box, how highly-available can it get? By my calculations, with the typical reboot cycle being 3 minutes, and with a security patch requiring a reboot being released on a weekly basis - a stand-alone high-availability Windows box looses about 156 minutes a year just applying patches! So it can never get past the third 9! In *nix environments, reboots are not required as often (except for kernel changes - how APT!), since you can recycle the appropriate daemon without restarting. But really, has anybody made a formal study?" Now wouldn't this make an interesting college project?

1 of 15 comments (clear)

Min score:

Reason:

Sort:

Your Question is -1 Troll, and here's why by Anonymous Coward · 2001-09-21 14:15 · Score: 1, Insightful

First, the troll part is that patches for Microsoft's OS are not released weekly. Moreover, the number of patches that apply to your situation, and your installed software, will only arise monthly, at most.

Second, this nonsense of 5, 6, or 7 nine's availability is bullshit. No one is using the box that often. Cluster, bring boxes down during the normal 9-5 workday, fix it, and bring it back online. Give the support staff a life, reduce cost by having only one shift, with a rotating pager.

Third, this whole Six Sigma nonsense on availability is bullshit. A box that has to be available every minute of the day is the weak link in the production chain. Go back, cluster, and re-architect the whole thing.

Finally, most of the boxes I see with high uptime (as in years of uptime) aren't doing squat. You're effectively saying that a box like that was set up perfectly, with perfect software, perfect administration, perfect hardware, and no part of the software needed updating, patching, or restoring from a backup, ever, nor did any of the hardware age, get flakey, fail, need expansion, or anything that would normally happen. You've been to Vegas; calculate the odds of it all working out wonderfully.

Fifth, having two or more boxes clustered is cheaper, and makes more sense, than trying to keep one box up for years at a time. The space shuttle has, what, 5 or 7 computers, checking each other? DNS and mail servers have secondaries. In fact, anyone who's smart, and plans to make money, doesn't put all their computing eggs into one basket.