Why Is Less Than 99.9% Uptime Acceptable?
Ian Lamont writes "Telcos, ISPs, mobile phone companies and other communication service providers are known for their complex pricing plans and creative attempts to give less for more. But Larry Borsato asks why we as customers are willing to put up with anything less than 99.999% uptime? That's the gold standard, and one that we are used to thanks to regulated telephone service. When it comes to mobile phone service, cable TV, Internet access, service interruptions are the norm — and everyone seems willing to grin and bear it: 'We're so used cable and satellite television reception problems that we don't even notice them anymore. We know that many of our emails never reach their destination. Mobile phone companies compare who has the fewest dropped calls (after decades of mobile phones, why do we even still have dropped calls?) And the ubiquitous BlackBerry, which is a mission-critical device for millions, has experienced mass outages several times this month. All of these services are unregulated, which means there are no demands on reliability, other than what the marketplace demands.' So here's the question for you: Why does the marketplace demand so little when it comes to these services?"
Probably because of the cost. I do network design for a fairly large telco, and let me tell you the cost goes up exponentially with the number of "9"s that the business asks for.
Are these kind of outages really so common? Mobiles phones I absolutely agree with. ON the other hand, I literally cannot remember the last time I lost cable or my internet. I've literally lost power more frequently than either of them (maybe 4 times in the past year) and lost water once. Emails not making it to their destination--again, does this really happen? In the decade plus I've been using internet email, I can't off the top of my head ever think of any "lost" email unless it was sent to a wrong address or something.
While you deserve the mod points, it should also be noted that consumer expectation is strangled into submission within 20 minutes on the first support call they make to ask about better service quality.I know a guy who is locally famous because he will spend 4,5,6 or more hours on the phone with customer service, supervisors, managers and anyone on the board of directors that he can find a phone number for. What is he fighting for? discounted service or reparations for lost service(s). That's right, it takes hours on the phone to get one of those companies to either own up to, and pay for losses accrued by their customers through loss of service.
In truth, most consumers won't complain when they should, so there is no marketplace pressure on those businesses to aim for five nines uptime.
Support NYCountryLawyer RIAA vs People
I think the term you're looking for is "managing expectations." Here's a little article about it from the IT side. It's something that Microsoft and teleco's have become so good at. If you keep expectations low and give them a little better, they'll be more than happy. If you give the same, but you promised the world, you get a bunch of unsatisfied customers.
As a guy who does communications in the U.S. Navy, I can attest to this. If the United States military can't guarantee 99.999% uptime on communications in all conditions, what makes anyone think it's possible in the private sector?
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
I was born in 1964. I have no recollection of POTS telephone service ever being unavailable.
Electricity was expected to drop out a few times every summer, and until someone figures out how to tell lightning where to go, I expect it will continue to happen. In my part of Canada, however, power is continuously available from October to April no matter what. Even if you don't pay your bill. The only winter power outage of note I can think of offhand was the great Ice Storm of 1998, one of the most spectacular cases of force majeure I've witnessed in my life.
In my part of the world, at least, power and telephone were life-and-death services and legislation mandated their reliability.
Crumb's Corollary: Never bring a knife to a bun fight.
My electricity isn't 99.999% uptime (that's 30 seconds in a year) which would require me to get an UPS
My consumer grade equipment isn't 99.999% uptime (with luck, maybe I guess but there's no ECC, redundant power etc).
My software isn't 99.999% uptime (ok, so the kernel is stable. When X crashes, so does everything of importance on a desktop)
If there's something urgent, you CALL me anyway.
I'd rather take a line with 99.5% uptime (that's two days without internet per year) that's 10x faster and costs 10x less. Which doesn't include that I have Internet at work, or via my cellphone, or via a webcafe or any number of other easily available sources. The only real killer I can think of is if you only telecommute and can't go to work, but even then I figure the nearest Starbucks will let you occupy a corner with some purchases.
Live today, because you never know what tomorrow brings
The origins of an OS really show through a lot of the time. Windows started out as a single user OS, so rebooting was OK because the only person you messed up was the guy sitting in front of the screen. It eventually evolved into a multi-user OS, but the "just reboot!" mentality persists to this day.
Windows NT (ie: contemporary Windows) has been a multiuser OS since it's first release.
The reason the "just reboot" mentality persists is simply becaus e99% of the time it *is* used as a single-user OS, and no-one else is impacted. This has _zero_ to do with the architecture and everything to do with the user. Linux would be (and is) treated in the same way in similar situations.
Linux/Unix on the other hand started out life as a multi-user OS. Rebooting was a big no-no, because you'd affect countless people logged in, and you'd get yelled at for ruining someones work.
UNIX actually started out as a single-user OS and the multiuser aspect was bolted on later. Linux didn't, of course, because by the time Linus banged together his UNIX rip-off, UNIX had been multiuser for quite a while.
However, again, the attitudes towards how their relevant users treat servers and workstations have about 10% to do with their architectures and 90% to do with their knowledge. DOS and OS/2 were single user, yet frequently had BBSes and similar running off them. You can be assured the people running those BBSes were far less like to have the "just reboot" mentality.
Further, the other reason most people have that attitude is because to them a computer is just another appliance. When other appliances act up, pretty much the first thing _everybody_ does is turn it off and back on again. Why on Earth would you expect them to treat a computer any differently ?
Windows administrators categorically will try rebooting the damn thing first to fix any problem (and it usually works). Linux administrators will only try this as a last resort (and it almost never works).
No. Inexperienced admins will try rebooting first, regardless of platform. Experienced admins will not. Incidentally, there are numerous classes of problems on Linux (and UNIX in general) which are more quickly and easily "fixed" with a reboot.
Anyway, at Microsoft the idea that you can somehow tweak windows just right so rebooting isn't necessary is crazy.
I can't even remember the last time I had to reboot any of my Windows machines without a good reason (eg: patching).
Finally, there's nothing wrong with rebooting _anyway_. If your service uptime requirements are affected by a single machine rebooting, your architecture is broken. All the reboot does is demonstrate that it's broken without a real problem actually occurring.
Sysadmins comparing machine uptimes is like ricers comparing spoilers.