Why Is Less Than 99.9% Uptime Acceptable?
Ian Lamont writes "Telcos, ISPs, mobile phone companies and other communication service providers are known for their complex pricing plans and creative attempts to give less for more. But Larry Borsato asks why we as customers are willing to put up with anything less than 99.999% uptime? That's the gold standard, and one that we are used to thanks to regulated telephone service. When it comes to mobile phone service, cable TV, Internet access, service interruptions are the norm — and everyone seems willing to grin and bear it: 'We're so used cable and satellite television reception problems that we don't even notice them anymore. We know that many of our emails never reach their destination. Mobile phone companies compare who has the fewest dropped calls (after decades of mobile phones, why do we even still have dropped calls?) And the ubiquitous BlackBerry, which is a mission-critical device for millions, has experienced mass outages several times this month. All of these services are unregulated, which means there are no demands on reliability, other than what the marketplace demands.' So here's the question for you: Why does the marketplace demand so little when it comes to these services?"
Probably because of the cost. I do network design for a fairly large telco, and let me tell you the cost goes up exponentially with the number of "9"s that the business asks for.
Are these kind of outages really so common? Mobiles phones I absolutely agree with. ON the other hand, I literally cannot remember the last time I lost cable or my internet. I've literally lost power more frequently than either of them (maybe 4 times in the past year) and lost water once. Emails not making it to their destination--again, does this really happen? In the decade plus I've been using internet email, I can't off the top of my head ever think of any "lost" email unless it was sent to a wrong address or something.
While you deserve the mod points, it should also be noted that consumer expectation is strangled into submission within 20 minutes on the first support call they make to ask about better service quality.I know a guy who is locally famous because he will spend 4,5,6 or more hours on the phone with customer service, supervisors, managers and anyone on the board of directors that he can find a phone number for. What is he fighting for? discounted service or reparations for lost service(s). That's right, it takes hours on the phone to get one of those companies to either own up to, and pay for losses accrued by their customers through loss of service.
In truth, most consumers won't complain when they should, so there is no marketplace pressure on those businesses to aim for five nines uptime.
Support NYCountryLawyer RIAA vs People
I think the term you're looking for is "managing expectations." Here's a little article about it from the IT side. It's something that Microsoft and teleco's have become so good at. If you keep expectations low and give them a little better, they'll be more than happy. If you give the same, but you promised the world, you get a bunch of unsatisfied customers.
As a guy who does communications in the U.S. Navy, I can attest to this. If the United States military can't guarantee 99.999% uptime on communications in all conditions, what makes anyone think it's possible in the private sector?
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
I'm still waiting for people to scream about the rising gas prices and the record oil company profits. Seems like this would have a greater impact on the general populous than reliable cell phone service.
It has nothing to do with conditioning. They could easily bury power lines to prevent storm outages, but people don't want to pay the costs. That is what 9s in uptime is all about. Paying increasingly more for increasingly smaller additional uptime. I would rather pay my current rates than pay twice as much, but have less downtime. I can live for a day or two with out power after a major storm. If you can't then pay the extra your self and buy a generator. Don't try to force others to subsidize your service requirements.
I was born in 1964. I have no recollection of POTS telephone service ever being unavailable.
Electricity was expected to drop out a few times every summer, and until someone figures out how to tell lightning where to go, I expect it will continue to happen. In my part of Canada, however, power is continuously available from October to April no matter what. Even if you don't pay your bill. The only winter power outage of note I can think of offhand was the great Ice Storm of 1998, one of the most spectacular cases of force majeure I've witnessed in my life.
In my part of the world, at least, power and telephone were life-and-death services and legislation mandated their reliability.
Crumb's Corollary: Never bring a knife to a bun fight.
My electricity isn't 99.999% uptime (that's 30 seconds in a year) which would require me to get an UPS
My consumer grade equipment isn't 99.999% uptime (with luck, maybe I guess but there's no ECC, redundant power etc).
My software isn't 99.999% uptime (ok, so the kernel is stable. When X crashes, so does everything of importance on a desktop)
If there's something urgent, you CALL me anyway.
I'd rather take a line with 99.5% uptime (that's two days without internet per year) that's 10x faster and costs 10x less. Which doesn't include that I have Internet at work, or via my cellphone, or via a webcafe or any number of other easily available sources. The only real killer I can think of is if you only telecommute and can't go to work, but even then I figure the nearest Starbucks will let you occupy a corner with some purchases.
Live today, because you never know what tomorrow brings
Partly correct. What they did was to mass introduce the GUI. 1.0 was a joke as far as usability went. At the same time the 386 was out and the talks of multiprocessing was promising new and exciting computing in the near future.
I don't think they measured squat. Just did their best. Only thing was that there were nobody who could properly design an O/S and complexity, instead of simplicity, ruled the day.
What we are seeing is the very best they as group are able to produce.
They have never been great at marketing either. But they were really the first to push the GUI with success. Don't forget Apple became a very closed platform. They did not attract masses the way the open IBM PC did.
Right there history shows how important open standards are to success. Apple was considered this fantastic success story but in reality they cut it short and did not buy the masses the way the Johnny came lately IBM PC did. But we are slow when it comes to learning from history.
What they have been good at is market lock-in, vender lock-in and many other types of lock-in. (The problem really is that they had never heard about duty and were only interested in money.) We all thought they would get it right sooner or later and deliver a good platform that would allow happy computing. The fact that they specialized in adopting good standards and then corrupt them so that you got locked in was a very calculated development.
At one point Gates himself said that Unix was the way to go. Then he decided to do it better but clearly never understood what made Unix so good (simplicity). Torvald on the other hand was ONLY looking for simplicity. Which is why it fit so well into the general Unix design.
Look at windows, it is filled with arbitrary complexities and is horribly inefficient. Never mind when upper management throws fits and yell at staff, I've never found that conducive to good programming, or business.
Gates cheated his way into O/S design, used people from VAX who's memory management problem were dragged over to windows. Built a kernel in BASIC! Haha! And got away with it for years!
Someone who knew more about systems picked the Unix design and rewrote history based on technology, and was not motivated by money. Interesting to see how much we like to be able to just do what we need. Imagine if IBM had released Linux. With all the corporate support for let's say $100. Then opened it up with a GPL license.
Microsoft would not be sitting pretty at all. The O/S2 collaboration would not have happened and Gates would not have learned his lessons from that. For all their success I've never considered them much of a success where it really matters. Integrity in product and care for customers. I have people send me Brandy, fine wines and other tokens of their appreciation after sales. Because I believe in treating other people the way I like to be treated, and I really care about my clients.
The origins of an OS really show through a lot of the time. Windows started out as a single user OS, so rebooting was OK because the only person you messed up was the guy sitting in front of the screen. It eventually evolved into a multi-user OS, but the "just reboot!" mentality persists to this day.
Windows NT (ie: contemporary Windows) has been a multiuser OS since it's first release.
The reason the "just reboot" mentality persists is simply becaus e99% of the time it *is* used as a single-user OS, and no-one else is impacted. This has _zero_ to do with the architecture and everything to do with the user. Linux would be (and is) treated in the same way in similar situations.
Linux/Unix on the other hand started out life as a multi-user OS. Rebooting was a big no-no, because you'd affect countless people logged in, and you'd get yelled at for ruining someones work.
UNIX actually started out as a single-user OS and the multiuser aspect was bolted on later. Linux didn't, of course, because by the time Linus banged together his UNIX rip-off, UNIX had been multiuser for quite a while.
However, again, the attitudes towards how their relevant users treat servers and workstations have about 10% to do with their architectures and 90% to do with their knowledge. DOS and OS/2 were single user, yet frequently had BBSes and similar running off them. You can be assured the people running those BBSes were far less like to have the "just reboot" mentality.
Further, the other reason most people have that attitude is because to them a computer is just another appliance. When other appliances act up, pretty much the first thing _everybody_ does is turn it off and back on again. Why on Earth would you expect them to treat a computer any differently ?
Windows administrators categorically will try rebooting the damn thing first to fix any problem (and it usually works). Linux administrators will only try this as a last resort (and it almost never works).
No. Inexperienced admins will try rebooting first, regardless of platform. Experienced admins will not. Incidentally, there are numerous classes of problems on Linux (and UNIX in general) which are more quickly and easily "fixed" with a reboot.
Anyway, at Microsoft the idea that you can somehow tweak windows just right so rebooting isn't necessary is crazy.
I can't even remember the last time I had to reboot any of my Windows machines without a good reason (eg: patching).
Finally, there's nothing wrong with rebooting _anyway_. If your service uptime requirements are affected by a single machine rebooting, your architecture is broken. All the reboot does is demonstrate that it's broken without a real problem actually occurring.
Sysadmins comparing machine uptimes is like ricers comparing spoilers.
Somewhat off-topic, but an anecdote related to massive consumer calls to tech support.
Back in the day, I was an IRC Operator for a large Undernet server, and there came a time where the new thing for troublemakers was to use open proxies on cable connections to flood channels/servers. One cable provider had a particularly large number of clients whose setup was used to attack the network and generally cause trouble.
At first, being in the area of that provider, I called tech support and escalated the issue as much as I could. My point was that they were ultimately responsible for the abuse coming from their network. Long story short, for months I got nothing but "we'll look into it".
After a particularly nasty week, and after consulting with the server admins, we decided to ban the whole ip range of that provider from using our server (they could still use the rest of Undernet, but our server was popular for them). The ban kicked > 1000 clients from the server with a message like : Your provider does not respond to abuse complaints. Contact your provider's technical support to have this issue resolved.
10 minutes later, there was a 30 minute wait at the provider's tech line. On a sunday afternoon. One hour later, I got an email saying they were blocking inbound port 1080 at their router to protect their clients machines from being abused.
I guess the point is, when something generates enough backlash, preferably with a nice surprise effect, things can change. The hard thing is to organize people enough to harass the company about it.
"I remember Y1K, every abacus had to get another bead"