Slashdot Mirror


Why Is Less Than 99.9% Uptime Acceptable?

Ian Lamont writes "Telcos, ISPs, mobile phone companies and other communication service providers are known for their complex pricing plans and creative attempts to give less for more. But Larry Borsato asks why we as customers are willing to put up with anything less than 99.999% uptime? That's the gold standard, and one that we are used to thanks to regulated telephone service. When it comes to mobile phone service, cable TV, Internet access, service interruptions are the norm — and everyone seems willing to grin and bear it: 'We're so used cable and satellite television reception problems that we don't even notice them anymore. We know that many of our emails never reach their destination. Mobile phone companies compare who has the fewest dropped calls (after decades of mobile phones, why do we even still have dropped calls?) And the ubiquitous BlackBerry, which is a mission-critical device for millions, has experienced mass outages several times this month. All of these services are unregulated, which means there are no demands on reliability, other than what the marketplace demands.' So here's the question for you: Why does the marketplace demand so little when it comes to these services?"

113 of 528 comments (clear)

  1. because they've been conditioned by yagu · · Score: 5, Insightful

    Why does the marketplace demand so little when it comes to these services?

    The marketplace has been duped into believing that this is the best technology can provide. People don't have time to know, understand, or research history and find that technology really can be reliable.

    I'll get modded troll, but I lay much of this at Microsoft's feet. I laughed them off when I first heard of them and their goal of taking over the industry. After all, I'd been working on systems that ran 24x7 with five-9 reliability for years, and DOS/Windows couldn't touch that.

    One time I had an opportunity to visit Microsoft and have lunch with a friend there. I figured while there I'd take the opportunity. I asked them in hushed tones, "Just how do you configure Windows so that you don't have to reboot it all of the time?" They looked at me like I was crazy.

    Technology can provide reliability. The general public is no longer even aware that it's possible.

    1. Re:because they've been conditioned by The+Ancients · · Score: 5, Insightful

      The reasons why Microsoft were so successful (in a business sense) are manifold, but one is not that their products were great, but that they were good enough. They accurately measured what people would put up with at different price points, and serviced the market accordingly. I think ISPs, telcos, etc have done likewise.

    2. Re:because they've been conditioned by Otter · · Score: 4, Insightful
      I'll get modded troll, but I lay much of this at Microsoft's feet.

      Truly, your courage is an inspiration to us all!

      In fact, though, I can tell you that in the pre-Windows days, electricity had outages, television had outages, telephone service had outages, gas service had outages... For the same reason we have them today -- people aren't willing to accept the economic and aesthetic costs of providing those services at the level of reliability you and the author are demanding.

      Incidentally, is it most people's experience that "We're so used [sic] cable and satellite television reception problems that we don't even notice them anymore"? There were some glitches in a broadcast of Zoolander on TBS last weekend, which I'll admit is cause for complaint. (Especially since one wiped out "I feel like I'm taking crazy pills!") But on the whole, I can't say I've seen substantial problems when there wasn't a blizzard or hurricane, and if I'm forced to to stop watching TV for an hour or two, it's not the end of the world.

    3. Re:because they've been conditioned by Naughty+Bob · · Score: 2, Insightful

      The marketplace has been duped into believing that this is the best technology can provide.
      I do not believe that this is the cause.

      As is correctly noted above, there are only market pressures involved. When that's the case, customers rarely factor 7 or 8 different metrics (eg. price, quality, reliability etc.) into their decision making. Rather, they identify what they want, then find the cheapest supplier, and provided that there is no compelling reason to avoid the supplier, do the deal.

      This means that suppliers concentrate on maintaining enough of a service that they can advertise without being sued, and getting the price down. They have no reason to do any more.

      My mobile phone operator gives me a good phone, and cheap calls. But their data charges, and roaming charges are extremely uncompetitive. As data/roaming charges make up a small proportion of my bill, I can't justify prioritizing them when I am shopping around for a contract. I am rewarded with a good old gouging.
      --
      "Be light, stinging, insolent and melancholy"
    4. Re:because they've been conditioned by SailorSpork · · Score: 5, Interesting

      I think the term you're looking for is "managing expectations." Here's a little article about it from the IT side. It's something that Microsoft and teleco's have become so good at. If you keep expectations low and give them a little better, they'll be more than happy. If you give the same, but you promised the world, you get a bunch of unsatisfied customers.

    5. Re:because they've been conditioned by Vellmont · · Score: 5, Insightful


      One time I had an opportunity to visit Microsoft and have lunch with a friend there. I figured while there I'd take the opportunity. I asked them in hushed tones, "Just how do you configure Windows so that you don't have to reboot it all of the time?" They looked at me like I was crazy.

      In a certain sense.. you were crazy, at least at Microsoft.

      The origins of an OS really show through a lot of the time. Windows started out as a single user OS, so rebooting was OK because the only person you messed up was the guy sitting in front of the screen. It eventually evolved into a multi-user OS, but the "just reboot!" mentality persists to this day.

      Linux/Unix on the other hand started out life as a multi-user OS. Rebooting was a big no-no, because you'd affect countless people logged in, and you'd get yelled at for ruining someones work.

      It's funny the attitude that comes from the users of each OS. Windows administrators categorically will try rebooting the damn thing first to fix any problem (and it usually works). Linux administrators will only try this as a last resort (and it almost never works).

      Anyway, at Microsoft the idea that you can somehow tweak windows just right so rebooting isn't necessary is crazy. They designed the damn thing so "just reboot!" will fix any problem. This of course is an unacceptable solution to a lot of people out their, but for a lot of people it's obviously reality.

      --
      AccountKiller
    6. Re:because they've been conditioned by Rhaui · · Score: 3, Interesting

      It has nothing to do with conditioning. They could easily bury power lines to prevent storm outages, but people don't want to pay the costs. That is what 9s in uptime is all about. Paying increasingly more for increasingly smaller additional uptime. I would rather pay my current rates than pay twice as much, but have less downtime. I can live for a day or two with out power after a major storm. If you can't then pay the extra your self and buy a generator. Don't try to force others to subsidize your service requirements.

    7. Re:because they've been conditioned by tverbeek · · Score: 4, Insightful

      Conditioning certainly has to be a big part of it. People put up with crappy wireless phone service because that they don't remember (or are too young to know) what an old-fashioned fully-wired telephone conversation sounds like. After a couple decades of cordless and wireless phones, the level of service has gone from "you can hear a pin drop" to "can you hear me now?"

      --
      http://alternatives.rzero.com/
    8. Re:because they've been conditioned by cgenman · · Score: 4, Insightful

      The server is up 99.99% of the time. The server's T1 is up 99.99% of the time. T1's ISP is up 99.99% of the time. The backbone provider is up 99.99% of the time. The cellular ISP is up 99.99% of the time. The cell-to-tower linkage is up 99.99% of the time...

      Eventually, with all of these little points of failure, you're going to get a good sized chunk of fail. Add in things like the inherent instability of wireless technologies and our nation-wide problem with an aging electrical infrastructure, and you have the sorts of occasionally mildly-inconviencing issues that you see today.

      Right now it seems like the things users want to optimize most for are A: speed and B: cost. One day every other month where our home internet is down doesn't seem like the end of the world, especially with the cost of the alternative.

    9. Re:because they've been conditioned by kasperd · · Score: 2, Informative

      Linux administrators will only try this as a last resort (and it almost never works).
      I think that about 80% of the time I will know beforehand if rebooting a Linux system is going to solve a particular problem. But even if I'm convinced that a reboot would solve the problem, I usually spend some time looking for a solution that does not involve rebooting. There are multiple reasons why I look for other solutions. Sometimes a reboot is inconvenient because of all the programs that have to be shut down and started again. If I find a solution that does not involve rebooting, I will usually have saved some time on the long term, because next time it happens, I will be back to my work right away. And finally it helps me understand the problem, so maybe I will even be able to prevent it in the future.

      If something used to work and suddenly stopped working, chances are, a reboot will solve it. Obvious exceptions are that the problem could be caused by faulty hardware or a full disk. Those are usually easy to spot.
      --

      Do you care about the security of your wireless mouse?
    10. Re:because they've been conditioned by Vellmont · · Score: 2, Funny


      I'm not saying Windows is better, but the above means you don't have to work a lot with NFS clients on Linux...

      Very true.

      I consider NFS to be the devil. If given the choice, I'll choose a different protocol every time.

      --
      AccountKiller
    11. Re:because they've been conditioned by Vellmont · · Score: 2, Interesting


      Frequent reboots haven't been required since win2k.

      (snicker)

      I've been running windows for years, and this statement is just very funny to me. You must be running some entirely different magical version of windows that I've ever seen, but reboots are EXTREMELY common on 2000, XP, and Vista. The "just reboot" instinct I've seen from multiple different Windows guys is common, and DOES work. I was looking forward to Vista, which claimed it didn't require rebooting as often. That didn't really turn out to be the case. If you really think win2k and beyond doesn't require reboots, I think you either don't run it, or just have a very poor memory.

      --
      AccountKiller
    12. Re:because they've been conditioned by ext42fs · · Score: 2, Interesting

      I'm not going to mod you as troll. But maybe others do me. Computer science and all what goes with is is an exact science. Until MS came around. MS destroyed it. Welcome to the dark ages of MS.

    13. Re:because they've been conditioned by drsmithy · · Score: 4, Interesting

      The origins of an OS really show through a lot of the time. Windows started out as a single user OS, so rebooting was OK because the only person you messed up was the guy sitting in front of the screen. It eventually evolved into a multi-user OS, but the "just reboot!" mentality persists to this day.

      Windows NT (ie: contemporary Windows) has been a multiuser OS since it's first release.

      The reason the "just reboot" mentality persists is simply becaus e99% of the time it *is* used as a single-user OS, and no-one else is impacted. This has _zero_ to do with the architecture and everything to do with the user. Linux would be (and is) treated in the same way in similar situations.

      Linux/Unix on the other hand started out life as a multi-user OS. Rebooting was a big no-no, because you'd affect countless people logged in, and you'd get yelled at for ruining someones work.

      UNIX actually started out as a single-user OS and the multiuser aspect was bolted on later. Linux didn't, of course, because by the time Linus banged together his UNIX rip-off, UNIX had been multiuser for quite a while.

      However, again, the attitudes towards how their relevant users treat servers and workstations have about 10% to do with their architectures and 90% to do with their knowledge. DOS and OS/2 were single user, yet frequently had BBSes and similar running off them. You can be assured the people running those BBSes were far less like to have the "just reboot" mentality.

      Further, the other reason most people have that attitude is because to them a computer is just another appliance. When other appliances act up, pretty much the first thing _everybody_ does is turn it off and back on again. Why on Earth would you expect them to treat a computer any differently ?

      Windows administrators categorically will try rebooting the damn thing first to fix any problem (and it usually works). Linux administrators will only try this as a last resort (and it almost never works).

      No. Inexperienced admins will try rebooting first, regardless of platform. Experienced admins will not. Incidentally, there are numerous classes of problems on Linux (and UNIX in general) which are more quickly and easily "fixed" with a reboot.

      Anyway, at Microsoft the idea that you can somehow tweak windows just right so rebooting isn't necessary is crazy.

      I can't even remember the last time I had to reboot any of my Windows machines without a good reason (eg: patching).

      Finally, there's nothing wrong with rebooting _anyway_. If your service uptime requirements are affected by a single machine rebooting, your architecture is broken. All the reboot does is demonstrate that it's broken without a real problem actually occurring.

      Sysadmins comparing machine uptimes is like ricers comparing spoilers.

    14. Re:because they've been conditioned by dae_vid43 · · Score: 2, Informative

      my uncle once said: "in construction, clients are interested in 3 things: 1) build it fast, 2) build it cheap, and 3) build it right. realistically, you can have only two of these three". he was right. so, the same goes for 99% wireless or whatever. if you *need* some technology thing to work 100% of the time, you'd better be willing to pay out your ass for it. if you want something to be cheap, than it'll be cheap--and less reliable. look at Japan: they have a 99.999% reliable rail network, but it cost an arm and a leg to build--and it took like 50 years. (note that your cell phone *will* work in subways in Tokyo, because they paid out the ass to make it possible). So, yeah, we could have a 100%-reliable cell network, (or whatever) but most people aren't willing to pay $200 per month to make it happen; i'm certainly not.

    15. Re:because they've been conditioned by Kalriath · · Score: 3, Funny

      Actually it is the case. I rarely reboot my Vista machine (mostly because for some reason the BIOS on my PC tries to boot from the printer - don't ask, I don't know), and on average only need to do so once every month or two (I don't accept Windows Updates for components I don't use)

      --
      For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
    16. Re:because they've been conditioned by Vellmont · · Score: 2, Informative


      This has _zero_ to do with the architecture and everything to do with the user. Linux would be (and is) treated in the same way in similar situations.


      This is simply not true. Anyone that's ever installed software, or run "windows update" knows that rebooting is a very likely part of this process. The dependencies and non-modular approach of Windows are quite apparent. Software vendors say "just reboot" because of all the complexities and dependencies within windows.

      The same simply isn't true for Linux. Replace a critical shared library? No problem, running programs still have a hook to the old version. Any new process that starts will get the new version of the library. Why reload the whole damn OS when restarting a process will do the same thing?

      You can be assured the people running those BBSes were far less like to have the "just reboot" mentality.

      You're trying to tell me with a straight face that the BBS market influenced Microsoft? (Which flies in the face of what we've all experienced with Windows).

      Further, the other reason most people have that attitude is because to them a computer is just another appliance.

      No, the reason people have this attitude is because it freaking works.

      Incidentally, there are numerous classes of problems on Linux (and UNIX in general) which are more quickly and easily "fixed" with a reboot.

      I've been administrating Linux machines for 13+ years. I can count on one hand the number of times a reboot solved any problem. The only class of problem this solved is a kernel bug, or the kernel crashing (usually from a hardware problem).

      I can't even remember the last time I had to reboot any of my Windows machines without a good reason (eg: patching).

      Why would anyone reboot without a "good reason"? The point is that Linux simply has less "good reasons", and requires less reboots. Linux requires FAR less reboots for "patching".

      Finally, there's nothing wrong with rebooting _anyway_. If your service uptime requirements are affected by a single machine rebooting, your architecture is broken.

      Wow. Now I know you've really drank the Microsoft kool-aid. Not everyone can afford multiple machine redundancy just to fix the endemic problems of Microsoft who advocate "Just reboot!" to fix so many problems. There's really no reason why I need to reboot just to update what's essentially some new versions of DLLs. The Microsoft architecture is essentially broken if you have to buy another damn machine for the SOLE purpose of maintaining high availability.

      --
      AccountKiller
    17. Re:because they've been conditioned by NevermindPhreak · · Score: 5, Informative

      I believe you are correct. The market isn't "conditioned" into thinking that anything less than five 9s is acceptable. They just don't want to pay the cost associated with it. The price/reliability ratio right now is the one that will satisfy the most customers. 99.999% reliability is harder to sell than 99.9% reliability at half the cost.

      I work for a cable company, by the way. I design a lot of the building-out of our system, so i know the actual costs associated with creating that kind of reliability. Whenever someone needs that kind of reliability, I actually recommend getting a second ISP as a low-speed backup solution. It is the only smart way to go to get complete reliability, as pretty much any company advertising 99.999% reliability in this area is outright lying to the customer. (I know this from experience. I have switched customers over to our ISP from a week-long (or longer) outage of every ISP here, and there are quite a few.) Besides, a good router will split bandwidth between the ISPs so you're not paying for something you're not using. (called "bonding")

      I still get amazed when people yell at me for being offline for a few hours after maybe 3, 4, 5 years of uptime. They say that they are losing thousands of dollars per day they are offline. Yet, they don't want to pay for a $40 roll-over backup. THESE are the vast majority of customers who complain so much about 99.999% uptime.

      On another note, I think anyone claiming 99.999% on POTS is anecdotal. Growing up, I had my power cut out at least twice a year, and the phone system was hardly 99.999%. Trees fall on lines, and people cut buried lines for all sorts of accidental reasons. Just like you insure anything worth enough value, just like you back up data in multiple locations, you need a fallback plan if your ISP goes out if it means that much to you.

    18. Re:because they've been conditioned by Jurily · · Score: 3, Informative

      Keeping internet services online suffers from the problem of black swans. Nassim Taleb, who invented the term, defines it thus: "A black swan is an outlier, an event that lies beyond the realm of normal expectations." Almost all internet outages are unexpected unexpecteds: extremely low-probability outlying surprises. They're the kind of things that happen so rarely it doesn't even make sense to use normal statistical methods like "mean time between failure." What's the "mean time between catastrophic floods in New Orleans?"

      http://www.joelonsoftware.com/items/2008/01/22.html
    19. Re:because they've been conditioned by griffjon · · Score: 2, Interesting

      I'd say it's an even deeper problem -- it's not really a marketplace. The competition is few and far between, and they're oligopolistic, and probably price-fixing. I mean, what's your alternative to a blackberry? So what if the service sucks -- is your employer going to ... buy you an iPhone? [1] If Verizon pisses me off, I can switch to... AT&T, or some of the others if I don't mind roaming? People would vote with their wallets if there were candidates worth switching to.

      [1] If so, let me forward you my resume for your consideration

      --
      Returned Peace Corps IT Volunteer
    20. Re:because they've been conditioned by pclminion · · Score: 2, Interesting

      It's funny the attitude that comes from the users of each OS. Windows administrators categorically will try rebooting the damn thing first to fix any problem (and it usually works). Linux administrators will only try this as a last resort (and it almost never works).

      It's even less than a last resort. I have, once or twice, had true problems that required a reboot of a Linux machine to fix. The one in most recent memory, it took three weeks before realizing that a reboot was (or at least, could be) the solution. That's three weeks of hard core debugging, tweaking, and hair pulling. The idea of a reboot to fix a user-level software issue is not something that even remotely crossed my mind, nor anyone else's. In fact, it was a Windows user from another location who ultimately made the suggestion "Have you tried rebooting it?"

      Rebooting a computer to fix a problem should be viewed with the same suspicion as burning down your house to eradicate an infestation of insects.

    21. Re:because they've been conditioned by LynnwoodRooster · · Score: 2, Interesting
      Rebooting a computer to fix a problem should be viewed with the same suspicion as burning down your house to eradicate an infestation of insects.

      No, it should be viewed as fumigating your house. You all move out, wait a few days, then move back in. When you reboot you don't lose the computer, you don't lose the archived data, and all the users can return in a short amount of time.

      Burning down your house loses all the contents and ensures you'll never return...

      --
      Browsing at +1 - no ACs, I ignore their posts. So refreshing!
    22. Re:because they've been conditioned by drsmithy · · Score: 4, Insightful

      This is simply not true.

      Yes, it is. People who use Windows, when using Linux, are going to respond exactly the same way to problems - by rebooting.

      Anyone that's ever installed software, or run "windows update" knows that rebooting is a very likely part of this process. The dependencies and non-modular approach of Windows are quite apparent. Software vendors say "just reboot" because of all the complexities and dependencies within windows.

      No, they do it because it's a simple step for the ignorant end user to understand.

      The same simply isn't true for Linux. Replace a critical shared library? No problem, running programs still have a hook to the old version. Any new process that starts will get the new version of the library. Why reload the whole damn OS when restarting a process will do the same thing?

      Because for people who don't know that, it's easier to say reboot.

      You are conflating knowledgable end users with typical end users. This is at best naive and at worst deliberately deceptive.

      You're trying to tell me with a straight face that the BBS market influenced Microsoft? (Which flies in the face of what we've all experienced with Windows).

      No, I'm telling you that a random individual's attitude towards rebooting is going to be vastly more influenced by their skill level ad what they're using their computer for than the OS it runs.

      No, the reason people have this attitude is because it freaking works.

      Exactly. Now, again, why do you think they're going to treat computers any differently ?

      I've been administrating Linux machines for 13+ years. I can count on one hand the number of times a reboot solved any problem. The only class of problem this solved is a kernel bug, or the kernel crashing (usually from a hardware problem).

      Not done much work with NFS then, I take it ? Or services that have long timeout periods and don't die nicely ?

      I struggle to believe anyone has been using Linux for "13+ years" and can only "count on one hand the number of times a reboot solved any problem". Either you've not used Linux anything close to "13+ years" or you've not used it in a very wide range of situations.

      Why would anyone reboot without a "good reason"?

      The fact that you even need to ask disqualifies you from any useful input to this discussion. Fucking hell. People hit the rest button on their PCs because the monitor power-saving kicked in and for dozens of other reasons that aren't even that good.

      The point is that Linux simply has less "good reasons", and requires less reboots. Linux requires FAR less reboots for "patching".

      Linux also makes a lot more assumptions about its users (and "users" in this sense reaches from Grandma to software developers).

      Wow. Now I know you've really drank the Microsoft kool-aid. Not everyone can afford multiple machine redundancy just to fix the endemic problems of Microsoft who advocate "Just reboot!" to fix so many problems. There's really no reason why I need to reboot just to update what's essentially some new versions of DLLs. The Microsoft architecture is essentially broken if you have to buy another damn machine for the SOLE purpose of maintaining high availability.

      Yeah, like I thought. "13+ years" and 12 of those were probably using it on your home PC.

      The only meaningful difference between a "reboot" and a hardware failure is the amount of warning. I'll say it again. If your business continuity is vulnerable to individual machine outages (be they from reboots or motherboards going up in smoke), then it's broken. Period. If you can't afford "multiple machine redundancy" then you don't need 24/7 uptime. If you don't need 24/7 uptime, then either scheduled machine reboots (eg: for patching) are irrelevant, or brief outages are acceptable.

      Any sysadmin who thinks he can run a high-availability operation without multiple machine redundancy is incompetent. Any sysadmin who is purporting to do so, is grossly negligent. The fact that there's a hell of a lot of people whose Linux (and UNIX in general) bias puts them into these categories, does not make them any less incompetent or negligent.

    23. Re:because they've been conditioned by drsmithy · · Score: 4, Insightful

      That's three weeks of hard core debugging, tweaking, and hair pulling.

      The fact that you were able to wait *three weeks* demonstrates that the problem was, at most, insignificant.

      When thousands of dollars (or more) are being lost every minute that a service is unavailable[0], you don't fuck around with idiotic philosophising about how "its UNIX, I shouldn't need to reboot for anything"[1], you just DO IT.

      [0] We shall ignore here for a minute the false economy of not just investing in a properly redundant architecture where individual machine outages do not impact availability.

      [1] I've been there myself and had arguments with my (at the time) boss about it. It is the difference between how geeks think and how businesspeople think. The geek is interested in figuring out wtf is wrong. The businessman is interested in whether or not his business is still operating.

    24. Re:because they've been conditioned by Mortimer82 · · Score: 2

      On my Home machine, the only time I reboot Windows is in the event of a software update (once a month). Otherwise, I have my power options set up that I just use suspend to RAM when I am out the house or sleeping. It takes 4 seconds to go to sleep and 3 seconds to wake up.

      I work in a 24 hour customer support environment, and as shift times change often and there are 4 different shifts per day, the computers stay on all the time and almost never restart, except for perhaps windows updates. However, they are aways logged off by the user at the end of their shift.

      Anyone who frequently needs to restart their Windows machine today is either running rubbish software/drivers on it, or has sub-standard hardware.

      On Windows, OS updates is the only thing requires restarts, or should (it really bugs me when a Quicktime update says it needs to restart Windows for only God knows what reason). Although Linux may not need full system restarts for software updates (I don't know), I find Windows more than reliable enough for my needs.

    25. Re:because they've been conditioned by Firethorn · · Score: 2, Interesting

      people aren't willing to accept the economic and aesthetic costs of providing those services at the level of reliability you and the author are demanding.

      I have to agree.

      I've stated before 'Every 9 of reliability increases the cost 10 fold'. Now, this is only the vaguest estimate, with vast numbers of variables, unseen incidents, competency, etc...

      Take a car that's 90% reliable. It'd be used, of course, and probably cost you only $100-500. You can get a car that's 99% reliable for $1-5k. 99.9% reliability would be getting into needing a new car(or newer used), costing $10-40k. This, of course, discounts getting a lemon.

      Now, when it comes to phone service it's reliability comes from that stuff has been done for so long that the extra reliability doesn't actually cost 10X, plus the base '90%' is so cheap that upping it to 99.9% isn't very expensive.

      --
      I don't read AC A human right
    26. Re:because they've been conditioned by wizzahd · · Score: 2, Funny

      Everything else being equal, that should still result in 99.94% uptime, or .04% fail. The point is still valid, of course. Do you work for Verizon?
    27. Re:because they've been conditioned by rmerry72 · · Score: 4, Insightful

      The reasons why Microsoft were so successful (in a business sense) are manifold, but one is not that their products were great, but that they were good enough.

      This I agree with whole-heartedly. Its a fundamental basis of a market driven economy. Spending effort on things that are too good for the market wastes resources that could be spent elsewhere on items that the market (ie. people) do want. Capitalism does not - and must not - build the best, merely the just barely good enough.

      Most people don't give a crap about quality, and if they do then somebody else should pay for it. Its all about the latest and greatest bling and appearing to be better than your neighbours.

      So everything we have in our lives - every product, service, and system - is just good enough to work for most of the people most of the time and no more. Our transport largely gets people from A to B (eventually), our health system keeps most people alive a few years longer with not much discomfort, our communications work most of the time for most people in most places, and our politicians mostly look after us OK.

      Oh, and most of us do most of our work most of the time when we have to. And no more!

      --
      We do not inherit the Earth from our parents. We borrow it from our children.
    28. Re:because they've been conditioned by Hooya · · Score: 2, Interesting

      > I can tell you that in the pre-Windows days...

      and I can tell you that in the post-windows days... well, people have this concept of rebooting when things don't work. "it will auto-magically fix itself" (tm). cell-phones, managed switches, home routers... you name it, the first thing tech-support will do is ask you to "turn it off and on again". so much so that that is a standard gag in "the IT crowd".

      i had this incident in our data center where this nincompoop kept futzing around with a managed switch. he hosed the config, caused some ripple effect on the servers and then panicked and wanted to reboot everything - including the servers. didn't know what the problem was but as he is indoctrinated to the ethos of rebooting to automagically fix problems, just wanted to reboot everything.

      i had to step in, restart a few services and things were back to normal. no reboot required. a reboot would have taken us out for a good 15-20 minutes. restarting services, 10 seconds.

      it's almost like people don't take pride in uptimes. who cares if it's down for 30 minutes... thanks largely to the microsoft OS culture. unix was bad enough - compared to mainframes and VMS - or so i'm told.

      so, yeah, it's gone downhill. MS didn't help. telephones might have had outages but i sure don't recall having to reboot the big black rotery dial phones..

    29. Re:because they've been conditioned by Da+Web+Guru · · Score: 2, Insightful

      I still get amazed when people yell at me for being offline for a few hours after maybe 3, 4, 5 years of uptime. They say that they are losing thousands of dollars per day they are offline. Yet, they don't want to pay for a $40 roll-over backup. THESE are the vast majority of customers who complain so much about 99.999% uptime.

      Thousands of dollars per day? That's all? I work for a web hosting company. When one of our customers' servers goes down for more than 10 minutes, they immediately claim to be losing tens of thousands of dollars per hour. :) Of course, they *might* be paying only $100 per month for the server. And these are the same customers that can't be bothered to pay $50/month for any kind of backups for their only copy of their data.

      --

      --guru

    30. Re:because they've been conditioned by tronbradia · · Score: 5, Informative

      Actually our health system has completely ballooning costs relative to other countries and is really more of an example of the opposite phenomenon, where insurance must pay for all possible treatment or be sued. Our system without a doubt provides the most care of any system in the world, even though it's pretty obvious that returns diminish dramatically after about 10% of GDP (we are at 15% of GDP, 2nd runner up is Switzerland at 11 or 12%). Returns diminish because, essentially, more care doesn't actually make people healthier past a certain point. 99% if people just need a GP (cheap), immunizations (dirt cheap), antibiotics when they get an bacterial infection (dirt cheap), and surgeons to sew them up when they get in a car crash (expensive-ish but hopefully uncommon and only rarely protracted). The problem is whenever anybody gets anything terminal, there's the potential for basically infinite spending, and the more successful treatment is, the more money goes in because treatment is prolonged. In this case our system is not "barely good enough", it's more way too good, or at least, way too generous.

    31. Re:because they've been conditioned by Eivind · · Score: 4, Insightful

      Everything you say is true, but it's actually even -worse- than that.

      It's not just that the returns are diminishing, they're -NEGATIVE-. It's not just that countries that spend 30-40% less on healthcare compared to USA have similar health and life-expectancy, several of them actually have significantly BETTER results for LESS money.

      The reason is basically what you state: Giving EXTREME healthcare to those who already have GOOD healthcare provides little if any benefit, but providing the BASICS to those who are lacking them is cheap and efficient.

      So, USA has very very high spendings for those who are "in", but fall quite deeply on the rankings because you fail to provide GOOD healthcare to everyone living in the USA. That's why you're not in the top 40 for any of the most used healthcare-indications despite being undisputed as number one in spendings.

      Norway, for example, has similar healthcare to USA, not quite as extreme on the top mainly due to less panic about courts, but still come out way ahead, because healthcare is truly universal.

      Costs less, gives more health. What is not to like ?

    32. Re:because they've been conditioned by chathamhouse · · Score: 2, Insightful
      Not done much work with NFS then, I take it ? Or services that have long timeout periods and don't die nicely ?

      Amen. Hoping for a long, stable uptime on a linux machine that does very intensive and sustained NFS I/O provde to be pipe dream for me. Things did get much better after applying the plethora of nfs.org patches, but you still get some awesome kernel failures.

      But I don't care, because I have many machines accessing the NFS mount (mailboxes, btw). I lose one, and keep on ticking. If I lose one machine every 3-4 months for an hour, my service availability stays good - though maybe a bit slow depending on the time of day. I could have mounted the shares on a set of Solaris boxes, but the cost for knowledgeable staff would have been far greater than sticking with Linux. That's right, hardware & software are generally much cheaper than the people to manage it.

      So I agree that the original poster's "13+ years" experience with Linux is either a troll, or someone that doesn't have anything but simple use cases for his champion OS.

      Individual components of a computing infrastructure will fail. I don't care if it's a $500 compute node, or a $20M disk array. You have to assume that this failure will occur in your design. Availability comes from design with this intent, and that sort of design is expensive.

      It always comes down to dollars! As a business, you generally permit the expenditure on highly available design when it has a strong business case. Will you lose $1mil in revenue because of a string of outages? If the probability is very high that the answer is yes, then it would be reasonable to secure $300-400k to remedy the situation. If the highly availabe design costs you $300-400k, but the expected losses from your flaky infrastructure are $50-100k, the money will not be spent, regardless of how many times you whinge on Slashdot.

      And so, we accept these average performing, generally there services because we consider it good value for the dollars we pay. We complain that it should be better - sure - but we don't change providers because of the dollars involved.

    33. Re:because they've been conditioned by rtb61 · · Score: 2, Insightful
      Your reasoning fails. You do not just replace the widget, what happens is the widget fails, your lose time, your then waste time and money not only paying for the new widget but also the exercise of researching it and getting it back to the location where it will be used. Now multiply that by a several failures and your now spending way more, not just a bit more. So it is never $4 versus $3 a couple of times, the reality is it is $5 (real quality) versus $3 plus the hidden $10 several times, so $5 versus the reality of $50.

      Plus of course the additional impact upon the environment of the extra energy required to produce and obtain the goods and the waste of failed products. Big profits for corporations and marketers, for which every citizens and future generations pay an extreme price.

      --
      Chaos - everything, everywhere, everywhen
    34. Re:because they've been conditioned by Eivind · · Score: 4, Insightful

      You're asking two questions, so you get two replies.

      Why, in general terms, do we redistribute wealth forcibly ?

      The short answer is: Because we live in a democracy and the majority of politicians vote in favor of doing that.

      The longer answer is; Because living in a stable, healthy population with a safety-net has benefits, even if you're not among the direct recipients of the welfare.

      In South-Africa earning $100.000/year means living in a castle surrounded by 10-feet concrete topped with broken glass and barbed wire, surveiled by video-cameras, in a "gated community", driving your kids wherever they need to go for fear of kidnapping and *still* accepting that your odds of being killed by someone desiring your wealth are non-negligible.

      In Norway, earning $100.000/year means living wherever the hell you want, surrounded by a garden with strawberries in it, never even having the thougth "kidnapping" cross your mind in relation with your children, posessing no security-camera and indeed unless you live in a major city you'll probably not bother locking the door. Still, even without the precautions, your odds of being killed by someone desiring your wealth is, essentially zero. (more than 2 orders of magnitude lower)

      I don't know what that's worth. But it's worth -something-.

      I'm much more skeptical of all the corporate welfare, truth be told. If I could directly change what my tax-dollars are used for, my vote would be to cut drastically on subsidizes to dinosaur-industries that are uncompetitive (it's insane that *tobacco*-farmers and coalminers are the two groups receivin the most subsidies in the EU) and to *UP* support of those people who need it the most. Primarily EDUCATION -- I'm the opinion that that is the most sensible support you can give a weak group. It's the only help that can help them with time becoming independent.

  2. Oh Zonk by opec · · Score: 4, Funny

    Oh Zonk, I'm marking your story as "flamebait". :(

  3. The way it has always been by Corpuscavernosa · · Score: 3, Insightful

    Complacent consumerism. "Hey, it's always been this way so they [service providers] must not be able to have 99.9% uptime. If they had the capability, they sure would provide it to us, their customers."

    --
    We figured out a long time ago that it's easier to elect seven judges than to elect 132 legislators.
    1. Re:The way it has always been by zappepcs · · Score: 4, Interesting

      While you deserve the mod points, it should also be noted that consumer expectation is strangled into submission within 20 minutes on the first support call they make to ask about better service quality.I know a guy who is locally famous because he will spend 4,5,6 or more hours on the phone with customer service, supervisors, managers and anyone on the board of directors that he can find a phone number for. What is he fighting for? discounted service or reparations for lost service(s). That's right, it takes hours on the phone to get one of those companies to either own up to, and pay for losses accrued by their customers through loss of service.

      In truth, most consumers won't complain when they should, so there is no marketplace pressure on those businesses to aim for five nines uptime.

    2. Re:The way it has always been by x_MeRLiN_x · · Score: 2, Insightful

      That's not necessarily true. If a sufficiently high volume of people complained, it would certainly start to eat into their customer service budget. I don't know how much it costs to run upwards of a dozen dedicated customer call centres, but I would assume it isn't pittance. If their call volume were to treble for longer than a week or two, improvements would be forthcoming. Alas, large consumer groups that are able to organise this level of pressure don't (as far as I'm aware) exist.

    3. Re:The way it has always been by cgenman · · Score: 2, Informative

      That's right, it takes hours on the phone to get one of those companies to either own up to, and pay for losses accrued by their customers through loss of service.

      Having been on the other end of these types of calls, this sort of thing can be *very* annoying. People do call all of the time with the expectation that because they do five or six thousand dollars worth of business in a day, the ISP is somehow responsible for those thousands of dollars when some idiot Verizon contractor accidentally cuts our cables. Other reasons for outages include: power loss, fire, flood, exploding transformers, telephone pole collapse, and many other issues outside of our capability.

      If you want guaranteed uptime, get it in your contract and be prepared to pay for it! Otherwise, we'll do the best we can to provide service at the funding level we recieve, and will gladly refund the 59c worth of service that you would have paid for a 6 hour outage.

      Would you expect Ford to pay you for lost wages when someone hits your car? Would you expect your grocery store to pay for your chiwawa when he starves to death because the store is out of dog food?

    4. Re:The way it has always been by zappepcs · · Score: 2, Insightful

      While I understand what you are saying it would go a very long way if when I called customer service, while I was on hold waiting for an operator the interactive processing system could take my zip code and tell me if there are any known problems or outages in my area. That would alleviate much of the complaints because of technical problems that are out of your hands. I've had trouble getting anyone to tell me they are having problems of any kind, never mind that the problem happened 2 blocks from my house. I have patience for being called a dumb user that lasts about two seconds, and that goes for being treated like one also. If you can tell me in 60 seconds that there IS a problem in my area, then I won't wait on the phone for 10-15 minutes getting pissed off before I talk to a call center rep. I will probably hang up with the knowledge that you know about the problem and are working on it. If I'm **REALLY** lucky, you'll have given me a number to call for status updates or a website or both so I won't have to bother your customer service reps any more.... sigh... like that is going to happen

    5. Re:The way it has always been by CaptJay · · Score: 3, Interesting

      Somewhat off-topic, but an anecdote related to massive consumer calls to tech support.

      Back in the day, I was an IRC Operator for a large Undernet server, and there came a time where the new thing for troublemakers was to use open proxies on cable connections to flood channels/servers. One cable provider had a particularly large number of clients whose setup was used to attack the network and generally cause trouble.

      At first, being in the area of that provider, I called tech support and escalated the issue as much as I could. My point was that they were ultimately responsible for the abuse coming from their network. Long story short, for months I got nothing but "we'll look into it".

      After a particularly nasty week, and after consulting with the server admins, we decided to ban the whole ip range of that provider from using our server (they could still use the rest of Undernet, but our server was popular for them). The ban kicked > 1000 clients from the server with a message like : Your provider does not respond to abuse complaints. Contact your provider's technical support to have this issue resolved.

      10 minutes later, there was a 30 minute wait at the provider's tech line. On a sunday afternoon. One hour later, I got an email saying they were blocking inbound port 1080 at their router to protect their clients machines from being abused.

      I guess the point is, when something generates enough backlash, preferably with a nice surprise effect, things can change. The hard thing is to organize people enough to harass the company about it.

      --
      "I remember Y1K, every abacus had to get another bead"
  4. The cost by Introspective · · Score: 5, Interesting

    Probably because of the cost. I do network design for a fairly large telco, and let me tell you the cost goes up exponentially with the number of "9"s that the business asks for.

    1. Re:The cost by HairyCanary · · Score: 4, Informative

      Exactly what I was thinking. I work for a CLEC, and I have a rough idea how much things cost -- compare what a Lucent 5E costs with what a top of the line Cisco router costs, and you have the answer why voice service achieves five-nines while data service typically does not.

    2. Re:The cost by freebase · · Score: 2, Insightful

      Don't forget that the support costs on a 5E dwarf even the cost of most, if not all, Smartnet contracts.

      Simply said, because the equipment isn't/hasn't been able to support it, the only way to build 5 9's or better has been to add more equipment, which increases operations costs, capital costs, etc across the board in an almost linear fashion.

      The market has for the most part established the level of service available by establishing the price point the customer is willing to pay for said service.

      People love to point towards the big bad telcos and other companies as monopolies and only being concerned about profit margins. They forget that those same profit margins are what drive the company's stock price, in turn causing growth in people's portfolios. It's a vicious cycle and won't end until enough people decide they have enough.

      --
      Sig??? I don't need no stinkin Sig!
    3. Re:The cost by who's+got+my+nicknam · · Score: 5, Interesting

      Also, you need to bear in mind that POTS is incredibly simple technology compared to Internet/Cellular/Data services. I haven't had cable TV since the early '90s, but I don't ever remember it going out, either- that was long before we had digital cable/cable Internet in my market area. POTS never goes down because the equipment is extremely robust, even (especially) the older stuff. My local telco could continue to provide POTS for more than 4 days during power outages simply because of lower power requirements (after 4 days, they had to fire up their generators, and started dropping remote COs due to extreme cold).

      We always want to compare service levels for newer tech with POTS and complain when they don't approach the same levels, but I'd expect that if we were to be still using the same equipment for ISP/Cellular service in a hundred years, it would be as stable and robust as the current (ok, previous generation) iteration of POTS. Problem is, we are constantly demanding better, faster, and cheaper: this has to be traded off for reliability, and for the most part people are happy with that tradeoff. Just like we're happy to buy crappy consumer goods from China at Wal*Mart because they're cheaper than domestic products. /rant

      --
      "Apparatus dignosco occultus, satis non supernus."
    4. Re:The cost by general_re · · Score: 2, Interesting

      I would have settled for only 99% from comcast. The fact that the cable modem was only ~70% reliable is just embarassing, to this day I cringe when I hear that people are relying upon comcast for emergency calls. It would be out for hours every day, and we did ditch them for DSL. YMMV, but I use my residential Comcast connection as a backup monitor for a server I administer. Every 10 minutes my home machine (which is running 24/7) pings the server and waits for a response - it's a cheap way of tracking the server's availiability, although of course it's really checking the availability of both the server and my home internet connection. I just checked the records for all of of February, and it only recorded one failed attempt for the entire month, which translates to a success rate of 99.98%. And that's pretty good for $42.95 per month, I think.
      --
      ABSURDITY, n.: A statement or belief manifestly inconsistent with one's own opinion.
  5. the simple answer - we have more options... by studpuppy · · Score: 3, Insightful
    So the simple answer is that I have more options. When my cell phone doesn't work, I have my desktop phone (or vice versa). or IM. Or email. Or fax.

    Basically, we don't rely so much on a single system that a brief outage can be tolerated because there are alternatives to choose from.

    This is also the basis of Clayton Christensen's theories on disruptive innovation - that a consumer of something (technology, etc.) is willing to trade off some of these aspects, like reliability, for cost or performance benefits (however you wish to define those benefits...).

    --
    The last time I wrote code, it was Morse
  6. Here's an easy one. by palegray.net · · Score: 5, Insightful
    Quoting the summary:

    ... after decades of mobile phones, why do we even still have dropped calls? It's a little thing called physics. When you're traveling while using your phone, you may transit into dead zones. We could solve this by cutting down all the trees and flattening the landscape, but that might make some people angry...
    1. Re:Here's an easy one. by palegray.net · · Score: 5, Interesting

      As a guy who does communications in the U.S. Navy, I can attest to this. If the United States military can't guarantee 99.999% uptime on communications in all conditions, what makes anyone think it's possible in the private sector?

    2. Re:Here's an easy one. by CorSci81 · · Score: 2, Insightful

      People don't understand basic physics is the simple answer. Or they don't think about it beyond "it's not working right now". Until we have magical transmitters that can transmit at any wavelength in the spectrum all wireless communications are subject to weather interference. The only way to beat the weather right now is to have a physical connection (and even that's not 100% immune).

    3. Re:Here's an easy one. by palegray.net · · Score: 3, Funny

      The only way to beat the weather right now is to have a physical connection (and even that's not 100% immune). That's a true statement. Hurricanes, fires, and tornadoes do have a way of reducing uptime in many cases. I suppose the network provider could always enter into an SLA with God to improve things, though. Similar deals with the devil have proved too costly in the long run.
    4. Re:Here's an easy one. by nikanj · · Score: 2, Interesting

      Well, we do have physics and trees and hills in Finland but I can't even remember the last time I had a call drop. Just last thursday I took a 140km train trip to a nearby city and spent the whole time chatting on irc. Used the same ssh connection for the whole trip. Nice 3g handovers @ 120 km /h (Nokia N73). Greetings from the 21st century..

  7. Low price or high-quality? by schnikies79 · · Score: 5, Insightful

    You can have one or the other.

    We're not talking about software, we're talking about hardware and man-hours. Those will never be free.

    --
    Gone!
  8. because its ridiculous by myowntrueself · · Score: 2

    'five nines' of uptime is a ridiculous and exaggerated expectation for pretty much anything technological for anything that is not life threatening.

    Whenever people talk about 99.999 uptime for a service delivered over the internet I laugh in their faces.

    --
    In the free world the media isn't government run; the government is media run.
    1. Re:because its ridiculous by X0563511 · · Score: 3, Informative

      I just did the math. 99.999 uptime is "less than 5 minutes per year" or "less than half a minute per year" depending if i stuck an extra 0 in there...

      Clearly, a ridiculous number.

      --
      For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
    2. Re:because its ridiculous by elronxenu · · Score: 2, Insightful
      If Google was unavailable for 10 straight hours, that would be really really bad. If google was unavailable for 1 straight hour, that would be really bad.

      On the other hand, if google was unavailable for 9.863 seconds per day, every day (which is the equivalent of 1 hour per year), who would care? Just resubmit your query.

      What's important about reliability is often not the total downtime but the duration of downtime.

  9. More physics in action. by palegray.net · · Score: 4, Funny

    mass outages several times this month Was it converted to energy?
  10. It's a market-wide problem. by HazyRigby · · Score: 2, Informative

    As consumers, we're made to feel helpless. The worst we can do (without litigation) to a company is complain or refuse to use their services, but what harm can that do to a giant conglomerate? And in situations in which one company has a monopoly in a certain area of the country, for example, consumers may not have the ability to switch or do without.

    As a personal example, Comcast owes me a refund check for Internet services I canceled six months ago. If I, as a consumer, had allowed my debt to go unpaid for that long, my account would have been sent to collections long ago. But the problem is that most of the power--with the economics of the situation, with politicians, and so on--lies on one side of the table, and that power ain't with the consumer.

    1. Re:It's a market-wide problem. by cgenman · · Score: 2, Insightful

      As a consumer, you're more than entitled to take Comcast to small claims court, which is most likely the mechanism that Comcast would use to extract unpaid bills from you. That Comcast is more likely to enact this mechanism than you are is not a fault of politicians.

      It varies by state, but usually it costs 15 dollars to take a company to court, and no lawyers are required. It is generally quick and painless, and people at your local courthouse can fill you in on the details and help you through the process.

  11. Really so common? by Moridineas · · Score: 5, Interesting

    Are these kind of outages really so common? Mobiles phones I absolutely agree with. ON the other hand, I literally cannot remember the last time I lost cable or my internet. I've literally lost power more frequently than either of them (maybe 4 times in the past year) and lost water once. Emails not making it to their destination--again, does this really happen? In the decade plus I've been using internet email, I can't off the top of my head ever think of any "lost" email unless it was sent to a wrong address or something.

    1. Re:Really so common? by TubeSteak · · Score: 2, Insightful

      ON the other hand, I literally cannot remember the last time I lost cable or my internet. Hey! I've got an anecdote too! I spent a few years in a town where heavy rain would kill most of the town's cable tv & internet).

      Hint: Just because you live somewhere without such problems does not mean they don't exist. Ditto for lost e-mail.
      --
      [Fuck Beta]
      o0t!
    2. Re:Really so common? by CorSci81 · · Score: 2, Interesting

      I'm in exactly the same situation. I'm on Time Warner's fiber network for internet, most of the time my wireless router is the source of any internet troubles, or it's exterior to my connection to TW. I've lost power more times in the last 2 months than I have my cable in the past 2 years. Even then my cable outages are generally under an hour and it usually involves calling the local office and having them reset my box remotely. Takes maybe 10 min to fix. And as far as cell phones, I generally know where the dead zones are and avoid them. I rarely have dropped calls outside of these zones, and I don't really expect Verizon to install a new cell just to fix the dead zone I drive through every day that has a radius of about 20 yards from a particular intersection. Rather, I just make sure I'm not on the phone when I go through it.

  12. It's the cost by hehman · · Score: 2, Insightful

    If offered cell plans that cost $50/month with rare outages or $150 a month with extremely rare outages, which would most people take?

    99.999% (5 nines) of reliability is achievable, but it's very expensive and hard to do. Everything has to be redundant, with no single point of failure, everything has to support fail-over seamlessly, the software has to be tested with extreme rigor, and upgrade procedures need to function nearly instantly and support rollback without loss of service.

  13. Re:Costs increase geometrically by (H)elix1 · · Score: 4, Informative

    Because every nine will cause a geometric increase in costs.

    This

    Uptime (%) Downtime 90% 876 hours (36.5 days)
    95% 438 hours (18.25 days)
    99% 87.6 hours (3.65 days)
    99.9% 8.76 hours
    99.99% 52.56 minutes
    99.999% 5.256 minutes
    99.9999% 31.536 seconds

    I work for a software shop where we can do high availability, but more often than not, folks chose to lower the uptime expectation rather than pony up for the stupid money it takes to have the hardware / software / infrastructure to get there. Most companies know the customer will not pay the extra cash for the uptime, thus... you get what you pay for.

  14. Not So Simple by jcnnghm · · Score: 4, Insightful

    To put it simply, it's the money stupid. It requires a lot more equipment and manpower to offer a high availability service. This extra cost results in higher prices. It can cost 1000% more a month for less than 1% more reliability. Think of a $400 a month T1 with a SLA versus a $40/month cable line. Being sheep has nothing to do with it.

    --
    You don't make the poor richer by making the rich poorer. - Winston Churchill
  15. because 'misson critical' is a myth by spasm · · Score: 5, Insightful

    Because 90% of stuff labeled 'mission critical' actually isn't. Think about it - for most of us, being able to receive or send cellphone calls or emails at any time seems super important, but the number of hours in any given month where it really *was* super important (the grant application was due in two hours; your mother was sick; your partner was about to go into labor; whatever) is generally pretty low - our real tolerance for occasional downtime is therefore quite high.

  16. Because it's not necessary? by Srass · · Score: 3, Informative

    Well, my guess would be that many (but not all) people understand that being able to call an ambulance because Aunt Betty has fainted is a necessity, but being able to chat with Aunt Betty for an hour from your car isn't. Missing a rerun of Laverne and Shirley isn't critical, and neither is having to wait to post those vacation pictures to Flickr. Your coworkers will, in all probability, somehow muddle through if you can't send them email from your blackberry.

    The telephone as we know it was the first genuinely instantaneous, worldwide communications medium that anyone could use, it was seen as a necessary component for national security during the cold war, and was built out as such. We've had over a century to perfect it, and vast amounts of money were spent doing so. Despite its origins at DARPA, the Internet as we know it today, although more useful, is by and large less of a basic need, is far more complex, and large portions of it are still built on top of the telephone infrastructure, besides.

    I can't help but think that most people understand this sort of thing, and understand that bringing such modern conveniences up to five nines of reliability is difficult and expensive, and people have evidently decided that a certain tradeoff to make such things affordable isn't out of line.

    The shorter, more pessimistic version of this is probably, "It's cheaper to suck."

  17. At what price? by NEOtaku17 · · Score: 5, Insightful

    "The marketplace has been duped into believing that this is the best technology can provide. People don't have time to know, understand, or research history and find that technology really can be reliable."

    No. They believe it is the best the technology can provide at a given price. Why do people "put up" with cars that only give them X amount of protection in a car crash even though there is technology out there that would make them safer? Because they aren't willing to pay the marginal cost for the extra protection. Arguing about what is possible with technology is pointless. What matters is what a piece of technology can do at a given price.

    Everything is a trade-off. The sooner Slashdot learns this the less we will have these stupid "Why don't consumers use the latest, greatest, most expensive technology? We need to force them somehow!" articles.

    1. Re:At what price? by DaveRobb · · Score: 2, Informative

      RFC 1925 Rule 7a.

      Good, Fast, Cheap: Pick any two (you can't have all three)

      People want high reliability, but they're not prepared to pay for it. If they _are_ prepared to pay more money, they miss the point that unless they spend a LOT more money, they'll only increase one of Good (aka reliable) or Fast, not both.

    2. Re:At what price? by NevermindPhreak · · Score: 2, Insightful

      I don't know about non-tech areas, but the US has a much thinner population density than many other developed countries. This is why it is easier to get, say, a fiber optic connection and good cell coverage in New York City, than, say, Idaho. People are sprawling away from urban centers more and more now, so that just makes the problem harder.

      My company offers up to gigabit fiber optic in the city. As you get more into the country areas, you're outside our service coverage, and no ISP will offer that without a HUGE premium. Same goes for cell phone coverage around here, the further you get from the cities, the worse it becomes. You even get less radio stations as you drive further and further out in the country. It's a population density problem, always has been.

  18. You don't have to take it anymore by BanjoBob · · Score: 4, Insightful

    When Comtrash Internet dropped my speed from 6 Mbps to 1 Mbps but kept the rate at 6 times DSL, I dropped Comtrash and went with the 1.5 Mbps DSL from my local telco. I got 50% more than Comtrash was delivering at 1/6th the cost. No problem.

    When Microsoft decided that I didn't own the rights to my own media and stopped me from being able to copy my own DVDs, I decided to drop them for my media development system and I switched to Linux and Apple. Microsoft doesn't want my business so I went with the people who do. No problem.

    When my Long Distance company decided to charge over $1.00 per minute for International calls, I switched to AT&T and their 17 cents a minute program. No problem.

    When Frigidaire washers charged extra for the warm water cycle but only give you 5 seconds of hot water and thus, never any, it was no problem to return the unit and buy a different brand. Sure, the salesman wasn't happy but, that is now his problem and not mine. I bought a different brand that did give me what they advertised and promised. No problem.

    The list is endless and across all businesses and domains.

    The point being is that there are alternatives but, many (or most) people are either too lazy to do anything about it or, like this article, they are too apathetic to do anything about it.

    The choice is up to the consumer and, if the consumer would take action, the industry would have to adapt because the market demands it. So far, the market is willing to accept this and thus, the industry sees no reason to change. The less the consumer will accept for their dollar the less they will receive. That, is the problem.

    --
    Banjo - The more I know about Windoze, the more I love *nix
  19. Bingo by dreamchaser · · Score: 4, Insightful

    It's all about cost vs. the cost of downtime. You'll find in business lines such as the financial sector, customers are willing to pay for extremely high availability because time is indeed money. Business lines that have lower costs for downtime have to weigh availability vs. ROI.

  20. It's simple confusion by Chairboy · · Score: 3, Funny

    Be careful to pick a provider that advertises "seven nines of reliability" instead of the more common "nine sevens of reliability".

  21. O RLY? by nacturation · · Score: 4, Insightful

    When it comes to mobile phone service, cable TV, Internet access, service interruptions are the norm -- and everyone seems willing to grin and bear it: 'We're so used cable and satellite television reception problems that we don't even notice them anymore. And television is mission critical? Besides, I bet most people don't experience significant cable TV interruptions. Satellite depends on the strength of the signal. Tap into Arecibo and you'll likely get 100% reception.

    We know that many of our emails never reach their destination. [citation needed] I call bullshit on that one.

    Mobile phone companies compare who has the fewest dropped calls (after decades of mobile phones, why do we even still have dropped calls?) Because it's a benefit to have a phone that doesn't draw so much power that your brain heats up just from using the device. Also, dropping a call indicates that you're in an area where there's no cell towers or because you've hopped from one tower to the next and the next tower has its connections maxed out.

    And the ubiquitous BlackBerry, which is a mission-critical device for millions, has experienced mass outages several times this month. Blackberry is not a mission critical service. The people who use it as such are naive. If there truly is a market for five nines uptime for Blackberry, RIM would develop such a service and charge an order of magnitude more for it.

    All of these services are unregulated, which means there are no demands on reliability, other than what the marketplace demands.' So here's the question for you: Why does the marketplace demand so little when it comes to these services? Because ultimately it's really not a big deal. So your satellite TV goes down for a bit... get a life. You drop a cell phone call... redial. Your Blackberry isn't receiving emails... get a life.
    --
    Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
  22. Why? Simple... by robizzle · · Score: 2, Insightful

    Engineering has always been about compromise. Any idiot can design a structure that is X feet tall but it would prove more useful it if wasn't a giant block of concrete -- if it had room for offices and the materials used to build it had minimal cost without sacrificing structural integrity.

    The same applies to computer engineering. We would easily build a cell phone network that had so many redundancies that it would virtually never go down and would support for thousands of times the expected average load, but we would pay for it in terms of cost. Customers demand reliability. Customers demand affordable cost. What the customer is "willing to accept" is a balance between the two.

  23. Gas Prices? by careysb · · Score: 3, Interesting

    I'm still waiting for people to scream about the rising gas prices and the record oil company profits. Seems like this would have a greater impact on the general populous than reliable cell phone service.

    1. Re:Gas Prices? by maxume · · Score: 2, Insightful

      Maybe they realize that the oil companies(and countries) can't do a whole lot about the price of oil.

      I wonder how much Exxon and Shell make when we import a barrel of oil from Canada?

      http://www.eia.doe.gov/pub/oil_gas/petroleum/data_publications/company_level_imports/current/import.html

      --
      Nerd rage is the funniest rage.
  24. Reality Check by grcumb · · Score: 4, Interesting

    In fact, though, I can tell you that in the pre-Windows days, electricity had outages, television had outages, telephone service had outages, gas service had outages...

    I was born in 1964. I have no recollection of POTS telephone service ever being unavailable.

    Electricity was expected to drop out a few times every summer, and until someone figures out how to tell lightning where to go, I expect it will continue to happen. In my part of Canada, however, power is continuously available from October to April no matter what. Even if you don't pay your bill. The only winter power outage of note I can think of offhand was the great Ice Storm of 1998, one of the most spectacular cases of force majeure I've witnessed in my life.

    In my part of the world, at least, power and telephone were life-and-death services and legislation mandated their reliability.

    --
    Crumb's Corollary: Never bring a knife to a bun fight.
    1. Re:Reality Check by PlusFiveTroll · · Score: 3, Funny
      I have no recollection of POTS telephone service ever being unavailable.

      Your neighbors evidently didn't own a backhoe. ;)

    2. Re:Reality Check by Gr8Apes · · Score: 2, Interesting

      You don't live in areas that have hurricanes that result in outages of all services for a week or more at a time, tornadoes, ice storms, straight line winds, or thunderstorms so severe that regular antenna reception even suffers.

      Or, what about the nifty Verizon cell outage that affected most of the south of the US for 8 hours 6 months or so ago? Or the network issues in the middle east? Rolling brown/blackouts in Ca and the NE of the US?

      There's not a lot you can do when the entire area is covered in 6 or more inches of ice with heavy winds, or if every goes under a few feet of water.

      --
      The cesspool just got a check and balance.
    3. Re:Reality Check by Beardo+the+Bearded · · Score: 4, Interesting

      The first responder radios they have in my city are being upgraded... ... to 97% uptime.

      First responders are police, paramedics, firefighters, etc. There was an incident about a year ago where two cops were being assaulted (and losing the fight) in a basement. Their radios were not working, so they couldn't call for backup.

      Luckily for them, a bystander called 911 on their cell phone.

      Lucky for me, too, since I got called to the carpet for calling the reliability of the system into question. I probably would have been fired, but the above-mentioned incident was in the paper the morning of my "meeting".

      The new radios are controlled by internet-connected computers. As the Farkism goes, "this should end well."

      --

      ---
      ECHELON is a government program to find words like bomb, jihad, plutonium, assassinate, and anarchy.
    4. Re:Reality Check by Rick17JJ · · Score: 2, Interesting

      I have always had several telephone service failures per year, every year, for the last several decades, where I live here in Northern Arizona. First of all, when it rains, the telephone lines sometimes become wet and I loose my dial-tone for a day or so. Then, when I call the telephone company, they usually say, if your telephone lines have not dried out and started working within 48 hours, we will send someone out then then. Can't they figure out how to water proof the phone lines and boxes and other stuff?

      Nearby lightning strikes during thunderstorms also cause several brief power and telephone service failures every summer. The power and telephone service failures usually last anywhere from several minutes to an hour or so. In two instances, my telephone was destroyed and in one instance the twisted pair telephone line itself in the building was damaged. Fortunately, I had already unplugged my computer, in those instances.

      Then of course, about once every other year or so, a backhoe causes a several hour loss of telephone service. Then about a year or two ago, several nearby telephone poles snapped during a wind storm. Then about once a year, telephone and/or power briefly fails for reasons that are not obvious.

      I always keep several LED flashlights and a battery powered radio handy just in case, especially during the summer. My backup methods of communication are my cell phone and the 2-meter ham radio in my truck. By the way, we do not have tornadoes, hurricanes or ice storms here.

    5. Re:Reality Check by darkpixel2k · · Score: 2, Interesting

      Lucky for me, too, since I got called to the carpet for calling the reliability of the system into question. I probably would have been fired, but the above-mentioned incident was in the paper the morning of my "meeting".

      Unfortunately nobody seems to realize just how much money goes into one radio tower.
      In my county we had decent coverage even though we were the second largest county in the state, but had the fewest number of people.

      We had three towers serving the entire area. Each one cost around $100,000 to put up. That covered the price of the equipment, the man-hours to install it, the equipment hours to fly it by chopper to the mountain top, the price of refueling the propane tank so the sunlight-poor winter months wouldn't shutdown the repeater.

      Sure, it's easy to say we need 99.99999999999% reliability, but who wants to pay double or more for the redundancy?
      Hell, just reprogramming all the radios in the county to support another repeater would cost $10,000. Not to mention if you wanted 99.99% uptime, you'd probably have to purchase a second set of radios for the responders because the place that handles reprogramming takes about a week.

      The more reliability, the greater the cost.
      At least until someone replaces the proprietary windows-only dispatch computers, applications, and processes with linux. Then you just pay for the hardware...

      --
      There's no place like ::1 (I've completed my transition to IPv6)
    6. Re:Reality Check by Cramer · · Score: 2, Insightful
      I was born in '72. I can recall phone service being dead only twice. The last, a few years ago, was the result of the entire CO "crashing". (I don't know how a "crash" can cause a loss of loop current, but that was their story. Our DSLAM didn't lose power, so I don't know what they screwed up.) The other was a decade ago... trunk failure prevented calls out of the CO.

      Bottom line, things that aren't supposed to happen, do sometimes happen.

      power and telephone were life-and-death services
      EXACTLY. That's the part people tend to gloss over. Seeing the latest Southpark episode is not a life or death situation. Likewise, your heart isn't going to explode because you cannot get to Yahoo! immediately.

      Trusting one's life to a cell phone is a gamble. While they are a fairly stable technology, there are numerous troubling issues... Batteries don't last forever. Service isn't available everywhere. 911 calls aren't always routed to the most appropriate call center -- although it's much better than in years past. In an accident, your cell phone is just as likely to be damaged as you -- or worse, lost. etc.
    7. Re:Reality Check by greed · · Score: 2, Interesting

      That's the only time POTS in my neighborhood was unavailable. Backhoes just LOVE "Call Before You Dig", but of course, the backhoe didn't have a dime to make a phone call.

      Cut the cable in 3 places; the Bell crews were camped in the trench with tents over it for 4 days, splicing the mess back together.

      Strangely, the contracting company that did it was never seen again....

  25. Re:Five 9's is impossible! by icebike · · Score: 2, Informative

    You totally misunderstand the 5 9s concept.

    It doesn't mean that each and every individual phone will be up 99.999 percent of the time, it means that the system as a whole will be up 99.999% of the time.

    Its quite possible for an entire town to be down for an entire year and still meet this criteria.

    Yet modern cell operators STILL can not come close.

    --
    Sig Battery depleted. Reverting to safe mode.
  26. Re:Because we're cheap? by Titoxd · · Score: 2, Insightful

    That being said, there is a market for 99.999. Upper-middle class and higher would pay for it. Um, no. The thing that got the upper class to where they got is either a) dumb luck or b) an ability to distinguish which costs are unnecessary and avoiding them. A savvy spender doesn't give a damn whether the cell will not get a signal for 50 minutes during the year, instead of five minutes, if the costs he will incur are double. A savvy spender determines what he needs and then finds the most cost-effective solution that will fit his needs.
  27. Just one point ... by tomhudson · · Score: 3, Informative
    In a properly designed cell phone system, if the tower you were going to be handed off to can't take the connection, either the tower you're with will keep the connection, or another (though still sub-optimal) will take the connection.

    Of course, when you don't have transmitters with overlapping coverage, this doesn't work.

  28. Re:Costs increase geometrically by setagllib · · Score: 4, Funny

    My concept of 5 9s is much easier: 9.9999%. Or for Vista servers, .99999%.

    --
    Sam ty sig.
  29. No way... by duffbeer703 · · Score: 5, Insightful

    This has everything to do with cost and nothing to do with Microsoft. Consider VoIP... people are deliberately choosing telephony services that are less reliable and lower quality than POTS, because VoIP is cheaper. If you want 99.999% uptime, that's fine -- but you're going to pay for it. High availability services require better equipment, redundant equipment that doesn't come cheap and more, higher quality staff to operate it. So it costs more.

    I've been in the technology services business for a long time, and with few exceptions, 80%+ customers want their services are delivered as cheaply as possible. Most hospital systems don't even have a 99.999% availability requirement. The 20% the want varying levels of higher than normal availability usually have a government regulation, SLA or other mandate requiring that they do so.

    --
    Conformity is the jailer of freedom and enemy of growth. -JFK
  30. WTF would I do with 99.999% uptime? by Kjella · · Score: 4, Interesting

    My electricity isn't 99.999% uptime (that's 30 seconds in a year) which would require me to get an UPS
    My consumer grade equipment isn't 99.999% uptime (with luck, maybe I guess but there's no ECC, redundant power etc).
    My software isn't 99.999% uptime (ok, so the kernel is stable. When X crashes, so does everything of importance on a desktop)
    If there's something urgent, you CALL me anyway.

    I'd rather take a line with 99.5% uptime (that's two days without internet per year) that's 10x faster and costs 10x less. Which doesn't include that I have Internet at work, or via my cellphone, or via a webcafe or any number of other easily available sources. The only real killer I can think of is if you only telecommute and can't go to work, but even then I figure the nearest Starbucks will let you occupy a corner with some purchases.

    --
    Live today, because you never know what tomorrow brings
  31. Partly correct by unixfan · · Score: 3, Interesting

    Partly correct. What they did was to mass introduce the GUI. 1.0 was a joke as far as usability went. At the same time the 386 was out and the talks of multiprocessing was promising new and exciting computing in the near future.

    I don't think they measured squat. Just did their best. Only thing was that there were nobody who could properly design an O/S and complexity, instead of simplicity, ruled the day.

    What we are seeing is the very best they as group are able to produce.

    They have never been great at marketing either. But they were really the first to push the GUI with success. Don't forget Apple became a very closed platform. They did not attract masses the way the open IBM PC did.
    Right there history shows how important open standards are to success. Apple was considered this fantastic success story but in reality they cut it short and did not buy the masses the way the Johnny came lately IBM PC did. But we are slow when it comes to learning from history.

    What they have been good at is market lock-in, vender lock-in and many other types of lock-in. (The problem really is that they had never heard about duty and were only interested in money.) We all thought they would get it right sooner or later and deliver a good platform that would allow happy computing. The fact that they specialized in adopting good standards and then corrupt them so that you got locked in was a very calculated development.

    At one point Gates himself said that Unix was the way to go. Then he decided to do it better but clearly never understood what made Unix so good (simplicity). Torvald on the other hand was ONLY looking for simplicity. Which is why it fit so well into the general Unix design.

    Look at windows, it is filled with arbitrary complexities and is horribly inefficient. Never mind when upper management throws fits and yell at staff, I've never found that conducive to good programming, or business.

    Gates cheated his way into O/S design, used people from VAX who's memory management problem were dragged over to windows. Built a kernel in BASIC! Haha! And got away with it for years!

    Someone who knew more about systems picked the Unix design and rewrote history based on technology, and was not motivated by money. Interesting to see how much we like to be able to just do what we need. Imagine if IBM had released Linux. With all the corporate support for let's say $100. Then opened it up with a GPL license.

    Microsoft would not be sitting pretty at all. The O/S2 collaboration would not have happened and Gates would not have learned his lessons from that. For all their success I've never considered them much of a success where it really matters. Integrity in product and care for customers. I have people send me Brandy, fine wines and other tokens of their appreciation after sales. Because I believe in treating other people the way I like to be treated, and I really care about my clients.

    1. Re:Partly correct by Herby+Sagues · · Score: 2, Interesting

      > they didn't measure squat, they just did their best. I don't agree with that, and there's an obvious process that demonstrates my point. Beta. Microsoft could have released each product a year earlier, or a year later. Do it a year later and you have a more polished product, but users clamoring for an update in the mean time. Do it earlier and you have crap. At many times Microsoft had product in long betas and people were asking for a release, but Microsoft knew that releasing at that point would damage sales. THe only time they caved in to pressure (Vista) they got a huge headache. They build extremely complex products and have teams of thousands of developers and testers working in them. Do you think they manage that at random? No, they have quality bars. Do you think they set them arbitrarily? That they say "this product will shine, that other one will suck"? No, they set those quality bars according to their estimates of what the market wants (quality as a tradeoff of when and for how much). Apple has a different strategy: they do not need to attract 100% of the users, with catching 5% they are already growing, so they need to set the bar much higher, and do not care much at wether the majority needs that produc now or in ten years. A small minority will accept the tradeoffs (price, compatibility, flexibility) and that's business sense for them. But they have set quality bar based on a rational process as well. Just that their situation is different, and so is their strategy. Regarding dropped calls: we do have dropped calls because we have some thing called radio bandwidth, and people moving from one area to another. No matter how much spare room you have, at some point some users will saturate a cell, and some calls will get dropped. And I don't get 99.999% uptime in my ground line either. It is just a perception since when you pick up the phone and it doesn't work you just assume that it has been working all the time even when you weren't using it. 99.999% uptime would implythat you would have to attempt to make a hundred thousand calls and on one single opportunity find the line dead. And that's not even nearly the case. I would say that out of a thousand calls (that is probably a year of usage) I find the line dead a few times. That means lower than 99.9%. 99.999% is extremly expensive, wether you are talking of phones, Internet, cellular, banking or airplanes. Very, very few businesses offer even four nines to their customers. And telcos, in any of their forms, are not among them.

    2. Re:Partly correct by Chaos+Incarnate · · Score: 2, Insightful

      If you have to choose your hardware around the OS, that hardly counts as simplicity.

      --
      Benford's Corollary to Clarke's Law: "Any technology distinguishable from magic is insufficiently advanced."
    3. Re:Partly correct by amRadioHed · · Score: 3, Insightful

      Apple users would disagree.

      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
  32. Re:What is good enough? by finity · · Score: 2, Interesting

    So the question becomes, how do you get someone to spend more when what they currently get is good enough?

    Maybe that's the question the cable company would like to ask, but the one concerned consumers should be asking is, "how do you get someone to expect _more_ for the same price (or less) when they think that what they currently get is good enough?" Reading your piece of the discussion, I think this question could also follow, and it happens to be the original question...

    Would I be willing to pay more for cell service that had fewer dead zones, dropped calls and "busy networks" then my current one has? No way. It's not as good as landline, but it's good enough for me. If, ten years from now, it worked the same as it does now, I would expect their competition to have passed them by and I'd switch. In the US we're in a free market system.

    If I was tired of my cable internet dying on me occasionally, which competitor would I turn to? DSL, satellite and local wireless all have problems too. I settle for less than 5 nines because I have no choice, if I want service that is anywhere near the cost it is right now.

  33. Reality check by mstone · · Score: 2, Informative

    The N-nines model is a fast and easy way to compare order-of-magnitude differences between existing networks, but it says almost nothing meaningful about actual usage or the perception of uptime from a user's perspective.

    Let's look at the numbers: 99.9% uptime translates to about 9 hours of unscheduled downtime a year. That can be one 9-hour block once a year, 36 minutes per day, 1.5 minutes per hour, 1.5 seconds per minute, or one dropped packet per thousand. Sure, it's easy to spot a 9-hour blackout, but as the slices of downtime get thinner, they get harder to notice at all, or to identify as USD specifically.

    99.999% uptime translates to about 5 minutes of USD per year, and is of questionable value. You can't identify a network outage, call in a complaint, and get the issue resolved in the given timeframe. 99.9999%? It is to laugh. You can't even look up the tech support phone number without blowing your downtime budget for the year. Get hit by a rolling blackout for an hour? Kiss your downtime budget goodbye for the next 120 years.

    Getting back to 99.9% uptime, let's move on to standard utilization patterns. USD really only becomes an issue if people notice it .. nobody cares if an incoming piece of email got delayed by 30 seconds at the MTU, but they do get testy if they can't load their webpages. But web surfing only uses 1-2 seconds of bandwidth per minute anyway.

    If we have 2 seconds of usage and 2 seconds of downtime per minute, the odds of a collision are around 15:1 with an average overlap of 1 second when a collision does happen. Simply interleaving usage and downtime that way increases the perceived uptime by an order of magnitude since 90% of the outages happen when no one is actually using the network. And larger blocks of downtime get lost in larger blocks of non-utilization exactly the same way .. who cares about a half hour of downtime from 0300 to 0330 when no one in your company is actually in the building and using the network?

    Granted, if you have higher utilization you'll have a better chance of hitting a chunk of downtime, but you'll also have higher chances of queuing latency within your own use patterns. If you're already using 99% of your bandwidth, you can't just plunk in one more job and expect it to run immediately. It has to wait for that 1% of space no one else is currently using. And when you get to that point, it's really time to consider buying a bigger pipe anyway.

    And that brings us to the main point: People don't buy network connectivity in absolute terms. They buy capacity, and the capacity they buy is scaled to what they think of as acceptable peak usage. "Acceptable peak usage" is a subjective thing, and nobody makes subjective judgements with 99.999% precision.

  34. Here's your citation about email by btarval · · Score: 2, Interesting
    "We know that many of our emails never reach their destination.

    [citation needed] I call bullshit on that one.

    And I call BS on your BS. Clearly you're not familiar with the state-of-the-art as far as email goes. You've certainly not had to set up and run a private email server.

    Here's one good reference. It mostly mirrors my experience, except that it's been going on longer than the writer has observed.

    The basic problem is that Yahoo, Hotmail, ATT and other large email providers, or ISPs, simply refuse to honor the standards which have been published (DKIM, et. al.). Google is great. But it's gotten so bad with the others that I simply don't bother communicating with anyone who has a Hotmail, Yahoo, or ATT account. If they are someone important, I'll tell them once (via a different band) of the situation. And let them know that unless they change their email provider, I won't be responding to any future email from them.

    Usually I just refer them to gmail, because google seems to be the only large email provider with a technical clue.

    The other interesting thing is that all of these large companies will treat unsigned email from an Exchange server as more verified than a DKIM email, but I digress.

    Supposedly the excuse is that it's due to spam. I'm certain that is part of the problem. But the other part is that there's definite incentive for the big boys to eliminate the small independent websites and drive all of the business into their arms.

    So, yes, the OP's statement about many email messages not reaching their destination is quite true. Most? No. But anything that doesn't use the technology offered by the big commercial joints (including Microsoft server technology) is shut off from communicating with a large part of the internet.

    Blackberry is not a mission critical service. The people who use it as such are naive.

    Heh. Well, many PHBs would disagree, but your point is valid.

    For your amusement, the Blackberry email servers are provided by a company called Mirapoint (mirapoint.com), and they are Linux based. From what I've heard, they cut over about 2 years ago from BSD to Linux, for various reasons. I'm also told that the CEO is a complete airhead who has difficulty managing a secretary, let alone a company. But that the mid-level managers and engineers in the U.S. are first rate. I imagine that they could indeed improve the uptime of the email servers, but those servers are quite good already.

    --
    The best way to predict the future is to create it. - Peter Drucker.
  35. Introducing the EULA by Mr+Pippin · · Score: 4, Informative

    Also, because the EULA came into existence, product warranties effectively vanished, as well as actions the consumer could take via product liability claims, in court..

    After all, liability plays a large part in defining QA policies. If software companies were held to the same liability standards most product manufacturers face, I'd bet software development would be more of the engineering practice it should be.

    To quote part of Microsoft's EULA for Windows XP.

    http://www.microsoft.com/windowsxp/home/eula.mspx
    ALSO, THERE IS NO WARRANTY OR CONDITION OF TITLE, QUIET ENJOYMENT, QUIET POSSESSION, CORRESPONDENCE TO DESCRIPTION OR NON-INFRINGEMENT WITH REGARD TO THE SOFTWARE.

  36. Statutory liability for software defects by Schraegstrichpunkt · · Score: 2, Insightful

    They believe it is the best the technology can provide at a given price. Why do people "put up" with cars that only give them X amount of protection in a car crash even though there is technology out there that would make them safer? Because they aren't willing to pay the marginal cost for the extra protection.

    This reminds me of why Bruce Schneier's dream of legislating liability for software defects is misguided. Sure, statutory liability would make software more reliable, but it would mean that the many who don't need the additional reliability (and currently aren't willing to pay for it) would be forced to subsidize the handful who do. It would also likely claim volunteer-developed software as a casualty.

  37. Because they're cheap and unobtrusive by supabeast! · · Score: 2, Insightful

    If we wanted better uptime we could have it. We would just have to pay more for, and look at, a whole lot of redundant systems. Personally, I'm happier to keep paying less and only have one power line coming into my house, with the nearest plant many miles away. The same goes for cable and telephone service. And my cellular service does work about 99.9% of the time.

  38. true for blackberry too by CdBee · · Score: 2, Insightful

    when my employers blackberries failed earlier this month they fell back to laptops with a bluetooth tethered phone and outlook/exchange. redundancy is built into the mindset. No messages were lost

    --
    I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU
  39. Re:What is good enough? by perlchild · · Score: 2, Insightful

    I'm sure there's a lot of the attraction of Internet service in being you pay a single flat fee, no matter how "important" the packet is. Who wants to have a 2.99 extra surcharge per call if the caller is a job recruiter(presumably, because he is offering you a job)? How about a 5 dollars surcharge if the call comes from your doctor? vet? The Internet caught on so far with the "a packet is a packet" mantra. Now all the internet suppliers compete on price(because people want cheaper internet) and want to charge extra for things... people haven't considered when they signed up... so they can charge more. This is what this is about, period. I imagine similar efforts are underway, paid for by different cable companies, etc... Anything to not have 5mbps to the internet, unfettered, 24hrs per day, 7 days a week, always-on, for a flat fee.

    Unfortunately for them, I'd be willing to downgrade to 1mbps, but not on the always on, nor the unfettered, and if they do downgrade, I will be readjusting my idea of how much it should cost.

  40. Here's Why America Puts Up with It by Hercules+Peanut · · Score: 2, Interesting

    I'm at home (and awake) 20% of the time.
    My landline is up 99.999% meaning my phone is available to me when I need it 19.998% of the time.

    I'm out and about (and coherent) 40% of the time.
    My cell phone works 90% of the time meaning it is available to me when I need it 36% of the time.

    Clear winner, cell phone.

    Sometimes we lose site of reality while studying statistics.

  41. Even Simpler... by nick_davison · · Score: 3, Insightful

    As Joel Spolsky pointed out on his blog JoelOnSoftware, 99.999% is pretty much fictional.

    99.999% over a year is 31.526 seconds.

    No matter how good your staff, no matter how many people you have on site, no matter how robust your systems, no matter how many failsafes you have standing by, ready to be plugged in...

    IF something does go down, even the fastest tech on earth is unlikely to identify, pull out, replace and have fired back up whatever the faulty item is in under 30 seconds.

    99.999% uptime is essentially fictional. It's simply an impressive sounding number that says, "We'll do everything realistically possible to keep you up 100% of the time. In a typical year, you won't see anything bring you down. You can now tell your investors/clients this and make them feel warm and fuzzy."

    It ignores the second part, "But, honestly, if it does go down, we won't have it back within 30 seconds, 100% of the time. Sorry, but welcome to reality. But, for what it's worth, our board's happy to pay you outage fees because it's a small enough risk and the amounts are capped enough, that we're happy to take the risk and costs in exchange for advertising a service we know no one can deliver."

    Let's look at regulated phone service, the example in the original post. Can anyone point to a major carrier that hasn't had a major outage at some point? Be it an idiot in a switch room, a power outage affecting a whole side of the country, an anchor ripping up an undersea cable? And how many of them have actually been back within the mandated 30 seconds?

    It doesn't happen. That two hour outage is going to take quarter of a millenium of absolutely no more faults to earn back at 30 seconds/year. With luck, it only hit one in 250 customers so you can pretend you're well within your 99.999% uptime but that 1 in 250 isn't really going to agree they got 99.999% after they were down for 1:59:30 more than their contract said they would be.

    So, no, 99.999% doesn't exist. It's just a really cool story we tell ourselves whilst being willing to pay whatever the penalties are for missing it, on rare occasions, in exchange for great advertising.

  42. once in a while by Scrameustache · · Score: 2, Funny

    One day every other month where our home internet is down doesn't seem like the end of the world Hell, it's a relief! We wander outside, blinking and squinting at the surprising brightness, experiencing strange yet nostalgic smells and sounds.
    --

    You can't take the sky from me...

  43. Re:Costs increase geometrically by LordKronos · · Score: 2, Insightful

    Thanks for actually listing out the figures. It really puts things in perspective, and it made me realize something. My internet service probably gets somewhere between 99.9% and 99.99% uptime. My cell phone is probably in a similar range. My cable is better than 99.999% (maybe even 99.9999%).

  44. No, it does exist by Sycraft-fu · · Score: 2, Informative

    But only with redundant systems. What happens is when something goes down, techs aren't getting it back up in 30 seconds, rather it is instantaneously failing over to another system. You have enough redundancy, you can keep operating even in the face of multiple simultaneous failures.

    The problem is, of course, going for that can be really expensive. Not only does the system itself have to have a bunch of redundancy, but so does everything supporting it. For example in the case of a web server you'd not only have to have multiple boxes running that, but multiple power connections, generators, network connections, ISPs, etc.

    Doing something like that, you can offer essentially 100% uptime, barring a catastrophic event (and face it, and amount of uptime can be ruined by a sufficiently large event). However it is extremely costly, and of course everything has to be well designed because, as you noted, you fuck up anywhere, you got 30 seconds to fix it.

    Or you can just do what the voice guys like to do: Change the rules. For them, the system is "up" so long as there is at least one phone line that can place a call to at least one other phone line. By that standard, the voice switch on campus has never been down. Of course that isn't a particularly useful standard, if you asked me.

  45. On redundancy by Animats · · Score: 3, Informative

    In the entire history of electromechanical switching in the Bell System no central office was ever out of action for more than thirty minutes for any reason other than a natural disaster. On the other hand, step-by-step (Strowgear) switches failed to connect about 1% of calls correctly, and crossbar reduced that to about 0.1%. With electronic switching, the failure rate is higher but the error rate is much lower.

    This reflects the fact that, in the electromechanical era, the hardware reliability was low enough that the system had to be designed to have a higher reliability than any of its individual units. In the computer era, the component reliability is so high that good error rates can be achieved without redundancy. This is why computer-based networks tend to have common mode failures.

    If you're involved in designing highly reliable systems, it's worth understanding how Number 5 Crossbar worked. Here's an oversimplified version.

    The biggest component of Number 5 crossbar were the crossbar switches themselves. Think of them as 10x10 matrices of contacts which could be X/Y addressed and set or cleared. Failure of one crossbar switch could take down only a few lines, and they usually failed one row or column at a time, taking down at most one line.

    The crossbars had no smarts of their own; they were told what to do by "markers", the smart part of the central office. Each marker could set up or tear down a call in about 100ms. Markers were duplicated, with half of the marker checking the other half. If the halves disagreed, the transaction aborted. Each central office had multiple markers (not that many, maybe ten in an office with 10,000 lines), and markers were assigned randomly to process calls.

    When a phone went off hook, a marker was notified, and set up a "call" to some free "originating register", the unit that understood dial pulses and provided dial tone. The marker was then released, while the user dialed. The originating register received the input dial info, and when its logic detected a complete number, it requested a random marker, and sent the number. The marker set up the call, set and locked in the correct contacts in the crossbars, and was released to do other work.

    If the marker failed to set up the call successfully (there was a timeout around 500ms), the originating register got back a fail, and retried, once. One retry is a huge win; if there's a 1% fail rate on the first try, there's an 0.01% fail rate with two tries. This little trick alone made crossbar systems appear very reliable. There's much to be said for doing one retry on anything which might fail transiently. If the second retry fails, unit level retry as a strategy probably isn't working and the problem needs to be kicked up a level.

    The pattern of requesting resources from a pool at random was continued throughout the system. Trunks (to other central offices), senders (for sending call data to the next switch), translators (for converting phone numbers into routes), billing punches (for logging call data), and trouble punches (for logging faults) were all assigned on a random, or in some cases a cyclic rotation basis. Units that were busy, faulted, or physically removed for maintenance were just skipped.

    That's how the Bell System achieved such good reliability with devices that had moving parts.

    Note that this isn't a "switch to backup" strategy. The distribution of work amongst units is part of normal operation, constantly being exercised. So handling a failure doesn't involve special cases. Failures cost you some system capacity, but don't take the whole system down.

    We need more of that in the Internet. Some (not all) load balancers for web sites work like this. Some (but not all) packet switches work like this. Think about how you can use that pattern in your own work. It worked for more than half a century for the Bell System.

  46. Brandy? by JakartaDean · · Score: 2, Funny

    I have people send me Brandy, fine wines and other tokens of their appreciation after sales.
    Great! Could you send her my way, once you're done with her?
    --
    The subject who is truly loyal to the Chief Magistrate will neither advise nor submit to arbitrary measures (Junius)
  47. Because it's not worth the enormous cost! by reidconti · · Score: 2, Insightful

    Thank you for bringing some sanity into this argument. Before you showed up it was dominated by idiotic hippies ranting about our mindless consumer-driven existence, the destruction of the environment, Microsoft, and just about everything else that has nothing to do with the issue at hand.

    99.999% uptime is orders of magnitude more expensive than 99.99%, which in turn is orders of magnitude more expensive than 99.9% uptime, and so on.

    The added cost is simply not worth it, in any sense of the word, to the general public.

    I, for one, would prefer to deal with a day's worth of power loss in a major storm, than paying 10x as much for my electricity in order to make it bulletproof.

    The savings would be better spent elsewhere.

    Note that this is not an argument against proper planning and preventative maintenance to REDUCE downtime as much as possible, just an argument against designing everything in the world to survive a nuclear bomb when that level of reliability is simply not worth the cost.

  48. Zero dropped calls? by mc900ftjesus · · Score: 2, Informative

    "after decades of mobile phones, why do we even still have dropped calls?" This is just stupid. A dropped call is not the network, it's your phone losing the network. There is absolutely no way to avoid them, none. RF only travels so far and through so many things. I completely understand the article and its merit, but this is just the author being ignorant of their subject or scoring sensationalist points with uninformed readers. Someone explain to me how a company could possibly cover the entire US, and I mean Wyoming and Montana too (if you want zero dropped calls). Then there the fact that Americans will take a $0 junk heap of a phone with a contract and hope that it will perform well.