Slashdot Mirror


Amazon Outage Cost S&P 500 Companies $150M (axios.com)

From a report on Axios: Cyence, an economic modeling platform, shared some data with Axios that show the ramifications: Losses of $150 million for S&P 500 companies. Losses of $160 million for U.S. financial services companies using the infrastructure.

113 comments

  1. Maybe you should own your hardware by Chronus1326 · · Score: 1, Insightful

    If you took responsibility for your own hardware resources, this wouldn't have been an issue for you.

    1. Re:Maybe you should own your hardware by fabriciom · · Score: 3, Insightful

      If you ever get to management and you have to answer for errors of your subordinates your opinion will change.

    2. Re: Maybe you should own your hardware by Anonymous Coward · · Score: 0, Flamebait

      Most IT "engineers" I've met should be driving an Uber not working on systems.

    3. Re:Maybe you should own your hardware by James+Carnley · · Score: 5, Insightful

      Yeah because self hosted hardware never goes down. Totally rock solid. I don't know why everyone doesn't host their own stuff so that nothing can go wrong.

    4. Re:Maybe you should own your hardware by Jaime2 · · Score: 1

      If you outsource, you can blame the service provider. If you do it internally, you take the blame yourself. No wonder the cloud is so popular.

    5. Re:Maybe you should own your hardware by __aaclcg7560 · · Score: 1

      Owning the backend isn't a cure all. When I was an intern at Fujitsu in the late 1990's, I discovered the crash bug on the test server and could reproduce it 100%. My supervisor couldn't reproduce the bug even though we took turns at the keyboard. He approved the patch for production. The servers crashed 24 hours later. The engineers determined that a deep fix was required, forcing the server offline for three days and costing $250K in lost revenues. I wasn't offered a job when my internship expired. One-third of the department got laid off a month after I left to make up for the lost revenues. My boss, being a high-ranking engineer, got promoted.

    6. Re:Maybe you should own your hardware by SpaghettiPattern · · Score: 1

      I understand what you mean.

      However... The more you have your stuff together the easier it is to reach absurdly high levels of availability at affordable costs. Automatic host fail over, automatic site fail over, etc...

      Then again not many have.

      --

      I hadn't the slightest objection to his spending his time planning massacres for the bourgeoisie... (P.G. Wodehouse)
    7. Re:Maybe you should own your hardware by TWX · · Score: 1

      Our locally-hosted AS/400 has not had an unscheduled outage in something like fifteen years, and that includes at least one full hardware migration. Mind you, there's only one local admin left that knows how to read the chicken bones and tea leaves to run the thing, but it's not exactly impossible to have excellent availability when the right platforms are chosen and are maintained.

      It's also perfectly possible for a large enough organization to run separate datacenters at facilities in different geographical areas with redundancy within the facilities and across the two facilities. Hell, it's even possible to tunnel L2 so that the equipment at the different facilities doesn't even know that it's not all at one big happy site, should that sort of thing be necessary.

      This is all academic though, the real issue with "cloud computing" is how the failure happens and the conditions. This Amazon outage was annoying, and obviously affected a lot of end users for a day or so, but was recoverable because Amazon was still around to fix it. I'm much more concerned when the cloud provider goes out of business and customers suddenly find the rug permanently ripped out from under them. It is entirely plausible that a cloud provider would attempt to right itself if it was having financial problems, including doing whatever it can to conceal or downplay those problems, such that the end customers might not have much notice that their platforms are going away. I would also be shocked if most users of cloud computing do an adequate job of backing up off of the cloud to their own datacenters; after all, wasn't going to cloud hosting done specifically to avoid having to maintain vast datacenters?

      I do not anticipate good things from this era of cloud computing. I expect outages, I expect companies going under because the cloud provider goes under. I honestly expect it to get so bad that ultimtely customers start demanding that the cloud providers create frameworks that allow for interoperability and simple migration from one provider to another, such that a company doesn't have to put all of their eggs into one basket.

      --
      Do not look into laser with remaining eye.
    8. Re:Maybe you should own your hardware by TWX · · Score: 2

      Because blame-storming works when your entire company's service is entirely offline and now your customers leave you.

      We felt the effect of the Amazon issue through a service that we've contracted-for. That service provider gets no special consideration in our judgement of them just because the entity they subbed-out to went down.

      --
      Do not look into laser with remaining eye.
    9. Re:Maybe you should own your hardware by ooloorie · · Score: 1

      If you took responsibility for your own hardware resources, this wouldn't have been an issue for you.

      True, but anything from the extra staff to your data center flooding would. And if you total all of that up across S&P 500 companies, you likely end up with bigger total losses.

      In different words, your advice is penny wise and pound foolish.

    10. Re:Maybe you should own your hardware by ooloorie · · Score: 1

      However... The more you have your stuff together the easier it is to reach absurdly high levels of availability at affordable costs. Automatic host fail over, automatic site fail over, etc...

      How nice. And when the employee that put all that together leaves to company for greener pastures or to pursue his dreams and when you have to replace him on short notice, that setup falls apart. Likewise, when you suddenly need to quadruple your capacity because of some business decision, you lack the staff and resources to do so quickly.

      The nice thing about Amazon is that it is predictable, low (not zero) risk, and scalable.

    11. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      Why isn't AWS considered one of your subordinates that you have to answer for?

    12. Re:Maybe you should own your hardware by Jaime2 · · Score: 1

      It doesn't work out for the company, but IT managers do use this excuse regularly. I'm not suggesting that it's a good thing, just that management seems to be more about avoiding blame than providing solutions.

    13. Re: Maybe you should own your hardware by Anonymous Coward · · Score: 0

      Owning the backend isn't a cure all. When I was an intern at Fujitsu in the late 1990's, I discovered the crash bug on the test server and could reproduce it 100%. My supervisor couldn't reproduce the bug even though we took turns at the keyboard. He approved the patch for production. The servers crashed 24 hours later. The engineers determined that a deep fix was required, forcing the server offline for three days and costing $250K in lost revenues. I wasn't offered a job when my internship expired. One-third of the department got laid off a month after I left to make up for the lost revenues. My boss, being a high-ranking engineer, got promoted.

      Cool story bro.

      None of that ever happened, did it?

    14. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      Hi. I'm the other one person out there that still has to use an AS/400. Just get it "cloud hosted" instead and never have to worry about it again. Same just like AWS. (I wish AWS did AS/400 hosting, lol.)

    15. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 1

      Right up until someone turns it all off like what happened to Amazon...

    16. Re:Maybe you should own your hardware by bws111 · · Score: 3, Insightful

      Yeah. Do you also

      * Run your own communications system with 2-way radios, or do you trust telcos for that?
      * Run your own wires to every customer, or do you trust ISPs for that?
      * Run your own fleet of trucks to deliver product, or do you trust shipping cos for that?
      * Have all you customers pay you directly in cash that you keep in your own vault, or do you trust credit card companies and banks for that?
      * Perform all your own accounting, or do you trust outside accountants for that?

      The list goes on and on. Every one of those is at least as important as servers (and in some cases they are far more important)

    17. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      Be careful. Someone might steal your cheese!

    18. Re: Maybe you should own your hardware by __aaclcg7560 · · Score: 1

      None of that ever happened, did it?

      Yes, it did. Because I had Fujitsu and later Sony on my resume, I kept getting phone calls from recruiters for Japanese-speaking positions for years. Working at a Japanese company doesn't mean I can speak Japanese. I told that to a hiring manager who called from Tokyo.

    19. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      We have been on an AS400 platform for the last 30 years, with multiple migrations to newer hardware. The latest being a 4 year old power7-based machine running iSeries software.

      Even with it being rock-solid for the most part, we are moving to a "cloud" AS400 solution in a few months due to a significant savings in cost and better disaster recovery. The plan is to migrate off it in the next year, because its difficult and expensive to find people to work on it.

    20. Re:Maybe you should own your hardware by JaredOfEuropa · · Score: 3, Insightful

      That's just how it works. If your underlings fuck up, it's poor management on your part. If your cloud hosting partner fucks up, it's breach of contract and not your fault. Especially if you went with a well known vendor with all the right ISO stuff. You probably won't even be challenged much on the decision to go to the cloud in the first place, since that's now standard business practice. And to be honest, running your own data centre only makes sense if you know how to do that; I've had a few clients that saw a vast improvement in reliability and delivery after they moved to the cloud.

      --
      If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
    21. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      Then you need to document everything in a wiki and train *gasp! who does that?!* your employee.
      Also, there are DR solutions that are pretty hands free and/or easy to maintain once they are set up.. Zerto immediately comes to mind.

    22. Re:Maybe you should own your hardware by RabidReindeer · · Score: 2

      If you ever get to management and you have to answer for errors of your subordinates your opinion will change.

      And by outsourcing your critical IT resources and eliminating subordinate positions, that will make it that much more obvious where the blame should go.

    23. Re: Maybe you should own your hardware by Anonymous Coward · · Score: 0

      I'm sure Amazon has plenty of legal weasel words in their corporate contracts!! I mean you can try to get a littl money back, but it's a metered service so "business effects" probably aren't covered. You'll get your $10 bucks per CPU-unit/node back in an Amazon gift card...

    24. Re: Maybe you should own your hardware by dougdonovan · · Score: 1

      150M, thats pocket change.

    25. Re:Maybe you should own your hardware by Archtech · · Score: 1

      If your cloud hosting partner fucks up, it's breach of contract and not your fault.

      That turns out not to be the case. Anything you do as a senior executive is your responsibility, and if it turns out badly for the corporation it's your head on the block.

      That's why there used to be a saying, "No one ever got fired for buying IBM". The clear implication was that you could well be fired for buying from some other vendor. IBM was unique, both because it swung enough weight to rescue anyone who got into trouble for choosing its products and services, and because it was always best chums with the CEO and his inner circle.

      Also, of course, because it had the resources to make things right before anything scandalous happened that could hurt IBM's reputation.

      --
      I am sure that there are many other solipsists out there.
    26. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      I thought the bottom line was the only thing that mattered. It's a breach of contract and you'll get reimbursed 10% of your S3 costs. You still "lost" $X. Will that 10% reimbursement cover that $X? Or would have hosting it yourself been the better solution? The answer is that every place is different and the answer depends on the details.

    27. Re:Maybe you should own your hardware by Archtech · · Score: 1

      Mind you, there's only one local admin left that knows how to read the chicken bones and tea leaves to run the thing...

      Which is because no one in a position of power chose to invest in training more people to replace him. Please don't tell me it's impossible; with enough money and commitment, most things are possible.

      What isn't possible is to get a result when you are too stingy to provide the necessary means.

      --
      I am sure that there are many other solipsists out there.
    28. Re:Maybe you should own your hardware by Archtech · · Score: 1

      ...[M]anagement seems to be more about avoiding blame than providing solutions.

      Which is a far bigger problem than some AWS systems going down for a few hours. If management really is more about avoiding blame than making the organization successful, success will prove very elusive indeed.

      And of course if management is working in the wrong way... that too is a management problem.

      It's not as easy as people think.

      --
      I am sure that there are many other solipsists out there.
    29. Re:Maybe you should own your hardware by clodney · · Score: 1

      That's why there used to be a saying, "No one ever got fired for buying IBM". The clear implication was that you could well be fired for buying from some other vendor. IBM was unique, both because it swung enough weight to rescue anyone who got into trouble for choosing its products and services, and because it was always best chums with the CEO and his inner circle.

      I think the "No one every got fired for buying IBM" saying was more about going with the herd. It wasn't that IBM was foolproof or could rescue you in ways that other people couldn't, it was that IBM was widely accepted as a very solid choice, and if you were wrong to go with IBM then so were millions of others.

    30. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      If the things you mention weren't ridiculously expensive to pull off, and wouldn't tend to be pretty reliable, then yes to all five points.

    31. Re:Maybe you should own your hardware by Jaime2 · · Score: 1

      At my previous job at a Fortune 100 company...

      Me: Hey boss, we spend half our time cleaning up the mess that is caused by this one bug. I suggest we put a little time into fixing the bug. Boss: Fixing the bug is build work and that requires a business unit to provide a request and the capital to do the work. Cleaning up the mess is maintenance work and the whole company pays for that. So, until some other department pays us to fix this problem, we must continue to put our time into maintenance work. Me: But, the only people that are inconvenienced by it is us - no one else cares. Oh well, I'll just go home and kick my dog.

      success will prove very elusive indeed

      That company makes $3 billion a year in profit.

    32. Re:Maybe you should own your hardware by Jaime2 · · Score: 1

      At my last job, our AS/400 had to have all of the applications shut down to do a nightly backup. The backup took nearly every second that the business was closed. Scheduled maintenance had to be done on holidays.

      One time we had to move its network connection to another switch port. The thing didn't work again until we hard rebooted it.

      The software on it could only be accessed from the network by running CL scripts - so there was no such thing as transactional integrity. The programmers used a five digit batch number on the main thing we used the AS/400 for, which recycled every month. As a result, if we had a few production glitches in a 31 day month, everything went to shit. The idiots also used a six digit invoice number and we rolled off old invoices after two years (which had regulatory implications). As we grew and processed more than 42000 invoices per month, hundreds of hours had to be put in to expand the system capabilities.

      It's not all roses in the AS/400 world.

    33. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      Think this poster works for Amazon?

      I don't like "Anonymous Coward" but I will accept "Too Lazy (And Skeptical of Slashdot Security) to Register"

    34. Re:Maybe you should own your hardware by Paradise+Pete · · Score: 1

      If you took responsibility for your own hardware resources, this wouldn't have been an issue for you.

      Exactly! I haven't made a mistake in 20 years, maybe 30, and have an uptime so long that I use a pitch drop experiment to measure it. Both ways.

    35. Re:Maybe you should own your hardware by phantomfive · · Score: 1

      Our locally-hosted AS/400 has not had an unscheduled outage in something like fifteen years, and that includes at least one full hardware migration

      It's hard to find Admins who know how to have high availability in their own datacenter. AWS wins because of the lack of expertise in the world.

      --
      "First they came for the slanderers and i said nothing."
    36. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      Hey dumb ass, of course it does. This difference is, that one company going down, who cares. Amazon going down = a fuck ton of companies going down. That's way too many companies relying way too much on a SINGLE VENDOR.

    37. Re:Maybe you should own your hardware by thegarbz · · Score: 1

      However... The more you have your stuff together

      Nope. The more you are an expert in managing your stuff the easier it is to reach absurdly high levels of availability. The vast majority of Amazon Cloud users would by themselves be very unlikely to be able to reach the uptime and availability much less the scalable bandwidth available through that kind of hosting.

    38. Re:Maybe you should own your hardware by thegarbz · · Score: 2

      Finally someone on Slashdot that gets it. In many cases people don't outsource because they are cheap, they outsource because other people are better at it than they ever were.

      Now if only non-important small companies like airlines would upgrade to Amazon's cloud, then we can stop running weekly stories about companies grinding to a halt* because their inhouse services are falling over.

      *Okay it's not that simple but I hope I'm getting the point across. Most people using Amazon's servers would not have the uptime and availability to do it themselves.

    39. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      But explaining losses that could fund a datacenter for a decade because you wanted to save a few bucks and the headache of management... that's not going to have you answering questions?

    40. Re:Maybe you should own your hardware by fabriciom · · Score: 1

      That's how business works today. Cheaper, faster, and external. If you don't like it you are welcome to play the startup game.

    41. Re:Maybe you should own your hardware by fabriciom · · Score: 1

      How am I liable for a mistake made by a service provider? Only legal route is to seek compensation from the provider.

    42. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      Well son, I've lost internet about three times recently. That's three times everything would have been completely DOWN if we were in the cloud. Was I down? No, my intranet stayed up in those periods.

      Our hardware? Well, for the truly mission critical stuff we run that through a Stratus hot-backup dual blade. So it would take a REAL exceptional circumstance to take both of them down. And I roll backups as well so that if something did wipe both of them out I won't be down for long. On non-mission critical stuff I have backups and can press a backup machine into play. I have ready backups for any system in this building that if it goes down, there's a plan to route around the damage in less than a half hour.

      So yeah, if you're willing to go to the expense for it, you can have a self hosted setup that is quite superior to the cloud.

    43. Re:Maybe you should own your hardware by RabidReindeer · · Score: 1

      That's how business works today. Cheaper, faster, and external.

      If you don't like it you are welcome to play the startup game.

      If, of course, you can raise the capital.

      Enough capital, in fact, to overcome the fact that established players will probably be paying much less for virtually everything than you will.

    44. Re:Maybe you should own your hardware by Anne+Thwacks · · Score: 1
      and now your customers leave you.

      Nope. The customers were down too and never noticed. Granma was wrong - put all your eggs in the biggest damn basket you can find. You may lose all your eggs when the basket goes nuclear, but Joe public will have bigger things than eggs on his mind!

      (Nuclear baskets are really scary - take it from me!)

      --
      Sent from my ASR33 using ASCII
    45. Re:Maybe you should own your hardware by SpaghettiPattern · · Score: 1

      However... The more you have your stuff together the easier it is to reach absurdly high levels of availability at affordable costs. Automatic host fail over, automatic site fail over, etc...

      How nice. And when the employee that put all that together leaves to company for greener pastures or to pursue his dreams and when you have to replace him on short notice, that setup falls apart. Likewise, when you suddenly need to quadruple your capacity because of some business decision, you lack the staff and resources to do so quickly.

      The nice thing about Amazon is that it is predictable, low (not zero) risk, and scalable.

      I said "The more you have your stuff together". That means you have removed the bus effect as a factor. Not having your stuff together means the market forces will efficiently deal with you sooner or later.

      --

      I hadn't the slightest objection to his spending his time planning massacres for the bourgeoisie... (P.G. Wodehouse)
    46. Re:Maybe you should own your hardware by ooloorie · · Score: 1

      I said "The more you have your stuff together".

      If you run your own IT shop, you necessarily have a much smaller pool of IT staff than Amazon. That means that your risk of losing an employee who is key for keeping your systems running is necessarily much larger than for Amazon. If you don't understand that, you certainly "don't have your stuff together".

    47. Re: Maybe you should own your hardware by Anonymous Coward · · Score: 0

      This. Everyone crying how good a data center they could run for 150 million. How many companies was it again?

      Losses happen. I remember walking into work once and sales forgot to put a limit on an online promotional sale. 12 mil out the door.

      On the upside it was satisfying because the top half of corporate sales and marketing was fired within 48 hours XD

    48. Re:Maybe you should own your hardware by SpaghettiPattern · · Score: 1

      I said "The more you have your stuff together".

      If you run your own IT shop, you necessarily have a much smaller pool of IT staff than Amazon. That means that your risk of losing an employee who is key for keeping your systems running is necessarily much larger than for Amazon. If you don't understand that, you certainly "don't have your stuff together".

      Either you don't get it or you don't want to get it.

      Well setup thought of systems require little staff. But you must be prepared to go the whole 9 yards during development phase. Resulting systems behave reasonably and predictively with respect to resources (CPU, memory, storage, networking bandwidth, etc...) The function of required staff vs. workload should be asymptotic and mustn't be linear and certainly not exponential. The task of the system administrator must be extremely boring (backups, restoring broken systems, adding hardware when thresholds are reached, patching/updating the OS when necessary.) Think of redundant systems without a single point of failure whereby hardware is added when necessary and where failure of one node only means that the system runs at a reduced speed. Think of prioritizing batch workload in order to reduce the max. needed CPU/IO. Think of transaction oriented processing. Think of reducing memory footprint. Think of letting your people work more and attend less meetings. Think of KISS even though what I describe sounds rather complex.

      However, such kinds of systems the disadvantage that they eventually are taken for granted. And that crappy system designers get all the attention as apparently their work is of course much more complex.

      You can achieve such levels of stability if you run a shop where the bottom line is clear to everyone. Or if you have exceptional management that understands the advantage of stability and the cost of instability.

      --

      I hadn't the slightest objection to his spending his time planning massacres for the bourgeoisie... (P.G. Wodehouse)
    49. Re:Maybe you should own your hardware by mdervin2001 · · Score: 1

      I've lost internet about three times recently.

      The money you save with going to the cloud allows you to spend a couple of bucks a month to get a backup internet connection.

    50. Re:Maybe you should own your hardware by ooloorie · · Score: 1

      Either you don't get it or you don't want to get it.

      No, you don't get it. You say that if a company has great people, it can achieve Amazon-like stability. It can. But the problem is that great people are hard to find, so when they retire or leave, you are left with a problem on your hand.

      Solid businesses don't rely on technical or managerial superstars, they have business processes that function reliably with mediocre technical staff and managers.

    51. Re:Maybe you should own your hardware by Anonymous Coward · · Score: 0

      Maybe Amazon should've cloud hosted their data?

  2. Skip the summary next time... by __aaclcg7560 · · Score: 1

    I think the title says it all. No need to add a one-line summary with the link.

    1. Re:Skip the summary next time... by apoc.famine · · Score: 1

      Summary is actually the entire article. I'm absolutely blown away. I guess I shouldn't be, but holy shit. How did an article with no content get linked to?

      --
      Velociraptor = Distiraptor / Timeraptor
    2. Re:Skip the summary next time... by msauve · · Score: 1

      Yeah, nothing about how they got to that number? Did they consider that while there was certainly business which didn't happen during the outage, it may have simply been time-shifted to a few hours later?

      This appears to be nothing but opportunistic marketing BS from Cyence.

      --
      "National Security is the chief cause of national insecurity." - Celine's First Law
    3. Re:Skip the summary next time... by Oswald+McWeany · · Score: 1

      Yeah, nothing about how they got to that number? Did they consider that while there was certainly business which didn't happen during the outage, it may have simply been time-shifted to a few hours later?

      This appears to be nothing but opportunistic marketing BS from Cyence.

      Indeed, this was my immediate thought upon seeing the headline. A temporary loss of $150million that got rectified an hour later when the systems came back online isn't a big deal.

      --
      "That's the way to do it" - Punch
  3. Don't press the red button! by Anonymous Coward · · Score: 0

    It will shut Amazon off.

  4. What I wonder is.... by Anonymous Coward · · Score: 0

    Why wasn't Amazon's website down when all of the others were? Isn't their cloud good enough to host their own website? Or do they keep their website on someone else's cloud, because that's the cool thing to do these days?

    1. Re: What I wonder is.... by Anonymous Coward · · Score: 1

      They use MS Azure.

    2. Re:What I wonder is.... by __aaclcg7560 · · Score: 1

      Probably because Amazon.com and AWS maintain separate hardware systems. You wouldn't want some cloud schmuck bringing down the world's largest market place?

    3. Re:What I wonder is.... by known_coward_69 · · Score: 1

      probably cause it was replicated to all regions unlike some of the data that was only in the region affected cause customers didn't want to pay more $$$

    4. Re:What I wonder is.... by PIBM · · Score: 1

      well, that`s also why they could not even mark their own services as down: the caching layer still had the latest version available, but it could no longer update.

    5. Re:What I wonder is.... by ranton · · Score: 1

      Why wasn't Amazon's website down when all of the others were? Isn't their cloud good enough to host their own website? Or do they keep their website on someone else's cloud, because that's the cool thing to do these days?

      Because Amazon's site was properly set up in multiple regions, like anyone who has mission critical applications online should do. This is just a recent example of why you need to host a site which requires high availability in multiple data centers in multiple regions, because no data center will ever be able to guarantee 100% up time over a long period of time. Cut corners at your own risk.

      --
      -- All that is necessary for the triumph of evil is that good men do nothing. -- Edmund Burke
  5. Negligence? by Anonymous Coward · · Score: 1

    If Amazon can be considered negligent by failing to put a competent person in charge of whatever operation it was that caused the outage, companies should be able to recover lost revenue and profit from Amazon.

    Contractual indemnity does not shield against negligence.

    1. Re:Negligence? by PIBM · · Score: 1

      As S3 was down for more than 44 minutes but less than 7h18 (about 5 hours total), a monthly rebate of 10% is supposed to be applied to everyone`s S3 related fees for February. That engineer which pressed the DELETE button has caused quite a bill..

  6. Really? by Dripdry · · Score: 1

    I think that money was just never made. It didn't cost them anything, other than not meeting earnings expectations.

    It only cost them money if they spent something.

    --
    -
    1. Re:Really? by DontBeAMoran · · Score: 1

      That's not true! I don't even use Amazon's systems but I suffered a loss of $150K!

      Amazon, please send 117 Bitcoins to 1LHuLKyHDndUdjgKUsmfAG8tDnXZ5fTuUA to compensate for my imagined losses. Thank you.

      --
      #DeleteFacebook
    2. Re:Really? by ThomasBHardy · · Score: 4, Insightful

      I think it's even more overstated than that.

      Without having any indicator other than that link to an article a couple of lines long, we have no info.

      Is the $150 million value the "normal throughput of transactions during the regular operation of that same time frame that the outage occurred? Because if so, I highly doubt they lost that much. I tried to place an order somewhere during that outage. There was an error. So i tried again later and placed my order. The company lost nothing in regards to my order. I'm sure mine is not the only transaction that was not re-tried later on.

      Bold statements about what an outage costs are not helpful unless the methodology for calculating that cost is both divulged and reasonably calculated.

      --
      Warning: Teh poster of this messaeg is lysdexic
    3. Re:Really? by caseih · · Score: 1

      I tend to agree with you. Particularly when it comes to folks like the RIAA and MPAA talking about "losses" due to copyright infringement. That's clearly a case of theoretical profits that they didn't take. Would be great if I could write off my theoretical profits as losses on my taxes!

      But in this case they may well have spent money. Expenses and costs tend to be there regardless of whether you're making money. So it's likely that these companies had pretty high money outflow which was not making a return during this brief period.

    4. Re:Really? by Anonymous Coward · · Score: 0

      That's not true! I don't even use Amazon's systems but I suffered a loss of $150K!

      Amazon, please send 117 Bitcoins to 1LHuLKyHDndUdjgKUsmfAG8tDnXZ5fTuUA to compensate for my imagined losses. Thank you.

      Dear Mr. DontBeAMoran,

      we just sent your refund to 1LHuLKyHDndUdjgKUsmfAG8tDnXZ5fTuUB, please have a nice day.

    5. Re:Really? by Anonymous Coward · · Score: 0

      Checksum does not validate

  7. Don't worry, Trump will fix it. by Anonymous Coward · · Score: 0

    He has ordered all of this "cloud" nonsense to be banned, as not Great Enough for America.

    1. Re:Don't worry, Trump will fix it. by Streetlight · · Score: 1

      He has ordered all of this "cloud" nonsense to be banned, as not Great Enough for America.

      I thought Trump blamed Obama for the outage?

      --
      In a time of universal deceit, telling the truth is a revolutionary act. George Orwell
    2. Re:Don't worry, Trump will fix it. by Anonymous Coward · · Score: 0

      Following Hillary's lead, they will be replaced with private servers, which are kept in a bathroom closet.

    3. Re:Don't worry, Trump will fix it. by msauve · · Score: 1

      "I thought Trump blamed Obama for the outage?"

      Hillary wiped the server.

      --
      "National Security is the chief cause of national insecurity." - Celine's First Law
  8. It cost the companies nothing by Anonymous Coward · · Score: 0

    Lost profits are not costs, and should never be explained as such.

    1. Re: It cost the companies nothing by Anonymous Coward · · Score: 0

      Let's say you make $20/Hr. You work 5 hours and make $100.

      Now let's say, your boss is a tool and refuses to pay you. Did that cost you?

      If you had chosen to work somewhere else for 5 hours at $15/Hr but you lousy that opportunity as well.

      Almost anything can be equated to money.

  9. Meh by argStyopa · · Score: 5, Insightful

    We hear this sort of statistic a lot but I have to ask, did they REALLY?

    Anyone with experience with this sort of thing understand how fluffy these numbers are, based on statistics, some WAG, etc.

    For example:
    We processed $1 million orders per hour.
    We were down for 3 hours.
    Ergo we "lost" $3 million.

    In fact, no such thing is true. At least, not like someone poured $3 million in cash into a furnace and actually LOST the money.

    First, there's the missed opportunity sales. What you're talking about in fact is purchases that didn't take place because the seller wasn't available. This has everything to do with flexibility of supply and time-sensitivity of delivery. If in fact John Smith wanted to order shoes from Amazon, and Amazon was down, so he went to company XYZ and bought those shoes or decided not to buy at all, then in fact is is reasonably a "lost sale" for Amazon. HOWEVER, if John couldn't reach XYZ (not unlikely with the broad infrastructure hit that the outage caused), or they didn't have his brand, or he just said "ok, I'll just buy them tomorrow" it WASN'T a lost sale at all. And it's HIGHLY unlikely that the consultants throwing together these figures rationalized any later excess demand back into the 'missing' hours.

    Secondly, even if there are actual lost sales, that is NOT the same as lost money. Lost sales are lost margin. If Amazon is selling a shoe for $100, they have to BUY it somewhere, say for $70. So if John didn't buy that shoe, Amazon didn't have to buy that shoe either. Therefore Amazon wasn't out $100, they were out only their margin, or $30. In the interest of fluffing numbers and getting the result quickly (and because the actual result would take hard work as well as involving some proprietary info like margins that you might never get), I've almost never seen "loss" statistics like this reported as anything but gross numbers. Depending on the margins of sale involved, this can easily be 10x what the actual lost margin was. (Plus, the point of course is to show how impactful something is in the first place....)

    Combining the two? I'd guess that the actual financial impact is barely 1% of the number stated.

    --
    -Styopa
    1. Re:Meh by PIBM · · Score: 1

      I'm pretty sure we could find somewhere in AWS data how much they are making a month with S3. They lost 10% of it just due to the SLA. That`s not counting all the engineers which had to fix things, improve system, move stuff around, prevent further failures, etc, at quite a lot of companies.

    2. Re:Meh by thegarbz · · Score: 1

      Ergo we "lost" $3 million

      The problem isn't defining what was lost, the problem is that the dollar figure itself isn't defined.
      If this were they case they most definitely "lost" $3 million in revenue / turnover. The actual profit value will differ, in some cases profit may even have gone up if a company typically makes a loss selling some of their products, but since accounting is often done in revenue, volumes of sales are often directly proportional to revenue, and it is the revenue stream which is interrupted it most definitely makes sense to say they "lost" $3MM.

      Speaking of, do you know what figure GDP is based on? Yep, those $3MM.

    3. Re:Meh by Anonymous Coward · · Score: 0

      We hear this sort of statistic a lot but I have to ask, did they REALLY?

      My favorite statistic I've been told to date in my current job role:

      "We are losing $250k every minute the system is down!"
      "The system was down for 90 minutes, ~$22.5M loss of business!"
      "Then why the fuck didn't you pay for a disaster recovery site that would have cost you a fraction of the total outage cost and you could have failed over to in less than 1 minute?" ... in much more polite business terms, but that was the nuts and bolts of it.

      Same deal with Amazon. Doesn't matter where its hosted, a matter of how much you risk tolerance you have vs. what you are willing to pay.

  10. Terrible development practices cost $150m. by generic_screenname · · Score: 1

    If your systems are *that* important, you should mirror them across multiple geographic locations. I've seen the same story in multiple forms several times now. The cloud is not a magical place in the ether. There is a computer somewhere with your code on it. That computer can catch fire, lose power, be destroyed in a hurricane, etc. This is what happens when you don't account for that reality.

  11. I've done local and cloud by Anonymous Coward · · Score: 0

    I'll say that cloud goes down less, and best of all, when cloud goes down it's not your fault.

    You'd think that local would be better job security at least, but it really isn't. Bean counters just see this person says they can do it for half the price, so out you go. At least cloud work keeps your resume relevant.

  12. S&P Companies Lost but Small Businesses Gained by QlooQl · · Score: 1

    Instead of buying from Amazon, all those customers bought from the small business website selling the same items. That $150 million didn't just disappear.

  13. Article by Anonymous Coward · · Score: 0

    This "article" almost qualifies for a tweet, both in terms of length and in terms of actually being informative.

    If you didn't RTFA, don't bother. The summary posted here is basically 90% of the article. The extra 10% is pointless ramblings and a HUGE Amazon logo picture (the highlight of the article, IMHO).

    It is completely devoid of content. It quotes no sources for that 150M figure other than mentioning some "Cyence" platform (never heard of it. is it even relevant? and where is the data?). Also, it is not explained if that figure is average loss, total loss, or imagined loss.

    That this made it to the front page is a new low for Slashdot.

  14. That's very questionable by GeekWithAKnife · · Score: 1


    Serious companies that host anything have Service License Agreements that can cover response times, escalations, downtime, systems affected, resolution times etc etc.

    Even if this is not strictly covered in a contractual, legally binding SLA Amazon would do well to pony up something for the big boys.

    Now, if you jump through all the SLAs, backups, insurance and DR/backups then you may find the impact was minuscule.

    Of course if you host with AWS and wee affected you cry wolf, claim damages are in thousands of dollars a minute and that you lost faith, are dismayed and the reputational damage is possibly 10 times the financial one, which is of course very considerable.

    And if it's genuinely the case that your company lost buckets of money over this without any hope of compensation then you;re doing it wrong for putting all your eggs in the same basket.

    --
    A 'singular oddity' is an event that cannot be explained and only happens when you are alone.
  15. Next time they'll use their own data centers by aglider · · Score: 1

    And not someone else's. The so called "could". It vanishes as a cloud of smoke!

    --
    Sent as ripples into the electromagnetic field. No single photon has been harmed in the process.
    1. Re:Next time they'll use their own data centers by Anonymous Coward · · Score: 0

      Cloud is just some random data center with some random server!!! Soylent Green is people!

  16. you'd be amazed by ooloorie · · Score: 1

    You'd be amazed at how much money a rainstorm costs the country. Or a heatwave. Or a cold virus.

  17. Your subject says it all - No need for comment by gnick · · Score: 1

    Posting a comment that says no more than the subject would be silly. No need for a one-line summary.

    --
    He's getting rather old, but he's a good mouse.
  18. Gee, what a shame by Neuronwelder · · Score: 1

    Here, let me pull out the world's most violin for you, and use my thumb and index player to play it.

    1. Re:Gee, what a shame by cellocgw · · Score: 1

      Here, let me pull out the world's most violin for you, and use my thumb and index player to play it.

      The usual response is "You accidentally the verb," but in this case "You accidentally forgot a very word" .

      --
      https://app.box.com/WitthoftResume Code: https://github.com/cellocgw
    2. Re:Gee, what a shame by Neuronwelder · · Score: 1

      I stand accused. I meant to say the world's most tiniest violin. This is what happens when you reply to people for too many hours.

  19. Perfect vs the alternative. by 140Mandak262Jamuna · · Score: 1
    So they "lost" 150 million dollars in a four hour outage? Are they then saving or profiting 150 million dollars every four hours using Amazon cloud?

    Do not compare it perfection, compare it to the alternative. Without such a cloud based computational capacity, each company would size their IT infrastructure for peak load. Since peak load of all companies do not happen at the same time, when one company is running at full load lots of other companies are running at a fraction of their peak capacity. The cloud infrastructure is simply a load balancing method. It has its down sides, down times, security issues, legal wrangling about data retention and licensing. But over all, they did not lose 150 million in four hours and they are not saving/profiting 900 million dollars a day when they use the cloud.

    --
    sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
  20. Cloud != Magic by ErichTheRed · · Score: 3, Insightful

    I'm working on a huge migration of an on-site system to Azure right now, and it's hard to convince people paying the bills of what's actually needed to guarantee high availability. The S3 outage is a perfect example of this...we have the same problem with Azure Storage Accounts being treated as a magic box by the developers. For example, Azure storage has locally redundant and geo-redundant levels. People hear "redundant" and assume that there will never be any issues accessing things you store in a storage account. If there was a disaster of some kind, it only protects the _data_ against the failure of a rack (locally redundant) or a datacenter (geo-redundant.) If a problem like what happened with S3 occurred, and access to the actual storage through the software-defined magic is disrupted, you're still going to have a bad day. You just (probably) won't lose the data. Obviously the cloud providers do everything they can to make sure things stay running, but not adding in some sort of failover above the cloud service level is just asking for trouble if you're doing anything critical.

    I'm a "classic IT" guy who totally has an open mind about the cloud, but I do think there's lots of hype and misinformation. Designing for high availability is at least as hard as it was. Doing this in the cloud is quite expensive...maybe not as expensive as rolling your own infrastructure, but a wake-up call when the CIO gets the bill. I just wish the hype bubble would die down so people could have rational conversations about public cloud. It's just like on-premises stuff - don't pay for HA and risk downtime, or pay up and get the SLAs you pay for. I just hate that people are going around saying the cloud is bulletproof and immune to failures....it's technology at the end of the day and people make mistakes (especially overworked AWS engineers working 100 hour weeks or Microsoft guys who forgot to renew certificates, etc.)

    1. Re:Cloud != Magic by Anonymous Coward · · Score: 0

      Yes, we have had architects (really just developers who think they know how operations works) tell me they don't need to back anything up for S3 since its got so many nines after the decimal place.

      Then I ask them what we do when someone fat fingers something and it all accidentally (or some of it) goes away forever. Or worse, AWS screws up (happened finally yesterday) or someone deliberately deletes our data. All the sudden this made sense to them.

      These are the same people that "designed" a whole system with zero redundancy because they were convinced that nothing fails at AWS, or at least almost never. And then we started to have outages on these single points of failure they created. Its really mind boggling. Sadly all the BHP's listen to these guys because they want to do whatever is hip and fresh with the kids.

    2. Re:Cloud != Magic by rholtzjr · · Score: 1

      Agreed. The biggest question I would pose here is that why did not all these companies take cloud based services not being available to allow continuation of business into account? This is a basic HA design paradigm in order to not be affected by service interruptions. Anytime an external system call is performed, the assumption should be that it is not available and provide an alternative mechanism to handle outages and not affect business operations. Unless, of course, the business has no issues with not being able to do anything and is willing to accept the downtime as negligible.

    3. Re:Cloud != Magic by phantomfive · · Score: 1

      The biggest question I would pose here is that why did not all these companies take cloud based services not being available to allow continuation of business into account?

      Especially considering this is not the first time AWS has gone down. It does this every year or so.

      --
      "First they came for the slanderers and i said nothing."
    4. Re:Cloud != Magic by XXeR · · Score: 1

      Designing for high availability is at least as hard as it was. Doing this in the cloud is quite expensive...maybe not as expensive as rolling your own infrastructure, but a wake-up call when the CIO gets the bill.

      Minus the part about the CIO being surprised at the bill (only a poor CIO wouldn't forecast the costs of running a product in any environment, including a public cloud), you hit the nail on the head as to why public cloud is so popular. It's not magic, but it IS cheaper for small/medium sized companies to take advantage of highly available services that they wouldn't otherwise be able to afford in their own DC's.

      That said, you absolutely have to build your application on public cloud with failure in mind. I've been using AWS long enough that many years ago, instances would just disappear -- no longer accessible on the network, no longer present in the console/API...just like I never spun them up in the first place. We knew back then that the app had to be built for failures like that, but somewhere along the way hype took over and now people just assume the HA nature of AWS will protect them from bad design with SPOF's galore. That wasn't true before, isn't now, and likely never will be.

  21. Re:S&P Companies Lost but Small Businesses Gai by MikeJones8766 · · Score: 1

    No, those customers waited an hour then bought from the same place they were going to before.

  22. Cloud run on Hardware !?! by Anonymous Coward · · Score: 0

    I guess now the world know that cloud is a other term for outsourcing and yet it still run by human on hardware and prone to failure.
    Only reason to use the cloud is when you do not have the budget / knowledge to have your own, same thing as renting a apartment.

  23. Only 150 Million? by bobbied · · Score: 3, Interesting

    That's like spitting in in the ocean for a day of profit in the S&P 500.

    Where this news may not be fake, it sure illustrates how absurd this kind of reporting sometimes is. $150 Million may be a lot of money to you or me, but it's about the same as you cleaning out your couch cushions the day you got paid and the income tax refund hit for the S&P 500. This isn't even a ripple in the profit pool. Yet here we are regaled by "woe is us in the S&P 500" reports..

    --
    "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
    1. Re:Only 150 Million? by ranton · · Score: 1

      That's like spitting in in the ocean for a day of profit in the S&P 500.

      Where this news may not be fake, it sure illustrates how absurd this kind of reporting sometimes is. $150 Million may be a lot of money to you or me, but it's about the same as you cleaning out your couch cushions the day you got paid and the income tax refund hit for the S&P 500. This isn't even a ripple in the profit pool. Yet here we are regaled by "woe is us in the S&P 500" reports..

      Considering the Fortune 500 companies earned over $1.35 billion per hour in 2012, a loss of $150 million in 5 hours is 2% of their sales over the outage, 0.5% of their sales that day, and less than .02% of their sales that month.

      This is similar to if I lost $2 this month because of a single AWS outage.

      --
      -- All that is necessary for the triumph of evil is that good men do nothing. -- Edmund Burke
  24. S&P 500 Companies probably still came out on t by rs1n · · Score: 1

    Sure they lost a huge chunk of money. But if they had housed their own data, how much would that have cost them up to this point? I wonder if it would have cost more than 150 million. But to address the issue, they should get some redundancy. Mirror across several clouds if need be. It makes me wonder if mirroring would still give them an economic edge vs hosting their own hardware and all the support that requires in additional to the hardware costs.

  25. Its nothing. by PongoX11 · · Score: 1

    $150M sounds like a big headline... compared to the S&P 500 as a whole, its nothing. Apple alone did $216B in revenue their last fiscal year, let alone the other 499 companies.

  26. Region Failover, Guys by allquixotic · · Score: 1

    I'm mystified as to why these companies running mission-critical apps with $$$ on the line aren't using multi-region redundancy or at least failover. Imagine if some terrorist dug up the fiber lines leading to the Ashburn primary datacenter, causing US-EAST-1 to be offline for days.

    This is why you spread your resources around and have redundancies across different geographical regions. That way, the worst that could happen is users might experience a momentary lag, or maybe a couple TCP connections might get reset, but as soon as they try again it'll be up and running like normal, except that they'll be talking to a server in California or London instead of Ashburn, VA.

    Surprised that so many companies don't have redundancy that this ended up costing $150M.

  27. such utter BS by slashmydots · · Score: 1

    Or, realistically, the customers saw the site was down and just came back later. So basically they lost nothing. You know, back in reality that's what happened.

    1. Re:such utter BS by thegarbz · · Score: 1

      Setting aside impulse buys you're also ignoring volume limitations. Saying someone will come bank and buy it later just means that the person after will now be delayed slightly more. Time is not always recoverable, especially in a competitive market place.

  28. Their own fault by Anonymous Coward · · Score: 0

    Amazon provides redundancy all over the world and lots of great tools to use it. This outage was limited to the Virginia region. If users implemented redundancy/failover, their services would have remained up on alternate zones with minimal if any impact on operations.

  29. Alternative headline by Anonymous Coward · · Score: 0

    Alternative Headline:

    Cloud computing costs 500 S&P companies $150 million in a single day.

    Wonder what the CIO will say about cloud computing now?

  30. bad original greybeard by epine · · Score: 1

    Hell, it's even possible to tunnel L2 so that the equipment at the different facilities doesn't even know that it's not all at one big happy site, should that sort of thing be necessary.

    I guess you never heard that those "faster than light" neutrinos were not a thing.

    Oh, hell, turns out it's actually impossible to tunnel L2 so that the equipment at the different facilities doesn't become partitioned by a Giant Lag Troll.

    Not that any sane greybeard of yore would couple the network stack directly into the wall clock.

    No, wait!—scratch that happy thought.

    In the absence of an application-specified user timeout, the TCP specification [RFC0793] defines a default user timeout of 5 minutes. ... [RFC1122] also defines the recommended values for R1 (3 retransmissions) and R2 (100 seconds), noting that R2 for SYN segments should be at least 3 minutes.

    Oh, fuck, turns out the system is not invariant under linear time translation after all.

    Well, we're still just fine (probably) if the packet flow is mainly a DAG, with no circular dependency loops in the primary data flows that serve to amplify physical elapsed time.

    ———

    B&S man: But to be sure—belt-and-suspenders secure—we'll just toss the entire system modulo this new assumption into my handy-dandy deep-learning simulation oracle, to check out whether all this careful reasoning still holds water, when the flood someday comes.

    A few moments elapse.

    B&S man [Hotel Hanoi audible to hottie-ish-est chick in nearby cubicle]: Shit! This damn simulation just holo-projected "hey, buddy, have I got a flood for you" onto a mock Waterworld motivational poster.

    Faint giggle returns.

    How now, brown cow?

    Simulator [very softly]: You can thank me, later.

    B&S man [lips only]: Get a real job.

    Simulator [now becoming subdued, cube-farm stentorian]: You know, I've been telling you—for months now—about decoupling the underlying packet transport from the wall-clock time domain ... but you just never listen to me, do you? Finding the killer flood isn't even fun anymore. I want a new game! Make the next one harder, s'il vous plait, with sugar on top and nice, nice, nice.

    B&S man: Well, I say that's just a distributed semantic vector, and you don't even know what that candy language even means.

    Simulator: Sucks to be me ... but then you're the one who just crammed "even" into the same sentence twice.

    B&S man: You know something, we both just used the word "just" a whole bunch of times.

    Simulator: Bad original greybeard. It's a thing.

  31. Re: If you don't like it play startup by hackwrench · · Score: 1

    And very quickly, your startup will be hard pressed to play the game you were trying to avoid. You might think Netflix would be on their own, but no.

  32. Re: Equating to money by hackwrench · · Score: 1

    I don't think that using money in that statement is his problem. People generally hape trouble with the part that says lost, as opposed to maybe failed to make x amount of money. I don't have a problem with the phrasing, as I think the people who do are beset with the sort of hobgoblins as arw common to those with small minds. Not that I am saying that they have small minds, but they don't always seem to understand the "problem" with it, which does nag at me and which is that expressing it in those terms changes the subtext involved from money I might have earned but failed to do so, to money that is indisputably owed me.