Slashdot Mirror


More Uptime Problems For Amazon Cloud

1sockchuck writes "An Amazon Web Services data center in northern Virginia lost power Friday night during an electrical storm, causing downtime for numerous customers — including Netflix, which uses an architecture designed to route around problems at a single availability zone. The same data center suffered a power outage two weeks ago and had connectivity problems earlier on Friday."

183 comments

  1. Cloud takes down cloud by AlienIntelligence · · Score: 5, Funny

    Nuf said

    --
    For me, it is far better to grasp the Universe as it really is than to persist in delusion
    1. Re:Cloud takes down cloud by Anonymous Coward · · Score: 2, Informative

      Here's what's going on - Amazon's us-east-1 datacenter has been having some issues with its Relational Database Services (RDS), which is the database system holding all of the chumby data.

      What appears to be happening is frequent premature disconnects between the EC2 instances running the web servers and the main database. MySQL has a trigger in it that when too many premature disconnects occur without a successful connection, it assumes it's being hacked and blocks incoming connections from that server until a command is explicitly given to it to clear the error and resume accepting connections.

      During all of the time the system appeared to be down, it really wasn't - the database was actually running and completely operational from a parallel web server hosted under "insignia.chumby.com", which we use to provide a branded experience for Infocast and Insignia TV users. It had just blocked the systems that are used most frequently. All of the web servers, the forum, wiki, content servers were all up and running.

      To compound the problem there was a storm on Friday night that greatly impaired RDS at that datacenter, and as it came back up, it ended up producing the same kind of disconnect errors, and the same trigger happened.

      As of this writing, that issue is still ongoing and the RDS service in us-east-1 is still impaired. Note that several other companies - Pinterest, Heroku, Instagram and others are being similarly impaired.

    2. Re:Cloud takes down cloud by joocemann · · Score: 0

      The clouds are just like Android.... not quite ready for a quality showtime....

      Written from an epic4g running android 2.3.... an OS that no dev from the 90s would put over a 0.92 beta.

    3. Re:Cloud takes down cloud by CptNerd · · Score: 1

      Well, they did say there was a lot of "cloud to cloud" lightning in that storm...

      --
      By the taping of my glasses, something geeky this way passes
    4. Re:Cloud takes down cloud by cheater512 · · Score: 2

      And Linux shouldn't ever be used for mission critical applications.

      Posted using the Linux kernel version 2.2.13

  2. Largest non-hurricane related power outage ever by Anonymous Coward · · Score: 5, Informative

    I live in the affected area and that's what they're saying. May take 7 days for the last person to have their power restored.

    1. Re:Largest non-hurricane related power outage ever by jrmcferren · · Score: 5, Interesting

      That really shouldn't matter though as long as the Data center's generators are running and they can get fuel. It seems that they are not performing the proper testing and maintenance on their switchgear and generators if they are having this much trouble. The last time the data center in the building where I work went down for a power outage was when we had an arc flash in one of the UPS battery cabinets and they had to shut the data center (and the rest of the building's power for that matter) down.

      --
      sudo mod me up
    2. Re:Largest non-hurricane related power outage ever by John+Bresnahan · · Score: 4, Insightful

      Of course, the network only works if every router in between the data center and the customer has power. In a power outage of this size, it's entirely possible that more than one link is down.

    3. Re:Largest non-hurricane related power outage ever by Anonymous Coward · · Score: 1

      If an individual customer has a power outage AND a failure of their backup power, that's not Amazon's fault. As far as who to blame, I highly doubt that EVERY network provider had both failures in grid power AND failures in their backup systems. So this is likely still Amazon's fault.

      The problem is that a lot of people cheap out on their backup power. Generators and UPSes are expensive. In a generating system, often the most common single point of failure is the automatic transfer switch. It's quite possible that they had a single generator feeding both A and B sides of power in all or part of the data center - the failure of the transfer switch (or the failure of the generator itself coming on) would cause total loss of power after the UPS(es) drained their batteries.

    4. Re:Largest non-hurricane related power outage ever by jrmcferren · · Score: 5, Informative

      The automatic transfer switch(es) would be the first component I would check even without knowing anything. In order to maintain the UL listing on the transfer switch, it must be tested monthly. The idea is, if it is tested monthly, everything is operated and is less likely to seize and fail than if the device is not tested. Modern systems can be designed that the generators can start BEFORE the transfer switch operates when in test mode to reduce the impact of the test (miliseconds without power versus 30 seconds or so).

      --
      sudo mod me up
    5. Re:Largest non-hurricane related power outage ever by ILongForDarkness · · Score: 2

      I don't know if the state or even just the city is without power it is quite possible the ISPs are borked in the area. After all why bother with too much redundancy if you customers don't have power for their computers than they aren't using the internet anyways. Then Amazon plops down a 200M datacentre in town and ... shit happens.

    6. Re:Largest non-hurricane related power outage ever by fuzzyfuzzyfungus · · Score: 3, Interesting

      The problem is that a lot of people cheap out on their backup power. Generators and UPSes are expensive.

      I wonder, in comparing the price/performance numbers on the invoices from Dell and the invoices from APC(hint, one of these has Moore's law at its back, the other... Doesn't.) what it would take in terms of hardware pricing and software system reliability design to make these backup power systems economically obsolete for most of the 'bulk' data-shoveling and HTTP cruft that keep the tubes humming...

      Obviously, if your software doesn't allow any sort of elegant failover, or you paid a small fortune per core, redundant PSUs, UPSes, generators, and all the rest make perfect sense. If, however, your software can tolerate a hardware failure and the price of silicon and storage is plummeting and the price of electrical gear that is going to spend most of its life generating heat and maintenance bills isn't, it becomes interesting to consider the point at which the 'Eh, fuck it. Move the load to somewhere where the lights are still on until the utility guys figure it out.' theory of backup power becomes viable.

    7. Re:Largest non-hurricane related power outage ever by GPLHost-Thomas · · Score: 2

      Data center redundancy isn't "cheap" to write for a complex software. So you got risk a lot of money per hours of down time to invest in that. I don't think that's something a lot of companies can afford, unless they start their software design with this in mind to begin with. So the problem to me, is that data center redundancy is often an after though, and IaaS hardly has easy answers to this problem yet.

    8. Re:Largest non-hurricane related power outage ever by turbidostato · · Score: 1

      "So the problem to me, is that data center redundancy is often an after though, and IaaS hardly has easy answers to this problem yet."

      It won't. For a very basic physical reason: it's always cheaper to move data near than far away. If you have a given piece of data in one place you either will lose it if that place goes nuts or you will need to go expensive to make sure such data piece is replicated out of that place fast enough.

      IaaS can help comoditizing compute and storage resources but has nothing to offer with regards to move data cheaply from place A to place B and not all business (not even a minority) have the luck of managing mostly low value (i.e. Google) or read-only (i.e. Netflix) data.

    9. Re:Largest non-hurricane related power outage ever by thePowerOfGrayskull · · Score: 2

      But then the question must be asked...

      [queue Psycho screeching violins]
      How are you posting this now!

    10. Re:Largest non-hurricane related power outage ever by Anonymous Coward · · Score: 0

      Largest ever must be qualified somehow. At the least https://en.wikipedia.org/wiki/North_American_ice_storm_of_1998 was bigger.

    11. Re:Largest non-hurricane related power outage ever by Salgak1 · · Score: 3, Informative

      Well, as of current reports. . . . 2.5 million are without power in Virginia, 800 Thousand in Maryland, 400+ thousand in DC. I've seen numbers in the 3.5 million region between Ohio and New Jersey. We got power back early this morning ~0400, but we STILL don't have phone, net, or cable at home. The real question, since some areas in DC Metro are not supposed to get power back for nearly a week is. . . . do the emergency fuel generators have sufficient fuel bunkers ???

    12. Re:Largest non-hurricane related power outage ever by TubeSteak · · Score: 1

      it becomes interesting to consider the point at which the 'Eh, fuck it. Move the load to somewhere where the lights are still on until the utility guys figure it out.' theory of backup power becomes viable.

      The answer mostly depends on the cost of downtime for you.
      The real problem is getting your (customer) data to the same place as your failover solution.
      Some websites generate enormous amounts of data and it's not trivial or cheap for them to constantly keep it backed up at another data center.
      A station wagon full of hard drives is still faster than any link 99% of us could afford

      --
      [Fuck Beta]
      o0t!
    13. Re:Largest non-hurricane related power outage ever by halltk1983 · · Score: 1

      Fiber only needs power at each end. There should be at least one long haul fiber to another city in any respectable data center.

      --
      Watch for Penguins, they eat Apples and throw rocks at Windows.
    14. Re:Largest non-hurricane related power outage ever by bhcompy · · Score: 1

      Isn't this the point of routing the way we do? It's self-healing, of a sort, as long as another path exists.

    15. Re:Largest non-hurricane related power outage ever by NJRoadfan · · Score: 2

      I drove through the affected areas today, there were swaths of I-95 that didn't have any cell phone service. I'd say that's pretty bad considering I still had service during the 2003 blackout. The cloud outage is the least of these folks worries, 100+ degree (f) weather forecasted the next few days with no A/C and water conservation measures in some areas is a concern right now

    16. Re:Largest non-hurricane related power outage ever by NJRoadfan · · Score: 1

      Natural gas generators will likely be ok. Gasoline may be a problem however since stations can't pump fuel. There was power in Fredricksburg, VA and it seems that the surrounding areas didn't have any power going by the mobs at the gas stations.

    17. Re:Largest non-hurricane related power outage ever by Salgak1 · · Score: 1

      Not really, Fiber needs boosters every 20 miles or so to deal with signal broadening due to chromatic aberration. Or in the words of Commander Scott: "Ye kenna chainge the laws o'physics !" (grin)

    18. Re:Largest non-hurricane related power outage ever by zippthorne · · Score: 1

      Chromatic aberration? Are you sure you don't mean numerical aperture?

      --
      Can you be Even More Awesome?!
    19. Re:Largest non-hurricane related power outage ever by timeOday · · Score: 1

      I hope an insider will weigh in on this, but I don't think the Internet is all that self-healing at the upper levels, as when dealing with Netflix, Amazon, google etc. At those levels links are not an abstraction; they are statically routing across specific fiber segments, and there probably isn't enough overcapacity in the infrastructure to simply route around without interruption. Think about when an Interstate is closed through a metropolitan area - yes, you can still get there along side streets eventually, but it's hardly transparent substitue. Not enough to keep the Netflix Streaming service humming along for example. I doubt email users noticed anything wrong though :)

    20. Re:Largest non-hurricane related power outage ever by Anonymous Coward · · Score: 0

      The providers for Amazon most certainly would be expected to have backup power at their POPs. Amazon would not be purchasing residential broadband connections for their datacenter, they'd be purchasing optical carriers from the likes of Time Warner Telecom, Global Crossing, Level 3, and etc. These links come with contractual SLAs too.

      Why should anyone bother with backup power for their datacenter if the network providers aren't going to be doing the same?

    21. Re:Largest non-hurricane related power outage ever by Reschekle · · Score: 1

      Any mission critical datacenter is going to have refueling contracts with multiple fuel providers to keep their generators fed during an outage.

      The logistics of how to maintain fuel service during an extended crisis is left to the fuel provider, but diesel can be hauled across long distances from a non-affected area if needed and brought to the fuel depot where it is in turn loaded onto a refueling truck that goes around to these datacenters every few days to refuel them.

    22. Re:Largest non-hurricane related power outage ever by Mashiki · · Score: 1

      I'm guessing you're talking about population? I was up in Northern Ontario back during the last major ice storm we had. That hit the area, along with southern and mid-northern Quebec. There were places without power 4 months later. In the dead of winter, let me know how well you're going to survive when it's -38C outside will ya? 7 days is bad, no doubt and I know what you're going through, but try 3 months with no power.

      Damn was it fucking cold. We ended up living with 4 other families in the asshole of nowhere, getting food dropped off by the army along with a ration of fuel oil to run the generator.

      --
      Om, nomnomnom...
    23. Re:Largest non-hurricane related power outage ever by baegucb · · Score: 1

      I asked a maintenance person at work how long we could go in the event of a power outage. I got a blank look like they couldn't fathom the question, and then told we'd go forever. My workplace has 6 generators with 400? gallons of diesel for each one. One generator will handle the current load. It's all tested monthly. (and the odd times city power gets cut)

    24. Re:Largest non-hurricane related power outage ever by JWSmythe · · Score: 1

      I'm surprised no one got the bright idea to get a generator for the gas station.

      I was driving out in the middle of the desert a while back. We stopped for gas (never go below 1/4 tank, unless you like walking for hours). The gas station couldn't pump, because their generator was down. They were so far out in the middle of nowhere, they didn't have power lines run to the station. They sent someone off to get a new generator, but it was something like 4 hours round trip to the nearest store that had one.

      I guess people take things like power for granted, because it seems like it's always there. The magic wires in the walls just work. Well, until they don't, I guess.

      --
      Serious? Seriousness is well above my pay grade.
    25. Re:Largest non-hurricane related power outage ever by Anonymous Coward · · Score: 0

      Aren't you so clever? Why didn't they think about that! They'd be lucky to have a smart guy like you working for them. Not like the EPA would fine them millions of dollars to leave the generators running. No, not at all..

    26. Re:Largest non-hurricane related power outage ever by Anonymous Coward · · Score: 0

      Because a net cable has no problem in no carrying data while servers take very badly being shut down unproperly

    27. Re:Largest non-hurricane related power outage ever by afidel · · Score: 1

      Many gas stations do have generators. I know the last few I've seen built in my area have them. Of course If I had to guess that's more to keep the refrigeration units running since most of a gas stations profits come from the mini-mart they run but if you're going to bother with a generator you'll probably have it run the pumps as well =)

      Of course that ignores the point that almost zero percent of datacenter generators run on gasoline. Offroad diesel is by far the most common fuel used for generators and it's generally supplied via tanker truck which gets supplied at the local fuel depot, which will certainly have backup generators. Natural gas can be used, but when you're talking a $200M datacenter it would require a pipeline connection the size of a small power plant and so it's generally cheaper to do diesel and have onsite storage of a day or two worth of fuel.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    28. Re:Largest non-hurricane related power outage ever by JWSmythe · · Score: 1

      That's odd. I don't see many with generators at retail fueling stations. I guess it varies by area. I'm in Florida, and can't say I've ever noticed one. We're most likely to get knocked down by hurricanes a few times a year, so it'd be a good thing to have. That's one of those things I look for, since I'm very interested in how things work. Well, I have the "how things work" part down pat. Now I just look at how someone else does theirs. What brand equipment do they use, etc, etc.

      The gas station reference was for the previous poster. Definitely, off-road diesel is preferred. It's easily transportable, fairly easy to store, and won't generally leak through seals in pressure systems. In a pinch, diesel from a retail fueling station would do, but that means an awful lot of gas cans. :) Retail fueling stations don't keep a huge supply. They keep enough for their demand, plus a little reserve for peak traffic days. If a retail station knows that they sell 1k gallons of diesel a week, they won't have 20k gallons to pump into a truck for them. And of course, the road taxes will kill you.

      The only other option I've seen at a large datacenter was one Northeast of Atlanta. They had a flywheel system. Well, there were stages. The DC room had capacity to run for a day. The flywheel was suppose to keep it up for another day or two. Then they'd fall back to diesel generators for longer durations. They were very proud of their flywheel. I just saw it as a huge expense that wasn't necessary. They also went for whiz-bang devices, like a retinal scanner to enter the main doors. That was cute, except I went around to the back door and entered the facility unchallenged. We didn't go with that datacenter, because their cooling sucked. They had a huge amount of cooling, but didn't have their flow designed properly. Walking the facility, there were hot spots in excess of 90 degrees, on the cold side of existing customer equipment. It was a very pretty facility though.

      The only gasoline generators I've worked with are at my house, and they're regular consumer portable units (5KW and 5.5KW) I got the 5.5KW because my ex-wife flipped out when a tropical depression came our way, and we had just moved to Florida. She didn't get that "tropical depression" is the same as "normal summer thunderstorm". The 5KW I got real cheap as a non-functional unit. $20 in parts, and now it works. They're really very simple devices, if you know what all the parts do.

      --
      Serious? Seriousness is well above my pay grade.
    29. Re:Largest non-hurricane related power outage ever by Vrtigo1 · · Score: 1

      Yes...this. It's exactly what EC2 is about. Everything about what they give you in terms of docs makes you think instances are designed to be basically disposable. If it gets hosed, don't spend time trying to fix it, just spin up another one. And if that instance happens to be in a completely separate AZ, the system doesn't care. There are a multitude of ways to gracefully recover from a datacenter outage in a straightforward manner if you plan for it and know what to do before you need to do it. It may not be a seamless recovery, you may have 10 minutes of downtime, but unless you're Facebook you can live with that. The beauty of the "cloud" is that anyone can achieve this level of uptime without having to worry about buying generators and batteries, etc. Don't spend crazy money on redundancy, just make sure you've got your ducks in a row so when something like this happens you can easily spin back up or ramp up in another DC..

  3. Infrastructure by TubeSteak · · Score: 5, Insightful

    We need to invest trillions in roads, water, and electrical infrastructure to keep this country going.
    If you let the basic building blocks of civilization rot, don't be surprised when everything else follows suit.

    --
    [Fuck Beta]
    o0t!
    1. Re:Infrastructure by rubycodez · · Score: 4, Insightful

      war is the basic building block of our particular civilization. if we waste money on your frivolities, how will we afford war & keep war machine shareholder value?

    2. Re:Infrastructure by Anonymous Coward · · Score: 1

      It seems like many of these jumbo datacenters built to support the top web sites are located in rural areas chosen for the ability to minimize costs (real estate acquisition, taxes, energy, cooling). Surprise... they may be more vulnerable to tornadoes and more isolated from repair crews, compared to an in-campus data center.

    3. Re:Infrastructure by Anonymous Coward · · Score: 0

      What sort of public investment do you think we need to make in the power grid to prevent this type of situation from happening in the future?

      Should we be investing in hardened power lines that can stand tornadoes and large trees falling on them? I'm not exactly sure what you're looking for here.

      The blame for this outage falls squarely on Amazon. Did they have multiple feeds from the grid? Do they have multiple generators? Multiple UPSes? Do they test their backup systems regularly and thoroughly?

      Chances are Amazon made some kind of call to save money and cheaped out somewhere in their electrical facilities.

    4. Re:Infrastructure by Anonymous Coward · · Score: 1

      Hell yea! US should privatize the infrastructure including maintaining roads, electricity and internet. Down with the pot holes, down with the evil socialists in Europe who manages to do these things cheaper and more affordable using their communist-era ideology.

    5. Re:Infrastructure by Anonymous Coward · · Score: 0

      we also need to invest trillions in gas powered dildos but you dont hear the legislators complaining, do you?

    6. Re:Infrastructure by Anonymous Coward · · Score: 1, Insightful

      Governments don't engage in war to make sure bullets sell. They engage in war to gain control of the natural resources the other country has.

      The distinction is subtle, but significant.

    7. Re:Infrastructure by Anonymous Coward · · Score: 0

      Being in a rural area does not make you statistically more likely to be hit by a tornado.. Tornadoes don't have any sort of inborn preference. Tornado danger is a function of geography, not population density.

      The only drawback of being in the sticks is it is harder to access multiple power feeds. A good data center will have at least two feeds coming in from different directions. You can still do it, but it costs more since those power lines are being run just for YOU.

      Quick action from power repair crews is not really an issue. If your generators are maintained and functioning properly, they should be able to run for weeks with a steady supply of fuel. Barring a major disaster that inhibits access to refuel your generators or a much bigger regional catastrophe, it's a non-issue.

    8. Re:Infrastructure by Anonymous Coward · · Score: 2, Insightful

      I would say Laos would argue otherwise... The most bombed country in the world because America felt like it and had a lot of extra stock! Oh and they were officially a neutral country.

      GO USA!

    9. Re:Infrastructure by Anonymous Coward · · Score: 2, Insightful

      Dude, if you think a datacenter in Northern Virginia was plopped down here because of the insanely attractive price of real estate or energy, or because of the business-friendly tax rates you're out of your freaking mind. Datacenters are built here because of pre-existing backbone access. Period.

    10. Re:Infrastructure by Anonymous Coward · · Score: 0

      ...Should we be investing in hardened power lines that can stand tornadoes and large trees falling on them?...

      Yes. They're called underground utilities. Tree falling = who the hell cares. Tornado = so what. (OK, maybe an F5 would dig up a little real estate, but aside from that... I live in a neighborhood in NoVA with underground utilities and my only power interruptions for the last 20 years have been 100% based on an above-ground failure somewhere upstream from me.

    11. Re:Infrastructure by tukang · · Score: 1

      (Defense and energy) Companies get governments to engage in war to make sure bullets sell and to gain control of the natural resources the other country has.

    12. Re:Infrastructure by turbidostato · · Score: 1

      "Being in a rural area does not make you statistically more likely to be hit by a tornado.. Tornadoes don't have any sort of inborn preference. Tornado danger is a function of geography, not population density."

      You can't be so dense, can you? Do you think that being a tornado area might have something to do with people avoiding such a place -specially given that due to needed geography, tornado areas tend to be in the middle of nowhere?

      "The only drawback of being in the sticks is it is harder to access multiple power feeds [...] You can still do it, but it costs more since those power lines are being run just for YOU."

      So you are going to expend a big chunk of the savings of placing your datacenter in the middle of nowhere with the recurring costs of an utilitily that you rarely if ever will need for a service that *on purpose* relies on multiple placements to be able to serve out of out-of-the-mill hardware and capabilites.

      "If your generators are maintained and functioning properly, they should be able to run for weeks with a steady supply of fuel."

      Which is easier to say than do when a) your site is in the middle of nowhere and b) the "steady supply of fuel" crew is diverted to hospitals, banks, and other important places *not* in the middle of nowhere.

      It's is the cloud, you fool! Why do you think companies like Amazon expend a lot to be able to offer to you a *distributable* service? If your service is minor and you can't re-deploy it on another datacenter in some few hours, you are doing it *WROOOONG*. If your service is producing a lot of money 24x7 and you can't reroute on the fly out of a failing datacenter, you are doing it *WROOOONG*. In the end, if you believe the weasels that sold you that having a virtual private server (or a few) in a (unnamed) datacenter will magically protect you from a failure in that (unnamed) datacenter just because "it's a cloud provider", you are doing it *WROOOONG*.

    13. Re:Infrastructure by DarkTempes · · Score: 1

      I live in a hurricane prone area. In my experience with massive power outages like this it's typically high voltage transmission towers going down.
      It's not really economical to bury those.
      Fixing something like this is apparently not easy and takes time.

    14. Re:Infrastructure by AliasMarlowe · · Score: 2

      They engage in war to gain control of the natural resources the other country has.

      The distinction is subtle, but significant.

      Tell us again what natural resources the US wished to control when it engaged in war against Grenada in 1983, or when it engaged in war against Panama in 1989, or when it engaged in war against Afghanistan starting in 2001.

      There are many reasons for one state to go to war against another. Gaining control of natural resources is only one (e.g. Iraq's invasion of Kuwait), and is not the commonest.

      --
      Those who can make you believe absurdities can make you commit atrocities. - Voltaire
    15. Re:Infrastructure by roman_mir · · Score: 0

      You can't invest into shit worth of infrastructure if you don't produce anything that can pay for that expense, and all of your credit is used to buy wars and also foreign made goods.

      Until you restructure the debt and get gov't out of business of regulating business and doing all of this stuff (including wars, infrastructure, business regulations, all the nonsense that has been going on for over 100 years now), you won't have any new infrastructure that will make any sense.

      Oh, sure, you can have gov't come up with work projects, but none of them will be sustainable and useful, they will put you more into debt and won't give you any competitive advantage since that infrastructure won't be built to satisfy real demand, only to have gov't create more make shift work and spend more.

    16. Re:Infrastructure by Sir_Sri · · Score: 3, Informative

      In the case of panama it's control of the panama canal zone, which while by itself isn't a natural economic resource, but it saves a crap load of them in reduced shipping costs.

      Though true, wars are generally fought for gold glory and god as one of my past history teachers used to say. I think what she meant is that wars are *started* for gold glory or god. Afghanistan was very much god and glory (for Al Qaeda and the Taliban at least), and it was for them in part about natural resources and control, benefit and possession of the islamic caliphates (yes, that's doesn't actually exist, but that's the kind of level they were thinking at) resources.

      The invasion of Grenada is more tricky. By itself Grenada isn't anything, but a major military airfield in Grenada could cover all of the oil export ports from Venezuela, and there was the matter of US prestige on the issue.

    17. Re:Infrastructure by Anonymous Coward · · Score: 0

      I live in a semi rural heavy wooded area in northern VA and I get my power from a co-op, Novec. They are VERY diligent about yearly line maintenance and tree trimming. About 600 homes around me (average about 5 acres per house) all get our power from a single source that follows the road into our area, we rarely ever lose power and when we do, It's never been more than 30 minutes before I see multiple power trucks driving around checking the red flashing lights on the top of the poles to find the break.

      Yes our source of power is limited but from what I read and understand about power outages in the DC metro area over the last decade or so, the "source" of power and substations are usually not the failure, it is the individual lines that are the major cause. Some preventative maintenance and an adequate crew numbers could reduce outages and times of outages.

    18. Re:Infrastructure by tyler_larson · · Score: 4, Interesting

      In my past two jobs and over the past 20 years, we've worked with dozens of independent an unrelated vendors with locations around the country, including Virginia. Of all the locations where these companies have operations, the ones in Virginia have been dramatically, almost comically, more disaster-prone than the rest of the country and even the rest of the world. The running joke in the office is that whenever any vendor or service provider drops offline, we first check the weather in Virginia before checking to see if any of our own systems are offline. Every time, we see a post-mortem a few days later disclosing some failed system or backup or contingency, and every time, they say this problem that will never happen again.

      You'd think that all the failing locations would share a operations center or service provider or even a single city, but it turns out that the only thing these disaster-prone operations have in common is that they're in Virginia. I have no idea why this is the case. But our company has a policy singling out Virginia saying that no mission-critical components are allowed to be based there.

      --
      "With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea...."
      RFC 1925
    19. Re:Infrastructure by Anonymous Coward · · Score: 0

      Did you go to North Junior High in colorado springs? my history teacher there said the same thing, in those words even.

    20. Re:Infrastructure by Anonymous Coward · · Score: 0

      You're an idiot. Where do I even begin? Go look at a map of tornado alley and pinpoint several major cities. Furthermore, go look outside of tornado alley and find large areas of very low population density. Finally, get a list of major datacenters and find what percentage of them are located in tornado alley.

      Fucking retard. You embarrass yourself every time you open your mouth don't you?

    21. Re:Infrastructure by Sir_Sri · · Score: 1

      No, I'm not an american. That was at the university of guelph. Though it wouldn't surprise me if the instructor herself was american. Unfortunately I can't remember her name or what she looked like enough to know 10 years on if she's one of the people listed on the history department faculty page.

    22. Re:Infrastructure by datavirtue · · Score: 1

      I've had the idea in mind for years that we need to bury every single electrical line in this country to create jobs for stimulus and to ensure service during severe climate disruptions. In my area (South-West Ohio) we had winds, little rain for ehh....10 minutes. Clobbered the hell out of us, and then it was gone. I was down for over twelve hours and many will not get power until Monday at midnight. While burying these lines we can run additional fiber owned by the people and covering the last mile. It would be sweet payback to the assholes who pocketed billions of dollars in subsidies and failed to improve their networks in the nineties. Plus there would be a shit-ton of those stinky wooden poles left over and a lot of steel cable for other projects. impractical? We can do anything we really want to do.

      --
      I object to power without constructive purpose. --Spock
    23. Re:Infrastructure by datavirtue · · Score: 1

      Afghanistan is an easy one. There have been plans for a pipeline through the region where we are fighting (Pakistan / Borderlands) for several decades now. A lot of powerful people want that pipeline for several reasons. And they are going to get it.

      --
      I object to power without constructive purpose. --Spock
    24. Re:Infrastructure by datavirtue · · Score: 1

      As usual Roman your comment carries a tinge of error. You have to invest in infrastructure first, regardless of cost, to propel economic activity. It is a well known fact that America's past prosperity was made possible by a superior infrastructure. The faster and easier and cheaper it is to transport goods, the more (exponentially more) activity will arise under those conditions. The same can be said of data as well. The more people can access high speed internet for little money, the more it will spur online economic activity and OPPORTUNITY.

      --
      I object to power without constructive purpose. --Spock
    25. Re:Infrastructure by roman_mir · · Score: 1

      Ha ha ha, and if nobody can loan you money and your money isn't good because nobody produces anything, so your money isn't worth anything, where are you going to get the loans from? And if all of the credit that you had was used to build up the war machine and to buy foreign made consumer goods and finally nobody wanted your money anymore, where are you going to get the loans from?

      I don't disagree that IF you have credit and can borrow, then you can in principle use the credit to build up infrastructure, but I do disagree that you know what you are talking about.

      Infrastructure by itself will not make you more productive. USA didn't have any government infrastructure, yet in from early 18 hundreds up until 1913 the country did become the largest manufacturer, exporter and creditor nation, which means it had to produce enough to export, and to do that it had to build up infrastructure, all privately, so that the products could be produced and then exported.

      Investing in infrastructure first?
      1. You don't have the money.
      2. Every cent you borrow you spend on wars and consumer goods (foreign once, that's your 54 billion USD/month trade deficit)
      3. What are you going to connect with your infrastructure?

      What are you going to connect with your infrastructure? You lost the factories, sure sure, you can connect one downtown to another with yet another wider road, you can put a road from your downtown right into a corn field in Iowa.

      The little infrastructure that does make sense to build would be build privately but you are blocking those efforts.

      That's right, at this point in your history, the infrastructure that you COULD use to reduce your trade deficit has nothing to do with moving cars around, it's not roads, it's a pipe that would be eventually used to pump your natural resource for productive use somewhere else, like China, because they CAN pay, they produce things. And for you to build up your economy again, you'd have to work down your debt and deficit, and to do that you'd have to EXPORT something useful, not just fresh looking pieces of green paper.

      USA is going to be a net energy exporter, USA will be exporting what it can still produce. Energy, food, other resources that it mines, that's going to be the backbone of US economy for a while, before it can rebuild its real manufacturing capacity.

      Investing in infrastructure first? What gives you the idea that your government wants to do that anyway? Your government wants to put the borrowed money into the pockets of military contractors and it has to put some money into the pockets of you consumers, so that they won't pay attention to what's happening and would continue 'voting' the way they are supposed to, Obama or Romney, whatever.

    26. Re:Infrastructure by turbidostato · · Score: 1

      "About 600 homes around me (average about 5 acres per house) all get our power from a single source"

      That means, for the case of a datacenter, that you automatically get discounted for a tier one qualification. That you don't have more outages is basically a matter of luck.

      "we rarely ever lose power and when we do, It's never been more than 30 minutes before I see multiple power trucks driving around"

      Yes, if the only problem in the area is your power line. Wait for a wide area incident and, of course, limited resources will have to be priorized; then you probably won't see those power trucks arriving so soon.

      "Some preventative maintenance and an adequate crew numbers could reduce outages and times of outages."

      Truly. Now define "adequate crew". You will find "adequate crew" means "people enough for the usual case". No utility company will pay a lot of people to do nothing but once every ten years.

      But that's not even the point. The problem here seems to me that people think they are talking about a "datacenter" when we are talking here about an "Amazon datacenter". Even usual "datacenters" are tiered by the efforts they go to protect themselves against a blowout: it is not professional to expect Tier One capabilities on a Tier Three datacenter as it is not professional to expect capabitilities on a infrastructure without taking the due diligence of attesting them. And then, we're talking here about datacenters from a company which value proposition is not that a single datacenter won't fail -there are other providers for that, but that their services allow to protect yourself from a single datacenter failing. Given that, why would someone expect Amazon to incur the expenditures of having top notch datacenters when they don't need them and, as a rule of thumb, each extra 9 on reliability adds x10 in costs?

      And even if someone foolishly could expect this, is it so much to ask someone reading Amazon's SLA to learn they are offering "just" three nines over a monthly window?

    27. Re:Infrastructure by mister_dave · · Score: 1

      the evil socialists in Europe who manages to do these things cheaper and more affordable using their communist-era ideology.

      Nothing involving the British Government is done 'cheaper'. They strive to find the most expensive option. However perverse.

      The Government says it needs £110bn of investment in new energy production plant to keep the lights on. That's slightly down from the £120bn figure the Department had cited earlier, but it needs to be qualified. That sum is "needed" to meet EU climate change target of a 20 per cent reduction in CO2 emission by 2020 using renewable energy production. Given the worldwide retreat from carbon dioxide mitigation policies, and the current financial situation within the EU, it's unlikely that a single European government will adopt a similar commitment, with a similar kind of energy mix.

      So why the scare quotes around "need" - just how necessary is that £110bn? It's a good question. The cost of building new open-cycle gas-fired plant to meet the requirement is £13bn. What inflates the figure by a factor of eight is the commitment to do it using renewable energy and nuclear.

    28. Re:Infrastructure by CheeseTroll · · Score: 1

      But don't forget about the dangers of the backhoe, which has severed all too many buried fiber connections. I'd like to think that a high voltage line would be marked better than some fiber thrown down in the 90's, but won't hold my breath.

      --
      A post a day keeps productivity at bay.
    29. Re:Infrastructure by Anonymous Coward · · Score: 0

      As usual Roman your comment carries a tinge of error

      only a tinge of error? i'm not sure i've seen a comment from him yet with a tinge of reality or logic.

    30. Re:Infrastructure by Anonymous Coward · · Score: 0

      Afghanistan is VERY rich in minerals of all kinds.

      And that, my friend, was known already by the Soviets. They made detailed maps about it. Why it was publicized as some kind of news a few years back in the NYT is beyond me. "USGS survey finds Afghanistan abundant with minerals!" Oh really, well it was well known since at least the 80s.

      Panama was kind of also about natural resources. The narrow straight is a natural resource. It has a canal dug into it. This is extremely important and strategic for commercial shipping and US Navy movements. The war was about control of the Panama Canal.

      Grenada was not about natural resources, it was about pushing back the Soviets and the RED MENACE(tm).

      Now, 2 out of 3 were about natural resources.

    31. Re:Infrastructure by sydbarrett74 · · Score: 1

      Governments don't engage in war to make sure bullets sell.

      They do if they've been bought by defence industry lobbyists.

      --
      'He who has to break a thing to find out what it is, has left the path of wisdom.' -- Gandalf to Saruman
    32. Re:Infrastructure by Anonymous Coward · · Score: 0

      Afghanistan was about multiple issues:

      1. Opiates (Start here: https://en.wikipedia.org/wiki/Opium_production_in_Afghanistan ).
      2. Oil and the transfer of Oil
      3. Political grand standings in the middle east.
      4. Taliban (just in general).

        I've put them in the order I believe was the most (1) to least (4) important; it always amazes me how the US is so anti-drug, and yet seems to go out of it's way to protect its supplies... Perhaps a drug induced docile population works better?

    33. Re:Infrastructure by Scoth · · Score: 1

      I had two history teachers across several years use the same phrase in a couple different states. It's probably pretty common.

  4. Seems like anything takes down the cloud... by Anonymous+Brave+Guy · · Score: 5, Interesting

    It seems that recently, anything can take down the cloud, or at least cause a serious disruption for any of the major cloud providers. I wonder how many more of these it takes before the cloud-skeptics start winning the debates with management a lot more often.

    You can only argue that the extra costs and admin involved with cloud hosting outweigh the extra costs of self-hosting and paying competent IT staff for so long. If you read the various forums after an event like this, the mantra from cloud evangelists already seems to have changed from a general "cloud=reliable, and Google's/Amazon's/whoever's people are smarter than your in house people" to a much more weasel-worded "cloud is realiable as long as you've figured out exactly how to set it all up with proper redundancy etc." If you're going to pay people smart enough to figure that out, and you're not one of the few businesses whose model really does benefit disproportionately from the scalability at a certain stage in its development, why not save a fortune and host everything in-house?

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    1. Re:Seems like anything takes down the cloud... by Anonymous Coward · · Score: 1

      Storms bring down clouds all right. It rains and everyone is miserable, except the greenies. :P

      Cloud computing brings availability to the "small guys". It also allows for quick scalability. You can't really accomplish similar things in-house unless you use 100s of servers, but then you have logistical issues as you have to ship these servers all over the place. If you just have one data drop to your 100 servers at one location, guess what? Your infrastructure is no better than hosting everything with a 3rd party in one data center.

    2. Re:Seems like anything takes down the cloud... by Anonymous Coward · · Score: 1, Interesting
      You realise that this took out one data center? That is, all of those other AWS data centers are working still just fine? If anything, this is proving the reliability of cloud providers!

      Why not save a fortune and host everything in-house?

      You really think hosting your own hardware in your own data centers spread across the world will save you a fortune? Have you even bothered to run those figures?

      Even if you have more money than sense, once you've got your hardware spread across the globe, you've still got to build the systems on top to survive an outage in one of them I.e. exactly what you have to do if you use a cloud provider anyway. So what have you saved, precisely?

    3. Re:Seems like anything takes down the cloud... by girlintraining · · Score: 1

      It seems that recently, anything can take down the cloud,

      It wasn't just anything that took down the cloud: it was another cloud.

      --
      #fuckbeta #iamslashdot #dicemustdie
    4. Re:Seems like anything takes down the cloud... by tnk1 · · Score: 3, Interesting

      And this is ridiculous. How are they not in a datacenter with backup diesel generators and redundant internet egress points? Even the smallest service business I have worked for had this. All they need to do is buy space in a place like Qwest or even better, Equinix and it's all covered. A company like Amazon shouldn't be taken out by power issues of all things. They are either cheaping out or their systems/datacenter leads need to be replaced.

    5. Re:Seems like anything takes down the cloud... by girlintraining · · Score: 2

      How are they not in a datacenter with backup diesel generators and redundant internet egress points?

      Something about maximizing profits... by cutting corners... perhaps.

      --
      #fuckbeta #iamslashdot #dicemustdie
    6. Re:Seems like anything takes down the cloud... by hawguy · · Score: 5, Insightful

      It seems that recently, anything can take down the cloud, or at least cause a serious disruption for any of the major cloud providers. I wonder how many more of these it takes before the cloud-skeptics start winning the debates with management a lot more often.

      I think it's more because a cloud outage affects thousands of customers, so it has more visibility. When Amazon has problems, the news is reported on Slashdot. When a smaller collocation center has an accidental fire suppression discharge taking hundreds of customers offline, it doesn't get any press coverage at all.

      But the biggest takeaway from this is - never put all of your assets in one region. No matter how much redundancy Amazon builds into a region, a local disaster can still take out the datacenter. That's why they have Availability zones *and* regions. I have some servers in us-east-1a and they weren't affected at all. If they were down, I could bring up my servers in us-west within about an hour. (I could even automate it, but a few hours or even a day of downtime for these servers is no big deal)

    7. Re:Seems like anything takes down the cloud... by Anonymous+Brave+Guy · · Score: 2

      Cloud computing brings availability to the "small guys". It also allows for quick scalability. You can't really accomplish similar things in-house unless you use 100s of servers

      Sure, but probably 99% of small businesses don't actually need to scale that fast, or anywhere close. The cloud hosting proposition for most (not all, but most) small businesses is an appeal to wishful thinking, like the bank guy who tells you how they can give you a starter current account today, but they do have several tiers of service and once you're making over 10,000,000 in a year you'll have a dedicated account manager available to make you a coffee any time you want one.

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    8. Re:Seems like anything takes down the cloud... by fuzzyfuzzyfungus · · Score: 1

      While the nimbostratus salesweasels are(obviously, these are salesweasels) lying, an incident where a datacenter gets taken down good and hard by weather won't do the in-house guys much good either... 'Cloud' or not, a datacenter(and probably a fair few smaller ones, and a veritable legion of various converted-broom-closet small business setups) was taken down by weather.

      It certainly has become increasingly hard to hide that most of the 'cloud' providers do, er, rather less magic-distributed-reliability than their glossy brochure might insinuate. The decent ones generally make it possible; but they generally leave making it happen up to the customer. Anybody who expects 'cloud' to magically save them is naive or lying. However, that doesn't change the fact that it does make buying capacity in other regions, on short notice, convenient, so long as you can bully the vendor into admitting what you are actually buying.

      However, the in-house approach is in largely the same boat, only more visibly. Anybody's in-house operation in that part of the electrical grid would also have been good and hosed without redundancy in some other region. Whether it is cheaper/easier to provide that redundancy via traditional means or by purchasing the requisite 'cloud' stuff is a different issue...

    9. Re:Seems like anything takes down the cloud... by Anonymous+Brave+Guy · · Score: 1

      You realise that this took out one data center? That is, all of those other AWS data centers are working still just fine?

      Well, OK then, next time I'll just tell all of those people who can't use their home-grown Heroku-based apps for a few hours to go watch a movie on Netflix instead. It's probably just the little guys who got in trouble on this one, and it's their own dumb fault for not setting up more than one AZ or using different regions or something. Oh, no, wait, loads of people couldn't watch the movie either, and Netflix are HUGE AWS customers with an army of people to maintain a redundant infrastructure.

      You really think hosting your own hardware in your own data centers spread across the world will save you a fortune?

      False dichotomy. Most on-line businesses don't need redundant access in data centres all over the world to avoid a problem like this. Having a primary and a stand-by in different geographic locations would have done just fine, and we've been doing that since long before the marketing people invented terms like "cloud computing".

      Have you even bothered to run those figures?

      Several times and for multiple businesses. Have you?

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    10. Re:Seems like anything takes down the cloud... by Anonymous Coward · · Score: 0
      So your argument is: Netflix fucked up, so cloud is shit? Brilliant.

      Several times and for multiple businesses. Have you?

      Yes. Cloud is usually cheaper and easier at small to medium scale I.e. the vast majority of use cases.

    11. Re:Seems like anything takes down the cloud... by ILongForDarkness · · Score: 2

      They expect the customers to pay for the redundancy by using multiple servers in different geographical locations. People buying one server or a bunch only in one datacentre are taking a risk already. I'm assuming someone in Amazon said lets build a few datacentres and skimp on the redundancy at each one. The redundancy is at the multi-datacentre level not at the multi-UPs multi-connection etc level at each datacentre.

    12. Re:Seems like anything takes down the cloud... by MrBandersnatch · · Score: 1

      "Several times and for multiple businesses. Have you?"

      I'd actually be interesting in hearing your analysis and experience. I'm looking at this myself and finding that cost advantages differ depending on scenario - there just doesn't seem to be a clear cut point at which one solution costs less than the other for all but the most trivial scenarios.

    13. Re:Seems like anything takes down the cloud... by Anonymous+Brave+Guy · · Score: 1

      So your argument is: Netflix fucked up, so cloud is shit?

      No, my argument is that saying this only affected one AWS data center and people elsewhere are fine is clearly not the whole story.

      Cloud is usually cheaper and easier at small to medium scale

      Cheaper and easier than what? Cloud technologies are basically useful for two things: outsourcing hardware and staff resources so you can adapt to very fast changes in the level of requirements, and being a glorified CDN. What proportion of small/medium businesses ever need to scale so fast that doing it in-house is impractical, or need the generalised capabilities of services like Amazon's rather than a straight-up CDN provider like Akamai?

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    14. Re:Seems like anything takes down the cloud... by MrBandersnatch · · Score: 1

      Almost spot on - in fact don't even put all of your assets into the same cloud even because the day IS going to come when an infrastructure issue takes out even the largest of providers.

    15. Re:Seems like anything takes down the cloud... by andy1307 · · Score: 1

      I wonder how many more of these it takes before the cloud-skeptics start winning the debates with management a lot more often.

      This sort of thing never ever happens when you host everything in-house?

    16. Re:Seems like anything takes down the cloud... by MobileTatsu-NJG · · Score: 0

      They'd only 'win' the debate until a power failure at their location. Or a hardware failure... Or a malware outbreak... etc.

      --

      "I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)

    17. Re:Seems like anything takes down the cloud... by Anonymous+Brave+Guy · · Score: 1

      OK. Obviously I'm posting pseudonymously so I can't give a lot of specifics, but FWIW...

      I agree that this isn't a straightforward question, and I think one big problem is that people sometimes start by assuming a false dichotomy: either we're hosting in the cloud or we're kitting out a whole new server room. In reality, there is a broad scale to consider, with all kinds of managed hosting and colo options where a lot of the sysadmin overhead can be outsourced but you basically get to use real hardware with proper root access at a much more sensible cost-per-resource-unit than any cloud hosting provider is going to offer.

      For a lot of small/medium sized businesses (anyone who is going to run Netflix 2 successfully doesn't need advice from me ;-)) the sweet spot seems to be somewhere in the middle. If you can find a service provider with geographically diverse hosting facilities and sensible connectivity, you can either lease machines from them or buy your own and use their colo services, and basically make the hosting service into your on-site IT people. If you're just starting out and don't have dedicated IT people yet, a lot of these services will also offer basic sysadmin support for a nominal fee, to help with installing/patching your OS or standard cloned images, and setting up things like firewalls, load balancing, database replication, distributed filesystems and all that stuff that you probably don't care about if you're trying to build a new service that actually does something useful.

      The key thing seems to be finding a host who will let you outsource the mundane stuff that you would do via a console in a cloud-based system -- chances are that's basically what they've set up on their own systems anyway -- but keeping the increased flexibility and lower cost-per-resource-unit of leasing/buying your own dedicated hardware with real root access. This approach seems to work pretty well up to a scale of dozens/hundreds of machines, as long as your resource needs grow reasonably predictably or you an afford a day or two to catch up in the event of an unexpected spike.

      Of course if you need a global CDN then you're probably not going to beat a real CDN provider this way, but you can combine that with some sort of managed hosting/colo arrangement in various sensible ways. And if you really do need to scale up and down within a matter of minutes/hours, perhaps because your service has wildly different usage patterns at different times of day, then probably Amazon-style cloud hosting is your only viable option without spending a fortune on hardware you won't be using efficiently.

      Not sure if that just repeats things you already figured out, but I hope it helps.

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    18. Re:Seems like anything takes down the cloud... by lucifuge31337 · · Score: 1

      "Several times and for multiple businesses. Have you?"

      I'd actually be interesting in hearing your analysis and experience. I'm looking at this myself and finding that cost advantages differ depending on scenario - there just doesn't seem to be a clear cut point at which one solution costs less than the other for all but the most trivial scenarios.

      Because it really depends on the business and the application. It also depends on how much bandwidth you use and if you have geographical limitations which would make accessing that bandwidth more costly in one or more locations.

      If you are in it for the long haul, why not have control over your own cheap commodity machines and "scale into the cloud" for overages until you acquire more hardware? Then you can actually hav control of those little things that let you switch between datacenters easily like.....you know, your BGP and other trivial things like that.

      There's definitely no one sized fits all for this, but the bulk of the statrups I see that are cloud based appear to be 1.) a bunch of developers first and foremost, so not data center or network engineers at all and 2.) not capitalized well enough in the beginning to be able to afford leasing and equipping space in multiple data centers. And there's nothing at all wrong with that. It's a valid choice if you recognize the reality of what you are buying rather than believing the marketing hype hook line and sinker.

      --
      Do not fold, spindle or mutilate.
    19. Re:Seems like anything takes down the cloud... by lucifuge31337 · · Score: 1

      I wonder how many more of these it takes before the cloud-skeptics start winning the debates with management a lot more often.

      This sort of thing never ever happens when you host everything in-house?

      Obviously they do. But at least you have some control over the recovery, rather than sitting around watching for carefully-worded email and Twitter updates from Amazon about when you just might get access to the shit you are paying for again. That makes communicating real information to your customers a bit easier.

      Of course, you can always use the excuse that it's not your fault and blame Amazon ("see...look at all the other people who are down"). But that's largely a marketing decision I suppose.

      --
      Do not fold, spindle or mutilate.
    20. Re:Seems like anything takes down the cloud... by Anonymous Coward · · Score: 0

      It seems that recently, anything can take down the cloud

      I remember having downtime on "not the cloud" back in the day because a truck crashed into ... I can't recall if it was just a utility pole, or if the truck actually crashed into a Rackspace datacenter.

      Replace 'the cloud' with 'any operation that's too cheap to pay for five-nines uptime by having balanced systems located in multiple disparate geographic locations' - random downtime has little to do with something being 'the cloud' or not.

      Of course, the proper solution is expensive as hell, which is why companies usually don't go for it.

    21. Re:Seems like anything takes down the cloud... by codepunk · · Score: 1

      Hmm no you don't usually have much control over the recovery either. I was involved in a outage once because some guys trenching cable cut clean through our fiber bundle. There is no controlling anything that happens after that you are just down until the fiber is repaired.

      In a cloud environment, given that you have a DR plan you press a button and you are back online.

      --


      Got Code?
    22. Re:Seems like anything takes down the cloud... by lucifuge31337 · · Score: 1

      Hmm no you don't usually have much control over the recovery either. I was involved in a outage once because some guys trenching cable cut clean through our fiber bundle. There is no controlling anything that happens after that you are just down until the fiber is repaired.

      Diverse utility paths are pretty much required for any datacenter. And even that may not be enough, which I will respond to in the next point.

      In a cloud environment, given that you have a DR plan you press a button and you are back online.

      Two things: that whole concept is not a "cloud environment" thing, it's the way things have been done for a long time. Also, if you have to "press a button" (or perform any action) you are doing it pretty much wrong and have nothing to be smug about. None of this is magic, not unique to "cloud computing". Stop letting your brain fall out of your ear when you hear the latest buzz words. "Cloud computing" is code for "we figured out how to lease you a fractional part of several servers and call it something else". The only part about it that is new is the marketing. It's carries all the same risks as it did before, but now has some more tools, platform support and vendors due to its trendiness and the increased needs for the types of service in the general market area it covers.

      No, I'm not working my first or second job in the industry. That's why I know "THE CLOUD! THE CLOUD!" is not the answer to all problems. It's just another tool that can be used appropriately or inappropriately. Much of what I've seen lately has been inappropriate, but wholly in keeping with what the marcom firms the providers have hired are messaging. Sounds like you've bought into that too. I'd suggest you get some perspective on the industry you are presumably a part of.

      --
      Do not fold, spindle or mutilate.
    23. Re:Seems like anything takes down the cloud... by twisted_pare · · Score: 1

      Good point. Also don't forget that even if you colo'd in a major VA datacenter, and the center lost power, you'd still be just as screwed, cloud or no cloud.

      --
      HTFU
    24. Re:Seems like anything takes down the cloud... by acoustix · · Score: 1

      Cloud computing brings availability to the "small guys". It also allows for quick scalability. You can't really accomplish similar things in-house unless you use 100s of servers, but then you have logistical issues as you have to ship these servers all over the place. If you just have one data drop to your 100 servers at one location, guess what? Your infrastructure is no better than hosting everything with a 3rd party in one data center.

      Ever heard of server virtualization? It makes availability and scalability ridiculously cheap and easy.

      --
      "A plan fiendishly clever in its intricacies"- Homer Simpson
  5. What, you thought "cloud" meant "no outage"? by ebunga · · Score: 4, Insightful

    Cloud computing is nothing more than 1960s timesharing services with modern operating systems. Unless you design for resilience, you're not resilient to problems.

    1. Re:What, you thought "cloud" meant "no outage"? by rubycodez · · Score: 2

      The laugh is that those 1960s sytems had, for additional money, configurations for 24x7 uptime. Here we supposedly design for that with the cloud architecture, and fail. I would not be surprised at all if the modern mainframe were a cost effective alternative to this bloated expensive cloud.

    2. Re:What, you thought "cloud" meant "no outage"? by ColdWetDog · · Score: 1

      Cloud computing is nothing more than 1960s timesharing services with modern operating systems. Unless you design for resilience, you're not resilient to problems.

      Cool. Can we get those old Teletype terminals back? The clattering ones that left little round bits of paper all over the place?

      And 8-track tapes while we're at it.

      --
      Faster! Faster! Faster would be better!
    3. Re:What, you thought "cloud" meant "no outage"? by Anonymous Coward · · Score: 1

      The laugh is that those 1960s sytems had, for additional money, configurations for 24x7 uptime.

      If you cut all the power to those "redundant" systems, they went down.

      Funnily enough, that's what's happened here. Except the other AWS data centers are all working, unlike your 1960's system.

      Are people on Slashdot really too stupid to understand cloud, or are they just deliberately disingenuous?

    4. Re:What, you thought "cloud" meant "no outage"? by rubycodez · · Score: 1

      Are you too stupid to research before spouting off? cutting "all the power" was rather difficult, as it came from two utilities and onsite generation.

    5. Re:What, you thought "cloud" meant "no outage"? by ILongForDarkness · · Score: 1

      I think most are just cheap bastards that are upset that their one server $30/month setup didn't by a redundant datacentre and that opps maybe they should have listened went people said that geo-redundancy: "It's a good thing" TM.

    6. Re:What, you thought "cloud" meant "no outage"? by sweatyboatman · · Score: 1

      Cloud computing is nothing more than 1960s timesharing services with modern operating systems. Unless you design for resilience, you're not resilient to problems.

      Cloud computing a little more than 1960s timesharing services. Some miniscule differences such as being accessible from anywhere in the world, providing enormously more power and exponentially more capacity, and priced by they penny, but those are tiny differences that matter. Not to mention that as other commenters have mentioned, the Amazon Cloud does provide more redundancy, the people using it just didn't want to pay for it.

      The parent is the single stupidest comment possible for this thread and it's modded +5 insightful.

      --
      It breaks my pluginses, my precious!
    7. Re:What, you thought "cloud" meant "no outage"? by MrBandersnatch · · Score: 1

      I suspect there is a lot of resistance to the concept due to the general early experiences of SaaS and hosting solutions being cloud-washed....

    8. Re:What, you thought "cloud" meant "no outage"? by Anonymous Coward · · Score: 0

      No, it really isn't. Modern day cloud computing isn't much more advanced than it was in the 1960's.

    9. Re:What, you thought "cloud" meant "no outage"? by Anonymous Coward · · Score: 0

      I think Amazon deserves some share of the blame for not being able to keep the DC online. Once they realized they were going to be in a long-term outage they should have moved some of the EC2 instances to other sites (Oregon and California are the other two iirc) while the generators were still running.

      Most of the blame does fall on the customers though. If you pay to be hosted in only one site, you should know that there's risk you'll go down. This is true whether you host your own servers or have them hosted "in the cloud". I find it surprising these companies don't divide their capacity between sites just over bandwidth concerns. Being in just one site opens you up not only to downtime but also to complete data loss. Suppose a number of these servers were fried? I'm not sure how elaborate their data backups are, but hopefully those aren't single-site too.

    10. Re:What, you thought "cloud" meant "no outage"? by Anonymous Coward · · Score: 0

      It certainly doesn't mean "reliability", does it?

      Of course a lot of people will take the oppotunity to say cloud sucks, but it seems the argument is (or should be) whether the reduction in costs/increse in scalability is worth the hassle of a less dependable system, and whether it really *is* less dependable... and whether that's just because it's still gaining acceptance.

      If struggles by early adopters meant the whole concept is bad we'd still be riding carriages.

    11. Re:What, you thought "cloud" meant "no outage"? by lucifuge31337 · · Score: 1

      I think most are just cheap bastards that are upset that their one server $30/month setup didn't by a redundant datacentre and that opps maybe they should have listened went people said that geo-redundancy: "It's a good thing" TM.

      Yeah....Netflix is totally one of those places. Oh...wait....no they aren't and they were down anyway.

      --
      Do not fold, spindle or mutilate.
    12. Re:What, you thought "cloud" meant "no outage"? by dkf · · Score: 1

      Are you too stupid to research before spouting off? cutting "all the power" was rather difficult, as it came from two utilities and onsite generation.

      Never underestimate the power of the universe to shit on you. It's still quite possible to get a perfect storm of problems that takes things offline, such as the main onsite generator being down for scheduled maintenance that overruns, the backup generator only having limited capacity, and a major storm wiping out the power grid completely for 20 miles. At that point, stuff will go down, and at some point it becomes cheaper to have insurance to deal with the losses arising (including reputational losses) instead of building the vastly complex infrastructure that can't fail in ever less likely scenarios.

      The other big change is that the vast majority of work now requires that datacenter be online (in a network sense), and at that point you're vulnerable to someone else being the weakest link. Doing it all yourself is fantastically expensive, and incredibly hard too given the number of different skills involved.

      Mind you, most of Amazon's service provision didn't even bat an eyelid. They can lose a whole datacenter and only some customers are affected, and those customers can (if they adequately prepared) get back up and going in minutes. They could even have arranged things so that their customers would have hardly seen a thing at all, but that is admittedly more easily done with some types of service than others. Still, it's not something that Amazon fixes for you (and they explicitly tell you they don't if you read their docs; you've got no excuse there). One of the genius things about the Cloud is that these things are not totally hidden from you (as a direct customer of the main service providers, of course); it allows prices to be lower and it allows you to deal with issues at the application level (usually the easiest place). It lets you get the benefits of having multiple globally-distributed datacenters without the hassle of physically building out all over the world, but it isn't magic. Just engineering and business.

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    13. Re:What, you thought "cloud" meant "no outage"? by dkf · · Score: 4, Funny

      And 8-track tapes while we're at it.

      We need those tape machines. Stick them in front of the real machines and get something hacked from a Raspberry Pi to spin them back and forth in an interesting pattern, with some extra blinkenlights for good measure, and we'll be able to once again prove to all the management types that we're doing serious computing so they can leave us alone and go back to their golf handicap.

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    14. Re:What, you thought "cloud" meant "no outage"? by dkf · · Score: 1

      No, it really isn't. Modern day cloud computing isn't much more advanced than it was in the 1960's.

      All except for the data volumes, timescales, connectivity and pricing. In the '60s, timesharing services didn't ever have to deal with anything like the volume of data that would be found on a modern PC. They'd have a turnaround time of a few days, and connectivity was by courier if you were in a hurry, or driving over there yourself with your stack of punched cards (or paper tape) otherwise. I suppose it would be possible to think that pricing was comparable, especially if you were to ignore inflation, but really there's no comparison at all.

      The net effect of these things is that people use the concepts of a timesharing service differently to back then. Human activity is not time- or space-scale invariant.

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    15. Re:What, you thought "cloud" meant "no outage"? by gmhowell · · Score: 1

      I suggest that you aren't old enough to remember 8-track tape if you imply that they can be wound back and forth. Hell, just going in one direction gave pretty good odds of screwing up. Methinks you're getting your formats mixed up.

      --
      Jesus was all right but his disciples were thick and ordinary. -John Lennon
    16. Re:What, you thought "cloud" meant "no outage"? by ILongForDarkness · · Score: 1

      And you're sure that they had everything that should have been redundant actually setup as required to work in AWS? Or did they just have redundant db and webservers but some stupid master index that everything has to pass through running in a single zone? 9/10 that is the problem. People can justify multiple dbs because of performance and data integrity needs. People justify multiple webservers so that they can get low lantency to different areas of the globe and under load. Then someone throws on top a cassandra or memcache layer or whatever and plays with it in one zone ... then goes live ... in one zone. Opps.

    17. Re:What, you thought "cloud" meant "no outage"? by lucifuge31337 · · Score: 1

      And you're sure that they had everything that should have been redundant actually setup as required to work in AWS? Or did they just have redundant db and webservers but some stupid master index that everything has to pass through running in a single zone? 9/10 that is the problem. People can justify multiple dbs because of performance and data integrity needs. People justify multiple webservers so that they can get low lantency to different areas of the globe and under load. Then someone throws on top a cassandra or memcache layer or whatever and plays with it in one zone ... then goes live ... in one zone. Opps.

      I'm not sure what that has to do with a cheap bastard with a $30 a month setup, as I was replying to in your post. But I'll play along: of course I can't be sure of any of those things. But I'm having a hard time a marquee customer of that size is doing things that wrong. The reports of other availability zones being affected/degraded seems to bare this out.

      --
      Do not fold, spindle or mutilate.
  6. Millions of dollars spent for nothing. by Anonymous Coward · · Score: 5, Interesting

    So this is the second time this month Amazons cloud has gone down, there should be serious questions being asked of the sustainability of this service given the extremely poor uptime record and extremely large customer base.

    They would have spent millions of dollars installing diesel or gas generators and/or battery banks and who knows how much money maintaining and testing it, but when it comes time to actually use it in an emergency, the entire system fails.

    You would think having redundant power would be a fundamental crucial thing to get right in owning and operating a data centre, yet Amazon seems unable to handle this relatively easy task.

    Now before people say "well this was a major storm system that killed 10 people, what do you expect", my response is that cloud computing is expected to do work for customers hundreds and thousands of kilometres/miles from the actual data centre so this is a somewhat crucial thing that we're talking about - millions of people literally depend on these services; that's my first point.

    My second point is it's not like anything happened to the data centre, it simply lost mains energy. It's not like there was a fire, or flood, or the roof blew off the building, or anything like that; they simply lost power and failed to bring all their millions of dollars in equipment up to the task of picking up the load.

    If I were a corporate customer, or even a regular consumer I would be seriously questioning the sustainability of at least Amazons cloud computing, Google and Facebook seem to be able to handle it but not Amazon - granted they don't offer identical products the overall data centres seem to stay up 100 or 99.9999999% of the time unlike Amazons.

    1. Re:Millions of dollars spent for nothing. by turbidostato · · Score: 2

      A datacenter is a datacenter is a datacenter. You are not in "the cloud" if you can't scape from a datacenter-level incident.

      Given that there is no "cloud" provider (not yet, at least) that will automagically protect your services from a datacenter-level incident, is up to you, the customer, to do it.

      It's certainly possible with current technology but it's neither cheap nor straightforward, no matter what the "cloud" providers insist in sell and the PHBs in believe.

    2. Re:Millions of dollars spent for nothing. by hawguy · · Score: 5, Informative

      So this is the second time this month Amazons cloud has gone down, there should be serious questions being asked of the sustainability of this service given the extremely poor uptime record and extremely large customer base.

      They would have spent millions of dollars installing diesel or gas generators and/or battery banks and who knows how much money maintaining and testing it, but when it comes time to actually use it in an emergency, the entire system fails.

      You would think having redundant power would be a fundamental crucial thing to get right in owning and operating a data centre, yet Amazon seems unable to handle this relatively easy task.

      Well, the entire system didn't fail, my servers in us-east-1a weren't affected at all.

      Hardware fails, even well tested hardware... especially in extreme conditions - don't forget that this storm has left millions of people without power, killed at least 10, and caused 3 states to declare an emergency. Amazon may have priority maintenance contracts with their generator and UPS system vendors and fuel delivery contracts, but when a storm like this hits, they vendors are busy keeping government and medical customers online. Rather than spend millions more dollars building redundancy for their redundancy (which adds complexity that can cause a failure itself), Amazon isolates datacenters into availability zones, and has geographically disperse datacenters.

      Customers are free to take advantage of availability zones and regions if they want to (which costs more money), but if they chose not to, they shouldn't blame Amazon.

    3. Re:Millions of dollars spent for nothing. by Anonymous Coward · · Score: 0

      Someone mod parent up. "The Cloud" simply forces you to engineer reliability correctly. You can no longer throw money at a high-end single point of failure (a storage system or server or switch) and just hope that if it costs enough it will never fail. That option is gone in the cloud; all components are commodity and considered cheap and prone to failure.

      Ultimately, whether you used high-end/low-failure-rate stuff or the cheap/crap stuff within one datacenter, you'll still need to engineer *real* reliability between multiple datacenters if you want to survive natural (and man-made) disasters. If you're doing that properly, there's no sense wasting money on high-end componentry within a single datacenter anymore.

      Building architectures that scale and handle failure correctly and globally (deployed in multiple locations on commodity cheap stuff, virtually never goes down anyways) is the way of the future whether you run your own hardware or not. It's The Right Way To Do Things. At a certain scaling level it probably makes sense to build your own "cloud". For many companies, it makes more sense to use something like EC2, because they really aren't at a scaling level where they can do it as reliably and cheaply. I'd guestimate the cutoff is somewhere in the vicinity of having a permanent baseline need of ~5K+ instances. Either way, it's the same architectural goals for your applications to get the same reliability and scalability.

      If your app fails due to a single Amazon datacenter failing, you're not architecting things correctly. Amazon has nothing to do with that; the same rules apply anywhere else.

    4. Re:Millions of dollars spent for nothing. by ahodgson · · Score: 1

      ELB issues last night did cause problems to services with zone redundancy. We had services with zone redundancy that were experiencing issues because the ELB addresses being served were not functional even though they had working instances connected to them.

      Amazon has also had at least one other outage in the last 18 months that affected more than one availability zone.

      Region redundancy would be good. But it's quite a bit more complex and costly, what with security groups and ELBs not crossing regions and having to pay external data charges for every byte moved between regions. We do it for important services, but it is a pain.

    5. Re:Millions of dollars spent for nothing. by Anonymous Coward · · Score: 0

      It's certainly possible with current technology but it's neither cheap nor straightforward, no matter what the "cloud" providers insist in sell and the PHBs in believe.

      Any decent IaaS cloud provider will offer CDN and GSLB products at a reasonable price. It's totally possible to build a system on top of a cloud that can survive a data center ("Region") outage, using the services the cloud provider offers.

    6. Re:Millions of dollars spent for nothing. by dbrueck · · Score: 5, Informative

      Sorry, but "Amazon's cloud has gone down" is wildly incorrect. From the sounds of it, *one* of their many data centers went down. We run tons of stuff on AWS and some of our servers were affected but most were not. Most important of all is that we had *zero* service interruption because we deployed our service according to their published best practices, so our traffic was automatically handled in different zones/regions.

      Having managed our own infrastructure in the past, it's these sort of outages at AWS that make us grateful we switched and that continue to convince us it was a good move. It might not be for everybody, but for us it's been a huge win. When we started getting alarms that some of our servers weren't responding, it was so cool to see that the overall service continued on its merry way. I didn't even bother staying up late to babysit things - checked it before bed and checked it again this morning.

      Firing up a VM on EC2 (or any other provider) != architecting for the cloud.

    7. Re:Millions of dollars spent for nothing. by turbidostato · · Score: 1

      "Any decent IaaS cloud provider will offer CDN and GSLB products at a reasonable price."

      Which helps you with your authoritative dynamic data exactly how?

      And even with your mostly read-only data you will get only the lowest advantage if going the "automagical" route: to take most benefit from CDN or GSLB you need to engineer and develop you apps with those services in mind -which is exactly what I already said.

    8. Re:Millions of dollars spent for nothing. by joelsanda · · Score: 1

      You would think having redundant power would be a fundamental crucial thing to get right in owning and operating a data centre, yet Amazon seems unable to handle this relatively easy task.

      How much power would they need available to counteract the total failure of the electric grid in their area? In the case of this type of storm you can't rely upon basic services being up: like water, electricity, and without power mobile or land line communication. So that means the only way to have a redudant power source would be to handle all electricity needs on-site for x number of hours. That seems a little extreme, at least in this case?

      --
      The Luddites were ahead of their time.
    9. Re:Millions of dollars spent for nothing. by twisted_pare · · Score: 1

      Looks like their VA datacenter is down to two 9's for this year. Whatever happened to A+B power? You have two rails in your rack. Rail A goes to powergrid A, rail B goes to powergrid B. Then again, if, as you point out, your millions of dollars of switching gear does not work, does it matter how many redundant systems you have?

      --
      HTFU
  7. I live nowhere near Va by bugs2squash · · Score: 4, Interesting

    However "Netflix, which uses an architecture designed to route around problems at a single availability zone." seems to have efficiently spread the pain of a North Eastern outage to the rest of the country. Sometimes I think redundancy in solutions is better left turned off.

    --
    Nullius in verba
  8. Pepco still has 400,000 people without power by Oxford_Comma_Lover · · Score: 2
    --
    -- IANAL, this isn't legal advice, and definitely isn't legal advice for you. Also, Squee!
  9. not just netflix, and not just "electrical storm" by acroyear · · Score: 2

    Instagram's servers in that cloud server were also affected, and more people griped about that on my facebook feed than netflix.

    as for "an electrical storm", that's a bit of an understatement. The issue was actually more the 80 mph wind gusts as well as the lightning continuing on for 2 hours after the wind and rain had passed (meaning crews couldn't get out there overnight).

    The result is some 2 million people without power, 1 million around DC alone. Dominion Power (which services the area where the data center resides, about 5 miles from my house) lost power for more than half of its northern virginia customers, and even now has only restored power to about 60,000* out of 461,000 that lost it. On the Maryland/DC side of the potomac, half a million people may be without power for days through a 100 degree each day heat wave (and more storms like last nights coming...).

    * fortunately that would include me...though i'm writing this via my sprint phone as a wifi hotspot 'cause our cable modem is still down ;-)

    --
    "But remember, most lynch mobs aren't this nice." (H.Simpson)
    -- Joe
  10. it seems like the switching system failed by Joe_Dragon · · Score: 3, Informative

    it seems like the switching system failed and or the back up power generators did not kick on.

    Maybe natural gas ones are better. The firehouses have them. I also see them at a big power sub station as well.

    1. Re:it seems like the switching system failed by tnk1 · · Score: 1

      While failure of the backup systems is a possibility (just look at Fukushima), the backup systems are usually fairly redundant and tested as well. I know most datacenters I have been in test their generators periodically, something like every month or two. Unless there's a fairly large natural disaster, or someone sets off a very large bomb, backup power should be available for at least 24-48 hours. At that point, things could start breaking down because you have to start getting fuel shipped in, but after last night's storm, power should be have been up in a matter of less than an hour to those sites. It's not like Virginia is serviced by Pepco.

    2. Re:it seems like the switching system failed by Relayman · · Score: 1

      Natural gas fails when there is an earthquake. Depending where your data center is located, diesel may be a better choice.

      --
      If I used a sig over again, would anyone notice?
    3. Re:it seems like the switching system failed by Deekin_Scalesinger · · Score: 1

      Hehe - normally I'd agree, but Pepco did all right last night as far as am concerned. I flickered for about 2 seconds last night, but I'm in the downtown Capitol Wastelands - I don't know what grid I'm on but it seems to be a good one. Oh - and yay for personal UPSes! They did what they should have done.

      --
      "As the intrepid kobold companion continues his journey, he begins to wonder... if priests raises dead, why anybody die?
    4. Re:it seems like the switching system failed by drinkypoo · · Score: 1

      Natural gas fails when there is an earthquake.

      Natural gas generators (or even fuel cells) are commonly used within city limits for a broad number of reasons. First and foremost, you're not permitted to store quantities of flammables in most cities. Another is that the emissions are relatively benign.

      OUTSIDE of a city, you can use a propane generator, which can be a converted gasoline generator if you prefer. You can even convert one to be dual-mode so it will run on either gasoline or propane, but that's quite a bit more work. Common dual-mode generators run on natural gas or propane, which practically is a bit like saying high or low octane. It takes only minor changes to convert an appliance from one to the other. (Your car can run on one or the other with a timing change...)

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    5. Re:it seems like the switching system failed by twisted_pare · · Score: 1

      And don't forget, this is the second time in two weeks this happened at this data center. Bezos is going to have some heads on Monday. Funny though, Google learned this less long ago. Forget the $10M of UPC's. Strap an emergency exit light lead acid cell to each server/switch (DC->DC, nice). If you loose power, server is good for 10-20min while the building cuts over. Otherwise 1s or 10min outage does not matter. You're still performing a cold startup on a massive system. Good luck with that. Oh yeah, and they're still not out of the woods yet. http://status.aws.amazon.com/rss/ec2-us-east-1.rss

      --
      HTFU
  11. Wasn't even a big storm by gman003 · · Score: 4, Informative

    I was in it - it was not a particularly bad storm. Heavy winds, lots of cloud-to-cloud lightning, but very little rain or cloud-to-ground lightning. I lost power repeatedly, but it was always back up within seconds. And I'm located way out in a rural area, where the power supply is much more vulnerable (every time a major hurricane hits, I'm usually without power for about a week - bad enough that I bought a small generator).

    According to TFA, they were only without power for half an hour, and that the ongoing problems were related to recovery, not actual power-lossage. So their problems are more "bad disaster planning" than "bad disaster".

    Still, you'd think a major data center would have the usual UPS and generator setup most major data centers have - half an hour without power is something they should have been able to handle. Or at least have enough UPS capacity to cleanly shut down all the machines or migrate the virtual instances to a different datacenter.

    1. Re:Wasn't even a big storm by Anonymous Coward · · Score: 0

      Just because your area wasn't affected does not mean anything. At least 12 people died.
      http://www.usatoday.com/weather/storms/story/2012-06-30/storms-power-eastern-us/55936366/1?csp=hf

    2. Re:Wasn't even a big storm by Anonymous Coward · · Score: 1

      As reported on Wunderground the storm was a derecho, a storm at with a track at least 240 miles long with winds above 58 mph. The derecho started in Northwest Indiana, and tracked all the way to offshore Deleware and New Jersey. a bunch of places had 80+ mph gusts. So exactly how bad the storm was depended on how much wind one got.
      Now the issue of data center protection and backup, the question is where else along the route were data centers, and did they survive?

    3. Re:Wasn't even a big storm by Anonymous Coward · · Score: 0

      Yeah it only killed 10 people, pretty minor stuff. I used to walk 10 miles in worse storms up hill both ways, every day. You younguns have it easy these days.

    4. Re:Wasn't even a big storm by Anonymous Coward · · Score: 0

      Why not 13?

    5. Re:Wasn't even a big storm by CptNerd · · Score: 1

      I was in it, and barely missed getting hit by multiple tree branches of the 6+inch diameter variety as I drove the final half-mile home. I lost power long enough to make my UPS whine, but that was before I got there. Every street around me had branches down, some completely blocking main streets. I live in Alexandria near Potomac Yard, got hit by the weather driving through Shirlington.

        I had managed to not be out in a storm of this size before, usually I stay in or get back before one hits (I'm a weather junkie, wunderground,com rules!), so I hadn't ever seen big chunks of tree fall around me, nor had my car hit with 70 MPH winds.

      I have to give the SFX guys in "Twister" credit, they were pretty darn accurate, at least about the storms leading up to tornadoes.

      --
      By the taping of my glasses, something geeky this way passes
    6. Re:Wasn't even a big storm by gmhowell · · Score: 1

      I was in it - it was not a particularly bad storm. Heavy winds, lots of cloud-to-cloud lightning, but very little rain or cloud-to-ground lightning. I lost power repeatedly, but it was always back up within seconds. And I'm located way out in a rural area, where the power supply is much more vulnerable (every time a major hurricane hits, I'm usually without power for about a week - bad enough that I bought a small generator).

      According to TFA, they were only without power for half an hour, and that the ongoing problems were related to recovery, not actual power-lossage. So their problems are more "bad disaster planning" than "bad disaster".

      Still, you'd think a major data center would have the usual UPS and generator setup most major data centers have - half an hour without power is something they should have been able to handle. Or at least have enough UPS capacity to cleanly shut down all the machines or migrate the virtual instances to a different datacenter.

      Me, me, me, me, me!

      Your post brings this to mind.

      BTW, having lived in rural areas in MD, they were far less likely to be victims of the weather than built up areas (where I currently live in MD). The rural electric co-ops are MUCH better about preventative maintenance than the for profit companies. Further, one minor outage in Montgomery Co. (where I currently am) can put hundreds or thousands of times more people without power than a similar outage in St. Mary's Co.

      In short, get your head out of your ass, and understand that the world is bigger than the view from your mom's basement would lead you to believe.

      --
      Jesus was all right but his disciples were thick and ordinary. -John Lennon
  12. Poorly run datacenter by nurb432 · · Score: 0

    If they don't have proper backup generators, they have no business running a data center.

    --
    ---- Booth was a patriot ----
    1. Re:Poorly run datacenter by turbidostato · · Score: 1

      "If they don't have proper backup generators, they have no business running a data center."

      *Or* they are in a business that recognizes that shit happens, even at the datacenter level, and provide services so you can spread your load out of more than one datacenter, making the x10 expenditures needed to go from a "decent" datacenter to a "top notch" one moot and avoidable.

      Hey, doesn't that look like this funny "cloud" concept they are waving so oftenly?

    2. Re:Poorly run datacenter by nurb432 · · Score: 0

      No, if you are a professional stuff doesn't 'happen', and downtime like this is unacceptable. Using your suggestion, if they were relying an off-site type of DR plan, it apparently was not up to the task, so they still fail. Effort and good intentions don't count.

      Where i'm at, even if our primary site was wiped out by a nuclear bomb we would be down less than a minute. If our power goes out, we have zero downtime. ( someone will have to go find some diesel if we are without mains for more than 48 hours tho ). Extended down time is NOT an option here.

      They are still amateurs, and this shows they cant be trusted to play with the big boys yet.

      --
      ---- Booth was a patriot ----
    3. Re:Poorly run datacenter by turbidostato · · Score: 2

      "No, if you are a professional stuff doesn't 'happen'"

      No, if you are a professional you evaluate risks and adjust your behaviour to an acceptable level and you don't expend a bazillion to protect half a bazillion.

      In example, Google designed their applications in a way that stand for a failing server: what's the benefit in their case going with RAID10, doubled PSUs and hot swappable RAM and CPUs? What gives to the table but lost money?

      Amazon offers out of Fortune 100 people the ability to do the same, only at the datacenter level. But then, if you can stand a whole datacenter failure by properly using the services they offer, what's the advantage of making the expenditure of making their datacenters five nines instead of four?

      "They are still amateurs"

      They are there for the money and they are making a lot of money: that's what make them professionals.

      I'll tell you who's being unprofessional: all those that think that their critical services are propely protected within a single datacenter just because they read it was "the cloud" in a colourful brochure.

    4. Re:Poorly run datacenter by nurb432 · · Score: 1

      Well, this is America, you are welcome to your belief, even if its horribly wrong.

      --
      ---- Booth was a patriot ----
    5. Re:Poorly run datacenter by Anonymous Coward · · Score: 0

      You have exceptionally low standards. Not that you'll ever do anything substantial, but remind me to never use any of your services on the off chance that you do.

    6. Re:Poorly run datacenter by rahvin112 · · Score: 1

      I personally think it's funny that people would even say that (if yoru a professional stuff doesn't happen BS). As someone who works in the infrastructure business I can tell you with 100% certainty that no design, location or setup will be perfect. Regardless of how well you plan you are one natural disaster away from a service interruption and any single point in the system can be taken down by some guy in a backhoe digging where he shouldn't.

      Even if you designed a data center with 100 layers of redundancy on power and connectivity there is a damn good chance that all those communication, power and other lines go through a single point somewhere miles away from the data center, probably where they all cross the interstate or a river. Infrastructure just don't have that much redundancy and in the real world there are lots of places where there is a single crossing, be that a river, interstate or any other property or natural feature that restricts access. So one guy in a backhoe digging where he shouldn't can do things like take out a whole cities power and communication lines. It's not common but it does happen.

    7. Re:Poorly run datacenter by turbidostato · · Score: 1

      "You have exceptionally low standards."

      No, I don't.

      I have standards tied to reality and I know Amazon offers[1] "a Monthly Uptime Percentage [...] of at least 99.9% during any monthly billing cycle". On top of that, I know what the value of an SLA exactly is.

      [1] http://aws.amazon.com/s3-sla/

    8. Re:Poorly run datacenter by turbidostato · · Score: 1

      "I personally think it's funny that people would even say that (if yoru a professional stuff doesn't happen BS). As someone who works in the infrastructure business I can tell you with 100% certainty that no design, location or setup will be perfect."

      Truly. In fact, the professional is the one that knows that shit happens, what's the recurrence of a certain kind of shit, how it will impact the business and what's the best countermeasure to achieve the best bang for the buck: sometimes is avoiding the shit to happen, sometimes adopt measures for the shit to happen but still not let it to affect the business and sometimes let the shit happen and just cross your fingers so it doesn't happen in your time -are you really covering a global nuclear war scenario, really?

      It's easy to say "this wouldn't happen to me" ...when you are not in a position that this could happen to you.

    9. Re:Poorly run datacenter by zero0ne · · Score: 1

      It isn't amazon's job to setup a DR plan for their entire datacenter. That is the customer's job. You can pay for Amazon support and they will gladly help you setup a DR plan for your setup, but there is no way what-so-ever that it is their responsibility.

      They provide you with an easy way to rent their hardware from 1 or more locations, and you can build in redundancy into your own system.

  13. with cable the nodes need power and there batterie by Joe_Dragon · · Score: 1

    with cable the nodes need power and there batteries will run down and then the cable co needs to have on site portable generators at the nodes with no power.

    The phone systems have RT (less of them then cable systems) that are the same way.

  14. How many private clouds went down? by Anonymous Coward · · Score: 1

    Amazon is a huge target - but how many other data centers went down in the Virginia area also? Did they come back up as fast as Amazon?

    And Netflix is an Amazon Cloud customer... What's the matter with them? Are they just too dumb to host in house?

  15. Re:with cable the nodes need power and there batte by ILongForDarkness · · Score: 1

    Why exactly would a cable operator bother with backup power? I mean if the neighborhood has now power than people aren't running T.V.s or computers (unless laptops but still their modem would be down). It is probably a different beast with something the size of a Amazon datacentre though, they probably can go to the ISP and say "hey look we'll by 5M a month of internet for you but we need redundancy. Piss on all your home users for all we care but we get internet no matter what.".

  16. double hit by slashmydots · · Score: 0

    Not only did I get annoyed for like 3 whole minutes last night at the tail end of the netflix downtime but I also couldn't download an important software patch form a vendor on Friday because it was hosted on the amazon cloud download service thing. By the way, Netflix apparently doesn't have a damn thing for single point of failure adaptation, seeing as how their entire website itself was down and wouldn't even respond to a ping. They can't even load a freaking "sorry, we're having problems" page on a backup host? Yeah, real adaptive. Oh and good call hosting your site on the same service that your videos stream from. That's really smart.

    How did this happen anyway? The cloud is magic...MAGIC!!!!!! You cannot destroy magic! It must have been a dark wizard. That or all the cloud product salesmen are full of shit.

  17. Re:with cable the nodes need power and there batte by Joe_Dragon · · Score: 1

    well there are long runs from the headend to the each neighborhood so some area may have power but hours later the cable goes not as the lines pass though areas that don't have power.

  18. My instance was down for 9hrs... by geekymachoman · · Score: 2

    Which is the problem. Not the power outage itself.
    If the power outage happened, and the servers where back let's say ... in 30 minutes, 1hour... alright, but 9 freakin' hrs ?

    In my specific case I didn't suffer as much because I have another instance in different zone with db replication and all that, serving as a backup server, and my project there, although very critical (20 people are getting wages out of it) is very low on resource usage... I can imagine there where quite a lot of people that lost quite a lot of money because of this. It's really unacceptable for a DC to have a 9 hrs downtime, whatever the reason is... because.. that's just the standard people are used to.
    I never experienced anything like this at any other company in the last 10 years I'm working as a linux admin.. although at all those companies, I used real servers.

    1. Re:My instance was down for 9hrs... by PTBarnum · · Score: 4, Interesting

      There is a gap between technical and marketing requirements here.

      The Amazon infrastructure was initially built to support Amazon retail, and Amazon put a lot of pressure on its engineers to make sure their apps were properly redundant across three or more data centers. At one point, the Amazon infrastructure team used to do "game days" where they would randomly take a data center offline and see what broke. The EC2 infrastructure is mostly independent of retail infrastructure, but it was designed in a similar fashion.

      However, Amazon can't tell their customers how to build apps. The customers build what is familiar to them, and make assumptions about up time of individual servers or data centers. As the OP says, it's "the standard people are used to". Since the customer is always right, Amazon has a marketing need to respond by bringing availability up to those standards, even though it isn't technically necessary.

    2. Re:My instance was down for 9hrs... by Anonymous Coward · · Score: 0

      If the power outage happened, and the servers where back let's say ... in 30 minutes, 1hour... alright, but 9 freakin' hrs ?

      Have you ever tried to restart an entire data center before?

  19. Re:with cable the nodes need power and there batte by Alex+Zepeda · · Score: 2

    Why exactly would a cable operator bother with backup power?

    Because that cable operator also provides phone service.

    --
    The revolution will be mocked
  20. uh forgot something important by drinkypoo · · Score: 1

    whoops, I forgot to say OUTSIDE of a city you can use a propane generator FROM A PROPANE TANK. Which, of course, means it can still function after a 'quake. And if you live in someplace where it's legal to have a tank AND where you can get city gas, you can get the best of both worlds.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  21. Re:not just netflix, and not just "electrical stor by Onuma · · Score: 1

    You lucked out, then. I've driven around Fairfax, Arlington and PG counties as well as DC today. I haven't seen a major road without some kind of debris blocking it, nor an area which has 100% power restored at this point.

    This was a bad storm, but could certainly have been far worse. Even still, the grocers and stores are out of ice and people are swarming out of their homes like rats abandoning ship in some areas. These same people would be fucked if the S really HTF.

    --
    What else can happen when an unstoppable force collides with an immovable object?
  22. What Datacenter? by Anonymous Coward · · Score: 0

    What datacenter was this? Was it a private Amazon datacenter or was it someone else's?

    What real datacenter can't operate for a week without power? That's ridiculous!

  23. Inevitable by Anonymous Coward · · Score: 0

    All my data and media are still accessible. I never swallowed the cloud Kool-aid, though.

  24. Defined My Saturday Morning by pgn674 · · Score: 1

    My company uses Amazon Web Services to host some of our product, and I got a call at 7 am to help bring our stuff back up. A bunch of our instances were stopped, and a bunch of Elastic Block Store volumes were marked Impaired. We're working on making our environment more "cloudy" to make better use of multiple availability zones, regions, and automation to better survive an outage like this, but we're not there yet.

  25. Shitty is the new Acceptable by gelfling · · Score: 2

    Didn't you get the memo? Netflix barely runs now and this is working as planned. Time Warner had four internet outages in Raleigh THIS WEEK.

    Everything everywhere is slowly grinding to a halt. So let's send more work to China and India. Who cares anymore.

    1. Re:Shitty is the new Acceptable by twisted_pare · · Score: 1

      I knew something was up in Raleigh. I think I was online for every one of those outages. Too bad we don't get an SLA with TWC?

      --
      HTFU
  26. Positive spin by wonkey_monkey · · Score: 1

    We don't have downtime. We have "uptime problems."

    --
    systemd is Roko's Basilisk.
    1. Re:Positive spin by Anonymous Coward · · Score: 0

      It's not a problem, it's an opportunity!

  27. Caught-up in Semantics by Anonymous Coward · · Score: 0

    It seems to me the real problem here is an understanding of exactly what the term 'cloud' implies (and doesn't). Evident by comments across the web, cloud implies automated availability, redundancy, scalability, and management. Unfortunately, this is where it seems the primary misunderstanding seems to occur with AWS in that yes, while it does provide all 4 of those features, it provides them however on a geographic region level (aka 'zones'). Whether this misunderstanding is a result of Amazon marketing efforts or not, I cannot say and I'm honestly too lazy right now on a Saturday to find citable sources.

    In any case, as I've observed, many users are quick to note that AWS best practices recommend using boxes across multiple regions (zones), to prevent service interruption in an event such as this. I have no idea as to the percentage of AWS users who actually follow that best practice or not, but judging from the amount of websites and services encountering problems, it seems to be somewhat low (full disclosure: completely my opinion here). Anyway, Amazon obviously has a huge PR and marketing problem here that it seems like they can address in one of two ways. Either better educate and mandate users follow "optional/recommend/gonna-be-mandatory-soon" or take the choice out of user's hands and make the system do it on it's own. The latter of which would better match the term 'cloud' and all that it implies...

  28. What? by Anonymous Coward · · Score: 0

    Oh, sure, you can have gov't come up with work projects, but none of them will be sustainable and useful

    How do you figure highways and freeways to not be useful? Even if you don't drive on them, you use products that are transported on them. I don't see Coca-Cola or WalMart funding road building projects.

  29. Critical load loss in a data center like Amazon? by Anonymous Coward · · Score: 0

    This boggles the mind. I work in power backup industry and data centers of the scale of Amazon's have redundancies on top of redundancies. Somebody isn't doing their job if they lost the critical load in a thunderstorm.

  30. Click by codepunk · · Score: 1

    To migrate Click Here!

    At least for those that have a DR migration plan.

    --


    Got Code?
  31. Stupid: Military is Insurance by Anonymous Coward · · Score: 3, Insightful

    What are you, 14? Democracies don't like War, because they don't like their sons, fathers, brothers, and husbands getting killed. It generally takes quite a lot to motivate Democracies into war, because of the hatred of casualties. Even when it is the best option. Example: going to war against Hitler in 1934, or 1936, or in 1938.

    Out here in the real world, the sum total of human experience suggests a strong military is like insurance or a seat belt. You hope you never have to use it, but its a godsend if you need it. Indeed having a strong military deters attacks. Nobody goes down to Venice Beach to pick fights with body builders, or down to the Gracie's gym to start fights.

    Like insurance, working out, eating right, avoiding bad areas, a strong military is a pain in the ass. It costs a lot. It is a pain and non-productive to maintain. And sure, you could save a lot by going without auto or health insurance. You could eat more cheaply at McDonalds than cooking healthy meals at home. Its cheaper to live in the ghetto than a nice area.

    As far as market value of defense stocks, the market capitalization of Lockheed Martin is 28.27 Billion, of Apple Computer 546.08 Billion. The market value of L'Oreal at 54.83 billion is about twice that of Lockheed Martin, suggesting lipstick pays a lot more than military avionics. Defense firms since their inception have been very cyclical, made relatively little money, and are merging like crazy as war spending winds down. But unless you're going to change human nature with Harry Potter's magic wand, carrying otherwise unprofitable defense firms is worth it because making drones, airplanes, missiles, tanks, ships, and helicopters to kill well-armed enemies is a very narrow engineering niche with knowledge quickly lost.

    As soon as your computer runs on unicorn farts and rainbows, we can all forget about dominance in the Persian Gulf and other oil areas. Until then, I'd prefer to drive to work and run the AC not live like a dirty smelly hippie. That AC making life bearable in 118F Kansas? Runs on oil not tree-hugging and drum circles.

    1. Re:Stupid: Military is Insurance by Anonymous Coward · · Score: 0

      Why hasn't this been modded up? I wish I had mod points...

    2. Re:Stupid: Military is Insurance by MrL0G1C · · Score: 1

      What are you, 9?

      https://secure.wikimedia.org/wikipedia/en/wiki/USA_wars

      Don't tell me democracies don't like wars, USA is a massive warmonger, if that psycho Romney gets in you'll probably go to war with Iran.

      --
      Waterfox - a Firefox fork with legacy extension support, security updates and better privacy by default.
    3. Re:Stupid: Military is Insurance by Anonymous Coward · · Score: 0

      If that's the way you really think, don't be whining when people disagree about handing their resources to the driver of the military machine.

    4. Re:Stupid: Military is Insurance by rubycodez · · Score: 1

      How does the $600 billion for defense budget, $120 billion for ongonig "wars", fit into your comparisons? we have over ten times the military needed for any "insurance", their purpose is armed robbery, coercion, genocide.

  32. Re:with cable the nodes need power and there batte by timeOday · · Score: 1

    I have a UPS for my cable modem, router, Ooma box, and wireless phone so VOIP will still work in an outage, if the cable signal is up (i.e. even with my computer turned off). Whether I can actually expect the cable to be up in an outage, I have no idea.

  33. Still think supporting things using water vapor... by Anonymous Coward · · Score: 0

    is a good idea? If I live in Vancouver, Canada, and I'm visiting you online, and you're a company based in Manchester, England, and a storm in Virginia, United States results in me not being able to access your site, something is very wrong with someone's business model. I'm the customer, so it's not me... who does that leave?!?

    Clouds by their nature are insubstantial when it comes to being a solid footing upon which to build a business, maybe this will be the wakeup call people need to pull their heads out of the cloud, and their bums, and host their own content, rather than outsourcing it somewhere else.

    Also worthy of note, a thunderstorm is more powerful than Anonymous, which you'll recall tried taking down Amazon over the Wikileaks donation stoppage thing, and had to admit they couldn't. Just saying.

  34. cableTV fail just means you're not entertained by Anonymous Coward · · Score: 0

    Bummer.. no entertainment. If the outage lasts more than 24 hours, they'll credit you 1/30th of the month's fees.

    Why you shouldn't be buying your mission critical network services from a entertainment company. the mind set is wrong.

  35. Re:not just netflix, and not just "electrical stor by gmhowell · · Score: 1

    Meh, it's PEPCO for the most part. They wouldn't have been working Friday night anyway. Ought to be an interesting bit of discussion with the utility commission regarding their current desired rate hike.

    --
    Jesus was all right but his disciples were thick and ordinary. -John Lennon
  36. One data center is not a cloud by CargoCultCoder · · Score: 1

    I'm sorry, but if your service is taken down by a single data center failure, you are not using the cloud to its full potential. Data centers do go down, drop out of sight, or otherwise become unusable now and again. Plan on it, design for it, and use the tools available to manage it.

  37. Re:with cable the nodes need power and there batte by ILongForDarkness · · Score: 1

    And your ip phone is going to work when your house has no power?

  38. Re:with cable the nodes need power and there batte by afidel · · Score: 1

    Yes, they are supposed to. That's why a VoIP cable modem has a battery in the unit, to ensure you can still communicate during normal power outages. If you're going to be without power for a week nothing short of a generator or POTS is going to help (ok some voice only cellphones can go a week in standby).

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.