Slashdot Mirror


EC2 Outage Shows How Much the Net Relies On Amazon

An anonymous reader writes "Much has been written about the recent EC2/EBS outage, but Keir Thomas at PC World has a different take: it's shown how much cutting-edge Internet infrastructure relies on Amazon, and we should be grateful. Quoting: 'Amazon is a personification of the spirit of the Internet, which is one of true democracy, access to the means of distribution, and rapid evolution.'" An article at O'Reilly comes to a similarly positive conclusion from a different angle.

31 of 147 comments (clear)

  1. Multiple Locations by WrongSizeGlass · · Score: 2, Informative

    Amazon has an option to have another Amazon location serve as the failover for your services. Yes, it costs more, but it does exactly what it's supposed to when this type of thing happens. If your backup/disaster recover plan requires as close to 100% uptime as possible you'll want to pay the extra for this type of protection.

  2. Clouds: Up in the air and foggy: by Hartree · · Score: 5, Insightful

    This article seems to be an apology for Amazon.

    Basicly it says "We went down, and took down lots of important stuff. That shows just how important we are and that lots of people use us. Thus, our cloud is a good thing."

    The logic of that doesn't quite work.

    I agree that it's a useful tool, but there are a lot of things that don't make sense to put in the cloud.

    1. Re:Clouds: Up in the air and foggy: by WrongSizeGlass · · Score: 3, Informative

      I agree that it's a useful tool, but there are a lot of things that don't make sense to put in the cloud.

      I always feel better when anything that is mission critical is in-house. Cloud based (and regular internet based) services can become inaccessible for your business if you simply lose your internet connection - it doesn't require all of Amazon to bite the dust.

    2. Re:Clouds: Up in the air and foggy: by hawguy · · Score: 2

      I always feel better when anything that is mission critical is in-house. Cloud based (and regular internet based) services can become inaccessible for your business if you simply lose your internet connection - it doesn't require all of Amazon to bite the dust.

      But if having your application available to the outside world is mission-critical to the outside world, you're almost always better off colocating it with providers in multiple physical locations.

      Even for internal apps that are necessary for your business, you may be better off outsourcing, since if your building catches on fire, you can send employees home to let them continue working. Few companies have the resources to build a truly redundant hosting infrastructure across multiple regions.

  3. Except they didn't work. by pavon · · Score: 4, Informative

    A large number of people that are experiencing this outage, did pay for multiple availability zones, and it didn't help them.

    1. Re:Except they didn't work. by el_tedward · · Score: 5, Informative

      I guess what we should learn from this is to put your failover in separate regions, not separate availability zones?

    2. Re:Except they didn't work. by WrongSizeGlass · · Score: 5, Informative
      From the NYT article:

      Big companies, that have decided to put crucial operations on Amazon computers are apt to pay up for the equivalent of computing insurance, analysts say. Netflix, the movie rental site, has become a large customer of the Amazon cloud. Most of its Web technology — customer movie queues, search tools and the like — runs in Amazon data centers.

      Netflix said it had sailed through the last couple of days unscathed. “That’s because Netflix has taken full advantage of Amazon Web Services’ redundant cloud architecture,” which insures against technical malfunctions in any one location, said Steve Swasey, a Netflix spokesman.

      Sounds like it worked for some.

    3. Re:Except they didn't work. by Guspaz · · Score: 4, Insightful

      Paying for multiple availability zones is not the same as paying for multiple locations. There are multiple availability zones in a single datacenter. Netflix got it right, they spread their infrastructure over multiple physical locations, and didn't suffer any downtime despite losing a significant chunk of their infrastructure; it was business as usual.

      Like anything else, cloud computing still requires you to decide how much redundancy you're willing to pay for. If uptime is that important to you, spreading your infrastructure out over multiple datacenters is a no-brainer.

    4. Re:Except they didn't work. by hawguy · · Score: 2

      Do they even have the capability to spread someone out across different regions?

      Yes you have full control over what region your instance runs in - some regions cost more than others, the East region is cheaper than the West region.

  4. Forget cloud computing! by stopacop · · Score: 2

    I'll stick to my setup of a dedicated server and virtual private servers across the globe rather than putting all my eggs in one basket with Amazon and "cloud computing"! It may be a little bit more in terms of operating costs, but it has true failover in the event of an outage!

    --
    http://www.stopacop.so -- You have rights. How about standing up for them before they go away?
  5. "Bailout" of the Cloud now! by ninejaguar · · Score: 2

    Otherwise, Amazon will become too big to fail.

    = 9J =

  6. Slightly more... but yeah by Anonymous Coward · · Score: 2, Insightful

    I guess that the major difference to traditional outsourced hosting is what you mentioned but didn't emphasis... The "scalable" part. If you normally spend X amount of resources (CPU time, memory, whatever) and might get a peak of 50X resources at some point, traditionally you would either constantly pay for a lot of resources that you didn't need for most of the time, or your service would crash during the peak. Cloud offers a lot more flexibility as you can pay based on what you use, not based on what you estimate you might need. Pretty useful for some things, though certainly overhyped (and because of the hype, some have reacted with the "It's useless!" attitude, which is just as wrong).

    Disadvantages are pretty obvious: Your data is at the hands of a third party.

  7. Re:SPF by jd · · Score: 2

    Apparently, because having just one party and no elections makes a democracy. And in later news, why Rupert Murdoch tapping everyone's phones is good for privacy.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  8. Re:Why The Cloud? by mini+me · · Score: 4, Insightful

    The cloud represents a black box that abstracts the underlying network topology.

    You might send your data to a server in Germany and retrieve it from a server in the USA. When you put something in the cloud you do not have to worry about problems like this because the cloud provider already has a hot backup ready to take the slack in another part of the world. You don't need to know or care how it happens, it just works. S3 is an Amazon example of a cloud service. You send your file to S3 and Amazon takes the responsibility of ensuring that it is available even if a datacenter is blown to smithereens.

    EC2 and EBS are not the cloud. There is no abstraction of the datacenter. Amazon leaves it up to you to choose which datacenter you wish to work in. This can allow you to easily build a cloud application on top of their physical infrastructure, but it is up to you to make it "the cloud". We witnessed so many failures because the applications were not cloud applications, just standard hosted services.

  9. One negative... by RyanFenton · · Score: 2

    When there's a 'service' you'd like to block (such as adverts), amazon hosting can make it rather difficult to consistently block them using an IP blacklist, without also blocking potentially useful things too.

    Essentially though, they're just packaging the benefits of an economy of scale - things get cheaper the more you focus on larger supply, and thus they can make the most profits and cut off the most competition by scaling up so much with cheap prices. It's part of how companies from WalMart and Google compete so well.

    Economies of scale are also one part of why markets inherently fail over time - competition almost always favors those who scale up best, who can then leverage that power over competitors, preventing them from growing to the same extent, and breaking any meaning to the freedom of the market. At that point, competition becomes defined by who can serve WalMart's interest best.

    Ryan Fenton

  10. Where have I heard this before... by girlintraining · · Score: 4, Insightful

    Microsoft: We're sorry our product broke and a lot of people weren't able to get online. Slashdot: BURN THE HERETIC! Amazon: We're sorry our product broke and a lot of people weren't able to get online. Slashdot: It's okay. Here, have a cookie.

    --
    #fuckbeta #iamslashdot #dicemustdie
  11. Re:Short Memories by jd · · Score: 2

    One group taking down WikiLeaks doesn't really matter when it comes to democracy. Indeed, since choice is a part of democracy, one group is perfectly entitled to censor what they like, since one group is utterly insignificant. Indeed, that is how you identify democracies.

    The Internet is not democratic and hasn't been since deregulation. The Internet is a federation of dictatorships. You have no choices. If you live in an area where X runs the backbone, ALL ISPs without exception are mere window-dressing over X. They can't provide anything X doesn't pipe, they can't charge less than X charges them, they can't give you freedoms or rights X doesn't grant you. To claim you can choose another provider is like saying you can choose to buy Fords in different shades of absolute black. If you believe in such illusions and phantoms, I've a Golden Gate bridge you can buy.

    All we have here is an extension of that. Amazon has pwned the data centers, so choice has been eliminated. That's not democracy, that's dictatorship that's telling you it's democracy. It's as close to real democracy as Saddam Hussein's elections or the Tea Party.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  12. Re:Outages by Overzeetop · · Score: 2

    So, in other words, this is exactly what people who use cloud services for mission critical data needed. It's exceptionally hard to learn good lessons from success, but failures are almost guaranteed to teach something. In this case, the community will understand the potential cost of a four-to-six-nines system without a backup. There is always a finite chance of failure.

    Still, it was only down for , what - a day? Remember Loma Prieta? WTC collapses? Things happen, and when they do everybody is down for a while. Compared to real disasters, that's pretty good.

    --
    Is it just my observation, or are there way too many stupid people in the world?
  13. Re:Why The Cloud? by tragedy · · Score: 4, Informative

    Hmm, considering how long "the cloud" has been a buzzword, doesn't it seem like an awful lot of unscheduled downtime if there have been enough events already for people to be claiming that they aren't given a fair shake by the media when they go down. After all, if the media have reported on it several times, it's happened several times. That's more unscheduled downtime than your typical web server gets in a few years.

    Perhaps if they hadn't gone with a word that means fuzzy, insubstantial and ephemeral to describe their services people wouldn't have the same reservations about it. Maybe it's also because IT people don't like their managers to say "I just heard about this neat new thing, let's abandon the system we have now to pursue this" against their advice, then have to deal with being screamed at by their managers later when everything is down and there's absolutely nothing they can do about it because they've effectively ceded all control to a third party service provider who has not managed, thus far, to establish themselves as particularly safe or reliable.

    The apologists whose articles are linked in this Slashdot story seem to think it's great that we're putting all of our eggs into the baskets of known basket droppers. Thus far I'm not impressed enough by these providers. Obviously, in order to do anything on the Internet, you have to rely on some sort of service provider, and even they have to rely on their peers. So obviously there's no way you can have total control. Nevertheless, you should still try to retain all the control you can over your own stuff.

  14. Made it Through Pretty Much Unscathed by ShipIt · · Score: 5, Informative

    Totally concur with others pointing out Amazon offers redundancy if you choose to use it.

    We had webservers, database (master/slave,) and other services split across usa-east and usa-west.

    When usa-east started showing problems, we:
    *) Took the usa-east webservers out of round robin DNS (ttl 1hr)
    *) Verified the slave (in usa-west) was up to date, shut down the master (usa-east,) and converted the slave to master.
    *) Updated all webservers to point to the new master.
    *) Cranked up new usa-west webservers / updated round robin DNS

    I believe Amazon offers mechanisms to do this automatically or we could just always write our own failover scripts, but this is the tradeoff me made. We were willing to trade some service degradation by switching over manually in exchange for avoiding the pitfalls of false-positive detection. Very much an application specific tradeoff, not for everyone, but it worked for what we are doing.

    The key was to avoid putting all eggs in the usa-east basket and splitting up across usa-west, even though we incur additional bandwidth fees, ie master/slave replication transfer is full fee between regions.

    We were never concerned about cascading failures effecting multiple availability zones in a give region nor did it matter for us - our redundancy requirement was geographical diversity, not partitions within a datacenter. We were thinking natural disaster, but the architecture covered us in this case as well.

    The coolest thing to me is just how quickly we were able to shuffle around these resources to avoid a problem area - a couple of hours. There's no way we could have done it so quickly with what we had before - a combination of our own colocated servers and VPS.

  15. Re:It also shows... by camperslo · · Score: 3, Insightful

    Amazon is a personification of the spirit of the Internet, which is one of true democracy, access to the means of distribution, and rapid evolution

    Spirit of the internet? Some on seeing Amazons' passing judgement on Wikileaks might think it more aligned with a certain corporate spirit than a spirit of the internet. If they're really support democracy, which can't function properly with a poorly informed public, maybe they shouldn't be the ones to decide whether or not someone is a journalist.

    Hardware doesn't make spirit. What people are doing, and the thoughts that drive the choices made probably do.

    They are still contented to profit from the sale of books about WikiLeaks.

    http://www.amazon.com/Inside-WikiLeaks-Assange-Dangerous-Website/dp/030795191X

    http://www.guardian.co.uk/technology/2010/dec/11/wikileaks-amazon-denial-democracy-lieberman

  16. Re:Outages by pla · · Score: 3, Insightful

    Many .com websites were unnecessarily down for hours since nobody had thought to plan for a outage. I am sure quite a few architecture meetings where held the following day addressing disaster recovery.

    Y'know, call me crazy, but I didn't even notice the outage.

    I mean, yeah, I read about it on a number of sites (all still up and runing just fine), but honestly can't say I tried to visit even a single site actually unavailable because of the downtime.

    I dunno, perhaps this mostly affected ad hosts and I didn't notice because I already block them?

  17. Re:Why The Cloud? by x*yy*x · · Score: 2

    People are unfairly giving cloud hosting here bad name anyway - EC2 doesn't handle distributing your services in that way and it's directly noted.. You have to make sure you have backup locations set up in EC2. It costs more, but it's for situations like this. That is why Netflix didn't have any problems even while they were using EC2.

    If you're being stupid and taking shortcuts thinking you won't need that, well, it's your choice. You would do it with any kind of service anyway.

  18. lesson learnt by Anonymous Coward · · Score: 2, Insightful

    I was directly affected by this outage. Once i discovered that the issue was at amazon and not at application- i restored from a previous snapshot, synced my application code, and associated my IP to a new instance in a functioning zone.

    Total downtime for me was probably just under an hour. And that's including my debugging time.

    Overall it wasn't the end of the world for me and i did discover I should make my redundancy setup run more frequently.

    Sure i lost a few sales, but in a way i look at this as an example of why I should be better prepared for such an occurrence.

    This still isnt as bad a when IBM pulled the wrong drives out of my server and wiped them.

  19. Re:My cloud is fine by thetoadwarrior · · Score: 3, Funny

    Of course. The botnet authors have a vested interest in keeping your system up.

  20. Re:Why The Cloud? by emt377 · · Score: 4, Informative

    Why is so much in the cloud? I've heard it touted in lots of marketing speak, but I've never worked with it.

    As someone who has never worked with the cloud (shocking, I know), what are the advantages and disadvantages?

    Is it basically just distributed scalable redundant web hosting run by a big company? So you're basically renting to avoid the start-up capital costs of those services and to put them in the hands of specialists, while you focus on your web apps?

    Or is it more?

    There's a big mix-up of lots of different concepts and ideas here, to the point that the questions you ask are impossible to answer.

    - EC2 is a vps-like virtual server provisioning service. You rent a virtual server instance by the hour. APIs exist for you to dynamically add and remove instances as needed. You create an image, then can fire up additional instances as you see fit. Someone like Netflix for instance, can fire up streaming servers during peak hours then shut them down at off hours.
    - You can of course set up your own co-lo systems, but it will be provisioned 24/7 and will cost you more since it will be sized for peak capacity, and even during peak most of the servers will be idle much of the time due to random load variance. You can improve peak utilization by setting up your own virtual provisioning. But then you have ops costs, so unless you have a massive operational scale you'll find it cheaper to buy from AWS (or linode, rackspace, etc).
    - EBS is a logical volume service. You create a volume and mount it on an EC2 instance. Like with server instances, there are API calls to dynamically create EBS volumes. You can unmount it and move it to a different server in the same datacenter, so you could use them for instance to take backup snapshots or log analysis, or similar, in addition to simply being server storage. Of course you get to build or buy the software to do all these things yourself.
    - Server instances belong to groups, and have access controls set up among them. This allows you to create private 'backplane' interconnects, where some things like sql servers are only accessible to instances part of a group.
    - EIPs are elastic IPs, which are IPs you lease and can then assign to any of your server instances (usually ingress and point-of-contact servers). You can move them between virtual servers as you like, and obviously would typically map DNS to them. Servers will otherwise get anonymous IP addresses, meaning they get something arbitrarily assigned. They're reachable (if you wish) from the net at large, but aren't well-known points for your service.
    - AWS also provides a load distribution service. I've never used this actually; it never seemed to fit right.
    - S3 is a cloud service, meaning it has no deterministic ingress and egress. It's used for content distribution: writing is expensive, reading is dirt cheap. Content stored is automatically replicated and de-replicated as needed. You have no idea where it lives, in how many copies, and how it's backed up. SLAs make promises about availability.
    - Content distribution is a poster child cloud service example. Not all services will easily fit a cloud model. Many other services that have fit the model (mainly using mapreduce or like) are batch processing based and more about massaging massive amounts of data than interactive end-user services.
    - Somewhat simplified, if your service can fit around a key-value store (even a sophisticated one like MongoDB), then it's a candidate for a cloud architecture.
    - There are plenty of providers of bits and pieces to do things like server monitoring, cost analysis, and automated/manual server provisioning. In fact, I'm getting into this business myself...

    A 'cloud' service is not a hosting service - it's a way to build things, a black-box mindset. There may be a well-defined point of contact (perhaps found via DNS), but beyond that everything is dynamic. The initial contact can redirect, either explicitly or implicitly. It's not like a 'hosting' service where you click a button and get a Joomla host. But it might be a viable way to implement such a hosting service.

  21. Re:My cloud is fine by RobertM1968 · · Score: 3, Insightful

    All my websites are fine, which is what my high profile clients expect.

    That's because we use Microsoft Windows Servers and Sql Databases.

    Really? I've found both such products to be unsuitable for the demand we put on such infrastructures - unless I throw a lot more hardware at them. With 1/20th the traffic, and 6% the userbase, our forums crawled on Windows Server and MSSQL Server. We switched to Apache and MySQL, and even running the greatly more database intensive (than the Windows solution we were provided) Simple Machines Forum, we need a lot less hardware than we previously did when we had so much less traffic.

  22. Re:It also shows... by nothings · · Score: 5, Insightful

    Don't forget the one-click patent. True democracy/spirit of the Internet my ass.

  23. Re:It also shows... by Adrian+Lopez · · Score: 2

    Clouds aren't the problem. It's contracting your cloud out to a third party that's the problem.

    Exactly. What I'd like to see is an open cloud platform that makes it easy to distribute nodes between multiple unrelated ISPs instead of all the servers being handled by a single monolithic entity such as Amazon or Google.

    --
    "In prison you just have to shut your eyes and take it. Here you have to shut your eyes and give it."
  24. And how much of the net, really ? by unity100 · · Score: 2

    rackspace.com, softlayer.com, hetzner.de -> most of the web is housed on big providers like these. personal, organization, and small businesses are alike. these providers' main business is renting racks and servers, which are then used by hosts to rent to end customers.

    i dont know where does this 'how much of the net relies on amazon became clear' bullshit comes from. are there any statistics to show for it ? or, are people unaware of what's going on outside their little world window of expertise, so that they think that amazon cloud, for some reason, has become the 'backbone' of internet ?

    really. where are the statistics ? all i see, some random guy gives away some pdf by hosting it through amazon's cloud, and then proceeds to claim that 'net' became too reliant on, amazon ...

    really ....

  25. Re:It also shows... by ScrewMaster · · Score: 2

    Seeing how the internet is the cloud where else do you expect internet sites to go?

    No, if you throw a packet into the "cloud" known as the Internet, it usually comes out where you wanted it to go, and you don't need to know the path it took to get there. "Cloud computing" is an entirely different concept (or, rather, a set of somewhat related concepts that mean different things to different people.) The Internet just schleps data from here to there.

    --
    The higher the technology, the sharper that two-edged sword.