Slashdot Mirror


EC2 Outage Shows How Much the Net Relies On Amazon

An anonymous reader writes "Much has been written about the recent EC2/EBS outage, but Keir Thomas at PC World has a different take: it's shown how much cutting-edge Internet infrastructure relies on Amazon, and we should be grateful. Quoting: 'Amazon is a personification of the spirit of the Internet, which is one of true democracy, access to the means of distribution, and rapid evolution.'" An article at O'Reilly comes to a similarly positive conclusion from a different angle.

8 of 147 comments (clear)

  1. Multiple Locations by WrongSizeGlass · · Score: 2, Informative

    Amazon has an option to have another Amazon location serve as the failover for your services. Yes, it costs more, but it does exactly what it's supposed to when this type of thing happens. If your backup/disaster recover plan requires as close to 100% uptime as possible you'll want to pay the extra for this type of protection.

  2. Re:Clouds: Up in the air and foggy: by WrongSizeGlass · · Score: 3, Informative

    I agree that it's a useful tool, but there are a lot of things that don't make sense to put in the cloud.

    I always feel better when anything that is mission critical is in-house. Cloud based (and regular internet based) services can become inaccessible for your business if you simply lose your internet connection - it doesn't require all of Amazon to bite the dust.

  3. Except they didn't work. by pavon · · Score: 4, Informative

    A large number of people that are experiencing this outage, did pay for multiple availability zones, and it didn't help them.

    1. Re:Except they didn't work. by el_tedward · · Score: 5, Informative

      I guess what we should learn from this is to put your failover in separate regions, not separate availability zones?

    2. Re:Except they didn't work. by WrongSizeGlass · · Score: 5, Informative
      From the NYT article:

      Big companies, that have decided to put crucial operations on Amazon computers are apt to pay up for the equivalent of computing insurance, analysts say. Netflix, the movie rental site, has become a large customer of the Amazon cloud. Most of its Web technology — customer movie queues, search tools and the like — runs in Amazon data centers.

      Netflix said it had sailed through the last couple of days unscathed. “That’s because Netflix has taken full advantage of Amazon Web Services’ redundant cloud architecture,” which insures against technical malfunctions in any one location, said Steve Swasey, a Netflix spokesman.

      Sounds like it worked for some.

  4. Re:Why The Cloud? by tragedy · · Score: 4, Informative

    Hmm, considering how long "the cloud" has been a buzzword, doesn't it seem like an awful lot of unscheduled downtime if there have been enough events already for people to be claiming that they aren't given a fair shake by the media when they go down. After all, if the media have reported on it several times, it's happened several times. That's more unscheduled downtime than your typical web server gets in a few years.

    Perhaps if they hadn't gone with a word that means fuzzy, insubstantial and ephemeral to describe their services people wouldn't have the same reservations about it. Maybe it's also because IT people don't like their managers to say "I just heard about this neat new thing, let's abandon the system we have now to pursue this" against their advice, then have to deal with being screamed at by their managers later when everything is down and there's absolutely nothing they can do about it because they've effectively ceded all control to a third party service provider who has not managed, thus far, to establish themselves as particularly safe or reliable.

    The apologists whose articles are linked in this Slashdot story seem to think it's great that we're putting all of our eggs into the baskets of known basket droppers. Thus far I'm not impressed enough by these providers. Obviously, in order to do anything on the Internet, you have to rely on some sort of service provider, and even they have to rely on their peers. So obviously there's no way you can have total control. Nevertheless, you should still try to retain all the control you can over your own stuff.

  5. Made it Through Pretty Much Unscathed by ShipIt · · Score: 5, Informative

    Totally concur with others pointing out Amazon offers redundancy if you choose to use it.

    We had webservers, database (master/slave,) and other services split across usa-east and usa-west.

    When usa-east started showing problems, we:
    *) Took the usa-east webservers out of round robin DNS (ttl 1hr)
    *) Verified the slave (in usa-west) was up to date, shut down the master (usa-east,) and converted the slave to master.
    *) Updated all webservers to point to the new master.
    *) Cranked up new usa-west webservers / updated round robin DNS

    I believe Amazon offers mechanisms to do this automatically or we could just always write our own failover scripts, but this is the tradeoff me made. We were willing to trade some service degradation by switching over manually in exchange for avoiding the pitfalls of false-positive detection. Very much an application specific tradeoff, not for everyone, but it worked for what we are doing.

    The key was to avoid putting all eggs in the usa-east basket and splitting up across usa-west, even though we incur additional bandwidth fees, ie master/slave replication transfer is full fee between regions.

    We were never concerned about cascading failures effecting multiple availability zones in a give region nor did it matter for us - our redundancy requirement was geographical diversity, not partitions within a datacenter. We were thinking natural disaster, but the architecture covered us in this case as well.

    The coolest thing to me is just how quickly we were able to shuffle around these resources to avoid a problem area - a couple of hours. There's no way we could have done it so quickly with what we had before - a combination of our own colocated servers and VPS.

  6. Re:Why The Cloud? by emt377 · · Score: 4, Informative

    Why is so much in the cloud? I've heard it touted in lots of marketing speak, but I've never worked with it.

    As someone who has never worked with the cloud (shocking, I know), what are the advantages and disadvantages?

    Is it basically just distributed scalable redundant web hosting run by a big company? So you're basically renting to avoid the start-up capital costs of those services and to put them in the hands of specialists, while you focus on your web apps?

    Or is it more?

    There's a big mix-up of lots of different concepts and ideas here, to the point that the questions you ask are impossible to answer.

    - EC2 is a vps-like virtual server provisioning service. You rent a virtual server instance by the hour. APIs exist for you to dynamically add and remove instances as needed. You create an image, then can fire up additional instances as you see fit. Someone like Netflix for instance, can fire up streaming servers during peak hours then shut them down at off hours.
    - You can of course set up your own co-lo systems, but it will be provisioned 24/7 and will cost you more since it will be sized for peak capacity, and even during peak most of the servers will be idle much of the time due to random load variance. You can improve peak utilization by setting up your own virtual provisioning. But then you have ops costs, so unless you have a massive operational scale you'll find it cheaper to buy from AWS (or linode, rackspace, etc).
    - EBS is a logical volume service. You create a volume and mount it on an EC2 instance. Like with server instances, there are API calls to dynamically create EBS volumes. You can unmount it and move it to a different server in the same datacenter, so you could use them for instance to take backup snapshots or log analysis, or similar, in addition to simply being server storage. Of course you get to build or buy the software to do all these things yourself.
    - Server instances belong to groups, and have access controls set up among them. This allows you to create private 'backplane' interconnects, where some things like sql servers are only accessible to instances part of a group.
    - EIPs are elastic IPs, which are IPs you lease and can then assign to any of your server instances (usually ingress and point-of-contact servers). You can move them between virtual servers as you like, and obviously would typically map DNS to them. Servers will otherwise get anonymous IP addresses, meaning they get something arbitrarily assigned. They're reachable (if you wish) from the net at large, but aren't well-known points for your service.
    - AWS also provides a load distribution service. I've never used this actually; it never seemed to fit right.
    - S3 is a cloud service, meaning it has no deterministic ingress and egress. It's used for content distribution: writing is expensive, reading is dirt cheap. Content stored is automatically replicated and de-replicated as needed. You have no idea where it lives, in how many copies, and how it's backed up. SLAs make promises about availability.
    - Content distribution is a poster child cloud service example. Not all services will easily fit a cloud model. Many other services that have fit the model (mainly using mapreduce or like) are batch processing based and more about massaging massive amounts of data than interactive end-user services.
    - Somewhat simplified, if your service can fit around a key-value store (even a sophisticated one like MongoDB), then it's a candidate for a cloud architecture.
    - There are plenty of providers of bits and pieces to do things like server monitoring, cost analysis, and automated/manual server provisioning. In fact, I'm getting into this business myself...

    A 'cloud' service is not a hosting service - it's a way to build things, a black-box mindset. There may be a well-defined point of contact (perhaps found via DNS), but beyond that everything is dynamic. The initial contact can redirect, either explicitly or implicitly. It's not like a 'hosting' service where you click a button and get a Joomla host. But it might be a viable way to implement such a hosting service.