Amazon EC2 Crash Caused Data Loss
Relayman writes "Henry Blodget is reporting that the recent EC2 crash caused permanent data loss. Apparently, the backups that were being made were not sufficient to recover the lost data. Although a small percentage of the total data was lost, any data loss can be bad to a Website operator."
... the confusion of ideas that would lead someone to treat their live web server as their primary/master data repository.
I guess I'm still stuck in Commodore 64 World, or something..
Was the lost data... all the stuff the PSN network lost? I think I see a connection!
There's a spot in User Info for World of Warcraft account names? Really?
EC2 is not meant to be used for data storage, that is what S3 is designed for. You store data and backups on S3, and use EC2 to serve high bandwidth websites to the masses.
Cloud applications hosted on Amazon survived this incident without issue, as expected. Only the regular old hosted applications had problems with the outage. They were never "the cloud" to begin with, so I'm not sure why the term even comes up in this discussion.
The cloud represents a black box that hides the underlying network topology so that there are no single points of failure. Cloud applications are tolerant because they are spread through different datacenters across multiple points of in world. A catastrophe at one or more datacenters will have no noticeable effect on the availability of a cloud application because it continues to run in many more.
Amazon offers a few cloud applications: S3 comes to mind. But Amzon's EC2/EBS hosting service is a plain old hosting service like any other. The EC2 topology is not hidden away from you. You have to make active decisions about where you want your EC2 instance to live. That goes against the idea of the cloud. What Amazon does offer in EC2 is the tools necessary for you to build a cloud application, but not everything hosted on EC2 is a cloud application by default.
Guess Wikileaks feels good about not being hosted there anymore.... their critical information could have been "lost" as well....
I think people miss the point of the cloud - saying the cloud is worthless because it "brings people that would otherwise have nothing against you trying to take down your server" is like saying that the internet is worthless because it opens up security risks.
I for one am glad to be connected, and obviously so are many others. Don't use services that aren't good for you - there are some cloud based services that are great, and some that aren't. It's pretty clear that in the future, things will be more connected, not less - adapt and take advantage of the good parts, the rest will fade anyway.
From: http://aws.amazon.com/es/ec2/
Availability Zones are distinct locations that are engineered to be insulated from failures in other Availability Zones and provide inexpensive, low latency network connectivity to other Availability Zones in the same Region. By launching instances in separate Availability Zones, you can protect your applications from failure of a single location.
Better than use different region, I think it is better have multiple cloud providers...
Damia
Post morten Amazon explanation:
http://aws.amazon.com/message/65648/
Damia
This is not the first time I've heard about a big hosting centre losing data even though it never happens, and they are keeping backups, etc.
It if it's at all manageable, keep one copy safe at your own place in addition to the replication at the hosting centre. You can set up a cheap box at the office with a couple of terabytes disk space and suck down the data periodically with something like rsync and rdiff-backup. It's not a whole lot of work and can make the difference between having a big problem and total disaster.
It would help if hosting centres actually told you how exactly they store and backup your data and what they do in case of emergency instead of throwing meaningless phrases like "99.999% uptime!" and "fully redundant storage backbone!" at you. Fully redundant storage backbone is nothing if it means it's built with some big arse proprietary SAN stuff where the whole array goes down if the main controller goes down. Which it of course does because it's a flaky embedded thing with 2k memory that has to be programmed in assembler and C with dangling memory pointers all over the place.
The durability you quote for S3 (99.99%) is for the reduced redundancy option. The standard storage lists 99.999999999% durability.