Amazon EBS Failure Brings Down Reddit, Imgur, Others
Several readers have sent word of a significant Amazon EBS outage. Quoting:
"Amazon Web Services has confirmed that its Elastic Block Storage (EBS) service is experiencing degraded service, leading sites across the Internet to experience downtime, including Reddit, Imgur and many others. AWS confirmed on its status page at 2:11 p.m. ET that it is experiencing 'degraded performance for a small number of EBS volumes.' It says the issue is restricted to a single Availability Zone within the US-East-1 Region, which is in Northern Virginia. AWS later reported that its Relational Database Service (Amazon RDS) and its Elastic Beanstalk application plaform also experienced failures on Monday afternoon."
Or else my afternoon is going to totally suck.
Productivity reached a record high this afternoon.
After 3 days without programming, life becomes meaningless
- The Tao of Programming
It's the cloud! It's like never like down, and webscale!
Since no one can go on reddit, they will come back to /. only to find out why reddit is down!
Coursera is also down as a result.
/. is working just fine.
Are those karma points in the mail?
It's as if millions of geek voices cried out in terror & were suddenly silenced.
We are seeing EBS problems across multiple AZs with our services, as are many others. Amazon is downplaying the issue.
See HN for ongoing discussion as well: http://news.ycombinator.com/
Bad luck if you're hosted in the US-East-1 Region, I guess.
Heh, I should really start advertising the LVS clusters I tend to as 'private clouds with better uptime than Amazon'.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
I have to admit, due to this outage I just logged in to Slashdot for the first time in a year. We're experiencing our own outages at work, unrelated to AWS, but I'd hate to be an AWS admin during one of these major outages. This makes me wonder why Reddit, Imgur, etc., don't have presences in multiple availability zones to prevent this kind of outage.
http://www.rackspace.com/blog/rackspace-cloud-block-storage-making-progress-towards-a-fall-release/
"Nearing fall release"?!? Help us out!
Do you still think that putting your digital life in the "cloud", without any ability to fall back on a physical hard drive or device, is a good idea?
[End Of Line]
An honest question, why don't these large, big-name sites utilize the Multi Availability Zone failover that Amazon offers? It seems these AWS outages make for good headlines, but shouldn't any large site be co-located in multiple physical locations to ensure uptime? If they WERE using Multi AZ, or there is some other technical reason why it wouldn't help, I'm really curious to know why...
But the cloud is so much better to use!
Paul: Father... father, the sleeper has awakened! - Dune
There's an oblig xkcd: http://xkcd.com/908/ Guess someone tripped over the wire.
s/[stupid comments]/[intelligent discourse]/gi
turntable.fm is also down -- I guess the NYC tech startup community is going crazy right now. Time to diversify!
This too shall pass
Looks like there won't be any fancy reports about the "cloud" having spectacular up times, with over an hour passed they can no longer claim more than 3 nines uptime.
My life and business doesn't rely on ANY internet based social service things and I make sure my customers are not dependent on social media to know whats going on with my business. Hell even if the internet would go down I still have a phone book and a land line.
by TheSpoom (715771) Uncaring Linux user here. I have nothing to add to this but please continue. *munches popcorn*
The tech is available to amazon to migrate running vms from one cluster to another. Why do we still have these outages.
Amazon all of the problems of hosting stuff yourself and all of the problems of cloud hosting with out any of the advantages. I have single boxes at my house running on cable modems that have better uptime than EC2 right now. (baring a few power outages cause I don't have ups on my crap :-(, )
This keeps happening.
Amazon claims "degraded performance" in EBS. But, then RAID rebuilding and instance migration/failover increases the load so much that everything else around it crashes as well.
This is yet another major outage for hundreds of (some) significant sites and apps. I'd call it a cloud burst.
Netcraft confirms it.
sudo eat my shorts
If only there were some lessons learned over decades and decades of mainframe use that that could be applied to the cloud.
These sites load so much faster.
Is storage that expensive or is it the bandwidth costs associated with them?
I love the smell of cloud failure in the afternoon..
Should I take a sick day for a loved one?
This zapping people's data is getting to be habit forming for Amazon I think.
I guess we're just waiting to hear if it was a mistake or on purpose.
blog.sam.liddicott.com
Why is it that these same companies keep failing at cloud instead of learning from their mistakes and using technology like failover and high availability that has been around for years to ensure that if their is an outage their service remains online and solid? A single point of failure should never cripple a site or service.
http://benjaminkerensa.com/2012/06/30/reflecting-on-netflix-instagram-pinterest-downtime
http://www.brandonholtsclaw.com/blog/2012/how-not-to-fail-at-the-cloud/
The Internet was meant to be resilient to nuclear attacks... Now major websites simply go down when you take out major cloud service providers... This whole development is just silly.
You asked for it!
Funny, and I am ALWAYS the one criticize for my "don't use a cloud" attitude.
My productivity level for today just hit an all-time low.
Why would you ever put a DEV environment in the cloud? Why? Really? time to go surf some more... i bet monster is still up...
I'm noticing that the amazon web interface is also slow in responding - looks like specific parts of cloudfront are also having issues.
Amazon must have some wires crossed, Minecraft.net is now rendering the website Room Key.
When you own the infrastructure, know what you are doing and control your own cloud. Private clouds are the only real future for businesses who need 24x7 uptime (if you have hired the right folks to build and manage it).
is to use Zadara Storage instead of EBS.
http://blog.zadarastorage.com/2012/10/comparing-provisioned-iops-ebs-vs.html
Minecraft login is down too!
-- QED
When you put something out of your hand. You have no control where it's replicated over what many server. Then yes, this incident should be no surprised of you. I just waiting for big data breach, loss of data in a spectacular way.
In a virtual world, you put on your roller blades, and administer a failing data center. Level 1 is your home LAN. Level 2 is a law office and all the attorneys want the morning's court briefs immediately because court starts in 45 minutes and the file server screen says "RAID array offline". Level 3 is a small ISP. Level 4 is AWS. Level 5 is Google. Good luck!
now we need to go OSS in diesel cars