How Amazon Scrambled To Fix Prime Day Glitches (cnbc.com)
Amazon's Prime Day shopping event last week was riddled with glitches. Roughly 15 minutes into the sale, the landing page stopped working. Some users saw an error page featuring the "dogs of Amazon" and were never able to enter the site; others got caught in a loop of pages urging them to "Shop all deals." According to internal documents obtained by CNBC, it appears that Amazon failed to secure enough servers to handle the traffic surge, causing it to launch a scaled-down backup front page and temporarily kill off all international traffic. From the report: The e-commerce giant also had to add servers manually to meet the traffic demand, indicating its auto-scaling feature may have failed to work properly leading up to the crash, according to external experts who reviewed the documents. "Currently out of capacity for scaling," one of the updates said about the status of Amazon's servers, roughly an hour after Prime Day's launch. "Looking at scavenging hardware." A breakdown in an internal system called Sable, which Amazon uses to provide computation and storage services to its retail and digital businesses, caused a series of glitches across other services that depend on it, including Prime, authentication and video playback, the documents show.
Amazon chose not to shut off its site. Instead, it manually added servers so it could improve the site performance gradually, according to the documents. One person wrote in a status update that he was adding 50 to 150 "hosts," or virtual servers, because of the extra traffic. Caesar says the root cause of the problem may have to do with a failure in Amazon's auto-scaling feature, which automatically detects traffic fluctuations and adjusts server capacity accordingly. The fact that Amazon cut off international traffic first, rather than increase the number of servers immediately, and added server power manually instead of automatically, is an indication of a breakdown in auto-scaling, a critical component when dealing with unexpected traffic spikes, he said.
Amazon chose not to shut off its site. Instead, it manually added servers so it could improve the site performance gradually, according to the documents. One person wrote in a status update that he was adding 50 to 150 "hosts," or virtual servers, because of the extra traffic. Caesar says the root cause of the problem may have to do with a failure in Amazon's auto-scaling feature, which automatically detects traffic fluctuations and adjusts server capacity accordingly. The fact that Amazon cut off international traffic first, rather than increase the number of servers immediately, and added server power manually instead of automatically, is an indication of a breakdown in auto-scaling, a critical component when dealing with unexpected traffic spikes, he said.
The entire point of "prime day", which actually started many years ago with a massive sale selling XBox consoles for $100/ea, is to test out their infrastructure. They can test using simulated connections, but that only goes so far. They need to be able to test AWS with massive demand on unpredictable pages, and have the system scale appropriately. What better way to do this than to shove a few "sales" at a bunch of products, and then contact literally every media outlet in the country to promote it. Seriously, name a local news channel NOT hyping the prime day event. This is simply Amazon creating quite possibly the worlds largest single day beta test of new infrastructure code, and done annually. The big difference this year is that something didn't work right, so engineers were right on the spot to scale things up manually by hand.
I don't know about anyone else but I couldn't buy anything for a solid 5 hours or more, checking intermittently, because it wouldn't let me checkout. I did see a lot of Amazon dogs though.. like a lot..
The whole site worked for me but I couldn't checkout. It didn't seem like a server capacity issue to me.
Like Amazon Web Services for example...
wtf. if that was the case, not a very good example of aws' scaling capabilities, that's for sure..
and how the fuck is cnbc getting 'internal' (and no doubt confidential, nda-protected shit) documents? bezos is gonna hang someone for that leak.
That's pretty sad.
Joke's on you, because most of the Amazon retail doesn't actually run on AWS. It uses its own deployment system, server management, data storage, etc.
Is that I wasn't able to buy anything on prime day because I don't have any money. Epic fail. LULZ.
Thanks, Obama.
I didn't even know this stupid event was going on and couldn't care less some consumer whores missed out on some deals.
Here's a company based solely on cloud solutions and web retail marketing who plans a day which always creates a burden from the start especially on server demand. Maybe this should not reflect on the reliability of AWS to provide business with reliable solutions. But I would certainly be asking questions on why a company like Amazon created such a bad automated system for demand that obviously didn't work
It wasn't riddled with glitches. It was a total disaster and a complete embarrassment to the company.
We all saw it. He's Beszos' little PEDO bitch.
One of the first things you need to do when setting up an environment in AWS is to get them to increase your (artificially low) server limits for each instance type you're planning on using. Otherwise, you're going to run into those limits at the worst possible time when you need to rapidly scale your servers.
While I understand why they do this (probably to protect themselves from having someone spin up 1,000 cryptocoin mining instances with a hacked account), it's refreshing to see Amazon get bit by their own annoying provisioning decisions.
I tried for hours to order the Amazon Fire 7" (8GB) for the low price of CAD$40, but the page kept changing. Sometimes it would be available, sometimes it would be disabled and only the 16GB was available, sometimes the 8GB option completely disappeared as if it didn't even exist, other times it was available from a third-party non-Amazon seller for nearly twice the price.
It kept doing that every single time the page loaded and I was reloading it roughly once per second.
What's also weird is that once every few minutes, when it was finally available again, the estimated delivery kept going up by about two weeks. In the end I was able to order it (8GB), but I'm guessing it's not even manufactured yet since my delivery date is mid-september.
#DeleteFacebook
I was thinking of something like Microsoft Azure or Google Cloud platform. maybe Amazon could hire some of their Cloud Consultants to figure out how to do it.
Prime day is a reaction to Alibaba and Aliexpress.com. They both generate nearly a trillion dollars of revenue across the world with 11/11 day sales. 11/11 day itself is a celebration called 'singles day' in china, where students started celebrating being single around 1993 on university campuses.
amazon day is a pointless branded knockoff Bezos hopes will generate just as much money. Assuming no one finds out about aliexpress and they somehow magically stop competing.
Good people go to bed earlier.
Amazon got slashdotted.
The Prime Day thing is pretty skeezy - tons of no-name brand items who's prices were inflated for the sale day so they could "slash prices" and offer you the low low discounted price of what it normally sells at - but with the bigger price it never sold out crossed out. It was entirely an exercise in preying on peoples gullibility, who saw these huge "discounts" and made impulse buys thinking this super short special shopping day was saving them money. And of course, you had to buy the prime membership in the first place. You're by no means saving money on prime - you're usually paying more for the same goods - all you're getting is the "free" two day shipping.