Amazon's Move Off Oracle Caused Prime Day Outage in One of its Biggest Warehouses, Internal Report Says (cnbc.com)
Amazon is learning how hard it can be to move off of Oracle's database software. From a report: On Prime Day, while the e-retailer was dealing with a major website glitch that slowed sales, the company was also dealing with a technical problem in Ohio at one of its biggest warehouses, leading to thousands of delayed package deliveries, according to an internal report obtained by CNBC. The problem was in large part due to Amazon's migration from Oracle's database to its own technology, the documents show. The outage underscores the challenge Amazon faces as it looks to move completely off Oracle's database by 2020, and how difficult it is to re-create that level of reliability. It also shows that Oracle's database is more efficient in some aspects than Amazon's rival software, a point that Oracle will likely emphasize during this week's annual OpenWorld conference in San Francisco.
Was it just a regular outage that could have happened to anyone, or something very specific to their own infrastructure?
Just because a change was made at some point in the past, you don't get to just assume that everything would have been fine if Change X or Y hadn't been made. Oracle isn't a silver bullet.
Oracle: Don't you dare change to a competing product. Bad things will happen to you.
Even those who arrange and design shrubberies are under considerable economic stress at this period in history.
So the only glitch was a short delay in a single warehouse?
Sounds like a massive success story to me.
I use bussiness management products from oracle with an underlying oracle database. I feel like sometimes the IT department must not be shoveling enough coal into the boiler or something beacuse this antiquated inflexible interface just stalls all the time and very frequently has to go down for some sort of synchronization. It's slick like Amazon's web site. I don't understand why Oracle even exists given my experience with it.
Some drink at the fountain of knowledge. Others just gargle.
That phrase confused me.
I can absolutely understand wanting to move off Oracle. But why would they re-invent the wheel and write their own database? At least, that's what it sounds like they're doing based on the way the article was phrased.
Wouldn't it have been better to just switch to Postgres and use the oracle compatibility layer if they needed things like PL/SQL support?
Ilsa
Between Java and their Enterprise platforms, if Oracle spent as much time listening and responding to their customers as they spent threatening them, they might be in a far better position today. Any major platform transition is going to have problems unless you're exceptionally lucky. There's just too many moving parts in Enterprise systems for humans to get everything right on the first try. Oracle won't tout all of the problems people have moving ONTO their software from a competitor, but that transition pain happens too.
Every year that goes by, it seems like Oracle is in a more tenuous position, despite their increased revenue. They've already lost the SME space -- I don't know of a single company anywhere in our client base, or within my sphere of influence, that still uses Oracle software. Organizations are bumping up against the limits of NetSuite -- the costs to integrate 3rd-party or industry-specific components, compared with other ERPs, are turning out to be more significant than expected. So we have clients and vendors migrating ERPs over time.
Oracle is becoming the Comcast of the software world. They treat everyone like crap, but were so deeply embedded that they were hard to dislodge. With every passing year, that is less true, and I think Oracle knows it. Unfortunately, they seem to be choosing to double-down on the "treat everyone like crap" strategy, rather than actually fixing the systemic problems that might eventually sink them...
Notice: Your mouse has been moved. Windows will now restart so this change can take effect.
Many a company has used Oracle ERP for everything in the company because a sales guy sold them on it. Then later on they deeply, deeply regretted doing it because they customized the system, then later realized they didn't own the underlying technology, so they got tied to an ancient version of Oracle ERP. Essentially Oracle got them over a barrel, and is able to extort them for YEARS of high licensing costs. Since it's all custom, moving to a newer Oracle ERP is a nightmare.
This is a standard story. Oracle ERP is like crack. It's too late to try to get off the stuff once you get hooked.
The outage underscores the challenge Amazon faces as it looks to move completely off Oracle's database by 2020, and how difficult it is to re-create that level of reliability. It also shows that Oracle's database is more efficient in some aspects than Amazon's rival software, a point that Oracle will likely emphasize during this week's annual OpenWorld conference in San Francisco.
Nothing in the article really supports those conclusions.
Was it due to some actual inferiority in "their own technology" (postgresql?), or was it just a migration issue?
Comment removed based on user account deletion
Oracle is a complete nightmare. I've ported several large databases off Oracle, and have spent to many years developing using Oracle. There were constant issues with Oracle. Reliable, please. Every month we were running into open bugs and submitting issues. All while paying obscene money for the privilege to use their products
Adding features while doing bug fixes for god knows how long.
Their DB is in need of a total rewrite.
Amazon, if doing this slowly with great design of the software can easily crush Oracle in the long run.
Just saying, I've dealt with alot of Oracle features that are bug ridden crap. Reliability my ass.
It's certainly unsurpassed in the efficient manner in which it eats all available IT funding. What licensing scheme are they using to rip off their customers this year? By CPU cores? By clock speed? Both?
Amazon could, obviously, have done a better job of testing before flipping the switch on a migration this big. It's not like the company is hurting for the money that could have been used to put together an appropriate environment to prevent a snafu like this.
CUR ALLOC 20195.....5804M
Likely as well, the $90K that this incident cost them is a rounding error in the total budget of the project, and the long term savings that the project will provide over the years, and additional monies coming in due to being able to now sell this as a services on their AWS platform.
I am sure Amazon probably looses more money per year, maybe even month due do damages of product in shipment than this little mishap cost them.
Anyone who expected otherwise has not done a major migration. But once the move off of Oracle is complete, Amazon may be in a much better place.
HA HA, you thought your homebrew infrastructure was up to snuff.
Sounds like a story that's really not a story. Someone leaks a memo or something and then some story is built out of that to make it a headline grabber. When in fact issues probably occur in any large data center operation that connects to many places in their system.
Amazon could spin off a small consulting company that helps people migrate from Oracle to something else, and use all the lessons hard earned from their own experience with the process to make it go smoother for their customers.
Amazon having trouble rolling out a platform migration does not mean Oracle is a reliable platform. On the contrary, my experience is that due to the high licensing costs, many business forego implementing the replication and redundancy measures needed to make Oracle's db reliable. Amazon having trouble rolling out a platform migration only goes to show that scale makes such migrations difficult and underscore how important planning is in IT.
I feel like sometimes the IT department must not be shoveling enough coal into the boiler or something beacuse this antiquated inflexible interface just stalls all the time
Ok, so imagine that, but worse. That was Prime Day. Hours on hours of not stalling, but simply not working at all.
What you are describing sounds like maybe the devs aren't as good as they could be at optimizing, or maybe the company is stingy on hardware. What happened to Amazon was a world-class system brought to a halt simply because of too many users and the system fell over. That is something that Oracle is just better at handling (when it's administered right and has some powerful hardware at work, which Amazon has in spades for anything they stand up).
"There is more worth loving than we have strength to love." - Brian Jay Stanley
This is how it works: [old state] -> [chaos] -> [new state]
When people see chaos, they often think that is the new state, when it is the chaos that occurs during the switch. If you have to maintain two different systems at the same time and those systems have to work together, of course it is hard and of course problems will occur.
But anything is better than Oracle products. Anything.
Big databases usually require careful tuning to handle big loads. Could it be the new incarnation has yet to undergo such tuning? The new incarnation may also have a different trade-off profile such that the porting process moved operations mostly as-is instead of rebalance the trade-offs to fit the new host. Much of the Oracle DB tuning may be direct production experience, something the new incarnation won't have by definition.
For a car analogy, suppose you are used to hauling big loads up the mountain in a Ford pickup truck. You switch to a Chevy truck and find your productivity drops. At first you blame the Chevy.
After weeks of experience you find the Chevy less powerful at directly going over boulders; however, it's more maneuverable than the Ford such that you just learn to swerve around boulders instead of try to go over them. Once you get used to the Chevy, the haul time is roughly the same.
Table-ized A.I.
Anyone who has ever had the misfortune of having to use the Oracle software that is used in many universities can tell you that they can make some serious garbage. Every day I have to use it for student management, and it is really, really, really terrible. It's the worst software that I have ever used in my life.
I would rather get up every morning and punch myself in the groin area than use Oracle/Peoplesoft for my work.
plus you don't have to have a fleet of Oracle consultants in your office for life.
One
Rich
Asshole
Called
Larry
Ellison
seems like it.
When Amazon is accepting to take losses like this one in the process to drop Oracle, you *KNOW* it means Oracle Costs Too Much.
This is not a bunch of incompetents we have here, they have estimated these risks and their costs when something goes wrong, and it *still* made sense to drop Oracle.
So there.
Likely as well, the $90K that this incident cost them is a rounding error in the total budget of the project, and the long term savings that the project will provide over the years, and additional monies coming in due to being able to now sell this as a services on their AWS platform.
I am sure Amazon probably looses more money per year, maybe even month due do damages of product in shipment than this little mishap cost them.
AND they now have a viable (and battle-tested) migration path from Oracle to their own product, which can also be sold to those disgruntled Oracle customers.
There's nothing like eating your own dog food to show that you're not selling vaporware.
This is so Amazon
$90K is likely similar to what the Oracle license costs them per day. If you think I'm joking, that's $30M/year - which wouldn't surprise me for a company the size of Amazon.
Do you have ESP?
Larry Ellison taunts Amazon that they still use Oracle and can't do without them, thus ensuring that Amazon will stop at nothing to be rid of Oracle and him.
When all you have is a hammer, every problem starts to look like a thumb.
Moving from Oracle to SQLite just won't cut it. Shoulda had a V-8.
Not for long.
Not for long.
Not for long.
No doubt, forever and ever.
Do you mean -1 shilling?
$30m/year could go just on Oracle Financials at their scale, let alone the database.
So unless and until the patents run out and it can be safely forked into a fully community controlled and supported language, you're still under the thumb of Oracle, even if indirectly.
But postgres has finally gotten some forms of replication and the roadmap for the next few versions has it match basically every feature Oracle still has left as an advantage within the next 5 or so years. The money and motivation are there now for postgres to overshadow Oracle in features, compatibility, and licensing costs. Plus if you need any of the Oracle specific compatibility during migration there is a proprietary solution available that overlays postgres-unimplemented Oracle-specific features atop postgres if you need.
Oh, look, a 'news' article paid for by Oracle.
On a long enough timeline, the survival rate for everyone drops to zero.
https://twitter.com/Werner/sta...
Never let facts interrupt a "good story.” Tried to help reporter get it right, but clickbait won (https://www.cnbc.com/2018/10/23/amazon-move-off-oracle-caused-prime-day-outage-in-warehouse.html ). Our Fulfillment Centers have migrated 92% of DBs from Oracle to Aurora with better avail, less bugs and patches, less troubleshooting, less hw cost. More: (use the link at the top to read the More)
Aurora doesn't use EBS (like RDS does).
Aurora has a storage cluster, that uses instance storage on EC2 instances, with a client to this storage cluster in the storage engine in the database instances.
See figure 5 in the Aurora paper: https://www.allthingsdistribut...