eBay Deploys 100TB of SSDs, Cuts Rackspace By Half
Lucas123 writes "eBay's QA division was facing mounting performance issues related to its exponential growth of virtual servers, so instead of purchasing more 15k rpm Fibre Channel drives, the company began migrating over to a pure SSD environment. eBay said half of its 4,000 VMs are now attached to SSDs. The changeout has improved the time it takes the online site to deploy a VM from 45 minutes to 5 minutes and had a tremendous impact on its rack space requirements. 'One rack [of SSD storage] is equal to eight or nine racks of something else,' said Michael Craft, eBay's manager of QA Systems Administration."
For sites like ebay i have no doubt this makes sense. For the average small business I suspect they are far less IO bound and need storage...
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
essentially because SSD has far better IOPs, you need less units to get the same speed. So you can cut the size of the storage array in half. Ebay are clearly io bound rather than needing huge storage space. So for them, its a win.
For others, maybe not so much.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
They're expensive, but nowhere near as expensive as SSD. I guess if performance is that important, it makes sense but how many Ebay/Google/Amazon situations are there out there?
Of course everyone would love to replace all of their storage with SSD if price was no object.
The closest they come to mentioning cost is:
Though SSD is typically a magnitude of order more expensive than hard disk drive storage, Craft said the Nimbus arrays were "on par" with the cost of his previous storage, which a Nimbus spokesman said was from NetApp and HP 3PAR. (Craft declined to identify the vendors).
So, cost of new SSDs was similar to whatever HDDs they bought years ago? Yeah, that's kind of how it goes...
Nimbus prices its product on a per-terabyte basis - it charges $10,000 per usable terabyte
$10,000 per terabyte. Ok, then. Sure, it's faster, if you are willing and able to pay 10x the cost of *current* HDD-based systems...
So the entire eBay VM operation could fit into 6 racks? 200 physical servers @ 1RU each = 5 racks 10x 10TB 2U SSDs = half a rack 5x 2U switches = quarter rack
TFA reads like a thinly-veiled promo for Nimbus Data Systems, which I can only guess are pushing a Linux-based SAN appliance full of SSDs. Big whoop.
What I would love to know is: Why does eBay need 4000 VMs ?
-Billco, Fnarg.com
The article does not mention which kind of SSDs they've used, or have I missed something? That might be very interesting, especially when it comes to reliability. It's often claimed the SSDs are more reliable than traditional drives, but accrding to this http://www.tomshardware.com/reviews/ssd-reliability-failure-rate,2923.html that's not really true.
I got the impression ebay just terminated a hosting arrangement with Rackspace (the company) -- bringing it inhouse, and cutting Rackspace's revenues in half. :)
has improved the time it takes to deploy a VM from 45 minutes to 5 minutes
uh, any logical explanation for this? SSDs are snappier and the peak I/O can be faster compared to spindle drives - but not by factor 9, or?
I read a blog-post a while back stating that SSDs fail a lot more then you would expect. Somewhere around a year of heavy use seems to take most of the life out of a consumer grade ssd. Now i wonder how putting SSDs into Raid 5 (or 6, or whatever) will behave. If a certain model of SSD croaks around X write ops, then i think the nature of Raid will mean that your entire array of SSDs will go bad pretty closely together. It must suck to have two more drives go belly up while rebuilding your array after the first drive failure.
Perhaps it would make sense to stagger SSDs in different phases of their lifetime to keep simultanious failures at bay, use some burned in drives and some fresh ones.
People, what a bunch of bastards
Is he talking about performance or price. I can imagine that a single rack of enterprise SSD's could easily cost the same as 9 racks of anything else.
Calling someone a "hater" only means you can not rationally rebut their argument.
While most people instantly gravitate towards the upfront cost and performance of going solid state, I would make one important point. Reducing your data center space by 9 racks is significant in terms of power, cooling and that is all on top of the purchase price and support contracts. Regardless if ebay owns their own data center or colocates, the cost per square foot in a data center and the continued operation of such a large storage system is more then likely to provide a higher return on investment. eBay isn't in the business of looking cool and hip, they're in the business of selling stuff as cheaply as possible and I'm certain their CIO cares only about the bottom line.
Might even pay for itself by the years end
Barring any major catastrophes - expect to see may companies with server farms to go this route soon
..........FULL STOP.
I don't know, but I would be checking ebay for a butt-load of cheap 15k fiber channel drives for sale there.
More like, the I/Os per second of 1 rack of SSD has 9x the THROUGHPUT of 1 rack of magnetic media. remember, we're talking about massive arrays here to get speed, not necessarily for disk capacity. If you only need the capacity provided by 1 rack of SSD, then being able to cut your rack space in half or less by getting the required IO in less drives can save heaps. Potentially, it can save you needing to build a new datacenter....which ain't cheap.
I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
Finally we are getting a chance of seeing real reliability stats of the SSD!
That if eBay would be kind enough to publish the data couple of years later.
All hope abandon ye who enter here.
It's about speed, not capacity. They used to have 15K disks, which have a latency of around 2 ms. They switched to SSD's, which have a latency of around 0.1 ms.
Why does latency matter for them? They have large amounts of relatively small files (small images, item descriptions and so forth). These files are spread across the disks. Each file requested means the latency is counted. If a webpage needs 100 files the latency of the disk is multiplied by 100. Each user (and there are many at any given moment) needs 100 files. As soon as possible.
The used to solve this by having far to many disks. The amount of storage space may have been twice what they expected to need, because more disks means you can acces more files at once, effectively cutting your acces time. Now file 2 doesn't have to wait until file 1 is done, because it's on another disk (or actually RAID). This increases the speed, but at a cost.
Add to that the fact that SSD's have insane transfer speeds. An 15K rpm disk may transfer 300 MB/s, an simple SSD can transfer 550. Theoretically they can go much faster, but SATA can't keep up. These systems don't use sata, so their transfer speeds may be much higher.
IANAEOTS, so correct me if I am wrong.
Well, I might have a way, but it only works on a semi spherical planet in a vacuum.
i was under the (misguided?) impression that ssd's weren't, as yet, enterprise ready in terms of reliability?
You discover modern hardware does virtualization real well. You get a good host software, like vSphere or something on new hardware and you have extremely near native speeds. The CPUs handle almost everything just like it was running as the host OS, and sharing the CPU resources works great. Memory is likewise real good, in fact VMs only use what they need at the time so they can have a higher memory limit collectively than the system RAM and share, so long as they don't all try to use it all at once.
You really do have a situation where you can divide down a system pretty evenly and lose nothing. Let's say you had an app that used 2 cores to the max all the time and 3GB of RAM. You'd find that it would run more or less just as well on VM server with 4 cores and 8GB of RAM, half assigned to each of two VMs, as it would on two 2 core 4GB RAM boxes. ...Right up until you talk storage, then everything falls down. You have two VMs heavily access one regular magnetic drive at the same time and performance doesn't drop in half, it goes way below that. The drive is not good at random access and that is what it gets with two VMs hitting it at the same time, even if their individual operations are rather sequential.
It is a bottleneck that can really keep things from scaling like the other hardware can handle.
At work I use VMs to maintain images for instructional labs (since they all have different, custom requirements). When I'm doing install work on multiple labs, I always do it sequentially. I have plenty of CPU, a hyper-threaded 4 core i7, plenty of RAM, 8GB, there's no reason I can't load up multiples. However since they all live on the same magnetic disk, it is slower to do the installs in parallel than sequential.
If I had an SSD, I'd fire up probably 3-4 at once and have them all do installs at the same time, as it would be faster.
Took me a while to figure how much do they actually cut from the end price, it's not something they're pleased to tell you.
Especially with recent "adjustment of prices" at ebay, no wonder they have extra cache to waste on whatever idiotic idea comes to their IT management.
They get 8%+ of most below 500$/Euro items sold. Outrageous.
I hope my hosting company will change to such SSDs, and I will not have to wake up those rotating disks early in the morning.
All in all this is barely a dent in anything Ebay does. It sounds more like an experiment and hype of the drives they used.
When the foot seeks the place of the head, the line is crossed. Know your place. Keep your place. Be a shoe.
even the most touted and expensive 'enterprise ssd' can die out on you unexpectedly.
Read radical news here
SSD life is limited with number of write operations. you cant use them like normal disks in the business ebay is using.
but read operations are unlimited. so, if you are going to just read files from a hard disk, ssd makes the perfect candidate. in random reads, they are approx 40 times faster than best hdd at the minimum.
so, you just put 250 ssd disks, put your VM images on it, and, as the article says, it boosts your vm deployment time to other systems from 45 minutes to 5 minutes - there is nothing wrong with the article.
Read radical news here
not only random reads.
you have to use ssds for ALL reads, but, hdds for all writes. since ssd life is still limited with number of write operations conducted.
Read radical news here
I want the old hard drives! Please?
It's not significant at all, they're moving from FibreChannel drives which are notoriously small and expensive to SSDs. My last employer had a FibreChannel disk array. If I remember the prices, we were faced with a choice like 240 GB FibreChannel for $1000 each or 2 TB IDE drives for $100 each. It was obvious that moving to anything that wasn't FibreChannel was a good idea, because for the same price we could get either 480 GB of FibreChannel or 40 TB of IDE.
Fanatically anti-fanatical
The pricing from most vendors is more like $1,000 for a 450GB 15k FC drive or $500-600 for a 2TB NL SAS drive, the FC drive will provide ~7x more worst case IOPS and have 1/3rd to 1/2 the annual failure rate. It really depends on if you need more IOPS or more storage as to which you use. We've found that by using a large striping array our workload tends to meet a balance between IOPS required and storage space about where the 450GB FC drives are. Our next array will probably contain a mix of SSD's, 15k FC drives, and NL SAS drives with intelligent array software to move things around between the tiers and using the SSD's for cache.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
This only works out if you have lots of wasted space in your array.
This still comes down to why you would have used SSD 10 years ago before most slashbots ever even heard of the tech: You simply need the IOPS regardless of cost.
I could see eBay having this requirement. The rest of us not so much.
A Pirate and a Puritan look the same on a balance sheet.
Yeah. Had forgotten about that part.
"Enterprise" drives tend to be small and expensive to begin with. Moving from one variant of this to another is probably not such a big change after all.
A Pirate and a Puritan look the same on a balance sheet.
Fibre Channel Storage equipment floods EBay!
There's a huge difference between the Nimbus and Violin systems. Violin has a method by which to keep the I/O flat and not have to suffer a write cliff. Nimbus doesn't appear to have the same features, though if you care only about density, it seems good.
'nuff said.
I'm not a lawyer, but I play one on the Internet. Blog
They sweet-spot in enterprise storage is doing deduplication with SSD with both direct attached and networked storage, plus 15k 2.5" and 7.2K 3.5" disk for the rest. Deduplication saves a lot space and with SSD it works like cache, especially in scaled out environments.
From the paper "Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?" by Bianca Schroeder Garth A. Gibson
In our data sets, the replacement rates of SATA disks are not worse than the replacement rates of SCSI or FC disks.
So the annual failure rates are apparently similar, regardless of vendor claims
15K drives have a seek time of about 5ms. That is just to move the head before the data has started to flow. SSDs can get the data in microseconds.
Yeah, and their bank balance. And I thought our £2 million SSD database cluster was expensive..
I wrote my first program at the age of six, and I still can't work out how this website works.
One thing I haven't seen mentioned for the power savings is that in an extremely high random read environment where SSDs shine the most, you're generally looking at one SSD replacing multiple hard drives.
If you can get rid of 5 15k drives and replace it with 1 SSD, the capital cost difference is vastly reduced, if not eliminated, even before you look at power savings.
I don't read AC A human right