Slashdot Mirror


eBay Deploys 100TB of SSDs, Cuts Rackspace By Half

Lucas123 writes "eBay's QA division was facing mounting performance issues related to its exponential growth of virtual servers, so instead of purchasing more 15k rpm Fibre Channel drives, the company began migrating over to a pure SSD environment. eBay said half of its 4,000 VMs are now attached to SSDs. The changeout has improved the time it takes the online site to deploy a VM from 45 minutes to 5 minutes and had a tremendous impact on its rack space requirements. 'One rack [of SSD storage] is equal to eight or nine racks of something else,' said Michael Craft, eBay's manager of QA Systems Administration."

19 of 197 comments (clear)

  1. depends if you are IO bound or need storage by smash · · Score: 2

    For sites like ebay i have no doubt this makes sense. For the average small business I suspect they are far less IO bound and need storage...

    --
    I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
    1. Re:depends if you are IO bound or need storage by myurr · · Score: 3, Insightful

      Actually the vast majority of websites, even business websites, are really low traffic and they benefit far more from storage space (especially when shared with other sites) than speed. Operating system RAM caching will usually make up any performance deficit those kinds of low traffic sites may experience, where the majority of the traffic that does go to their sites tends to be read only and directed at only a few pages on any given day. Premature optimisation adds either (or both) complexity and expense, and is unnecessary for 90+% of the web.

      Scalability is a nice problem to have, and the majority of websites would do an awful lot better if they worried about driving traffic more than they worried about scalability.

    2. Re:depends if you are IO bound or need storage by hairyfeet · · Score: 2, Insightful

      I agree but there should be an up side to this, in that we'll get some solid data to see if Atwood at coding horror is correct that SSDs need to be judged on a hot/crazy scale due to the insanely high SSD failure rates.

      The reason I personally won't recommend SSDs to customers or carry them myself is after having my two "must have teh benchmarkz!" gamer customer buy top o' the line SSDs both of them had the SSD fail without warning which for me is unacceptable. Sure they got them replaced under warranty, but so what? That didn't cover their downtime, the cost of the 1Tb drives they had to pick up to hold them over while they waited on the RMA, or the data they lost between their last backup and the SSD failure. in the end they went dual Raptors in RAID 0 with the 1TB as backup and game storage.

      The nice thing about HDDs is in my experience one gets plenty of warning before they go tits up. The drive gets noisy, you get Windows delayed write failures, the drive starts running hot, you get SMART warnings, something. with both of the SSDs it was just...poof. Dead drive. on one I was able to retrieve a little bit of the data, on the other it wouldn't even show up in BIOS.

      That is why unless I have a customer where IOPS is all like in TFA or someone that lives on a heavily mobile laptop AND has the time for daily backups or stores all their important data in the cloud to just stay away until they get the kinks worked out. it is still too new and IMHO they haven't really got the tech down yet, hence all the failures. I tell my customers to have a fat HDD with plenty of cache along with giving Windows 7 plenty of RAM (4Gb minimum, 8Gb is better) to really use Superfetch and preload their apps into memory and they'll be happy. Yeah it doesn't boot with SSD speed, but who boots anymore?

      --
      ACs don't waste your time replying, your posts are never seen by me.
    3. Re:depends if you are IO bound or need storage by arglebargle_xiv · · Score: 2

      Any business dealing on the web (aka hosting) would benefit from ssd

      Because SSD is web scale.

  2. Re:Is this a Slashvertisement? by smash · · Score: 3, Informative

    essentially because SSD has far better IOPs, you need less units to get the same speed. So you can cut the size of the storage array in half. Ebay are clearly io bound rather than needing huge storage space. So for them, its a win.

    For others, maybe not so much.

    --
    I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
  3. Uhh.. cost? by Dahamma · · Score: 2

    Of course everyone would love to replace all of their storage with SSD if price was no object.

    The closest they come to mentioning cost is:

    Though SSD is typically a magnitude of order more expensive than hard disk drive storage, Craft said the Nimbus arrays were "on par" with the cost of his previous storage, which a Nimbus spokesman said was from NetApp and HP 3PAR. (Craft declined to identify the vendors).

    So, cost of new SSDs was similar to whatever HDDs they bought years ago? Yeah, that's kind of how it goes...

    Nimbus prices its product on a per-terabyte basis - it charges $10,000 per usable terabyte

    $10,000 per terabyte. Ok, then. Sure, it's faster, if you are willing and able to pay 10x the cost of *current* HDD-based systems...

    1. Re:Uhh.. cost? by Anonymous Coward · · Score: 5, Informative

      Depends on your workload. (Disclosure: I work in storage for a living.)

      Sometimes, what you need is raw, bulk storage. There are two serious contenders in this space: tape, and disk. You use tape if you have a lot of data you need to store, but not much that you need to access regularly: less power, and it scales to near infinite levels of storage (at the cost of very slow access for a given piece of data.) Or you use disk if you need to access most of it reasonably regularly. SSDs are not, and never will be, a contender in this space - you're paying through the nose on a per GB basis.

      On the other hand, sometimes what you need is IOs per second. Database administrators are very familiar with this - you need a bit of data from over here, and a bit of data from over there, and maybe a little bit more from somewhere in the middle, and you need it five minutes ago. Traditionally, you got this performance by building a large array across many spindles, and giving up half, three quarters, or even more of your disk space, in return for that nice, fast section of disk on the outside of the platter. Lots of running hard drives, drawing lots of power, generating lots of heat, and costing a lot of money for storage space that isn't even half used - because if you throw something else on that empty space, you completely ruin the throughput of your major production database.

      In that latter space, SSD is king. Sure, it's more expensive on a dollars per GB basis, but hey, guess what - GB isn't the important metric here. You figure out which bit of data is being hammered, and you move it across to the SSD. Rather like profiling an application: pick the function that takes 90% of the time in the software, optimise the wazoo out of it, and you get a significant improvement (rather than picking something at random and optimising it to billy-oh, and getting not much return for your investment.)

      So yeah - SSDs aren't going to compete in raw capacity any time soon. But in random I/O performance, they make a hell of a lot of sense. In some respects, yes, they most definitely are cheaper than traditional platters of spinning rust - see the aforementioned massive RAID set across dozens of spindles.

    2. Re:Uhh.. cost? by LordLimecat · · Score: 2

      You figure out which bit of data is being hammered, and you move it across to the SSD. Rather like profiling an application: pick the function that takes 90% of the time in the software, optimise the wazoo out of it, and you get a significant improvement (rather than picking something at random and optimising it to billy-oh, and getting not much return for your investment.)

      Or you do what eBay is apparently doing and say, screw it, we're doing 5 blades, and throw all of your storage on SSD,

    3. Re:Uhh.. cost? by neokushan · · Score: 2

      x10?

      --
      +1 IDisagreeSoHeMustBeATrollOrAnAstroturferOrAShill
  4. Compact by drmofe · · Score: 2

    So the entire eBay VM operation could fit into 6 racks? 200 physical servers @ 1RU each = 5 racks 10x 10TB 2U SSDs = half a rack 5x 2U switches = quarter rack

  5. Zero details by billcopc · · Score: 3, Informative

    TFA reads like a thinly-veiled promo for Nimbus Data Systems, which I can only guess are pushing a Linux-based SAN appliance full of SSDs. Big whoop.

    What I would love to know is: Why does eBay need 4000 VMs ?

    --
    -Billco, Fnarg.com
    1. Re:Zero details by Konster · · Score: 2

      They need 4000 VM's so they can buy 100TB of SSDs. :)

    2. Re:Zero details by Monoman · · Score: 3, Funny

      "We're not a storage team. We're Windows administrators who got into virtualization ... "

      --
      Keep the Classic Slashdot.
  6. Re:Wonder why not 2.5" SAS drives.. by fuzzytv · · Score: 2

    Because you won't get the IOPs or speeds you get with SSDs? SAS driver are still the traditional drives, so the random access is a pain.

  7. Reliability? by Vectormatic · · Score: 2

    I read a blog-post a while back stating that SSDs fail a lot more then you would expect. Somewhere around a year of heavy use seems to take most of the life out of a consumer grade ssd. Now i wonder how putting SSDs into Raid 5 (or 6, or whatever) will behave. If a certain model of SSD croaks around X write ops, then i think the nature of Raid will mean that your entire array of SSDs will go bad pretty closely together. It must suck to have two more drives go belly up while rebuilding your array after the first drive failure.

    Perhaps it would make sense to stagger SSDs in different phases of their lifetime to keep simultanious failures at bay, use some burned in drives and some fresh ones.

    --
    People, what a bunch of bastards
  8. Return On Investment by ritcereal · · Score: 4, Insightful

    While most people instantly gravitate towards the upfront cost and performance of going solid state, I would make one important point. Reducing your data center space by 9 racks is significant in terms of power, cooling and that is all on top of the purchase price and support contracts. Regardless if ebay owns their own data center or colocates, the cost per square foot in a data center and the continued operation of such a large storage system is more then likely to provide a higher return on investment. eBay isn't in the business of looking cool and hip, they're in the business of selling stuff as cheaply as possible and I'm certain their CIO cares only about the bottom line.

  9. Re:Is this a Slashvertisement? by erroneus · · Score: 2, Funny

    I don't know, but I would be checking ebay for a butt-load of cheap 15k fiber channel drives for sale there.

  10. Re:huh by smash · · Score: 2

    More like, the I/Os per second of 1 rack of SSD has 9x the THROUGHPUT of 1 rack of magnetic media. remember, we're talking about massive arrays here to get speed, not necessarily for disk capacity. If you only need the capacity provided by 1 rack of SSD, then being able to cut your rack space in half or less by getting the required IO in less drives can save heaps. Potentially, it can save you needing to build a new datacenter....which ain't cheap.

    --
    I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
  11. VMs are in the IO category for sure by Sycraft-fu · · Score: 4, Informative

    You discover modern hardware does virtualization real well. You get a good host software, like vSphere or something on new hardware and you have extremely near native speeds. The CPUs handle almost everything just like it was running as the host OS, and sharing the CPU resources works great. Memory is likewise real good, in fact VMs only use what they need at the time so they can have a higher memory limit collectively than the system RAM and share, so long as they don't all try to use it all at once.

    You really do have a situation where you can divide down a system pretty evenly and lose nothing. Let's say you had an app that used 2 cores to the max all the time and 3GB of RAM. You'd find that it would run more or less just as well on VM server with 4 cores and 8GB of RAM, half assigned to each of two VMs, as it would on two 2 core 4GB RAM boxes. ...Right up until you talk storage, then everything falls down. You have two VMs heavily access one regular magnetic drive at the same time and performance doesn't drop in half, it goes way below that. The drive is not good at random access and that is what it gets with two VMs hitting it at the same time, even if their individual operations are rather sequential.

    It is a bottleneck that can really keep things from scaling like the other hardware can handle.

    At work I use VMs to maintain images for instructional labs (since they all have different, custom requirements). When I'm doing install work on multiple labs, I always do it sequentially. I have plenty of CPU, a hyper-threaded 4 core i7, plenty of RAM, 8GB, there's no reason I can't load up multiples. However since they all live on the same magnetic disk, it is slower to do the installs in parallel than sequential.

    If I had an SSD, I'd fire up probably 3-4 at once and have them all do installs at the same time, as it would be faster.