Slashdot Mirror


AOL Spends $1M On Solid State Memory SAN

Lucas123 writes "AOL recently completed the roll out of a 50TB SAN made entirely of NAND flash in order to address performance issues with its relational database. While the flash memory fixed the problem, it didn't come cheap, at about four times the cost of a typical Fibre Channel disk array with the same capacity, and it performs at about 250,000 IOPS. One reason the flash SAN is so fast is that it doesn't use a SAS or PCIe backbone, but instead has a proprietary interface that offers up 5 to 6Gb/s throughput. AOL's senior operations architect said the SAN cost about $20 per gigabyte of capacity, or about $1 million. But, as he puts it, 'It's very easy to fall in love with this stuff once you're on it.'"

158 comments

  1. AOL? by roman_mir · · Score: 4, Funny

    What is surprising to me is not the amount of money spent on what was bought, but the fact that AOL has any performance issues at all. They still have users? They have an entire database of users?

    1. Re:AOL? by MrDiablerie · · Score: 5, Informative

      It's a common misconception that AOL's primary business is still dial-up access. They make more money nowadays with their content sites like TMZ, Moviefone, Engadget, etc.

    2. Re:AOL? by savvysteve · · Score: 1

      It's a common misconception that AOL's primary business is still dial-up access. They make more money nowadays with their content sites like TMZ, Moviefone, Engadget, etc.

      Well I was thinking the same thing that roman was... I can't believe that AOL still is in business. They are owned by Time Warner right? Or the other way around? I know Time Warner in my area is now Comcast. I'm curious.

    3. Re:AOL? by bananaquackmoo · · Score: 4, Informative

      Neither. AOL separated into its own company again.

    4. Re:AOL? by T+Murphy · · Score: 4, Funny

      No, they think they still have lots of users. The cancellation department is separate from HQ- at 56k it's still going to be a few decades before the suits finish receiving all the cancellation notices.

    5. Re:AOL? by tnk1 · · Score: 4, Informative

      AOL is Advertising.com and some flagship sites. And yes, they still have dialup users. The access business is steadily decreasing, but its pretty profitable since they basically stopped upgrading it and now just sort of run it.

      If they maintain their current path, yes, they will eventually disappear and fail, but the process is much longer than you might think. Not all of their acquisitions were as retarded as Bebo.

      What they probably need the SAN for is the Advertising business. That is profitable and requires a shitload of storage. They don't need that for their websites.

    6. Re:AOL? by Mikkeles · · Score: 4, Funny

      Me Too!!!

      --
      Great minds think alike; fools seldom differ.
    7. Re:AOL? by mark72005 · · Score: 0, Redundant

      The acquisition was such a disaster (due to AOL suckage) TW jettisoned them.

    8. Re:AOL? by mark72005 · · Score: 3, Funny

      I think it just means they are still billing users who cancelled years ago, per standard practice.

      Also, there are users who wanted to cancel years ago, but are still lost in the phone tree. Those are still active accounts too.

    9. Re:AOL? by dintech · · Score: 2, Funny

      They have an entire database of users?

      No, the 50TB is for a museum of all the different CDs they sent out.

    10. Re:AOL? by Jah-Wren+Ryel · · Score: 3, Informative

      Neither. AOL separated into its own company again.

      As a very casual observer it seems like the entire TW/AOL debacle could not have been mismanged worse - well I guess both companies could have gone titsup, but that's about it. TW vastly overpaid for AOL when AOL was at its peak (160 billion dollars). Then, just as AOL had started to climb out of the bottom they spun it off for a song ($2.5 billion). Since then AOL has been doing a decent enough job of reinventing itself as "new media" company - the kind of thing TW seems to be struggling with.

      That's why corporate CEO's get the big bucks though!

      --
      When information is power, privacy is freedom.
    11. Re:AOL? by Trepidity · · Score: 2, Interesting

      It's true that they make more money now with their content sites, but only slightly more: ISP subscriptions still make up around 40% of its revenues.

    12. Re:AOL? by Linker3000 · · Score: 1

      No, silly. The database is an historic record that they keep of every demo CD and floppy they ever sent out (date, name, address etc.). It was designed to ensure that they never sent more than 999 to the same person.

      --
      AT&ROFLMAO
    13. Re:AOL? by Dishevel · · Score: 1

      Then they failed.

      --
      Why is it so hard to only have politicians for a few years, then have them go away?
    14. Re:AOL? by abarrow · · Score: 1

      They probably did it because their database vendor (Microsoft?) claimed that their database problems had to be due to their hardware. It couldn't possibly be software performance issues...

    15. Re:AOL? by Mad+Merlin · · Score: 1

      What's surprising to me is that they managed to extract such awful performance out of so many SSDs. I mean, seriously, a pitiful 250k IOPS with $1M of SSD? You could do better with a dozen SSDs from the corner store!

    16. Re:AOL? by Anonymous Coward · · Score: 2, Informative

      AOL bought TW. It was a very shrewd move for AOL, because TW had a much higher intrinsic value to set a floor on the stock price when the internet bubble burst.

    17. Re:AOL? by Vancorps · · Score: 1

      I dunno, for 300k from NetApp you can get 100k IOPS over 60TB and that was three years ago, my new unit will do that with 100TB and costs even less, takes up half the space, and uses half the power, oh, and it'll do 200k IOPS, at least projected. Of course if you're database is jacked by that back-end storage then you're database storage is poorly defined especially since Oracle OCFS can utilize multiple storage back-ends simultaneously. MS SQL server can achieve this as well through other means but most situations I've encountered involve analytics slowing down production which is easily solved with an OLAP database server.

    18. Re:AOL? by Anonymous Coward · · Score: 0

      You forgot this is slashdot, and therefore must rehash every old tired trope of scoffing, dismissing, and otherwise denigrating the Targets That All The Cool Kids Hate, a list including but not limited to: Microsoft (the root of all evil), Oracle (SQL is so wordy so DBs are teh sux), India (poor tech support and they all talk like Kwik-E-Mart clerks), C++ (bloated useless language), and AOL (lusers who type in all caps). The list changes only with the maturity level of those who continue to make such posts year after year after year.

      That said AOL is still a heavyweight in what's left of the dialup business.

    19. Re:AOL? by Anonymous Coward · · Score: 0

      Bebo is as popular in Brazil as facebook is here, surely all they need to do is put some spyware games on there and it'd be worth billions.

    20. Re:AOL? by St.Creed · · Score: 2, Informative

      This isn't true - what the AC says is true. TW was bought by AOL when AOL could leverage its 160 billion dollar fairy-dust value into tangible assets. If they hadn't done so they would have been gone years ago. It was a brilliant move by AOL and at the time, TW thought it was a great deal as well. TW got suckered, as we know now. But in that day and age it looked like a good move: AOL had the internet savvy, TW the IP - combine them and rule the Internet. Ofcourse, that didn't quite go as planned.

      --
      Therefore, by the (faulty) logic you're using, you're just a cow with a keyboard - osu-neko (2604)
    21. Re:AOL? by Anonymous Coward · · Score: 0

      Bebo is gay.

    22. Re:AOL? by Anonymous Coward · · Score: 0

      Every article, that same comment gets posted. It hasn't been funny the last 100 times. Fuck the mods.

  2. rolling out solid state storage by Anonymous Coward · · Score: 1, Funny

    > AOL recently completed the roll out of a 50TB SAN made entirely of NAND flash

    ME TOO!!!

  3. What? by EndlessNameless · · Score: 5, Insightful

    As a DBA, I would love to have solid-state storage instead of needing to segment my databases properly and work with the software dev guys to make sure we have reasonable load distribution.

    Where can I get someone to pay a million dollars so I can do substandard work?

    --

    ---
    According to the latest ruleset, this post should be modded as Vorpal Flamebait +5.
    1. Re:What? by Jimmy+King · · Score: 5, Funny

      As long as you come really cheap, I can probably get you on where I work. You won't get cool hardware like that, but you can have the other half. Management seems to be ok with substandard work as long as apologizing to the customers continues to be cheaper than doing a good job or buying the hardware to cover up the poor job.

    2. Re:What? by Threni · · Score: 3, Insightful

      You're the DBA - do what you do best, and start Googling! :)

    3. Re:What? by eln · · Score: 2, Insightful

      I was thinking something very much along these lines. I can't believe that AOL is doing something more I/O intensive than everyone else in the world. If you're looking at buying something this expensive, you really need to go through your database design and application code with a fine-toothed comb and look for inefficiencies first.

      Of course, in the real world, this sort of thing (maybe not to this scale) happens all the time. We just had a customer that was having major performance problems. They demanded we put them on a massive $750,000 whiz-bang SAN device right away to alleviate their problems. So we did, and then their DBAs finally get off their asses and look at the code and make some changes that cut their I/O demand in half. Basically, they ended up burning $750,000 on something they didn't even need. I have a feeling AOL just spend $1,000,000 on something they didn't really need as well.

    4. Re:What? by h4rr4r · · Score: 1

      Mod parent way up. This is where big companies waste bundles of money. Rather than do the work right they throw ever more hardware at it.

    5. Re:What? by Nexzus · · Score: 2

      Cynic in me believes that in addition to a whiz bang storage network, AOL also got some free publicity in the Tech circle with the inclinination that they're leading edge.

      --
      Karma: Can only be portioned out by the Cosmos.
    6. Re:What? by Lifyre · · Score: 1

      Not talking about any specific real world case:

      When does it become cheaper to throw more power at it than improving code efficiency? It seems to me that this is taking the same steps that a large amount of software has. that it is cheaper to use a more powerful processor than optimize the code...

      Granted they likely jumped the gun a little bit but the world needs early adopters...

      --
      I'll meet you at the intersection of "Should be" and "Reality"
    7. Re:What? by hxnwix · · Score: 3, Funny

      You could probably get by with a cloud of 486s, but why the fuck would you bother?

    8. Re:What? by Kjella · · Score: 2

      Where can I get someone to pay a million dollars so I can do substandard work?

      You try claiming the next big work they want will take more than a million dollars in DEV/DBA work compared to buying a million dollar SAN. At this point three things could happen:

      1. They say "um, never mind"
      2. They pony up the cash
      3. They call you on it

      While I've seen some rather dysfunctional companies, I still haven't seen any where the PHBs try reestimating the IT cost themselves. Mind you, I haven't seen an overwhelming many companies that have a spare million dollars lying aorund either so I figure #1 would happen in 95% of the cases. But I actually think #2 would happen in 4% of the remaining cases and #3 only in 1%. Unless your costs are questioned before it leaves IT...

      --
      Live today, because you never know what tomorrow brings
    9. Re:What? by kanad · · Score: 3, Interesting

      Careful on what you wish.Virgin Blue airlines in Australia suffered a 2 day blackout costing $20 million due to a single solid state drive failure. They have since gone back to normal drive. Read it at http://www.theaustralian.com.au/australian-it/still-no-clue-to-virgin-blues-20m-question/story-e6frgakx-1225937335722

    10. Re:What? by Gearoid_Murphy · · Score: 1

      That's exactly what I was thinking, unless AOL is doing something ' amazing ', it's very much likely that the requirements of their DB infrastructure are similar to that of everyone else. The way everyone else solves these problems is through a marraige of well-designed infrastructure and reactive software systems, asfaik. That said, it still sounds uber cool and the ultimate DB toy/tool.

      --
      prepare the survey weasels.
    11. Re:What? by Firehed · · Score: 1

      If databases were implemented correctly, they'd take care of the load distribution themselves. Of course we'd all still be perfectly capable of writing stupid queries, but a lot of the bullshit we have to deal with when it comes to databases stems from rotational hard drives being so ill-suited to the random seeks that databases are so useful for.

      As far as I'm concerned, running your database on solid-state drives just amounts to a bug-fix in the database software. Stuff like data denormalization, avoiding joins, and sharding are effectively hacks around bugs, even if those bugs exist at a hardware level.

      --
      How are sites slashdotted when nobody reads TFAs?
    12. Re:What? by konohitowa · · Score: 3, Funny

      As long as you come really cheap, I can probably get you on where I work. You won't get cool hardware like that, but you can have the other half. Management seems to be ok with substandard work as long as apologizing to the customers continues to be cheaper than doing a good job or buying the hardware to cover up the poor job.

      You could always take a long lunch, cross the bridge from Redmond to Seattle, and apply at Amazon. I'm sure Microsoft would give you a couple of hours off to do that, right?

    13. Re:What? by Vancorps · · Score: 1

      They are hardly early adopters. NetApp has had SSD trays for a few years now and they have units that easily bust through 250k IOPS. I have a middle-tier NetApp storage implementation and I've got almost that much bandwidth available to me at a third of the cost. NetApp also has solid state modules for 10k that are used as cache to facilitate the early morning rush. This is nothing new and not really that impressive. For me it just reenforces all the AOL stereotypes about inefficiency. I will never come close to bandwidth limitations of my storage, I will however run out of storage at some point but block level dedupe has dramatically extended the life of existing storage.

    14. Re:What? by fluffy99 · · Score: 4, Informative

      I have a feeling AOL just spend $1,000,000 on something they didn't really need as well.

      They admitted as much in the article. They decided that it was cheaper to improve the hardware throughput than to spend the money on developers to try to trim the demand. They were also probably losing money by not meeting SLAs and a quick fix was cheaper in the long run. They also reduced power and cooling requirements as well, so there may be some long term payback there as well. The free publicity certainly didn't hurt either

    15. Re:What? by DigiShaman · · Score: 1

      Not that I've crunched the numbers or anything. But I'm willing to bet that a team DBAs outsourced to India and cheap hardware made in China provides a better ROI over an expensive team of American DBAs and a standard server configuration for the task.

      Sucks doesn't it?

      --
      Life is not for the lazy.
    16. Re:What? by Anonymous Coward · · Score: 0

      I have seen 3000 machine websites that if coded properly could easily run on 100.
      This stuff isn't that difficult to do, it takes time up front or money in the end.

    17. Re:What? by A+beautiful+mind · · Score: 1

      Depends. For certain workloads like mixed read/write (let's say 70%/30% - 40%/60%), solid state approaches are pretty good. If you've got lots of writes that you need to read back randomly, then buying lots of memory or duing multi-master or master-slave replication is not ideal.

      I definitely see a use-case for flash based approaches, where you both need the read and the write IOPS and don't have warehousing amounts of data, but the usecase is narrower than people think.

      Reasonable load distribution can be very expensive, powering a lot of database servers / disks is expensive. The power / cooling bill for servers is a significant expense.

      --
      It takes a man to suffer ignorance and smile
      Be yourself no matter what they say
    18. Re:What? by Anonymous Coward · · Score: 0

      But other companies DO use SSDs. Ever heard about Texas Memory Systems and their RamSans? Have a look at their customer list.

      Yeah, there's no lack of retarded architecture, code and queries around, but hardware does matter and there are situations you simply can't optimize yourself out of (if not why don't everyone run their databases on 286s with floppies?)

    19. Re:What? by Anonymous Coward · · Score: 2, Insightful

      certainly the failure of an entire infrastructure after the failure of a single drive is the fault of the drive manufacturer. spinning disks never fail?

    20. Re:What? by Relayman · · Score: 1

      I'm so glad I work with IBM iSeries servers where the hardware takes care of this and we don't need a DBA for this (every object is automatically spread among all disks in the storage pool).

      IBM also has a utility that tells you how to match SSD with disk for the best overall performance. Then you only buy the SSD you need.

      --
      If I used a sig over again, would anyone notice?
  4. Sas bandwidth constrained??? by TheSunborn · · Score: 1

    It does mention that sas can 'only' deliver 5Gbit/sec - but is that not the bandwidth for each disk and thus not a problem at all?

    The reason the ssh is so much faster is most likely the nice search time for ssd. And I really like the concept of them using flash chips directly. Now we just need something cheeper then 20$/GB :}

    1. Re:Sas bandwidth constrained??? by EndlessNameless · · Score: 1

      At the rate SSD storage is growing (and the capacity is being used), it is conceivable that a company could choose cheap MLC drives and simply plan on upgrading them before their expected time of death.

      With modern wear-leveling algorithms, reduced write amplification, and better physical longevity, I can see cheap SSDs lasting the 2-3 years their capacity would be good for.

      SATA SSD over iSCSI is starting to look very appealing now compared to Fibre Channel or SAS. Since silicon performance and capacity scale much faster than mechanical performance/capacity---and SATA devices are compatible with SAS host controllers---it should only be a few more years before this becomes commonplace.

      --

      ---
      According to the latest ruleset, this post should be modded as Vorpal Flamebait +5.
    2. Re:Sas bandwidth constrained??? by Anonymous Coward · · Score: 0

      It doesnt say what arrays they used and/or compared with so we will never know what they mean about SAS.

    3. Re:Sas bandwidth constrained??? by TheRaven64 · · Score: 4, Insightful

      Now we just need something cheeper then 20$/GB

      Actually, the price was the most interesting part of this:

      at about four times the cost of a typical Fibre Channel disk array with the same capacity

      Four times the price and, what, ten? A hundred? times the IOPS? That makes NAND pretty much a no brainer for any heavy-use database.

      --
      I am TheRaven on Soylent News
    4. Re:Sas bandwidth constrained??? by NevarMore · · Score: 1

      Cheaper than $20 a GB.

      I know thats expensive now, but I'm just old enough to remember when a GB of spinning magnetic disk was a big effing deal.

    5. Re:Sas bandwidth constrained??? by hairyfeet · · Score: 1

      I don't know about you, but I haven't seen SSDs battle tested enough for me to truly trust the things yet. With mechanical drives I've yet to have one "just die" as I ALWAYS got warnings something was going via drive noise, heat, random small errors, etc. And now SMART just makes that even easier to spot. On the other hand I've been told time and time again flash just "don't die" they wear out, yet I've had plenty of flash drives just go dead, and a couple of my early adopters managed to have dead SSDs. Not losing some space, not read only, stone cold dead with 100% data loss. I can't even remember the last time I had data loss with a HDD, even my customers who wait until the last second to bring their dying PC in I've managed to get their data off with Spinrite.

      So I'd love to hear from the /. guys in the trenches that have REAL experience pounding these drives. How well do they hold up? How many failures have you had? What kinds of failures? because they can spout that MTBF bullshit but anybody whose gotten a bad drive OOTB knows they way they crank them things out bad ones WILL get through, the question is are they easy to spot like a bum HDD, and do they give plenty of warning before going tits up. What about it admins?

      --
      ACs don't waste your time replying, your posts are never seen by me.
    6. Re:Sas bandwidth constrained??? by oldspewey · · Score: 1

      It does mention that sas can 'only' deliver 5Gbit/sec - but is that not the bandwidth for each disk and thus not a problem at all?

      More importantly, how many '540 Free Hours! CDs does that translate into?

      --
      If libertarians are so opposed to effective government, why don't they all move to Somalia?
    7. Re:Sas bandwidth constrained??? by rsborg · · Score: 2, Informative

      I don't know about you, but I haven't seen SSDs battle tested enough for me to truly trust the things yet. With mechanical drives I've yet to have one "just die" as I ALWAYS got warnings something was going via drive noise, heat, random small errors, etc. And now SMART just makes that even easier to spot.

      Google found differently in their massive hard drive survey... sometimes drives would just up and die with no SMART warnings. Also the most common SSD failure-case is lack of writes, at least you can retrieve data off the drive as opposed to a completely opaque device if the platter is frozen.

      --
      Make sure everyone's vote counts: Verified Voting
    8. Re:Sas bandwidth constrained??? by fusiongyro · · Score: 1

      The first problem on my mind right now though is that nearly all widely used relational databases are built with a lot of algorithmic assumptions about the disk. They spend a great deal of time ensuring that they only fetch the minimum number of blocks, and many higher end databases go to lengths to ensure that related blocks wind up near each other on disk, implement block caches and things like that. A lot of this is done to mitigate seek time.

      With SSDs, seek time is basically constant and there's no need to minimize it, though you still want to minimize number of fetches. However, all SSDs on the market (AFAIK) exhibit a profound performance degradation once the disks start having to erase blocks. Most disks postpone this as long as possible with an internal copy-on-write mechanism, but it's not uncommon for write speed to dramatically decrease once every block has been written to once. So there is a serious need to eliminate unnecessary writes and minimize necessary ones, which is not something most relational databases have put much effort into.

      I fully expect that in a few years most databases will have a tunable parameter for, am I dealing with SSDs or traditional HDDs, and will make appropriate optimizations for the type of disk, but I wouldn't be a bit surprised to hear that their performance improvements degrade sharply in a year or two when all of their array's blocks have been touched. At the same time, I think this area is ripe for exploitation by database vendors. I also wouldn't be surprised if the gap between SSD-backed and HDD-backed DBs were made much larger by software improvements rather than hardware improvements over the next few years. It'll be interesting to see.

    9. Re:Sas bandwidth constrained??? by hairyfeet · · Score: 1, Informative

      Yes I've read the Google report but IIRC we have NO access to their source, hell we don't even have access to detailed measurements of what types of load and I/O they were running. same as I wouldn't be surprised if you are running massive databases the trade off for SSD would make the $$$ worth it as in TFA, I want to see more "real world" average server and workstation loads, not the "insane pound the shit out of the drives" that someone like Google does. I also noticed in TFL they are basing everything on SMART, and while I said SMART was a nice add-on I almost never was warned simply by SMART, but more often drive noise and small read/write errors that I doubt Google sized places would even notice.

      So how about any? Any large deployments of SSDs in none insane conditions? How are they holding up? Any failures? What types?

      --
      ACs don't waste your time replying, your posts are never seen by me.
    10. Re:Sas bandwidth constrained??? by EXrider · · Score: 2, Informative

      With mechanical drives I've yet to have one "just die" as I ALWAYS got warnings something was going via drive noise, heat, random small errors, etc. And now SMART just makes that even easier to spot.

      Google found differently in their massive hard drive survey... sometimes drives would just up and die with no SMART warnings. Also the most common SSD failure-case is lack of writes, at least you can retrieve data off the drive as opposed to a completely opaque device if the platter is frozen.

      Yeah, I've seen quite the opposite. Let me preface this with saying that I'm strictly talking about consumer and midrange drives, I've seen very few SCSI and SAS drives die without warning.

      In the past 10 years, in a company with about 200 nodes, I can literally count on one hand the amount of hard drives that have given any SMART warnings leading up to their imminent failure. They pretty much always die while the OS accumulates log entries of bad blocks and I/O errors. Most of the time it was either death by shock, or death by manufacturer defect (Maxtor!). The former, SSD drives are pretty much immune to BTW. I would prefer an SSD in a road warrior or college student's laptop any day over a conventional HDD.

      --
      grep -iw skynet /etc/services
    11. Re:Sas bandwidth constrained??? by borgboy · · Score: 1

      Yeah. The relevant metric for databases really is $/IOPS, not $/GB.

      So, off the cuff, I figure you need a 700-disk array of 146GB drives to do this much storage at RAID 10 ( or 0+1 for you pedants ). That's a lot of random IO capacity. I don't know how poorly IOPS scale for systems at this magnitude, but I'd be surprised if the SSD solution was 10x IOPS over 700 15k spindles. Maybe 2-5x?

      --
      meh.
    12. Re:Sas bandwidth constrained??? by SQL+Error · · Score: 1

      We run all our databases on SSD. Just like disk drives, and unlike your claim, sometimes they simply drop dead without warning, even the high-end ones.

      The performance gains are entirely worth it, though.

    13. Re:Sas bandwidth constrained??? by Rockoon · · Score: 1

      Dell reported in 2008 that "Our global reliability data shows that SSD drives are equal to or better than traditional hard disk drives we've shipped."

      --
      "His name was James Damore."
    14. Re:Sas bandwidth constrained??? by jon3k · · Score: 1

      I believe the article claimed a 4x performance increase.

    15. Re:Sas bandwidth constrained??? by jon3k · · Score: 1

      How exactly do you get 250k IOPS out of 700 disks by the way? If we assume 15k rpm 2.5" disks you _might_ PEAK at about 250 IOPS/sec/disk, and thats being very generous. That's only 175k IOPS, and that's assuming straight reads, if you have a high mix of writes to a RAID volume (which could double your writes in RAID 10) you'd dramatically cut that down. By my math you'd need at LEAST 1,000 FC disks to get 250k IOPS.

      I'm just curious, how did you come up with 250k IOPS with 700 disks? Short-stroking? I apologize if I'm missing something obvious, I'm not a storage expert by any stretch of the imagination!

      Oh, another thing worth considering - heat and power. I have to assume this many disks comes in at a fraction of the power consumption. Is that worth considering in the $/IOPS as well, if we look beyond the CapEx costs of the drives?

    16. Re:Sas bandwidth constrained??? by borgboy · · Score: 1

      I didn't get 250k IOPS. I _said_ 250k IOPS was 2-5x better. I used the same math you did, and specifically hedged about not knowing how poor the scaling was with these kinds of systems. I am _not_ a storage engineer, just a developer with a (professional) interest in high performance random IO systems.

      --
      meh.
    17. Re:Sas bandwidth constrained??? by jon3k · · Score: 1

      Then your guess was pretty close! I thought I read it quoted at 4x faster. There's also the fact that it uses 90% less power, and I would assume 90% less cooling as well? Not to mention the dramatic reduction in floor space (not sure on their cost per sq ft obviously). I wonder what the difference in operational costs of the SSD array would be vs magnetic disks. And of course, no matter how many spinning disks you throw at it, you'll never get 1ms access times, unless it's coming out of RAM cache. I can tell you that you can expect to pay probably close to $1k/disk for 146GB 15K drives from the major storage vendors. Even if we assume only $500k for 700 disks there's still a lot of storage infrastructure to pay for (enclosures, cabinets, directors, cabling, etc, etc). More throughput (maybe not peak theoretical?), more IOPS, less floor space, less cooling, less power. Seems like a pretty attractive solution all around, depending on your workload.

    18. Re:Sas bandwidth constrained??? by Anpheus · · Score: 1

      Your transactions per second won't scale as well as your IOPS because with spinning disks, there's still a significant latency before your data actually gets written to disk.

      RAID just increases the number of in-flight IOs, widening the throughput but not decreasing its latency per disk.

    19. Re:Sas bandwidth constrained??? by borgboy · · Score: 1

      RAID by itself does only increase the number of in-flight IOs, but it almost always comes with that most magical of pixie dust, the battery-backed cache.

      The other point that I'll make is that often the only writes your RDBMS is waiting on are log writes, which are sequential anyway.

      In any case - I'll cede your point that spinning rust will likely NEVER scale as well as NAND.

      --
      meh.
    20. Re:Sas bandwidth constrained??? by borgboy · · Score: 1

      Power and cooling are a big win here, no doubt.

      What would really be news here would be database engines and/or filesystems that grocked SSD performance patterns well and could combine pools of spinning disks and SSDs in optimal ways for a given workload.

      --
      meh.
    21. Re:Sas bandwidth constrained??? by Skal+Tura · · Score: 1

      SAS can deliver 6Gbps, as can SATA nowadays too (Tho rare). Fastest SSDs hit this limit, but there's no simple way to go beyond (PCIe controller).

      SANs have their own bottleneck: SAN switches, which due to "centralized nature" (all traffic from all nodes goes through certain set of switches, or single switch) lowers the overall throughput.

      There are ways to have waaaay more IOPS, and waaaay higher throughput total, for way less money. These ways are what we intend to use in our VM cluster to be brought up next spring. We are initially targeting only about 20k IOPS, with throughput scalable upto 20Gbps per node (starting throughput is 5Gbps per node to storage, and upgraded as needed). As this is VMs, throughput is not as important as IOPS. Only on special cases we will be using SSDs. The infra has been designed so that if we maximize a node, the FSB BW will eventually be a potential bottleneck. When we get this cluster online, it's going to be really fun to make some storage tests and see how much IOPS can we achieve and total throughput :D

    22. Re:Sas bandwidth constrained??? by Skal+Tura · · Score: 1

      Something is wrong when DB is handling that ... The OS underneath should do these conclusions and optimize based on type of storage, without intervention of the DB software.

      Of course, applications to have to manage the load they do to a degree, but down to hardware level? That's simply too much, better trust kernel to make the right decisions! Then again, the world isn't perfect ...

      We have to battle with profoundly bad HDD IO management on the software (Still that software is best for our business), but it is more to do with the complete lack of caching mechanisms of any real tangible value, just the bare minimum. Fortunately, this too is about to be fixed :) We tried an alternative, it was the other extreme, spending as much RAM & CPU as possible to avoid HDD trashing, and the total throughput was multiple times lower :O

    23. Re:Sas bandwidth constrained??? by fusiongyro · · Score: 1

      What's wrong is that the OS is a general-purpose piece of software, and it is designed to guarantee relatively good performance across a wide range of use cases. However, databases are a very special case and as such they are able to make better guesses about when and where they are going to need their data. The OSes we live with are a hodgepodge of hardware abstraction and conceptual abstraction. This is one reason exokernels are so interesting to me, is because they should give you an OS built out of layers, so if you just need to treat all HDDs as the same kind of thing, you can do that, whereas if you need a filesystem for conceptual clarity, you can do that too, and these ideas don't stomp on each other's toes.

  5. It is called HDSL... by Yaa+101 · · Score: 3, Informative

    You can read more about that here:

    http://www.google.com/search?q=High-Speed+Data+Link

    1. Re:It is called HDSL... by afidel · · Score: 1

      Whee, 1GBps (10Gbps) per direction. How is this significantly better than COTS 8Gbps FC or 10Gb iSCSI/FCoE that don't need a proprietary card? Heck if you want LOTS of bandwidth use 40Gb IB. Plus according to the manufacturers site they DO support the hosts through COTS connections.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  6. Who are AOL? by Anonymous Coward · · Score: 0

    AOL have a website?

  7. What is this SAN connected to? by Anonymous Coward · · Score: 1, Interesting

    I wonder what machines that the LUNs are presented to. I'm guessing either extreme end x86 hardware, SPARC, or POWER. Most machines out there would not even notice the performance increase.

    1. Re:What is this SAN connected to? by Skal+Tura · · Score: 1

      Individual nodes does not matter that much after a certain point. I say shoot for the best combination of power consumption & space, which offers you lowest cost per N amount of performance, and distribute the hell out of it :)

  8. Really? by Archangel+Michael · · Score: 3, Informative

    My impression has been that this has been what has been going on for some time now with all the larger database operations, and one of the reasons why SSD have not yet come down in price is that all the best units and tech are going to the big companies as fast as they can get it from the manufacturers. I wouldn't be surprised to see someone like Google saying something like "yawn, 50TB" and saying that they have PETABYTE versions already out there.

    If you run a Database of any size, especially ones with large read to write ratios, SSD would only make things faster. And speed counts.

    --
    Agent K: A *person* is smart. People are dumb, stupid, panicky animals, and you know it.
    1. Re:Really? by Achromatic1978 · · Score: 1

      I wouldn't be surprised to see someone like Google saying something like "yawn, 50TB" and saying that they have PETABYTE versions already out there.

      Yeah, because at $1MM for 50TB, a $20MM investment by a publicly owned company in such a thing would entirely fly under the radar...

    2. Re:Really? by Anonymous Coward · · Score: 0

      Google could probably do it for $5M.

    3. Re:Really? by Galestar · · Score: 0

      Or they would just use BigTable or other distributed options. Scaling up can only do so much...

      --
      AccountKiller
    4. Re:Really? by Growlor · · Score: 1

      If they were going to do something custom, I wonder if setting-up a RAM based drive would have been faster and/or cheaper. It's kind of fun to fantasy engineer stuff like this: For $1M I wonder if you might be able to buy a decent size UPS and generator (just need it to last long enough to cover a write to a slow drive if the mains power went out) and would need say 2X the storage, 1 in the fast custom-made RAM drives and another in slower/cheaper regular spinner platter drives (or tapes.)

    5. Re:Really? by jon3k · · Score: 1

      Well considering it could easily be hidden in any of their billions of dollars (?) in infrastructure line items in their SEC filings I assume they could "hide" it reasonably well, no?

  9. Well, their purchasing agents got Microsofted by guruevi · · Score: 1

    "but instead has a proprietary interface that offers up 5 to 6Gb/s throughput."

    You know that SAS offers 6Gb/s throughput and Infiniband up to 300Gb/s (with 8 and 16 being more common).

    Either way, $1M for a bunch of SAS SSD (even SAS NVRAM) is way overpriced imho. They could've done it cheaper.

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
    1. Re:Well, their purchasing agents got Microsofted by Galestar · · Score: 2, Funny

      hey could've done it cheaper.

      It's AOL, would you actually expect them to make intelligent, informed decisions?

      --
      AccountKiller
    2. Re:Well, their purchasing agents got Microsofted by kc0aua · · Score: 1

      Not the difference between GB and Gb. "So you're getting the 4GB/sec. of PCIe bandwidth, not the 5Gbit/sec. or 6Gbit/sec. SAS bandwidth. You're getting almost an order of magnitude of bandwidth to the storage internally just because you're using an interface that's capable of it,"

    3. Re:Well, their purchasing agents got Microsofted by Anonymous Coward · · Score: 0

      "but instead has a proprietary interface that offers up 5 to 6Gb/s throughput."

      You know that SAS offers 6Gb/s throughput and Infiniband up to 300Gb/s (with 8 and 16 being more common).

      Either way, $1M for a bunch of SAS SSD (even SAS NVRAM) is way overpriced imho. They could've done it cheaper.

      That's 5 to 6 GB/s for PCIe, with a capital B. SAS is 6Gbps, or about 750MB/s.

    4. Re:Well, their purchasing agents got Microsofted by jon3k · · Score: 1

      They claim $20/GB which I have to assume includes more than the bare disks themselves. People routinely pay north of $30/GB for 15K fiber channel storage systems. Lots of things to consider - shelves, hba's, rack enclosures, I/O directors, maybe even power and cooling? Right now Intel SLC drives are over $11/GB and that's just for bare drives. I'd be curious to see if you could build a 50TB RAID5 flash based storage system for under $1M.

    5. Re:Well, their purchasing agents got Microsofted by Anonymous Coward · · Score: 0

      Just like I'd expect intelligent comments on /.

  10. DB Performance Issues by scubamage · · Score: 1

    Just curious, have they exhausted all of their software avenues for this? While yes, I understand they have a huge relational DB, I know other companies that are just as big/bigger and the have next to no issues. Maybe its just poorly designed? That's a hell of a lot of (albiet super sexy) hardware to throw at what could be a software problem. Thoughts?

    1. Re:DB Performance Issues by jandrese · · Score: 1

      They mentioned in the article (albeit obliquely) that the sysadmin thought he could probably reduce the load by working with the software guys, but in the end it would cost more than the $1 he spent on this solution. Plus, it might not even work if the software guys were in fact competent and the problem is just that you have too many users for the old hardware.

      --

      I read the internet for the articles.
    2. Re:DB Performance Issues by Anonymous Coward · · Score: 1, Informative

      The parent post has it right.

      Also, even if the software/dba guys can tune all the apps* there is the opportunity cost of having them spend time on a problem that is not specific to AOL's expanding business (delivering content in attractive / magnetic ways) and spend it on a technical problem that may actually just be a growth problem.

      If you're the manager in charge of this decision, you embrace the crisis, spend the cash to get yourself some new capacity and keep rolling. If you want to tune the apps/dbs as well, you do that in parallel, install the new hardware and get the performance bump that validates your strategic choice. Then you roll out the performance improvements and make everyone even happier.

      * Remember: they're serving up data for an unknown number of applications here; tuning might encompass a huge amount of db profiling, multiple application teams, and god knows how many load interdepencies, both technical and interpersonal.

  11. I wonder how the total cost compares by bareman · · Score: 1

    once you figure the total energy savings (reduced power needs, reduced cooling needs, etc) over the lifetime of the drive I wonder how much more expensive it is. I can't wait for SSD to become more affordable. I'd like to have that in our SANs too.

    1. Re:I wonder how the total cost compares by afidel · · Score: 1

      It's not even close, power is ~5% of the TCO of anything enterprise grade, maybe 10-15% if you include capital costs for UPS, generator and AC into the equation.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    2. Re:I wonder how the total cost compares by outZider · · Score: 1

      Hm. I've always seen power as the most expensive part of an enterprise deployment -- see also why these companies are building data centers in cheap-power areas.

      --
      - oZ
      // i am here.
    3. Re:I wonder how the total cost compares by afidel · · Score: 1

      For scale out power's probably a bigger percentage of the total, there you're talking about cheap hardware, no software licensing fees, and no support contracts.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  12. This is the comeback of AOL by Anonymous Coward · · Score: 1, Funny

    Remember when AOL used to send you so many floppies in the mail, you didn't need to go out and buy them yourself?

    I'm looking forward to getting 50 TB SANs in the mail.

    1. Re:This is the comeback of AOL by Anonymous Coward · · Score: 0

      how about a dick in your ass?

    2. Re:This is the comeback of AOL by PRMan · · Score: 1

      Ah, those were the days. But you needed to do a full format on them first or they would lose the data.

      --
      Peter predicted that you would "deliberately forget" creation 2000 years ago...
  13. Wait! by Lifyre · · Score: 1

    Does this mean AOL is doing something novel and progressive? Something doesn't feel right about that...

    I'm so confused!

    --
    I'll meet you at the intersection of "Should be" and "Reality"
    1. Re:Wait! by Anonymous Coward · · Score: 0

      It's time for AOL to grow again now that they wiggled out of the claws of conglomerate business that is Time Warner. How do you grow in the technology sector? Innovate! They have accomplished this by implementing the worlds first totally NAND SAN. Now what if they were to take this SAN and improve on it either through software or hardware, making it cheaper, faster and more efficient. It could change internet data serving for forever granted they don't give it a craptastic name like "cloud computing" or "fairy unicorn happy servers" (and avoid running on pure gimmic like those other cute named things).

  14. Finite number of program-erase cycles? by PatPending · · Score: 2, Informative

    I wonder what the read/write rating is vs. a hard disk?

    Wikipedia puts flash at 1,000,000 program-erase cycles

    --
    What one fool can do, another can. (Ancient Simian Proverb)
    1. Re:Finite number of program-erase cycles? by pz · · Score: 1

      Troll. Not even a very good one.

      --

      Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
    2. Re:Finite number of program-erase cycles? by SQL+Error · · Score: 2, Interesting

      It's a non-problem. With Intel's 64GB X25-E drive, for example, you can do non-stop random writes for 6 years before you run into problems. We run all our databases on SSDs, mostly Intel and FusionIO ioDrives.

      That said, we've had drives simply drop dead with a controller failure. You still have to run a RAID array, even with SSDs.

    3. Re:Finite number of program-erase cycles? by jon3k · · Score: 1

      Wow! I've always wanted to talk to someone who was running production databases on Fusion IO ioDrives. Can you explain what the storage setup is like? Do you use FusioIO ioDrives in a Tier 1 sort of configuration, backing it with SSDs, or are they totally indepedent? What's the server hardware configuration like? x86? What manufacturer? How many ioDrives per host? What's the total capacity and performance (throughput, IOPS) ? What about the total SSD capacity? Are you using magnetic media for anything in your DB systems? Say, horizontally partitioning out old data and moving to SATA near-line tiers for OLAP, etc? So many question! You should write up a blog post somewhere about your setup, I guarantee there are a lot of people that would be very interested to hear the details!

  15. Why not go out and buy a ready made SAN? by ClaytonianG · · Score: 1

    Although I'm certain the person designing the SAN had a blast doing so and did an excellent job, it still seems it would have been faster/easier to go with a pre-existing SAN/DB system such as Oracle's exadata2

    I've personally witnessed the exadata2 process close to the advertised 1,000,000 iops(well it was in a controlled demo environment done by oracle, but still, it was impressive).

    I'd also be curious in how much the second SAN would cost. If the first one costs $1, will the second one be cheaper and thus justifying developing the system in house?

  16. Failure rate? by Anonymous Coward · · Score: 0

    Not the brightest people in the world there at AOL (what do you expect?). I can't wait to see what their failure rate is after a year or so of usage.

    Unless they have improved recently my experience with SSD/Flash drives is that they fail quite often. I have never had one last more than a year with relatively heavy use (developer workstations and database usage).

    I would put good money on them losing at least 50% of the whole array over a one year period (I actually think the odds are pretty good that they lose 100% but I'll leave some room since SSD's could have improved since I tried them about a year ago).

    1. Re:Failure rate? by afidel · · Score: 1

      SLC SSD's should last 5-10 years at 100% interface saturated writes.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    2. Re:Failure rate? by Anonymous Coward · · Score: 0

      I think "should" is the key word. They're suppose to last a lot longer than spinners but in my real world experience they fail way more often. That was my point.

      Specs be damned, what I have actually experienced in practical usage is a high failure rate (everyone I know that has tried SSD's for databases or development work has had them fail within a year).

    3. Re:Failure rate? by afidel · · Score: 1

      Are they using SLC drives or consumer MLC drives? Because the x25e's have been running well for me since they launched.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    4. Re:Failure rate? by Anonymous Coward · · Score: 0

      So, the data should still be intact when the company goes bankrupt?

  17. Ok guys.... by mrsteveman1 · · Score: 2, Funny

    It's very easy to fall in love with this stuff once you're on it.

    I said the same thing about coke in the 70's....

    I guess what i'm saying is, no one loan money to AOL until they admit they have a problem.

    1. Re:Ok guys.... by sharkey · · Score: 1

      I said the same thing about coke in the 70's....

      Thank goodness they came out with New Coke and drove us away!

      --

      --
      "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
  18. interface? by loxosceles · · Score: 1

    From summary:

    One reason the flash SAN is so fast is that it doesn't use a SAS or PCIe backbone, but instead has a proprietary interface that offers up 5 to 6Gb/s throughput.

    What are they talking about? The violin memory website says the appliances themselves support FC, 10 GbE, and Infiniband connections. Their performance page says that the appliance can be directly connected to a pcie bus, presumably using some sort of pass-through interface card, but what physical connector and media are used?

    1. Re:interface? by LordMyren · · Score: 1

      I just enjoyed the fact that 5-6Gb/s is a breath-stealing 150% the speed of a single lane of PCIe v2.0, and equal to SATA3's rate. Your implicit question of "what actually runs this SAN," whats behind this interfaces propositioned as blazing fast, is oh so much more dirt on the grave of this fluff piece. Still, from the outset, the "facts" present are already pretty funny.

    2. Re:interface? by Anonymous Coward · · Score: 1, Informative

      If you look at the Violin Memory website (vmem.com) you can see a Memory Array presents a PCIe interface. What AOL is doing is using a Violin SAN head which connects to multiple Memory Array's and then presents Fibre Channel to their EMC VPLEX (Storage Virtualization Layer) and then they can provision to their individual internal customers as needed.

      I think there is some confusion in what is available from the VPLEX point of view which can aggregate multiple Memory Arrays to present whatever performance profile they want - not what is available from an individual box.

    3. Re:interface? by Anonymous Coward · · Score: 0

      I believe they call it "trade secrets" but I could be wrong.

    4. Re:interface? by Guido+del+Confuso · · Score: 1

      The Slashdot post is incorrect, according to the article. The actual throughput is about 4GB/s.

    5. Re:interface? by Skal+Tura · · Score: 1

      16 lane PCIe V3 is 16GB/s, or 128Gbps to compare oranges to oranges. The more common V2, 16 lane is 8GB/s, or single lane 500MB/s=4000Mbps. So a basic x4 PCIe connector would be more than able to handle that speed.

  19. Easy to fall in love? by mysidia · · Score: 0, Offtopic

    It's very easy to fall in love with this girl once you're on her.

    There, fixed it for you.

    Although I suppose it's possible you were talking about drugs, alcohol, cigarettes, caffeine, candy, or pizza. Some people call those things 'stuff'

    But surely one doesn't really fall in love with a million dollar box that will be worth $100 in 5 years.

    And your computer apps will adjust to the storage capabilities of your solid-state storage and require yet even more performance at even higher capacities.

    Ooops... back to mechanical disks.

    1. Re:Easy to fall in love? by Overzeetop · · Score: 1

      But surely one doesn't really fall in love with a million dollar box that will be worth $100 in 5 years.

      Why do you have to bring Demi Moore into the discussion?

      --
      Is it just my observation, or are there way too many stupid people in the world?
  20. And? by Rooked_One · · Score: 1

    6Gbs huh? Ok, so i'm assuming you have some special cable connecting to the SAN... I know offhand that dell sells the MD3200 - a DAS unit that transfers 6Gb/s... Although I estimated it was about 10GB in 30 seconds.

    I've got to be missing something here. The seek times are probably out of this world with this "specialized" SAN, but then we have equallogic SANs that can have 48 SSDs and have 10Gb/s...

    Hey AOL - you are in the arctic right? Can I interest you in some of this amazing ice?

    1. Re: And? by SunSpot505 · · Score: 1

      I've missing something here.

      Indeed you are missing something, as is the person who wrote the summary of the article for ./ : It should read 4GB/s rather than 6Gb/s, two very very different numbers. FTA: "So you're getting the 4GB/sec. of PCIe bandwidth, not the 5Gbit/sec. or 6Gbit/sec. SAS bandwidth. You're getting almost an order of magnitude of bandwidth to the storage internally just because you're using an interface that's capable of it," Pollack said

      Granted, it was the last line, so you really had to dig for that one, read the article next time.

    2. Re: And? by Rooked_One · · Score: 1

      I did - I glazed right over the large B. Woops! So now I have to think of some quick witted scat to cover my tracks....

      Nah :P

  21. Cheap by curious.corn · · Score: 1

    Hei folks,

    20$/GB is not that much IMHO... is that net capacity, does it include geographical replication? Depending on the answer, the real news could be that SSD storage is so much more competitive that one may have thought... :D

    --
    Mi domando chi à il mandante di tutte le cazzate che faccio - Altan
    1. Re:Cheap by Osgeld · · Score: 1

      20$/GB is not that much IMHO

      yea, for a decade ago

    2. Re:Cheap by AcquaCow · · Score: 1

      Once you factor in the total cost of ownership for a disk-based SAN eg: heat/cooling/maintenance/etc... Flash is actually pretty cheap.

      --

      up 12 days, 22:30, 2 users, load averages: 993.20, 994.21, 994.56
      *makes note to limit user processes...
    3. Re:Cheap by jon3k · · Score: 1

      I'd ignore the cost ($20/GB) even though it's actually pretty good. The real interesting thing here is $/IOPS. To build out a similar system based on $/IOPS using 15K FC disks would cost significantly more, and wouldn't ever provide the same access time (less than 1 millisecond). I'd love to see a storage vendor quote out a DELIVERED COMPLETE SYSTEM that provides 250k IOPS using fiber channel disks and do it for under $1M. I don't think any of the big guys (EMC, 3PAR, Hitachi, etc) could touch it.

  22. $1M? A bargain? by Anonymous Coward · · Score: 0

    Um. Am I the only that thought the speculated price was a bit low?

    I would be surprised if that $20/GB isn't the raw perGB cost and the 50TB is the usable figure for how much storage they ended up with.

    That means there's RAID in there, probably spares, any other overhead and hmmm did I see that it's mirrored across two six-node clusters?

    $1M 'tain't that much for some screaming storage and my first thought was "wow...that is really reasonable for that much solid state"

  23. Look at Google and Facebook, not AOL's bandaid by cryfreedomlove · · Score: 1

    I look to Google, Facebook, and other massively scaled companies that build highly distributed systems running on low availability commodity systems. These guys are not throwing Solid State Memory at biggus relational databases. Sorry, but this is a bandaid for a dinosaur.

    1. Re:Look at Google and Facebook, not AOL's bandaid by Anonymous Coward · · Score: 0

      Pollack's presentation.

      One thing about SSDs make this problem easier to solve.

    2. Re:Look at Google and Facebook, not AOL's bandaid by jon3k · · Score: 1

      I think it's just a dramatic difference in workloads. I think Facebook and Google have massive storage capacity requirements, whereas AOL just wanted more IOPS and/or throughput. But, the bottom line is, it was cheaper to throw hardware at this particular problem than engineering expertise. Right tool for the job, I suppose.

  24. The cost/GB is an irrelevant measure by Anonymous Coward · · Score: 0

    This is clearly an application where $/IOP is the problem, not $/GB. If they need 250K random IOPs, they'd need something in the order of 800-850 FC disk drives, and a honking big array to house them, and they certainly wouldn't see any change from $1M for that configuration from EMC, then you add the running costs in terms of power for that configuration and the FLASH stuff looks really attractive.

  25. Summary is wrong by I'm+not+god+any+more · · Score: 0

    FTFA: They're using the NAND memory on a custom board sitting on a PCIe bus, they are getting 4GB/sec.

  26. so what by Anonymous Coward · · Score: 0

    sas is capable of 6 Gb/s, that's why fibre channels is being phased out. aol isn't doing anything any other enterprise is doing, only difference is somebody decided to write about it.

  27. Why do I think... by Anonymous Coward · · Score: 1, Funny

    The developers / DBA's on this project are not familiar with the 'CREATE INDEX' statement.

  28. RAID 5? by daver_au · · Score: 3, Insightful

    They wanted performance and went *RAID 5*? That pretty much sums the entire approach up. Let's not optimise the application first, the database second, but instead hide the problem by throwing hardware at it. Then what we'll do is use a RAID configuration that hobbles the write performance of the arrays and lets not mention what happens to performance when we lose a disk (don't say it won't happen).

    Sure, RAID 5 is the answer to somethings, but not when the question is database *PERFORMANCE*.

    Also - latency is more important than IOP/s. I don't care how many IOP/s you can do, if you're latency is high, the performance won't be. Most garden variety storage engineers don't seem to grasp this concept.

    1. Re:RAID 5? by Lieutenant_Dan · · Score: 1

      What would you have recommended in regards to a higher performance RAID config? Just curious ...

      --
      Wearing pants should always be optional.
    2. Re:RAID 5? by pankkake · · Score: 1
      --
      Kill all hipsters.
    3. Re:RAID 5? by Anonymous Coward · · Score: 0

      They are using SSD drives.

      Each drive in a modern SSD can sustain 30,000 IOPs (read or write)

      They will have more problem with XOR processing then keeping parity updated on a RAID 5 array.

      As for latency ... as you kidding me .... latency for SSD is measured in microseconds, not milliseconds. Worse case latency is similar to best cast latency. Linear performance scaling from increased queue depths.

      As long as the things arn't failing left right and centre, these things are a database's wet dream.

    4. Re:RAID 5? by jon3k · · Score: 1

      I thought that was very odd as well, but maybe their workload is dramatically more reads than writes? RAID 5 obviously gives them a LOT more capacity, so maybe it made sense for them?

  29. A million dollars = half an executive per year by Anonymous Coward · · Score: 0

    I'm enjoying the comments which are sarcastically asking whether AOL is doing anything amazing to justify this investment. A million dollars is not a big deal in terms of capital investment, even to a firm which has taken it's share of losses recently. If the choice was an extra three to four months of performance problems while the developers work out the best way to tune the db and spending a million dollars on storage you probably would have bought in some form anyway then that's no choice at all if your an operations director or whoever approves this sort of thing.

  30. $/IOPS by RegTooLate · · Score: 1

    If you compare their IOPS price to a Fibre drive you will find that AOL got quite the bargain. 250,000 IOPS / 180 IOPS = 1388 10kRPM Fibre drives * $2,000 a pop - $1M = $1.7M savings.

  31. AOL is still around? by roc97007 · · Score: 1, Redundant

    What the hell does AOL need a database for? Users still on hold trying to cancel their accounts?

    --
    Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
    1. Re:AOL is still around? by jon3k · · Score: 1

      TMZ, etc, they own quote a few popular sites these days. Not to mention the untold millions of e-mail boxes I'm sure they still service to this day.

  32. Backbone by Anonymous Coward · · Score: 0

    at about four times the cost of a typical Fibre Channel disk array with the same capacity, and it performs at about 250,000 IOPS. One reason the flash SAN is so fast is that it doesn't use a SAS or PCIe backbone, but instead has a proprietary interface that offers up 5 to 6Gb/s throughput. AOL's senior operations architect said the SAN cost about $20 per gigabyte of capacity, or about $1 million.

    A $250,000 Fibre Channel Disk array doesn't have a SAS or PCIe backbone either. There are plenty of valid reasons flash can be "faster" in one measure or another than disk, but... I feel dumber just for having to saying Fibre Channel Disk Array and SAS in the same sentence. Urgh... /.

  33. They may have wasted the cash by John+Jamieson · · Score: 2, Interesting

    It is hard to know anything for sure with this limited amount of info. But it appears to me that they have not accomplished such a great feat.

    I put together a server this year that pushes over 9 GB/s. I did this with a mere 150 2.5 inch drives. (144 raid 10 + 6 live spares). This was SAS 2.0 of course, because in the real world SAS kicks FC's A**.

    We found that the real bottleneck to throughput is not the drives and not the SAS cards. We have 8 SAS 2.0 lanes coming into each card, multiply that by 6 cards, and you have a heck of a lot of potential.

    No, the real problem is you saturate your PCIe slots, and chipsets sometimes choke when you feed this much data. So, the chipset and PCI-e bus tend to be the restraining factor, not the archaic rotating platters.

    1. Re:They may have wasted the cash by shri · · Score: 1

      Care to share what this looks like? Several servers connected to a SAN with 150 drives?

    2. Re:They may have wasted the cash by jon3k · · Score: 1

      Throughput is far less relevant in this scenario than IOPS. You 150 drives would put out a PEAK theoretical throughput of sequential reads (no writes!) of about 27k IOPS. Or about 10% of the total IOPS of AOL's SSD-based storage system. You're also comparing individual serial connection bandwidth (a single FC or SAS connection) with the entire throughput of your director. They're doing 4GB/s (32Gb/s) per connection vs your 6Gb/s per connection.

    3. Re:They may have wasted the cash by jon3k · · Score: 1

      The article summary is wrong, from TFA:
      "So you're getting 4GB/sec. of PCIe bandwidth, not the 5Gbit/sec. or 6Gbit/sec. SAS bandwidth. You're getting almost an order of magnitude of bandwidth to the storage internally just because you're using an interface that's capable of it," Pollack said.

      That's 4GB(ytes) per second. Not 4 gigaBIT per second. That's 32Gbit/s vs your 6Gbit/s via SAS.

    4. Re:They may have wasted the cash by Skal+Tura · · Score: 1

      That is quite nice :)

      I'm now in process to build a VM cluster with 140 drives, and total initial throughput of 50+Gbps. 5Gbps+local cache per node accessible, plenty in our usage! The nodes can be upgraded to have 20+Gbps access to storage, if needed, albeit with "highish" cost. Most of this with commodity hardware to have low costs vs. performance :) 20+k IOPS is the expected real world storage performance, which will be distributed likely to only 50 guest VMs. Which again, is plenty for our usage, when typical VM will be using ~100-250IOPS average, and those demanding VMs with 1k+ requirement will keep running fine :)

      The beauty of the designed infrastructure is that everything is trivial to upgrade once the infra is initially built, just add hardware, make couple small configurations (15mins + testing) and done :) Designed it to scale to up to 57 nodes without significant jump in hardware pricing, and even after that, it's only a small restructuring of the infra, without über expensive hardware. We can easily add storage performance in 5-7k IOPS increments, for up to 360k IOPS before infra or drive type needs to be upgraded. Cost of that 360k IOPS would be roughly 240k eur, and we would have multilayer redundant storage capacity of 1.2Pb, and still have easy ways to either upgrade IOPS or capacity, depending upon needs. I guess that would be plenty for the 250-600 VMs we are going to run in the cluster :) (Yes really, likely not more than that). But we are starting small, just that 140 drives.

      But how did you build your server? 150 drives in a single server with 9GB/s sounds kind of extreme and hard to design :) Tons of SAS RAID controllers and Multiplier backplanes, on a huge chassis?

    5. Re:They may have wasted the cash by John+Jamieson · · Score: 1

      We used an HP server with all sockets filled. We stuck in 6 SAS cards, and a couple of FC for talking to the SANS for backup. All 150 drives were DAS (connected to the 6 SAS cards).

      I think you could replicate this on either HP server (DL585-g7 or DL980). I like them in that they give enough PCI-e slots. (11 in total)

      As I was mentioning earlier, test your hardware before you buy it, we found a hardware bug one vendor was not aware of when we put this thing together. Benchmark, Benchmark Benchmark!

      Oh ya, the drives sat in 6 HP D2000 enclosures

    6. Re:They may have wasted the cash by John+Jamieson · · Score: 1

      OK, I read the article now. I call BULL!

      This guy only has an 8 lane PCIe which means he can only get 3 GB/s real throughput.

      And he is dissing SAS, and with most SAS 2.0 cards you get 8 to 16 lanes of 6 Gb/s.

      That means that the sas card has a 6-12 GB/s theoretical throughput. (Which is meaningless because it is limited by the same cruddy PCIe v2 bus which limits the FC cards. The 8 lanes maxing at 4GB/s theoretical, 3GB/s in real life.)

      BTW, SAS 2.0 has lower latency than FC, I've checked it out, so if going for raw speed, I will take the SAS over FC anyday.

    7. Re:They may have wasted the cash by John+Jamieson · · Score: 1

      Sorry if I wasn't clear.

      1 - HP Server
      6 - SAS 2.0 cards with 8 SAS lanes each
      6 - HP D2000 enclosures with 25 drives each.

    8. Re:They may have wasted the cash by John+Jamieson · · Score: 1

      Please see my reply to a previous poster.

      The guy is throwing out meaningless figures. All the SAS controllers I looked at had 8-16 SAS lanes.

      If he was to be accurate, he would point out that FC is only 8Gb/s, that is why they aggregate them on one card just like the SAS controllers do.

      He either does not know his material, or he is snowing us

    9. Re:They may have wasted the cash by jon3k · · Score: 1

      I think he's talking about 4GB/s to a single disk, which is what HDSL provides, a full 8 PCI-e 2.0 lanes of bandwidth to a single drive. Whereas with SAS, sure a single CONTROLLER has up to 16 x1 lanes available, but it still cannot communicate to an individual drive at more than 6Gb/s. Now if you're using spinning disks this isn't an issue, but once you move to high-end SSD you need more than 6Gb/s to each drive, which SAS cannot provide.

      As far as FC vs SAS latency, I'd really like to see some proof to backup your claim. But really it doesn't matter. SAS is great for certain things. When you need to build a SAN you typically don't use SAS beyond the disk enclosure. You can't build an entire storage network out of SAS.

  34. Bad summary. Not 5-6 Gb/s but 4 GB/s. by xororand · · Score: 1

    Serial ATA 3.0 and SAS achieve 5-6 Gb/s. This system delivers 4 GB/s. It's really sad how these sloppy summaries make it to the front page.

    Quote from TFA: "So you're getting the 4GB/sec. of PCIe bandwidth, not the 5Gbit/sec. or 6Gbit/sec. SAS bandwidth. You're getting almost an order of magnitude of bandwidth to the storage internally just because you're using an interface that's capable of it," Pollack said.

  35. Higher costs? Maybe not. by fr0dicus · · Score: 1

    They will probably save money compared to powering and cooling the equivalent disk array.

  36. Wait a while until Write Amplification kicks in by Newtonian_p · · Score: 1

    Wait a while until Write Amplification kicks in. Then they'll be screwed.

    --

    There are 2 kinds of people in this world: Those who write in decimal and those who don't

  37. USB FTW? by macshome · · Score: 1

    Wow, 50TB of flash is a lot of thumbdrives!