Slashdot Mirror


AOL Spends $1M On Solid State Memory SAN

Lucas123 writes "AOL recently completed the roll out of a 50TB SAN made entirely of NAND flash in order to address performance issues with its relational database. While the flash memory fixed the problem, it didn't come cheap, at about four times the cost of a typical Fibre Channel disk array with the same capacity, and it performs at about 250,000 IOPS. One reason the flash SAN is so fast is that it doesn't use a SAS or PCIe backbone, but instead has a proprietary interface that offers up 5 to 6Gb/s throughput. AOL's senior operations architect said the SAN cost about $20 per gigabyte of capacity, or about $1 million. But, as he puts it, 'It's very easy to fall in love with this stuff once you're on it.'"

34 of 158 comments (clear)

  1. AOL? by roman_mir · · Score: 4, Funny

    What is surprising to me is not the amount of money spent on what was bought, but the fact that AOL has any performance issues at all. They still have users? They have an entire database of users?

    1. Re:AOL? by MrDiablerie · · Score: 5, Informative

      It's a common misconception that AOL's primary business is still dial-up access. They make more money nowadays with their content sites like TMZ, Moviefone, Engadget, etc.

    2. Re:AOL? by bananaquackmoo · · Score: 4, Informative

      Neither. AOL separated into its own company again.

    3. Re:AOL? by T+Murphy · · Score: 4, Funny

      No, they think they still have lots of users. The cancellation department is separate from HQ- at 56k it's still going to be a few decades before the suits finish receiving all the cancellation notices.

    4. Re:AOL? by tnk1 · · Score: 4, Informative

      AOL is Advertising.com and some flagship sites. And yes, they still have dialup users. The access business is steadily decreasing, but its pretty profitable since they basically stopped upgrading it and now just sort of run it.

      If they maintain their current path, yes, they will eventually disappear and fail, but the process is much longer than you might think. Not all of their acquisitions were as retarded as Bebo.

      What they probably need the SAN for is the Advertising business. That is profitable and requires a shitload of storage. They don't need that for their websites.

    5. Re:AOL? by Mikkeles · · Score: 4, Funny

      Me Too!!!

      --
      Great minds think alike; fools seldom differ.
    6. Re:AOL? by mark72005 · · Score: 3, Funny

      I think it just means they are still billing users who cancelled years ago, per standard practice.

      Also, there are users who wanted to cancel years ago, but are still lost in the phone tree. Those are still active accounts too.

    7. Re:AOL? by dintech · · Score: 2, Funny

      They have an entire database of users?

      No, the 50TB is for a museum of all the different CDs they sent out.

    8. Re:AOL? by Jah-Wren+Ryel · · Score: 3, Informative

      Neither. AOL separated into its own company again.

      As a very casual observer it seems like the entire TW/AOL debacle could not have been mismanged worse - well I guess both companies could have gone titsup, but that's about it. TW vastly overpaid for AOL when AOL was at its peak (160 billion dollars). Then, just as AOL had started to climb out of the bottom they spun it off for a song ($2.5 billion). Since then AOL has been doing a decent enough job of reinventing itself as "new media" company - the kind of thing TW seems to be struggling with.

      That's why corporate CEO's get the big bucks though!

      --
      When information is power, privacy is freedom.
    9. Re:AOL? by Trepidity · · Score: 2, Interesting

      It's true that they make more money now with their content sites, but only slightly more: ISP subscriptions still make up around 40% of its revenues.

    10. Re:AOL? by Anonymous Coward · · Score: 2, Informative

      AOL bought TW. It was a very shrewd move for AOL, because TW had a much higher intrinsic value to set a floor on the stock price when the internet bubble burst.

    11. Re:AOL? by St.Creed · · Score: 2, Informative

      This isn't true - what the AC says is true. TW was bought by AOL when AOL could leverage its 160 billion dollar fairy-dust value into tangible assets. If they hadn't done so they would have been gone years ago. It was a brilliant move by AOL and at the time, TW thought it was a great deal as well. TW got suckered, as we know now. But in that day and age it looked like a good move: AOL had the internet savvy, TW the IP - combine them and rule the Internet. Ofcourse, that didn't quite go as planned.

      --
      Therefore, by the (faulty) logic you're using, you're just a cow with a keyboard - osu-neko (2604)
  2. What? by EndlessNameless · · Score: 5, Insightful

    As a DBA, I would love to have solid-state storage instead of needing to segment my databases properly and work with the software dev guys to make sure we have reasonable load distribution.

    Where can I get someone to pay a million dollars so I can do substandard work?

    --

    ---
    According to the latest ruleset, this post should be modded as Vorpal Flamebait +5.
    1. Re:What? by Jimmy+King · · Score: 5, Funny

      As long as you come really cheap, I can probably get you on where I work. You won't get cool hardware like that, but you can have the other half. Management seems to be ok with substandard work as long as apologizing to the customers continues to be cheaper than doing a good job or buying the hardware to cover up the poor job.

    2. Re:What? by Threni · · Score: 3, Insightful

      You're the DBA - do what you do best, and start Googling! :)

    3. Re:What? by eln · · Score: 2, Insightful

      I was thinking something very much along these lines. I can't believe that AOL is doing something more I/O intensive than everyone else in the world. If you're looking at buying something this expensive, you really need to go through your database design and application code with a fine-toothed comb and look for inefficiencies first.

      Of course, in the real world, this sort of thing (maybe not to this scale) happens all the time. We just had a customer that was having major performance problems. They demanded we put them on a massive $750,000 whiz-bang SAN device right away to alleviate their problems. So we did, and then their DBAs finally get off their asses and look at the code and make some changes that cut their I/O demand in half. Basically, they ended up burning $750,000 on something they didn't even need. I have a feeling AOL just spend $1,000,000 on something they didn't really need as well.

    4. Re:What? by Nexzus · · Score: 2

      Cynic in me believes that in addition to a whiz bang storage network, AOL also got some free publicity in the Tech circle with the inclinination that they're leading edge.

      --
      Karma: Can only be portioned out by the Cosmos.
    5. Re:What? by hxnwix · · Score: 3, Funny

      You could probably get by with a cloud of 486s, but why the fuck would you bother?

    6. Re:What? by Kjella · · Score: 2

      Where can I get someone to pay a million dollars so I can do substandard work?

      You try claiming the next big work they want will take more than a million dollars in DEV/DBA work compared to buying a million dollar SAN. At this point three things could happen:

      1. They say "um, never mind"
      2. They pony up the cash
      3. They call you on it

      While I've seen some rather dysfunctional companies, I still haven't seen any where the PHBs try reestimating the IT cost themselves. Mind you, I haven't seen an overwhelming many companies that have a spare million dollars lying aorund either so I figure #1 would happen in 95% of the cases. But I actually think #2 would happen in 4% of the remaining cases and #3 only in 1%. Unless your costs are questioned before it leaves IT...

      --
      Live today, because you never know what tomorrow brings
    7. Re:What? by kanad · · Score: 3, Interesting

      Careful on what you wish.Virgin Blue airlines in Australia suffered a 2 day blackout costing $20 million due to a single solid state drive failure. They have since gone back to normal drive. Read it at http://www.theaustralian.com.au/australian-it/still-no-clue-to-virgin-blues-20m-question/story-e6frgakx-1225937335722

    8. Re:What? by konohitowa · · Score: 3, Funny

      As long as you come really cheap, I can probably get you on where I work. You won't get cool hardware like that, but you can have the other half. Management seems to be ok with substandard work as long as apologizing to the customers continues to be cheaper than doing a good job or buying the hardware to cover up the poor job.

      You could always take a long lunch, cross the bridge from Redmond to Seattle, and apply at Amazon. I'm sure Microsoft would give you a couple of hours off to do that, right?

    9. Re:What? by fluffy99 · · Score: 4, Informative

      I have a feeling AOL just spend $1,000,000 on something they didn't really need as well.

      They admitted as much in the article. They decided that it was cheaper to improve the hardware throughput than to spend the money on developers to try to trim the demand. They were also probably losing money by not meeting SLAs and a quick fix was cheaper in the long run. They also reduced power and cooling requirements as well, so there may be some long term payback there as well. The free publicity certainly didn't hurt either

    10. Re:What? by Anonymous Coward · · Score: 2, Insightful

      certainly the failure of an entire infrastructure after the failure of a single drive is the fault of the drive manufacturer. spinning disks never fail?

  3. It is called HDSL... by Yaa+101 · · Score: 3, Informative

    You can read more about that here:

    http://www.google.com/search?q=High-Speed+Data+Link

  4. Really? by Archangel+Michael · · Score: 3, Informative

    My impression has been that this has been what has been going on for some time now with all the larger database operations, and one of the reasons why SSD have not yet come down in price is that all the best units and tech are going to the big companies as fast as they can get it from the manufacturers. I wouldn't be surprised to see someone like Google saying something like "yawn, 50TB" and saying that they have PETABYTE versions already out there.

    If you run a Database of any size, especially ones with large read to write ratios, SSD would only make things faster. And speed counts.

    --
    Agent K: A *person* is smart. People are dumb, stupid, panicky animals, and you know it.
  5. Re:Sas bandwidth constrained??? by TheRaven64 · · Score: 4, Insightful

    Now we just need something cheeper then 20$/GB

    Actually, the price was the most interesting part of this:

    at about four times the cost of a typical Fibre Channel disk array with the same capacity

    Four times the price and, what, ten? A hundred? times the IOPS? That makes NAND pretty much a no brainer for any heavy-use database.

    --
    I am TheRaven on Soylent News
  6. Re:Well, their purchasing agents got Microsofted by Galestar · · Score: 2, Funny

    hey could've done it cheaper.

    It's AOL, would you actually expect them to make intelligent, informed decisions?

    --
    AccountKiller
  7. Finite number of program-erase cycles? by PatPending · · Score: 2, Informative

    I wonder what the read/write rating is vs. a hard disk?

    Wikipedia puts flash at 1,000,000 program-erase cycles

    --
    What one fool can do, another can. (Ancient Simian Proverb)
    1. Re:Finite number of program-erase cycles? by SQL+Error · · Score: 2, Interesting

      It's a non-problem. With Intel's 64GB X25-E drive, for example, you can do non-stop random writes for 6 years before you run into problems. We run all our databases on SSDs, mostly Intel and FusionIO ioDrives.

      That said, we've had drives simply drop dead with a controller failure. You still have to run a RAID array, even with SSDs.

  8. Ok guys.... by mrsteveman1 · · Score: 2, Funny

    It's very easy to fall in love with this stuff once you're on it.

    I said the same thing about coke in the 70's....

    I guess what i'm saying is, no one loan money to AOL until they admit they have a problem.

  9. Re:Sas bandwidth constrained??? by rsborg · · Score: 2, Informative

    I don't know about you, but I haven't seen SSDs battle tested enough for me to truly trust the things yet. With mechanical drives I've yet to have one "just die" as I ALWAYS got warnings something was going via drive noise, heat, random small errors, etc. And now SMART just makes that even easier to spot.

    Google found differently in their massive hard drive survey... sometimes drives would just up and die with no SMART warnings. Also the most common SSD failure-case is lack of writes, at least you can retrieve data off the drive as opposed to a completely opaque device if the platter is frozen.

    --
    Make sure everyone's vote counts: Verified Voting
  10. Re:Sas bandwidth constrained??? by EXrider · · Score: 2, Informative

    With mechanical drives I've yet to have one "just die" as I ALWAYS got warnings something was going via drive noise, heat, random small errors, etc. And now SMART just makes that even easier to spot.

    Google found differently in their massive hard drive survey... sometimes drives would just up and die with no SMART warnings. Also the most common SSD failure-case is lack of writes, at least you can retrieve data off the drive as opposed to a completely opaque device if the platter is frozen.

    Yeah, I've seen quite the opposite. Let me preface this with saying that I'm strictly talking about consumer and midrange drives, I've seen very few SCSI and SAS drives die without warning.

    In the past 10 years, in a company with about 200 nodes, I can literally count on one hand the amount of hard drives that have given any SMART warnings leading up to their imminent failure. They pretty much always die while the OS accumulates log entries of bad blocks and I/O errors. Most of the time it was either death by shock, or death by manufacturer defect (Maxtor!). The former, SSD drives are pretty much immune to BTW. I would prefer an SSD in a road warrior or college student's laptop any day over a conventional HDD.

    --
    grep -iw skynet /etc/services
  11. RAID 5? by daver_au · · Score: 3, Insightful

    They wanted performance and went *RAID 5*? That pretty much sums the entire approach up. Let's not optimise the application first, the database second, but instead hide the problem by throwing hardware at it. Then what we'll do is use a RAID configuration that hobbles the write performance of the arrays and lets not mention what happens to performance when we lose a disk (don't say it won't happen).

    Sure, RAID 5 is the answer to somethings, but not when the question is database *PERFORMANCE*.

    Also - latency is more important than IOP/s. I don't care how many IOP/s you can do, if you're latency is high, the performance won't be. Most garden variety storage engineers don't seem to grasp this concept.

  12. They may have wasted the cash by John+Jamieson · · Score: 2, Interesting

    It is hard to know anything for sure with this limited amount of info. But it appears to me that they have not accomplished such a great feat.

    I put together a server this year that pushes over 9 GB/s. I did this with a mere 150 2.5 inch drives. (144 raid 10 + 6 live spares). This was SAS 2.0 of course, because in the real world SAS kicks FC's A**.

    We found that the real bottleneck to throughput is not the drives and not the SAS cards. We have 8 SAS 2.0 lanes coming into each card, multiply that by 6 cards, and you have a heck of a lot of potential.

    No, the real problem is you saturate your PCIe slots, and chipsets sometimes choke when you feed this much data. So, the chipset and PCI-e bus tend to be the restraining factor, not the archaic rotating platters.