Slashdot Mirror


Sun Adding Flash Storage to Most of Its Servers

BobB-nw writes "Sun will release a 32GB flash storage drive this year and make flash storage an option for nearly every server the vendor produces, Sun officials are announcing Wednesday. Like EMC, Sun is predicting big things for flash. While flash storage is far more expensive than disk on a per-gigabyte basis, Sun argues that flash is cheaper for high-performance applications that rely on fast I/O Operations Per Second speeds."

38 of 113 comments (clear)

  1. We are going to have two layers of storage by javilon · · Score: 4, Interesting

    I would put the operating systems, binaries and configuration files on the SSD.

    But most of what makes up the volume on current computers (log files, backups, video/audio) can be committed to a regular hard drive.

    --


    When his defense asked, "Which computer has Jon Johansen trespassed upon?" the answer was: "His own."
    1. Re:We are going to have two layers of storage by 3HackBug77 · · Score: 2, Interesting

      That definately makes sense, It is a very expensive idea to make EVERYTHING SSD, it doesn't really make sense, because except for on the local level it wouldn't really make a huge difference either, due to network bandwidth limits.

    2. Re:We are going to have two layers of storage by CastrTroy · · Score: 2, Insightful

      Why would you bother putting the programs and operating system on SSD for a server? Once the files are loaded into memory, you'll never need to access them again. SSD only helps with OS and Programs when you are booting up, or opening new programs. This almost never happens on most servers.

      --

      Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
    3. Re:We are going to have two layers of storage by Anonymous Coward · · Score: 5, Funny

      when you have servers that stays up for months I'm a Windows admin, you insensitive clod!
    4. Re:We are going to have two layers of storage by compro01 · · Score: 4, Informative

      I was sure that figure was upwards of a million cycles per sector in modern flash chips.

      Also, throw in wear-leveling and spare sectors. a million writes to a file system sector doesn't mean a million writes to a particular physical sector (could be 1000 writes each to a 1000 different sectors) and when a sector does wear out, it simply gets put out of service and is replaced with a spare one. this same principle is used in mechanical hard drives. if a sector is problematic to read from/write to, it gets marked as bad and the file system sector is remapped to somewhere else.

      SSDs could quite likely last longer than mechanical hard drives in this regard.

      --
      upon the advice of my lawyer, i have no sig at this time
    5. Re:We are going to have two layers of storage by boner · · Score: 3, Interesting

      RAM drive uses DRAM, Enterprise class DRAM ~ $100/GB and uses ca 8W/GB. Enterprise Flash, ~ $30-80/GB and uses 0.01W/GB

      In addition, assume that 90% of ram-drive accesses go to 10% of the storage, you can see that effectively you are burning a lot of energy with zero gain. Multiply by up-time.

      Flash has the potential of greatly improving performance/watt for most servers.

    6. Re:We are going to have two layers of storage by boner · · Score: 2, Interesting

      Ummm, most programs are not completely loaded into memory and inactive pages do get swapped out in favor of active pages. While the most active regions of a program are in memory most of the time, having the whole program in memory is not the general case.

      Also, DRAM burns ~8W/GB (more if FB-DIMMS), Flash burns only 0.01W/GB. Thus swapping inactive pages to Flash allows you to use your DRAM more effectively, improving your performance/W.

      From a different perspective: you have a datacenter and you are energy constrained. Most applications use 10% of the DRAM 90% of the time. It may be an attractive proposition to give the applications less DRAM (at a slight performance loss) and let them swap to Flash (with a significant reduction in power). Multiply by 10000 servers, even a 20W reduction per server becomes significant.

    7. Re:We are going to have two layers of storage by BlendieOfIndie · · Score: 5, Informative

      It sounds like the SSDs are internal drives for the server. A database would never be stored on an internal hard drive. Almost any commercial database is connected to a disk farm through SAN fabric.

      SSDs really shine for OLTP databases. Lots of random IO occurs on these databases (as opposed to data warehouses that use lots of sequential IO).

      Normal hard drives are horrible for random IO because of mechanical limitations. Think about trying to switch tracks on a record player thousands of times per second; this is whats happening inside a hard drive (under a random IO load). Its amazing mechanical HDDs work as well as they do.

    8. Re:We are going to have two layers of storage by dgatwood · · Score: 5, Interesting

      I was thinking about this at Fry's the other day when trying to decide whether I could trust the replacement Seagate laptop drive similar to the one that crashed on me Sunday, and I concluded that the place I most want to see flash deployed is in laptops. Eventually, HDDs should be replaced with SSDs for obvious reliability reasons, particularly in laptops. However, in the short term, even just a few gigs of flash could dramatically improve hard drive reliability and battery life for a fairly negligible increase in the per-unit cost of the machines.

      Basically, my idea is a lot like the Robson cache idea, but with a less absurd caching policy. Instead of uselessly making tasks like booting faster (I basically only boot after an OS update, and a stale boot cache won't help that any), the cache policy should be to try to make the hard drive spin less frequently and to provide protection of the most important data from drive failures. This means three things:

      1. A handful of frequently used applications should be cached. The user should be able to choose apps to be cached, and any changes to the app should automatically write through the cache to the disk so that the apps are always identical in cache and on disk.
      2. The most important user data should be stored there. The user should have control over which files get automatically backed up whenever they are modified. Basically a Time Machine Lite so you can have access to several previous versions of selected critical files even while on the go. The OS could also provide an emergency boot tool on the install CD to copy files out of the cache to another disk in case of a hard drive crash.
      3. The remainder of the disk space should be used for a sparse disk image as a write cache for the hard drive, with automatic hot files caching and (to the maximum extent practical) caching of any catalog tree data that gets kicked out of the kernel's in-memory cache.

      That last part is the best part. As data gets written to the hard drive, if the disk is not already spinning, the data would be written to the flash. The drive would spin up and get flushed to disk on shutdown to ensure that if you yank the drive out and put it into another machine, you don't get stale data. It would also be flushed whenever the disk has to spin up for some other activity (e.g. reading a block that isn't in the cache). The cache should also probably be flushed periodically (say once an hour) to minimize data loss in the event of a motherboard failure. If the computer crashes, the data would be flushed on the next boot. (Of course this means that unless the computer had boot-firmware-level support for reading data through such a cache, the OS would presumably need to flush the cache and disable write caching while updating or reinstalling the OS to avoid the risk of an unbootable system and/or data loss.)

      As a result of such a design, the hard drive would rarely spin up except for reads, and any data frequently read would presumably come out of the in-kernel disk cache, so basically the hard drive should stay spun down until the user explicitly opened a file or launched a new application. This would eliminate the nearly constant spin-ups of the system drive resulting from relatively unimportant activity like registry/preference file writes, log data writes, etc. By being non-volatile, it would do so in a safe way.

      This is similar to what some vendors already do, I know, but integrating it with the OS's buffer cache to make the caching more intelligent and giving the user the ability to request backups of certain data seem like useful enhancements.

      Thoughts? Besides wondering what kind of person thinks through this while staring at a wall of hard drives at Fry's? :-)

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    9. Re:We are going to have two layers of storage by Anonymous Coward · · Score: 2, Insightful

      As capacity goes up, the feature size on flash gets smaller. This means less energy per bit and a thinner dielectric.

      So, as density of flash goes up, write cycle lifetime potentially goes down.

      HDDs have the same issue of bits being less "durable" as capacity goes up. However, the media never wears out for HDD. Furthermore, it is already accepted that there will be many bit errors and these are simply corrected with error correction codes and mapping out bad sectors.

      As far as reliability goes, everybody talks about it but nobody actually buys on the basis of reliability. At the end of the day it all comes down to dollars per gigabyte for most applications.

      Power usage on the other hand is becoming more and more important. That may actually be a strong selling point.

      As I've said elsewhere, the first step is going to be OS support for treating Flash differently than HDD. This will allow for hybrid storage solutions. At that point we will see the medium between HDDs and Flash. Right now, it is all or nothing going Flash. In that environment it is going to be hard for Flash to get going.

    10. Re:We are going to have two layers of storage by dgatwood · · Score: 4, Informative

      Five years ago, I would have agreed. These days, some of the better flash parts are rated as high as a million write cycles. If we're talking about 4 GB of flash, a million write cycles on every block would take a decade of continuous writes at 10 megabytes per second. Real-world workflows obviously won't hit the cache nearly that hard unless your OS has a completely worthless RAM-based write caching algorithm.... Odds are, the computer will wear out and be replaced long before the flash fails. That said, in the event of a flash write failure, you can always spin up the drive and do things the old-fashioned way. And, of course, assuming you put this on a card inside the machine, if it does fail, you wouldn't have to replace the whole motherboard to fix the problem.

      That said, to reduce thrashing of the write cache, it might be a good idea to add a cap of a meg or two and spin up the hard drive asynchronously once the write cache size exceed that limit. Continue writing to the flash to avoid causing user delays while the HD spins up (huge perceived user performance win there, too) and flush once the drive is up to speed.

      You could also do smart caching of ephemeral data (e.g. anything in /tmp, /var/tmp, etc.). Instead of flushing changes those files to disk on close, wait to flush them until there's no room for them in the RAM buffer cache, and then flush them to the flash. After all, those directories get wiped on reboot anyway, so if the computer crashes, there's no advantage to having flushed anything in those directories to disk....

      BTW, in the last week, I've lost two hard drives, both less than a year old. I'm not too impressed with the write lifetimes of Winchester disk mechanisms. :-)

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    11. Re:We are going to have two layers of storage by dgatwood · · Score: 4, Insightful

      Because write caches in RAM go away when your computer crashes, the power fails, etc. Battery-backed RAM is an option, but is a lot harder to get right than a USB flash part connected to an internal USB connector on a motherboard.... In-memory write caching (without battery backup) for more than a handful of seconds (to avoid writing files that are created and immediately deleted) is a very, very bad idea. There's a reason that no OS keeps data in a write cache for more than about 30 seconds (and even that is about five times too long, IMHO).

      Write caching is the only way you can avoid constantly spinning up the disk. We already have lots of read caching, so no amount of improvement to read caching is likely to improve things that dramatically over what we have already.

      Even for read caching, however, there are advantages to having hot block caches that are persistent across reboots, power failures, crashes, etc. (provided that your filesystem format provides a last modified date at the volume level so you can dispose of any read caches if someone pulls the drive, modifies it with a different computer, and puts the drive back). Think of it as basically prewarming the in-memory cache, but without the performance impact....

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

  2. Write cycles. again. by Amiga+Lover · · Score: 4, Insightful

    Cue up 20 comments going "But what about the limited write cycles, these things will fail in a month" and 500 comments replying "this is no longer an issue n00b"

    1. Re:Write cycles. again. by morgan_greywolf · · Score: 4, Insightful

      You forgot the 1000 comments prognosticating about SSDs replacing HDDs permanently "any day now" with the added bravado of saying "I knew this would happen! See, I told you!" with 3000 comments replying 'Yeah, but price/performance!", all of which will be replied to with "but price/performance doesn't matter, n00b. Price makes no difference to anyone."

      Then, in a fit of wisdom, a few posters, all of whom will be modded down as flamebait, will say "There's room for both and price/performance does matter, at least for now."

    2. Re:Write cycles. again. by maxume · · Score: 2, Insightful

      I'm just glad there is enough interest in paying for the performance to keep the development moving at a decent clip, flash really does look like it will have a big advantage for laptop users that are not obsessed with storing weeks worth of video.

      --
      Nerd rage is the funniest rage.
  3. Good by brock+bitumen · · Score: 3, Insightful

    They are trying to push new technology on their high paying customers because they can get a premium since it's a scarce resource, this will drive up production, and down the costs, and soon we'll all be toting massive flash disks all the day

    I, for one, welcome our new flash disk overlords

  4. Re:Lifespan? by Chabil+Ha' · · Score: 4, Informative

    Oh no, the flash wear myth! Try this on.

    --
    We're all hypocrites. We all have hidden parts, it's the contrast between them that make us more a hypocrite than others
  5. big deal by larry+bagina · · Score: 4, Funny

    Most computers come with flash preloaded. I don't know why you'd be browsing the web or watching videos/web comics/ads/etc on a server computer. Maybe they're trying to dumb diwn ti compete with Windows Server 2008.

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

    1. Re:big deal by GuyverDH · · Score: 4, Funny

      Please.... Please.... Please.... Tell me you were joking....

      I can usually read into the comment if someone is joking or not... but this one... I dunno... Could go either way....

      --
      Who is general failure, and why is he reading my hard drive?
  6. Re:Samsung 256GB Flash Drive by QuantumRiff · · Score: 4, Interesting
    This is just a story about SUN doing something that others have already done in for sometime now

    Really? What other top 5 computer manufacturer has been putting flash drives in SERVERS? I've seen a few laptops, but I haven't seen any used in servers or storage systems. (EMC and a few others have announced plans to do it, but haven't released anything AFAK)

    Also, their "thumper" server has 48 drives in it. Would you want to pay around $1000 per drive to fill that up?

    --

    What are we going to do tonight Brain?
  7. Two layers, but not those ones by clawsoon · · Score: 5, Insightful
    We are going to have two layers, but they'll be deeper in the filesystem than that.

    High frequency, low volume operations - metadata journalling, certain database transactions - will go to flash, and low frequency, high volume operations - file transfers, bulk data moves - will go to regular hard drives. SSDs aren't yet all that much faster for bulk data moving, so it makes the most economic sense to put them where they're most needed: Where the IOPs are.

    Back in the day, a single high-performance SCSI drive would sometimes play the same role for a big, cheap, slow array. Then, as now, you'd pay the premium price for the smallest amount of high-IOPs storage that you could get away with.

  8. Re:Samsung 256GB Flash Drive by bubulubugoth · · Score: 2, Informative

    IBM has them as option for blades and racks servers...

    --
    Â_Â
  9. Re:Samsung 256GB Flash Drive by Archangel+Michael · · Score: 3, Insightful

    Also, their "thumper" server has 48 drives in it. Would you want to pay around $1000 per drive to fill that up? Yes. If performance dictated it was necessary.

    Just because you don't want to, doesn't mean everyone else doesn't want to also.
    --
    Agent K: A *person* is smart. People are dumb, stupid, panicky animals, and you know it.
  10. This will even further ZFS by E-Lad · · Score: 4, Interesting

    Current versions of ZFS have the feature where the ZIL (ZFS Intent Log) can be separated out of a pool's data devices and onto it's own disk. Generally, you'd want that disk to be as fast as possible, and these SSDs will be the winner in that respect. Can't wait!

    1. Re:This will even further ZFS by allanw · · Score: 2, Interesting

      Current versions of ZFS have the feature where the ZIL (ZFS Intent Log) can be separated out of a pool's data devices and onto it's own disk. Generally, you'd want that disk to be as fast as possible, and these SSDs will be the winner in that respect. Can't wait! As far as I know, contiguous writing of large chunks of data is slower for flash drives than plain HDD's. I'm guessing the ZIL is some kind of transactional journal log, where all disk writes go before they hit the main storage section of the filesystem? I don't think you'd get much of a speed bonus. SSDs are only really good for random access reads like OLTP databases.
    2. Re:This will even further ZFS by Anonymous Coward · · Score: 4, Interesting

      The benchmarks say something like a 200x performance by putting the ZIL onto the an alternate high performance logging device.

      I have been actively researching a vendor who will supply this type of device. Currently we're testing with Gigabyte i-Ram cards, connected in through a separate SATA interface. (Note: Gigabyte are battery backed SDRAM .. but I won't have lost power for 12 hours so it's a non-issue for me)

      Fusion-IO is a vendor who is making a board for Linux - but as near as I can tell the cards aren't available yet, and when they are - they won't work with Solaris anyway!

      The product which Neil Perrin did his testing with (umem/micromemory) with their 5425CN card doesn't work with current builds of Solaris. Umem is also a pain to work with .. they don't even want to sell the cards (I managed to get some off eBay)

      I hope Sun lets me buy these cards separately for my HP proliant servers. Of course if they didn't, this is one thing that might make me consider switching to Sun Hardware! (Hey HP/Dell - are you reading this??)

  11. I'm surprised that it is big enough to talk about. by fuzzyfuzzyfungus · · Score: 4, Interesting

    Given that you can get flash disks that hang off pretty much any common bus used for mass storage(IDE, SATA, SAS, USB, SPI, etc.) "Adding a flash storage option" is pretty much an engineering nonevent, and a very minor logistical task.

    If Sun expects to sell a decent number of flash disks, or is looking at making changes to their systems based on the expectation that flash disks will be used, then it is interesting news; but otherwise it just isn't all that dramatic. While flash and HDDs are very different in technical terms, the present incarnations of both technologies are virtually identical from a system integration perspective. This sort of announcement just doesn't mean much at all without some idea of expected volume.

  12. Re:I'm surprised that it is big enough to talk abo by gbjbaanb · · Score: 4, Insightful

    "Adding a flash storage option" is pretty much an engineering nonevent but as a marketing event its a magnificent and almost unbelievable paradigm-shift approach to a massive problem that's been crying out for a reliable storage-based performance solution for years.
  13. IOPS by Craig+Ringer · · Score: 5, Informative

    People (read: vendors) now frequently refer to flash storage as superior when IOPs are the main issue.

    From what I've been able to discern this is actually true only in read-mostly applications and applications where writes are already in neat multiples of the flash erase block size.

    If you're doing random small writes your performance is likely to be miserable, because you'll need to erase blocks of flash much larger than the data actually being changed, then rewrite the block with the changed data.

    Some apps, like databases, might not care about this if you're able to get their page size to match or exceed that of the underlying storage medium. Whether or not this is possible depends on the database.

    For some other uses a log-oriented file system might help, but those have their own issues.

    In general, though, flash storage currently only seems to be exciting for random read-mostly applications, which get a revolting performance boost so long as the blocks being written are small enough and scattered enough. For larger contiguous reads hard disks still leave flash in the dust because of their vastly superior raw throughput.

    Vendors, however, make a much larger margin on flash disk sales.

    This article (PDF) may be of interest:
    Understanding Flash SSD performance
    (google text version).

  14. Re:I'm surprised that it is big enough to talk abo by fuzzyfuzzyfungus · · Score: 2, Funny

    Good point. And I strongly suspect that it enables today's Dynamic CIOs to realize unprecedented First-Mover Synergies in the modern Data-Centric Enterprise Solution Space.

  15. Re:I'm surprised that it is big enough to talk abo by boner · · Score: 5, Informative

    Re: "Adding a flash storage option" is pretty much an engineering nonevent, and a very minor logistical task.

    You have no idea what you are talking about. Sun customers demand that the product Sun sells them have known reliability properties and that Sun guarantees their products properly interact with each other. It takes a significant amount of resources to do this validation. At the same time SSDs and HDDs react very differently to load and can have all sorts of side effects if the OS/application is not prepared to deal with them.

  16. Power consumption and heat dissipation by UpooPoo · · Score: 4, Interesting

    I work in a company that has a few thousand servers running in a few regional data centers. We are looking into SSDs not because of their superior IOPs (this is a mitigating factor vs HDD performance) but because of their low power consumption and low heat dissipation. When you scale your operations reach a scale where you are using an entire data center, heating and power become more and more of a cost issue. Right now we are trying to build some hard data on actual sabings, but there's lots of spin out there that gives you an idea of what potential savings could be. Here are a few interesting links, google around for more information, there's plenty to be had:

    http://www.stec-inc.com/green/storage_casestudy.php
    http://www.stec-inc.com/green/green_ssdsavings.php (You have to request the whitepaper to see this one.)

  17. Re:Samsung 256GB Flash Drive by Calinous · · Score: 2, Insightful

    Samsung will have Multi Level Cells, which are slower (and cheaper). The Single Level cells are faster (up to twice as fast I think), but more expensive.
          You can go either way with it, but I think faster (and smaller) drives are more attractive than bigger and slower.
          You need to compete against the sequential speed of a 15,000 rpm SCSI drive too (SSD will beat them dead on access speed, but not all workloads are small random reads)

  18. Re:I'm surprised that it is big enough to talk abo by Jor-Al · · Score: 3, Informative

    However, neither of Suns competitors, IBM or HP, offer SSDs at the moment. Year about a year too late making that comment. IBM having SSDs in their Blades: http://www.itbusinessedge.com/blogs/dcc/?p=175
  19. RAID 4, anyone? by mentaldrano · · Score: 3, Insightful

    In the time between now and when SSD becomes cheaper than magnetic storage, might we see a resurgence of RAID 4? RAID 4 stripes data across several disks, but stores parity information all on one disk, rather than distributing the parity bits like RAID 5.

    This has benefits for workloads that issue many small randomly located reads and writes: if the requested data size is smaller than the block size, a single disk can service the request. The other disks can independently service other requests, leading to much higher random access bandwidth (though it doesn't help latency).

      One of the side effects of this is that the parity disk must be much faster than the data disks, since it must service all requests, to provide the parity info. Here SSD shines, with its quick random access times, but poor sequential performance. Interesting, no?

  20. Cheaper than RAM? by AmiMoJo · · Score: 3, Insightful

    At the moment high performance SSDs are still more expensive than RAM. Since a 64 bit processor can address vast amounts of RAM, wouldn't it be even better and cheaper just to have 200GB of RAM rather than 200GB of SSD?

    Okay, you would still need a HDD for backing store, but in many server applications involving databases (high performance dynamic web servers for example) a normal RAID can cope with the writes - it's the random reads accessing the DB that cause the bottleneck. Having 200GB of database in RAM with HDDs for backing store would surely be higher performance than SSD.

    For things where writes matter like financial transactions, would you want to rely on SSD anyway? Presumably banks have lots of redundancy and multiple storage/backup devices anyway, meaning each transaction is limited by the speed of the slowest storage device.

    --
    const int one = 65536; (Silvermoon, Texture.cs)
    SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
  21. Sun is Afraid of THIS! by StCredZero · · Score: 2, Interesting

    A single ioFusion card has the concurrent data serving ability of a 1U server cabinet full of media servers. They do this by having 160 channels on a drive controller that also incorporates flash memory. Since each channel is a few orders of magnitude faster than a mechanical hard drive, one card can handle a flurry of concurrent random access requests as fast as 1000 conventional hard drives.

    The perfect thing for serving media, where you don't need a few GB per customer, you need the same few GB served out to 1000's or millions of users concurrently. So while $/GB stored stinks, $/GB streamed is fantastic.

  22. what drives are for. by flaming-opus · · Score: 2, Insightful

    You're confusing two very different sorts of storage. There is bulk data storage. This is a fileserver for home directories, video archives, piles of email, that sort of stuff. This is the market where the 1TB sas drive thrives. Then there's the database backing store. Almost every customer I've sold to wants a huge number of very fast, very small drives for database backing store. The extra capacity is meaningless, as they have to use so many spindles to get a decent IOPS performance. In this area, selling drives hasn't been about capacity for 10 years. IOPS, in particular read IOPS is your throttle point for these. Now that flash drives are beginning to get traction for high-end laptops, and we have affordable, SDD drives, with industry standard interfaces, there's no reason NOT to use them.

    Also, fibre channel drives already cost $1000, so paying this much is nothing new for enterprise customers. An enterprise server with LESS than $50,000 of storage would be the oddball case.