Wear Leveling, RAID Can Wipe Out SSD Advantage
storagedude writes "This article discusses using solid state disks in enterprise storage networks. A couple of problems noted by the author: wear leveling can eat up most of a drive's bandwidth and make write performance no faster than a hard drive, and using SSDs with RAID controllers brings up its own set of problems. 'Even the highest-performance RAID controllers today cannot support the IOPS of just three of the fastest SSDs. I am not talking about a disk tray; I am talking about the whole RAID controller. If you want full performance of expensive SSDs, you need to take your $50,000 or $100,000 RAID controller and not overpopulate it with too many drives. In fact, most vendors today have between 16 and 60 drives in a disk tray and you cannot even populate a whole tray. Add to this that some RAID vendor's disk trays are only designed for the performance of disk drives and you might find that you need a disk tray per SSD drive at a huge cost.'"
This assumes that RAID controller manufacturers won't be making any changes though.
RAID for years has relied on millisecond access times. So why spend a lot of money on an ASIC & Subsystem that can go faster? So taking a RAID card designed for slow (relatively) spinning disks and attaching them to SSD of course the RAID card is going to be a bottleneck.
However subsystems are going to be designed to work with SSD that has much higher access times. When that happens, this so called 'bottleneck' is gone. You know every major disk subsystem vendor is working on these. Sounds like a disk vendor is sponsoring 'studies' to convince people not to invest in SSD technologies now knowing that a lot of companies are looking at big purchases this year because of the age of equipment after the downturn.
As a rock-in-roll Physicist once said, No matter where you go, there you are.
RAID means "Redundant Array of Inexpensive Disks".
Wear Leveling, RAID Can Wipe Out SSD Advantage for enterprise.
While it may not be efficient to slap together a platter of 16 SSDs, it is worthwhile to upgrade personal computers to use an SSD.
Scaling works both ways. Often technology that benefits larger installations or enterprise environments gets scaled down to the desktop after being fine tuned. It's not uncommon for technology that benefits desktop or smaller implementations to scale up to eventually benefit the 'big boys'. This is simply a case of the laptop getting the technology first as it was the most logical place for it to get traction. Give SSD's a little time and they'll work their way into RAID as well as other server solutions.
The real advantage of solid state storage is seek time, not read/write times. They don't beat conventional drives by much at sustained IO. Maybe this will change in the future. RAID just isn't meant for SSD devices. RAID is a fix for the unreliable nature of magnetic disks.
This study seems to have a very bad case of "unconsciously idealizing the status quo and working from there". For instance:
"Even the highest-performance RAID controllers today cannot support the IOPS of just three of the fastest SSDs. I am not talking about a disk tray; I am talking about the whole RAID controller. If you want full performance of expensive SSDs, you need to take your $50,000 or $100,000 RAID controller and not overpopulate it with too many drives. In fact, most vendors today have between 16 and 60 drives in a disk tray and you cannot even populate a whole tray. Add to this that some RAID vendor's disk trays are only designed for the performance of disk drives and you might find that you need a disk tray per SSD drive at a huge cost."
That sounds pretty dire. And, it does in fact mean that SSDs won't be neat drop-in replacements for some legacy infrastructures. However, step back for a minute: Why did traditional systems have 50k or 100k RAID controllers connected to large numbers of HDDs? Mostly because the IOPs on an HDD, even a 15K RPM monster, sucked horribly. If 3 SSDs can swamp a RAID controller that could handle 60 drives, that is an overwhelmingly good thing. In fact, you might be able to ditch the pricey raid controller entirely, or move to a much smaller one, if 3 SDDs can do the work of 60HDDs.
Now, for systems where bulk storage capacity is the point of the exercise, the ability to hang tray after tray full of disks off the RAID controller is necessary. However, that isn't the place where you would be buying expensive SSDs. Even the SSD vendors aren't even pretending that SSDs can cut it as capacity kings. For systems that are judged by their IOPS, though, the fact that the tradition involved hanging huge numbers (of often mostly empty, reading and writing only to the parts of the platter with the best access times) HDDs off extremely expensive RAID controllers shows that the past sucked, not that SSDs are bad.
For the obligatory car analogy: shortly after the début of the automobile, manufacturers of horse-drawn carriages noted the fatal flaw of the new technology: "With a horse drawn carriage, a single buggy whip will server to keep you moving for months, even years with the right horses. If you try to power your car with buggy whips, though, you could end up burning several buggy whips per mile, at huge expense, just to keep the engine running..."
... researchers have found that putting a Formula One engine into a Mack truck wipes out the advantages of the 19,000 rpm.
Why is it so hot? Where am I going? What am I doing in this handbasket?
Super Sonic Device. They're hard drives that spin so fast the edge of the platter goes faster than sound.
Not a sentence!
And we don't have to use Highlander Rules when considering drive technologies. There's no reason that one has to build a storage array right now out of purely SSD or purely HDD. Sun showed in some of their storage products that by combining a few SSDs with several slower, large capacity HDDs and ZFS, they could satisfy many workloads for a lot less money. (Pretty much the only thing a hybrid storage pool like that can't do is sustain very high IOPS of random reads across a huge pool of data with no read locality at all.)
I hope we see more filesystems support transparent hybrid storage like this...
If you use ZFS with SSDs, it scales very nicely. There isn't a bottleneck at a raid controller. You can slam a pile of controllers into a chassis if you have bandwidth problems because you've bought 100 SSDs - by having the RAID management outside the controller, ZFS can unify the whole lot in one giant high performance array.
The advantage of hardware RAID, at least with RAID 5, is the battery backup. When you write a RAID stripe, you need to write the whole thing atomically. If the writes work on some drives and fail on others, you can't recover the stripe. The checksum will fail, and you'll know that the stripe is damaged, but you won't know what it should be. With a decent RAID controller, the entire write cache will be battery backed, so if the power goes out you just replay the stuff that's still in RAM when the array comes back online. With software RAID, you'd just lose the last few writes, (potentially) leaving your filesystem in an inconsistent state.
This is not a problem with ZFS, because it handles transactions at a lower layer so you either complete a transaction or lose the transaction, the disk is never in an inconsistent state.
I am TheRaven on Soylent News
My understanding is that pretty much all the serious storage appliance vendors are moving in that direction, at least in the internals of their devices. I suspect that pretty much anybody who isn't already a sun customer doesn't want to have to deal with ZFS directly; but that even the "You just connect to the iSCSI LUN, our magic box takes it from there" magic boxes are increasingly likely to have a mix of drive types inside.
I'll be interested to see, actually, how well the traditional 15K RPM SCSI/SAS enterprise screamer style HDDs hold up in the future. For applications where IOPS are supreme, SSDs(and, in extreme cases, DRAM based devices) are rapidly making them obsolete in performance terms and price/performance terms are getting increasingly ugly for them. The costs of fabricating flash chips are continuing to fall, the costs of building mechanical devices that can handle what those drives can aren't as much. For applications where sheer size or cost/GB are supreme, the fact that you can put SATA drives on SAS controllers is super convenient. It allows you to build monstrous, and still pretty zippy for loads that are low on random read/write and high on sustained read or write(like backups and nearline storage), storage capacity for impressively small amounts of money.
Is there a viable niche for the very high end HDDs, or will they be murdered from above by their solid state competitors, and from below by vast arrays of their cheap, cool running, and fairly low power, consumer derived SATA counterparts?
Also, since no punning opportunity should be left unexploited, I'll note that most enterprise devices are designed to run headless without any issues at all, so Highlander rules cannot possibly apply.
Disks are cheap. There's no reason to use the full GB (or TB) capacity, especially if you want fast response. If you just use the outside 20% of a disk, the random I-O performance increases hugely. ISTM the best mix is some sort of journalling system, where the SSDs are used for read oparions and updates get written to the spinning storage (or NV RAM/cache). Then at predetermined times perform bulk updates back to the SSD. if some storage array manu. came up with something like that, I'd expect most performance problems to siomply go away.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
We haven't purchased 15k disks for years. In most cases, it is actually cheaper to buy 3x or even 4x SATA spindles to get the same IOPS. Plus you get all that capacity for free, even when you factor in extra chassis and power costs. We use all that capacity for snapshots, extra safety copies, etc. If your enterprise storage vendor is charging you the same price for a 1TB SATA spindle as a 300GB 15K spindle, you need to find a new vendor. Look at scale-out clustered solutions instead of the dinosaur "dual fiber controllers and a bunch of disk" offerings.
How about this for an argument.
A 500GB SSD can be entirely over-written ("changing all the data on the medium") over 10,000 times. No wear leveling needed here. 10K writes is the low end for modern flash.
Lets suppose you can write 200MB/sec to this drive. Thats about average for the top enders right now.
It will take 2,500 seconds to overwrite this entire drive once. Thats about 42 minutes.
So how long to overwrite it 10,000 times?
Thats 25,000,000 seconds.
Thats 416,667 minutes.
Thats 6,944 hours.
Thats 289 days.
289 *days* of constant 24/7 writing to use of the flash.
Now.. and this is the key point.. will a platter drive survive 289 days of constant max-throughput writing? The answer is no. You will burn the platter drives physical components way before that.
"His name was James Damore."
289 *days* of constant 24/7 writing to use of the flash.
This assumes the case of repeated sequential write to blocks 1 to n, where no wear levelling occurs. Consider that I first write once to 100% of the disk, then repeatedly: write sequentially to the first 25% of the disk n times, then write to the remaining 75% of the disk once. Dynamic wear levelling is out. How is a typical static wear levelling algorithm likely to kick in in a way which prevents an unacceptable slowdown during one pass, while at the same time squeezing out max writes to all physical blocks?
Now.. and this is the key point.. will a platter drive survive 289 days of constant max-throughput writing? The answer is no.
According to whom? Where are the independent test results for various specified duty cycles, performed in real time?
(Although perhaps all that matters is whether, at any time before m years is up, I will get a warranty replacement for my drive.)