Wear Leveling, RAID Can Wipe Out SSD Advantage

← Back to Stories (view on slashdot.org)

Wear Leveling, RAID Can Wipe Out SSD Advantage

Posted by Soulskill on Saturday March 6, 2010 @05:32AM from the not-so-solid dept.

storagedude writes "This article discusses using solid state disks in enterprise storage networks. A couple of problems noted by the author: wear leveling can eat up most of a drive's bandwidth and make write performance no faster than a hard drive, and using SSDs with RAID controllers brings up its own set of problems. 'Even the highest-performance RAID controllers today cannot support the IOPS of just three of the fastest SSDs. I am not talking about a disk tray; I am talking about the whole RAID controller. If you want full performance of expensive SSDs, you need to take your $50,000 or $100,000 RAID controller and not overpopulate it with too many drives. In fact, most vendors today have between 16 and 60 drives in a disk tray and you cannot even populate a whole tray. Add to this that some RAID vendor's disk trays are only designed for the performance of disk drives and you might find that you need a disk tray per SSD drive at a huge cost.'"

13 of 168 comments (clear)

Min score:

Reason:

Sort:

Correction: by raving+griff · 2010-03-06 05:40 · Score: 5, Informative

Wear Leveling, RAID Can Wipe Out SSD Advantage for enterprise.
While it may not be efficient to slap together a platter of 16 SSDs, it is worthwhile to upgrade personal computers to use an SSD.
1. Re:Correction: by JustASlashDotGuy · 2010-03-06 15:12 · Score: 2, Informative
  
  You obviously don't manage a SAN and I'm starting to think you've never seen one. SSD's are nice, but typical FC/SAS/SATA drives will be around for a long time to come. IOPS aren't all that matters in a SAN, space matters as well.
  IE: Typical SANS are setup in tiers. In my case, we use Compellent SANs. New writes and active data is written to 15K FC drives. They are fast, expensive, and we have less total capacity than the SATA totals. After about 2 weeks, the inactive blocks that were on the FC drive are moved down to SATA. If your company is like most companies, you have a lot of stale data that finds it way to the SAN and may never be touched for years. This data is good to have on slower SATA disk. There's no need to waste money and rack space to store data that no one will access for years.
  We are flirting with the idea of adding the SSD disk to our tiers. I our case, the SSD tier would receive all the new writes for that tier (RAID10) and then tier everything down to RAID5 over night. This allows the RAID5 write penalty to be taken in the off hours. 2 weeks later, the really old blocks is sent to SATA. In this case FC and SATA disk will just be used for reads.
Re:Duh by Anarke_Incarnate · 2010-03-06 05:48 · Score: 3, Informative

or Independent, according to another fully acceptable version of the acronym.
Seek time by 1s44c · 2010-03-06 05:48 · Score: 4, Informative

The real advantage of solid state storage is seek time, not read/write times. They don't beat conventional drives by much at sustained IO. Maybe this will change in the future. RAID just isn't meant for SSD devices. RAID is a fix for the unreliable nature of magnetic disks.
1. Re:Seek time by LBArrettAnderson · 2010-03-06 06:34 · Score: 3, Informative
  
  That hasn't been the case for at least a year now. A lot of SSDs will do much better with sustained read AND write speeds than traditional HDs (the best of which top out at around 100MB/sec). SSDs are reading at well over 250MB/sec and some are writing at 150-200MB/sec. And this is all based on the last time I checked, which was 5 or 6 months ago.
Re:Raid controllers obsolete? by TheRaven64 · 2010-03-06 06:41 · Score: 4, Informative

The advantage of hardware RAID, at least with RAID 5, is the battery backup. When you write a RAID stripe, you need to write the whole thing atomically. If the writes work on some drives and fail on others, you can't recover the stripe. The checksum will fail, and you'll know that the stripe is damaged, but you won't know what it should be. With a decent RAID controller, the entire write cache will be battery backed, so if the power goes out you just replay the stuff that's still in RAM when the array comes back online. With software RAID, you'd just lose the last few writes, (potentially) leaving your filesystem in an inconsistent state.
This is not a problem with ZFS, because it handles transactions at a lower layer so you either complete a transaction or lose the transaction, the disk is never in an inconsistent state.

--
I am TheRaven on Soylent News
Re:Little Flawed study. by rodgerd · 2010-03-06 07:21 · Score: 2, Informative

ceph, XIV, and other distributed storage controller models are available today, and avoid controller bottlenecks.
Re:Little Flawed study. by sirsnork · 2010-03-06 07:33 · Score: 3, Informative

He may have half missed the point, but so did you.

I clicked on this thinking this guy has done some testing... somewhere. Nope, nothing, no mention of benchmarks or what hardware he used. I'm sure some of he said is true. But I'd really like to see the data that he gets the

I have seen almost 4 to 1. That means that the write performance might drop to 60 MB/sec and the wear leveling could take 240 MB/sec.
from. I'd also really like to know what controllers he's tested with, wheather or not they have TRIM support (perhaps none do yet), what drives he used, if he had a BBU and write-back enabled etc etc etc.

Until he give us the sources and the facts this is nothing but a FUD piece. Yes, wear levelling will eat up some bandwidth, thats hardly news... show us the data about how much and which drives are best

--

Normal people worry me!
Re:Little Flawed study. by amorsen · 2010-03-06 09:26 · Score: 2, Informative

Why is it so hard for developers of ports and interface standards to get it super fast, first time round? It's not like there's a power issue and there's no worry about having to make things small enough (as with say the CPU).
There IS a power issue, and most importantly there's a price issue. The interface electronics limit speed. Even today, 10Gbps ethernet (10Gbase-T) is quite expensive and power hungry. 40Gbps ethernet isn't even possible with copper right now. They couldn't have made USB 3 40 Gbps instead of 4, the technology just isn't there. In 5 years maybe, in 10 years almost certainly.
USB 1 could have been made 100Mbps, but the others were close to what was affordable at the time.

--
Finally! A year of moderation! Ready for 2019?
Re:ZFS sidesteps the whole RAID controller problem by turing_m · 2010-03-06 12:59 · Score: 2, Informative

Get a real high-performance file system. One that's also mature and can actually be recovered if it ever does fail catastrophically. (Yes, ZFS can fail catastrophically. Just Google "ZFS data loss"...)
I just did. On the first page, I got just one result on the first page relating to an event from January 2008 - Joyent. And they managed to recover their data. I did another search - "ZFS lost my data". One example running on FreeBSD 7.2, in which ZFS was not yet production ready. Other examples existed in which people were eventually able to get their data.
The following is an interesting message - http://www.sun.com/msg/ZFS-8000-8A - that seems pretty scary but someone was able to get back their data anyway. All in all, the lack of datapoints for ZFS losing data is actually encouraging. If this were really a problem, I would expect to see a lot more forum posts about this, and people piling on as well. The others are singing ZFS's praises.

--
If I have seen further it is by stealing the Intellectual Property of giants.
Re:Little Flawed study. by gfody · 2010-03-06 14:04 · Score: 2, Informative

..not to mention the gobs and gobs of cheap sdram you could use as cache. There's a huge opportunity for an up and coming SAN company to be competitive with commodity hardware. Doesn't look good for the likes of 3PAR, EMC, Equilogic, etc.

--

bite my glorious golden ass.
Re:Little Flawed study. by TheLink · 2010-03-06 20:09 · Score: 2, Informative

>Enterprise loads such as databases do many many seeks and tend to have long queues as many clients request the data. Size and throughput are less important for these loads than seek time (though still critical).

Did you even read the link?

"I saw 4KB random write speed drop from 50MB/s down to 45MB/s. Sequential write speed remained similarly untouched. But now I've gone and ruined the surprise."

That's for random writes. 4KB random writes at 45MB/sec is 11520 writes per second.

A 15000rpm drive doing 4KB random writes (noncached/buffered) will only manage about 250 IOPS ( assuming 4 millisecond seek times). Or about 1MBps.

That's 45 times slower. You'll need a lot of spindles to match that.

The only issues I see with SSD are whether reliability is really up to scratch, whether you can hotswap them, and perhaps capacity (if you somehow can't use a tiered storage scheme).
--
- Too many replies beneath your current threshold
Re:Little Flawed study. by amorsen · 2010-03-06 20:30 · Score: 2, Informative

You would still have to run a sophisticated DSP, unless you kept entirely separate chips for USB1, USB2, and USB3. The DSP would eat lots of power even when working at USB1-speed.
Also, we're talking hundreds or thousands of dollars for a USB3-DSP in the USB1 era.

--
Finally! A year of moderation! Ready for 2019?