Triple M.2 NVMe RAID-0 Testing Proves Latency Reductions
Vigile writes: The gang over at PC Perspective just posted a story that looks at a set of three M.2 form factor Samsung 950 Pro NVMe PCIe SSDs in a RAID-0 array, courtesy of a new motherboard from Gigabyte that included three M.2 slots. The pure bandwidth available in this configuration is amazing, breaching 3.3 GB/s on reads and 3.0 GB/s on writes. But what is more interesting is a new testing methodology that allows for individual storage IO latency capturing, giving us a look at performance of SSDs in all configurations. What PC Perspective proved here is that users often claiming that RAIDs "feel faster" despite a lack of bandwidth result to prove it, are likely correct. Measurements now show that the latency of IO operations improves dramatically as you add drives to an array, giving a feeling of "snappiness" to a system beyond even what a single SSD can offer. PC Perspective's new testing demonstrates the triple RAID-0 array having just 1/6th of the latency of a single drive.
On VM's in a home lab I see no difference between raid 0 and not with SATA on Samsung pros outside of benchmarks.
However, my VM's do boot quicker with the Samsung than the sansdisk that replaced them. The IOPS were better and that made them boot quicker and shutdown. I imagine at work it is the same with database or production VM's too
http://saveie6.com/
Remember kids, losing just one drive dumps the entire array. It's really not appropriate for anything besides completely transient data (scratch disks, stuff like this benchmark, etc.). Not smart to run your OS on RAID 0. RAID 10, OTOH . . . now we're talking.
They did post a quick speed graph regarding raid-5 on the drives; obviously writes took an impact but reads were almost exactly the same as raid-0 x3 drives.
Karnal
I've been building PCs long enough to remember a time when things were improving so quickly that it made no sense to keep a computer for more than 4 years. But since then, the progress in CPU performance has reached a plateau. People like me, who bought a good Sandy Bridge system in 2011, still have a system that doesn't come close to feeling crippled and lazy. We don't have much reason to envy the people who bought the latest generation of i5/i7 systems. Five years used to mean an order of magnitude improvement in performance. Now it's not even a doubling. I've sometimes wondered when I will finally start feeling the urge to upgrade my system.
These SSD latency numbers are the first thing I've seen that gave me the feeling that there is some truly worthwhile trick that my present computer can't come close to matching. I'm not saying that I now want to upgrade, but on reading this, I have become upgrade-curious for the first time in many years.
These results seem to be very questionable. Their graphs claim that in some configurations almost all 4k read requests are handled within 100 ns. But getting even a single DRAM burst from a random DRAM location already takes almost 100ns, even through the memory controller is connected with a much tighter interface, optimized for low latency and PCIe is much slower than DRAM interface. Even without overhead 4 PCIe 3.0 lanes ( 8 GB/s) can only transfer 8 KB per s. Transfering a 4 KB Block should thus take at least 0.5 s or 500ns and that does not include any overhead nor the time needed to actually send the request to the SSD, open the page from the NAND flash, run ECC and decompression.
Jan
PC Perspective's new testing demonstrates the triple RAID-0 array having just 1/6th of the latency of a single drive.
That was with a queue depth of 16. Not exactly representative of a normal desktop user.
The SSD controller already does a form of this, as it is talking to multiple flash memory dies over multiple channels. RAID is just another layer to get even more performance out of more parallelism (and as we figured out in testing, to considerably drop the latency under load).
Allyn Malventano
Storage Editor, PC Perspective
this sig was brought to you by the letter
I have found these calculators work well for projecting the performance and capacity of various RAID levels:
http://wintelguy.com/raidperf.pl/?formid=1&raidtype=R0&ndg=2&ng=1
http://wintelguy.com/raidperf.pl/?formid=2&raidtype=R0&ndg=2&ng=1
Some other guy mentioned RAID 10 isn't a backup strategy; he's correct (no RAID level is), however one thing to keep in mind is that when his RAID 0 array dies, he'd better hope his back-ups are all up-to-date and restorable. With just about any other RAID level, you get an opportunity to replace the dead / dying drive first, start rebuilding, and KEEP ROLLING, with no need to screw around with backups at all and no human interaction even required if your array has a hot spare configured. Yes, technically that is availability, but I'd sure as hell take it over "fuck, there went one drive of my RAID 0 stripe, better hope I can tolerate this downtime and that my last backup set had everything I needed on it." RAID may not be a backup strategy but it's absolutely another layer in place before you need to restore from backups (as long as it's not RAID 0).
The gang over at PC Perspective...
gangs, presumably roving, are taking over websites now! YOUR SITE COULD BE NEXT!
Anons need not reply. Questions end with a question mark.
Not completely true. With 6 or 8tb drives, you are looking at a few days to a week or so of the raid rebuilding. During this time, you have the protection of raid 0 without the speed.
In Soviet Russia the insensitive clod is YOU!
Ummm...ok. So when your SMART detects a failing drive in your RAID0 array and you decide you want to replace it, how do you do that exactly? Oh, that's right, wipe the entire array and restore from backup, which, depending on the size of your array can take anywhere from several hours to days, more if you decided to use your array to run the OS as well. RAID0 is just a plain terrible idea, period. It doesn't matter if you don't think you need uptime, an N disk RAID0 is N times more likely to fail catastrophically as a standalone hard disk (assuming the failure rates on all of the hard disks are equal), and without redundancy getting back up and running is a long process.