Triple M.2 NVMe RAID-0 Testing Proves Latency Reductions
Vigile writes: The gang over at PC Perspective just posted a story that looks at a set of three M.2 form factor Samsung 950 Pro NVMe PCIe SSDs in a RAID-0 array, courtesy of a new motherboard from Gigabyte that included three M.2 slots. The pure bandwidth available in this configuration is amazing, breaching 3.3 GB/s on reads and 3.0 GB/s on writes. But what is more interesting is a new testing methodology that allows for individual storage IO latency capturing, giving us a look at performance of SSDs in all configurations. What PC Perspective proved here is that users often claiming that RAIDs "feel faster" despite a lack of bandwidth result to prove it, are likely correct. Measurements now show that the latency of IO operations improves dramatically as you add drives to an array, giving a feeling of "snappiness" to a system beyond even what a single SSD can offer. PC Perspective's new testing demonstrates the triple RAID-0 array having just 1/6th of the latency of a single drive.
On VM's in a home lab I see no difference between raid 0 and not with SATA on Samsung pros outside of benchmarks.
However, my VM's do boot quicker with the Samsung than the sansdisk that replaced them. The IOPS were better and that made them boot quicker and shutdown. I imagine at work it is the same with database or production VM's too
http://saveie6.com/
Why would you do this the hard way? Certainly a similar configuration could be done internal to the drive and deliver equivalent performance without the need for messy RAID controllers and configurations. With RAID 0 as soon as one drive goes tits up your whole array is toast anyway.
Gaming.
"If you have nothing to hide, you have nothing to fear." - Every fascist, ever
Remember kids, losing just one drive dumps the entire array. It's really not appropriate for anything besides completely transient data (scratch disks, stuff like this benchmark, etc.). Not smart to run your OS on RAID 0. RAID 10, OTOH . . . now we're talking.
But does raid 0 support sharing? It is the secret ingredient in the async sauce?
http://saveie6.com/
They did post a quick speed graph regarding raid-5 on the drives; obviously writes took an impact but reads were almost exactly the same as raid-0 x3 drives.
Karnal
That would be YUGE!!!!
So, basically it partly takes seek time out of the equation, or something similar?
Because then in theory I guess you can be serving multiple requests instead of just one at a time.
Doesn't seem entirely unreasonable. If the latency is spread out a little, it may not seem as big for any individual transaction.
Lost at C:>. Found at C.
I've been building PCs long enough to remember a time when things were improving so quickly that it made no sense to keep a computer for more than 4 years. But since then, the progress in CPU performance has reached a plateau. People like me, who bought a good Sandy Bridge system in 2011, still have a system that doesn't come close to feeling crippled and lazy. We don't have much reason to envy the people who bought the latest generation of i5/i7 systems. Five years used to mean an order of magnitude improvement in performance. Now it's not even a doubling. I've sometimes wondered when I will finally start feeling the urge to upgrade my system.
These SSD latency numbers are the first thing I've seen that gave me the feeling that there is some truly worthwhile trick that my present computer can't come close to matching. I'm not saying that I now want to upgrade, but on reading this, I have become upgrade-curious for the first time in many years.
These results seem to be very questionable. Their graphs claim that in some configurations almost all 4k read requests are handled within 100 ns. But getting even a single DRAM burst from a random DRAM location already takes almost 100ns, even through the memory controller is connected with a much tighter interface, optimized for low latency and PCIe is much slower than DRAM interface. Even without overhead 4 PCIe 3.0 lanes ( 8 GB/s) can only transfer 8 KB per s. Transfering a 4 KB Block should thus take at least 0.5 s or 500ns and that does not include any overhead nor the time needed to actually send the request to the SSD, open the page from the NAND flash, run ECC and decompression.
Jan
That Gigabyte board is looking more and more attractive.
And I'm actually due for a system rebuild this year...
Chas - The one, the only.
THANK GOD!!!
PC Perspective's new testing demonstrates the triple RAID-0 array having just 1/6th of the latency of a single drive.
That was with a queue depth of 16. Not exactly representative of a normal desktop user.
You're right. I/O is where the best improvements have been coming lately.
I'd still be using a Core2Quad if it wasn't for the platform's outdated I/O features.
PCI 3.0, SATA 3, USB3, DDR4, and surrounding technologies like M.2 and NVME are the the real reason to upgrade. The newer platforms are faster in most practical applications simply because they can feed the data to the CPU faster.
Intel's also been focusing on power saving. The new parts sip power in comparison to the decade old core2.
While you're right that it's for transient stuff, these days the OS might be transient. e.g. If you're insta-create-launching a container to run an application that you don't want to persistantly remember anything (e.g. porn browser), or to protect against a possibly-hostile app (though I don't know if Linux containers are quite suitable for that, yet).
It's the cache on the RAID controller that makes the array more responsive.
1 second = 1000 milliseconds (ms) = 1000000 microseconds (us or \mu s) = 1000000000 nanoseconds (ns)
8GB/s = 8MB/ms = 8kB/us = 8 bytes/ns
GP post makes sense if the mu didn't show up correctly.
bah - I run a RAID 0 system disk (ie, OS). I also have multiple offline backups, so if it fails, I'm back up within 10-15 minutes may at most have lost 60 minutes of work since I have a workspace on separate drives plus regular snapshots and an off-system backup.
Just remember, RAID 10 is not a backup, but a HA setup. Whether you "sudo rm -rf /" on a single disk, RAID-0 or RAID-10, you've likely hosed your system. So go ahead and run in RAID-0, it's no more dangerous than running any other configuration as long as you have real backups.
The cesspool just got a check and balance.
Hopefully we start seeing boards transitioning over to offering connectivity for ever-greater numbers of M.2 drives.
For RAID-0, the big issue is "lose a drive and you're fucked".
For RAID-5, the big issue is "lose a drive on a large-enough array and you could be looking at an unrecoverable read error during the array recovery".
Granted, most of the people who are using these setups are frothing gamers and hardware junkies who aren't keeping anything truly valuable within those filesystems.
But for those who are looking for truly dependable storage solutions, they should be looking at RAID-10 or better, or looking to offload their storage needs to a device that can handle something like RAID-6 or high-level ZFS.
Chas - The one, the only.
THANK GOD!!!
To bad the skylack only has 16+4 PCI-E 3.0 lanes from the CPU. That why intel needs to put QPI in all cpus and drive the chipset off of QPI and not DMI.
I have found these calculators work well for projecting the performance and capacity of various RAID levels:
http://wintelguy.com/raidperf.pl/?formid=1&raidtype=R0&ndg=2&ng=1
http://wintelguy.com/raidperf.pl/?formid=2&raidtype=R0&ndg=2&ng=1
Some other guy mentioned RAID 10 isn't a backup strategy; he's correct (no RAID level is), however one thing to keep in mind is that when his RAID 0 array dies, he'd better hope his back-ups are all up-to-date and restorable. With just about any other RAID level, you get an opportunity to replace the dead / dying drive first, start rebuilding, and KEEP ROLLING, with no need to screw around with backups at all and no human interaction even required if your array has a hot spare configured. Yes, technically that is availability, but I'd sure as hell take it over "fuck, there went one drive of my RAID 0 stripe, better hope I can tolerate this downtime and that my last backup set had everything I needed on it." RAID may not be a backup strategy but it's absolutely another layer in place before you need to restore from backups (as long as it's not RAID 0).
Sip power compared to Core2, but even in performance per watt, Sandy Bridge is not really that far behind. My 3.5GHz Sandy Bridge Xeon has a TDP of 80 Watts. Look at these numbers and you won't get the impression that a half a decade has passed since it came out.
A RAID0 of 3 SSDs will be faster than a single SSD drive for multiple reasons, the primary one being that the kernel reads from all three devices at the same time, and (secondarily) both the SSDs and the kernel are doing read-ahead, so that once hundreds (if not thousands) of sectors are in memory, you're only looking at the time to copy them into the destination buffer. For more speed, set your read-ahead buffers up - /sys/devices/(device)/hostX/targetX:0:0/X:0:0:0/block/sde/queue/read_ahead_kb, in Linux...
Usually defaults to 128KB....
Not smart to run your OS on RAID 0.
Why? You're assuming the OS is something that I can't just re-install? Remember that you're only slightly more than doubling the failure rate. Given the incredibly low failure rates it's not like you're guaranteed to lose things constantly.
See subject: Performance IS "where it's at" & especially for the SLOWEST parts of a PC - disks (which aren't so slow anymore via SSD)... I used LITERAL RamDisks from CENATEK & Gigabyte based on PC-133 SDRAM in the former + DDR-2 in the latter - they FLEW (when SSD still had problems earlier on - I finally "bought-in" via Intel's offerings this year though, after all the 'kinks' in controllers + longevity were worked out though) but were PUNY by comparison for storage space (4gb).
I used those in the following ways though (utilizing a Promise PCIx4 128mb ECC Ram caching controller on WD Velociraptor 10k rpm HDD's to speed them up a bit more too):
1.) Pagefile placement (offloading HDD's where my programs resided on #1, & data for processing that was LARGE on HDD #2)
2.) WebPage caching for browsers
3.) Print Spooler location
4.) %Comspec% location
5.) %tmp% & %temp% BOTH SYSTEM & USER location
6.) Application TEMP folders (when data wasn't too large in them)
7.) Hosts file location (non-std. via registry hack, for faster loads)
* It all worked for MASSIVELY faster system, & NOT impeding slower HDD's by offloading them...
( WHAT I REALLY WANT IS A PAIR OF THESE IN RAID 0 (which are basically 2 Intel 750's in RAID 0 on a SINGLE card of PCI Ex4 nature):
http://hothardware.com/reviews...
NOW THAT WOULD FLY!!!
APK
P.S.=> I still use the Gigabyte IRAM that way (don't have a 64-bit driver for the CENATEK though, too bad) but combined with modern SSD by Intel... apk
Let me get this straight. When you have more devices available to service read or write requests, the time that it takes to service the request goes down.
What next? Are we going to be told that RAID5 gives better read performance than a single drive too?
The gang over at PC Perspective...
gangs, presumably roving, are taking over websites now! YOUR SITE COULD BE NEXT!
Anons need not reply. Questions end with a question mark.
Most SSD do RAID internally across several dies already.
If programs would be read like poetry, most programmers would be Vogons.
There are several things that affect the latency. You will get about 200us latency on program on a die, but that can be reduced by using some caching and acknowledging the writes before the program finishes, but that cache can be saturated by sustained writes. Especially with random writes that get high level of map updates. As that fills the write latency on sustained random writes will eventually climb to the 2 program instructions latency, which is 400us. With multiple controllers, only every third write is going to a specific controller. The latency will go down as you don't sustain writes to each controllers at the level to flood the caching.
If programs would be read like poetry, most programmers would be Vogons.
just raid1 it up and call it a day.
write speed doesnt increase, but big whoop. read speed increases and all your data is backed up.
so basically raid0 is for people who need that write boost and want to live life dangerously.
For RAID-5, the big issue is "lose a drive on a large-enough array and you could be looking at an unrecoverable read error during the array recovery".
This gets repeated a lot, but isn't a problem for any halfway decent RAID setup because they slowly read data from the drives in the background (called patrol read on LSI/Dell controllers). The chances of a problem with a drive not turning up in one of the numerous patrol reads yet happening during a recovery are astronomically small.
Not completely true. With 6 or 8tb drives, you are looking at a few days to a week or so of the raid rebuilding. During this time, you have the protection of raid 0 without the speed.
In Soviet Russia the insensitive clod is YOU!
Any classic RAID level is useless if you want data safety. So one of your drives in RAID10/5/6 returns garbled data (without an error), which copy/parity do you trust?
Also many 5/6 implementations won't actually calculate the parity chunk on reads, only for rebuilds. There are some pricey controllers that do full checksumming ala ZFS on chip but as with most hardware systems the SPOF becomes your controller.
With the drives becoming ever larger and faster, more data is being read but the errors per terabyte read are not really decreasing so the probability of you reading an error is nearing 1 faster than ever.
Custom electronics and digital signage for your business: www.evcircuits.com
RAID0 is unsuitable for situations that require very high uptime but there is nothing inherently dangerous about storing real data on a RAID0 array. I know this gets said frequently but, RAID, at any level, isn't a backup. It's a reliability/performance feature. Even if you had configured these three disks as a triple mirrored RAID1, you would be insane to not run SMART monitoring tools on the disks and even more insane to not have good backups. I don't know if I've ever had a disk fail without plenty of warning from SMART monitoring so, for RAID0, you are mostly gaining some performance at the expense of more difficult disk replacements. That seems like a very acceptable tradeoff for something like a gaming machine.
When was the last time you actually had to rebuild an array with large (4GB+) constituent disks?
And they don't read THAT slowly. Indeed, the increased (and sustained) load during the rebuild can cause additional drives in the array to fail.
Chas - The one, the only.
THANK GOD!!!
Is the rebuild issue for SSD RAID-5 arrays the same stratum of risk it is for spinning rust?
I would presume not, both because of speed and because there's not nearly as much added stress from the intensive reads necessary to rebuild the array.
Double parity and/or hot spare is better, but I kind of wonder as SSDs gain write durability (or it becomes more accepted they just have it, as some endurance tests have noted) and they start popping up in more budget minded arrays if maybe RAID-5 might make a comeback due to its lower overhead and arguably less risk due to faster and less mechanically strained rebuilds.
Ummm...ok. So when your SMART detects a failing drive in your RAID0 array and you decide you want to replace it, how do you do that exactly? Oh, that's right, wipe the entire array and restore from backup, which, depending on the size of your array can take anywhere from several hours to days, more if you decided to use your array to run the OS as well. RAID0 is just a plain terrible idea, period. It doesn't matter if you don't think you need uptime, an N disk RAID0 is N times more likely to fail catastrophically as a standalone hard disk (assuming the failure rates on all of the hard disks are equal), and without redundancy getting back up and running is a long process.
It depends on how you've setup your RAID0 array. If you are using mdraid, you can simply take the array offline, dd the contents of a failing disk onto a new disk, remove the old disk and bring the array back online (you can do this with a USB boot stick if you need to take the root filesystem offline). That's certainly more work than popping a failing disk out of a hotswap bay, screwing a new disk into the drive tray and pushing it back in. But, it's not that prohibitive.
Now, having said that, I certainly wouldn't build a RAID0 array out of a bunch of "green" desktop disks or out of bulk storage disks. I have a RAID0 array with 4 Intel 80GB SSDs and SMART says they have an online time of 4.3 years. They have incredible performance and never a hiccup. If I lost one of the disk controllers, the data could be restored in a few hours with a single rsync.
My point is that RAID0 has a place. It may not have a place in your setup but, it's not inherently flawed technology. It's a technology that is aimed at maximum performance with a bit more risk in downtime when compared to a single disk.
Correct. SSDs fail for different reasons than spinning rust. Most mechanical HDs fail for physical reasons and physical reasons tend to be highly correlated for all drives in an array, even if they're different models or even brands. There is a very high risk that if one drive fails, another is right behind it. RAID5, I'm looking at you.
RAID0 is any drive failing is a loss, so multi-drive failures don't matter so much, but they're also much less likely until it's a firmware bug or other pathological issue. But SSDs are pretty much RAID0 already and have a fraction the failure rate of mechanical drives.
Astronomically small? It's happened to me TWICE in a couple years and I only have a single large raid array. It happens quite often -- and I'm using one of your LSI controllers (9280-16i4e).
Rebuild just finished... Yesterday. Took 5 days, and that is with 3TB disks.
What do I do? Well, I run a RAID-0 of SSD for my OS. I drop both the failing drive and a new drive into my drive duplicator, hit a button, and approximately 5 minutes later I put the new drive into the box and it's running again. That is of course if SMART detects it before failure which is ~50% of the time. Otherwise I wipe and restore from backups.... I'm guessing about 2-3 hours as I haven't had to do it yet.
The only reason people upgrade these days is because you can't upgrade Windows anymore.
For RAID-5, the big issue is "lose a drive on a large-enough array and you could be looking at an unrecoverable read error during the array recovery".
This gets repeated a lot, but isn't a problem for any halfway decent RAID setup because they slowly read data from the drives in the background (called patrol read on LSI/Dell controllers). The chances of a problem with a drive not turning up in one of the numerous patrol reads yet happening during a recovery are astronomically small.
I'm not sure how you define "astronomically", but I've seen this more than a few times in my career. And it has become increasingly common with larger disks and larger arrays.
RAID 5 is decent for availability... but you'd better be able to restore from your backups. RAID 6 should be the default these days (though I prefer ZFS RAIDZ2 or RAIDZ3). And don't be one of those idiots who makes a 32-disk, 192 TB RAID5 (or 6 for that matter).
SWM seeks new sig for a brief fling
5 days is a long time to hold one's breath...
I can totally agree with these sentiments. My first computer was a Commodore 64. Since then, I've been chasing the performance dragon, upgrading to a new computer every couple years. C128, Amiga, XT, 286, 386, 486, Athlon, P60, P2, P3, P4, P5... up until my last two computers.
My previous PC was a Core2Duo e6600 (circa 2006) and its performance was great and then still good. Had it for 5 years. I assembled my current desktop, a Core i7 2600k, in mid 2011. I've upgraded the video card (I'm a gamer) a couple times, and I was starting to feel a little performance pinch about 12-18 months ago, so I upgraded my OS drive to a 512GB SSD. I'm back to having zero performance problems except when fast traveling in Fallout 4 (my games drive is a mechanical hard drive.)
I've been considering another upgrade... but only if I can get M.2/NVMe and DDR4. I'm hoping that all of the announcements we heard months ago about new faster, affordable storage will pan out. Based on the hype, I'm hoping to get a few blazing TB of disk space for what mechanicals cost right now... but I won't hold my breath.
Buy a Mac and be stuck with an overpriced system that can't be upgraded and has only the GPU that Apple says you can have in it? No thanks.
And is why it's a RAID-6 and not a RAID-5. I've had a second disk fail during rebuilds twice so far.