New PCIe SSDs Load Games, Apps As Fast As Old SATA Drives
crookedvulture writes Slashdot has covered a bunch of new PCI Express SSDs over the past month, and for good reason. The latest crop offers much higher sequential and random I/O rates than predecessors based on old-school Serial ATA interfaces. They're also compatible with new protocols, like NVM Express, which reduce overhead and improve scaling under demanding loads. As one might expect, these new PCIe drives destroy the competition in targeted benchmarks, hitting top speeds several times faster than even the best SATA SSDs can muster. The thing is, PCIe SSDs don't load games or common application data any faster than current incumbents—or even consumer-grade SSDs from five years ago. That's very different from the initial transition from mechanical to solid-state storage, where load times improved noticeably for just about everything. Servers and workstations can no doubt take advantage of the extra oomph that PCIe SSDs provide, but desktop users may struggle to find scenarios where PCIe SSDs offer palpable performance improvements over even budget-oriented SATA drives.
A PCIe SSD opens up the sole SATA slot for the backup disk in the small form factor PCs that are currently in vogue.
I should use this sig to advertise my book ISBN-13 : 978-1501515132.
Most folks who need the throughput of a PCI-E SSD won't use it for just gaming. These same users are likely power users. Everything from running test VMs locally to Video / Audio editing would see a huge improvement from this tech.
Loading apps? games? That's nice and all, but those are far from the only use cases of fast storage media.
Personally, the new PCI-E SSDs have gotten a good amount of use from me as ZFS cache drives, where they've been wonderful for saturating 10gbps Ethernet.
Particularly, an X-25M 3gbps SATA drive. Which was pretty fast a few years ago. These are, in practical terms, no faster. I doubt enough data is being moved to see the difference.
HBI's Law: Frequency of calling others Nazis is directly correlated with the likelihood of the accuser being Communist.
Since so many games make you sit through crappy videos, copyright screens and other garbage for thirty seconds while they start up, or at least make you hit a key or press a mouse button to skip them and that damn 'Press Any Key To Start' screen that they couldn't even take five minutes to remove when porting from a console, faster load time is pointless once you've eliminated the worst HDD delays.
A guy named Amdahl had something to say on the subject. SSDs excel at IOPS, but that buys you little if you're not IOPS-constrained.
Examples of things that eat operations as fast as you can throw them at 'em: databases, compilation, most server daemons.
Examples of things that couldn't care less: streaming large assets that are decompressed in realtime, like audio or video files. Loading a word processing document. Downloading a game patch. Encoding a DVD. Playing RAM-resident video games.
It should be a shock to roughly no one that buffing an underused part won't make the whole system faster. I couldn't mow my lawn any faster if the push mower had a big block V8, nor would overclocking my laptop make it show movies any faster.
TL;DR non-IO-bound things don't benefit from more IO.
Dewey, what part of this looks like authorities should be involved?
I mean, why would anyone think images would load faster? The cpu is doing enough transformative work processing the image for display that the storage system only has to be able to keep ahead of it... which it can do trivially at 600 MBytes/sec if the data is not otherwise cached.
Did the author think that the OS wouldn't request the data from storage until the program actually asked for it? Of course the OS is doing read-ahead.
And programs aren't going to load much faster either, dynamic linking overhead puts a cap on it and the program is going to be cached in ram indefinitely after the first load anyway.
These PCIe SSDs are useful only in a few special mostly server-oriented cases. That said, it doesn't actually cost any more to have a direct PCIe interface verses a SATA interface so I these things are here to stay. Personally though I prefer the far more portable SATA SSDs.
-Matt
Gosh, stupid html tags ate most of my posting. Anyway here it is.
I don't understand why people still don't understand the difference between latency and bandwidth, and the fact that a huge amount of the desktop IO load is still less than 4k with a queue depth of basically 1.
If you look at many of the benchmarks you will notice that the .5-4k IO performance is pretty similar for all of these devices and that is with deep queues. Why is that? Because the queue depth and latency to complete a single command dictate the bandwidth. So you either need deeper queues or lower latency to go faster at those block sizes.
So the latency on PCIe is not that much better, but the queue depth can be much deeper than what is possible with a normal AHCI controller. This helps a lot with benchmarks, but not so much for a single user.
Anyway, boot times, and general single user performance is bottle necked mostly by latency. Especially when the throughput of larger transfers is greater than a few hundred MB/sec. So, the pieces large enough to take advantage of the higher bandwidth is a smaller (and growing smaller) portion of the pie.
Next time you start your favorite game look at the CPU/DISK IO. Its likely the game never gets anywhere close to the max IO performance of your disk, and if it does its only for a short period.
Anyway, its like multicore, beyond a fairly low core count most desktop type operations are better off with faster CPU's rather than more of them.
And just like desktop benchmarks, the guys running benchmarks seem lothe to heavily weigh single thread operations, or queue depth 1 1k IO loads in the overall performance picture even though its a large portion of actual system performance running everyday tasks.
sata has overhead of the, well, sata and scsi layers.
the pci-e ssd stuff is going to be based on NVMe and that cuts thru all the old layers and goes more direct. its also network-able (some vendors).
this is really NOT for consumers, though. consumers are just fine with ssd on sata ports.
--
"It is now safe to switch off your computer."
The PCIe devices are faster; but (since they also tend to be either substantially similar to SATA devices; but packaged for the convenience of OEMs who want to go all M.2 on certain designs and clean up the mini-PCIe/SATA-using-mini-PCIe's-pinout-for-some-horrible-reason/mini-SATA/SATA mess that crops up in laptops and very small form factor systems; or tend to be markedly more expensive enterprise oriented devices that focus on IOPS) it isn't clear why you'd expect much improvement on application loading workloads.
SSDs are at their best, and the difference between good and merely adequate SSDs most noticeable, under brutal random I/O loads, the heavier the better. Those are what make mechanical disks entirely obsolete, cheap SSD controllers start to drop the ball, and more expensive ones really shine. Since application makers generally still have to assume that many of their customers are running HDDs(plus the console ports that may only be able to assume an optical disk and a tiny amount of RAM, and the mobile apps that need to work with cheap and mediocre eMMC flash), they would do well to avoid that sort of load.
HDD vs. SSD was a pretty dramatic jump because even the best HDDs absolutely crater if forced to seek(whether by fragmentation or by two or more programs both trying to access the same disk); but there aren't a whole lot of desktop workloads where 'excellent at obnoxiously seeky workloads' vs. 'damned heroic at obnoxiously seeky workloads' makes a terribly noticeable difference. Plus, a lot of desktop workloads still involve fairly small amounts of data, so a decent chunk of RAM is both helpful and economically viable. Part of the appeal of crazy-fast SSDs is that the cost rather less per GB than RAM does, while not being too much worse, which allows you to attack problems large enough that the RAM you really want is either heroically expensive or just not for sale. On the desktop, a fair few programs in common use are still 32 bit, and much less demanding.
That's isn't correct. The queue depth for a normal AHCI controller is 31 (assuming 1 tag is reserved for error handling). It only takes a queue depth of 2 or 3 for maximum linear throughput.
Also, most operating systems are doing read-ahead for the program. Even if a program is requesting data from a file in small 4K read() chunks, the OS itself is doing read-ahead with multiple tags and likely much larger 16K-64K chunks. That's assuming the data hasn't been cached in ram yet.
For writing, the OS is buffering the data and issuing the writes asynchronously so writing is not usually a bottleneck unless a vast amount of data is being shoved out.
-Matt
Booting from PCIe is not well supported at this point and that may be interfering with the boot times. As for the game loading benchmark results, these drives are usually used for high speed working file space in servers/workstations (e.g. latency critical databases, video editing, scientific computing). If you aren't trying to solve an I/O bottleneck problem for a specific application, PCIe SSDs probably aren't what you're looking for. And even if you are, you have to know exactly what type of I/O is critical for your application because the different models target different needs (various combinations of IOPS, sequential speeds, read/write balance, write endurance).
Knowledge Brings Fear
That's just it. Their speeds are not "much higher." They're only slightly faster. The speed increase is mostly an illusion created by measuring these things in MB/s. Our perception of disk speed is not MB/s, which is what you'd want to use if you only had x seconds of computing time and wanted to know how many MB of data you could read.
Our perception of disk speed is wait time, or sec/MB. If I have y MB of data I need read, how many seconds will it take? This is the inverse of MB/s. Consequently, the bigger MB/s figures actually represent progressively smaller reductions in wait times. I posted the explanation a few months ago, the same one I post to multiple tech sites. And oddly enough Slashdot was the only site where it was ridiculed.
If you measure these disks in terms of wait time to read 1 GB, and define the change in wait time from a 100 MB/s HDD to a 2 GB/s NVMe SSD as 100%, then:
A 100 MB/s HDD has a 10 sec wait time.
A 250 MB/s SATA2 SSD gives you 63% of the reduction in wait time (6 sec).
A 500 MB/s SATA3 SSD gives you 84% of the reduction in wait time (8 sec).
A 1 GB/s PCIe SSD gives you 95% of the reduction in wait time (9 sec).
The 2 GB/s NVMe SSD gives you 100% of the reduction in wait time (9.5 sec).
Or put another way:
The first 150 MB/s speedup results in a 6 sec reduction in wait time.
The next 250 MB/s speedup results in an extra 2 sec reduction in wait time.
The next 500 MB/s speedup results in an extra 1 sec reduction in wait time.
The next 1000 MB/s speedup results in an extra 0.5 sec reduction in wait time.
Each doubling of MB/s results in half the reduction in wait time of the previous step. Manufacturers love waving around huge MB/s figures, but the bigger those numbers get the less difference it makes in terms of wait times.
(The same problem crops up with car gas mileage. MPG is the inverse of fuel consumption. So those high MPG vehicles like the Prius actually make very little difference despite the impressively large MPG figures. Most of the rest of the world measures fuel economy in liters/100 km for this reason. If we weren't so misguidedly obsessed with achieving high MPG, we'd be correctly attempting to reduce fuel consumption by making changes where it matters the most - by first improving the efficiency of low-MPG vehicles like trucks and SUVs even though this results in tiny improvements in MPG.)
Examples of things that couldn't care less: streaming large assets that are decompressed in realtime, like audio or video files. Loading a word processing document. Downloading a game patch. Encoding a DVD. Playing RAM-resident video games.
Yes, any one of those things. However, if you're downloading a game patch while playing a game and maybe playing some music in the background, at the same time as perhaps download a few torrents or copying files, whatever... SSD's kick ass.
Why? Because singularly those things aren't IO-bound, but once you start doing 2+ things that require semi-hefty disk access then on an HDD you're going to have a lot of thrashing and speed goes out the window.
Huh? This sounds like nonsense. Operating systems already cache frequently used data in ram.
-Matt
Individual chips have an upper cap on speed, but that's why every SSD on the market accesses numerous devices in parallel. All you need to do to make an SSD go faster is add more NAND devices in parallel and a slightly faster controller to support them.
Maybe if you have no idea what you're talking about.