Ask Slashdot: Is the Gap Between Data Access Speeds Widening Or Narrowing?
New submitter DidgetMaster writes: Everyone knows that CPU registers are much faster than level1, level2, and level3 caches. Likewise, those caches are much faster than RAM; and RAM in turn is much faster than disk (even SSD). But the past 30 years have seen tremendous improvements in data access speeds at all these levels. RAM today is much, much faster than RAM 10, 20, or 30 years ago. Disk accesses are also tremendously faster than previously as steady improvements in hard drive technology and the even more impressive gains in flash memory have occurred. Is the 'gap' between the fastest RAM and the fastest disks bigger or smaller now than the gap was 10 or 20 years ago? Are the gaps between all the various levels getting bigger or smaller? Anyone know of a definitive source that tracks these gaps over time?
The distance between the "fastest" and "slowest" gets larger and larger, but the gaps are getting smaller because things like SSDs fill them.
Does it matter? Fast CPU, fast RAM, fast disks is like having no speed limits on every race track in the world - but in order to get from track to track you have to go on the interstates or perhaps back country roads (PCI bus, etc). Sure, each component is fast and getting faster, but the way those components connect to each other hasn't changed all that much...
Don't blame me, I voted for Kodos
I'm not sure what a historic timeline of these ratios (not "differences", please) would gain you.
These ratios can have a big impact on what algorithms and implementations you choose to maximize performance. I suppose if, say, the ratio of RAM to disk speed increased by a factor of 10 over the decade before last, then decreased back to its original ratio in the last decade, it might be worth trawling through some old papers (or old source trees) to revisit lessons learned in the earlier period -- but that seems like a bit of a stretch.
If you're just curious, it shouldn't be too hard to generate timelines of CPU cycle speeds, cache and RAM latencies and bandwidths, disk performance, and so on. But really, each of those has enough factors that a simple "ratio" would probably conceal more than it illuminates.
for a CS or IT class?
Yes, yes it is.
PlanetVulkan.com
Ancient scrolls of dubious provenance hint darkly that DDR4 was not the first inhabitant of the RAM slots we consider so permanent. Debased cultists still sometimes mutter chants mentioning "PC100", or even uncouth syllables such as "korr"...
This /. article, plus one called "Casino lock flooring [...] play casino online" which 404s in Norwegian when you click on it.
As it turns out, a whole bunch of really technical reviews on DDR4 memory, plus 'scope/test gear for testing DDR4 bus access. At least partially, potentially, useful, if you're prepared to wade through a bunch of dense stuff.
A whole bunch of MacBook reviews/unpaid ads, followed at the end by a Toshiba and a Kingston paid ad (sucks to be them, should have paid more).
So, ummm, like most LMGTFY trolls who think they're way smarter than they are, the actual results are far less useful than just asking someone who knows what they're talking about for the answer. (Whether /. counts as that is an entirely different matter, though there's usually at least one person who appears to know what they're talking about and will answer the question usefully and honestly without being a smug stuck up prick.)
"Everyone knows that CPU registers are much faster than level1, level2, and level3 caches."
I'd argue that most people don't even know what a CPU register is, never mind what it's faster than.
BeauHD. Worst editor since kdawson.
Originally, there was CPU registers, and memory. Then there was registers, memory, and disk. Then there was registers, SRAM cache, memory, and disk. Then there was registers, L1 cache (on CPU), L2 cache (on mobo), DRAM, and disk. Then the L2 moved onto the CPU. Then there was L3. Then SSDs were added between RAM and disk. Now some chips have an L4 cache on the CPU package (but not the CPU die).
Oh, and there's a difference between latency and bandwidth. DRAM latency has not significantly improved over time, particularly compared to DRAM bandwidth.
And with multiple cores, some levels are core-specific while others are not. You can even have a bizarre situation where L1 cache is per-core, L2 cache is shared between two cores, and L3 cache is per-CPU (in SMP setups, that means main RAM is the first level shared among all cores).
The latency of RAM is improving very slowly, only something like 2x-4x improvement in last 20 years.
Only the bandwidth of the memory is growing faster, and that's just because they have been putting more dram cells in parallel, always doing bigger data transfers and having faster memory bus.
Same is true for hard disk drive speed, the rotation speeds dictates the random access latency and the rotation speed of average hard disk has only gone up from 4200 or 5400 to 7200 rpm in the last 20 years, meaning only 1.7 or 1.33 times improvement in random access latency
Though replacing hard disks with flash-based SSD storage has improved latency by a huge margin.
20 years ago main memory was 10-14 ns, instruction cycle time was 2-4ns (Cray)
Guess what? it still is.
Memory has grown, it has gotten cheaper.
What HASN'T changed? Access to memory. That is how Cray got its speed - instead of a single port to memory, it used a crossbar switch - 4 ports for each processor. 1 instruction bus, 2 input data busses, and one output bus; even I/O got its own port to memory; all with overlapping address/data cycles.
The effect was that all of main memory worked at the speed of cache, thus the CPU had no need to waste silicon on cache memory - and the entire system ran full speed.
What slows down the current systems? Memory access. Most systems only have a single port to main memory. Some servers and "high performance" desktops have dual ported memory. Yet even dual ported memory access is slow when you have to share it among 4/8 cores... plus I/O (which isn't dual ported). Interrupt latency on PCs is really horrible. Still only 15 IRQs? and have to share them? No direct vectoring? Forced interrupt chain actions? Even the old PDP 11 with ONE interrupt request line allowed direct interrupt vectoring (64 basic vectors) to reduce overhead.
There hasn't been much innovation in architecture in over 20 years.
This article can shed some light on it: http://www.dba-oracle.com/t_hi... Looks like RAM is the laggard.
You can look up the specs easy.
Back in the 80286 days, there was not even an L1 cache however the memory and ISA bus ran at CPU speed 8-20Mhz. Hard drive latency was ~65ms.
In the 80486 days L1 cache was introduced and L2 was sometimes available in (very) expensive modules. I remember buying 256kb for the same price as 16MB RAM. The L1 caches ran (if I remember correctly) at CPU speed, 1 cycle. However the bus speed started to slow down compared to the CPU. The VLB ran at CPU bus speed ("local" bus) and was often used for graphics but PCI (an inferior bus) ran at 33MHz so for anything over 33MHz, we started needing dividers. The RAM ran at 80-120ns so it started being slower than the CPU bus. Hard drive speeds were however up to ~30ms.
In the Pentium age memory slowed even farther compared to the CPU bus. Now it took several cycles to access memory, buses ran even slower (still PCI mostly, eventually PCI-X (133MHz?) until PCI-e (serial buses running) came along. Hard drive speeds went up to ~15ms
In modern age, L1 caches have slowed even further requiring 4 cycles for L1 cache and up to 30 for L3 caches. RAM is even slower access with bus speeds about a quarter of a single CPU but sometimes 16 CPU's need to share those lanes. Peripheral bus speeds however have gone up and PCIe 3.0 is now directly integrated into CPU 80486 VLB-style. Hard drives have latencies of 10ms (we have a mechanical issue there) still but even cheap SSD's can go down to ~1-2ms.
Custom electronics and digital signage for your business: www.evcircuits.com
Yes, he was ruling himself out as unable to answer. So am I. And it would take a *LOT* more than a Google search to answer. I lean towards agreeing with the people who cite bus speed as the limiting factor, but I'm not sure, and there could certainly be special circumstances where something else was the limit.
I *do* know that it's not an easy question to answer, and that any answer is going to depend for its correctness on a presumed workload. (Some things are CPU bound, and don't even use much RAM. Other things are IO bound, and make you think your disks are thrashing. Most things are somewhere in between.)
But the original question was "has the gap between fast-small memory and slow-large memory gotten larger, smaller, or stayed the same. Even that's an oversimplified question, as it doesn't deal with persistent RAM. (I'm dubious about the value of that, but some people used it to advantage in the days of core-memory.) Also ignored were the questions of relative cost. If you pay ENOUGH there are lots of exotic technologies...and I have no idea of the tradeoffs.
So much better to get the answer from somebody who actually knows the area. It's not simple.
I think we've pushed this "anyone can grow up to be president" thing too far.