Intel's 128MB L4 Cache May Be Coming To Broadwell and Other Future CPUs
MojoKid writes "When Intel debuted Haswell this year, it launched its first mobile processor with a massive 128MB L4 cache. Dubbed "Crystal Well," this on-package (not on-die) pool of memory wasn't just a graphics frame buffer, but a giant pool of RAM for the entire core to utilize. The performance impact from doing so is significant, though the Haswell processors that utilize the L4 cache don't appear to account for very much of Intel's total CPU volume. Right now, the L4 cache pool is only available on mobile parts, but that could change next year. Apparently Broadwell-K will change that. The 14nm desktop chips aren't due until the tail end of next year but we should see a desktop refresh in the spring with a second-generation Haswell part. Still, it's a sign that Intel intends to integrate the large L4 as standard on a wider range of parts. Using EDRAM instead of SRAM allows Intel's architecture to dedicate just one transistor per cell instead of the 6T configurations commonly used for L1 or L2 cache. That means the memory isn't quite as fast but it saves an enormous amount of die space. At 1.6GHz, L4 latencies are 50-60ns which is significantly higher than the L3 but just half the speed of main memory."
On laptops? Perhaps it could, I suspect that an eDRAM cache+slower main memory could have lower total power consumption at the same performance level than a faster main memory, especially if you have more of it. I believe that the major power usage component for main memory DRAMs is actually using the memory (as in, transferring the data).
Ezekiel 23:20
With this 128MB cache, shouldn't this CPU be able to run an OS like Win95 of an older Linux without additional memory?
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
Cache performance impact is very heavily dependant upon application characteristics. Specifically, active memory.
Best case, when you're working with an active set that's larger than L3 but under L4 - around 100MB or so - and you're accessing it on a repeating pattern, and the compiler hasn't found any tweaks to help, and you're not multitasking, and the OS isn't swapping you out every slice, and the stars are aligned in your favor... the theoretical maximum performance gain can be up to 2x. It's very rare you'll find a program that benefits that much, though. Closest I can think of is image processing.
So in the real world, anywhere from 'no benefit' to 'double the speed' depending on application.
you can revisit those those nostalgic 8MB and 4MB days again with the latest AMD chips as L2 cache. :)
just use a modifide version coreboot to bypass those silly POST tests and load to the CPU cache directly with Windows 3.11 :)
Anons need not reply. Questions end with a question mark.