Overclocker Pushes Intel Core i7-7700K Past 7GHz Using Liquid Nitrogen (hothardware.com)
MojoKid writes from a report via HotHardware: If you've had any doubts of Intel's upcoming Kaby Lake processor's capabilities with respect to overclocking, don't fret. It's looking like even the most dedicated overclockers are going to have a blast with this series. Someone recently got a hold of an Intel Core i7-7700K chip and decided to take it for an overclocking spin. Interestingly, the motherboard used is not one of the upcoming series designed for Kaby Lake, but the chip was instead overclocked on a Z170 motherboard from ASRock (Z170M OC Formula). That bodes well for those planning to snag a Kaby Lake CPU and would rather not have to upgrade their motherboard as well. With liquid nitrogen cooling the processor, this particular chip peaked at just over 7GHz, which helped deliver a SuperPi 32M time of 4m 20s, and a wPrime 1024M time of 1m 33s. It's encouraging to see the chip breaking this clock speed, even with extreme methods, since it's a potential relative indicator of how much headroom will be available for overclocking with more standard cooling solutions.
If I recall correctly, the first time someone got over 8 Ghz was back in ~2004, over a decade ago. I know clock speed isn't everything, but parallelism will only get you so far. I really hope before we get to 5nm chips, we can get some 20 Ghz clock speeds. The amount of work you'll be able to do on a single thread will be amazing.
The only thing inherently inefficient about parallel computing is the inefficiency created by the overhead required to keep the software consistent and coherent. The real problem with multi-core computing is very little software is written in such a way that it can run on multiple CPUs. Hell, my professors were saying that in college 15 years ago, and it's still true today.
... lumbers along at 100MHz still.
Time we started working on that side of the hardware some more.
Frequency is not an actual measure of CPU speed, as the average number of instructions per cycle were way smaller in 2004 (on a P4 CPU) than it is today.
Today, processors have much smaller pipelines (14 vs Prescott's 31). Pipelines contain instructions scheduled to be executed, being in different stages of executions (eg: decoded, having their necessary input ready, having their result computed in a temporary storage but not yet committed). To run a cpu efficiently, you need to keep the pipeline full. When a branch instruction (eg: "is the value of register CX 0"?) is encountered, the cpu will predict it's outcome ("no, it's not 0") and starts loading and decoding the instructions for that path in the branch to keep the pipeline full. If CX turns out to be 0, it will have to discard the entire pipeline and start to fill up with the path for the "yes, it's 0" case. Shorter pipelines reduce this penalty, resulting in a larger number of average instructions executed per cycle.
Branch prediction also improved greatly, so you don't need to pay the already smaller penalty that many times.
The number of cycles the execution of a single instruction takes also shrunk. An integer division took 34 cycles on a Prescott, takes 23-26 cycles on a Skylake. For simpler instructions the relative difference is even bigger, eg a MOV was at least 2 cycles for a Prescott, now it is usually 1 cycle.
The average amount of data processed by an instruction also improved with new instructions (eg AVX, AES-NI).
The cumulated result of these improvements (more data per instruction, more instruction per cycle, less cycles wasted in pipelines) are many times faster CPUs at the same clock rate, even without considering multiple cores.