Intel Announces Xeon E5 and Knights Corner HPC Chip
MojoKid writes "At the supercomputing conference SC2011 yesterday, Intel announced its new Xeon E5 processors and demoed their new Knights Corner many integrated core (MIC) solution. The new Xeons won't be broadly available until the first half of 2012, but Intel has been shipping the new chips to a small number of cloud and HPC customers since September. The new E5 family is based on the same core as the Core i7-3960X Intel launched Monday. The E5, while important to Intel's overall server lineup, isn't as interesting as the public debut of Knights Corner. Recall that Intel's canceled GPU (codenamed Larrabee) found new life as the prototype device for future HPC accelerators and complementary products. According to Intel, Knights Corner packs 50 x86 processor cores into a single die built on 22nm technology. The chip is capable of delivering up to 1TFlop of sustained performance in double-precision floating point code and operates at 1 — 1.2GHz. NVIDIA's current high-end M2090 Tesla GPU, in contrast, is capable of just 665 DP GFlops."
4chan is still down? Maybe we should lend them a hand.
Faster! Faster! Faster would be better!
I mostly understand the figures this post states, but it sounds like engineering dialog from 'Star Trek: Voyager'. But, all this means to me is that the chips from last year are now cheaper that they've been out-classed.
Odds are... they have it lined up such that... they are in a 5x10 grid. Or a 5x5 Grid front/back.
Just because it's a computer doesn't mean it's bound by the power of two. Boards are rectangular. Chips laid out aren't necessarily in binary distribution.
Intel's period of dismissive attitude toward advanced features(multiple cores, 64-bit support on x86, something that sucked less than FSB) was never really serious. Back when they still thought that they had a chance of making IA64 the 'serious' platform and gradually letting x86(and AMD) sink into the bargain bin, they did some tactical rubbishing of what "normal users" needed in order to justify restricting those features to the high-end SKUs; but they worked on them.
Once it became clear that that particular plan wasn't a happening thing, and that AMD was delivering serious server parts and knockdown prices, and Nvidia was doing interesting things with GPUs, and ARM licensees were pumping out increasingly zippy low-end chips, they stopped fucking around. These days they'll still charge as hard as they can for the features provided; but their hopes of sandbagging x86s in order to sell IA64s are dead
Your average consumer doesn't need 50 cores.
Sure they do. What do you think a GPU is? History has shown over and over that we can never have enough computing power. Now that we're at the physical limits of clock speeds, parallelism is going mainstream.
I wonder if Intel is taking a page from IBM's playbook.
Upper end POWER7 CPUs have the ability to have half their cores turned off. The cores that are on can then use the disabled neighbor's caches, and run at a higher clock speed. For some things, this switch actually speeds up some tasks that can't be evenly broken up into balanced threads.
I can see Intel doing this where some cores are disabled due to manufacturing defects (which happen to all dies), and having the operable cores use nearby caching which would otherwise go to waste.
A 50 core chip at 1GHz is going to need to perform 20 double precision floating point ops per cycle per core to achieve 1Tflop performance. OK, so 1.2GHz cuts that down to 16flops/clock. Since when can anything Intel Architecture achieve that many flops per cycle? Two 4-element dot products is only 14 flops. I suppose if they did two vector-scaler multiply-adds that would get 16 flops per cycle. So I just answered my own question. But can they really keep the FP unit running continuously at that rate? On all 50 cores?
At 6Ghz, you are very close to the speed of light in copper, so unless you can break the speed of light... its a "physics limit".
Below this point you have the problem of energy efficiency, i.e. whats the point of spending more energy on cooling than on actually powering the thing?
Intel's 3d-transistors are HUGE because of this, they can push higher clock speed more easily.
Because computers count in binary, which is powers of two. And, I'll assume you meant cores.
Historically such things have been powers of two to make the addressing simpler without having extra magic or control lines left over. So, 1, 2, 4, 8, 16, 32 and 64 all make sense in terms of being expressable in a fixed number of bits ... 50 to some of us seems like a fairly arbitrary choice. Since you use an unusual combination of wiring, it might as well be 37 or 51 since it's not a number that 'naturally' lends itself to computers. The device is likely wired in such a way that it could count to 64 ... or they're doing things in a slightly odd way.
Anyway, that's why some of us find it to be a little odd. And it's also why the hard-drive makers deciding "1 GIG" is "1,000,000,000 bytes" is irksome ... with all of those extra powers of two, it should be "1 073 741 824 bytes". Which means you lose about 72MB/GIG ... so my 2TB drive isn't.
Lost at C:>. Found at C.
Intel claims it will be released as a commercial product in the near future.
So, are you always an asshole, or just on Slashdot?
Lost at C:>. Found at C.
Well, since I own 3 iPods and an iPad ... you'd think I'd be the one being accused of being an asshole by that logic.
I'm going to go with self-righteous prick who feels entitled to be an ass on the internet because he's got a 5-digit Slashdot ID and therefore considers himself to be l337.
Lost at C:>. Found at C.
Well, in fairness, on the memory side, you do that with some combination of memory modules which are addressable by powers of two. (eg. 2GB + 1GB, or 4GB + 4GB + 1GB), each of which is discrete from the others. I don't believe you can buy a 3GB or 9GB memory module.
Nope, absolutely not. Not saying that ... just saying that traditionally such things have been architected to use powers of two because it was most efficient.
Obviously, for other reasons, Intel decided to go with 50 cores.
Lost at C:>. Found at C.
I'm at SC11 right now and just attended NIC's MIC presentation. The scaling looks fantastic according to various codes that they compiled to run on it, but what was notably absent was performance relative to traditional x86 chips. The final presenter even said that now that the technology has been demonstrated to work (with minimal porting effort required) the next step will be to optimize and improve performance. The take away is that relative to Intel's other chips, MIC performance wasn't impressive enough to include in the presentation. That's fine in my book because it's an ambitious project, but it sounds like there is still some work to do.
Yea, I know it is too late. The good news is that the x64 transition went much better.
No, light travels 5cm in one 6 GHz clock cycle, in a vacuum. Speed of light limitations have been a consideration for years. The Cray1 was designed in the early 70s and its physical design allowed for the propagation speed of electricity in copper. It only ran at 80MHz. It's not just about cycle time - what's the duration of your edges? What other latencies are there in the electronics? In 2004, IBM's POWER5 MCM was 9.5cm wide and the CPUs ran at ~2GHz. Not sure what speed the interconnect ran at.
Music at http://www.ignorantbliss.co.uk/