Intel Announces New Enterprise Xeons, More Powerful Xeon Phi Cards
MojoKid writes "Intel announced a set of new enterprise products today aimed at furthering its strengths in the TOP500 supercomputing market. As of today, the Chinese Tiahne-2 supercomputer (aka Milky Way 2) is now the fastest supercomputer on the planet at roughly ~54PFLOPs. Intel is putting its own major push behind heterogeneous computing with the Tianhe-2. Each node contains two Ivy Bridge sockets and three Xeon Phi cards. Each node, therefore, contains 422.4GFLOP/s in Ivy Bridge performance — but 3.43TFLOPs/s worth of Xeon Phi. In addition, we'll see new Xeons based on this technology later this year, in the 22nm E5-2600 V2 family, with up to 12 cores. The new chips will be built on Ivy Bridge technology and will offer up to 12 cores / 24 threads. The new Xeons, however, aren't really the interesting part of the story. Today, Intel is adding cards to the current Xeon Phi lineup — the 7120P, 3120P, 3120A, and 5120D. The 3120P and 3120A are the same card — the 'P' is passively cooled, while the "A" integrates a fan. Both of these solutions have 57 CPUs and 6GB of RAM. Intel states that they offer ~1TFLOP of performance, which puts them on par with the 5110P that launched last year, but with slightly less memory and presumably a lower price point. At the top of the line, Intel is introducing the 7120P and 7120X — the 7120P comes with an integrated heat spreader, the 7120X doesn't. Clock speeds are higher on this card, it has 61 cores instead of 60, 16GB of GDDR5, and 352GBps of memory bandwidth. Customers who need lots of cores and not much RAM can opt for one of the cheaper 3100 cards, while the 7100 family allows for much greater data sets."
The x64 Phi cards are a lot easier to program then GPUs. No need to jump through hoops with memory mapping, keep things in sync for SIMD processing or worry about running out of stack space when doing recursion.
Will this be interresting for me? Price/value wise?
You won't get full performance from a Xeon Phi without using the SIMD instructions, so it is not as easy to program as you might hope.
In addition, we'll see new Xeons based on this technology later this year, in the 22nm E5-2600 V2 family, with up to 12 cores.
...And yet, because of corporate policies on running the shittiest AV on the planet (Symantec) cranked to the max, my desktop PC will still have the responsiveness of a sloth on 'luudes.
Seriously, I already have 8 cores worth of Xeon (2x4) and the load meter never even twitches, enough RAM to load my entire system drive into, and an SSD system drive. More cores won't help at this point.
It's too bad Thinking Machines Incorporated never had a sticker policy, because the "Fat Tree" routing topology is straight out of TMI (the prior TMI topology, hypercube, didn't allow the customer as much choice to balance cores vs interconnect).
--- Often in error; never in doubt!
Xeon, Itanium. I think I've figured out the real genius at Intel.
1. Pick a cool element.
2. Remove a letter.
3. ?????
4. Profit!!!
2015 Arbon
2018 Heliu
2023 Litium
2024 Silion
2026 Eon
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
I was expecting 32 cores minimum in desktop CPUs by the start of this decade. All this new supercomputer stuff is well and good, but what about lots of cores for us mere mortals too?
You wouldn't like the speed of typical software on a 32-core CPU using the same transistor count (i.e. at the same cost) of the machine you're running now.
Cache sharing, NUMA access, etc. turn out to be tricky to get fast, right, and cheap. In the meantime, much of the existing software library can't even properly take advantage of a 6-core desktop chip, so all mere mortals would get today from a 32-core chip would be a slower machine.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
2015 Ron
2018 Aluminum
refactor the law, its bloated, confusing and unmaintainable.
An 8-core Xeon (not i7) is not a mid-range desktop. Nor is "enough RAM to load my entire system drive into", or an SSD system drive.
Even now, those are all higher-end in the general scheme of things. More common on enthusiast machines, sure, but far from "mid-range" in a business system.
Well, if you've got an NVidia card + XEON (which happens to be what I have available at work), then any newly written code is going to be in OpenCL or LLVM IR (via C++ or custom language). If you're going that route, any code you write will more or less work on Phi with little modification (although I have not got a Phi on which I can actually test my hypothesis here, so I may be talking BS!). So in theory at least, it won't be any harder to write code for Phi than for NVidia/AMD. The thing that appeals to me about Phi the most, is simply the slightly less restrictive way you can address memory, code, and the CPU cores. GPU's were originally designed to be a more or less a one way process. You throw geometry data from the CPU to the GPU, and the GPU throws it on the screen. Whilst GPU's are much more general purpose these days, they do still display that heritage in the occasional moment where you realise "damn, I'm unable to access that memory here", or "damn, I have to split this process into two seperate ones because the hardware says so".
AltiVec was Motorola's 1999 SIMD instructions & hardware, a response to the SIMD instructions & hardware released by AMD in 1998 (AMD called theirs 3DNow!). Intel also released SIMD instructions & hardware in 1999, called SSE. 3DNow!, AltiVec & SSE were all 128 bit wide pipes that could handle 4 single precision floating point operations simultaneously in parallel. Some of them may have also been able to do two double precision floats also (not AltiVec though), and they all did various integer ops in parallel too.
Xeon Phi is a chip that contains around 60 independent specialized Intel X86 cores, plus caches & ring busses for the cores to communicate with each other. The core count is inexact probably because Intel is figuring out the expected number of dead cores on a chip they can ship and still call it a complete chip. Each of the 60 or so specialized cores has a 512 bit wide pipe that will do 16 parallel single precision floating point operations or 8 parallel double precision floats. To call it a "pipe" means a new instruction & data can be issued every clock cycle, and there are a number of instructions "in flight" streaming down the pipeline, with results issuing out of the bottom of the pipe every clock. The pipe is a "fused multiply add" architecture (useful for vector dot products) so theoretically, every clock cycle, the CPU could issue 16 single precision mults and 16 single precision adds, a total of 32 flops per clock per core. Most high performance computing uses double precision, so cut that 32 in half, and multiply 16 flops per clock times 60 cores times about 1.2GHz to get about 1.2 DP teraflops (theoretical) per Xeon Phi chip. Actual flops will be considerably lower if the problem doesn't fit well in cache.
The bottom half of this article has a nice overview of Xeon Phi specs.
--- Often in error; never in doubt!
Where did I hear that before
Of course news about a fake are Fake News.
Uh... that's what these Xeon Phi cards are. Lots of cores. FYI, that 80-core research chip wasn't x86.
Actually larabee was exactly 80 486DX cores on one die. They just couldn't figure out how to get them to do useful work (They were thinking graphics processing of all things). So they rethought their approach and canceled that project.
I have one question. If the Japanese Ministry of Agriculture is not in charge of Gundam, then who is?