NVIDIA Announces Tesla K40 GPU Accelerator and IBM Partnership In Supercomputing
MojoKid writes "The supercomputing conference SC13 kicks off this week and Nvidia is kicking off their own event with the launch of a new GPU and a strategic partnership with IBM. Just as the GTX 780 Ti was the full consumer implementation of the GK110 GPU, the new K40 Tesla card is the supercomputing / HPC variant of the same core architecture. The K40 picks up additional clock headroom and implements the same variable clock speed threshold that has characterized Nvidia's consumer cards for the past year, for a significant overall boost in performance. The other major shift between Nvidia's previous gen K20X and the new K40 is the amount of on-board RAM. K40 packs a full 12GB and clocks it modestly higher to boot. That's important because datasets are typically limited to on-board GPU memory (at least, if you want to work with any kind of speed). Finally, IBM and Nvidia announced a partnership to combine Tesla GPUs and Power CPUs for OpenPOWER solutions. The goal is to push the new Tesla cards as workload accelerators for specific datacenter tasks. According to Nvidia's release, Tesla GPUs will ship alongside Power8 CPUs, which are currently scheduled for a mid-2014 release date. IBM's venerable architecture is expected to target a 4GHz clock speed and offer up to 12 cores with 96MB of shared L3 cache. A 12-core implementation would be capable of handling up to 96 simultaneous threads. The two should make for a potent combination."
Nvidia has sidetracked OpenCL for CUDA?
IBM is announcing that their hardware is "Open", in the sense that it has PCIe slots, and Nvidia is announcing that they'd be happy to sell hardware to the sort of price-insensitive customers who will be buying Power8 gear?
I'm shocked.
Ah, the good old days.... when CPUs were measured in megahertz, and instructions took multiple clocks. :D
Really, what was the Cray when it first came out? One vector processing unit. How many does this new NVidia board have? How much faster are they than the original Cray?
I do not fail; I succeed at finding out what does not work.
NVIDIA seems behind AMD in moving to 512-bit wide GDDR5: this K40 still has 384-bit. Also worrying is whether significant performance improvements will really be possible beyond that point. GPU code is notorious for easily becoming DRAM bandwidth limited. Cache on the GPU is very small compared to the computing resources.
They walk around among campuses pitching funding and hiring graduates for getting rid of GPUs and using FPGAs instead. According to what they said in the interview, they even have a large software group doing this. They claim that GPUs have no future, and now they're partnering with NVIDIA? It seems to be their strategy for a quick-fix until their high-level language to FPGA compilers and Lime have matured.
IBM has announced willingness to license the Power8 design in much the same way that ARM licenses their stuff to a plethora of companies. IBM has seen what ARM has accomplished at the lower end in terms of having relevance in a market that might otherwise have gone to Intel given sufficient time, and sees motivation to do that in the datacenter where Intel has significantly diminished POWER footprint over the years. Intel operates at obscene margins due to the strength of their ecosystem and technology, and IBM is recognizing that it needs to build a more diverse ecosystem itself if it wants to compete with Intel. That and the runway may be very short for such an opportunity. ARM as-is is not a very useful server platform, but that gap may close quickly before IBM can move, particularly as 64-bit ARM designs start getting more prevalent.
For nVidia, things are a bit more than 'sure we'll take more money'. nVidia spends a lot of resources on driver development and without their cooperation, using their GPU accelerator solution will get nowhere. nVidia has agreed to invest the resources to actually support Power. Here, nVidia is also feeling the pressure from Intel. Phi has promised easier development for accelerated workloads as a competitor to nVidia solutions. As yet, Phi hasn't been everything people had hoped for, but the promise of easier development today and promise for improvements later has nVidia rightly concerned about future opportunities in that space. Partnering with a company without such ambitions gives them a way to try to apply pressure against a platform that clearly has it's sights on closing the opportunity for GPU acceleration in HPC workloads. Besides, IBM has the resources to help give a boost in terms of software development tooling that nVidia may lack.
XML is like violence. If it doesn't solve the problem, use more.
If Power is venerable, what is the correct term for x86?
X86 works reasonably well with today's transistor budgets, but its evolution is layers upon layers of kludges on top of workarounds on top of band-aids (rinse and repeat) for over 3 decades. Some details (like how some instructions affect flags) are straight from the 8008 (earlier than the 8080), others are thankfully being forgotten like th x87 stack, but it's still here for backward compatibilty. Even z Servers, with about 50 years of existence since IBM 360, look clean compared with x86 from an architectural point of view.
Not on my nVidia 320m, I'm not!
I should have bought USB ASIC miners when they were still available for cheap after the 75 USD price crash.
Get free satoshi (Bitcoin) and Dogecoins
Does it catch fire ?
By any chance, is nVidia planning on doing an end-around on Microsoft with the graphics card hosting a full-blown operating system? 12GB of RAM gets you plenty of working space.
I come here for the love
What's the price - and any software to help me mined bitcoin ?
According to the Reg (page 2) Power8 is going to have some sort of memory coherence function for accelerators. Allowing the GPU to be just another first-class processor with regards to memory could be a big win, performance-wise, not to mention making it easier to program.
The latest version of CUDA (version 6) has also just added features in the same area (unified memory mgmt). Anandtech has some more info about that.
This thing will be beast!
FUNK!