Inside Tsubame, Japan's GPU-Based Supercomputer

← Back to Stories (view on slashdot.org)

Inside Tsubame, Japan's GPU-Based Supercomputer

Posted by timothy on Thursday December 11, 2008 @12:27PM from the please-don't-christen-the-supercomputer dept.

Startled Hippo writes "Japan's Tsubame supercomputer was ranked 29th-fastest in the world in the latest Top 500 ranking with a speed of 77.48T Flops (floating point operations per second) on the industry-standard Linpack benchmark. Why is it so special? It uses NVIDIA GPUs. Tsubame includes hundreds of graphics processors of the same type used in consumer PCs, working alongside CPUs in a mixed environment that some say is a model for future supercomputers serving disciplines like material chemistry." Unlike the GPU-based Tesla, Tsubame definitely won't be mistaken for a personal computer.

4 of 75 comments (clear)

Min score:

Reason:

Sort:

Re:Ofcourse by dgatwood · 2008-12-11 13:27 · Score: 4, Informative

Indeed, that's the whole idea behind the recently ratified OpenCL specification. Design a C-like language that provides a standard abstraction layer for the ability to perform complex computations on a CPU, GPU, or conceivably on any number of other devices lying around (e.g. idle I/O Processors, the DSP core in your WinModem, your printer's raster engine...).

--
Check out my sci-fi/humor trilogy at PatriotsBooks.
Re:Hold the hyperbole - Read again by raftpeople · 2008-12-11 13:49 · Score: 5, Informative

On reading the article, the box has 30 thousand cores, of much the vast majority are AMD Opterons in Sun boxes. No mention of how/in what you'd program this to actually put the GPUs to good use
You may want to read the article again, if not here's a recap:
655 Sun Boxes each with 16 AMD cores=10,480 CPU cores
680 Tesla Cards each with 240 processors=163,2000 GPU processors

As for how to use the GPU's, I use my GTX280 (almost same thing as Tesla) to crunch through lots of numeric calculations in parallel. I'm sure these guys are doing the same thing as that is the strength of the GPU. NVIDIA has made it easier to access the processing power of the GPU with CUDA. You create a program in C that gets loaded on the GPU and when you launch it you can tell it how many copies to run at one time, each one typically operates on a different portion of the data. Because you can launch more threads than there are processors, the GPU can be reading data in from global vid mem while other threads are performing calculations.
The missing numbers by Anonymous Coward · 2008-12-11 14:16 · Score: 3, Informative

just to get a perspective, the GPUs provide about 10 out of 77 TFLOPs benchmarked in LINPACK HPC article
Re:Could do it for cheaper by Jeff+DeMaagd · 2008-12-11 14:47 · Score: 3, Informative

ATI's latest cards give more punch for the cost apiece. and they are designed specifically for being clustered/linked/xfired and whatnot.
I thought the nV Teslas were designed for HPC.
Performance going up, cost going down happens so quickly something like that can easily happen between the time it's ordered and the time it's installed.