Slashdot Mirror


GRAPE6, Now With GNU/Linux Frontend, At 32 TFlops

teuben writes "I am attending the "Astrophysical Supercomputing using Particle Simulations" conference here in Tokyo, and during the first session yesterday Jun Makino announced that the GRAPE6 is now operational and running with a 4 headed linux system running 1.7GHz PC each (not a Quad, just 4 individual PCs). This prototype is now running at 32 Tflops! Best of the news is that this prototype is scalable, and this configuration is only 1/4 of the final one. Funding currently limits building faster grapes. Check out http://grape.c.u-tokyo.ac.jp/iau208/ for the conference website, and http://astrogrape.org/ for the GRAPE website." But that's not all -- Peter also has word on how you (or more likely your local astrophysics department, since that's what it's best for) can get a grape of your own, and on electronics in Japan.

You can also get a baby-grape, see pictures on http://www.astro.umd.edu/~teuben/pics/japan/09/p70 90014.html which runs a good fraction of a TFlop, and will cost somewhere around 10k$.

I have some more pictures on http://www.astro.umd.edu/~teuben/pics/japan/08/ which shows the 1/4 size Grape6 running 32 Gflop. The final full version would cost about 1M$. Compare that to the AsciWhite at 12 Tflop for 100M$. Drawback of course is that the Grape only computes things similar to the gravitational N-body problem (also useful for pharmaceutical industries).

Btw, also spent some time in Akihabara on sunday, I guess we're deprived on the US east coast, the amount of DVD writers you can get here is amazing. Also very popular here seem to be all kinds of embedded units, e.g. the GPS in your car to not get lost in Tokyo!

There was an ABC news story earlier in the year on the GRAPE, but at the time it was running alpha's with their unix. They have now fully switched to linux, and this system has been running since July 5."

19 of 45 comments (clear)

  1. Re:Imagine... by Have+Blue · · Score: 2

    I can imagine it easily: Zero. The GRAPE units can't do RC5. Read the article.

  2. Re:sweet by garcia · · Score: 2

    do they have seeds?

    Having a vinyard would be quite the cluster of GRAPEs.

  3. Re:Configurable GRAPE? by DarkMan · · Score: 2
    It looks like the GRAPE boards should be re-configurable. Why'd they do it with custom chips?
    A whack of FPGA's should be pretty decent, but you can configure it for more than just as a N-body gravitational problem.


    Because in silicon it's pretty much twice as fast as an FPGA gets? (EE's rule of thumb, admitingly reffering to microwave app's).

    Not only that, but why would they want to do something other than N-body gravitational problems? _You_ might, but there are a lot of such problems to do, and that's what this is designed for.
    --
  4. Re:32 Teraflops? Seems a tad high... by drudd · · Score: 2

    Ahh yes, that's right... I've never been given a chance to use ours, so I was speaking from what I remembered from the little reading I've done ;)

    Doug

    --
    Venn ist das nurnstuck git und Slotermeyer? Ya! Beigerhund das oder die Flipperwaldt gersput!
  5. Heard it... by cr0sh · · Score: 2

    ...through the GRAPEvine?

    Worldcom - Generation Duh!

    --
    Reason is the Path to God - Anon
  6. Re:32 Teraflops? Seems a tad high... by TMB · · Score: 2
    >I've never been given a chance to use ours

    Probably because we're not convinced that the lockout works properly on the GRAPE5s. I know it works well on the GRAPE3, but VE and MS have done some tests on the GRAPE5 where they've tried hammering it with 2 different jobs, and the results haven't been kosher.

    Anyone have experience with GRAPE5 and notice this? Anything that could be changed in the API? I suppose we could write a wrapper around the g5_open and g5_close calls that does additional locking, but that seems inelegant.

    [TMB]

  7. Re:32 Teraflops? Seems a tad high... by TMB · · Score: 2

    Hi Doug! :-)=

    Just a clarification:

    (actually some of them do SPH (smoothed particle hydrodynamics) as well)

    The boards themselves don't do the SPH calculations. What they do is return neighbour lists for each particle, which reduces the load necessary to compute the hydro forces.

    [TMB]

  8. sweet by x-empt · · Score: 2

    Fast?

    Nope... the GRAPES are each one sweet system!

    --
    Ever need an online dictionary?
  9. Re:How is this possible? by teuben · · Score: 2

    enough people did answer that. but in that vain I should comment that some collegues of mine added some assembler code (incidentally for the same Nbody code). He's been using the 3DNow SIMD instruction set on the AMD directly, and was able to get about 2 billion PP interactions in 45 seconds, which translates to about 2 Gflop on a 1.2 GHz AMD (his math). With 8-10 of such athlons they could compete wiht the Grape5 in speed. Of course that's still far from the Grape6 speed. But depending on your problem and budget, you can still get pretty far with COTS.

  10. Re:Attn: Beowulf cluster troll by sharkey · · Score: 3

    Actually, I think it would be called a "bunch" of GRAPEs.

    --

    --

    --
    "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
  11. Re:32 Teraflops? Seems a tad high... by RobertFisher · · Score: 3

    The key point in this analysis is that they get 32 TFlops in doing the gravity summation for a discrete particle simulation.

    If you have additional physics (hydrodynamics, etc), that processing must happen on the workstation which is running the simulation. So the performance is ultimately bottlenecked by the workstation. In practice, Grape practioners typically do not see anything close to the theoretical peak of their boards.

    Bob

    --
    Science, like Nature, must also be tamed, with a view turned towards its preservation.
  12. Re:32 Teraflops? Seems a tad high... by drudd · · Score: 3

    Read the article...

    Grape boards are highly specialized chips which do nothing but N^2 direct summation gravity force calculations (actually some of them do SPH (smoothed particle hydrodynamics) as well).

    You take a pc/sun/beowulf cluster and link it to a set of grape boards. You then send particles to the boards and get accelerations back.

    Doug

    --
    Venn ist das nurnstuck git und Slotermeyer? Ya! Beigerhund das oder die Flipperwaldt gersput!
  13. Re:Q: What do GRAPEs have in common with chess? by bentini · · Score: 3

    FPGA's are, unfortunately, slow.

    They're made with a worse process than custom chips. For inner loops, you want as fast as you can get. You pay for programmability, and if it's always the same task, special-purpose is best.

    It's like the difference between hand-assembled code and a compiler. You get it easier with the compiler, but hand-assembling can be better when you know the specifics.

    The n-body gravitational problem is going to be around for a while, so it makes sense to customize to it.

  14. what?!?! by 2Bits · · Score: 3
    The final full version would cost about 1M$. Compare that to the AsciWhite at 12 Tflop for 100M$.

    What??? A machine like would cost one Microsoft? Either I have been sleeping thru all this time while inflation is running rampant, or M$ is not worth that much anymore.

  15. linux grape joke... by grammar+nazi · · Score: 3
    Q: What did the Grape9 say when it was crushed with my number windows factoring algorithm?

    Give up?

    A: Nothing, it just made a little Wine.

    --

    Keeping /. free of grammatical errors for ~5 years.
  16. :( by Drakula · · Score: 3

    Does this mean that other OSes (cough, Windows, cough) should have sour grapes?

    --
    "It's comin' back around again..." -RATM
  17. Re:32 Teraflops? Seems a tad high... by DarkMan · · Score: 4

    Read the article.

    With refference to the calculatiosn they are doing, they are simply doing

    G * m_i * SumOverAll(j .NE. i) (x_j - x_i) / (x_j - x_i)^3

    They are doing this by custom hardware.

    This is not a general purpose computer.

    Despite what the blurb said, there are 96 independant units doing the calculation, in each machine, to get the 32 TFlops across the system.

    There is a picture of an earlier model, which is about the size of one of my filing cabinets.

    Remeber these are scientists, not marketing, making those claims. They expect to be asked to justify them - and they have.
    --

  18. Here we go again... by Matt2000 · · Score: 5


    How slashdot slows scientific progress in the world:

    1. Oh look, and interesting story on academic research on slashdot.
    2. Oh look, a lovely link to those poor academic's website. Surely they have the $40k necessary to make a server that can handle the load from slashdot?
    3. Oh look, the reeking Sun Ultra 5 that they were using for web duties has burst into flame, destroying the lab and scaring a small puppy that lives in the lab next door.

    To hell with you slashdot for burning puppies.

    --

  19. Q: What do GRAPEs have in common with chess? by devphil · · Score: 5


    A: GRAPEs and chess-playing computers, such as the one that tackled Kasparov (Deep Blue?), both accomplish their opening-up-of-cans-of-mathematical-whoopass via the same approach: functions in the innermost loops are done via calls to special-purpose hardcare cards. The rest is done with software.

    So, say I take a GRAPE, and replace its special N-body gravitational daughtercard with one containing a few FPGAs programmed for, say, RC5; now I have a cracking machine. And then reprogram the FPGA to do image manipulation instead; now I have a renderer to make my own Toy Story. And then reprogram the FPGA to do, etc, etc.

    Of course, I'm still lacking the software. So actually this post is mostly babbling. :-)

    --
    You cannot apply a technological solution to a sociological problem. (Edwards' Law)