Slashdot Mirror


PS3 Folding@Home Begins with Impressive Numbers

hansamurai writes "As we've previously discussed, the Folding@Home client is now available on the PS3, and already some early results are in. The total number of teraflops generated by PS3s has already exceeded all other OS contributions combined and the entire project is heading towards one petaflop of distributed computing power. Stanford notes that their teraflops calculation is conservatively calculated so the total power could be under-appreciated. With the PS3 European release complete and the Folding client already available to them, the number of users will continue to grow for the time being, let's hope that the project does not run out of work units to pass out. Kotaku has some numbers that are a few hours old since the Stanford server is getting hit pretty hard with the renewed interest in the project."

4 of 114 comments (clear)

  1. Re:Very old numbers by OddThinking · · Score: 5, Informative

    I believe the numbers are being taken from this web site.

  2. Re:What about global warming? by drinkypoo · · Score: 5, Informative

    Last I heard, F@H was a feel-good novelty that is doubtful to ever produce any meaningful results.

    Where did you hear that? I don't know any details, but it's easy to find a voice of dissent from your view:

    ""For the most part, it's not that we're looking for a needle in a haystack, but we're looking for broad properties that require good statistics," said Vijay Pande, associate professor of chemistry at Stanford University. As one of the scientists behind the project, Pande is proud to say that Folding@home has actually provided useful information to the scientific community. SETI@home, however, has yet to discover a single alien transmission."

    ""These successes are documented in peer review journals. Over 50 papers have resulted from Folding@home," said Pande. He and his students collaborated with developers from Sony Computer Entertainment of America to build a Folding@home client for the PlayStation 3, but that wasn't really Pande's idea."

    (In-Depth: Sony, Stanford Experts Talk PS3 Folding@home)

    "Now, for the first time, a distributed computing experiment has produced significant results that have been published in a scientific journal. Writing in the advanced online edition of Nature magazine, Stanford University scientists Christopher D. Snow and Vijay S. Pande describe how they with the help of 30,000 personal computers successfully simulated part of the complex folding process that a typical protein molecule undergoes to achieve its unique, three-dimensional shape. Their findings were confirmed in the laboratory of Houbi Nguyen and Martin Gruebele scientists from the University of Illinois at Urbana-Champaign who co-authored the Nature study."

    (Folding@home Scientists Report First Distributed Computing Success)

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  3. What's so impressive here... by MarcoAtWork · · Score: 5, Interesting

    ... besides the number of PS3 owners that are running this? The PS3 seems to be significantly slower than the GPU client for example

    GPU: 41tflop 697cpus
    PLAYSTATION®3 346tflop 14138cpus

    so basically the GPUs are 2.4x as powerful as the PS3s.

    --
    -- the cake is a lie
  4. Re:For 64bit floats, the PS3 is a powerhouse by adisakp · · Score: 5, Informative

    You obviously have no idea what you're talking about since you're wrong on every point you make. I'm a PS3 developer but all of the points I make below are well known, publicly released information.

    Cell is very optimized toward one data type for calculation: 64bit floats

    Wrong!!! Cell is optimized towards 4-float vectors of 32-bit floats. All the vector math operations on the SPU operate this way and it's capable of doing 8 32-bit operations per cycle (FMAC = 32-bit multiply + 32-bit add X 4 wide). On the other hand, 64-bit operations are scalar and non-pipelined. The take a minimum of 7 cycles for throughput and can have longer latency (13 clock cycles). The maximum 32-bit FP Single Precision (SP) rate is at least 28 times faster than the DP rate.

    b) either:
    - fit code and data segments within 256K for each SPU
    - crunch long enough between streamed data blocks such that DMA latency doesn't kill performance


    Wrong!!! Actually, a single code processing step and data should fit in considerably less than 256K. Preferably around 128K. Then you can double-buffer your DMA's for input and output. When you do this the DMA latency doesn't even matter unless your processing occurs faster than the DMA transfers since your DMA's are completely asynchonous to the SPU processing. This simple method of programming can hide nearly all DMA latency -- especially for code that repetitively iterates on multiple data blocks.

    c) have the entire calculation broken down into no more than six parts for streaming (one per SPU)

    Wrong!!! You can break the calculation into many more parts than six. As a matter of fact you could have 100 calculation parts on on chunk of data and simply swap in new code and work on old data. You can arbitrarily schedule more than a single task per SPU. Sony (and even IBM on non PS3 Cells) have libraries that allow you to share SPUs between many different tasks with only a very small minimal overhead incurred in switching between a task on the SPU.

    Also, SPUs don't support a supervisor bit for memory protection

    Wrong!!! The SPU's can only directly access their own local memory. All other accesses go through a protected external memory interface (the SPU DMA to main memory) and are controlled by memory protection. It is possibly to virtualize and lock-out SPU's from the rest of the system and run them in "safe mode". If the SPU's could run rampantly and access the entire memory there wouldn't be much point to Sony running a hypervisor to keep you out of their system space on the PS3 linux project and still give you access to the SPU. Also, it wouldn't make much sense not to have memory protection from IBM's point of view to use Cell's as CPUs for clustered supercomputers.

    bad things happen when threaded code running on SPU goes tits up

    Wrong!!! There is no reason on the CELL hardware why it shouldn't be possible to kill SPU threads / processes and the SPU rescheduled by the OS if necessary. This way a single task can no more take down an SPU than the PPSU. It is possible to even swap out an entire SPU programs pre-emptively on the fly and restore their state. This incurs a much higher swap cost (full SPU threading) than a more simple task manager because you have to save and restore the entire context of the SPU (including 128 16-byte registers and 256K memory region) but your implied limitation of the SPU's is definitely incorrect here.

    If you want to calculate 128bit floats, ints, or have lots of branch logic... buy a quad core2duo

    So quad-core2duo can do 128-bit floats ? If you're thinking SSE (4 X 32-bit floats) then the SPU's do the same thing. SPU's can run integer code albeit more slowly since most of the integer operations are scalar but still running integer code in parallel on an SPU can sometimes be faster than a Core2Duo - If you align scalar-processed integers to 16-bits for preferred slot l