Slashdot Mirror


PS3 Folding@Home Begins with Impressive Numbers

hansamurai writes "As we've previously discussed, the Folding@Home client is now available on the PS3, and already some early results are in. The total number of teraflops generated by PS3s has already exceeded all other OS contributions combined and the entire project is heading towards one petaflop of distributed computing power. Stanford notes that their teraflops calculation is conservatively calculated so the total power could be under-appreciated. With the PS3 European release complete and the Folding client already available to them, the number of users will continue to grow for the time being, let's hope that the project does not run out of work units to pass out. Kotaku has some numbers that are a few hours old since the Stanford server is getting hit pretty hard with the renewed interest in the project."

14 of 114 comments (clear)

  1. Very old numbers by cxreg · · Score: 4, Informative

    Gizmodo has more current numbers (which are also a little behind). Currently they're showing 346 TFLOPS for PS3s.

    1. Re:Very old numbers by OddThinking · · Score: 5, Informative

      I believe the numbers are being taken from this web site.

  2. Re:What about global warming? by drinkypoo · · Score: 5, Informative

    Last I heard, F@H was a feel-good novelty that is doubtful to ever produce any meaningful results.

    Where did you hear that? I don't know any details, but it's easy to find a voice of dissent from your view:

    ""For the most part, it's not that we're looking for a needle in a haystack, but we're looking for broad properties that require good statistics," said Vijay Pande, associate professor of chemistry at Stanford University. As one of the scientists behind the project, Pande is proud to say that Folding@home has actually provided useful information to the scientific community. SETI@home, however, has yet to discover a single alien transmission."

    ""These successes are documented in peer review journals. Over 50 papers have resulted from Folding@home," said Pande. He and his students collaborated with developers from Sony Computer Entertainment of America to build a Folding@home client for the PlayStation 3, but that wasn't really Pande's idea."

    (In-Depth: Sony, Stanford Experts Talk PS3 Folding@home)

    "Now, for the first time, a distributed computing experiment has produced significant results that have been published in a scientific journal. Writing in the advanced online edition of Nature magazine, Stanford University scientists Christopher D. Snow and Vijay S. Pande describe how they with the help of 30,000 personal computers successfully simulated part of the complex folding process that a typical protein molecule undergoes to achieve its unique, three-dimensional shape. Their findings were confirmed in the laboratory of Houbi Nguyen and Martin Gruebele scientists from the University of Illinois at Urbana-Champaign who co-authored the Nature study."

    (Folding@home Scientists Report First Distributed Computing Success)

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  3. Re:What about global warming? by cxreg · · Score: 4, Informative

    I don't pretend to understand all of the science, but I assume the people at Stanford know what they're doing. The results page has some things that sound interesting

  4. Re:What about global warming? by usrusr · · Score: 2, Informative

    Folding on an idle PC that is used for CPU-easy things like typing /. comments and the like is one thing, the PC would be running anyways. But i don't (yet, likely to change with the increasing media-center role game consoles are supposed to take) see why a PS3 should be running at times when it is not running at full CPU demand (e.g. gaming). There should only be full throttle gaming or power off. Or can it still fold in the background while emulating PS2 games?

    An exception is the scenario where somebody would be heating electrically anyways, then it's always a good idea to pipe the energy through some circuits before enjoying the inevitable temperature increase. If you shake up that enthalpy, shake it up in the most funky way you can. But then it's still much better to avoid electrical heating at all...

    --
    [i have an opinion and i am not afraid to use it]
  5. Re:What about global warming? by OddThinking · · Score: 2, Informative

    I hate to repeat a post, but for additional information:

    In case anyone is wondering about what the project has acheived so far, here is the link.

    Concerning global warming, the processing statistics imply the PS3 is by far the most efficient. At 380 watts, using the statistics given (which are said to be conservative in the case of the PS3), that puts the PS3 at 63 teraFLOPS/megawatt, or 16.5 kilowatts/teraFLOPS. I'm not really familiar with this, but isn't that fairly good? It's definately better than using PCs. Blue Gene/L, which is supposed to be very efficient, will deliver 240,000 FLOPS/watt, or about 0.24 teraFLOPS/megawatt. My calculations may be off, but that suggests the PS3 is highly efficient and a better use of power than a supercomputer.

    I'm sure I'm missing some important considerations, so can someone through a little knowledge at this?

  6. Different work units? by pavon · · Score: 2, Informative

    When the GPU client first came out, it was pointed out that it was actually using different work units than the normal PC version and so the numbers weren't directly comaparable. I don't know what the situation is for the PS3, but it may not be using the same work units as either the GPU version or the PC version, and thus not directly comparable to either.

  7. /. team number by 4g1vn · · Score: 3, Informative

    Slashdot team# 11326 Now go get your PS3 and start crunching numbers.

  8. Re:For 64bit floats, the PS3 is a powerhouse by adisakp · · Score: 5, Informative

    You obviously have no idea what you're talking about since you're wrong on every point you make. I'm a PS3 developer but all of the points I make below are well known, publicly released information.

    Cell is very optimized toward one data type for calculation: 64bit floats

    Wrong!!! Cell is optimized towards 4-float vectors of 32-bit floats. All the vector math operations on the SPU operate this way and it's capable of doing 8 32-bit operations per cycle (FMAC = 32-bit multiply + 32-bit add X 4 wide). On the other hand, 64-bit operations are scalar and non-pipelined. The take a minimum of 7 cycles for throughput and can have longer latency (13 clock cycles). The maximum 32-bit FP Single Precision (SP) rate is at least 28 times faster than the DP rate.

    b) either:
    - fit code and data segments within 256K for each SPU
    - crunch long enough between streamed data blocks such that DMA latency doesn't kill performance


    Wrong!!! Actually, a single code processing step and data should fit in considerably less than 256K. Preferably around 128K. Then you can double-buffer your DMA's for input and output. When you do this the DMA latency doesn't even matter unless your processing occurs faster than the DMA transfers since your DMA's are completely asynchonous to the SPU processing. This simple method of programming can hide nearly all DMA latency -- especially for code that repetitively iterates on multiple data blocks.

    c) have the entire calculation broken down into no more than six parts for streaming (one per SPU)

    Wrong!!! You can break the calculation into many more parts than six. As a matter of fact you could have 100 calculation parts on on chunk of data and simply swap in new code and work on old data. You can arbitrarily schedule more than a single task per SPU. Sony (and even IBM on non PS3 Cells) have libraries that allow you to share SPUs between many different tasks with only a very small minimal overhead incurred in switching between a task on the SPU.

    Also, SPUs don't support a supervisor bit for memory protection

    Wrong!!! The SPU's can only directly access their own local memory. All other accesses go through a protected external memory interface (the SPU DMA to main memory) and are controlled by memory protection. It is possibly to virtualize and lock-out SPU's from the rest of the system and run them in "safe mode". If the SPU's could run rampantly and access the entire memory there wouldn't be much point to Sony running a hypervisor to keep you out of their system space on the PS3 linux project and still give you access to the SPU. Also, it wouldn't make much sense not to have memory protection from IBM's point of view to use Cell's as CPUs for clustered supercomputers.

    bad things happen when threaded code running on SPU goes tits up

    Wrong!!! There is no reason on the CELL hardware why it shouldn't be possible to kill SPU threads / processes and the SPU rescheduled by the OS if necessary. This way a single task can no more take down an SPU than the PPSU. It is possible to even swap out an entire SPU programs pre-emptively on the fly and restore their state. This incurs a much higher swap cost (full SPU threading) than a more simple task manager because you have to save and restore the entire context of the SPU (including 128 16-byte registers and 256K memory region) but your implied limitation of the SPU's is definitely incorrect here.

    If you want to calculate 128bit floats, ints, or have lots of branch logic... buy a quad core2duo

    So quad-core2duo can do 128-bit floats ? If you're thinking SSE (4 X 32-bit floats) then the SPU's do the same thing. SPU's can run integer code albeit more slowly since most of the integer operations are scalar but still running integer code in parallel on an SPU can sometimes be faster than a Core2Duo - If you align scalar-processed integers to 16-bits for preferred slot l

  9. GPU performance by Panzergheist · · Score: 2, Informative

    Also, since I haven't seen anyone mention this yet, the GPU client on the F@H site are all ATI X1900s. The work units performed by GPU clients and Cell clients are of a different type than those performed by general purpose CPUs. Check the F@H FAQs for more information.

  10. Re:For 64bit floats, the PS3 is a powerhouse by adisakp · · Score: 2, Informative

    SPU's do have local memory access. However, local memory access is not the same as unprotected memory access. All accesses to external memory (DMA) can be protected. DMA commands use the same type of translation and protection governed by the page and segment tables of the Power Architecture as the PPU (indeed there is MMU management for each Memory Flow Controller for each SPU). I suggest you read this paper to see the memory management on the SPU: http://www-128.ibm.com/developerworks/power/librar y/pa-celldmas/

    The SPU's can run in a completely safe and sand-boxed environment with actual SPU threads and processes. In fact, you can set a secure "vault" mode in which the SPU is completely disconnected from external access and it's possible to ensure that SPU's are not capable of touching anything other than what is in their local memory (and nothing external can touch the SPU's local store in this mode).

    If during normal SPU something goes wrong (i.e. access to a protected area), it's possible to kill the curren SPU process and restart the SPU with a new process -- neither that specific SPU nore the Cell processor need to be hung up by some bad SPU code. In this respect, it's possible to build a secure OS that allows SPU access. If something bad happens on an SPU, you can kill the offending process -- just like if something goes bad on a single core on a Core2Duo process, the linux kernel can kill the process. The machine won't go "tits up" -- just your process will get killed. But this is true of any bad program on any "secure" OS.

    I don't know what you're claiming as a "supervisor" bit for memory protection. Perhaps you're referring to read/modify/execute (RWX) page settings on the page-table (and cached in the TLB) with virtualized memory access. The SPU's do not have a TLB for local store (although it is mapped into the PPU address space and that TLB). However, they can set up "privileged" areas in local memory (and local register address space - i.e. DMA regs) that can be used to accomplish memory protection for local store and access to non-allowed areas can cause an exception which the OS can then deal with appropriately. Additionally, there is an Local Store Limit Register (LSLR) to mask the available region of local store. Now witin the normally accessible area of local store, the SPU can store instructions and execute them allowing for self-modifying code. Before DEP on windows (and/or if your CPU is more than a year or two old and doesn't support execution page protection), you could make self-modfying code run in any region of writeable memory -- one reason why windows buffer overflows are so easy to exploit.

    As far as the "this is how we program SPU's", sure you can program stuff to only run only 6 fixed SPU tasks with no swapping and use 256K data blocks (minus code overhead) so you have DMA latencies. However, you implied by stating those as limitations that was the *ONLY* way to program them which is wrong. All the papers out there from IBM and Sony suggest PREFERRED METHODS which are different than your implied limitation. They advocate cooperatively running multiple small task (dozens or even hundreds) rather than dedicating cores to fixed tasks and to double-buffer data (or even code) to mask DMA latencies. Singular task SPU usage can lead to idle SPU's very easily. Additionally, the double-buffer method can hide DMA latencies which can double the speed of your code if you're roughly equal on memory (DMA) and SPU compute time. We prefer to program SPU's in the manner in which they will be utilized much more fully and run up to 2X faster. I think anyone rewriting code for the SPU (since you have to retarget for the SPU anyway, there's no reason to write your code in a manner that will deliberately underperform).

  11. Re:Comparision of Ps3 vs. PC flawed. by Anonymous Coward · · Score: 1, Informative

    I still run folding at home, I support it, but my Bittorrents, my video tools, my firefox will all take away the precious cycles that Folding is after Even when you're not using your computer? I don't know about you, but when I'm not using my computer, the CPU idle time is %99 or so. F@H on PS3 only runs when nothing else is.

  12. Re:power bill by EjectButton · · Score: 4, Informative

    yeah most probably wouldn't notice
    according to this
    http://www.hardcoreware.net/reviews/review-356-2.h tm
    the ps3 uses about 200watts maximum
    and if you look at the cost per kwh around the US http://www.eia.doe.gov/neic/brochure/electricity/e lectricity.html
    and round up for the sake of argument, so say you run it for 24 hours a day, you never play any games on it, and you are paying $0.10 /kwh, that's $14.60/month

    more realistically say you pay $0.10/kwh and only run f@h when you are asleep, so 8 hours a day, less than $5 a month more than you would have paid otherwise.

  13. Re:What's so impressive here... by Phil+Wilkins · · Score: 2, Informative

    This is correct, although to achieve that, the GPU client requires twice as many FLOPs to process the same amount of work.