PS3 Folding@Home Begins with Impressive Numbers
hansamurai writes "As we've previously discussed, the Folding@Home client is now available on the PS3, and already some early results are in. The total number of teraflops generated by PS3s has already exceeded all other OS contributions combined and the entire project is heading towards one petaflop of distributed computing power. Stanford notes that their teraflops calculation is conservatively calculated so the total power could be under-appreciated. With the PS3 European release complete and the Folding client already available to them, the number of users will continue to grow for the time being, let's hope that the project does not run out of work units to pass out. Kotaku has some numbers that are a few hours old since the Stanford server is getting hit pretty hard with the renewed interest in the project."
Gizmodo has more current numbers (which are also a little behind). Currently they're showing 346 TFLOPS for PS3s.
God Fucking Damnit
Remember all that energy we aren't supposed to be wasting?
Last I heard, F@H was a feel-good novelty that is doubtful to ever produce any meaningful results.
I don't need no instructions to know how to rock!!!!
As we've previously discussed, the Folding@Home client is now available on the PS3, and already some early results are in [CC].
When will the SNES version finally be available?
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
Impressive, but I wonder if this interest among PS3 owners will drop off. Especially when GTA IV comes out, or they get next months power bill.
Libertarian Leaning Political Discussion Forum.
... besides the number of PS3 owners that are running this? The PS3 seems to be significantly slower than the GPU client for example
GPU: 41tflop 697cpus
PLAYSTATION®3 346tflop 14138cpus
so basically the GPUs are 2.4x as powerful as the PS3s.
-- the cake is a lie
Somehow, I doubt that people buying a $600 game system will care if their power bill goes up $1 (or $10 or $20) a month. Power is one of those things that most people ignore and simply pay unless it's completely out of whack. My commercial power bill fluctuates by sometimes as much as a hundred bucks a month, but even that's not enough to make it worth my time to figure out what might be causing it.
I don't respond to AC's.
Slashdot team# 11326 Now go get your PS3 and start crunching numbers.
You obviously have no idea what you're talking about since you're wrong on every point you make. I'm a PS3 developer but all of the points I make below are well known, publicly released information.
Cell is very optimized toward one data type for calculation: 64bit floats
Wrong!!! Cell is optimized towards 4-float vectors of 32-bit floats. All the vector math operations on the SPU operate this way and it's capable of doing 8 32-bit operations per cycle (FMAC = 32-bit multiply + 32-bit add X 4 wide). On the other hand, 64-bit operations are scalar and non-pipelined. The take a minimum of 7 cycles for throughput and can have longer latency (13 clock cycles). The maximum 32-bit FP Single Precision (SP) rate is at least 28 times faster than the DP rate.
b) either:
- fit code and data segments within 256K for each SPU
- crunch long enough between streamed data blocks such that DMA latency doesn't kill performance
Wrong!!! Actually, a single code processing step and data should fit in considerably less than 256K. Preferably around 128K. Then you can double-buffer your DMA's for input and output. When you do this the DMA latency doesn't even matter unless your processing occurs faster than the DMA transfers since your DMA's are completely asynchonous to the SPU processing. This simple method of programming can hide nearly all DMA latency -- especially for code that repetitively iterates on multiple data blocks.
c) have the entire calculation broken down into no more than six parts for streaming (one per SPU)
Wrong!!! You can break the calculation into many more parts than six. As a matter of fact you could have 100 calculation parts on on chunk of data and simply swap in new code and work on old data. You can arbitrarily schedule more than a single task per SPU. Sony (and even IBM on non PS3 Cells) have libraries that allow you to share SPUs between many different tasks with only a very small minimal overhead incurred in switching between a task on the SPU.
Also, SPUs don't support a supervisor bit for memory protection
Wrong!!! The SPU's can only directly access their own local memory. All other accesses go through a protected external memory interface (the SPU DMA to main memory) and are controlled by memory protection. It is possibly to virtualize and lock-out SPU's from the rest of the system and run them in "safe mode". If the SPU's could run rampantly and access the entire memory there wouldn't be much point to Sony running a hypervisor to keep you out of their system space on the PS3 linux project and still give you access to the SPU. Also, it wouldn't make much sense not to have memory protection from IBM's point of view to use Cell's as CPUs for clustered supercomputers.
bad things happen when threaded code running on SPU goes tits up
Wrong!!! There is no reason on the CELL hardware why it shouldn't be possible to kill SPU threads / processes and the SPU rescheduled by the OS if necessary. This way a single task can no more take down an SPU than the PPSU. It is possible to even swap out an entire SPU programs pre-emptively on the fly and restore their state. This incurs a much higher swap cost (full SPU threading) than a more simple task manager because you have to save and restore the entire context of the SPU (including 128 16-byte registers and 256K memory region) but your implied limitation of the SPU's is definitely incorrect here.
If you want to calculate 128bit floats, ints, or have lots of branch logic... buy a quad core2duo
So quad-core2duo can do 128-bit floats ? If you're thinking SSE (4 X 32-bit floats) then the SPU's do the same thing. SPU's can run integer code albeit more slowly since most of the integer operations are scalar but still running integer code in parallel on an SPU can sometimes be faster than a Core2Duo - If you align scalar-processed integers to 16-bits for preferred slot l
Well with the Wii, you can actually fold the proteins yourself using the innovative new motion-sensing controller!