Harvesting & Reusing Idle Computer Cycles
Hustler writes "More on the University of Texas grid project's mission to integrate numerous, diverse resources into a comprehensive campus cyber-infrastructure for research and education. This article examines the idea of harvesting unused cycles from compute resources to provide this aggregate power for compute-intensive work."
Does anyone realize that running a CPU at 100% takes more electricity than running a CPU at 10%?
"wasted compute cycles" aren't free. I would assert they're not even "wasted".
Does anyone realize that running a CPU at 100% takes more electricity than running a CPU at 10%?
"wasted compute cycles" aren't free. I would assert they're not even "wasted".
And neither are the computer cycles reused as the slashdot article would have you believing.
How can you reuse something that was never used in the first place?
http://www.tomshardware.com/cpu/20050509/cual_core _athlon-19.html
60-100W difference between idle and full power consumption. That is not an insignificant amount of power.
Q: Will Sun make Java Technology Open Source? A: Sun's goal is to make Java as open as possible and available to the largest developer community possible. We continue to move in that direction through the Java Community Process (JCP). Sun has published the Java source code, and developers can examine and modify the code. For six years we have successfully been striking a balance between sharing the technology, ensuring compatibility, and considering the needs of a growing installed base of more than 2.5 million Java developers who depend on us. We are certainly evolving Java through the JCP to a model that works for all involved but that also ensures compatibility. Cross-platform compatibility has always been the key to Java's success and integrity; a notion we feel was protected by Microsoft's agreement in January 2001 to settle the lawsuit regarding Java technology.
I take it that's a 'no.'
XML UI Browser/Platform
If you have extra Macs, you can with DVD studio and Shake. Look up qmaster.
Mod point free since 2001
How about we do something that's a little more pratical and useful such as finding new drugs that will cure cancer.
In theory? Yes, I'm pretty sure it's possible.
However, in practice, it's almost certainly more work than it's worth. You've got to have a LOT of code tracking what program wants what results, when it wants the results, etc. etc. Although it might work if they had large amounts of overlap, in other cases, chances are you'd spend a good deal more CPU power just doing the coordination than the sharing would save.
Err, not precisely. Intel's Pentium M can create a system that draws 132 watts at maximum CPU load, and runs nearly as fast.
I've been buying AMD for about five years, but I think my next system will be a Pentium M. Just as soon as they're a bit cheaper...
--grendel drago
Laws do not persuade just because they threaten. --Seneca
Our new buses are the exact same - designed in CAD - no prototype phase - first production models were sold.
And they are shit.
Flimsy, awkward, handle like a drunken whale, weak brakes, and parts you *physically cannot get to*.
There is a very good reason for prototypes - you get to see what breaks *before* you invest in production tooling and large material and parts purchases.
They're gonna lose their ass on that...
Why can't I mod "-1 Idiot"?
I'd love to have a Java vm or net runtime that runs on the GPU in my video card.
That would let me run stuff both on the main cpu + the gpu since the GPU does almost nothing 99.999% of the time (I don't play FPS anymore).
First, I completely agree with you that it does depend a lot on what you're doing. For instance, last I heard _cycle for cycle_ the Pentium is still the king of integer - such as chess. But the crown is different for flops... Raw clock is meaningless and you are highly misguided. Furthermore, MANY operations are multicycle, and I guarantee you they are used on anything mathmatically intense enough to be worth sending out over the network.
Interestingly, you don't have to leave Intel to see this: the Celeron, "vanilla" P4, Xeon and Pentium M have a lot of good differences just within the current Intel x86 line. The Pentium M is awesome, for instance.
But I'll provide just a few examples of how a cycle is NOT a cycle.
Many of these only help you if you compile for that architecture or it does something fancy in the background to compensate - but you could certainly distribute a mixed exe that ran the appropriate binary for the platform.
- First, bitwidth:
64 bit addition requires 1 cycle on a 64 bit cpu, but at least about 3 on a 32 bit. 64 bit multiplication is MUCH worse on a 32bit machine. Similarly, 128 bit vector math is much cheaper on a G4 ("altivec") than on a CPU limited to 64 or 32 bits in that arena.
- registers: A CPU can only actually DO operations on values in registers. If you have more registers you can do much more complicated (longer-chained) operations without having to go to RAM or cache. This is intensely true on highly serial but complicated math and amazingly significant if the operation data actually fits in registers in one CPU and not in another.
- branch prediction and shorter pipeline depth. All other things being equal you want the shortest pipeline possible because it means you have the lowest branch prediction penalty. Coupled with the quality of your branch predictor, this makes a big difference. (Of course, things _aren't_ equal, and longer pipelines make it easier to physically build faster CPUs) Even if branch prediction is meaningless, the pipeline depth is still important.
- parallelization: _All_ modern computers let you run some multiple commands in parallel using multiple CPUs, cores, hyperthreading and/or multiple processing units. Many computers come with two CPUs. Some newer CPUs comes with two cores. Hyperthreading decreases the process switching penalty. Modern CPUs have separate integer and flop units, often more than 1. Clearly the quantity and efficiency of these multiple units would make a big difference.
At an absolute minimum, all of these things help you run the OS without interfering too much with your actual work. But since we're talking about stuff that's already being distributed over a wide network to multiple computers, on some level this work is clearly parallelizeable. Even if your second core can't help on your first 'chunk' you could likely be executing two chunks at nearly the same speed (barring other constraints listed here)
- cache(L1/L2/L3), cache prediction, RAM, bandwidth, chipsets. I'm not going to go into all the details, but suffice to say that the cores need data and code to function and unless your entire process fits in registers, they have to get it from somewhere. The arrangement of memory has a big impact on 1) how much work the CPU has to do to get information and 2) how much the CPU has to wait for that information.
- I/O - I know this is out of our case, but the CPU efficiency of IDE has increased dramatically, but there is still some variance from system to system and driver to driver. Furthermore, different network cards/drivers use significantly different amounts of CPU time to send large amounts of data. This is true even if the speed of execution is not I/O bound - it still takes some main processor clocks and the quantity varies.
Furthermore, this arbitrary driver code and any OS code - for instance - is definitely susceptible to traditional branch prediction, cache hits, etc - even if your main crunching loop did fit in registers.
I'm sure there's more, but I'm done for now.
Looking for freelance Actionscript (Flash/Flex) or ColdFusion work and/or freelance developers. Email me, put Slashdot