TeraGrid v. Distributed Computing
Nevyan writes "After three years of development and nearly a hundred million dollars the TeraGrid has been running at or above most peoples expectations for such a daunting project. On January 23, 2004 the system came online and provided 4.5 teraflops of computing power to scientists across the country. However, the waiting list for TeraGrid is long, including a bidding process through the National Science Foundations (NSF's) Partnerships for Advanced Computational Infrastructure (PACI) and many scientists with little funding but bright ideas are being left behind. While the list of supercomputer sites and peak power is growing how is the world of Distributed Computing faring? "
The problem with using distributed computing for everything is that the number of people willing to let others use processing power on their computer is not infinite. It is a very large number, but eventually everyone who wants to/knows how to help out their favorite cause will have something already installed. In addition, the more useful endeavors that use distributed computing, the less users you will get for each, and only the 'interesting' projects will get many users. Who wants to use their computing power to analyze some boring old physics experiment when you could be finding aliens or curing cancer?
Distributed computing has its uses, but remeber: the public will only be willing to help you as long as they feel like they're contributing to something worthwhile.
Important to remember that the Grid is a _kind_ of distributed computing. But the main thing about The Grid (like The Internet, The Grid is basically TeraGrid in the US + European Data Grid) is that it is suitable for handing off parallel jobs with high intercommunication needs to (i.e. MPI jobs). Not necessarily because these jobs can run across different nodes of the grid (though they can with MPI/Nexus or whatever it's called), but because each "node" in the Grid network is a HUGE MOFO LINUX CLUSTER or similar. The grid gives lots of physicists access to computing resources for parallel processing jobs that would otherwise be sitting idle.
/.ers generally mean by distributed computing is a bit different - most apps there are "embarrassingly parallel" ones you can just farm out. They don't need to chatter to eachother, just process some data and send it back to Central.
What
Google's distributed OS has been discussed a lot on Slashdot, but it is more than just a search algorithm on their own servers:
Google Compute is a feature of the Google Toolbar that enables your computer to help solve challenging scientific problems when it would otherwise be idle. When you enable Google Compute, your computer will download a small piece of a large research project and perform calculations on it that will then be included with the calculations performed by thousands of other computers doing the same thing. This process is known as distributed computing.
The first beneficiary of this effort is Folding@home, a non-profit academic research project at Stanford University that is trying to understand the structure of proteins so they can develop better treatments for a number of illnesses. In the future Google Compute may allow you to also donate your computing time to other carefully selected worthwhile endeavors, including projects to improve Google and its services.
- The Google Compute Project
The BOINC platform (that seti@home is switching over to) has the ability to divide work between project as you suggest. Though I'm not really sure that there are very many other projects running on it.
The idea of payment for work units is interesting. While it would certainly provide incentive for participating in distributed computing projects, I can see two problems with it already:
1) Getting the money to pay people. One advantage of distributed computing is that you don't have to pay for time on expensive cluster. That advantage disappears when you pay distributed computing users. Of course, it may still turn out to be cheaper, and there may be users willing to participate for free.
2) Botnets and profit. We all know of spammers using zombies to peddle goods, and of script kiddies using them to DDoS. What if some enterprising but immoral person decided to use the computing power of his zombies to profit off of the distributed computing payments? With enough zombies, he could easily make a good amount of money off of other people's computers.
I don't understand why we are asking how a hammer is doing compared to a screwdriver? Both are varied computational models, and are at best architectural descriptions as titles; TeraGrid v. Distributed Computing. They have specific application domains and are used to solve different types of problems. One dealing with non-discrete data and experimental calculations (TeraGrid), the other focused on discrete chunks of data being filtered or rendered and are non-time nor message dependent (Distributed Computing; as defined by the Nevyan's reference). You have two tools in your tool chest. What makes one better than the other? They have completely different jobs that they tackle. They both will be successful. They need not be in competition.
It is based on Apple's XGrid, and uses volunteers from the Mac community here at NCSU, as well as some of the lab macs, and soon we will hopefully have official Linux and Windows clients, maybe even Solaris, to run on more of the computers around campus.
There is even a really nice web interface that shows the active nodes and their status, as well as the aggregate power of the two clusters.
Its really nice, anyone who is part of the grid can just fire up the controller and submit a job, I am a part of the lower power grid since my TiBook is only a 667, but I was able to connect up and do the Mandelbrot Set thing that comes with XGrid at a level equal to around 7 or 8 GHz.
There are some screenshots here
e to the pi i plus one equals zero