TeraGrid v. Distributed Computing
Nevyan writes "After three years of development and nearly a hundred million dollars the TeraGrid has been running at or above most peoples expectations for such a daunting project. On January 23, 2004 the system came online and provided 4.5 teraflops of computing power to scientists across the country. However, the waiting list for TeraGrid is long, including a bidding process through the National Science Foundations (NSF's) Partnerships for Advanced Computational Infrastructure (PACI) and many scientists with little funding but bright ideas are being left behind. While the list of supercomputer sites and peak power is growing how is the world of Distributed Computing faring? "
The problem with using distributed computing for everything is that the number of people willing to let others use processing power on their computer is not infinite. It is a very large number, but eventually everyone who wants to/knows how to help out their favorite cause will have something already installed. In addition, the more useful endeavors that use distributed computing, the less users you will get for each, and only the 'interesting' projects will get many users. Who wants to use their computing power to analyze some boring old physics experiment when you could be finding aliens or curing cancer?
Distributed computing has its uses, but remeber: the public will only be willing to help you as long as they feel like they're contributing to something worthwhile.
If you can divide your problem into very many independent subproblems, clustering or distributed computing will work well. If not, your best bet is a true supercomputer.
So: SETI@Home splits up its scans into sections, each of which do not depend on any other; therefore, a distributed solution is efficient. However, the Earth Simulator deals with chaotic systems (or so I would assume), which do not independently parallelize; this is where having hundreds of processors and terabytes of RAM and using something like NUMA is greatly more efficient.
In short: use the right tool for the job.
The problem with large projects like TeraGrid, EarthSimulator and other supercomputer sites is that the underfunded _brilliant_ ideas are left behind by those who can afford to pay for or build these centers and sites.
While TeraGrid is a powerfool tool it is one that thousands of scientists and laboratories are standing in line to use. Meanwhile Distributed Computing is available, cheap and relatively quick.
While it may look good on your project to say you used a IBM BlueGENE or DeepComp 6800 is it really worth the extra cost and waiting in line for your chance to use?
True Distributed Computing is the way to go and shows positive results. Now we just need to tinker with it some more!
I don't understand why we are asking how a hammer is doing compared to a screwdriver? Both are varied computational models, and are at best architectural descriptions as titles; TeraGrid v. Distributed Computing. They have specific application domains and are used to solve different types of problems. One dealing with non-discrete data and experimental calculations (TeraGrid), the other focused on discrete chunks of data being filtered or rendered and are non-time nor message dependent (Distributed Computing; as defined by the Nevyan's reference). You have two tools in your tool chest. What makes one better than the other? They have completely different jobs that they tackle. They both will be successful. They need not be in competition.
4.5 Teraflops for $100 million? Surely not. That much compute power can be had for 1/20'th the price. What am I missing?
1) Somebody does pay for time on an expensive cluster. They are built and maintained with your (and my) tax money.
2) Yes, security is a big issue in Grid computing. And it ain't there yet.