Introduction to Distributed Computing
dosten writes "ExtremeTech has a nice intro article on distributed and grid computing." Someday someone will successfully implement something like Progeny's NOW and all of these assorted hacks at building a distributed computing system will be superseded.
The whole thing
Rather than a popup ad per page.
if you are interessted in distributed computing over internet check out this url: http://www.aspenleaf.com/distributed/.
there is short description of all distributed computing projects plus lots of other stuff.
-- http://electronicintifada.net --
... like the dogma project at Brigham Young University is a distributed application system currently on used on a few thousand machines. It is written in pure Java, requires no persistant storage on the local machine, can be interrupted at any time, and is OS independant, to name a few things.
//TODO: Think of witty sig statement
First of all, be sure to check out the links at the end of the article to some of the projects that are going on right now. Some of the ones that I find more interesting are the Particle Physics Data Grid and the Access Grid (no link in article).
One of the great benefits of Grid computing over distributed computing is the access to resources, such as storage. This is what PPDG seeks to do, provide access to physicists, in near real time, to the results of experiments. The problem is that the experiments may be performed at CERN and the researcher may be at CalTech. While normally for a telnet or what not, this isn't a problem, it is a problem when an experiment can produce Petabytes of data. For more information on that see http://www.ppdg.org. There is another project called NEESGrid that will provide access to earthquake simulation equipment remotely. Truly cool.
I also encourage you to check out Globus. Using a system like the Globus Toolkit along with MDS, I can locate a machine and execute my program on it transparently. This transparency is taken care through a network of resource managers, proxys and gatekeepers. It's pretty cool and is pretty easy to install on your favorite Linux box.
Programming Grid enabled applications is pretty easy. There are software libraries called CoG Kits that provide simple APIs for Java, Python and a few other languages. In just a few lines of code you can have a program that looks up a server to run your executable on, connects, executes and returns the data to you.
The current push right now is towards OGSA which is Open Grid Services Architecture. This will form the basis for Globus 3.0. OGSA will take ideas from web services, like WSDL, service advertisement, etc, and implement them to create Grid services. This will be the next thing with services easily able to advertise themselves and clients easily able to find services.
My Slashdot account is old enough to drink...
You're missing the point. Distributed computing is not about only running on machines that aren't yours, but also efficiently utilizing the machines that are yours (or at least have easy access to).
Consider that a University of Wisconsin study showed that, on average, computers on desktops are idle at least 60% of the time. And that doesn't count the cycles burned lost between keystrokes --- I'm talking about extended periods of time. For example, almost all desktop machines are idle during nights. That's 50% already. Now add lunch time. Meetings, etc.
That's when systems like Condor come in. Researchers at Wisconsin got hundreds of years of CPU time on machines they already had without impacting others.
Coming back to your argument, the counter argument is that you may not even need to buy additional boxes --- just use the ones you already have more efficiently by utilizing distributed computing systems.
As far as "freeness" of processor cycles, let me tell you that the optimization researchers can soak up as much cpu as you can possibly throw at them. Also, if you look up Particle Physics Data Grid (PPDG) and GriPhyn, you'll find out that many distributed computing problems are I/O driven.
++Rajesh
These projects when described in the lay press nearly always skip over any analysis of the kinds of algorithms that can work well on a distributed system. The first metric to look at is the ratio of communication to computation. That is, how many bytes of data does a compute exchange with neighbor(s) before continuing with the next step of computation.
Render farms are embarrasingly parallel requiring no communication with neighbors while rendering a frame. They do require a large amount of data before starting on the next frame, but you can either pipeline that (which they don't do usually) or double up on the number of compute nodes (which is more common).
Suppose instead you want to solve a big mesh problem like a 3D cube with 10^10 points on a side. And its a fairly simple computation. You might need 10^5 or 10^6 nodes and the data traffic between nodes would look like a DOS attack if it took place on the internet.
And then there is the rich space of possibilities between these two extremes and the crossproduct with storage. It is a fascinating area to work in because there is much yet to learn and the possibilities for new networks and processors and storage evolves all the time. Things that were impossible to do last year are within reach this year or next year.
But.... just as 100 Volkswagon beetles may have the same horsepower as a huge earthmoving machine, the beetles cannot readily move mountains... and 100 or 1000 or 10000 PCs with a low-cost interconnect are not equal to a supercomputer or a supercluster that may support 10^6 greater communcations to computation ratio - and thus a much greater range of useful distribution algorithms.