Sun Releases Grid 5.2 for Linux
Linux_ho writes "Grid 5.2 is a distributed processing engine that runs on Solaris, and now Linux. Apparently it has been released under an "an industry-accepted open source license" but I couldn't find out which one. The product was designed to make use of the spare cycles from any idle Solaris or Linux machines on your network. Sun mentioned in the press release that it can be used for frame rendering, but I bet you can come up with some other interesting applications. Here's the FAQ."
You might want to check out MPI-6.3.2 (aka LAM) which has been around for a lot longer than Grid. MPI is a library for writing parallel programs that execute on groups of workstations.
Scroogle
It is very easy for each machine-owner to restrict or preference which jobs will run on his machine, for each job to preference certain machine attributes, and also for the queueing system to fairly distribute net CPU time across all active users of the system. All of this works using a very simple C-like language in which you express you desires.
At least, it doesn't look that way to me, browsing Sun's documentation
The nice thing about Mosix is that it automatically handles migrating any existing processes to wherever it thinks it will be finished the quickest. Last time I played with Mosix, I put it on an old P-100 Laptop with no L2 Cache and a K6-2/350. When not connected to the network, using LAME to generate MP3's on the P-100 went at about 15% of "real-time" speed. Connected to the network, it went at about 85% since the process could be automatically migrated to the K6-2. (It goes at about 105% or so when run "natively" directly on the K6-2/350). I suspect it would have been an even more dramatic difference had I been running on something faster than a 10BaseT Hub for the networking. Other than the kernel patch for Mosix, all of the software on both systems were standard "off the shelf" linux programs. (Compiling, I noticed, also went substantially faster, using plain old GCC, for example).
It looks like with "gridware" you actually have to "submit a job" to the handler using a separate program. I can't tell for sure just browsing the documentation whether you can submit ANY process, but it does look like it has to be done 'manually' in any case.
Gridware looks pretty neat, but I get the impression it'll be of more use to technical people who have need to distribute particular types of jobs, and have the resources to set up a "compute farm", and have technical enough users to make good use of it. Other than the installation, Mosix doesn't seem to have this limitation (but on the other hand, Mosix is Linux only [I think even ix86 Linux only, but don't quote me on that] and requires patching the kernel).
Now, if Mosix would get a 2.4 patch out, I could get it set up again at home...
---
"They have strategic air commands, nuclear submarines, and John Wayne. We have this"
Hacker Public Radio is our Friend
The basic idea is to have virtual machine (of sorts) that provides an API friendly to algorithm implementation. (Lots of math and data manipulation functions) The virtual machine can limit both CPU utilization and memory/disk usage by the actual distributed program. The program is written in a scripting langage (grab your favorite one) that can be compiled on the fly. The API functions would be implemented in the fastest possible way for each platform.
You could designed the virtual machine so that users could easily add programs to it for background execution. The client's security would be ensured by the resource limits enforced by the virtual machine and the lack of "dangerous" features in the scripting language.
I never was able to solve the data integrity issue in a satisfactory way, though. Rogue clients in this scheme could always submit bogus results to the server. That's not catestrophic, but it means that the distributed platform could not be used in an uncontrolled environment like the Internet. If anyone has some ideas on how to solve this problem, feel free to post or email me. (Or you could go patent them and maybe make yourself some money.)
Oh yeah, I also thought that "Hive" would be a cool name for such a program. :)
Which means buying a roomful of kit to build one out of. Grid is designed to run jobs on your existing hardware while it's idle - the rest of the time, they're all still general purpose, interactive workstations running regular applications. The Beowulfs of which I am aware use dedicated hardware.
Yes, I know it's from Sun so it's probably stable. I'd hate to see it crash.
Grid-Lock sucks.
C'mon guys, it's a Friday
Yep, I never spell check.
More incorrect spellings can be found he
Grid is a push based system, monitoring the activity on a set of servers, and pushing work to the more idle ones.
This is great, but I believe more in the Seti@home approach, let the idle servers pull work down.
Everyone who has worked with distributed computing knows that the application really has to be designed paying carefult attention to the distribution model. How about a more generic solution, say an XML based data and programming unit (in a language with multi platform capabilities like Java or perl) queued on a controlling server, with a farm of slave servers pulling down a unit during an idle time. It sould be something similar to:
nice -19 jobpoller --controlhost=control.server.com
Picture this as a backend to a website processing CGI, etc.
Anyone interested in forming a subscription based distributed computing project with me drop me a mail...
.. if only.
Well, there's no mention of an interface for Juno distributed processing...
The GridEngine system from Sun is an LSF-type batch-queueing/load-balancing system. Sun bought GridWare, which was an independent German company previously, and is looking to bundle the GridEngine with it's workstations, promoting the 'spare cycle' idea.
In answer to your question, yes, GridEngine can run anything. It isn't an MPI-type implementation which requires you to modify your code. GridEngine allows you to set up multiple execute resources, based on processor type, OS, memory, disk, I/O, runqueue usage, or really any heuristic you want to implement. You submit your job, with whatever resource requirements you need, and then GridEngine runs it on the available resource which meets your requirements. There's also a Q3-ish available product called GRD, which allows you to further allocate resources on a more policy-style basis. This piece will be a licensed add-on, but it provides the enterprise with the ability to divvy-up compute farm resources on the basis of users and groups, etc.
GridEngine also comes with a "grid-enabled" interactive tcsh, so you can have an interactive shell running which is actually spawing work all over the compute farm, as resources are available. There's also a "enabled" make, which does the same thing for builds.
It's pretty neat, but I think it's more effective in a dedicated compute-farm type of installation than a "let's use spare desktop cycles" kind of installation.
I guess this is news... maybe not, though since the Condor Project has been available for a whole lot of platforms for quite some time now. (Yes, Linux is supported.)
"I know - let's make Quake...AGAIN! They just might be stupid enough to buy it..." (overheard at id)
Batch systems have been around a long time in the HPC world. Gridware was orginally developed by GENIAS Software GmbH. GENIAS produced a batch scheduler called Codine, which was a commercial version of DQS. In fact, Sun's Grid Engine FAQ even states that Sun Grid Engine is a new name for CODINE.
Of course, DQS/Codine/Grid isn't the only batch-scheduling/cycle-scavenging game around. Other players are:
Many of these predate newcomers like SETI@home and Mosix by serveral years. Most also provide hooks into parallel computing APIs like MPI, PVM, openMP, or something similar.
Batch scheduling and cycle-scavening are old concepts. Having wasted away my years as a graduate student submitting large quantum chem jobs to Crays, it's nice to see lots of groups continuing to squeeze every useful cycle out of existing hardware. Sun's recent annoucements are just the latest update to an old product---not a new idea, and not a Mosix/SETI rip-off.
My opinion is that this is the beginning of an enterprise computing paradigm that Sun hopes will give Java an edge in the desktop market, after Microsoft's 15 year reign.
Imagine an entire office of computers efficiently sharing resources. I get up for coffee, my cycles are used for my co-worker's application compile. He goes to lunch, his cycles are mine for Unreal Tournament.
I think it's got potential.
do programs have to be written specificly to take advantage of this? I do lots of groundwater modeling and the models can take forever to run. Consequently, we spend lots of money on the newest and fastest machines. It would be nice to use something like this with our models "as-is"