Visions Of The Future Of Grid Computing
CaptianGrid writes "Computing grids, or software engines that pool together and manage resources from isolated systems to form a new type of low-cost supercomputer, have finally come of age. BetaNews sat down with some of the world's leading grid gurus to discuss the significance of such distributed technologies and separate grid hype from grid reality."
if i had a grid computer, maybe i would've been able to get first post.
Then more recently we have seen Univa being created, which I am involved as founder and advisor
Univa
Univac - a successor of Multivac, the largest computer in Asimovs world.
Nerds - they get everywhere.
I guess its time that the power of a single CPU (Ghz and instruction per clock) are leveling off, and this seems like the only way to increase computing power, hook lots of it together. Hopefully we will be able to find some answers from the SETI or cure some nice cancer for the Folding projects. Would be nice if the commertial grids also help out those projects by giving them their spare cycles. GRIDS CRUSH SINGLE CPU.
Imagine!
Check out Apple's X-Grid technology!
It runs on any OSX system, 10.2.8 and up. Put your spare cycles to work.
Xgrid: High Performance Computing for the Rest of Us
Never ask for directions from a two-headed tourist! -Big Bird
The article mentions the commoditization of grid computing by adhering to a set of standards, but past a certain point, it makes little sense for IBM or Sun to make their tools interoperable... that makes their consulting value-add on top of grid resources they offer diminish.
I think that for full standards compliance, you'll need to look to companies which don't offer their own computing resources -- platform-agnostic companies. But then who do you buy the compute resources from? Unless you're buying your own systems for use (which makes "utility computing" less viable), it's a bit of a catch-22.
500GB of disk, 5TB of transfer, $5.95/mo
If you want examples of operating systems that help with gridding, check out Plan 9 from Bell Labs and it's sister project Inferno. Nice thing about Inferno is that it runs on Linux, Windows, Mac OS X, Plan 9, and on native hardware.
Free of Flash! Free of Flash!
There will never be a substitute for a single box with a lot of CPUs on it. For tightly coupled dataset the latency of a grid will be a limitation.
Transcend Humanity. Please.
Computing grids, or software engines that pool together and manage resources?
Pure Bolshevism, that's what!
What we provide is primarily an implementation of Web services standards to allow people to build services, and the primary goal is also for us to provide a set of pre-defined services that allow you to use Web services protocols to interact to request the allocation of compute resources, the creation of computational services and moving the data from one place to another and so forth.
Does this sound like Carly Fiorina attempting to explain HP's strategy to anyone else?
The new generation of marketeers use Grid, but they rarely are refering to what computer science engineers refer to grid clustering. I think the marketeers talk about Grid when they really mean virtual Operating Systems running on abstracted hardware platforms: either a mainframe, or otherwise kick-ass multi-way system that has been virtually partioned, or something like vmware piecing together several x86 style servers.
Frankly, I don't like the word Grid being applied in this way. However, the latter technology is facinating (virtual OS) and will come to dominate computing in the next few years.
The basic idea is total abstraction of the application/service from hardware/location. The app gets the resources it needs, can be cloned/replicated to another location for distaster tolerance, and can scale and grow on demand based on needs by simply throwing more hardware modules at it. It's not just limited to computing but also applied to storage and network.
Someone you trust is one of us.
Look, the bottom line is there is nothing new here, just new sets of buzzwords. You have been able to submit massive computer jobs to IBM or Sun (with their insane $1/cpu-hour), or even most college campuses (the U of Minnesota had such systems) for the last 35 years. MPI/PVM standardized and commoditized the clustering side of things long ago.
;)
Globus is now "web services" and not "GRID". GRID is so last century. It's far more cool now that it's in Java too. Anyone still working on GRIDs should search/replace immediately!!!
And did they drop the name of every single business partner they have in that article, or did only I notice that?
- Adam L. Beberg - The Cosm Project - http://www.mithral.com/
Hasn't the blackout taught us to move away from GRID type setups? If people just created their own power the blackout would have affected us less. Could this principle not be used for home computing? Rely on yourself and not on others?
Live forever, or die trying.
While RTFA, I couldn't help but wondering what the overhead of a Web service-based grid solution might be and how the overhead would get compounded by the frequent communication among the grid nodes.
Tyranny isn't the worst enemy of a democracy. Cynicism is.
Just do all your computation in whatever hemisphere is in winter. They can use the heat.
Only problem with this kind of setup is in fact it's limited ability to accomplish anything usefull to a consumer or a medium company. While, of course it is an interesting field, and one that needs to be researched, technology like proximity computing (SUN) is what will dictate the technology in the future. It's hard as it is to even get decent multiprocessor scheduling without too much overhead on a single pc, overhead incurred with grids would be enormous (I guess that's why the primary applications would be file storage etc.) Proximity computing on the other hand, is an innovative approach that doesnt try to solve a problem in place, but avoids it all together.
What I want to know, is there anyway to sell my unused cycles on the open market. I love SETI and all, but making a $$$ would be super cool.
San Francisco Photographers
Is a combinataion of grid and virtualisation.
... same thing, ldap server same thing. If a server gets under load, it will automatically devote more memory/space/cpu/bandwidth to it as reasonable.
Grid in the sense that if my datacenter needs more resources, I just plug in a blamk PC with extra CPU/MEM/Disk and not worry about it. Or if one goes bad, I just rip it out without worring about what it will destroy.
Virtualisation in sanse that if I need an email server - I just create a virtual one on this grid and let it go, if I need a DNS server - I just create one on this grid and let it go, a web server
That is my idea of a true grid.
Grids are great for non-time critical computations tasks. But what happens when everyone needs cycles now! My guess is that systems will evolve to give cycles to the highest bidder/highest priority. In such an environment, low-priority tasks will become effectively impossible on a grid - there will always be some higher-priority/higher-paying task that usurps the cycles.
I wonder how long SETI@home will last if home PC users realize they can "sell" cycles to meet for-pay demand for computational power.
Two wrongs don't make a right, but three lefts do.
If you've got a problem that's trivially parallelizable, then sure grid computing is great! RC5, seti@home, and similar projects can benefit from grid computing (really, that's what grid computing is -- someone else's code able to run on your machine when it's idle and do work).
However, don't even begin to think you'll be solving anything that requires any sort of processor to processor communication. Rocket simulation (our local favorite example here at UIUC) for instance is heavily communication based.
The linpack benchmark that top500 uses also needs a low-latency interconnect to perform really well, so don't expect to see "the grid" sitting up at the #1 supercomputer slot on top500.org anytime soon (or really, ever, unless someone develops FTL networking). Latency on the internet in general (and specifically around the world thru all those switches and latest_slashdot_hot_chick_movie.torrent packets) is nothing near what a supercomputer needs.
Now, there are research groups looking at ways of making communiation delays less of a problem, including the one I was in while I was in grad school. There's a number of ways to do it, but none of them I've seen are going to take on worldwide-network-latency and survive with their performance intact.
Even something as "simple" as chess wants to have a fast interconnect - every node that's gotten stranded working on low-priority (bad move) work is a wasted node you may as well not have.
Slashdot Patriotism: We Support our Dupes!
The guy in TFA talks about P2P being another type of grid and that a family could create a distibuted environment for shared data. He also talked about trust.
My idea is that with adding strong encryption you get basically small priate network that is almost impossible to crack. DVDs + CDs + Encrypted P2P among a small group of people == Old Skool Sneakernet (aka borrowing your friend's stuff). You and your friends can share all the entertainment among yourselves as you like. All you need is a P2P-type client and share your keys with your friends physically (as in 3 1/2 floppy exchanges).
You want to borrow that new Spider-man 2 DVD but are too lazy to get go over to your friend's place to get it? Send him an email and ask him to rip it to Divx and throw it up on your private encrypted P2P network.
Take a pinch of Standard Linux
Wrap it up in Xen
Add a touch of SELinux
And a little bitty bit of Globus
Oh like a Sandboxed Platform
Oh Lordy, Lordy, mixed with Free and Open Source Code
You know you lump it all together
And you got a recipe for a Multi Vendor Development scene
It is coming though, you know, you know.
What we have is a great big melting pot
Big enough enough enough to take every vendor and all IT's got
And keep it stirring for a hundred years or more
And turn out Application Service and Content Providers by the score.
With apologies to Blue Mink .
"Grid" is all about "You let me use your spare cycles, and I'll pretend I'm going to let you use my spare cycles in return."
"But all your emitter and collector are belong to me!"
"Grid" technology to do this stuff has been around for decades e.g. NQS, hell NASA gave away PBS in the 80s & 90s.
The problem is that most of the CPUs out there run Windows, which is currently damned near useless for this kind of thing. It'll require a rewrite of the OS to take proper advantage of the potential of a network of windows boxes for general purpose computing. OTOH, a couple of shell scripts and SGE (http://gridengine.sunsource.net/) does the job on Linux and other Unix systems.
Government of the people, by corporate executives, for corporate profits.
Me neither, but for slightly different reasons.
The main definition of a grid is a pattern of intersecting lines. While sun or ibm may arrange their computers neatly in rows of vertical racks and build it in a grid pattern physically, nothing of this remains for the actual use or architecture of so-called grid computing. This leaves large swaths of parallel algorithms by the wayside. The only things you can efficiently compute on a grid are the "embarassingly parallel" codes that don't interact much with neighboring CPUs nor require large data sets. Sure, you can do SETI work units and compute large primes, but for chess, weather, and crash sims you'd be better off with a traditional supercomputer or local cluster.
Wiki reference here
What no one is mentioning is that these big cluster/grids that Sun/IBM are building to later sell over the network are dependant on the ratio between network speed and batch file sizes.
EXAMPLE: IBM is currently offering CPU/Hour service in Houston to oil and gas companies. Sounds great till you realize the multi-terabyte files that consume such a massive compute service are too big to be readily sent over the network. Instead they use vans to haul tape and disk over to IBM and then run the process on it.
What is the bandwith of a station wagon? Right now its faster than the internet on a 20 mile drive across Houston.
But even take it a step further and the ratio remains. What if I wanted to pay Sun/hr for CPUs while I worked on a big Maya render of 200 gigs. By the time I've sent that over cable modem have I gained a ton in performance time?
The problem I see is that we are making CPU massively parrallel but not networks. So will it EVER make sense to send a massive file to a commercial grid over a singular network connection.
Somone should do the math.
The use of the word "grid" here is in the sense of an electric power grid. The idea is that you should get computing power on demand, just like you get electic power on demand.
The reason for such an arrangement is that high-speed interconnects are expensive. Building a single cluster that is uniformly very high performance would be horrible for anyone other than a very rich organization to consider.
On the other hand, grids alone are way too slow to handle the needs of time-critical communication, which is what you have a lot of the time in parallel computing.
A hybrid, able to place components of a problem according to that component's needs, would seem to be the logical solution. It is also the scalable solution. Clusters often have an upper limit in size. By having grids of clusters, you have a virtually infinite capacity. True, there simply aren't any clusters that have reached the upper limit. Yet. But it's getting tough at the size they are at right now.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
In a Grid context, people usually mean Condor-G.
Parity: What to do when the weekend comes.
Because I'm feeling contrarian, too, I'll call you on your claims. Virtualization can be very cheap, and very easy to administer. VM/370 was based on CP/CMS, which was developed using government money, so it was open source. In an early example of why open source is such a good idea, several big timesharing companies took CP/CMS and hacked CMS to get rid of the real I/O instructions (CCW's, or Channel Command Words) inside it. You see, CMS was a real single-user OS. So CMS could run on bare hardware, just like it could run under CP. Thus, CMS issued CCW's to talk to what CMS thought were "real" I/O processors on "real" hardware. Which meant that when CMS ran under CP in user (non-privileged) mode, every time the machine tripped over one of these CCW's, an illegal instruction trap was generated. The trap was caught by CP, which then parsed and painstakingly emulated the CCW in an extremely complex routine called "CCWTRANS." Many have lost their sanity reading the code to CCWTRANS. Anyway, although really cool, this strategy also turned out to be really expensive.
Meanwhile, because they all had the source code to CP/CMS, the timesharing companies all came up with the same basic great idea. They hacked CMS to get rid of the CCW's, and replaced the CCW's with the equivalent of fast BIOS traps into CP. So CP didn't have to translate or emulate anything any more, things began to run at native speed, and suddenly everything was lickety-split fast again. In fact, this hack sped up CMS to the point where the premier speed vendor, National CSS, could run 250 users with decent performance on a 370/168 mainframe. VM/370, meanwhile, topped out at a measly 60-70 users. IBM either never figured out the hack, or as is more probable, wasn't very interested in VM/370 anyway (their cash cow was and still is OS/MVT and its successors).
So you are correct; VM/370 was a dog. But CP/CMS, hacked with traps, was totally amazing. I was there; I was a CP system programmer; I know.
The modern equivalent of this strategy is called Xen. Xen has been a topic here before. I predict you will see a lot more about it in the future.