Grid Computing at a Glance
An anonymous reader writes "Grid computing is the "next big thing," and this article's goal is to provide a "10,000-foot view" of key concepts. This article relates many Grid computing concepts to known quantities for developers, such as object-oriented programming, XML, and Web services. The author offers a reading list of white papers, articles, and books where you can find out more about Grid computing."
Grid computing is not about making a giant computing farm out of a bunch of distributed machines.
...), and ideas are the important part here. having these "Grid" concepts built into every new technology (filesystems: NFSv4, security: Globus GSI, etc.) will allow these linkups, data transfer, and whatever we may awnt to do, to happen much more efficiently in the future.
see, that's the major fallacy of the hype behind "The Grid". yes, one of the benefits can be seen in the supercomputing realm, where you can link up many different machines (we haven't gotten to doing this between architectures yet, mind you) to make a gianto-machine.
however, the key in *all* of this is the technologies that allow for that to happen, along with the data transfer, authentication, and authorization, et al, that have to happen.
as far as cycles go, no, we probably won't see a dynamically created, scheduled, and allocated meta-supercomputer anytime soon. most companies will use these technologies to make static or mostly-static links between a few select sites and partners for now.
however, these protocols (GridFTP, ack), standards (OGSA,
to wit: the killer app in "The Grid" is not to make a giant supercomputer. it's to develop a lot of different ideas and technologies which allow for resource sharing (at the general level, among other things) to occur in a standardized, efficient, and logical fashion in the future. noone will use all of them, but the key is to use what you need from what "The Grid" encompasses. that's why it's referred to as "The Third Wave of Computing"!
This is just an inverted version of the "network computing" universe where we all use thin clients that use a central server to do work. It can never become mainstream due to the physical limitations, not the technology ones. Suppose I am a corporation and I need a new big-iron system to process daily orders from our web site. Let's try grid computing: all 1000 employees in the company install a piece of software on their PC so we can use each PC to process an order, based on availability. The number of problems with this, as compared to using a central server, is incredible.
1) Still need a central server for storage/backup
2) One server needs one UPS, 1000 workstations...
3) Worsktations are flaky: They reboot, crash, play video games, etc. The distributed software can handle this, but the inefficiency involved is painstaking. I hope everybody doesn't run Windows Update all at once, or all the PCs could go down.
4) The corporate network is now a bottleneck.
I rattled off this list in about 30 seconds, so I'm sure there are lots more. Since these are physical limitations, not technology limitations, they aren't going away.
Oh bullshit. Every layer of abstraction costs you.
The fact that desktop pc's are 5-20% utilized is why you can just claim another layer of abstraction won't hurt you.
--- now please go and find me a list of things that "needs distributed".
-- next from your list remove any jobs that do not parallelize in to chunks of data that can fit in common machines --- yes the grid will have some big boxen, but do you think you are going to reliably get farmed onto one of those?
-- next from the remainder that you have managed to parallelize into small chunks, please remove those in which the chunks have to have any significant interdependence because you don't have any control of the net-ography of the grid and latency will be a killer.
-- now remove any notion you have about "generic db queries" unless you are going to have many redundant db systems on the grid. If you don't have redundancy the network latency will kill you. If you do have redundancy and the db query is sufficently complex as to need service by something other than your desktop PC then you'll probably want some beefy hardware out there... which you want to use not necessarily share
-- what's left? Things that occur to me: analysis of nuclear and particle physics data (that's where the grid idea started!), genomics research, cryptography, SETI@home and whatever else @home. The key point is that none of these are applicable to corporate IT unless you are doing say genomics. Do you think that genomics resarch companies are going to ever allow their data to be handled outside a structure they can micro-manage -- there are giga-dollars at play.
The grid has it's place, but the myth of:
1)plug my computer into grid
2)have access to limitless resources
3)do amazing things
is as goofy as the dot-bomb business plans that forgot to figure out a $profit$ step.
If you aren't doing amazing things outside the grid what makes you think adding 10000x the horsepower will change anything. The grid is at best a tool. If it meets the needs of your niche market you win big. If your problem(s) don't fit the grid then you gain nothing.