Grid Computing: Conceptual Flyover For Developers
An anonymous reader writes "This article relates many Grid computing concepts to known quantities for developers, such as object-oriented programming, XML, and Web services. The author offers a reading list of white papers, articles, and books where you can find out more about Grid computing."
Nope- that still does not tell me what "grid computing" is. This vague, loosely defined definition can describe just about every "next big thing" since the mainframe.
Recently I saw a similar design for a network and some "old timers" said it was no good to do it this way. It wouldn't satisfy the needs.
One thing I have noticed is that for many "old timers" there is the feeling of we have always done it the old way, why change. Any thoughts of how we drag that old donkey into the new methods when they don't want to go?
Evolution or ID?
The linked article is written in May 2003 yet it's new now?
As part of a university group that adopted Grid computing about a year ago, the Grid is mostly over-hyped material that isn't ready for prime time. The basic idea (see e.g. Legion) worked more than a decade ago, but what I've seen of today's Grid software is fragile, overcoupled, underdocumented, and doesn't yet deliver on all the promises.
:)
We were taught that the test of research software is whether a full professor (or corporate executive or other obscenely busy person worth >> $100/hour) finds it useful enough that they take time to learn it - the uses I've seen for the Grid don't pass that threshold yet.
There are some exceptions: tightly-integrated applications put together in a couple of the hard sciences that really just do supercomputing with a friendlier face. There's enough payoff there for a physicist to be happy with the software.
For a geek, however, even there, most "grid UI research" is simplistic, derivative, and uninspired.
Apologies to my first-ever-advisor who is now a Grid bigwig.
"No Neo, try again"
"What is grid computing?"
"Bingo."
There's certainly alot of info to devour there, but I guess if companies like Google and Dreamworks are using it, then it has to be a Good Thing.
The friendliest digital photography forums on the net!
I have had some experience with grids and the overwhelming difficulties I've come across have been in the areas of security.
First and foremost, grids are designed to run in a distributed environment which makes security design and administration that much more complex.
Second, grids are currently in their infancy and there is little prior art to the types of attacks and problems that will affect them. Despite this, they are very juicy targets with the kind of storage and bandwidth that would make even a hard-bitten cracker weep for joy. (i apologise for the imagery)
Third, in my book security has to be a top-down approach - i.e. the guys on top lead the way and then everyone else follows. Grids have no tops or bottoms which makes this a bit tough to apply. In short there is no security hierarchy in a default grid environment. Responsibility HAS to be established explicitly. A simple example is who is responsible for the data held on one of the nodes? Is it the person who wrote the application, the person who owns the application, the person who owns the hardware?
Grids are fascinating in their security requirements (and those who think these are solved by web services have another thing coming! People are a huge aspect of the security of a system, and distributed system like grids have a very complex task of ensuring that people behave the way they should).
Is this news? Article is dated May 2003.
Like, if you fit in "grid computing" in your grant proposal, the probability that you'll get funding increases. Now, if in addition to "grid" you manage to fit in "nanotechnology", "bio-informatics" and "paradigm" you'll be funded with a probability very close to 100 %!
The grid discussed here seems only to be built on the OGSA and Globus Toolkit, and Globus has not really covered itself in glory with their poor UIs etc.
Grid seems to address occasional demand for "much more power" from your computing resource, but does not really provide a consistent flexible computing resource.
The academic world uses External Grids to pool resources but private Enterprise has little to gain from these External Grids in exchange for a HUGE security problem.
And Internal Grids? These are so immature as to beggar belief. Why risk investing in these configurations when bang per buck is so uninviting.
You know we're talking about a dead meme when the first comment is modded redundant.
The beowulf cluster joke is dead, long live the beowulf cluster joke!!
The snow doesn't give a soft white damn whom it touches. -- ee cummings
Actually I always thought the name came from the concept of power grids. i.e. plug in your application to the computing grid, get it to run your computations, and get the results without having to worry about how the computing power got to your home...
Kind of similar to a power grid no? plug toaster, insert bread, get toast - no need to worry about coal/oil/nuclear fuel burning, transfomers or megawatts...
If it works that transparently, great. Imagine if they called the internet "the grid"... things that a truely new usually have new names.. I've found that the opposite usually indicates a very high ratio of fluff to content and is the a telltale sign that there's a marketing department behind it, and that's almost always not a plus.
The beowulf cluster joke is dead, long live the beowulf cluster joke!! ... in Japan?
Stuff like this whereby you get a load of co-operating computers and a multi level archtitecture to utilise it was done years ago , all built on top of RPC. This to me is just a nother refashioning of age old ideas so the people involved can justify their research positions and so IBM (and others) can make a whole heap of cash out of gullible IT managers.
I dont know about google, but I believe the dreamworks rendering used for shrek2 (that was theirs, right), was deployed onto a supercluster of 500+ nodes, not some fancy grid fabric.
As an "boots on the ground" IT professional it would be nice to have a consumer grade "grid computing" solution to offer some small business customers as an alternative to buying a server farm for the two days a month they actually put strain on it.
If there were an easy way to cluster their workstations they wouldnt need to invest in an underutilized server farm. They could just schedule their processor/disk intensive reports and processes for off hours or rely on grid load balancing to take the extra cycles from the computer of the CSO (Chief Solitaire Officer) so that the impact would be imperceptible to the average user.
The current problem with the concept of grid computing is the lack of an easy way to deploy it in a standard business environment. What the article and its links are driving at is coming up with a cheap and easily implimented mechanism to turn every office, and chain of offices into a grid.
In theory, you could sell your unused processor cycles the same way people who generate their own power sell power back to their power companies. You ISP could actually, someday become a processor cycle reseller and you could operate on a minimal set of hardare in the typical office enviroment becuase you can always pick up extra cycles from your ISP when you need them.
Ah, the pipe dream.
So why the hell is posting a blurb that links to articles that are 18 month old news?
Damn slow news day.
Stuff like this whereby you get a load of co-operating computers and a multi level archtitecture to utilise it was done years ago , all built on top of RPC
Yes, and that's why it didn't work: thinking about distributed computing as a bunch of procedure calls that happen to be remote is wrong. The sad thing is that a sizeable number of people still thinks it's the way to go (e.g., all the SOAP adherents).
This to me is just a nother refashioning of age old ideas so the people involved can justify their research positions and so IBM (and others) can make a whole heap of cash out of gullible IT managers.
The problem with grid computing is more that there isn't much of a need for it right now. But if there were, research into it would be justified because existing technologies can't handle it.
I guess I would disagree that the Grid is where HTTP was circa 1993. Whereas there was just one WWW and it was based on one protocol, the present definition of Grid computing is hazier.
:) TeraGrid runs on a very specific set of gov't -owned CPUs of a limited family of processors - basically, Itaniums. LCG/Grid3 are a bit less homogeneous - a selection of versions of RedHat. NorduGrid is very diverse, Linux-wise. SETI@home is positively promiscuous.
There are several competing definitions of "Grid" going around - from the happy-big-cluster idea that Apple calls Xgrid (bad name, good product, IMHO) to the TeraGrid and NCSA grids in the US to the LCG/Grid3/Nordugrid to SETI@home. They all speak different languages and are built on different models.
Most of the definition problems come from the degree of heterogeneity involved in the Grid in question. Obviously, Xgrid runs only Macs.
Now, have a look at the range of software each can run. SETI has ONE program available. Its wild heterogeneity makes it tightly limited in a resource-limited development environment. Xgrid can run almost anything a Mac can do. This is the tradeoff - as you support more platforms, it becomes harder to support a generality of packages on each platform.
If you make your code extremely light and portable, you can afford to push the code and compile at runtime, or do relatively frequent recompiles and updates on the known sites. This is hard to scale, though.
Another problem is managing the sites in question. Again, scaling. With 50 sites, you know the sysadmins and can interact with them on run issues and security. With 5000, there is no chance. You must have automated checks and maintainence. This is also nontrivial.
A number of solutions are being tried to these problems, as well as to security, load-balancing and storage optimization. They must be solved in the near future, and things look good for that. The most common solution, AFAIK, is the GLOBUS grid middleware - it standardizes a lot of this stuff. It is imperfect and needs work - but it's coming along. Previous comments that Globus' UI is imperfect are silly - it's like carping about the UI of a machine-code instruction. Others middlewares also exist in some form or other, and a few will eventually emerge as solutions to the various problems. Again, the solution you use will vary according to the problem you face.
Eventually, the goal is to have a fairly portable, secure, flexible framework that will run a reasonable number of applications on a reasonable number of platforms. The software has to be admin-friendly - no need for root on the boxes, easy to set up, easy to remove. These are all adjustable requirements.
Right now I run massive high-energy physics software (preinstalled on the site) on tens of sites in the US and Korea. It's not user-friendly - it's in development. It's extremely powerful, even now - my jobs are gargantuan and the disk to store them even more so. I need only one application set, and I have it. In the future, things will be easier - this is all in development mode. However - it DOES work, it WILL continue to improve, it continues to become more secure, and someone (when things are ready) will code up an interface to it that will make it friendly.
That's pretty good. Naysayers take note. This isn't a vaporware idea - it's just a difficult problem with a lot of blanks left to fill in.
One of the points in this article is that many companies have idle computing power and there servers are under utilized. Obviously if there existing infrastructure is more than handling it's load they are not going to be too keen to cross over to this new technology (well it's not really a new idea).
On the other hand it does mean that new networks can be created using less resources, but at the moment the biggest interest in this would probably be from the scientific community to do really intensive processing.
Okay...I'm not completely up in the inner workings of GRID computing, but is the premise the same as those used in the past for other distributed environments such as DCE (Distributed Computing Environment ) or CORBA (Common Object Request Broker Architecture)?
My experience with DCE at least was that it was a distributed environment that took a lot of coordination between systems, which unfortunately was not done very well in the environment I'm familiar with. As a result of this it did not prove robust enough for the systems it was used for. It had some possibilities, but if not done properly, can be a major confusing thing to deal with.
With CORBA, as I understand it (I've never directly worked with CORBA), it is suppose to represent similar services in a more Object Oriented way and easier to program with. Not an expert, but I believe this is ingrained into Java world along with other RMI type interfaces or peer to peer intefaces (like JXTA).
With these types of services, both DCE and CORBA offer distributed network services such as directory services, distributed file systems, and security services on hetrogenious environements. The interfaces are defined (see IDL) and compiled in to stubs for client/server services to develope and use on any compatible platform.
How is GRID different from these methods?
Eric B
ebresie@gmail.com
It's easy when classically trained/experienced to see tasks as captive procedurally to one processor. Numerous processors, distributed computing, and as a consequence, grids of computers, see a daemon, or a number of distributed but controlled daemon, as capable of broadening the number of concurrent tasks that can be performed. This is the benefit of the grid; resources that can be allocated and run concurrently, then added to the performance of the originating idea/program.
It's a distributed hierachy of functionality. Its benefits can be simply described: more rapid results over a wide breadth of available computing hardware/platforms. It's not for accounting tasks, but can work for churning databases or collections of information. This wide breadth of platforms to deal with datasets brings enormous computational/execution power for little money because it's not captive to a single (set of) platform(s).
Therein lays the value.
---- Teach Peace. It's Cheaper Than War.
Hosting companies have large numbers of identical machines with high bandwidth interconnects. That's just what you want for "grid computing". They're already set up to allow customers to run applications on their machines, and are able to deal with the security problems. Load is very low during off-peak hours. The machines stay up; they don't suddenly get disconnected from the net because somebody turned their desktop off. They're all loaded with the same base software. It's the ideal situation for commercial "grid computing".
So why is nobody selling this? Because there's no market for it. There's no real commercial market for supercomputer time, distributed or otherwise. Once upon a time, from about 1960 to 1980, there were engineering computer service centers, where you bought time-sharing service on big mainframes. Control Data and UNIVAC were the preferred machines for this. But that business is dead. CPU time became too cheap.
A well-known commercial grid was Gateway Processing on Demand, announced in late 2002 with great fanfare. Gateway offered "grid computing" on thousands of Gateway-owned machines. They quietly dropped that service some time last spring. Their former CEO admitted that it generated "not a lot" of revenue. Basically, it was an attempt to generate some revenue from Gateway's unsold inventory of machines.
Grid computing is one of those schemes where all the interest is on the sell side. Nobody wants to buy it. "Micropayments" and "portals" are like that. They didn't sell either.
Do you BreatheEatAndSleep GridComputing? Does your Nanotechnology experience RockYourX Paradigm? Passionate about BioInformatics? Read on!
~ ~~
SpreadThin is a BayAreaStartup with a Mission. The Mission: to combine Nanotech, BioInformatics, and GridComputing to create the NewParadigm for MoleculeBasedServices! Your Task: to synthesize these concepts into a Marketable NewParadigm. RoomAndBoard + Equity.
Knowledge of WikiEmergencyMaintenance is DesirableButNotRequired. StartingImmediately.
Resumes in HR-XML Resume 2.0 format to ceo_webmaster_receptionist@spreadthin.com.
~~~~~
Corollary to Moore's Law: The IQ of new computer owners is declining.
http://shit.slashdot.org/article.pl?sid=04/11/01/1 211259
It's now used not only in Computational Science, but also in business. Butterfly http://www.butterfly.net/ uses Grid as the online game solution. And as I know, SUN comes out with SUN Grid Engine, and Oracle also has its Grid solution for business customers.
As we worked on both intranet and internet a few decades before, we now have the grid for enterprise use and also the so-called Global Grid http://www.ggf.org/.
Yes, "Grid" came from marketing thoughts to capture more of the long-term goals. The field was previously called, more accurately, metacomputing. But nobody outside a narrow subfield of distributed systems knew what that meant either.