Supercomputing: Raw Power vs. Massive Storage
securitas writes "The NY Times reports that a pair of Microsoft researchers are challenging the federal policy on funding supercomputers. Gordon Bell and Jim Gray argue that the money would be better spent on massive storage instead of ultra-fast computers because they believe today's supercomputing centers will be tomorrow's superdata centers. They advocate building cheap Linux-based Beowulf clusters (PCs in parallel) instead of supercomputers." NYTimes free reg blah blah.
No Registration Required
Just use the google link!
Brings a tear to my eye... life is good.
Gordon Bell and Jim Gray are not just "a pair of Microsoft researchers". They are two of the biggest names in high-performance computing. Gordon Bell awards, anyone?
Just wait till Bill and Steve hear that their engineers are recommending Linux instead of Windows 2003 Server.
In an earlier story Microsoft researches recommended a Linux cluster. That story has been corrected. The Microsoft researchers recommend a hundreds of un-clustered Windows-XP servers. They claim they were eating Lea-Nuts brand PEANUT clusters at the time of the interview and were misquoted.
- For the complete works of Shakespeare: cat
New York Times?
MSFT'ers recommending Linux?
I thought they fired that reporter who was making things up
There are lots of reasons to have really good bulk storage technology. But what's the killer app that's going to get the $10^9/year in government spending? Can you say "Domestic Surviellance" boys and girls? I knew you could!
You could at least use partner=SLASHDOT
Look at the average Joe Schmoe, or even us uber-users, who really needs a 3+ GHz machine? Even some of the cornerstones of fast computing such as computational problem solving are being addressed by grid/cluster based solutions which typically don't use high end machines.
I'm perfectly happy with my P3 800MHz, but I run out of hard drive space everyday.
Cheap, YET RELIABLE high density storage solutions are still not readily available. I know we are now down to a $1 per Gig, but the average size of a user's file has increased now. Media (legal or otherwise), games, and other programs are chewing up hard drive space.
There needs to be more research into trustworthy, lowcost high volume storage mediums.
-"Those who fought today will die tommorow."-
I think they're advocating spending the big bucks on data storage rather than on big iron.
When they mention beowulfs, it's in the context that when researchers need the equivalent of a supercomputer, they can just build/use a beowulf cluster. What they can't do on their own is come up with petabyte storage facilities and the data in them.
So what they're really advocating is spending money on storage; it doesn't say in the article what form that storage should take.
The government may very well like this. They're going to need big data farms to support the TIA program. It takes a lot of space to remember what kind of toppings every person in the US likes on their pizza.
Clusters suck for some problems. Weather prediction is one classic one, fluid dynamics is a whole class of problems that suck on loosly coupled clusters. Basically you need your message passing interface latency to be much faster than one your calculation cycle or you just spin your tires waiting for results from adjacent cells. If all problems mapped well to cluster of comodity PC's then I can guarentee that Linux would be on almost all of the TOP 500 supercomputers because the cost/MIP is a fraction of the big systems. Then I look at the real TOP500 and realize that the top cluster of commodity PC's is only at #7 and it is beat out by a factor of 7 by the NEC vector supercomputer in the number one slot even though the NEC only has twice as many CPU's. Even then they aren't using fast ethernet or even gig ethernet, they are using the high bandwidth low latency Quadrics interconnects. The two other clusters in the top20 are using Myrinet which is also high bandwidth, low latency, but once you add those kinds of interconnects they kind of stop being cheap off the shelf PC's, since the connect boards probably cost nearly as much as the boxes =)
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
You have to wonder why, all things seriously being equal, they don't recommend a *BSD-based solution instead of a Linux-based one. Esp given the near-equivalent functionality of the *BSDs, and the fact that MS has publicly endorsed the BSD license in the past, citing it as an superior alternative to the GNU License.
Research on building Mega beowulf clusters is a legit govt activity and so is building some. But the beauty of the beowulf cluster is that it is affordable to bussinesses, acadmeics and govt, plus its very adaptable to budgets and interconnection schema (fast, slow, grid, scavenger).
but beowulf clusters wont replace the need for super fast, super scalable, computers with well architected interconnects. there are lots of problems in this class, mostly physics simulation, that just cant be done well on beowulf clusters.
I should probably note that my own work involves large computer clusters. However my probelms (in biology) are in fact well suited for beowulf clsters. thus I'm happy to hear of more money for beowulf computing. but frankly I think that this should be in addition to the fast computers.
the flip side here is that it might be the case that money for fast computer resources is not being well spent as it could be at present. there seems to be too much emphasis on "landing the contract" for the computer center than on building a good design. congress via DOE tends to doll these things out in a political fashion making sure each big client gets funding for a center rather than letting the best center get the most contracts. as a result some of the so-called super computers may be just glofied too-expensive-per-cpu unscalable systems already that could be eclipsed by a comparable low cost beowulf system.
but that being said its still an area that the gov needs to fund since it wont drive itself commercially but its needed for lots of science and simulation.
Some drink at the fountain of knowledge. Others just gargle.
He also talked about CERN generating 10 PetaBytes a year when their new collider comes on line
Supercomputers are sexy, but are losing the technology war. If you start designing a new one today it will be years before it is ready. During those years Intel and AMD will crank up their clock speeds and negate much if not all of the CPU speed advantage you get from your fancy design. Why not go for parallelism from cheap machines?
No electrons were harmed creating this post, though some may have been subjected to electrical and/or magnetic fields.
Mod this guy up. He's really telling the truth!
Loosely coupled clusters like PDSF are great for work like what the high energy physics people do, like SNO.
However, somethings work better on vector architectures such as climate models and fusion work: there is a reason why the Spanish Met troops bought a Cray. Additionally, some chemistry, many fusion and several other codes work best on vector architectures.
There guys presented their global warming work where at my job. They've developed their climate code though as a parallel one. See here. One of the places that they have been running is on seaborg, an IBM RS/6000 with over 6k and near 7k processors.
Interestingly, the PCM guys presented what they wanted for an uber'puter. While it had massive amounts of storage, it was also a 500 *PETAFLOP* SUSTAINED PERFORMANCE machine.
*clickety clack* That'd be something like 166,666,666 Athlons. IDK of any interconnects that handle that. Can you imagine being an admin? Better hope you're good on rollerblades zipping to and fro replacing those oh-so-reliable commodity disks and CPUs...even if you have a .05% failure rate, that's still too damn much. As an admin, that'd be a huge waste of time. It'd also wreck havoc on the guys running stuff.
Or is that what grad students are for? To attempt such a silly thing and then admin it? ;)
Seriously tho. To get from here to their, we're going to need some exotic techs...not just more 'attack of the killer micros'.
Do you know why the road less traveled by is littered with the bones of the unwary?