Supercomputing: Raw Power vs. Massive Storage
securitas writes "The NY Times reports that a pair of Microsoft researchers are challenging the federal policy on funding supercomputers. Gordon Bell and Jim Gray argue that the money would be better spent on massive storage instead of ultra-fast computers because they believe today's supercomputing centers will be tomorrow's superdata centers. They advocate building cheap Linux-based Beowulf clusters (PCs in parallel) instead of supercomputers." NYTimes free reg blah blah.
No Registration Required
Just use the google link!
Brings a tear to my eye... life is good.
My calendar says June 2nd. What does yours say?
GTRacer
- ? slooF lirpA
Defending IP by destroying access to it? That makes sense, RIAA/MPAA. Go to the corner until you can play nice!
Gordon Bell and Jim Gray are not just "a pair of Microsoft researchers". They are two of the biggest names in high-performance computing. Gordon Bell awards, anyone?
Just wait till Bill and Steve hear that their engineers are recommending Linux instead of Windows 2003 Server.
Its nice to see some MS researchers going against the perceived stereotype and being open in their suggestions like this.
And I think they have a good point about massive memory being a very important part of computing advancement right now.
Atheism is a religion to the same extent that not collecting stamps is a hobby.
In an earlier story Microsoft researches recommended a Linux cluster. That story has been corrected. The Microsoft researchers recommend a hundreds of un-clustered Windows-XP servers. They claim they were eating Lea-Nuts brand PEANUT clusters at the time of the interview and were misquoted.
- For the complete works of Shakespeare: cat
By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system. Many scientists are now adapting their work to these parallel computing systems, known as Beowulfs
Man, I'd like to see a... um... damn.
New York Times?
MSFT'ers recommending Linux?
I thought they fired that reporter who was making things up
Cluster computing really is the future. Supercomputers are expensive, run wierd OSes (sometimes), and have infrasructure requirements. A cluster (I prefer OpenMosix, but Beowulf if you like) just requires fast ethernet or fibre.
Plus, think of all the computers that go unused at night in places like school computer labs. All those free machines could, at night, join a cluster and do number crunching for researchers.
-- Bill "Houdini" Weiss
There are lots of reasons to have really good bulk storage technology. But what's the killer app that's going to get the $10^9/year in government spending? Can you say "Domestic Surviellance" boys and girls? I knew you could!
What company would like to supply database software worth a potential $1b per year?
Just waiting for the other shoe to drop...
Esteem isn't a zero sum game
You could at least use partner=SLASHDOT
Look at the average Joe Schmoe, or even us uber-users, who really needs a 3+ GHz machine? Even some of the cornerstones of fast computing such as computational problem solving are being addressed by grid/cluster based solutions which typically don't use high end machines.
I'm perfectly happy with my P3 800MHz, but I run out of hard drive space everyday.
Cheap, YET RELIABLE high density storage solutions are still not readily available. I know we are now down to a $1 per Gig, but the average size of a user's file has increased now. Media (legal or otherwise), games, and other programs are chewing up hard drive space.
There needs to be more research into trustworthy, lowcost high volume storage mediums.
-"Those who fought today will die tommorow."-
As much as I hate conspiracy theories and Microsoft bashing, this may be an extremely clever move. As of now, mainframe and supercomputing worlds are still relatively safe from commiditization. Unlike Linux, which is still virtually ireelevant on the desktop, mainframes and supercomputers are much bigger a piece to swallow for Microsoft. By recommending Linux clusters, Microsoft may actually be trying to establish commodity hardware in the world of supercomputing. The keyword here is hardware. Once clusters become ubiquitous, Microsoft will start aggresively pushing Windows 200X Server Cluster Edition, fighting an enemy it has already much experience with.
It's not Microsoft recommending anything. This is two independant researchers - leaders in the field - who happen to usually work out of Microsofts Bay Area research center.
They dont work for Microsoft, Microsoft simply provides the grants that fund their research.
If anything their report would tell those who are on the MS payroll to get to work on a cluster offering.
I don't need no instructions to know how to rock!!!!
The BBC has an article on a group of scientists who have built a beowulf cluster of Playstation 2s.
I think they're advocating spending the big bucks on data storage rather than on big iron.
When they mention beowulfs, it's in the context that when researchers need the equivalent of a supercomputer, they can just build/use a beowulf cluster. What they can't do on their own is come up with petabyte storage facilities and the data in them.
So what they're really advocating is spending money on storage; it doesn't say in the article what form that storage should take.
The government may very well like this. They're going to need big data farms to support the TIA program. It takes a lot of space to remember what kind of toppings every person in the US likes on their pizza.
On page two of the article, there is a mention of Linux, Beouwulf etc. Moreover x86 is not mentioned explicitly.
From the article
By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system. Many scientists are now adapting their work to these parallel computing systems, known as Beowulfs, which make it possible to cobble together tremendous computing power at low cost.
And if you are going to rewrite Unix code, it is easier to rewrite it for Linnux than for Windows.And how much can a MS cluster scale anyway?
.ACMD setaloiv siht gnidaeR
You have to wonder why, all things seriously being equal, they don't recommend a *BSD-based solution instead of a Linux-based one. Esp given the near-equivalent functionality of the *BSDs, and the fact that MS has publicly endorsed the BSD license in the past, citing it as an superior alternative to the GNU License.
From the MS site, the Bay Area Research Center is "... a small Microsoft Research group located in the San Francisco Bay Area. We've been working on two large projects with other universities, companies, other Microsoft Research groups, and with Microsoft product groups in Redmond and Cupertino. These projects are Scalable Servers and Media Presence. "
I can't see scalability involving commodity hardware with MS OSes. In spite of Microsoft's desktop domination strategies, and small business server dominance (arguably, at least for the moment) they know they won't be taken seriously about clustering Windows 2003 server, purely because there is no design AFAIK in the kernel for operating in clusters in the first place. This is supercomputing using commodity hardware, not supercrashing using commodity OSes. Linux is perfectly situated to be recommended by anyone because it is not a competitors product, per se.
The homepages of the two men can be seen here, if anyone is interested in some of the more interesting history of the two. Little of it has to do with Microsoft propaganda and the marketing machine:-
Gordon Bell
Jim Gray
Conversion Rate Optimisation French / English consultant
Raw speed will always be useful for problems that are hard to parallelize. Right now those problems (parts of crypto, some quantum physics calculations, etc.) are important scientifically, but away from the money.
Industry will spend R&D money on clustering for storage and reliability, without major government subsidy, because there's a crying need for it. How much government money went into Google/eBay/Amazon?
Government research is supposed to complement industry R&D - to be aimed at fields where the results are still important, but maybe not as profitable. This is why government should not abandon raw speed as a research goal.
To a Lisp hacker, XML is S-expressions in drag.
By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system.
"The supercomputer vendors are adamant that I am wrong," Dr. Bell said. "But the Beowulf is a Volkswagen and these people are selling trucks."
All the people who are responding saying they don't mention Linux didn't read the second page.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
I saw that it could be google too, but anyhow, I made a username/password for y'all:
slashdot124
slashdot
Be wary however, I registered as a North Korean military R&D official under high salary.
---
"The chances of a demonic possession spreading are remote -- relax."
All this for the price of a few supercomputers every year. And the market for supercomputers pushes several technologies; for example, high speed interconnect and gallium arsenide, and sets the bar for high performance silicon. Pretty good deal, doncha think?
But now the Moron-in-Chief wants to bring back nuclear testing. (pardon me, 'nookyuler.' Bush can't be wrong about something as simple as pronunciation, can he?). Farewell to deterrence. Farewell to common sense...
--- Often in error; never in doubt!
Research on building Mega beowulf clusters is a legit govt activity and so is building some. But the beauty of the beowulf cluster is that it is affordable to bussinesses, acadmeics and govt, plus its very adaptable to budgets and interconnection schema (fast, slow, grid, scavenger).
but beowulf clusters wont replace the need for super fast, super scalable, computers with well architected interconnects. there are lots of problems in this class, mostly physics simulation, that just cant be done well on beowulf clusters.
I should probably note that my own work involves large computer clusters. However my probelms (in biology) are in fact well suited for beowulf clsters. thus I'm happy to hear of more money for beowulf computing. but frankly I think that this should be in addition to the fast computers.
the flip side here is that it might be the case that money for fast computer resources is not being well spent as it could be at present. there seems to be too much emphasis on "landing the contract" for the computer center than on building a good design. congress via DOE tends to doll these things out in a political fashion making sure each big client gets funding for a center rather than letting the best center get the most contracts. as a result some of the so-called super computers may be just glofied too-expensive-per-cpu unscalable systems already that could be eclipsed by a comparable low cost beowulf system.
but that being said its still an area that the gov needs to fund since it wont drive itself commercially but its needed for lots of science and simulation.
Some drink at the fountain of knowledge. Others just gargle.
http://www.research.microsoft.com/~Gray/talks/CSTB _SuperComputing_Study_Group.ppt
He also talked about CERN generating 10 PetaBytes a year when their new collider comes on line
Supercomputers are sexy, but are losing the technology war. If you start designing a new one today it will be years before it is ready. During those years Intel and AMD will crank up their clock speeds and negate much if not all of the CPU speed advantage you get from your fancy design. Why not go for parallelism from cheap machines?
No electrons were harmed creating this post, though some may have been subjected to electrical and/or magnetic fields.
this is not what high performance computing is about. this is the class of problems that are embarassingly parallel and dont need good disk access. in short pointless benchmarks like computing pi rather than solving real tightly coupled physics probelms like say asteroid impacts, or molecular dynamics. or problems where processors have to access the disk a lot, or share data.
Some drink at the fountain of knowledge. Others just gargle.
Don't confuse Microsoft research with the rest of Microsoft. The research branch has the same atmosphere as a university. In fact, Microsoft has bought a number of university research groups wholesale. Quite a few famous people are now working for them (e.g. Tony Hoare, Erik Meyer, and the guys in the original article).
I've heard presentations from them, and talked to them in private, and I can assure you they are far from following the party line. I'm sure that any pressure from above to do so would cause massive protest.
Microsoft is very wise to run the research branch this way. Research is not the province of yes-men.
Massive data storage doesn't mean a thing to people like me who do computational physics work. We need better supercomputers to simulate larger systems... or simulate them faster. Sure, we can simulate a system of 300,000 particles within a few hours, but there could be great value in simulating systems of millions of particles. Maybe there is some effect that we miss... or something.
Anyway, data storage is not a problem in MY field -- and I would think that government interests in supercomputing lie in places OTHER than fast database servers or whatever.
Mmmm......sacrelicious.
According to this a "beowulf" is a cluster of cheap computers, NOT a cluster of cheap LINUX computers. I don't think Microsoft is advocating Linux, as much as I/you/we wish they were... http://www.phy.duke.edu/brahma/beowulf_online_book /node61.html
storage outsrips transistors, always has, probably always will. It's easier to store a piece of data then it is to manipulate it. Look at the storage capacity vs time graph and compare it to Moore's law, the doubling happens every 12-15 months not every 18-24. Access times haven't gotten lower, but that's because we still use rotating disks, it's very, very hard to make cheap components to the tollerances that would allow >15K RPM's. If the ever preducted holographic storage comes to be then we will have fast and low latency mass storage, but that's a field where throwing more money at it won't necessarilly make it happen faster because it's a basic sciences kind of thing and it's really just waiting for the right mind to come along to break it out of the rut it's been in for the last 10+ years.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
I agree to the point that money should be spent on data storage, but I'm not sure that money should be taken out of the "super computing" budget or wherever the money comes from. I think it should be another priority, but really, we need both. Clusters aren't the solution to every problem, and super computers have their place. All in all I think it amounts to we need more government spending in the IT sector, and better spending in general. The ISP where I work at is also a geological data and oil resevoir company. We recently did a project for the DOE and they budgeted us $2 Mil. just for a web page about the project. Ridiculous. That $2 Million would buy a pretty nice data storage center I would think. But I guess that's what happens when your govt pays $500 for a hammer.
Everyone is entitled to their own opinion. It's just that yours is stupid.
> And how much can a MS cluster scale anyway?
Windows 2000/2003 WLBS can scale theoretically scale to 32 nodes, but I have seen performance decreases after 16 or so.
Windows 2000 MCS can scale up to two nodes with Advanced Server, and four nodes with Datacenter.
Windows 2003 MCS can scale up to four nodes with the Server, and eight nodes with Enterprise.
jwg
The Titan Cluster
The Platinum Cluster
TeraGrid Clusters Successfully Installed at NCSA
These clusters run either RedHat or SuSE Linux and are available for researchers nationwide.
These clusters are not beowulf; they allow access through a general scheduler and have MPI to run programs that use a group of nodes at once. This gives the greatest flexability to the users to create a computational system that can be optimzed for the size and needs of their problem. The size of a cluster that can be supported at a national center allows enough computational power to solve problems that can't be solved elsewhere. Given that a cluster of a 128 nodes is now considered an instituitional asset and within the purchasing power of any university, it makes sense to use federal funds to create systems to handle problems beyond the scale of a cluster that any university might own.
Another aspect of this issue arises in the asumption that cluster computing is so easily accomplished that it might be compared to the setup of a single system. I respectfully submit that the simpliest of clusters is none too easy to deploy and use as of today, not to mention the lack of support one gets for the application of their scientific research to a stock parallel computing platform. The national centers can afford to have consultants and researchers on staff that specialize in these matters, as well as full-time admins.
Note: The opinions expressed here are my own and not necessarily representative of my employer or the federal government. In addition, given that I am employed by NCSA, a slight element of bias may be present in my statements. :)
The Internet has no garbage collection
Actually, Beowulf clusters of 800-1,000 machines running Linux can be competitive with supercomputers.
I remember reading in Wired magazine a few years ago about a biotech company here in the San Francisco Bay Area that clustered several hundred machines running Pentium III 600 MHz CPU's to do DNA mapping and analysis--and the results were just as fast as most supercomputers costing several times what that cluster cost.
Imagine what a cluster of 700 to 1,000 blade servers running the latest Intel Xeon CPU's can do now! =)
"...a pair of Microsoft researchers..."
"They advocate building cheap Linux-based Beowulf clusters..."
come on guys...June 2nd, not April 1st.
When all else fails, use the backup...
In Soviet Russia, clusters Beowulf you!
it's quite astonishing that these researchers, who are otherwise well-reputed, have missed the whole point of government sponsorship of super-* facilities: to do what can't be done otherwise. mostly, that means running traditional supercomputer jobs, those that are tightly coupled. people who have loosely-coupled jobs have long ago bailed from the supercomputing arena, and have been building their own clusters. similarly, there's no unique advantage to centralizing data storage, and a huge disadvantage (bottlenecks in and out).
I have to wonder whether Markoff badly munged the intent of the Gray/Bell paper, since the way he presents it is internally inconsistent. that is: the gov should spend huge bucks on massive centralized storage, but computing should be decentralized ala grids. oops, how is all that compute power supposed to move data to/from the three national data repositories? perhaps the central problem here is the fallacy shared by grid-o-philes: that networking is getting dramatically faster. take a look at your own network: if you are lucky enought to have gigabit to the desktop, when did that upgrade happen (probably 100 upgrade happen? what kind of speed did you get on your last big download? I've experienced a speedup of something between 10 and 50x in the past, say, 10 years. that's pathetic, when compared to the speedup we all have experienced in CPU power, memory size/speed, and disk size/speed.
there's no Moore's Law of networking: no n^2 process to keep accelerating (unlike die or disk densities). yes, there are technological improvements, and yes, you can gang cables together to scale bandwidth almost linearly. no such help for latency, though. and technological improvements are neither infinite nor increasing. that means that the network is becoming more of a bottleneck, not less.
Supercomputers are used for high performance technical computing. Mainframes, on the other hand are used when you need high reliability/availability. When someone talks about 5-nines reliability, they are saying a system is up 99.999% of the time - equivalent to a couple minutes per year. The systems that achieve this do what is called fault-tolerant computing. It is done by having integrated redundant hardware along with the appropriate specialized software to deal with it.
You won't find any supercomputer or PC that does this. This is why there will *always* be a market for mainframes. It may not be a huge market, but it's still a market.
They employ people with the likes of Tony Hoare (invented quicksort and the 'hoare triple'). They also hired most of the core developers of the functional language Haskell. And many other brilliant minds.
Most universities could only dream of the funding that MS research has. And they're completely free to research whatever they want. And of course they use Linux, BSD and whatever other tools are right for the job. They're researchers, not software politicans.
-