Supercomputing: Raw Power vs. Massive Storage
securitas writes "The NY Times reports that a pair of Microsoft researchers are challenging the federal policy on funding supercomputers. Gordon Bell and Jim Gray argue that the money would be better spent on massive storage instead of ultra-fast computers because they believe today's supercomputing centers will be tomorrow's superdata centers. They advocate building cheap Linux-based Beowulf clusters (PCs in parallel) instead of supercomputers." NYTimes free reg blah blah.
No Registration Required
Just use the google link!
Brings a tear to my eye... life is good.
Microsoft scientists wanting Linux Beowolf clusters? .NET into Evolution!
Next thing you know they will be trying to intergrate
Don't mod me, bro'!!!!
My calendar says June 2nd. What does yours say?
GTRacer
- ? slooF lirpA
Defending IP by destroying access to it? That makes sense, RIAA/MPAA. Go to the corner until you can play nice!
Gordon Bell and Jim Gray are not just "a pair of Microsoft researchers". They are two of the biggest names in high-performance computing. Gordon Bell awards, anyone?
Just wait till Bill and Steve hear that their engineers are recommending Linux instead of Windows 2003 Server.
"But the Beowulf is a Volkswagen and these people are selling trucks."
What a great quote!
KARMA TAG! You're it.
Its nice to see some MS researchers going against the perceived stereotype and being open in their suggestions like this.
And I think they have a good point about massive memory being a very important part of computing advancement right now.
Atheism is a religion to the same extent that not collecting stamps is a hobby.
Exactly where in that article did they endorse beowulf clusters?
Maybe there's hope somewhere in Redmond afterall? Nah..... Tomorrow they'll retract their statement (after a good talking to from Bill) and advocate Windows XP - Cluster Edition.
In an earlier story Microsoft researches recommended a Linux cluster. That story has been corrected. The Microsoft researchers recommend a hundreds of un-clustered Windows-XP servers. They claim they were eating Lea-Nuts brand PEANUT clusters at the time of the interview and were misquoted.
- For the complete works of Shakespeare: cat
Microsoft researchers recommending linux!
Can you say w00t!?!?!?
By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system. Many scientists are now adapting their work to these parallel computing systems, known as Beowulfs
Man, I'd like to see a... um... damn.
Worlds fastest supercomputer: SETI@home.
LilMikey.com... I'll stop doing it when you sto
It sounded too surreal to be real but there it was on Page 2, Paragraph 1. That said, I think this may be a sign of the apocalypse, so run and take cover!
Good people do not need laws to tell them to act responsibly, while bad people will find a way around the laws-Plato
New York Times?
MSFT'ers recommending Linux?
I thought they fired that reporter who was making things up
Must be why they bought that license from SCO.
now we need to go OSS in diesel cars
Cluster computing really is the future. Supercomputers are expensive, run wierd OSes (sometimes), and have infrasructure requirements. A cluster (I prefer OpenMosix, but Beowulf if you like) just requires fast ethernet or fibre.
Plus, think of all the computers that go unused at night in places like school computer labs. All those free machines could, at night, join a cluster and do number crunching for researchers.
-- Bill "Houdini" Weiss
So, aside from the obvious statement about Linux based Beowolf servers, I find it interesting that these computer scientists turned "Microsofties" are advocating a position held by Oracle's Ellison. Jeez, this is the way things were back in the 70's too. What's old is what's new eh?
Visit Jonesblog and say hello.
There are lots of reasons to have really good bulk storage technology. But what's the killer app that's going to get the $10^9/year in government spending? Can you say "Domestic Surviellance" boys and girls? I knew you could!
Well, as EVERY post before this pointed out, MicroSoft is recommending Linux clusters!? Why is that? Well, the only thing I can think of after only 4 cups of coffee, is that they're hoping that this attempt will fail. They can then claim that they tried to help Linux, but it was so sucky that it screwed everything up. That it couldn't compete with the "real" supercomputers after all.
Just a thought.... anyone have a better explination? There has to be a different motive here.
This space for rent, inquire within.
What company would like to supply database software worth a potential $1b per year?
Just waiting for the other shoe to drop...
Esteem isn't a zero sum game
You could at least use partner=SLASHDOT
Look at the average Joe Schmoe, or even us uber-users, who really needs a 3+ GHz machine? Even some of the cornerstones of fast computing such as computational problem solving are being addressed by grid/cluster based solutions which typically don't use high end machines.
I'm perfectly happy with my P3 800MHz, but I run out of hard drive space everyday.
Cheap, YET RELIABLE high density storage solutions are still not readily available. I know we are now down to a $1 per Gig, but the average size of a user's file has increased now. Media (legal or otherwise), games, and other programs are chewing up hard drive space.
There needs to be more research into trustworthy, lowcost high volume storage mediums.
-"Those who fought today will die tommorow."-
As much as I hate conspiracy theories and Microsoft bashing, this may be an extremely clever move. As of now, mainframe and supercomputing worlds are still relatively safe from commiditization. Unlike Linux, which is still virtually ireelevant on the desktop, mainframes and supercomputers are much bigger a piece to swallow for Microsoft. By recommending Linux clusters, Microsoft may actually be trying to establish commodity hardware in the world of supercomputing. The keyword here is hardware. Once clusters become ubiquitous, Microsoft will start aggresively pushing Windows 200X Server Cluster Edition, fighting an enemy it has already much experience with.
OK, go ahead and mod parent down and troll me into oblivion. I walked right into that one.
The second page of the article explains linux, beowulf, and sounds like a pretty neat step for clustering projects.
There is no reasonable defense against an idiot with an agenda
:wq
Lots of data, in a networked array of systems.
Sounds familar, and the RIAA and MPAA's worst fears.
Try reading the second page, homey.
Mikey-San
Karma: +Eleventy billion (mostly affected by watching Celebrity Jeopardy)
nothing about this mentioned on their researcherteams homepage :
http://research.microsoft.com/barc/Scaleable/
(just in case somebodys interessted)
The BBC has an article on a group of scientists who have built a beowulf cluster of Playstation 2s.
By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system. Many scientists are now adapting their work to these parallel computing systems, known as Beowulfs, which make it possible to cobble together tremendous computing power at low cost.
POMPOUS JACK-OFF!!!
I think they're advocating spending the big bucks on data storage rather than on big iron.
When they mention beowulfs, it's in the context that when researchers need the equivalent of a supercomputer, they can just build/use a beowulf cluster. What they can't do on their own is come up with petabyte storage facilities and the data in them.
So what they're really advocating is spending money on storage; it doesn't say in the article what form that storage should take.
The government may very well like this. They're going to need big data farms to support the TIA program. It takes a lot of space to remember what kind of toppings every person in the US likes on their pizza.
The article isn't about the fact that the researchers were Microsoft researchers, the article is about the fact that they suggest spending in data storage instead of processor power.
In my opninion, those two are inseperable. What use is running a giant experiment about modeling the globe's climate when you don't have a huge source of data to base your experiments on? And when you generate the model, what good is it if you can't store the results for future reference?
I do find that processors evolve very rapidly, still respecting Moore's law, but the data-storage field could really do with a scientific breakthrough to increase storage capacity alot.
Perhaps putting more money in the envelope for research in this field would satisfy the current need for cheap and reliable data-storage. Perhaps the need alone is enough to make people come up with better storage facilities.
On page two of the article, there is a mention of Linux, Beouwulf etc. Moreover x86 is not mentioned explicitly.
From the article
By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system. Many scientists are now adapting their work to these parallel computing systems, known as Beowulfs, which make it possible to cobble together tremendous computing power at low cost.
And if you are going to rewrite Unix code, it is easier to rewrite it for Linnux than for Windows.And how much can a MS cluster scale anyway?
.ACMD setaloiv siht gnidaeR
Could it not be that the linux beowulf cluster is the best solution? Maybe these are two honest researchers stating their opinion... and not really part of Microsoft's plan to overtake the world.
Davak
You have to wonder why, all things seriously being equal, they don't recommend a *BSD-based solution instead of a Linux-based one. Esp given the near-equivalent functionality of the *BSDs, and the fact that MS has publicly endorsed the BSD license in the past, citing it as an superior alternative to the GNU License.
From the MS site, the Bay Area Research Center is "... a small Microsoft Research group located in the San Francisco Bay Area. We've been working on two large projects with other universities, companies, other Microsoft Research groups, and with Microsoft product groups in Redmond and Cupertino. These projects are Scalable Servers and Media Presence. "
I can't see scalability involving commodity hardware with MS OSes. In spite of Microsoft's desktop domination strategies, and small business server dominance (arguably, at least for the moment) they know they won't be taken seriously about clustering Windows 2003 server, purely because there is no design AFAIK in the kernel for operating in clusters in the first place. This is supercomputing using commodity hardware, not supercrashing using commodity OSes. Linux is perfectly situated to be recommended by anyone because it is not a competitors product, per se.
The homepages of the two men can be seen here, if anyone is interested in some of the more interesting history of the two. Little of it has to do with Microsoft propaganda and the marketing machine:-
Gordon Bell
Jim Gray
Conversion Rate Optimisation French / English consultant
They're all wrong. The government doesnt need to support supercomputer research or beowulf clusters. It needs to spend more money on quantum computing research which in recent times seems to have fallen by the wayside. While everyone is rushing to build that next computer of super computer that is twice as fast as the last one, we could be rushing to build one thats several million times faster than anything current technological approaches will ever be able to reach. According to certain famous formulas (sorry forgot the guys name) computer chips are only going to get faster and faster as they get smaller and smaller, but eventualy theyll be to small to improve on and thats where we need quantum computing. If you look at the formula (sorry about the name again) and do the math youll see that this time is quickly approaching, and if we arent ready to continue to progress past that point the whole bottom could fall out of the tech industry
this story is obviously meant to distract the /. community, right now your systems are being hacked by the very clever ashcroftians in conguction with department of homeland security, you all have been owned
Try reading page 2, Asshats! I thought we were supposed to be smart...
Raw speed will always be useful for problems that are hard to parallelize. Right now those problems (parts of crypto, some quantum physics calculations, etc.) are important scientifically, but away from the money.
Industry will spend R&D money on clustering for storage and reliability, without major government subsidy, because there's a crying need for it. How much government money went into Google/eBay/Amazon?
Government research is supposed to complement industry R&D - to be aimed at fields where the results are still important, but maybe not as profitable. This is why government should not abandon raw speed as a research goal.
To a Lisp hacker, XML is S-expressions in drag.
By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system.
"The supercomputer vendors are adamant that I am wrong," Dr. Bell said. "But the Beowulf is a Volkswagen and these people are selling trucks."
All the people who are responding saying they don't mention Linux didn't read the second page.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
I saw that it could be google too, but anyhow, I made a username/password for y'all:
slashdot124
slashdot
Be wary however, I registered as a North Korean military R&D official under high salary.
---
"The chances of a demonic possession spreading are remote -- relax."
All this for the price of a few supercomputers every year. And the market for supercomputers pushes several technologies; for example, high speed interconnect and gallium arsenide, and sets the bar for high performance silicon. Pretty good deal, doncha think?
But now the Moron-in-Chief wants to bring back nuclear testing. (pardon me, 'nookyuler.' Bush can't be wrong about something as simple as pronunciation, can he?). Farewell to deterrence. Farewell to common sense...
--- Often in error; never in doubt!
They didn't advocate.
They simply spotted a trend, and suggested that BECAUSE of it (because of the use of Beowulf clusters of Linux machines), the focus of research should be on large data storage.
.sigs are for post^Hers.
money would be better spent on massive storage instead of ultra-fast computers
Of course, I agree fully with this arguement.
How else are we going to store all that pr0n?
Oh, yeah and the other data too.
.
Have you read the moderator guidelines? Well, have you, PUNK? (and I want a Karma: Gnarly option)
My calendar says its my birthday,
;)
so happy birthday to me.
(maybe not april fools, but commemorating the birth of a fool?
In the future, I would want to not be isolated from my friends in the Space Station.
Research on building Mega beowulf clusters is a legit govt activity and so is building some. But the beauty of the beowulf cluster is that it is affordable to bussinesses, acadmeics and govt, plus its very adaptable to budgets and interconnection schema (fast, slow, grid, scavenger).
but beowulf clusters wont replace the need for super fast, super scalable, computers with well architected interconnects. there are lots of problems in this class, mostly physics simulation, that just cant be done well on beowulf clusters.
I should probably note that my own work involves large computer clusters. However my probelms (in biology) are in fact well suited for beowulf clsters. thus I'm happy to hear of more money for beowulf computing. but frankly I think that this should be in addition to the fast computers.
the flip side here is that it might be the case that money for fast computer resources is not being well spent as it could be at present. there seems to be too much emphasis on "landing the contract" for the computer center than on building a good design. congress via DOE tends to doll these things out in a political fashion making sure each big client gets funding for a center rather than letting the best center get the most contracts. as a result some of the so-called super computers may be just glofied too-expensive-per-cpu unscalable systems already that could be eclipsed by a comparable low cost beowulf system.
but that being said its still an area that the gov needs to fund since it wont drive itself commercially but its needed for lots of science and simulation.
Some drink at the fountain of knowledge. Others just gargle.
Do you see the irony in M$ recommending that Linux be used for storage? Obviously this is the best solution, but the mere fact that M$ is recommending something other than their own could show a shift in their motives.
The parent is offtopic how? May you be Meta-modded into oblivion, DipShit.
http://www.research.microsoft.com/~Gray/talks/CSTB _SuperComputing_Study_Group.ppt
I've seen reports the US decided not to sell supercomputers countries like Pakistan and India. So my question is, Can this countries good enough job with Bewolf clusters ?? What is it they absolutely can not do without a supercomputer??
I'm surprised the government is still funding old-fashioned "Supercomputers" though. Well no, I guess I'm not. They're still subsidising helium production, so why not supercomputers?
Seems like everyone who needs tons of power has been doing Beowulf clusters for years. Wish the government would catch up.
ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
He also talked about CERN generating 10 PetaBytes a year when their new collider comes on line
Supercomputers are sexy, but are losing the technology war. If you start designing a new one today it will be years before it is ready. During those years Intel and AMD will crank up their clock speeds and negate much if not all of the CPU speed advantage you get from your fancy design. Why not go for parallelism from cheap machines?
No electrons were harmed creating this post, though some may have been subjected to electrical and/or magnetic fields.
this is not what high performance computing is about. this is the class of problems that are embarassingly parallel and dont need good disk access. in short pointless benchmarks like computing pi rather than solving real tightly coupled physics probelms like say asteroid impacts, or molecular dynamics. or problems where processors have to access the disk a lot, or share data.
Some drink at the fountain of knowledge. Others just gargle.
It's not happened in 20 years so far.
I've seen code running on a supercomputer which was first written for a VAX. The authors were long gone but nobody could afford the rewrite.
Don't confuse Microsoft research with the rest of Microsoft. The research branch has the same atmosphere as a university. In fact, Microsoft has bought a number of university research groups wholesale. Quite a few famous people are now working for them (e.g. Tony Hoare, Erik Meyer, and the guys in the original article).
I've heard presentations from them, and talked to them in private, and I can assure you they are far from following the party line. I'm sure that any pressure from above to do so would cause massive protest.
Microsoft is very wise to run the research branch this way. Research is not the province of yes-men.
First of all, isn't a good thing that Microsoft Researchers are thinking of using Linux for their research needs. It only makes sense to use a stable free operating system that's highly configurable to keep track of all their data. In the research situation, I agree that it's better to have mass storage than fast computing. In a few years they would just have to upgrade again.
Open Source. It's the difference between trust and antitrust.
Massive data storage doesn't mean a thing to people like me who do computational physics work. We need better supercomputers to simulate larger systems... or simulate them faster. Sure, we can simulate a system of 300,000 particles within a few hours, but there could be great value in simulating systems of millions of particles. Maybe there is some effect that we miss... or something.
Anyway, data storage is not a problem in MY field -- and I would think that government interests in supercomputing lie in places OTHER than fast database servers or whatever.
Mmmm......sacrelicious.
According to this a "beowulf" is a cluster of cheap computers, NOT a cluster of cheap LINUX computers. I don't think Microsoft is advocating Linux, as much as I/you/we wish they were... http://www.phy.duke.edu/brahma/beowulf_online_book /node61.html
BTW, I at least didn't think it should have been modded down. Humor is lost on some people, I suppose.
Redmond, Washington -- In a move related to the announcement of Linux for clustering, or large supercomputing projects, Microsoft has announced a new version of Windows XP, which supports up to 200 CPUs. The Highly-SMP version of XP is however, limied by the fact that all 200 CPUs must be on the same motherboard.
"200 CPUs should be enough for anybody." Microsoft Chairman Bill Gates was quoted as saying.
Steven Ballmer, CEO of Microsoft, contributed the following: "Developers, Developers, Developers, Developers!"
"Microsoft is commoitted to providing highly-scalable, enterprise-wide, trustworthy computing." claimed I. Amaliar, head of Microsoft's marketing division.
MS has annoucned a release date for this new version of XP for August, 2004.
If telephones are outlawed, then only outlaws will have telephones.
They advocate building cheap Linux-based Beowulf clusters (PCs in parallel) instead of supercomputers.
When talking about building a cluster, the first question I'd ask is whether they want HA(High Availability) or HPC(High Performance Computing). Some people thought clustering up 100 Pentium would imeediately achieve HPC. Without careful design and estimation, the cluster will fail on both aspects.
Most cluster infrastructures is only focusing on either one aspect, and mostly HA. When talking about HPC, recent researches shows that Computational Grid would be more effective.
Nevertheless, a cluster with many nodes may not behave better than a single supercomputer in HPC, but the former would definitely be more cost effective and excel in HA. It's very clever of them to recommend Beowulf on Linux; they ground would be weaken if they recommended Microsoft cluster, which is quite expensive in my opinion.
Clusters are ideal for problems that are easily parallelized (sp?) - such as the modeling of protein strucures/protein folding. As a matter of fact, most of the researchers here (UIUC) that do research in that area are using LINUX clusters. As are the people doing this with the CAVE/CUBE. So it looks like a lot of them have already made the move away from supercomputer based research and to cluster based research.
I agree to the point that money should be spent on data storage, but I'm not sure that money should be taken out of the "super computing" budget or wherever the money comes from. I think it should be another priority, but really, we need both. Clusters aren't the solution to every problem, and super computers have their place. All in all I think it amounts to we need more government spending in the IT sector, and better spending in general. The ISP where I work at is also a geological data and oil resevoir company. We recently did a project for the DOE and they budgeted us $2 Mil. just for a web page about the project. Ridiculous. That $2 Million would buy a pretty nice data storage center I would think. But I guess that's what happens when your govt pays $500 for a hammer.
Everyone is entitled to their own opinion. It's just that yours is stupid.
> And how much can a MS cluster scale anyway?
Windows 2000/2003 WLBS can scale theoretically scale to 32 nodes, but I have seen performance decreases after 16 or so.
Windows 2000 MCS can scale up to two nodes with Advanced Server, and four nodes with Datacenter.
Windows 2003 MCS can scale up to four nodes with the Server, and eight nodes with Enterprise.
jwg
The Titan Cluster
The Platinum Cluster
TeraGrid Clusters Successfully Installed at NCSA
These clusters run either RedHat or SuSE Linux and are available for researchers nationwide.
These clusters are not beowulf; they allow access through a general scheduler and have MPI to run programs that use a group of nodes at once. This gives the greatest flexability to the users to create a computational system that can be optimzed for the size and needs of their problem. The size of a cluster that can be supported at a national center allows enough computational power to solve problems that can't be solved elsewhere. Given that a cluster of a 128 nodes is now considered an instituitional asset and within the purchasing power of any university, it makes sense to use federal funds to create systems to handle problems beyond the scale of a cluster that any university might own.
Another aspect of this issue arises in the asumption that cluster computing is so easily accomplished that it might be compared to the setup of a single system. I respectfully submit that the simpliest of clusters is none too easy to deploy and use as of today, not to mention the lack of support one gets for the application of their scientific research to a stock parallel computing platform. The national centers can afford to have consultants and researchers on staff that specialize in these matters, as well as full-time admins.
Note: The opinions expressed here are my own and not necessarily representative of my employer or the federal government. In addition, given that I am employed by NCSA, a slight element of bias may be present in my statements. :)
The Internet has no garbage collection
I watched a presentation that Dr. Bell did at NSF fairly recently.
BTW, of the lectures that NSF have had (Computer science related) he has been one of the only lecturers to NOT allow the video presentation to be posted.
He IS advocating the Microsoft view of the world. Which is that "in the future" everybody will have massive storage and computing will be concentrated on the individual desktops.
This is what Bill Gates believes and wants! Dr. Bell does not believe that clustered supercomputing (aka grid computing) is worth any funding.
The idea of GRID computing is contrary to the MS view of where the power should be in desktop computing.
Thanks a lot for the information, though the question was mainly rhetorical. I have never worked on Windows clusters. If your info is correct, then Linux clusters are really superior
.ACMD setaloiv siht gnidaeR
Skimming the article gave me the sense that either the reporter's ideas or those of Bell and Gray are just all over the map.
One fact not mentioned is that planning for storage is already an integral part of planning a supercomputing center. Also not mentioned is another predictable outcome that generating lots of data eventually requires someone, or some thing (e.g., a beowulf cluster), to analyse it. Thus, under the present trends, data mining itself as well as development of method of *how* to data-mine are becoming increasingly important.
Where I would agree with Bell and Gray in principle is that we focus on *good* scientific problems. With more computers and faster computers, you can produce more and more crap with less and less thought. So, the issue becomes one of defining what good science is. Today it's defined in terms of politics, money, and peer review, not all mutually exclusive. But one thing to keep in mind is that the size of a research group doesn't necessarily correlate with the quality of the science produced therein.
To-do List: Receive telemarketing call during a tornado warning. Check.
"Letting the researchers decide" is a clear means of pushing M$ crap. While we might imagine people spinging up to do the work, M$ is still up to it's tricks and not everyone knows how to set up a cluster. Between a shortage of trained people ready to move and Paladium, M$ stands to suck up sales. NASA and others have shown the way, but M$ has blocked better schemes before. Just look at the last article on running a reasearch lab with free software. The winners were drowned in a sea of astroturf.
Beowulf clusters are the best solution to many problems and individual researchers are building them.
The bottom line for Microsft is that such a policy shift would provide potential platforms for their sales while huring companies like SUN. They know things about Paladium that we don't.
DMCA, Hollings, Palladium. What might have sounded like paranoia is now common sense.
I always thought Princess Lea was just a little NUTS, too.
Actually, Beowulf clusters of 800-1,000 machines running Linux can be competitive with supercomputers.
I remember reading in Wired magazine a few years ago about a biotech company here in the San Francisco Bay Area that clustered several hundred machines running Pentium III 600 MHz CPU's to do DNA mapping and analysis--and the results were just as fast as most supercomputers costing several times what that cluster cost.
Imagine what a cluster of 700 to 1,000 blade servers running the latest Intel Xeon CPU's can do now! =)
It seems though all the news stories chronicle how fast the latest incarnation of computers are but are less focused on the massive amount of storage each new cluster requires. Here's a look at the top 5 supercomputers:
Earth Simulator: 35.86 teraflops, storage not listed
ASCI Q #1: 7.7 teraflops, 600TB storage
ASCI Q #2: 7.7 teraflops, 600TB storage
ASCI White: 7.2 teraflops, 160TB
MCR Linux: 5.7 teraflops, 138TB
I think they realize that clusters can be built cheaply like never before. It used to be tens of millions of dollars have to be spent to get a Cray, but for some simulations like modeling weather, clusters are better. With the advent of Linux on x86 machines, computing power is getting cheaper. Oak Ridge National Laboratory has a Beowulf cluster built from discarded and donated 486's and Pentiums. It doesn't get much cheaper than pratically free.
Well, there's spam egg sausage and spam, that's not got much spam in it.
that is some funny shit
You're thinking of Moore's Law - the transistor density on a chip grows by a factor of two every 18 months. People have been saying for years that we're going to hit the physical limit in a couple months/years, and that this growth can't be sustained forever. However, each time this limit approaches, technical innovations appear, and we get a reprieve of another 5-10 years. Most of the computer architecture professors I know don't take these "the sky is falling" warnings seriously anymore.
And Quantum computing hasn't fallen by the wayside. Building a useful Q-bit is difficult, and is mostly worked on in universities and research facilities - it's not ready for prime time yet. It hasn't been forgotten though. At the same time, computer scientists are developing quantum computer algorithms - it's been years since I've looked into this, but we can do nondeterministic database lookups via quantum algorithms. And surely more have been developed. You can't expect the fundamental nature of computing, which has been developed over the past 60 years, to change overnight to something completely different.
Raw Power vs. Massive Storage
Raw Power is a classic but I thought that Massive Storage was far too commercial and derivative.
Many claim that it's what caused the breakup of the Stooges, something Iggy Pop denies to this day.
> If your info is correct, then Linux clusters are really superior
Here is a Microsoft Technet article for reference. Yes, it looks IBM's Linux cluster kicks Microsoft's tail.
I think I read an article recently that interviewed Bell, and he said Beowulf cluster is not the answer? And the US gov should lead the development of faster computer ... I am pretty sure the article is quite current, maybe last year or the beginning of this year.
Anyone can confirm that?
that this story defies having a decent "imagine a beowulf cluster of *" joke being made about it.
please, can someone help ?
... Microsoft has announced that two prominent researchers from Bay Area Research Center have been surgically turned into lobster like creatures.
When pressed for reasons this was done, Bill Gates responded "Lobster tastes really good."
Arbitrary sig
I am sympathetic to the core arguement about super data centers and that Beowulf clusters are allowing great strides in clustering computing power. I do think that pursuits along both super-data and super-computer paths are worthwhile. Both paths can feed off each other and both have problem types which they excel at.
So long and thanks for all the fish . . . !!!
Digital signal processing can suck up huge amounts of processing power. How would Joe Schmoe feel about a software defined radio or a software defined television in his PC? Today we have DVD players implemented in software. Someday we will have high-definition television receivers implemented in software.
Mea navis aericumbens anguillis abundat
See slide 12 of his presentation a couple of weeks ago:
S TB _SuperComputing_Study_Group.ppt
http://www.research.microsoft.com/~Gray/talks/C
Esteem isn't a zero sum game
Okay, first of all, the really tough CFD problem de jour is incompressible flow, which is--you got it--elliptic. Resolving or modeling the turbulent scales in a time-accurate way, especially near boundaries, is the most difficult part. Fluid dynamics equations only go hyperbolic where compressibility is important, such as in supersonic flow. For incompressible, you'll notice that solving the pressure-Poisson equation generally requires an FFT, a non-local operation
(or you can use e.g. a vortex method).
*HOWEVER*, it is *much* easier to solve heat and poisson equations than Navier-Stokes, for the very important reason that they are linear. I mean, really. Any old cad/fem package can do heat conduction, and poisson is just an FFT away. What makes hydrodynamics hard is its nonlinearity. It's just as elliptic as the other problems you mention under the incompressible conditions most often studied.
As I see it they are talking about using computing power in a different way than it is being used now.
Instead of focussing on a small part of data and doing a lot of numbercrunching on it, they want to take the whole chunk of data and process it at once. That way scientists might get a more global perspective on things, and might be able to form and/or adjust different theories during the preliminary results.
The way that can be done is through a lot of cheap harware in a cluster, but they don't care what will run as OS, as long as scientists will get a better way to do their jobs.
It's not about MS, Linux, BSD, VMS or whatever, it's about getting better results and efficiency from computing power.
home
Microsoft researchers are suggesting that clusters of Linux PCs are used? Are they now Ex-Microsoft researchers?
Jaysyn
There is a war going on for your mind.
Writing code for clusters with much larger latencies than those supercomputers is more difficult. Parallel coding by itself is already an art form. Scientists want to think about the science behind their problem, not the technical details behind the parallelization of their code. And the more complex your code, the more likely it is that you make mistakes, and that is a big problem for simulations that take a long time to complete.
However, the problem at small universities and these expensive "super computers" they own, like the enterprise 10000 we have, is that they are intended to be replaced a lot less frequent than a desktop workstation. At first your code will run a lot faster on a few nodes of such a system, but it's gained upon by a standard workstation quite fast. So when you're halfway through the time you are stuck with the super computer, your new ultra cheap workstation will outperform the expensive supercomp on problems that require small latencies, or scale badly. A cluster is often much cheaper to update.
So for smaller facilities, where most of the jobs that are submitted are allowed to use up to 8 nodes for example, I would use clusters and update the network infrastructure and CPU's as often as is possible. For large jobs I don't think we can do without nationally owned, big supercomputers. There simply is a group of problems that require these supercomputers, and where clusters can't be used. And these national science centers can of course maintain different kinds of supercomputers. If your problem requires low latency, use the supercomp, if it doesn't, use the cluster.
The authors propose to give scientists money to buy their own clusters, but I already saw calls for proposals (where you can apply for a research grant) where you could either reserve computing power on the national super computers, or get money to buy a cluster, or otherwise spend it on computer hardware. Of course the real question stated in the article is whether a country like America needs to have the fastest supercomputer. I guess that question is just a political one, as is the question whether that country would need the biggest storage facility in the world.
I also do not really understand the storage facility thing. Storage is not something I would expect you need temporarily. Only intermediate results are temporary, but the data in the big databases they mention seem there to stay. Once you've bought storage for one project, you can not allocate the storage to another project like in the case of supercomputing, where a project takes a month and then the power is handed to another project. If you have got such a project that needs lots of permanent storage space, why then not give THEM money to built such a storage system. Every university nowadays is on a fast line, and I don't see why that has to be central. Even storage divided among groups of researchers does not have to reside on a super data center. Just build systems for every requirement, with room to spare. Or am I missing something?
.. that the time it takes to finish a computing project is better spent waiting and doing nothing, because the technology improves so fast that at the end of your project you will have new systems available that didn't exist when you started!
(does anyone know who said it? i ferget at the moment. it's not moore's law, that's something different...)
I suggest you read Slashdot
Many scientists agree with the "moron-in-chief", as you termed him. Computer simulations provide the necessary assurance that the nuclear stockpile is safe and reliable today. There are serious questions about whether this is a adequate long-term solution. To have credible deterrence value, potential adversaries must believe that the weapons will work as advertised. Without data from physical experiments, computer simulations can be misleading or useless. The physical experiments are a necessary reality check. As the simulations become more complex and sophisticated, so does the uncertainty as to the accuracy of their results. At some point we may be forced to resume limited nuclear testing in order to check and validate the models used in the simulations.
Mea navis aericumbens anguillis abundat
As for weather prediction, a thousand processor cluster will need a thousand programmers to do all the optimization necessary to get even mediocre performance!
Weather prediction code is just full of special cases, requiring conditional branches inside of conditional branches, etc.
An example is calculation of precipitation and the associated latent heat release. Precipitation only occurs where the vertical motion field is upwards and the moisture field is supersaturated. This means not only lots of testing and conditional branches but one nonlinearity multiplied by another, with the resilting latent heat release being the dominant term in the heat equarion where precipitation is heavy, but small or zero everywhere else.
This cannot be made efficient even a vector/parallel machine. I won't even talk about boundary layer processes!
Why not the Borg icon? The story is about Microsoft, thus inviting yet another round of mindless MS-bashing. Most of the comments are about how this recommendation does/does not fit into Evil Scheme .NET. Slashdotters don't see this as a technology/IT story. Stick the right icon on it and let the flamethrowers loose!
Because *BSD is dead?
Jim Gray is a SQL advocate. Obviously he'd promote data storage over processing. The sad thing is that he is among those that want to go with the database status quo instead of progressing towards relational nirvana; and without truly relational systems we will never be able of getting full use of all that data.
Leandro Guimarães Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Jim Gray, left, and Gordon Bell, scientists at Microsoft's Bay Area Research Center in San Francisco, say that research will increasingly be data-driven and make use of inexpensive clusters of PC's.
BARC. It just doesn't quite have the same ring to it.
My name fits again.
"...a pair of Microsoft researchers..."
"They advocate building cheap Linux-based Beowulf clusters..."
come on guys...June 2nd, not April 1st.
Unless you assemble yourself, good luck buying a computer without windows. M$ makes money even if you use linux, providing you buy a preassembled computer from practically anyone.
What changed under Obama? Nothing Good
It seems odd that M$ is advocating the use of a Linux cluster. Perhaps the deal with $C0 has implications here. Perhaps M$ thinks it will own Linux soon.
When all else fails, use the backup...
In Soviet Russia, clusters Beowulf you!
The recent decision to test again was based specifically on allowing the us to begin creating low yield ground penetrating nuclear weapons. There is no mention of the validation of simulation mentioned in the congressional debate, only the creation of new weapons.
moron-in-chief indeed.
http://www.nytimes.com/2003/06/02/technology/02SUP E.html?ex=1055131200&en=0cf96af3c5256967&ei=5062&p artner=GOOGLE
I guess putting in "archive" instead of www doesn't work anymore. I tried and it just put me through to the front page. Oh, well. Google News usually works.
I prefer a void in conversation to a vacuous one.
Most scientific applications store their data in some proprietary format, or with the help of something like netcdf or hdf. It's not like the article is suggesting that they should turn all supercomputers into relational databases.
I'm sure Grey and Bell have earned the freedom to speak freely, and continue to earn this every time they do. (I've just looked at some of their online presentations - thought provoking)
And in this instance it doesn't hurt Microsoft, because right now the government money is going to a poker table that Microsoft isn't sitting at - the "big iron" table.
Microsoft doesn't lose by saying Linux wins at a game it isn't playing.
Better to attract the high roller (the gov't payor) away to a game you're good at (the database table).
Okay I've mixed enough metaphors.
Esteem isn't a zero sum game
Microsoft Research had a research project in the late 1990's called "Millennium". It was a prototype of Microsoft's future operating system for the new millennium. It was a distributed network that in theory would embrace the entire world. Believe me, Microsoft would not lack for CPU cycles if they implemented this.
The problem Microsoft would have would be scaling SQL Server into a world-wide file system. The solution: use Microsoft's considerable lobbying power (they spent three times as much as Enron on the 2000 elections) in order to get government research redirected for their purposes.
It does look like, from the descriptions of Longhorn, that it will be at least a partial implementation of Millennium. The Borg JVM (.Net) that Millennium will run on is already here. Full, world-wide implementation of Millennium might take a while. If the world is smart, it will never be allowed to happen. All relevant metaphors ("one system [to rule them all]", "computers ... assimilated", and Godzilla's wrath) apply.
Shinoda: "The age of Millennium."
Io: "What does that mean?"
Shinoda: "A thousand year kingdom. It wants to create a home for itself. There is one flaw in its plan: Godzilla."
"Godzilla 2000 [X] Millennium" (Japanese version)
it's quite astonishing that these researchers, who are otherwise well-reputed, have missed the whole point of government sponsorship of super-* facilities: to do what can't be done otherwise. mostly, that means running traditional supercomputer jobs, those that are tightly coupled. people who have loosely-coupled jobs have long ago bailed from the supercomputing arena, and have been building their own clusters. similarly, there's no unique advantage to centralizing data storage, and a huge disadvantage (bottlenecks in and out).
I have to wonder whether Markoff badly munged the intent of the Gray/Bell paper, since the way he presents it is internally inconsistent. that is: the gov should spend huge bucks on massive centralized storage, but computing should be decentralized ala grids. oops, how is all that compute power supposed to move data to/from the three national data repositories? perhaps the central problem here is the fallacy shared by grid-o-philes: that networking is getting dramatically faster. take a look at your own network: if you are lucky enought to have gigabit to the desktop, when did that upgrade happen (probably 100 upgrade happen? what kind of speed did you get on your last big download? I've experienced a speedup of something between 10 and 50x in the past, say, 10 years. that's pathetic, when compared to the speedup we all have experienced in CPU power, memory size/speed, and disk size/speed.
there's no Moore's Law of networking: no n^2 process to keep accelerating (unlike die or disk densities). yes, there are technological improvements, and yes, you can gang cables together to scale bandwidth almost linearly. no such help for latency, though. and technological improvements are neither infinite nor increasing. that means that the network is becoming more of a bottleneck, not less.
Bah. Linux users are either using those abandoned machines stashed in company supply closets under the boxes of copier paper or else their buying parts and building their own. I can't imagine why anyone would buy new Dells only to strip the hard drives and install Linux. Unless you're talking laptops.
I am the inventor of the hilarious refrigerator alarm.
I agree. Rather I think their point was that the money should be diverted from advancement in supercomputer computing speed to solving large-scale data problems.
Mr. Grey's presentation from May 20th has many solutions, and none lead directly to a Microsoft database.
He seemed inclined towards a distributed solution.
Esteem isn't a zero sum game
I'm a mathematician working on parallelizing the PDE solvers with domain decomposition methods for the weather problem. It can, in fact, be parallelized.
I don't know much about fluid dynamics, but it's still probably PDE's, and there are domain decomposition methods for those as well.
Performance computing has shrunken so much, theres no longer a need to keep mainframes around. That is pretty sad for the mighty Big Science history of the computers. Small ubiquitious computers kill the aura and the curiosity into these very complex machines, something that the stereotype of huge powerful computers and a supporting team of geeks created.
Sad to see we might not have a completely different architecture of big computers from small, the way the S/360 S/390 Cray etc had. Now we will see farms of Athlon64s and feel good that theyre using the same linux kernel that the PDA in the pocket is using, but perhaps there is no mysterious underlying OS like the OS/390 with its own languages, programming and look like the AS/400. Big Science computer technicians will no longer be an elite group.
I wonder if someone is selling an S/360, I could start a mortgage for it! Porting DOOM to it should be very cool.
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
BSD would be as good as some, and better than most.
These guys are dead-on right. You hear about all of these supercomputer centers, but then you dig around and find out they run them in an over-subscribed time sharing mode- so nobody sees the speed anyway. That is where your corse-grain paralism comes from, many different users.
Designer of the PDP-8 - perhaps one of the most economical and aesthetically pleasing designs of all time. Designer of the PDP-11 which led to the VAX.
:) :)
DEC was 2nd largest computer manufacturer for years.
Add to this his more recent accomplishments and I have no doubt he has more knowledge about computer design than everyone posting here put together.
Plus, I've met him, so nah-nah-nah!
Take everything we have now, and then get rid of tractor-trailer trucks and force everything to be moved around by car.
The economy would grind to a hault. You could increase the technology of everything else 100 fold and it wouldn't make a difference.
100 slow processors which can communicate really effectively will beat 100 really fast processors which can barely communicate by a very significant margin for some problems.
Can two grand masters play chess by mail? Of course. But two 10-year olds playing thorugh Yahoo! IM will finish their game much quicker.
paintball
Or if youre too lazy to do that just read it here.
June 2, 2003
In Computing, Weighing Sheer Power Against Vast Pools of Data
By JOHN MARKOFF
SAN FRANCISCO, June 1 -- For almost two decades the federal government has heavily underwritten elaborate centers to house the world's fastest supercomputers. The policy has been based on the assumption that only government money could ensure that the nation's research scientists had the computing power they needed to pursue projects like simulating the flow of air around a jet airplane wing, mimicking the way proteins are folded inside cells or modeling the global climate.
But now two leading American computer researchers are challenging that policy. They argue that federal money would be better spent directly on the scientific research teams that are the largest users of supercomputers, by shifting the financing to vast data-storage systems instead of building ultrafast computers.
Innovation in data-storage technology is now significantly outpacing progress in computer processing power, they say, heralding a new era where vast pools of digital data are becoming the most crucial element in scientific research.
The researchers, Gordon Bell and Jim Gray, scientists at Microsoft's Bay Area Research Center, presented the argument last month in a meeting of the National Research Council's Computer Science and Telecommunications Board at Stanford University.
"Gordon and I have been arguing that today's supercomputer centers will become superdata centers in the future," said Dr. Gray, an expert in large databases who has been working with some of the the nation's leading astronomers to build a powerful computer-based telescope.
The policy challenge spelled out by the Microsoft researchers comes as a quiet national policy debate over the future of supercomputing is taking place among experts in scientific, industrial and military computing.
In February the National Science Foundation Advisory Panel on Cyberinfrastructure issued a report calling on the nation to spend more than $1 billion annually to modernize its high-performance computing capabilities.
Separately, a study completed last year by a group of military agencies was released in April. Titled "Report on High Performance Computing for National Security," it calls for spending $180 million to $390 million annually for five years to modernize supercomputing for a variety of military applications.
Computer scientists added that the construction of the Japanese Earth Simulator, which is now ranked as the world's fastest supercomputer, has touched off alarm in some parts of the United States government, with some officials advocating even more resources for the nation's three national supercomputer centers, located in Pittsburgh, at the University of Illinois at Urbana-Champaign and at the University of California at San Diego.
Whatever decisions the government makes could have vast implications for computing.
The decision in 1985 to build a group of what were then five supercomputer centers linked together by a 56-kilobit-per-second computer network was a big impetus for development of the modern high-speed Internet, said Larry Smarr, an astrophysicist who is director of the California Institute for Telecommunications and Information Technology.
He said that Dr. Bell and Dr. Gray were correct about the data-centric technology trend and that increasingly the role of the nation's supercomputer centers would shift in the direction of being vast archives. Rapidly increasing network speeds would make it possible to increasingly distribute computing tasks.
Central to the Bell-Gray argument is the vast amount of data now being created by a new class of scientific instruments that integrate sensors and high-speed computers.
While the first generation of supercomputing involved simulating physical processes with relatively small data sets, the tremendous increase in data storage technology has led to a renaissance in
Did i just hear Microsoft researchers advocating Linux beowulf storage clusters?
If i did then it should now read "Ex Microsoft researchers..."
Slashdot - The one stop shop for procrastination
1 Terabyte = roughly 1,600 CDR's. I once did a backup of my (relatively small) system onto CDR's. Took over 40 discs. Imagining how long it would take to backup a terabyte, I can only say,
The Web is like Usenet, but
the elephants are untrained.
Whether you can use a supercomputer or a cluster depends not so much on whether the task can be broken into independent pieces (code can and is independent by the nature of OO!!) but whether the solution can be recoded in parallel cost effectively. Its usually expressed in terms of one woman produces one baby every nine months, but nine women cannot each produce one baby a month! If latency (or the time required to wait for results) is not a factor, as I believe it's not in the case of most large projects, then clusters can certainly replace many supercomputers. This is the real problem - how quickly you need answers to the problem you're setting, and therefore how well you can "word" the question.
"It's not your information. It's information about you" - John Ford, Vice President, Equifax
I sorta expected that, I mean, a sports reference on slashdot? Of course nobody gets it.
Couldn't companies start to use distributed storage? Think how many people have most of an unused 40gb or bigger disk in their workstation, especially at companies with a strict no-3rd-party-apps or even -user-data on the local drives policy.
Deploying something like the freenet system locally would help them out lots in this area - content is decentralised onto many workstations, but as long as a significant percentage (90%+) is up then nothing should just vanish from the network. Even hard disk failures won't loose data, especially if everything is FEC encoded on insertion into the network. (Also means you are more likley to always be able to retrieve files.)
Content is also self cleaning, anything not being used is just deleted automatically. The whole thing would be fast because there would be many upload points for one file. You can't retreive data without the right key, so it's reasonably private. (You could even mandate or automate encryption.)
Some mechanism would need to be added so that unpopular files would get backed up to cold storage before being 'forgotten', but that problem probably isn't insurmountable - hell, you could even send everything being inserted into the network to tape and then use a local freenet just for live data/retrieval, plus shared server side storage arrays for secure/confidental data.
Beep beep.
PowerNotebooks sells rather nice laptops sans operating system. You can choose to have Win2k or WinXP installed for extra if you'd like, but they come standard without anything pre-installed. Reasonable prices, too. Not only that, but they are also ranked rather high at ResellerRatings.com. (Not trying to plug them, just pointing out there are non-mainstream alternatives to get Windows free equipment.)
Imagine a beowulf cluster of beowulf clusters!
Never discout recursion!
err, well, not really...
at least where I am at (University of Wisconsin, CS department) we build ALL of our cluster computers. We have clusters of upwards of 100 nodes, and to buy all those prebuilt would be insanely expensive. Unless you are running very small clusters, it is just not economical to buy machines from Dell or anything with windows installed.
I obviously don't speak for the business world, of course, but most businesses don't have a use for a cluster or a supercomputer for that matter...it is only good for very specialized applications.
//FIXME: Bad
They employ people with the likes of Tony Hoare (invented quicksort and the 'hoare triple'). They also hired most of the core developers of the functional language Haskell. And many other brilliant minds.
Most universities could only dream of the funding that MS research has. And they're completely free to research whatever they want. And of course they use Linux, BSD and whatever other tools are right for the job. They're researchers, not software politicans.
-
let me know when you try 160 cpus. then you can watch your beowulf interconnects go up in smoke.
You're right. I had a brain freeze when posting; I meant to add "or DVDRs". I need to click preview more often.
On DVDs it would be closer to 160 discs than 1600. That's (barely) manageable, but I still wouldn't want to hang around a computer long enough to swap out 160 discs. I've come to the conclusion that the best solution for automated backups is cheap removeable hard drives.
The Web is like Usenet, but
the elephants are untrained.
By defining supercomputing in terms of storage capacity rather than aggregate CPU power, Microsoft seems intent on diverting discussion away from clustering, which Windows does very poorly (if at all).
How about those DVD's with the blue lazer thingy (BlueRay or something..)? Aren't those supposed to hold a lot more?
(\(\
(=_=) Bani!
(")")
Sure, blue-laser DVD's will hold more. And by the time you can get a 50 GB blue-laser writer, you will be able to buy a 500 GB hard drive for less money. Probably a lot less.
The actual numbers may vary; I'm too lazy to look up how much blue-laser DVD's are supposed to be able to hold when they're available, but my point still holds. The cheapest backup in terms of Storage/Cost is removable hard drives.
Especially if Cost is figured as:
Cost of drive system
+ Cost of media
+ Cost of my time
= TOO MUCH!
--
The Web is like Usenet, but
the elephants are untrained.
http://www.theregister.co.uk/content/61/30990.html
"With the new storage OS, users can create shadow copies of data for single or multiple volumes of information. Microsoft has also included the Distributed File System (DFS) and support for server clusters with the operating system."
Good timing or what?
ISO certified == THX certified
We have a good deal of beowulf clusters here mostly for data taking and data handling purposes. Data handling cluster will have about 1000 nodes in the final setup. We plan to upgrade them every 3 years. Now they are mostly Athlon 2000 duals running linux 2.4, divided into the small blocks of 8-16 nodes with a devoted 2Tb data server per block. All of them are connected together with T100 and the Gigabit ethernet. We have several head/gateway nodes to submit the jobs and some lab written software distributing the load between the nodes. Lab is thinking about throwinig away all the old SGI supercomputers. It cost too much to obtain and upgrade. I am afraid to be repetetive, but the main positive features of the beovulf clusters: 1. It is cheap. 2. It does not require special knowledge to build and mantain. 3. It does not require special knowledge to write a new software. 4. It is scalable. 5. There is a competition between the vendors. The only minus I can find is that the racks with PCs do not look as sexy as a SGI purple refrigerators. :)
are probably more common, but I expect its the physicsy ones that probably use the bigger computers (alongside atmospheric modelling).
SJW n. One who posts facts.