Teraflop In A Box At SC2003
HPC Prophet writes "For those of you that can't go to SC2003 or can't afford the US$750 late registration, here is a small taste of what we put together for our friends at Mellanox Technologies...It benches out at over 1.2TFLOP (192 dual Intel Xeon Processor blades, 64 in a Rackable chassis, 128 crammed into a Ciara chassis and all connected via InfiniBand) and loaded up with Callident Rx (based on NPACI Rocks) OS/Middleware. Total estimated time to unpack, build and get up and running was 17 hours." Read on for some details on this power-hog.
"We had the single-most power density for the smallest size booth they offer (380amps @ 208v in a 5U of rack space (look closely at the bottom of the middle rack containing all the cables and InfiniBand switches). Cooling was very nice too, we maxed out our Liebert HVAC when building it initially. Oh, by the way, this would end up somewhere in the neighborhood of #38 on the June 2003 Top500 list. There are a couple of other pictures on there too of some of the other attractions at SC2003 like the 128-node cluster that NPACI folks will build in a 2 hour period. Sorry about the cheezy slide show, I had to be quick."
If i'd had known that was considered a "sweet machine" i wouldn't have ditched the one i found in my basement! Damnit.
I like that they actually put this demo together with Windows XP Power Toys.
"For those of you that can't go to SC2003 or can't afford the US$750 late registration"
What about those of us that don't have a clue what sc2003 is?
In case anybody wants it, the link to the conference is at
http://www.sc-conference.org/sc2003/
Several of the lectures are being broadcast via high bandwidth video if
you are on Internet2.
A box full of Pentium Xeons in a cluster. So what? This stuff is getting rather passe. Where is the invention and innovation?
I though itanic was supposed to be wonderful according to intel and HP. So why are they not promoting huge clusters of itanics? Why are they talking terraflops with cheap and nasty Xeons? 32-bit Xeons?! Everyone else is 64-bit nowadays.
Rotten kids, cant trust 'em these days.
Speaking at Defcon 12 - Credit Card Networks Revisted: Pen
http://www.testdrivehpc.com/sc03/SC2003_booth_1011 _TFLOP_Cluster/html/35.htm
nohup rm -rf ~/. >& zen &
Yes, of course. My Counter Strike server.
Speaking at Defcon 12 - Credit Card Networks Revisted: Pen
a computer that will be able to run Windows Longhorn!
Windows XP Powertoys?
"If anyone needs me, I'm in the angry dome."
Can it run XP?
Hmmm.
emacs :p
Compiling Windows Leghorn?
Stick Men
Question should be: is XP able to run on it.
Everything that needs Java.
The site where: "I'm right, as long as you ignore the things that prove me wrong", became a valid method of debate.
http://www.gnu.org/software/hello/hello.html
If you look at the more recent November 2003 list instead of the older June 2003 one, this cluster would rate more like #84 than #38.
/cj
Wow, this is weird for /. this post has been up for 30+ mins and it only has 35 replies.
they (va lairIE/robbIE) probully have 'immunity' doo to their whoreabull stock markup fraud sucksass so far?
6000 floating point operations per pixel on a 1600x1200 display @100Hz. Doom 3 is due next year, you know?
cluster webcam
Ummm...I'll bite... Any modeling or visualization...anything application in which you need to calculate the complex interplay of many little components.
I'm writing an application that simulates the evolution of language in a population of ~1000 neural networks. Try running that on your 386SX with math coprocessor.
I only wish the price of these things would slide down a little more. Something like a PS2 cluster would be excellent for me if the linux kit wasn't so costly.
My girlfriend's rackable, but she doesnt clock anywhere near 1.2 taraflops. More like a few hertz, but hey...I can dream cant I?
Speaking at Defcon 12 - Credit Card Networks Revisted: Pen
Funny you say that ... MS does daily automated builds of Windows for all it's supported CPU platforms and does installs to a large farm of workstations. For Win2k, the build cluster was comprised of Compaq 8x processor Xeon servers. I imagine they may have moved to larger hardware like a Unisys E7000 by now. Windows is well over 20 million LOC now, and doing a daily build takes over 10 hours.
meh.
That's easy: Halo for PC
my karma will be here long after I'm gone
I mean, really. They've obviously just taken the T off of Titanic, they know it's going to crash and sink but are trying to pull the wool over your eyes wi this bit of alphabetic subterfuge.
Government of the people, by corporate executives, for corporate profits.
Doom III, anyone???
how long until
The new 'paper clip'-helper for windows longhorn.
>MS does daily automated builds of Windows for all it's supported CPU platforms
Please help a dumb country boy. How many platforms does Windows run on? I thought they dropped MIPS and Alpha support a long time ago?
from what i understand, there is a cluster in that box. so in effect, this is a beowulf cluster post!
now, imagine beowulf clusters of beowulf clusters of beowul fclusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters of beowulf clusters, stretching to infinity... plus one.
...something tells me that they aren't running it on their 1 tflop box. ;o)
I am NaN
Sure would be nice to update more then one document, write and deploy some code and read email with out getting a blue screen of death.
But then I am SURE windows would bring this box to its knees
Come the revolution, the Bourgeois, Capitalistic, "A PARKING STICKER HOLDERS", will be first against the wall!
I only wish the price of these things would slide down a little more.
Cost of this 1 teraflop Mellanox machine is less than US$1e6 according to this brochure.
That's considerably less than the US$50e6 that the first teraflop machine cost (Sandia's ASCI Red see this SC1996 flier) 7 years ago.
I don't have a spare million, either, but that kind of 98% price reduction is still fairly impressive.
"Provided by the management for your protection."
Kinda reminds me of Mad Magazine. There was always an advert with someone trying to kill themselves because they didn't have a subscription and the newsagents had sold out.
Uhhhh, I'll be contacting the owners of this system to use MY email address when they set up Seti@Home on the box... Look for me to vault into the top 10 in about 38 seconds!
Much though I'm loathe to admit it, there are things called "Windows" for many architectures, nameley "i386" aka Pentium/Athlon, AMD64 (Opteron and Athlon 64), itanic (itanium), ARM (for those WinCE things), and there used to be WinCE for embedded MIPS i.e. other hand-helds. Now, the question is, how much of the codebase do these "ports" have in common?
Stick Men
I don't have a spare million, either, but that kind of 98% price reduction is still fairly impressive.
Over 7 years, in terms of pure FLOPS, you'd expect the price to be halved about 5 times. So the price should be 1/32, about a 97% reduction.
Is Moore's Law impressive? Sure. Is this particular case impressive against the background of general computing progress? No.
Weather modeling, protein folding, advanced visualization of complex data.
How well do these blade boxes stand up to full trottle usage? Would a box like this handle running the distributed.net client for days and weeks and years? Although because this is an Intel box they will be slow as compared to AMD, but still a valid question.
Pretty Pictures!
http://www.sun.com/2003-1118/feature/
It's not actually the speed that matters, here. It's how well the applications are parallelized. Things like protein folding, most population modelling simulations, graphics rendering, etc are -highly- parallel in nature, and run beautifully on clusters and large SMP machines (by large we're talking >32 way).
A really good example is the genomic search tool BLAST. The "stock" version from NIH isn't natively parallel, however due to it being available in source form, it's been modified to run in parallel....and it's -much- faster that way.
Basically, if your problem set can be broken into chunks and -then- worked on, you can make good use of any sort of parallel system. Clusters are really the "poor man's" way of parallelizing computation...they're also becoming the most prevalent -because- you get a lot of bang for your buck...think about it: Earth Simulator cost 8 figures to build, IIRC, to get 17 TFlops. Earth Simulatr is a more tradition vektor system, so it's -really- freaking good at certain operations...but it's also freakishly expensive to design and build.
I thought the title read SCO2003.
Then I laughed out loud at the absurdity. SCO doesn't make products.
Our intelligent designer has never created an animal that we couldn't improve by strapping a bomb to it.
why oh why, what happened to the news these days, seriously this just seems like one big advert and it is happening more and more at the moment.
I know that i will get trolled for this but i wanna read kewl stuff, not about #shock# a fast server (thats not even that fast really)
oh well mod me down i can afford it (as long as my karma repayments are ok )
Kingdom of Loathing (www.kingdomofloathing.com) Addicted is me
Meanwhile, IBM recently built the prototype for a single BlueGene/L node, and it manages to cram 1024 PPC440 processors, with a Rpeak of 2Teraflops, and an Rmax of over 1.4TF into about half the space of the full racks mentioned in this article.
While this article is obviously about a somewhat less custom system than BlueGene/L, I'd have to say I'm much more impressed with IBM's achievement.
"The worst tyrannies were the ones where a governance required its own logic on every embedded node." - Vernor Vinge
All these low powered clusters are fine and well, but what is the state of supercomputing for problems that really arent parallizable?
Nice job cabling it together. NOT!
I rather go for a SSI than a stupid cluster. One kernel one supercomputer, a cluster is just one distributed app running on a local network with low latency.
There is no rocket science in that, even my grandmother can crank two computers together on a network. No more work on the kernel to get it scale, so I do not need to program MPI and do the parallellization my self. I want the compiler and the OS to do it.
Now go support those companies doing some innovation in the field like Cray, NEC, Fujitsu, SGI and IBM.
I have to say after reading up on the "Rocks" cluster OS software they were using, aside from extravigant benchmarks, and bragging rights most of these multi-node cluster "supercomputers" are fluff when it comes to the average users's applications.
I buy a dual CPU or A Quad CPU machine for example because I know when I run a multithreaded app in XP or 2k or linux it'll spread out the load on all the cpus.
Just about all of these cluster programs are a complete pain in the ass, and either required specially programmed software, or some other terribly annoying method.
Is there any cluster software out there that'll behave (although obviously not performance wise) similar to having a multi cpu system? Where I don't have to jump through hoops.
Compare this to the G5 cluster which cost about 5 Million US$. That is about 1/2 the cost of this setup on a cost/tera flop basis. Of course this may not necessarily be true when the processor count goes down ... but still something to consider.
S.r.
The revolutionary system that the Apple had nothing to do with constructing? All apple did was make the individual G5's They did nothing related to the supercomputers construction. The revolutionary system does not belong to apple.
Actually it was Apple who followed other companies with their xserver. Many other companies, even small ones, had developed 1U servers before Apple 'defined the future of computing'. To add on top of that this is Apple's first cluster computer to be put on the Top 500 list, after hundreds of other clusters have already made the list. Doesn't sound real revolutionary to me.
Plus you get all those nice shiny new G5s, complete with top-of-the-line PC graphics, firewire, audio, etc.
You get a lot for the money with a G5. And they're a piece of cake to assemble. But they might take up more floor space. I think it would take about 100 of them to put out 1Tflop.
But back at SC96, I remember paying a nice cheap $75 to get in the door. Quite a bit of inflation, there.
As for what all that power is good for... Why do you need a use in advance of the power? Do you think there was some proto Les Paul sitting around in the 1700s with a solid body guitar and pick-ups, thinking "if someone would just discover electricity, this baby would wail"?
Make the power available and people will literally hurt themselves coming up with ways to exploit it.
- G
Start a happiness pandemic
What an informative post... especially considering that "IBM" creates your new and extremely precious G5 processors (which were on IBM linux systems based one the PPC970--which is the G5--before it came out on apple systems).
Also it wasn't apple that created THE apple cluster (the 1100 G5 machine cluster by Virginia tech--or was it another Uni?). If your talking about the X-Serve... you would be laughed out when talking to server admins because clusters and 1u machines existed LONG before apple created the x-serve.
Hmmm... Pie...
"What about those of us that don't have a clue what sc2003 is?"
If you have a computer geek membership card, turn it in. If not, proceed directly to the next article. Do not pass go, and do not collect 200 miscellaneous promotional trinkets.
Although late registraton for exhibits is only $80, and it's $700 for the tech program.
paintball
Does it do any useful WORK?
paintball
A really good example is the genomic search tool BLAST. The "stock" version from NIH isn't natively parallel, however due to it being available in source form, it's been modified to run in parallel....and it's -much- faster that way.
[snip]...think about it: Earth Simulator cost 8 figures to build, IIRC, to get 17 TFlops. Earth Simulatr is a more tradition vektor system, so it's -really- freaking good at certain operations...but it's also freakishly expensive to design and build.
Go tell VT that.
They just bought a 10TFLOP system that is incredibly fast at applications such as BLAST and they did it for a song.
Let's see... 380A * 208V = 79,040 VA, call it 79 KW (106 HP), or an energy density, assuming 5U of 19" rack (17" net) by 27" deep, of 19.68 Watts/cubic inch. BTUs dissipated per hour would be 269,843, requiring at least 22.47 U.S. tons of refrigeration required to cool it, or about what would be required to cool seven average Texas homes in summer (12,600 sq ft total). That's pretty impressive, if correct.
Look at the bright side: there's always seppuku.
erm, i won't go into details, done that lots of times, but I have one word:
BULLSHIT
--Coder
Protein folding...the human genome project...the early evolution of the universe...weather prediction...the next generation of stealth technology...cracking documents encrypted by terrorists...
For an annual breakdown of the national direction in supercomputing and current "Grand Challenge" applications, look at the National Coordination Office for Information Technology Research and Development's supplements to the President's budget (a.k.a. the blue books)
http://www.itrd.gov/pubs/bb.html