Top 500 Supercomputers
Anonymous Coward writes "sendmail.net has a piece on the new Top500 list of supercomputers. 'So who came out on top? Well, three US Department of Energy machines have taken spots one, two, and three to lead the list: ASCI Red (manufactured by Intel) at Sandia National Labs in Albuquerque, ASCI Blue-Pacific (IBM) at Lawrence Livermore Labs in Berkeley, and ASCI Blue Mountain (SGI) at Los Alamos. These are the only three systems to exceed 1 TF/s on the Linpack benchmark, and represent 7.4 percent of the total Flop/s on the list.' The story notes that the average growth rates for the list exceed the number set by Moore's Law. "
No, you moron! Ofcourse not!!!
Yes, but the number varies based on the components and communication infrastructure. A set of components that only consumes 10 watts scale further then one that consumes 100 watts. Basically, infrastructure becomes you limiting factor. Power and communication specifically.
Overall though this just keep an upper bound on realistic growth.
A better metric for computation power is given by this formula concerning the memory heirarchy:
bandwidth*size*speed where memory
bandwidth is the average speed at which data streams to or from memory
size is the amount of memory
speed is the responsiveness to random access
What the massively parallel processor advocates frequently forget is that locality of reference is an expensive assumption. A similar mistake is made by memory heirarchy advocates. For example, many systems where CD-ROM jukeboxes were included to expand the size of the memory the architects overestimated "locality of reference" and therefore underestimated the profound impact that moving the robot arm around would have on latency. Such designs are convenient for the hardward designer who wants "good numbers" and a nighmare for advanced software application that needs unpredictable access to lots of information at a high rate in order to get the solution out of the machine before the solution is obsolete. The operands have to come together through that maze of wiring. If you have partitioned the memory, it profoundly affects both latency and bandwidth. The critical thing is to allow _shared memory_, and that means advanced memory control units.
Seymour Cray kept ahead of the supercomputer pack for more than two decades by focusing his best talent on fast, high bandwidth memory control units and building the biggest semiconductor memories to match.
Seastead this.
Well, it is, AND it isn't. It's using simular technologies, it's just not using the actual Beowulf 'package'. Nitpicking at that point, I suppose.
Depends on how you define a Beowulf cluster, really.
-- I'm the root of all that's evil, but you can call me cookie..
Its not really parrallel processing though, I think it would be more 'semetric', IE each chip doing one particular task, the one its best suted for. This is the way q3a works, each chip is given a spesific job. But, a 9000 chip intel box wouldn't run quake any faster then a 2 chip intel box.
anyway, all I ment in the comment about 3d gaming was that PCs are better then Macs for 3d gaming, no one, exsept maybe apples marketing department, would deny this. And a comparable PC would be cheaper then a comparable mac.
Yes, there are other uses for floating point, but the primary use in consumer situations is games. If a sciantest really needed high-power floating point, they could get an Alpha or somthing.
--
"Subtle mind control? Why do all these HTML buttons say 'Submit' ?"
ReadThe ReflectionEngine, a cyberpunk style n
That doesn't exactly hold. Replace my use of "speed" with "transistors" or "density", and it is still the same. Every 18-24 months density/transistors/speed may double in chips /on average/, but at the same time there are very expensive ones with higher densities/whatever that are faster, and very cheap ones with lower densities/whatever, so to remove the money factor is not fair. At any point in time, regardless of moore's law, anyone can build a faster machine simply by throwing more money, more resources at it. So it is not fair to compare machines like this. They should be normalized to density or performance per dollar (or any given monetary unit).
It's 10 PM. Do you know if you're un-American?
But thats not true though. In theory, yes, you dump a trillion dollars on the developement of quantum computers, and you've blown away Moore's Law. My point however is that dumping a trillion dollars into a parallelized system that uses regular processors will not break Moore's Law, and never will, because Moore's Law is based on the measurement of the density within the individual processors themselves, and not the system as a whole. i.e. the density of a system that has 1 processor is the same as the density of a system with 500 processors.
Moore's Law is based on the fastest processor existing today, not the most economical. Now of course there are processors that will do specific functionality faster, but their semi-conductor density has not changed. Speed is not the issue at hand here, its the symptom of increasing density.
You mean it's not Open Source(tm)????
Boys, get out the picket signs....
The way I calculated it was by figuring out how many megaflops my computer had, then dividing the overall keyrate by my accumulated keyrate to see how many of my-equivalent-computers were doing d.net. Multiplying that by my number of mflops gives an extremely rough estimate of d.net's `score,' but a better one than guessing how many computers each e-mail address has. As I said, I think I erred on the low side (probably on the very low side, since d.net is at least a several-hundred-thousand-processor machine and ASCI Red is only 9,632-processor).
(Yeah, yeah, I know you've got 2k TRS-80s in a Beowulf cluster in your back yard.)
Naw, that's a crappy way to do it. The 30,000 Z-80's are all hooked together in a crossbar array of dual ported RAM. Boy was it a mess to wire-wrap that one up.
But it's the fastest cluster with all 10 MHz clocks.
So... Where's yer Linpack #'s AC????
Have a great day...
I could not justify my existence if I were a turkey farmer. Would I terminate myself? Undoubtably, yes.
fine :(
Sorry.
You can't see the source code.
*snicker*
Indeed.
Not everybody drops their shorts at the first sign of a dicksize contest in progress, after all...
Right...Moore's law won't break when you add more hardware. The density will still be the same. But the cost won't. These systems should be compared on equal footing. If my $5000 system runs at 70% of the speed of your $10000 system, I win, because I have a better performance to cost ratio. Theoretically, add another of my $5000 systems, and my composite $10000 system beats your $10000 system. That's how things should be compared. It makes no sense to say my $X K7 beats your $.5X Celeron. It's bang for the buck that matters (more bang per buck indicating better design).
It's 10 PM. Do you know if you're un-American?
It's not only fast-n-fun, it's usually set up by someone who needs it for some practical purpose.
Not that that fact would become obvious from the carrying on here, of course...
Well bang for the buck isn't really the issue. You could probably get 5000 486 processors for reeally really cheap and build a system that has more power than X system. That doesn't mean shit when it comes to it. Money never was an issue with "fastest" computer and Moore's Law.
It would probably count to twelve before somebody's mom needed to use the phone line.
I run Linux all the time on my Digi-Comp I.
But only the 2.0 kernel. Because the plastic keeps melting with anything newer.
If you look at the list of machines, you can see which of the IBM SP systems use the Power3 or Power2 vs which use the 604e. More seem to use the 604e...
But it's not running faster than a caribou... being chased by a pack of wolves.
Where is Charles Barkley?
Massively parallel machines tend not to "run" an OS as much as they are "run by" an OS. The control nodes or stations (often separate machines) run whatever (i.e. Unix) and the actual compute nodes run their computation and little else. After all, running 10000 copies of Linux isn't really the best use of resources.
Drinking will help us plan!
The Sandia/Intel ASCI-red TFLOPS machine has proven to be one of the more technically successful efforts in massively parallel, high-performance computing. However, large MPP systems have drawbacks. Among these are:
Applications that require high levels of compute performance will continue to grow in size, variety, and complexity. While cluster-based projects have firmly established a foundation upon which small- and medium-scale clusters can be based, the current state of cluster technology does not support scaling to the level of compute performance, usability, and reliability of large MPP systems. In contrast, large-scale MPP systems have addressed the problems related to scalability, but are limited by their use of custom components. In order to scale clusters to thousands of nodes, the following must be addressed: Use of non-scalable technology must be bounded or eliminated. Technologies like TCP/IP, NFS, and rsh have inherent scalability limitations. Scalable management and maintenance is critical. The complexity of maintaining the cluster should not increase as it grows. Usability of the machine is critical. Users should not be required to know detailed information about the cluster, such as the name of each node or which nodes are operational, to effectively use the machine.
-----------------------------------------------
The IBM SP series machines I've run into all ran unix.
The Suns of course run unix.
I would guess the HP's run unix, although it might not be HP-UX.
The SGIs probably run IRIX unless they are "Cray/SGI" T3Es in which case they run Unicos
I don't know about the NEC or Fujitsu machines.
Although to be fair, the definition of "CPU" might differ from one manufacturer to another. For one it might be just a single chip itself, for another, "CPU" might be an individual cabinet full of a couple dozen chips. Can anyone shed light on this?
Well, the one (#353) I helpdesk for (among other things) is used for more than one kind of computation, including weather forcasts, electrical flow and fluid dynamics in the human heart, and molecular simulations, and cosmic ray physics.
As I write this, there are people from Meteorology, Physics, Chemistry and Computer Science all waiting in the queues...
Our Intel-Beowulf cluster is up to about 110 nodes now, I think, and we've also got a 72-node IBM SP.
Dave
Possibly they're already on the list. Its difficult to build a big supercomputer without someone at least getting a rumor of its existence, and thats about all you need to get it on the list.
So how do they compare these scalar arch. to the vector ones like the NEC VPP series and so on.
Crays: Unicos
There own supercomputer OS for vector machines.
SGI:
Irix for SMP
Sun:
Solaris for SMP
Hitachi:
MPP version of HI-UX, this is a varient of HP-UX optimised for a non-shared memory system.
Intel:
Some flavour of unix I believe. However with these machines each node executes its own copy of the OS and does SMP on that node. The Intel machines should not really classify as one computer more like a few thousand clustered together.
Linux does not really scale beyond 4 procesors on SMP systems. The most poserful linux systems are the beowulf clusters like the ones that NASA has. I don't know why these don't appear on the list as they are surely more powerful than some of the lower end Suns. However I doubt that a beowulf cluster counts as one computer.
NEC simply isn't using *micro*processors. Some CPU's can be built of tens of single chips. For problems which can't be eficently solved using distributed memory (MPP), you have to use SMP. Unfortunatly SMP isn't *generally* good beyond 32-64 CPU (not enough memory bandwidth), so if you want more power, you have to build faster CPU's. That's what NEC (and probably others) is doing.
Opus: the Swiss army knife of audio codec
Typical geek typo. Personally, I often transpose digits to make powers of 2. Its the way my brain is wired.
PVM is available for Win32 as well, meaning that a distributed supercomputer is hardly limited to Linux.
If I had a 2k TRS-80 beowulf cluster... I could conquer the world!!!
Anyone have any idea on how many MFlops the average PIII 500Mhz machine runs at?
Just a nit...
Livermore NL is in Livermore. Berkeley NL is the one in Berkeley.
Don't get to wraped up in how Beowulf is 'changing the face of supercomputing' based on what you read on Slashdot. Beowulf is mentioned or discussed a thousand times more here than it is anywhere else. It's a cool technology but not exactly in wide spread use yet.
These things have always impressed me...
Massive computing power using sometimes generic technology, others using THE LATEST in busses and network technologies.
Quake at 100000 FPS... running OpenGL in software... I wouldn't be suprised, but then, these things run nuclear bomb simulations.
Quick question, if you linked these up, how long would it take them to crack RC5? DES? Probably why the USGov doesn't want them exported...
Hence the lower number of vector supercomputers in the top 500 list vs. last year. Old tech cannot compete.
the list would be much more meaningful and interesting if supercomps at Ft. Meade, and other classified TLA facilities, were included.
--
-- ken williams
yeah, funny you mention that. last visit there, that was exactly what was going on. woohoo!
Quake! Give me quake! Can you imagine using the top system for playing a 32-way quake deathmatch
2048x1532 resolution simultanously on one machine?
I'm suprised that nobody noted anything about this. Look at the number of processors involved.
=====
1 Intel ASCI Red 2379.6 9632 3207 362880
2 IBM ASCI Blue 2144 5808 3868 431344
3 SGI ASCI Blue 1608 6144 3072 374400
=========
Note that it took twice TWICE as much processors for the Intel Supercomputer to keep a slight edge, beating out PowerPC based IBM computers. What does that tell you? You actually can get more bang for your buck if you went PowerPC than Intel Pentium/Ithalon/Whatever.
I wish there was a similar CPU comparison in which you could do a fair assessement! BUT wait, there is indeed one!
Simple mathematics here
Let's say we take the total gigaflops and divide em by total cpu, we could at least rationalize what we're getting per cpu.
1) Intel 2379.6/9632 = .24705 .36914 .2617
2) IBM 2144/5808 =
3) SGI 1608/6144 =
Sooooo. . in fact, the SGI and IBM computers are actually better designed, cpu for cpu.
Looking more carefully into the IBM's PowerPC choice for cpu in this particular case. They're using the ancient 604e PowerPC chip. The current crops goes up two steps, the G3 and G4.
[604e is technically G2 (G standing for generation) ]
Now, moving onto the G4, which has a claim of 1.2 gigaflops per CPU (future tweaks of the G4 theoretically can go up to 4.8) . The fastest 604e only burps along at .36914 gigaflops
Easier to See here:
604e = .36914
G4 = 1.2
How many G4 would it take to equal what the top supercomputer has, in terms of flops. Well, let's see!
I'm gonna put all the F's I got in Calculus to a good use!
2379.6/1.2 = 1983
Gee, it'll only take 1983 PowerPC G4 just to match . . what? NINE FREAKING THOUSAND SIX HUNDRED THIRTY TWO cpus needed by Intel.
Soiled my pants
Mezzikah
I just couldn't wait for my friggin' password to come in, damnanation! E-mail me at Mezzikah@hotmail.com if you find a fault in my logic. (or you're just a hot chick looking for Mr.Right). Don't mind my spelling. I never won any spelling bees.
I teach Unix courses (on Linux) and in the first class I try to give an idea of where Unix is used.
I always say "the fastest computers in the world run Unix", but I'd rather be able to say "480 of the top 500 computers run Unix" - it sounds more impressive. The problem is that, although I can identify most of the operating systems on the list quite easily, I'm not sure about some of the more esoteric ones. Does anybody know exactly what all these systems are running?
It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail. - Abraham Maslow
Moore's law doesn't apply to machines like these. It applies to their components, but when you just keep adding components the aggergate will obviously grow faster. If you take the price though for that aggergate I think you'll see Moore's law probably still holds.
This has already been discussed here on /.
-----Transmission Complete----- If you want to email me...Don't
They should build a super computer and have it run all the time calculating pi, just to see if eventually it terminates or starts repeating... :)
I could not justify my existence if I were a turkey farmer. Would I terminate myself? Undoubtably, yes.
What gives?
Yeah, Cray still uses alphas in their T3Ds and T3Es.
The T90s and SV1s use Cray's special vector processors.
Preventive War is like committing suicide for fear of death. - Otto Von Bismarck
The US has ignored vector supercomputing (is Cray the only one left?) for several years now. This leaves Japanese machines from Hitachi and Fujitsu as the primary source of non-parallel muscle.
I think Cray still sells a vector machine (sv1?) but it uses a CPU made by IBM which apparently isn't all that great from the amount of press is HASN'T received...
The machines that have fast speed/chip will be vector machines, a large number of floating point vector pipes chained together producing a pretty huge number of operations per cycle.
There's a user manual available here for ASCI Blue. LLNL is already working on a 10 teraOPS machine called ASCI White. 8000 processors... ASCI Red is currently 1.8 TeraOPS.
There's a user manual available here for ASCI Blue. LLNL is already working on a 10 teraOPS machine called ASCI White. 8000 processors... ASCI Red is currently 1.8 TeraOPS.
I think it must have been a typo, for 'percent' read 'factor of' throughout.
:)
Man, if the fastest computers on the planet had REALLY only been getting faster at 1.8% per year, your laptop _would_ be pretty competitive
For what I understand Moore's Law is about the amount of transistors you can put per square inch, not the computing power. If you put more processors in one machine, it doesn't have anything to do with Moore's Law. Moore's Law is about going from .5 um, to .35 um, to .18, ...
Opus: the Swiss army knife of audio codec
Augh. Should have used preview. Should be:
170 NEC NLR 8 - fastest computer with a number of processors less than 10
101 SGI "Government" 1024 - PRESUMED slowest computer with a number of processors greater than 1000
teach me to consider a less-than symbol "Plain Old Text."
-=Best Viewed Using [INLINE]=-
It's dictated by the number of RS6000 nodes we had online on our biggest system on the day the benchmark was run. At the time, we were running with 3 different RS6000 SP systems with 128 or mode processing nodes. Nodes are constantly moved between the different systems as needs change.
Luke, help me take this mask off
Pixar uses a network of Suns to render their movies. However they are not interconnected in the sense that the nodes of a supercomputer are connected. Each frame of the movie is rendered by a different system, with very minimal interaction. In a sense this is the same way supercomputers work but with far less messaging from controllers to processing nodes.
At our center, we're installing a batch of new SMP nodes, so I'll be interested to see just where we place in the standings when we rerun the benchmark.
Luke, help me take this mask off
-BAPper
These machines are too expensive to build commercially, costing $20-$100 million. However, the US gov has a need to push high end machines: (1) to keep the US supercomputer industry alive and (2) to run thermonuclear modeling codes. So these are either (a) no-one-will-buy highest configuration machines from current parallel computer vendors or (b) almost-commercial next generation parallel computers. Both the computer companies and the government come out ahead then.
but only two CPUs :(
--
"Subtle mind control? Why do all these HTML buttons say 'Submit' ?"
ReadThe ReflectionEngine, a cyberpunk style n
Every time I see how Linux Beowulf clusters are changing the face of supercomputing, I thank God that I'm a Linux user too. It is great to be on board the super train zipping like a bullet into a brighter future. That train is named Linux, and its coming to a computer near you.
This is the first Beowulf machine from the top of the list me think:
:)
Manufacturer: Self-made. Nice
"In short: just say NO TO DRUGS, and maybe you won't end up like the Hurd people." --Linus Torvalds
...Beowulf cluster you could make out of those!
(Score: -1, Unoriginal)
Seriously, all we really want to know is which of the machines on the list are Linux clusters of some sort. This is still Slashdot, after all...
--
Xenu loves you!
Yes, the law should more accurately be stated "Speed /per doller/ doubles every 18-24 months"
Throwing money at the problem isn't fair...perhaps they should have normalized these systems based on their price...
It's 10 PM. Do you know if you're un-American?
It'd be waste to use these for echelon. DSP's are much cheaper and better for that kind of stuff...
Hajo
Hajo Monogamy: Belief so strong that millions of people end perfectly good relationships in order to start a new one.
Imagine a Beowulf cluster of these ASIC computers!!!
Nope: #44 is the first beowulf
Hajo Monogamy: Belief so strong that millions of people end perfectly good relationships in order to start a new one.
Beowulf on Amiga Linux could be a winner. Amiga Linux is rock stable and worth a look-see. Check it out for yourself. Maybe your next supercomputer will be running Amiga Linux. Who knows?
The list only counts those systems where linpack has been run. I imagine that there are several beowulf (and other self-built)systems that are actually achieving more computational throughput, but aren't going to be noticed. Also, I'm sure the government doesn't release everything it has working on satellite images and what not. Overall, I'd imagine that a large subset of the people who are actually using huge machines for real work (rather than academic research) wouldn't take the time away from their work to even run linpack on their system.
As one example of such a computer, Professor John Koza has a 1000 node (Pentium II 350Mhz) beowulf machine for his Genetic Programming Inc. ( GPI's web site ) research group. He's running genetic programming applied to difficult problems on the machine (such as automatic analog circuit design), and is getting a nearly linear speedup because of the embarrasingly parallel nature of GP.
Cheers,
David Andre
my web site
disclaimer: I worked with Professor Koza for several years and helped him build some of his previous machines.
i'm just happy that the computer topping the list is in a magical, far away place, where the sun is always shining and the air smells like warm root beer, and the towels are oh so fluffy! Where the shriners and the lepers play their ukuleles all day long, and anyone on the street will gladly shave your back for a nickel!
go weird al!
I would love to be able to say I've seen one of the 500 fastest super computers in the world.
A funny story about that box, when I went down there, and saw it the first thing I thought was that all those lights represented CPUs, then I figured it would have been imposible, since it would have certanly made it one of the fastest computers on earth....
--
"Subtle mind control? Why do all these HTML buttons say 'Submit' ?"
ReadThe ReflectionEngine, a cyberpunk style n
You can't compare the FLOP level for these super computers to the flop level of a super computer, A single-chip system will get much, much higher flops/chip then a super computer. So you would need a lot more then 2000 g4's to equal the performance of these ASCI's.
Also, just like the PowerPCs, the Intel chips used are very old, Pentium Pros, probably running at about 200mhz. I'd be willing to bet that an Athlon running at 800mhz, the fastest you could buy, would easily beat a G4 at 450mhz, the fastest you can buy...
You Mac freaks never realize that is not performance per box, its price/performance, and the PC kicks the crap out of a Mac. (esp. for 3d gaming, witch is really the only need consumers have for all that FP)
--
"Subtle mind control? Why do all these HTML buttons say 'Submit' ?"
ReadThe ReflectionEngine, a cyberpunk style n
I don't think any NSA computers are on that list. I have yet to hear any real specifics regarding what the NSA has at their disposal, except the word CRAY a few times. Suffice it to say that they have the potential to have way cooler machines than any on this list, due to their undisclosed budget.
Ignore Alien Orders
some do but most absolutely do NOT use intel processors. Most use stuff you can't get in your average pc. It's close, but not the same. They use a version of the cpu with hugely enhanced pipelining and floating point performance.
that's cuz the japanese one is a bigass vector machine. the processors are much more expensive, but they also haul ass over a regular risc chip. they're also significantly easier to program. However, for some reason, people have decided that vector machines are out of fasion. whatever, makes my job more secure 'cause parallel programming is significantly more difficult than vector programming. :-) Now if only it paid as well as say, Java programming I'd be stoked!
So where's the Linpack #'s???
We're workin' on it. There should be something official announced at SC99 nect week.
"My life's work has been to prompt others... and be forgotten." --Cyrano de Bergerac
The stats on #23 are out of date. Our frends at nersc joined their two T3E's so the new mcurie has a Peak of 575 Gflops not the 444.2 listed
Not only that, but there are numerous Hitachi, Fujitsu and NEC computers where the entries on either side of them have at least one order of magnitude more processors. What are those sneaky Japanese up to?
And the pressing question is whether next year we can expect to see a Transmeta-based supercomputer in the top 500.
Derick Siddoway "Should array indices start at 0 or 1? My compromise of 0.5 was rejected without, I thought, proper c
My favorite Beowulf related site is eXtreme LinuX. This is a good place to start when you first start thinking about getting into supercomputing on a budget. I like it very much. And the logo is really spiffy too. Beowulf is not only fast, it's FUN.
I think you are right, but a while ago, people started generalizing moore's law to talk about computer speed as well as memory density.
It's not so much that they are 'still programming them' as it's the fact that they keep adding more processor farms to them. The ASCII machines are actually clusters of clusters, and they can simply throw more processing power at them, really..
-- I'm the root of all that's evil, but you can call me cookie..
I know that their rendering farm is comprised of Sun boxen, but that's about all I know.
Don Negro
Don Negro
Perl 6 will give you the big knob. -- Larry Wall
It's a typo, of course. Thanks for pointing it out.
Paul Boutin | writer for Slate, Wired, etc
SGI is installing the company's first 128-processor Linux® cluster at the Ohio Supercomputer Center (OSC), bringing new technologies to the Ohio research and education community. As the adoption of Linux systems expands across all marketplaces, Ohio scientists, educators and engineers can begin to use the state's largest Beowulf cluster as a starting point into scalable high-performance computing.
Everyone knows, "...there is a world market for maybe five computers." -- Thomas Watson, chairman of IBM, 1943
in 1993? i think it was when i change my 386SX33 to a 486DX4/100... and it has cost me only 4000$ :)
--
http://www.beroute.tzo.com
"Science will win because it works." - Stephen Hawking
The Playstation 2 (which is rumored to use Linux as the dev. environment) is officially classified by the US dept. of commerce as a "supercomputer" because of it's 128bit processor etc. So, if everyone in america gets one and hooks it up with some kind of modem, and someone writes a program to run in parallel across 2+ million of them, would that count?
I can tell you these machines are waaay underutilized. In fact there was a GAO report not too long ago that said the same. Yet Sandia and Los Alamos feel the need to keep upgrading their machines, and then they dump the 'older' equipment where they can. You'd be surprised where some of the 'surplus' equipment ends up. I can't speak for Livermore Labs, but Sandia and Los Alamos burn through money like you wouldn't believe.
Yeah, a cpu is a chip. A node is a cabinet which may share cpus and memory and some internal interconnect. While a cpu is a chip, this chip may be a vector chip or a "normal" chip. vector chips can do sooooo much more per clock cycle for vectorizable algorithms, that's why the skew in the #'s of cpus. they're also seriously more expensive than "normal" chips.
oh please. vector still kicks ass at certain algorithms over parallel machines, things like irregular access to memory, and well, long vector problems. the transition to parallel machines has been almost entirely political. parallel is better at some things, but not all. the site I work for still has a Cray/SGI T90/16, and it is the heaviest used machine among a T3E, IBM SP2 and SP(3), as well as tons of little crappy clusters.
Beowulf clusters, eh? [nudge nudge, wink wink]
-Docvert converts MSWord to OpenDocument, clean HTML
heh...i work at a call center for them. they just barely made it in there at #451. surprise to me... =)
...
hdj jewboy
...
what? no IMac?
-dave
ASCI Blue... so that's where Erwin resides. Just hope Dust Puppy has a suit to enter the clean room!
*ducks flying tomatoes*
Deosyne
Why do the Germans have so many computers?
#44 is a Linux cluster, as is #256
Yes, they're on there. They're called 'Self Made'. Here are a few:
#44 CPlant Cluster
#265 Avalog Cluster
#454 Parnass2 Cluster
-- I'm the root of all that's evil, but you can call me cookie..
It's a typo, of course. Thanks for pointing it out.
Paul Boutin | writer for Slate, Wired, etc
See SGI at OSC.
The key component to forward compatibility is the Linux software used on Beowulf. With the maturity and robustness of Linux, GNU software and the "standardization" of message passing via PVM and MPI, programmers now have a guarantee that the programs they write will run on future Beowulf clusters---regardless of who makes the processors or the networks. A natural consequence of coupling the Linux system software with vendor hardware is that the system software must be developed and refined only slightly ahead of the application software. The Linux model adopted for Beowulf system software makes all these wonders possible.
MOUNTAIN VIEW, Calif. (Aug. 12, 1999) SGI (NYSE:SGI) today announced that it will install the company's first 128-processor Linux® cluster at the Ohio Supercomputer Center (OSC), bringing new technologies to the Ohio research and education community. As the adoption of Linux systems expands across all marketplaces, Ohio scientists, educators and engineers can begin to use the state's largest Beowulf cluster as a starting point into scalable high-performance computing.
"Presenting new technologies, like the 128-processor Linux cluster, to Ohio is central to OSC's mission," said Charlie Bender, executive director, OSC. "This type of project provides the center with an opportunity to expand its role as a statewide resource by bringing even more scalable computing power to Ohio's scientists and engineers. This collaborative project with SGI will help us assist researchers using Linux at their desktop to use the high-performance computing systems found both at OSC and the National Science Foundation supercomputer centers."
Beowulf clusters like OSC's are specialized supercomputers that are gaining popularity in the technical and enterprise computing market because of their high performance at a relatively low cost. Beowulf clusters are used for solving very specific types of problems through what is known as parallel decomposing.
Ohio's research community will be able to access the Beowulf cluster through OARnet, a division of OSC. OARnet is the state's high-performance network providing Internet connectivity to more than a million Ohioans. As a leader in computing and networking, OSC is a state-supported resource for Ohio's scientists and engineers with an impressive array of machines and visualization equipment including Cray T3Etm and Cray T94tm supercomputers, two Origin 2000tm servers and several Silicon Graphics® 320 and Silicon Graphics® 540 visual workstations.
"SGI continues to pave the way for highly scalable, parallel Linux solutions in both technical and commercial markets. Our cluster solutions offer the unique capability to scale both high and wide for the right price-performance to meet our customers' application and budget requirements," said Jan Silverman, vice president of marketing, Computer Systems Business Unit, SGI. "Together with OSC, we hope to focus our attention on building better management tools, workload balancing and high availability for these cost-effective clusters."
The OSC Linux cluster will consist of 32 SGItm 1400L servers, each with four 500 MHz Intel® Pentium® III Xeontm processors. Preloaded with the SGItm Linux® Environment with Red Hat® Linux 6.0, the SGI 1400L server is an enterprise-class server designed to fulfill customer needs for comprehensive and cost-effective solutions that merge SGI's expertise and innovation in scalability, bandwidth and performance with industry-standard components and operating systems.
I'm rather surprised that Pixar didn't show up in this list. I believe they're running SGI ORIGIN 2000 machines, and I'm sure they need alot of horsepower to crank out those movies. Any insights?
-- This space intentionally left blank.
The Hitachi machine can achieve these figures for two reasons:
1) Their Interconnect
2) Their Processors
The interconnect is a hyper-bar crossbar network, with a bandwidth of 1GByte. Also they are able to get sustained message passing performance of about 90% like they did on their previous machine the SR2201. Other vendors would provide 60-65% of peak.
The number listed in the Top500 for processors is a bit mis-leading, this is in fact the number of nodes. The Hitachi nodes are made up of a number of processors, each with pseudo-vector optimisation (allowing them to miss the cache when loading large memory blocks). This optimisation means the chip can have a high sustained performance on large scale numeric problems. The nodes can be configured as either SMP of vector. This allows the machine to address a much wider range of domain problems.
Hitachi have a very brief page describing their machines SR8000 Product Page
I would love to see what a fully configured machine could do (6 TFlops!).
BTW, Linpack is not a great gauge of a Supercomputers performance. When there a lot of nodes it becomes message bound and does not reflect the true performance of the machine. When looking at machines like this it is important to look at benchmarks related to domain problems. e.g. It does not really matter what interconnect you have if you are doing ray-tracing, but it matters a great deal when doing astro-physics.
I suspect that there is a typo in the previous post. Number 265 is the Avalon cluster, number 256 some proprietary crap^H^H^H^Hengineering wonder.
Stephan
Hmmm... still no Microsoft on the list. Never has been...
I deny that I have not avoided attaining the opposite of that which I do not want.
SGI/Cray are definitely using MIPS... at least for the O2k line, which I think is currently their only line. You can probably find out for sure on sgi.com
He said, "You'll be able to tell your grandchildren that you helped assemble the first NT supercomputer," and I cringed.
These machines have processors built for extremely different purposes. The powerful vector computers built by Fujitsu, NEC, Hitachi, etc are pretty much optimized for the benchmarks (usually intense sparse matrix ops) being used. The Pentiums are of course, not. American vector computer companies basically don't exist anymore. Cray is dead- they are being sold, and phased out of DOE labs like Los Alamos National Lab. These labs have to build clusters of supercomputers (like ASCI Blue Mountain) because the US Govt does not allow them to purchase supercomputers from Japan. The US even tries to threaten trade partners like the UK with sanctions when their weather research centers want to buy Japanese vector computers. Cray's SV1, shown-off at Supercomputing '98 in Orlando, has not had many (any?) customers. Also various manufacturers have source code to these benchmarks, and tweak them so that they run as well as they can on their hardware. Regardless of method used, the ranking only provides a list of the fastest overall machines for a common set of benchmarks, without making a bunch of useless categories for fastest computer under 2000 lbs, or fastest under 3000 processors, etc.
I did a little quick (and inaccurate) math using distributed.net's rc5-64 project statistics and extrapolating TFlops (the main unit of comparison used in this study) from the total keys/second as compared to my computer's keys/second. My results, which I strongly believe err on the low side, show that the computers -currently- working on RC5-64 total a computing power of 11.5 teraflops, or almost 6 times the power of ASCI Red. Woohoo! --neil
I claim the right of the first mention of Beowulf!
Pointing out the obvious: Sandia's good ole CPlant cluster is sitting in 44th place -- beige boxen rule!
These are massively parallel machines (Beowulf?) made of components that are following Moore's law. Thus I would expect their power to increase by approximately Moore's law * the increase in number of processors used.
Does Moore's Law apply to massively parallel, hand-built systems?
I wouldn't think so... At the time, Moore was running Intel, a one-CPU-per-machine outfit, and I think his "law" was an observation on the rate of progress in the PC industry, and what advancement was possible within the technology of single Von Neumann-bottleneck-style systems.
-schmaltz
Big Daddy, Johnny, Burp, Aunt Zelda, Scott, Slurp, Big Momma
...it's a BUSINESS PLAN! No reallly! It is. It's Intel's business plan -- double the speed of their chips every 18 months. And they've followed it very closely. But there is no natural law which says it has to be this way. (Good plan though... It captured everyone's imagination.)
This
I hate articles like this. In the first place theres the, "The new Top500 numbers are in, and your laptop has never looked so tragically slow." These supercomputers are all massively parallelized machines using regular microprocessors. The actual speed of the machine, like ASCI Red, is determined by the processors used, which in this case are just normal Intel processors. So you can go out and buy a machine that computes instructions just as fast as ASCI Red. The difference is that it can do more things at once because of all the processors involved. Does it make your laptop look slow? Hell no, because if you had ASCI Red, you wouldn't have any apps that take advantage of its parallelism to run on it anyway.
r e.htm):
Secondly, Moore's Law is the following (from http://www.intel.com/intel/museum/25anniv/hof/moo
In 1965, Gordon Moore was preparing a speech and made a
memorable observation. When he started to graph data about the
growth in memory chip performance, he realized there was a striking
trend. Each new chip contained roughly twice as much capacity as its
predecessor, and each chip was released within 18-24 months of the
previous chip. If this trend continued, he reasoned, computing power
would rise exponentially over relatively brief periods of time.
Moore's observation, now known as Moore's Law, described a trend
that has continued and is still remarkably accurate. It is the basis for
many planners' performance forecasts. In 26 years the number of
transistors on a chip has increased more than 3,200 times, from 2,300
on the 4004 in 1971 to 7.5 million on the Pentium® II processor.
Since the CPUs in supercomputers use standard processors, and Moore's Law applies to these processors, his law is still intact. His law is about CPUs, not systems.
He said, "You'll be able to tell your grandchildren that you helped assemble the first NT supercomputer," and I cringed.
The CPlant isn't a Beowulf machine. It is running linux tho.
Ishtar (1 flop)
Kevin Costner (1 megaflop/year)
--
"L'IT c'est moi!"
The real action is lower down Avalon was top 100 now is down to 265. The top cluster goes to cplant take the award for top cluster now.
Grey (Chris Lusena)
I wish they'd list what the Top 500 supercomputer machines were doing, what kind of mathematical computations they had them working on.
I could not justify my existence if I were a turkey farmer. Would I terminate myself? Undoubtably, yes.
Well, the vendors really like the parallel architectures because they can build them out of commodity components. An SP of whatever size is bolted together out of the same basic pieces as IBM's midrange or even desktop RS/6000 systems, so IBM gets to take advantages of economies of scale. The SP frames and the SP switch are unique parts, but heck, they sell more of those to businesses for web servers and LAN-in-a-can installations than they do to supercomputer centers. That's where IBM's volume SP sales are. I've talked with IBM folk about it, and believe me, they do not make much money on those huge SPs lurking at the top of this list.
The big-time vector vendors (and Cray/SGI seems to be the only US vendor left in that game) are all hurting or dead because they only product they could really offer was big, expensive number-crunchers. Vector processors cost like crazy and they aren't any good for much but massive quantities of math. They're too specialized to be lucrative in today's commodity-driven computer market.
So the prevalence of parallel systems these days really isn't due so much to technical issues as it is to business realities. I wouldn't call it politics. It came down to dollars and cents, and who could sell enough machines to stay afloat.
Hey you idiot! #37 at the NCSA at the University of Illinois at Urbana-Champaign is the fastest academic supercomputer.
I don't know where I read this article, but it was about parallel processing vs vector processing and about US vs Japan.
The main idea behind the article was that the US still dominated the Top500 list, but that the Japanese achieved more than the US since they had the same power with less processors.
But it also depends on what the purpose is of the computer... If you need a lot of parallel processes you would definetely go for the US version and if you need very fast (but little) parallel processing you would go for the Japanese.
- Artificial Intelligence usually beats real stupidity -
Yepp, the rest of the US machines in the top is used by the US military to find the best way to kill us in the rest of the world.
I knew eBay was a big outfit, but they have 2 computers in the top 500 list! The other 2 servers are at AOL, and a Japanese recruitment site.
ASCI Blue replaced its old SGI-based system (see list #13 at top500.org) with Moto chips and a Myrinet, according to the Top500 folks and ASCI Blue spokespeople. They were #1 until about two weeks ago, when ASCI Red pulled some late nights optimizing their two-year-old hardware to get back to #1.
Paul Boutin | writer for Slate, Wired, etc
This does not compute!
The ASCI machines, which are far beyond previous supercomputers in power and speed, are specifically funded and built to simulate the explosion performance of aging US nuclear warheads. Since the Clinton administration was determined to sign no-nuke-testing treaties to keep other countries from advancing in the technology, ASCI was created as a ten-year, billion-dollar program to replace real explosion tests with simulated ones.
:-)
That's also why there are export controls over "supercomputers" above 1 gigaflop - to make it harder for upcoming nuke powers to do simulations. That's being revised to 6.5 gigaflops in early 2000 to prevent Playstation II from being classified as a munition.
Paul Boutin | writer for Slate, Wired, etc
Simply put, this doesn't break Moore's Law, because Moore's law is based on Semi-Conductor density, not speed. Of course speed is an attribute related to density, but they are not the same thing.
Dumping CPU's on a system doesn't break Moore's Law, and never will.
Fujitsu, at #364 with VPP500/28 instaled at Institute of Physical and Chemical Res. (RIKEN) Wako in Japan in 1993.
What where you running in 1993???
The list is, of course, only as complete as its submissions. For example, we have a 512-cpu Paragon in the basement, that would probably qualify for the list if anyone bothered...
Shut up, be happy. The conveniences you demanded are now mandatory. -- Jello Biafra
I've looked at this list every month for a long while now... and I always wonder what the US Government is doing with those 'classified' computers. Anyone think that the NSA might have one at Bragg? or one of their many other spots? Anyone know the amount of processing power it takes to scan radio waves and other modes of transmission? Though there is the idea of "National Security", the cost of these things must be pretty darn big. I sure want to know where exactly my tax dollars are going.
NIVRAM
Believe me, they do get used. Our multiuser machine (yes, on the list, and I'm on it now!) can barely cope with the use we put it through. When interactive use drops, there are always a ton (Oooo - 318 at the moment) of batch jobs queueing up to drain those CPU cycles. I don't know where you got your information from, but these machines *are* used - if there happen to be any "spare" machines going, I'd love one!
On a related note, I spent a Saturday setting up my PII233 laptop to run the same software as I run here at work on our Sun machine. Yes, it's slower (also Linux vs. SunOS 5.6), but the comparison made in the article is totally unwaranted; my one year old laptop *is* amazingly fast, and would have been a god send to my predessor particle physicist grad-students just ten years ago. Contrary to what the article states the gap between (*multiuser*) supercomputer and (*personal*) laptop is very very narrow, so much so that it is sometimes more productive for me to unplug and work on my laptop!
Dan
According to Cmdr Taco, it's art if it can make a good background image. Well, I can tell you that C-plant and Avalon Cluster make great art. The Avalon image I have is particularly nice, photographed in a reddish light. Both of those run Linux.
Yup. 33 of the top 500 are classified. (That's 6.6%). That's a lot when probally only 25% of the
classified that could make the top 500 are reported.
Linux has proven to be an essential component in building clusters of PCs (pioneered by Beowulf), and its popularity is increasing in the world of scientific and high-performance computing. With a modular design and free source code that has been ported to several CPUs, the Linux kernel is also ideally suited for computer science research. Several companies have introduced Linux products to support powerful desktops and high-performance computing clusters. Dozens of universities and laboratories are using Linux for scientific computation and research. Companies are beginning to market preconfigured Linux clusters using the latest Intel or DEC Alpha CPU. The marketplace is evolving. The Extreme Linux community wants to help.
Of those 11, 7 are working on "classified" matters. I doubt they'd be working on echelon though, but I'd very much like to know exactly what it is they are doing :)
94 IBM MHPCC 243 - fastest computer with a number of processors in no way related to common powers of 2
hmm, 243 is 3^5. I wonder what strange architecture dictated that number.
Whereas the winner employed nearly 10 thousand processors, the 3rd place Japanese entry (the computer with the name SR8000/128 ) used only 128. Interesting...
10 IBM UCSD 960 - fastest computer with a number of processors evenly divisible by 10
12 IBM Charles Schwab 2000 - fastest computer with a number of processors evenly divisible by 100
15 Fujitsu Kyoto 63 - fastest computer with a number of processors not evenly divisible by 2
46 Fujitsu NAL 167 - fastest computer with a number of processors neither evenly divisible by 2 nor equal to (2^x)-1
94 IBM MHPCC 243 - fastest computer with a number of processors in no way related to common powers of 2
170 NEC NLR 8 - fastest computer with a number of processors 1000
(Yeah, yeah, I know you've got 2k TRS-80s in a Beowulf cluster in your back yard.)
-=Best Viewed Using [INLINE]=-
I mean, it's nice to see the US machines taking the cream of the honors in raw power... but what the heck - ASCI Red gets it's 1st place berth with 9 THOUSAND some odd cpus (.0246Rmax pts / cpu), whereas the Hitachi machine gets a very respectable 5th with only 128: 6.8Rmax pts per CPU! Isn't there some credit due for the more efficient machine? It doesn't seem that impressive to simply dump silicon at a problem until you are #1...
Any Beowolf clusters in there? I didn't see any on a quick glance through.
XML causes global warming.
...as of a few days ago I choked one of this machines for a while... Other guys in the lab where not very happy, but who cares...
Efficient code? Who said efficient code?
<^>_<(ô ô)>_<^>
Proposition for the next top500 list:
rc5des -benchmark
But I never claimed to be sane either...
here are a two definitions of what an operating system actually is:
whatis.com's definition of an operating system
foldoc's definition of an operating system
Ner lbh sebz gur HFN? Gura lbh'ir whfg ivbyngrq gur QZPN!
silly me... of course this was posted to the wrong article.... i gotta get some sleep!
Ner lbh sebz gur HFN? Gura lbh'ir whfg ivbyngrq gur QZPN!
I live in Livermore, and was a bit upset that we didn't get credit for the 2 fastest machines.
BTW, my Father in Law and my Wife both work at Sandia (home of #1!!), maybe I could convince them to let me go in and install Quake 2?
Sure, I have a thankless job. That's okay. I have a lot of (non
Actually, the NEC/Hitachi/whatever machines are NOT using standard processors. They, like the Cray J90/C90/T90/SV1 are using vector processors. These processors are highly specialized and much more expensive than regular processors (G4 notwithstanding ;-). Our humble Cray J90's CPUs, for example, can process 64 array elements (each element is an 8byte word) with each vector command making it very fast for matrix-style calculations. As far as I know, the LINPAK benchmarks vectorize quite nicely, thus, giving vector machines an advantage.
Unfortunately, since SGI sucked all of the life out of Cray Research, all of the recent developments in vector hardware have been from Japanese manufacturers, thus leaving many US gov't agencies without big vector iron. According to our local Cray guy, this will change with the Cray SV2, but that machine is still a year or so out and may be too little too late.
Incidentally, our T3E (#56 on the list) does have off-the-shelf Alpha processors (272 of 'em).
Actually, most of the IBM SP systems run either POWER 2 or (if they are newer) POWER 3 processors, which are not used in personal computers.
No need, it definitely doesn't.
The original Doctor Dark.
Uhm, I already did check sgi.com.
The SGI lines and the Cray lines are still prettyy separate.
O2K is a very SGI line. It runs IRIX and is MIPS and is related to SGI workstations.
T3E is a very Cray line. It runs UNICOS/mk and it is built from alpha processors.
I know less about the SV and T90 lines, but they are also very Cray. They're vector machines and they run UNICOS.
Anyway, the Cray derived lines were more important to this story, since if you look there are a lot of T3E's in the top 500.
The Origins are nice machines, but they don't have the super high performance. They seem to give pretty good bang for the buck, though.
I have heard that the T3E, might be the last of its line, but who knows how that fits in with plans to spin Cray back off of SGI.
Preventive War is like committing suicide for fear of death. - Otto Von Bismarck
According to SGI's Products ( http://www.sgi.com/products/ ) page, their T90 series is based on the Cray vector processing system. I haven't looked at the Top500 list in a couple of hours, but I do remember there being a couple T90-based machines on there. Add to this their SV1 (Scalable Vector) line, and you do have at least two major system options if it's American vector computing you're after.
But even their 60Gflops T90 series, maxxed out, can't come close to the 6 TERAflops of the highest-end Hitachi SR8000 systems. Whoa mama!
Get them running d.net. I imagine that there are a few idle clock cycles in that list.
Yes, I do see a lot of entries from Intel, which means to me probably Pentium Pro's.
Then there's IBM, which seems to be using PowerPC 604e's.
Next, SGI uses MIPS
SGI/Cray - have they moved to MIPS, or are they still using Alpha's? (I'm not 100% sure that's what they used before, but i'm 95% sure it is).
My main puzzler here is NEC. WHAT ARE THEY USING??? If you go down to #73 on the list, there's a machine that was deployed in 1999 with just 16 processors? Okay, it's performance is 1/19th that of Intel's #1 offering, but it uses just 1/602 the amount of CPU's??? That's not a standard processor that i've ever heard of?
NEC has a bunch of listing below that, too. Some use just 5 processors (though, those are all in the high 400's). What chips is it using? Can anyone explain what this machine is?
I think it is very interesting to note that of the top 40 supercomputers, there are seven with less than 400 processors and every single one of them is located outside the United States (six in Japan, one in France).
The U.S. Government owns 11 of the top 100, 10 of which are SGI's (or Cray, if you prefer).
Could any of that processing power have anything to do with Echelon???
--
E2 IN2 IE?
It seems to me that it's much more interesting to see that the Hitachi can get by with only 128 CPUs and the Fujitsu can get by with only 63. Does anyone know what those machines are doing so well that they can get extreme performance from relatively few CPUs (should tell Sun to take a hike and call Fujitsu)?