LinuxBIOS, BProc-Based Supercomputer For LANL
An anonymous reader writes "LANL will be receiving a 1024 node (2048 processor) LinuxBIOS/BProc based supercomputer late this year. The story is at this location. This system is unique in Linux cluster terms due to no disks on compute nodes, using LinuxBIOS and Beoboot to accomplish booting, and BProc for job startup and management. It is officially known as the Science Appliance, but is affectionately known as Pink to the team that is building much of it."
But if you'd replace the expensive high-performance interconnect with a cheap ethernet, then it would be a Beowulf cluster.
Think so? Wouldn't a system with disks be more suitable for that?
-----
For great justice!
Let's just hope they do something good with this. I'm tired of reading about how supercomputers are used for military war simulations.
LANL tends to do projects that are focused much more on science and engineering than military applications. It's very likely that Pink will end up analysing spectral emissions of bombarded protons or something like this.
The military simulations you mention probably don't happen at LANL.
as it is an OS by Apple and IBM (well gone but still)
I wonder why LinuxBIOS hasn't taken off. I've debated ordering one of their "kits." It seems to me the 3 second boot time of LinuxBIOS should be a selling point for some obscure Linux vendor, but no one really offers it yet.
I really imagine a machine with an 8MB EEPROM/ROM that can be updated as needed, but provides a boot environment and login screen - while spinning the disks in the background. This would make an excellent product.
Why hasn't anyone done this yet?
Curious
A former client who worked at a Cancer Center used a cluster to simulate radiation treatments.
This sounds like some kind of dual-processor rackmount type solution. Why not go all the way and use something like compactPCI? You can fit 21 cPCI blades into 8U of rackspace.
A standard blade could have up to a couple gigs of ram, a powerpc or p3/p4 cpu, 100BT or 1000BT ethernet, etc, etc.
You boot the things using bootp/tftpboot and then run linux off a ramdisk.
We're using cPCI at work to run VoIP softwitches. Currently we're at over a million calls an hour on a wimpy 450MHz processor.
I don't envy the developers... After every revision of LinuxBIOS, they get to reflash 1024 motherboards, which could take a while...
How many standard Libraries of Congress is a shitload?
Think so? Wouldn't a system with disks be more suitable for that
:)
Nah, just one honkin RAMDisk. Could serve up mucho porn/warez, when the feds come knockin, just pull the plug, presto, no evidence
On the contrary, I wouldn't mind seeing more military war simulations being done on supercomputers; so long as they are carried out as an alternative to actual military war.
Think about it: Instead of wasting all the money, resources, and lives of actually invading another country, we just get a few supercomputers into a network, and duke it out online.
First thing, of course, would be to allow the export of supercomputers to blacklisted countries. (Is Afghanistan still on the list, I wonder?) Then get a UN resolution that all member countries will abide by the outcome of any virtual war.
And hey, the US has already got a head-start in training soldiers for it: "America's Army"!
Any sufficiently advanced civilization is indistinguishable from Gods.
Does anybody know other applications that supercomputers are being used for. I know some do weather predictions.
Ok, non-military uses, off the top of my head:
I'm sure there are plenty more applications for supercomputer power - any kind of complicated or chaotic system is a good candidate for modelling, especially when there's more than one unknown variable (multivariate analysis is complicated, to say the least).
For those of you not familiar with the "Weekly World News" publication, it is a tabloid you'll find at most american supermarkets which will feature highly elevating stories such as "mom gives birth to four-headed quintuplets". The above story is just another one of their fictions. This is what tabloids do. They sell fiction. They appeal to the mentally ill-challenged, gulible minus habens.
Yahoo features those articles in their TV/Gossip/Entertainment section. So you don't have to spend money at the supermarket. Go yahoo.
Nothing to see here, move along.
Extraordinary Vacations. Exceptional Prices
This one, brpoc, is different it is completely stable. You never get NFS wedges. Jobs launch in flash. Plus if you do reboot the whole thing is back up in seconds (literally).
Bproc is an incredibly light weight job submission system. It is so light weight and fast that it changes how you think about sumbitting jobs. Rather than designing long duration jobs and tossing them on queue, you can just run tiny short jobs if you want with no loss to overhead. It makes you re-think the whole idea of batch processing.
when the jobs run they appear in the process list of the master node. That is if you run "top" or "ps" the jobs are listed right there. In fact from the users point of view the whole system looks like just one big computer.
Some drink at the fountain of knowledge. Others just gargle.
The largest (largest by a long shot it outpowers the rest of the top10 combined) supercomputer in the world is the NEC Earth Simulator in Japan. It is being used to do the most detailed climate modeling ever attempted. Not only that but they are attempting a complete system model which AFAIK has never before been possible. In addition the last couple clusters that I have read about have been for biomedical research, maybe it's just what I read but I believe bioinformatics is going to be one of the biggest pushers of HPC going forward. Genomics is nothing compared to proteonics, mapping the genome probably takes about as much computing power as simulating the folding of one large protein series!
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
I will personally track down and slaughter the first person to mention a popular clustering architecture, and how one might imagine it...
Code, Hardware, stuff like that.
Well, Hollywood has used supercomputers and large clusters to do effects for movies like Star Wars: Episode II, Resident Evil, and the upcoming Terminator 4.
So, no, there haven't been any good uses.
Cluster Overview:
* 2050 Intel 2.4GHz Xeon processors
Now when people complain about the United States government being responsible for global warming they will have some good hard facts to use.
Is that an imperial shitload or a metric shitload?
"Prefiero morir de pie que vivir siempre arrodillado!"
I agree with you on most respects (even if much of what you're talking about is very, very far beyond most realistically imaginable systems in the near future), but simple economics shows why the above is silly.
Simple question: someone uses a tool to make a killing on a pre-existing market. How does everyone respond (not counting RIAA, et al, who depend on regulation)? They either curl up and die, or figure out what the winners are doing, and quickly. Learning what people are doing is even easier in markets like finance, where there's a lot of transparency in actions, a very close knit group of participants, people who like to brag, and a lot of people staring at the winners.
Fact is, any new innovation in trading quickly becomes used by everyone who has a serious enough stake. It is just market economics. Once everyone gets an innovation, it is no longer an advantage, because everyone is doing it (bonus points for those who see past and potential systemic failures lurking in this behaviour).
Of course, keeping your traders free of risks like sharing information and regulatory oversight can extend an advantage, and that works in a very few situations. But hell, even Warren Buffett took a fairly serious beeting recently due to things he couldn't predict (and this is an insurance guy!), not to mention Soros when he attacked Asian currencies a few years ago.
Not only is there no silver bullet for the folks who run finance, there's just no way in hell peons in the game (anyone with less than a few hundred million invested) will profit from raw computational power. Sorry.
-j
I forget what 8 was for.
With 2048 processors and an assumed 1GB/node that's 1TB of low latency, super high-jiz RAM.
How do you define "low latency?" From a first glance at the evidence, it appears that this cluster just uses plain old TCP/IP over Ethernet as its node interconnect. That's not exactly low latency access to remote memory, you know.
Just nitpickin'.
Wrong. Render farms are neither clusters nor supercomputers. At best, a render farm might be considered an array.
A supercomputer is a single system image. Some people call large clusters "supercomputers," but technically they're wrong.
A cluster is an interconnected group of computers that can communicate with each other. Usually a cluster depends on some kind of software layer to allow programs to run across multiple systems, something like MPI. Clusters are tightly interconnected many-to-many systems.
An array has a single job control system and a number of job execution systems. Batch jobs are submitted by users to the job control system, which doles them out to the various execution systems and then collects the results. The execution nodes don't talk to each other, and one job runs on one execution node at a time. Render farms are basically arrays; each execution node works on rendering a single frame of a multiframe animation. Because each frame can be rendered independently, without any dependencies on the previous and subsequent frames, rendering is particularly well suited to array computing.
"The Science Appliance" as it is dubbed will use dual processor AMD based nodes.
Scary part is that this will be one of the top 5 supercomputers in the world.
Scary because you could buy all the hardware off the shelf for about half a million dollars.
On a lighter note:
"The Linux NetworX cluster will be used solely for unclassified computing, including testing on ASCI-relevant unclassified applications."
I think they mean text mode quake.
I guess they got tired of "Global Thermo-Nuclear War"
If voting were effective, it would be illegal by now.
Supercomputers like this one are great for big calculation; not so much for big data.
If all you're doing is storing data and then retrieving subsets of it on demand, then sure, I agree. Most databases are like that - but not all. Some databases do more complicated processing than search, sort, split and join, some have to do some heavy manipulation of every record, and in these cases, CPU speed is just as important as memory or disk speed, if not more so.
Case in point: SETI. Terabytes of raw data collected from radio telescopes, no doubt stored in a large database in Berkeley, divided up into manageable records indexed by time and position in the sky. The data is absolutely worthless though, without colossal amounts of processing. So they build their own network of donated CPU time - a global data processing system - in order to turn that worthless data into something worthwhile. It isn't the disks or memory that limits speed of data processing, it's the CPU.
Of coure, if they'd had the money they could have simply bought a single machine to do all the data storage and processing, but it would still have been CPU-bound rather than disk or memory-bound. There are plenty of more conventional systems where everything is done on one physical machine, but still have the same problem of being bound by CPU.
I guess it depends on how you define a database - is a system that does complex processing of data as well as storage and retrieval still a database, or is it something else?
To add to that, seismic interpretation by Oil companies. Shell have a 1024 node AMD cluster in the Netherlands for this purpose.
simple economics shows why the above is silly.
Silly it might be - that doesn't stop people from attempting it though, and buying and using supercomputers to do it with.
Fact is, any new innovation in trading quickly becomes used by everyone who has a serious enough stake.
Sure. But every time you improve the model, in theory at least you get a short period of having an advantage over everyone else - until they improve their statistical model to match or beat yours. Even if your advantage only lasts a day, and even if it's only a minor advantage, that's easily long enough to make up the cost of the supercomputer and the programming time, and then some, at least if you're a major investor.
No-one said this was something you or I would benefit directly from - at least, not unless you have a stake in the investors doing the market analyses - but to the major market investors, if it gives them an advantage, even a temporary one, good luck to them.
Oil companies like to have serious computer power too for prospecting and resevoir modelling.
In organic chemistry, you can do some serious molecular simulations ranging from pharmaceuticals through to the actions of enzymes and catalysts.
The fluidics side can even extend through to air-flow modeling (aircraft to cars) and combustion.
A supercomputer is a single system image. Some people call large clusters "supercomputers," but technically they're wrong.
Says who?
Once upon a time 'supercomputer' meant 'any computer made by Seymour Cray', and this was reasonable, because he (probably) invented the concept. Then there was the mid-80's loose but widely-accepted definition 'any computing system that can do more than 200 MIPS'. Then MIPS went out of fashion and processors got faster and it was 'anything that does more than a GigaFlop'. Or there's the US Department of Commerce definition which was 'any computing system that does more than 195 Mtops (Million theoretical operations per second)' during the 80's, which then got changed to 1500 Mtops and is probably something different now.
Note that most Linux cluster systems would meet the requirements of most of these - indeed, most single-CPU computers today would meet most of these requirements, which is how Apple manages to get away with calling the G4 a 'supercomputer'.
Really, these days 'supercomputer' means absolutely anything you want it to be, although if I had to define it, I think probably the fairest definition would be 'anything that can run the LINPACK benchmark suite and get on the Top500 list'.
Nice try at creative redefinition though.
The only place where a military conflict is actually a contest is in between the second and third world, now that many third world nations have nuclear weapons. The second world is not of a mindset that any type of computer simulation would be an acceptable substitute, and even simulating the morale of billions of starving Communists is a staggering proposition.
So we're right back to the third world countries. Ignoring the export ban, do you really think India and Pakistan (who are hardly third world countries anymore) are the types of people who will say, "Look, the computer says we'll win, so why don't you surrender now?"
And I am using the accurate definitions of first, second, and third world. No need for anyone to get offended by whatever false implications they attach to terms like "third world".
Ugg.. I do WISH that people would stop reading "Tom's Hardware", or at least that they would get a clue first and realize that Tom doesn't know dick-all about what he's talking about most of the time.
His comments about heat rising more then 1C/second make NO SENSE AT ALL! It's flat-out wrong! I don't know what orafice he pulled that comment from, but it certainly had no technical backing to it. The chip uses a thermal diode. It will tell you the temperature whenever you poll it. It doesn't matter how fast or slow you poll it, it will give you the temp. You would really have to go out of your way to try to break this sort of data to get it to only be able to handle a 1C/s temp increase.
As for the heat "problem". AMD's AthlonXP chips have a maximum power consumption of roughly 50-70W. Intel's P4's have a maximum power consumption of roughly 50-70W (yes, they consume almost the exact same amount of power, check the data sheets).
For comparison, Intel's Itanium has a maximum power consumption of around 100-130W, and IBM's Power4 is also on the high-side of 100W.
I know that people hate facts, but here they are:
Power consumption:
AMD AthlonXP 2600+ : 68.3W Max, 62.0W typical
Intel Xeon 2.4GHz: 65W TDP (*)
*TDP = Thermal Design Power, a kind of ambigious measure of power that is slightly less then the maximum power the chip can use.
HMMM, I'm sorry that you have failed to see how this is unique :-) You probably should visit http://www.clustermatic.org and read what's there.
Of course it has been going on a long time, I first did it 12 years ago with Suns. But your little 116-node cluster probably did not run into the problems you hit at a larger scale.
Anyway, what linuxbios gets us:
- more sane platform configuration
- we load linux from flash so can use all the
capabilities of linux as our bootstrap
- we boot over myrinet
- we're not even cabling the ethernet up
- We don't need to set up the serial network which
you HAVE to set up with kludges like SRM
You're just not going to get that with PXE or SRM.
I realize this detail was not available on the short article.
yep, you can do it with floppies. But do you really want to do it with 1024 floppies given a 10% average failure rate? Think about it.
I realize this is Slashdot
Can you actually provide a reference for that 32-node Windows box? Most of the "32 CPU Windows" boxes I have seen run Windows in cells of 4 CPUs, with 8 copies of the OS (e.g. Unisys). Do you really call this scaling? I don't.
ron
I don't know if he made the whole things up, but there were some fairly major issues that should be blatently obvious to anyone watching that video and reading the explination.
First off, that whole 1 C/s temp change max makes absolutely no sense at all! If someone actually told him that then they were either completely clueless, or whoever designed the heat-protection circuitry was going out of their way to try and make it a bad design. AMD has reference specs for this on their webpage, and it's REALLY not complicated!
And than there was the P4 side of things. That P4 which apparentely stayed at 29C without it's heatsink on, because of it's thermal throttling. Think about that for a second. If your processor was only at 29C, it WOULD NOT BE THROTTLING! The thermal throttling of the P4 doesn't come into effect until somewhere around 60C, if the temp is less than 60C, the chip would run at full speed, and it would get a LOT hotter than 29C if that were the case.
Than there's simply the fact that the whole test is rather contrieved. People just don't rip heatsinks off their processors while they are running. Heatsinks don't fall off unless they were installed by a moron.
So, what we end up with is a contrieved "test" that has at least one major and very obvious flaw, not to mention a rather dubious explination for the results.
Ohh, but it DID get Tom probably a few hundred million page hits, and therefore probably several million dollars worth of advertising revenue.