Slashdot Mirror


LinuxBIOS, BProc-Based Supercomputer For LANL

An anonymous reader writes "LANL will be receiving a 1024 node (2048 processor) LinuxBIOS/BProc based supercomputer late this year. The story is at this location. This system is unique in Linux cluster terms due to no disks on compute nodes, using LinuxBIOS and Beoboot to accomplish booting, and BProc for job startup and management. It is officially known as the Science Appliance, but is affectionately known as Pink to the team that is building much of it."

9 of 189 comments (clear)

  1. LinuxBIOS by Anonymous Coward · · Score: 5, Insightful

    I wonder why LinuxBIOS hasn't taken off. I've debated ordering one of their "kits." It seems to me the 3 second boot time of LinuxBIOS should be a selling point for some obscure Linux vendor, but no one really offers it yet.

    I really imagine a machine with an 8MB EEPROM/ROM that can be updated as needed, but provides a boot environment and login screen - while spinning the disks in the background. This would make an excellent product.

    Why hasn't anyone done this yet?

    Curious

  2. Why not use embedded tech? by Chirs · · Score: 4, Insightful


    This sounds like some kind of dual-processor rackmount type solution. Why not go all the way and use something like compactPCI? You can fit 21 cPCI blades into 8U of rackspace.

    A standard blade could have up to a couple gigs of ram, a powerpc or p3/p4 cpu, 100BT or 1000BT ethernet, etc, etc.

    You boot the things using bootp/tftpboot and then run linux off a ramdisk.

    We're using cPCI at work to run VoIP softwitches. Currently we're at over a million calls an hour on a wimpy 450MHz processor.

    1. Re:Why not use embedded tech? by ronaldgminnich · · Score: 2, Insightful
      Good question.

      Why not cPCI? In a word, performance/price on our apps. We looked at all sorts of cPCI blades (e.g. http://www.cluster-labs.de) but the peformance just is not there. Also, no existing ethernet will do the job for our apps, so we have to use Myrinet, and again, the fastest Myrinet is going to be in the PCI 64/66 slots on plain old motherboards.

      Other folks have asked about 1.0 Ghz G4. I like PPC. But on our applications the PPC, with the best compilers we can find, is actually slower in absolute terms than a PIII/800. So as much as I would have wished to use a PPC, it's not cost-effective.

      Note that our software runs fine on G3 and G4 macs however -- our standard CD distribution from http://www.clustermatic.org will boot either PC, PPC, or Alpha just fine. In fact, the standard Linux distribution from http://www.terrasoftsolutions.com/products/blackla b/components.shtml features some of our software, including bproc.

      Also, if you look at the PPC offerings from synergy and CSPI you'll find they run their own kind of "Linux in flash" -- not LinuxBIOS, but pretty much the same function. They've been doing this for years.

  3. Re:Uses by abulafia · · Score: 2, Insightful
    statistical modelling - wouldn't you like to know if the stock market is going to go up or down tomorrow, before it happens?



    I agree with you on most respects (even if much of what you're talking about is very, very far beyond most realistically imaginable systems in the near future), but simple economics shows why the above is silly.

    Simple question: someone uses a tool to make a killing on a pre-existing market. How does everyone respond (not counting RIAA, et al, who depend on regulation)? They either curl up and die, or figure out what the winners are doing, and quickly. Learning what people are doing is even easier in markets like finance, where there's a lot of transparency in actions, a very close knit group of participants, people who like to brag, and a lot of people staring at the winners.



    Fact is, any new innovation in trading quickly becomes used by everyone who has a serious enough stake. It is just market economics. Once everyone gets an innovation, it is no longer an advantage, because everyone is doing it (bonus points for those who see past and potential systemic failures lurking in this behaviour).



    Of course, keeping your traders free of risks like sharing information and regulatory oversight can extend an advantage, and that works in a very few situations. But hell, even Warren Buffett took a fairly serious beeting recently due to things he couldn't predict (and this is an insurance guy!), not to mention Soros when he attacked Asian currencies a few years ago.



    Not only is there no silver bullet for the folks who run finance, there's just no way in hell peons in the game (anyone with less than a few hundred million invested) will profit from raw computational power. Sorry.



    -j

    --
    I forget what 8 was for.
  4. Don't be so sure by marm · · Score: 3, Insightful

    A supercomputer is a single system image. Some people call large clusters "supercomputers," but technically they're wrong.

    Says who?

    Once upon a time 'supercomputer' meant 'any computer made by Seymour Cray', and this was reasonable, because he (probably) invented the concept. Then there was the mid-80's loose but widely-accepted definition 'any computing system that can do more than 200 MIPS'. Then MIPS went out of fashion and processors got faster and it was 'anything that does more than a GigaFlop'. Or there's the US Department of Commerce definition which was 'any computing system that does more than 195 Mtops (Million theoretical operations per second)' during the 80's, which then got changed to 1500 Mtops and is probably something different now.

    Note that most Linux cluster systems would meet the requirements of most of these - indeed, most single-CPU computers today would meet most of these requirements, which is how Apple manages to get away with calling the G4 a 'supercomputer'.

    Really, these days 'supercomputer' means absolutely anything you want it to be, although if I had to define it, I think probably the fairest definition would be 'anything that can run the LINPACK benchmark suite and get on the Top500 list'.

    Nice try at creative redefinition though.

  5. Re:Good Stuff by gregorio · · Score: 2, Insightful

    Scary because you could buy all the hardware off the shelf for about half a million dollars.

    Scary? Why? Oh, and the interconnect hardware and installation is going to cost you more than 4x this value if you want good latencies and reliability.

  6. Re:I fail to see how this is unique.... by ronaldgminnich · · Score: 2, Insightful

    HMMM, I'm sorry that you have failed to see how this is unique :-) You probably should visit http://www.clustermatic.org and read what's there. Of course it has been going on a long time, I first did it 12 years ago with Suns. But your little 116-node cluster probably did not run into the problems you hit at a larger scale. Anyway, what linuxbios gets us: - more sane platform configuration - we load linux from flash so can use all the capabilities of linux as our bootstrap - we boot over myrinet - we're not even cabling the ethernet up - We don't need to set up the serial network which you HAVE to set up with kludges like SRM You're just not going to get that with PXE or SRM. I realize this detail was not available on the short article.

  7. Re:Unique? by ronaldgminnich · · Score: 2, Insightful

    yep, you can do it with floppies. But do you really want to do it with 1024 floppies given a 10% average failure rate? Think about it.

  8. Re:How can MS not be scared? by ronaldgminnich · · Score: 2, Insightful
    Sad but true, but you're wrong.


    I realize this is Slashdot :-), but I think at this point you need to backup those statements with one or two facts or citations. I can give you one, however: at several conferences in 2000, e.g. ALS 2000, Compaq demonstrated 16 and 32-node Alpha SMP systems with Linux. Scaling did stop at 16 (for kernel builds) with the version of Linux they ran back then. I had a 16-node GS160 and it did scale just fine to 16 nodes.


    Can you actually provide a reference for that 32-node Windows box? Most of the "32 CPU Windows" boxes I have seen run Windows in cells of 4 CPUs, with 8 copies of the OS (e.g. Unisys). Do you really call this scaling? I don't.


    ron