Slashdot Mirror


LinuxBIOS, BProc-Based Supercomputer For LANL

An anonymous reader writes "LANL will be receiving a 1024 node (2048 processor) LinuxBIOS/BProc based supercomputer late this year. The story is at this location. This system is unique in Linux cluster terms due to no disks on compute nodes, using LinuxBIOS and Beoboot to accomplish booting, and BProc for job startup and management. It is officially known as the Science Appliance, but is affectionately known as Pink to the team that is building much of it."

189 comments

  1. Wow... by NineNine · · Score: 0, Troll

    I could serve up a shitload of porn with this. Seriously. No, really. A whole lot of porn.

    1. Re:Wow... by Anonymous Coward · · Score: 0

      Indeed.

    2. Re:Wow... by billd · · Score: 2, Funny
      I could serve up a shitload of porn with this

      Think so? Wouldn't a system with disks be more suitable for that?

      --

      -----

      For great justice!

    3. Re:Wow... by Anonymous Coward · · Score: 0

      Diskless Digitally Rendered Porn. (DDRP)!!!
      NP N&P finally realized!!!!!

    4. Re:Wow... by jpaz · · Score: 2, Funny
      I could serve up a shitload of porn with this

      How many standard Libraries of Congress is a shitload?

    5. Re:Wow... by binaryDigit · · Score: 5, Funny

      Think so? Wouldn't a system with disks be more suitable for that

      Nah, just one honkin RAMDisk. Could serve up mucho porn/warez, when the feds come knockin, just pull the plug, presto, no evidence :)

    6. Re:Wow... by Anonymous+Cowrad · · Score: 1

      a whole buttload of them...

      e2 has a handy guide to *load conversions.

      --

      --
      pants ahoy
    7. Re:Wow... by Anonymous Coward · · Score: 0

      With 2048 processors and an assumed 1GB/node that's 1TB of low latency, super high-jiz RAM.

      With an architecture like that you can avoid making physical copies on dozens or hundreds of nodes..

    8. Re:Wow... by JesseL · · Score: 3, Funny

      Is that an imperial shitload or a metric shitload?

      --
      "Prefiero morir de pie que vivir siempre arrodillado!"
    9. Re:Wow... by foobar104 · · Score: 2

      With 2048 processors and an assumed 1GB/node that's 1TB of low latency, super high-jiz RAM.

      How do you define "low latency?" From a first glance at the evidence, it appears that this cluster just uses plain old TCP/IP over Ethernet as its node interconnect. That's not exactly low latency access to remote memory, you know.

      Just nitpickin'.

    10. Re:Wow... by Anonymous Coward · · Score: 0

      African or European?

    11. Re:Wow... by Anonymous Coward · · Score: 0

      thinner than a human hair

  2. Obligatory Beowulf Post by Anonymous Coward · · Score: 1, Funny
    Imagine a Beowulf cluster of these!
    1. Re:Obligatory Beowulf Post by oz_ko · · Score: 2, Funny
      Imagine a Beowulf cluster of these
      In the interests of preseving diskspace i propose making IABCOT a standard slashdot acronym
    2. Re:Obligatory Beowulf Post by Anonymous Coward · · Score: 0

      And I suggested the same on the very same parent just seconds before reading your post! The time is ripe, obviously... but somehow I feel less smart now ;-)

  3. Floyd by Nobley · · Score: 1

    Anybody know if this is a reference to Pink Floyd, if so then I appreciate this team all the more :)

    1. Re:Floyd by Anonymous Coward · · Score: 0

      Pink is simply a color reference in this case. Advanced Simulation Computing Initiative (ASCI) White and Blue are two other supercomputers utilized by the LLNL, LANL's cousin DOE laboratory. I believe ASCI purple is the most recent undertaking at LLNL. You can read more at http://www.llnl.gov/asci/purple

    2. Re:Floyd by adpowers · · Score: 1

      If you go to this page you will see an announcement for the cluster.

      And I quote: "We will call it Pink."

    3. Re:Floyd by Anonymous Coward · · Score: 1, Interesting

      I think a more fitting musical allusion would be 'Music from Big Pink', by The Band.

    4. Re:Floyd by Anonymous Coward · · Score: 0


      No, in fact it is a reference to good porn vs. bad born; ie the presence of pink.

  4. Wait... by hkhanna · · Score: 1, Redundant

    Pardon my ignorance, but would this be considered a Beowulf cluster? I mean everyone on /. talks about them so is this it, finally? A real live, Beowulf cluster? If so, imagine a beowulf cluster of beowulf clusters.

    Or, a beowulf cluster of beowulf clusters of beowulf clusters. The possibilities are infinite (literally.)

    --

    Think nothing is impossible? Try slamming a revolving door.
  5. Good Thing (TM) by Nobley · · Score: 0, Flamebait

    It is a good thing it doesnt run windows, because every year they would have Science Appliance Compliance issues, with updates and $$$

    1. Re:Good Thing (TM) by Anonymous Coward · · Score: 0

      Watch it, buster. You are dangerously close to a one-way ticket onto my foes list.

    2. Re:Good Thing (TM) by Anonymous Coward · · Score: 0

      Whooaa. Great big AC telling me to watch it. I'm sooooooo scared.

    3. Re:Good Thing (TM) by Anonymous Coward · · Score: 0

      You might want to check and see if you meet any of the criteria on this list. I think your chances of finding a match are pretty good.

    4. Re:Good Thing (TM) by Anonymous Coward · · Score: 0

      Yeah, you better watch it. You just made it on my friends list.

    5. Re:Good Thing (TM) by Anonymous Coward · · Score: 0

      you have just been put on my foes of my enemies second cousin list

  6. Not a Beowulf cluster by Anonymous Coward · · Score: 4, Funny

    But if you'd replace the expensive high-performance interconnect with a cheap ethernet, then it would be a Beowulf cluster.

    1. Re:Not a Beowulf cluster by DeathPenguin · · Score: 1

      Oh shut the f*** up, Erik Hendriks ownz j00!

    2. Re:Not a Beowulf cluster by Nightwraith · · Score: 1

      Sure, but then it wouldn't be High-Performance now would it?

      Processor speed is only part of the equation in HP Computing, remember that those nodes have to WAIT for the data to go between each other.

      Ethernet's latency is simply too high for a system such as this...

    3. Re:Not a Beowulf cluster by Anonymous Coward · · Score: 0

      Not even with cheap ethernet would it be a beowolf cluster. its quite different from my understanding, from the BIOS to the network fs.

  7. Most commonly asked question about the computer by Anonymous Coward · · Score: 1, Funny

    Which one is pink?

  8. Who needs a Beowulf cluster of these? by SexyKellyOsbourne · · Score: 1, Funny

    Just like the Iraqis and the Chinese, I do all my nuclear weapons testing on my Playstation 2 Supercomputer!

    1. Re:Who needs a Beowulf cluster of these? by Anonymous Coward · · Score: 0

      This is funny as hell. What is is modded down for?

  9. fps? by iocc · · Score: 0

    And how many FPS will they get in Quake?

  10. Uses by esac17 · · Score: 1, Interesting

    Let's just hope they do something good with this. I'm tired of reading about how supercomputers are used for military war simulations.

    Does anybody know other applications that supercomputers are being used for. I know some do weather predictions.

    1. Re:Uses by saveth · · Score: 2, Informative

      Let's just hope they do something good with this. I'm tired of reading about how supercomputers are used for military war simulations.

      LANL tends to do projects that are focused much more on science and engineering than military applications. It's very likely that Pink will end up analysing spectral emissions of bombarded protons or something like this.

      The military simulations you mention probably don't happen at LANL.

    2. Re:Uses by Anonymous Coward · · Score: 0

      YEAH.... quake, unreal,....

    3. Re:Uses by Flat5 · · Score: 1


      That opinion is, shall we say, not very informed.

      You do realize that Los Alamos is the child of the Manhattan Project, don't you? The former home of Wen Ho Lee? Ringing any bells yet?

      Flat5

    4. Re:Uses by ptbrown · · Score: 2

      On the contrary, I wouldn't mind seeing more military war simulations being done on supercomputers; so long as they are carried out as an alternative to actual military war.

      Think about it: Instead of wasting all the money, resources, and lives of actually invading another country, we just get a few supercomputers into a network, and duke it out online.

      First thing, of course, would be to allow the export of supercomputers to blacklisted countries. (Is Afghanistan still on the list, I wonder?) Then get a UN resolution that all member countries will abide by the outcome of any virtual war.

      And hey, the US has already got a head-start in training soldiers for it: "America's Army"!

      --
      Any sufficiently advanced civilization is indistinguishable from Gods.
    5. Re:Uses by marm · · Score: 5, Interesting

      Does anybody know other applications that supercomputers are being used for. I know some do weather predictions.

      Ok, non-military uses, off the top of my head:

      • mathematical research - simply complicated maths on big numbers
      • fluid dynamics modelling - traffic flows, or aerodynamics, or hydrodynamics - this is also tied in quite closely with weather/climate prediction
      • statistical modelling - wouldn't you like to know if the stock market is going to go up or down tomorrow, before it happens?
      • computational chemistry/biochemistry - protein folding is just the tip of the iceberg - imagine being able to design a molecule and then simulate the effect it will have on the human body, without that substance ever having been actually synthesized or going near a human... this is the future of drug development
      • quantum mechanical simulation - related to computational chemistry, imagine taking all those complicated quantum mechanics equations to their logical conclusions, predicting as-yet undiscovered subatomic particles and their behaviour, or to design better magnetic containment fields so that practical fusion energy generation is possible
      • good old-fashioned databases and signal processing - when you have hundreds of terabytes of data that you wish to mine for interesting patterns, speed matters

      I'm sure there are plenty more applications for supercomputer power - any kind of complicated or chaotic system is a good candidate for modelling, especially when there's more than one unknown variable (multivariate analysis is complicated, to say the least).

    6. Re:Uses by Anonymous Coward · · Score: 0

      I'm tired of reading about how supercomputers are used for military war simulations.

      Yeah, enough with that "global thermonuclear war"... how bout some tic-tac-toe?

      Or maybe a good game of chess, Dr. Falken?

    7. Re:Uses by afidel · · Score: 3, Interesting

      The largest (largest by a long shot it outpowers the rest of the top10 combined) supercomputer in the world is the NEC Earth Simulator in Japan. It is being used to do the most detailed climate modeling ever attempted. Not only that but they are attempting a complete system model which AFAIK has never before been possible. In addition the last couple clusters that I have read about have been for biomedical research, maybe it's just what I read but I believe bioinformatics is going to be one of the biggest pushers of HPC going forward. Genomics is nothing compared to proteonics, mapping the genome probably takes about as much computing power as simulating the folding of one large protein series!

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    8. Re:Uses by kakos · · Score: 1

      Hell no. If that happened, Korea would take over the world. Starcraft has taught us that we should never mess with the Koreans when it comes to virtual war!

    9. Re:Uses by DasBub · · Score: 2, Funny

      Well, Hollywood has used supercomputers and large clusters to do effects for movies like Star Wars: Episode II, Resident Evil, and the upcoming Terminator 4.

      So, no, there haven't been any good uses.

    10. Re:Uses by DasBub · · Score: 1

      Er, 3. Terminator 3...

      But we all know there'll be a T4.

      SkyNet probably sends Vin Diesel back in time to beat a 27-year-old John Connor at a game of pick-up-sticks, thereby destroying his ego.

    11. Re:Uses by abulafia · · Score: 2, Insightful
      statistical modelling - wouldn't you like to know if the stock market is going to go up or down tomorrow, before it happens?



      I agree with you on most respects (even if much of what you're talking about is very, very far beyond most realistically imaginable systems in the near future), but simple economics shows why the above is silly.

      Simple question: someone uses a tool to make a killing on a pre-existing market. How does everyone respond (not counting RIAA, et al, who depend on regulation)? They either curl up and die, or figure out what the winners are doing, and quickly. Learning what people are doing is even easier in markets like finance, where there's a lot of transparency in actions, a very close knit group of participants, people who like to brag, and a lot of people staring at the winners.



      Fact is, any new innovation in trading quickly becomes used by everyone who has a serious enough stake. It is just market economics. Once everyone gets an innovation, it is no longer an advantage, because everyone is doing it (bonus points for those who see past and potential systemic failures lurking in this behaviour).



      Of course, keeping your traders free of risks like sharing information and regulatory oversight can extend an advantage, and that works in a very few situations. But hell, even Warren Buffett took a fairly serious beeting recently due to things he couldn't predict (and this is an insurance guy!), not to mention Soros when he attacked Asian currencies a few years ago.



      Not only is there no silver bullet for the folks who run finance, there's just no way in hell peons in the game (anyone with less than a few hundred million invested) will profit from raw computational power. Sorry.



      -j

      --
      I forget what 8 was for.
    12. Re:Uses by Lazyhound · · Score: 1
      On the contrary, I wouldn't mind seeing more military war simulations being done on supercomputers; so long as they are carried out as an alternative to actual military war.

      Wasn't that a Star Trek episode? I don't think it turned out very well...

    13. Re:Uses by foobar104 · · Score: 3, Informative

      Wrong. Render farms are neither clusters nor supercomputers. At best, a render farm might be considered an array.

      A supercomputer is a single system image. Some people call large clusters "supercomputers," but technically they're wrong.

      A cluster is an interconnected group of computers that can communicate with each other. Usually a cluster depends on some kind of software layer to allow programs to run across multiple systems, something like MPI. Clusters are tightly interconnected many-to-many systems.

      An array has a single job control system and a number of job execution systems. Batch jobs are submitted by users to the job control system, which doles them out to the various execution systems and then collects the results. The execution nodes don't talk to each other, and one job runs on one execution node at a time. Render farms are basically arrays; each execution node works on rendering a single frame of a multiframe animation. Because each frame can be rendered independently, without any dependencies on the previous and subsequent frames, rendering is particularly well suited to array computing.

    14. Re:Uses by Isle · · Score: 1

      Actually this was widely used before the tech-bubble. The idea is that a computer can generate a few parts of a percentage better predictions than most humans. Since the percentage is so low, you let the computer invest huge amounts of money.

      Ofcause when the entire market crashes, these machines loose money a lot faster than humans, since no one has tought them to pull out, and because they have to so much money invested.

      So while much of the research in this area died suddeny sometime in early 2000, it still proves your theory wrong.

    15. Re:Uses by cameldrv · · Score: 1

      You obviously don't know much about LANL.

    16. Re:Uses by 2short · · Score: 1

      "good old-fashioned databases and signal processing - when you have hundreds of terabytes of data that you wish to mine for interesting patterns, speed matters"

      I agree with most of your points, but as someone with only a few dozen terabytes to worry about, i can say, speed does matter. DISK speed. MEMORY access speed. With that much data, Cpu speed beyond "decent" is really only useful as something you can trade for less disk access via compression, and even that hits a wall pretty fast. Supercomputers like this one are great for big calculation; not so much for big data.

    17. Re:Uses by 2short · · Score: 1

      "any kind of complicated or chaotic system is a good candidate for modelling"

      Actually, perhaps the chief feature of chaotic systems is that they are NOT good candidates for modeling.

    18. Re:Uses by DasBub · · Score: 1

      There's at least one movie that I can use to thumb my nose at you. The Last Starfighter. It used a Cray X-MP to do 36000 frames.

      In any case, I was trying to be a funny little shit with regards to the crap that they've produced. I wasn't trying to be technically correct. But thanks for the education.

    19. Re:Uses by marm · · Score: 2

      Supercomputers like this one are great for big calculation; not so much for big data.

      If all you're doing is storing data and then retrieving subsets of it on demand, then sure, I agree. Most databases are like that - but not all. Some databases do more complicated processing than search, sort, split and join, some have to do some heavy manipulation of every record, and in these cases, CPU speed is just as important as memory or disk speed, if not more so.

      Case in point: SETI. Terabytes of raw data collected from radio telescopes, no doubt stored in a large database in Berkeley, divided up into manageable records indexed by time and position in the sky. The data is absolutely worthless though, without colossal amounts of processing. So they build their own network of donated CPU time - a global data processing system - in order to turn that worthless data into something worthwhile. It isn't the disks or memory that limits speed of data processing, it's the CPU.

      Of coure, if they'd had the money they could have simply bought a single machine to do all the data storage and processing, but it would still have been CPU-bound rather than disk or memory-bound. There are plenty of more conventional systems where everything is done on one physical machine, but still have the same problem of being bound by CPU.

      I guess it depends on how you define a database - is a system that does complex processing of data as well as storage and retrieval still a database, or is it something else?

    20. Re:Uses by larien · · Score: 2

      To add to that, seismic interpretation by Oil companies. Shell have a 1024 node AMD cluster in the Netherlands for this purpose.

    21. Re:Uses by marm · · Score: 2

      simple economics shows why the above is silly.

      Silly it might be - that doesn't stop people from attempting it though, and buying and using supercomputers to do it with.

      Fact is, any new innovation in trading quickly becomes used by everyone who has a serious enough stake.

      Sure. But every time you improve the model, in theory at least you get a short period of having an advantage over everyone else - until they improve their statistical model to match or beat yours. Even if your advantage only lasts a day, and even if it's only a minor advantage, that's easily long enough to make up the cost of the supercomputer and the programming time, and then some, at least if you're a major investor.

      No-one said this was something you or I would benefit directly from - at least, not unless you have a stake in the investors doing the market analyses - but to the major market investors, if it gives them an advantage, even a temporary one, good luck to them.

    22. Re:Uses by CreamsicleSeventeen · · Score: 1

      So you're saying.. like, GW 'n Sadam should play Return to Castle Wolfenstein?

    23. Re:Uses by back_pages · · Score: 2
      Come on, like there was anyone on Earth who thought that Afghanistan was going to destroy the US... or that Iraq has an ace in the hole that's going to bring us down... or that anyone, anywhere, has enough strictly military might to mess with the West. You've got to be kidding.

      The only place where a military conflict is actually a contest is in between the second and third world, now that many third world nations have nuclear weapons. The second world is not of a mindset that any type of computer simulation would be an acceptable substitute, and even simulating the morale of billions of starving Communists is a staggering proposition.

      So we're right back to the third world countries. Ignoring the export ban, do you really think India and Pakistan (who are hardly third world countries anymore) are the types of people who will say, "Look, the computer says we'll win, so why don't you surrender now?"

      And I am using the accurate definitions of first, second, and third world. No need for anyone to get offended by whatever false implications they attach to terms like "third world".

    24. Re:Uses by garglblaster · · Score: 1
      .. didn't the iraquis propose something similar recently..

      - not sure if they were joking though ;)

      --

      perl -e 'printf("%x!\n",49153)'

    25. Re:Uses by BlowChunx · · Score: 1

      New trading innovations are not necessarily adopted widely by everyone quickly. Technological innovations, sure, but not all innovations are technology based.

      Example: Benoit Mandelbrot has been working on using the concept of fractals to predict market changes. His intellect is unique and not likely to be duplicated easily by others. Unless of course he decides to publish his results.

    26. Re:Uses by Anonymous Coward · · Score: 0

      There's tons of scientific applications in materials, e.g. million atom simulations, ion implantation, fracture, glassy systems etc.
      Theoretical chemists use supercomputers heavily.

    27. Re:Uses by 2short · · Score: 1

      Fair enough. I guess I just wouldn't lump "good old fasioned databases" in with "signal processing". SETI is definitely the latter, but I'd say not the former. While the data is stored until processed, I'd imagine it's dropped once found to be nothing but noise. The case I'm thinking of (the one I work with) is when you have a whole bunch o' data, and you want to find all the records (or more to the point, combinations of records) that meet some complex criteria; Then you want to do some slightly different criteria; Then some totaly different criteria; etc. until the imagination of your clients is exausted (never). If you build the engine right, the complex criteria aren't, and its all about how fast you can load up records, but the "cobinations of" thing (at least in my case) hoses the aproach of just paralellizing the bottleneck away.

      Where I work we do some frankly amazing things along these lines, and peoples jaws drop. Then, invariably, they ask what kind of amazing processing power is behind it. I smile, and explain that the data box is a dual processor, Mhz only slightly faster than the one on their desk. They conclude our software is godlike, and thus that I am a programmer of mad skilz beyond their wildest dreams. Not that it's not true :), but at this point, I see little reason to mention the 20 terrabytes of fast RAID and several gigs of RAM that are the real story.

      Basically, it slightly peeves me that huge-processing power systems get all the press, when for a lot of problems, you can design algorithms that let you throw disk & memory at the problem instead of CPU. Throwing CPU at a problem means throwing bucks at it by orders of magnitude versus storage.

      Granted, the people building the system in the article are probably no dummies, so they probably really do have problems that must be attacked with CPU, but I've seen plenty of dummies who didn't and went that way anyway because they think that's the only measure of computer power.

  11. Unique? by Anonymous Coward · · Score: 0

    I'm sure a diskless cluster has been done before. Maybe not with the linuxbios, sure, but who cares how you accomplish the remote booting?

    1. Re:Unique? by goombah99 · · Score: 1

      Oh really. try it sometime! You'll find out why people dont do it! It's very hard to build a scalable diskless system

      --
      Some drink at the fountain of knowledge. Others just gargle.
    2. Re:Unique? by kl76 · · Score: 1

      Yep, it's been done before. In fact, I did it two years ago when I built a 17-node (16 diskless compute nodes plus a front-end server) Beowulf cluster using Etherboot loaded from floppy (icky, I know but it worked). Pretty straightforward if you've ever set up diskless UNIX workstations in a previous life...

    3. Re:Unique? by kl76 · · Score: 1

      Yep, it's been done before. In fact, I did it two years ago when I built a 17-node (16 diskless compute nodes plus a front-end server) Beowulf cluster using Etherboot loaded from floppy (icky, I know but it worked). Pretty straightforward if you've ever set up diskless UNIX workstations in a previous life...

    4. Re:Unique? by Anonymous Coward · · Score: 0

      Yep, it's been done before. In fact, I did it two years ago when I built a 17-node (16 diskless compute nodes plus a front-end server) Beowulf cluster using Etherboot [sourceforge.net] loaded from floppy (icky, I know but it worked). Pretty straightforward if you've ever set up diskless UNIX workstations in a previous life...

    5. Re:Unique? by ebiederm · · Score: 1

      Which is one of the major points of pink to resolve the scalability issues.

    6. Re:Unique? by ronaldgminnich · · Score: 2, Insightful

      yep, you can do it with floppies. But do you really want to do it with 1024 floppies given a 10% average failure rate? Think about it.

    7. Re:Unique? by kl76 · · Score: 1

      No, but then I could have burned Etherboot into a Flash chip on the (3C905) Ethernet cards in each compute node, which would have been neater (if I could have got a hold of the chips at the time...)

  12. cannot use pink by norwoodites · · Score: 2

    as it is an OS by Apple and IBM (well gone but still)

    1. Re:cannot use pink by Loligo · · Score: 2

      >as it is an OS by Apple and IBM (well gone but
      >still)

      AFAIK, "Pink" was just the internal code name for the eventual OS planned for the PowerPC.

      And we all know how THAT turned out.

      -l

    2. Re:cannot use pink by dome+troll · · Score: 1

      Los alamos is run by the gub'ment and they dodn't have to abide by patents or copyrights. The millitary, for example, reverse engineers software all the time to make it more secure. As to your second question its really coarse it tears easily and it sucks if you make bed sheets out of it...oops. Sorry, thought you said muslin

    3. Re:cannot use pink by Anonymous Coward · · Score: 0

      Now, if only the real pink (you know, the singer) would show up for the delivery to sign one of the racks. That would make it the ultimately cool cluster. Anyone know how to contact her?

    4. Re:cannot use pink by PerryMason · · Score: 1

      Yes, because people are _bound_ to confuse a 2048 processor cluster with an operating system when LANL start marketing them.......duh.

      --
      "I'm tired of all this 'Aren't humanity great' bullshit. We're a virus with shoes" - Bill Hicks
    5. Re:cannot use pink by Neon+Spiral+Injector · · Score: 3, Funny

      And what about the pop star who needs a belt for her pants (but I hope she doesn't get one)?

    6. Re:cannot use pink by sql*kitten · · Score: 2

      as it is an OS by Apple and IBM (well gone but still)

      Maybe they mean this Pink? I know which one I prefer ;-)

  13. Two great things that go great together by ebuck · · Score: 1

    Diskless X clients have been attractive due to the lack of remote configurations and disk/data failures.

    Clusters suck up a lot of electricity because of the hardware they support.

    I'm sure the next step involves skipping the extra motherboard components (IDE, USB, AGP, etc.) and making the CPU/Memory mount to a TCP/IP Switch backplane. Better yet, drive the thing with low power CPU's so it won't sound like a helicopter prior to take off/reqire a new nuclear plant to power it up/create a new market for Frigidair.

    1. Re:Two great things that go great together by Anonymous Coward · · Score: 0

      Yeah, but won't low-power CPU's decrease the overall effectiveness of a cluster?? Most clusters are used for fairly math-intensive stuff - it's my understanding that most low-power CPU's don't have as much "oomph" in processing terms (for lack of a better word). Although I did see a set of Transmeta specs that mentioned something about being equivalent or nearly equivalent with the Crusoe processors...

    2. Re:Two great things that go great together by Anonymous Coward · · Score: 0
      Although I did see a set of Transmeta specs that mentioned something about being equivalent or nearly equivalent with the Crusoe processors...

      I'd hope they were equivalent, since Transmeta makes Crusoe processors.

    3. Re:Two great things that go great together by Uebergeek · · Score: 1

      Actually, the key attraction for diskless where I work is the security crap that goes along with controls on harddrives.

      For instance, when a secret-level project uses a cluster, then all the hdd's for that project have to be controlled, and removed when the cluster is used for something else. This is a *major* bitch when it's a big cluster.

  14. LinuxBIOS by Anonymous Coward · · Score: 5, Insightful

    I wonder why LinuxBIOS hasn't taken off. I've debated ordering one of their "kits." It seems to me the 3 second boot time of LinuxBIOS should be a selling point for some obscure Linux vendor, but no one really offers it yet.

    I really imagine a machine with an 8MB EEPROM/ROM that can be updated as needed, but provides a boot environment and login screen - while spinning the disks in the background. This would make an excellent product.

    Why hasn't anyone done this yet?

    Curious

    1. Re:LinuxBIOS by brsmith4 · · Score: 2, Interesting

      Probably the same reason we aren't on IPv6 yet: not enough need to insite change. I agree with you though, I would love to have 2-3 second boot times.

    2. Re:LinuxBIOS by Anonymous Coward · · Score: 0
      Why hasn't anyone done this yet?

      What would the per-unit hardware cost increase be to use an 8 meg EEPROM? It might be significant, especially in a high volume, low markup marketplace

    3. Re:LinuxBIOS by Anonymous Coward · · Score: 0

      As an embedded developer, I have seen 3-second boots
      for many years. Fast boots are, and have been,
      very common.

    4. Re:LinuxBIOS by ronaldgminnich · · Score: 1

      Why hasn't it taken off? because it doesn't boot windows, and I don't really care about that.

      But if you are counting linuxbios is a ca. $16M business this year, so that's good enough for me. Plus, it runs in places you might be surprised to see ...

      see www.cluster-labs.de or linuxbios.com or lnxi.com or cwlinux.com for motherboards you can buy that run linuxbios.

      Ron, the linuxbios guy

    5. Re:LinuxBIOS by LuxuryYacht · · Score: 2

      The problem is that motherboard vendors still tend to only use 2M-bit flash devices and only route the address lines from the chipset to the flash for 2M bit flash BIOS devices.

      An 8M-bit flash BIOS kit would need to have its memory footprint paged into the 2M-bit memory space that is routed on the motherboards or include schematics, solder, wire and soldering iron for the installer to tie the needed memory address lines back to the chipset.

      --
      Quidquid latine dictum sit altum viditur
    6. Re:LinuxBIOS by ebiederm · · Score: 1

      Actually I'm willing to guess it hasn't done more because simply because it has a new, design and it takes people a while to get used to things like this. From what I can tell uptake has been snowballing for a long time. The snowball just started very very small... :) As for booting windows it could if anyone was sufficiently motivated but I that doesn't appear to be the niche of LinuxBIOS.

    7. Re:LinuxBIOS by DeathPenguin · · Score: 1

      I don't know if things have changed much in the past year, but last I heard it was because companies are paranoid and don't want to share adequate documentation necessary to make LinuxBIOS a success for both cluster applications and home users. Video chipset makers like nVidia and ATi wouldn't tell them how to initialize VGA BIOS, AMD wouldn't tell them how to initialize L2 cache on the Athlons. Setbacks in documentation. About the only company who's not a bunch of jerks to these guys is SiS... Click here to see a list of their working motherboards. I'm tempted to get a K7SEM just to see how well it works in it's current state.

  15. Medical (Was:Uses) by srw · · Score: 3, Interesting

    A former client who worked at a Cancer Center used a cluster to simulate radiation treatments.

  16. How can MS not be scared? by Anonymous Coward · · Score: 0

    When Linux runs on the smallest devices all the way up to boxen like this 10 tera-FLOPs beast and this SGI supercomputer that just set a memory bandwidth speed record with 120GB per second (faster than a Sun SunFire 15K, Cray C90, IBM p690, etc) on a single system image? Scale it any which way you like ;-)

    Rock on with your bad self.

    1. Re:How can MS not be scared? by Anonymous Coward · · Score: 0

      It's "boxes," twat. There's no such word as "boxen."

    2. Re:How can MS not be scared? by swissmonkey · · Score: 1

      Well...

      It's perhaps 10 teraFLOPS, but it's 1024 computers loosely coupled. The SGI box has 64 CPU, and Linux's scheduler has nothing that matches the capabalities of Solaris or IRIX when it comes to scalability in term of number of CPU, it's even worse than Windows in this view, and .NET Server has got a lot of improvements in that regard...

      Windows runs correctly(ie. with acceptable performance) on computers with 1 to 32 CPU TODAY, and .NET Server will extend this to 64 and perhaps even more.

      As of today, nobody uses Linux IN PRODUCTION with more than 8 CPU, so actually, Linux is playing catch up with Windows in this field also.

      Sad but true.

    3. Re:How can MS not be scared? by PerryMason · · Score: 1

      As of today, nobody uses Linux IN PRODUCTION with more than 8 CPU, so actually, Linux is playing catch up with Windows in this field also.

      Yeah, but only cause you need a 32 processor machine just to get XP to run at a reasonable clip!!

      In all seriousness though, all the benchmarks indicate that you get better performance and scalability from multi-node versus multi-cpu machines anyway.

      --
      "I'm tired of all this 'Aren't humanity great' bullshit. We're a virus with shoes" - Bill Hicks
    4. Re:How can MS not be scared? by chez69 · · Score: 0

      For certain problems, but for some problems where you can't break up the problem into subproblems easily, big ass supercomputers will beat clusters.

      --
      PHP is the solution of choice for relaying mysql errors to web users.
    5. Re:How can MS not be scared? by ronaldgminnich · · Score: 2, Insightful
      Sad but true, but you're wrong.


      I realize this is Slashdot :-), but I think at this point you need to backup those statements with one or two facts or citations. I can give you one, however: at several conferences in 2000, e.g. ALS 2000, Compaq demonstrated 16 and 32-node Alpha SMP systems with Linux. Scaling did stop at 16 (for kernel builds) with the version of Linux they ran back then. I had a 16-node GS160 and it did scale just fine to 16 nodes.


      Can you actually provide a reference for that 32-node Windows box? Most of the "32 CPU Windows" boxes I have seen run Windows in cells of 4 CPUs, with 8 copies of the OS (e.g. Unisys). Do you really call this scaling? I don't.


      ron

  17. Why not use embedded tech? by Chirs · · Score: 4, Insightful


    This sounds like some kind of dual-processor rackmount type solution. Why not go all the way and use something like compactPCI? You can fit 21 cPCI blades into 8U of rackspace.

    A standard blade could have up to a couple gigs of ram, a powerpc or p3/p4 cpu, 100BT or 1000BT ethernet, etc, etc.

    You boot the things using bootp/tftpboot and then run linux off a ramdisk.

    We're using cPCI at work to run VoIP softwitches. Currently we're at over a million calls an hour on a wimpy 450MHz processor.

    1. Re:Why not use embedded tech? by ziegast · · Score: 1

      At 8U you get to use standard (cheap!) ATX Xeon motherboards and put them in your cases. Xeon motherboards give you very fast CPUs with lots of memory (8GB-12GB) and usually one or two GigE built-in. This is what the high-end-computing customers want - concentrated computing power. VOIP is much less demanding. If you want to use Xeon with CompactPCI, you currently need to make your own motherboard ($$$$ in initial engineering costs) and figure out how to cool it (small fans don't work well).

      Does anyone know any good CompactPCI Xeon manufacturers? Doubt it.

    2. Re:Why not use embedded tech? by Elbereth · · Score: 2

      Because that's more specialized and not as mass produced, it's going to end up costing a bit more. I, personally, have never played with cPCI, and I've played with some esoteric, technical stuff. I'm not sure that they'd have the experience necessary with that. They might need to hire someone or train someone. Once you start getting into the embedded world, you need need training than the average guy on Slashdot has.

      cPCI with PowerPC processors would be just too damn cool. I've looked at them at Motorola's web site. I just wish I could find an application for them!

    3. Re:Why not use embedded tech? by larien · · Score: 2
      Unless I missed the announcement, linux is still stuck with 2GB RAM max, at least on x86.

      Also, if you can fit enough P4s in blades, you can beat the Xeon's processing power for less cost and floorspace; this is, after all, why x86 clusters are beating the big iron servers from Sun & IBM on price/performance. Of course, you have to have a load which can be distributed in this way; if you have fewer, heavy duty compute requirements, Xeons may be the best fit for the job.

    4. Re:Why not use embedded tech? by CreamsicleSeventeen · · Score: 1

      To quote /usr/src/linux/Documentation/Configure.help, "Linux can use up to 64 Gigabytes of physical memory on x86 systems."

    5. Re:Why not use embedded tech? by Anonymous Coward · · Score: 1, Interesting

      Any one process can only use 3GB out of that 64GB. The 64GB thing is a typical Intel kludge, analagous to the godawful memory segments of the DOS days.

    6. Re:Why not use embedded tech? by Anonymous Coward · · Score: 0

      Try http://www.mc.com/defense_electronics/news_detail. cfm?press_id=2002_02_12_074243_757667pr.cfm

      4 G4 nodes on cPCI, interconnected, with an 8240 host.

    7. Re:Why not use embedded tech? by nenolod · · Score: 1

      According to "make menuconfig", General Setup, There is now a high memory access setting, it goes all the way up to 2048gb or something.

    8. Re:Why not use embedded tech? by Wesley+Felter · · Score: 2

      Northwood Pentium 4s use as much space and power as Prestonia Xeons, because they're practically the same thing. To get high density you need to use much slower CPUs.

    9. Re:Why not use embedded tech? by SurfsUp · · Score: 2

      Unless I missed the announcement, linux is still stuck with 2GB RAM max, at least on x86.

      You missed the announcement. Linux upports up to 64 GB per ia32 processor, has done for years, with at least one 32 GB system out there in the wild and 8 GB systems fairly common.

      --
      Life's a bitch but somebody's gotta do it.
    10. Re:Why not use embedded tech? by ronaldgminnich · · Score: 2, Insightful
      Good question.

      Why not cPCI? In a word, performance/price on our apps. We looked at all sorts of cPCI blades (e.g. http://www.cluster-labs.de) but the peformance just is not there. Also, no existing ethernet will do the job for our apps, so we have to use Myrinet, and again, the fastest Myrinet is going to be in the PCI 64/66 slots on plain old motherboards.

      Other folks have asked about 1.0 Ghz G4. I like PPC. But on our applications the PPC, with the best compilers we can find, is actually slower in absolute terms than a PIII/800. So as much as I would have wished to use a PPC, it's not cost-effective.

      Note that our software runs fine on G3 and G4 macs however -- our standard CD distribution from http://www.clustermatic.org will boot either PC, PPC, or Alpha just fine. In fact, the standard Linux distribution from http://www.terrasoftsolutions.com/products/blackla b/components.shtml features some of our software, including bproc.

      Also, if you look at the PPC offerings from synergy and CSPI you'll find they run their own kind of "Linux in flash" -- not LinuxBIOS, but pretty much the same function. They've been doing this for years.

    11. Re:Why not use embedded tech? by DeathPenguin · · Score: 1

      http://www.acl.lanl.gov/linuxbios/clusters/shrek/i ndex.html

      http://www.acl.lanl.gov/linuxbios/clusters/bento /i ndex.html

      Heat dissipation is always a problem with fast CPUs too. Doesn't do you much good to double up your nodes if you have them crashing all the time, or downthrottling.

    12. Re:Why not use embedded tech? by ronaldgminnich · · Score: 1
      Another thing I forgot to mention, sorry.

      The LNXI modules in Pink are actually .8U effective space. I have a 1U RLX rack here with 800is. As much as I like the RLX, fact is that in terms of computing per cubic inch, the LNXI node is about 5 times the performance. Yup, 5 times:

      The LNXI module takes up about the space of 4 of the RLX blades. In the space of four RLX PIII/800s, LNXI puts 2 2.4 Ghz. Xeons. On some of our apps, measured performance of 1 2.4 Ghz. Xeon is 2.8x the PIII/800. Do the math: same cubic inches, ca. 5x the throughput, 1/2 the number of CPUs.

      Not being a blade or embedded node doesn't automatically mean you are less dense in computing per cubic inch!

      ron, one of the Pink guys

  18. Lots of chip programming by Anonymous Coward · · Score: 3, Funny

    I don't envy the developers... After every revision of LinuxBIOS, they get to reflash 1024 motherboards, which could take a while...

    1. Re:Lots of chip programming by Anonymous Coward · · Score: 3, Informative

      Not really, a new revision can be flashed with a single utility that can be run on all the nodes in parallel.

    2. Re:Lots of chip programming by Anonymous Coward · · Score: 0

      If done over the network, and there were no problems, I imagine it would only take marginally longer than time required for one machine.

      But..the "no problems thing" with 1024 nodes, is most likely enough to make one run around naked, screaming "motherfucker!"

    3. Re:Lots of chip programming by ebiederm · · Score: 1

      Nope I've done it several times on 960 nodes in MCR,it hasn't been a problem. In fact I can completely reinstall MCR (which being the conservative cousine of PINK has disk) in an hour or so. This includes all of the time for fighting problems.

  19. Riddle by Anonymous Coward · · Score: 0

    I'm enormously massive and powerful, pink, and made of penguins. What am I?

  20. Re:Wow... (and it's a joke, people) by jx100 · · Score: 1

    enough to fill about 17 football fields.

  21. Damn... by brsmith4 · · Score: 1

    And I thought the new 48 node cluster at work will soon be able put out some flops... outclassed, outgunned and outperformed. I used to get excited about hearing of other's beowulfs. Now I am only jealous. :)

    BTW, if you see a post that says 'Damn...' and nothing else, thats cuz this damn keyboard has this enter key that gets in the way.

  22. Re:Pink by brsmith4 · · Score: 1

    You'll have to do better than that...

    You guys get more and more creative by the day, don't you?

  23. Why Bother! www.top500.org shows Power4 faster by Anonymous Coward · · Score: 0

    Why Bother! www.top500.org shows Power4 faster than any linux offering, intel or not.

    in fact for many years straight macintosh related processors DOMINATE the top 500 supercomputer cluster sites of the world.

    Check out the top500 list yourself and search for linux, or whatecer you please . YOu will see other unix osses predominate, and that PowerPC realted computers also achieve the fastest speeds.

    True, two low cost p3s with three gige cards each, all bridged might seem ok, but if you want your work done in 1 hour instead of one month you cannot yet use linux or these non-RISC systems.... that is unless the us gov is spending the money and has no cost constraints such as LLNL

    1. Re:Why Bother! www.top500.org shows Power4 faster by ebiederm · · Score: 1

      Cost + Capability. MCR the conservative cousin to pink will likely how up at #5 on the the top500 list at the next posting. And at roughly 2000 CPUs it has the fewer CPUs than the risk competition it is beeting, and/or is beaten by.

  24. # Processors by Anonymous Coward · · Score: 0

    Why does the /. post say 2048 processors yet the article says it has 1024 processors?

    1. Re:# Processors by goombah99 · · Score: 1

      Duals

      --
      Some drink at the fountain of knowledge. Others just gargle.
  25. In the meantime...... by Billly+Gates · · Score: 1, Redundant
    ....RMS rants about pink and demands everyone call it GNUpink.

  26. weekly world news sucks by valmont · · Score: 2

    For those of you not familiar with the "Weekly World News" publication, it is a tabloid you'll find at most american supermarkets which will feature highly elevating stories such as "mom gives birth to four-headed quintuplets". The above story is just another one of their fictions. This is what tabloids do. They sell fiction. They appeal to the mentally ill-challenged, gulible minus habens.

    Yahoo features those articles in their TV/Gossip/Entertainment section. So you don't have to spend money at the supermarket. Go yahoo.

    Nothing to see here, move along.

  27. betatest: I've uses Bproc and Linux Bios by goombah99 · · Score: 5, Informative
    I've been a beta tester on the prototype for this system. It works great. I've seen diskless systems before they all were NFS nighmares, could not scale and had horrible tendencies to cause rippling crashes as one computer after the next timed out on some critical disk based kernel operation it could not complete across a wedged network.

    This one, brpoc, is different it is completely stable. You never get NFS wedges. Jobs launch in flash. Plus if you do reboot the whole thing is back up in seconds (literally).

    Bproc is an incredibly light weight job submission system. It is so light weight and fast that it changes how you think about sumbitting jobs. Rather than designing long duration jobs and tossing them on queue, you can just run tiny short jobs if you want with no loss to overhead. It makes you re-think the whole idea of batch processing.

    when the jobs run they appear in the process list of the master node. That is if you run "top" or "ps" the jobs are listed right there. In fact from the users point of view the whole system looks like just one big computer.

    --
    Some drink at the fountain of knowledge. Others just gargle.
    1. Re:betatest: I've uses Bproc and Linux Bios by albat0r · · Score: 1

      Have you ever tried LRP (Linux Router Project)? I suppose no, because you don't get the "NFS mess" you talk about. I've seen it running from floppy disk and compact flash card, and it worked great.

      Sure, you don't boot in 3 seconds when you do it from the floppy disk, put the flash card was a better anyway...! and you don't need to flash your BIOS everytime you want to upgrade...

      There are a lot more of minimalist Linux distro that don't have this "NFS mess", like tomsrtbt and Small Linux... go to Linux.org to see a huge list of minimalist Linux distro! (of course, you need to select "minimalist" in the category box...)

    2. Re:betatest: I've uses Bproc and Linux Bios by ebiederm · · Score: 1

      With LinuxBIOS the final kernel does not reside in flash, so you don't need to flash your BIOS to upgrade.

  28. Don't do it! by FyRE666 · · Score: 4, Funny

    I will personally track down and slaughter the first person to mention a popular clustering architecture, and how one might imagine it...

    1. Re:Don't do it! by Elbereth · · Score: 3, Funny

      Imagine a Mosix cluster of them!

    2. Re:Don't do it! by p3d0 · · Score: 1

      Does anyone have any idea how this "imagine a Beowulf cluster" thing started?

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    3. Re:Don't do it! by Anonymous Coward · · Score: 0

      Easy... fast computer = cool. Cluster of fast computers = even cooler.

      At the time it got started, Beowulf clustering was a common topic on slashdot - and every time a speedy machine was mentioned, someone would always ask about Beowulf clustering them for more speed. Eventually it got parodied by people asking "Cor! Imagine a beowulf cluster of these bad boys!" or dozens of other variants in reponse to just about anything. And yet another fad was born and tossed in the slashdot meme pool.

    4. Re:Don't do it! by Mr.+Frilly · · Score: 1
      I specifically remember the first time I saw (or at least remember) this phrase used in mockery.

      There was an article on /. several years ago, talking about how the corel computer guys had made a netwinder cluster by taking 10 of their netwinder rack motherboards, screwing them together, and hooking them together via ip-over-scsi. This was a one-off thing for a Linux trade show.

      So of course, someone mentioned using this setup as a beowulf, and the obvious reply was it'd be worthless since the ARM processors had no hardware floating point. But then, someone had to chime in, "but imagine a * of these things", and the rest if history.

      Oh, found the link: ALS Wrapup, but comments appear to be gone.

    5. Re:Don't do it! by p3d0 · · Score: 1

      Hey, thanks. That's interesting. Too bad the comments are gone. I wonder if it's in the Wayback Machine.

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  29. AMD Opteron by dfranks · · Score: 1

    It will be interesting to see if anyone builds a massive HyperTransport switch (probably a local switch for a blade frame with 1000bt between blade frames). The opteron looks like it could run without much in the way of chipset support (build in memory controller), and skipping all the unnecessary I/O would be pretty simple.
    Of course, dumping all the heat would be an issue...

    1. Re:AMD Opteron by raiyu · · Score: 1

      Yup, I dont think that anyone will take AMD seriously for any fault tolerant systems until they deal with the heat issue.

      Personally I dont even think its so much a matter of running hot, which although a nuisance isnt deadly, but more so that there are no safegaurds against overheating. Intel on the other hand, besides running cooler, downgrades the CPU if its overheating. AMD's XP does the same unless the temperature rises more than 1C per second, in which case it craps out anyway.

      Deal with the heat and they may be taken seriously, they certainly have the performance, not to mention it should total up to a nice savings for a couple thousand processor system.

    2. Re:AMD Opteron by Hoser+McMoose · · Score: 3, Informative

      Ugg.. I do WISH that people would stop reading "Tom's Hardware", or at least that they would get a clue first and realize that Tom doesn't know dick-all about what he's talking about most of the time.

      His comments about heat rising more then 1C/second make NO SENSE AT ALL! It's flat-out wrong! I don't know what orafice he pulled that comment from, but it certainly had no technical backing to it. The chip uses a thermal diode. It will tell you the temperature whenever you poll it. It doesn't matter how fast or slow you poll it, it will give you the temp. You would really have to go out of your way to try to break this sort of data to get it to only be able to handle a 1C/s temp increase.

      As for the heat "problem". AMD's AthlonXP chips have a maximum power consumption of roughly 50-70W. Intel's P4's have a maximum power consumption of roughly 50-70W (yes, they consume almost the exact same amount of power, check the data sheets).

      For comparison, Intel's Itanium has a maximum power consumption of around 100-130W, and IBM's Power4 is also on the high-side of 100W.

    3. Re:AMD Opteron by MechaStreisand · · Score: 1

      Dude! He did his own tests on the chips. He watched them fail. He videotaped it! I watched them fail too! Read about it here. It doesn't matter if it makes sense or not, it happened. And unless he's flat-out lying, he called the motherboard manufacturer and they told him that the diode could only handle a 1 C/s increase. This could well be false (probably is, even, since I believe the motherboard has to implement the heat-protection circuitry itself, which in my opinion is a misjudgment on AMD's part), but it's not Tom's mistake.

      Maybe it was a flaw in the motherboard's heat-protection circuitry, rather than a flaw in the thermal diode. Maybe it's a combination of things. But how the hell was Tom supposed to know the ultimate cause? He did a test and reported what happened. Or do you think he made everything up?

      --
      Disclaimer: IANAL. This post is, however, legal advice, and creates an attorney-client relationship.
    4. Re:AMD Opteron by Hoser+McMoose · · Score: 2

      I don't know if he made the whole things up, but there were some fairly major issues that should be blatently obvious to anyone watching that video and reading the explination.

      First off, that whole 1 C/s temp change max makes absolutely no sense at all! If someone actually told him that then they were either completely clueless, or whoever designed the heat-protection circuitry was going out of their way to try and make it a bad design. AMD has reference specs for this on their webpage, and it's REALLY not complicated!

      And than there was the P4 side of things. That P4 which apparentely stayed at 29C without it's heatsink on, because of it's thermal throttling. Think about that for a second. If your processor was only at 29C, it WOULD NOT BE THROTTLING! The thermal throttling of the P4 doesn't come into effect until somewhere around 60C, if the temp is less than 60C, the chip would run at full speed, and it would get a LOT hotter than 29C if that were the case.

      Than there's simply the fact that the whole test is rather contrieved. People just don't rip heatsinks off their processors while they are running. Heatsinks don't fall off unless they were installed by a moron.

      So, what we end up with is a contrieved "test" that has at least one major and very obvious flaw, not to mention a rather dubious explination for the results.

      Ohh, but it DID get Tom probably a few hundred million page hits, and therefore probably several million dollars worth of advertising revenue.

  30. Cluster details by MalleusEBHC · · Score: 4, Funny

    Cluster Overview:
    * 2050 Intel 2.4GHz Xeon processors


    Now when people complain about the United States government being responsible for global warming they will have some good hard facts to use.

  31. I want one. by Anonymous Coward · · Score: 0

    I've been waiting for this. A Whole-Linux clusterbox ready to add cpus as needed to make some good science. I've done aero, astro, chem, the "unnamed", and bio, but now it's all about the perfect ocean boat. Where can I get the distro?

  32. A bit more about Pink... by Anonymous Coward · · Score: 0
    From the same anonymous coward who posted the original story (I'm too lazy to register for an account here).



    All of the software that Pink will be based on is GPL'd and open source. This includes BProc, LinuxBIOS, Supermon, beoboot, Linux (of course), the BProc Job Scheduler, V9FS, etc... All of these projects are available from the cluster research lab web page and most if not all (not sure about V9FS currently) are available via Sourceforge for users interested in the most recent, bleeding edge CVS images.

  33. yeah right by NOiSEA · · Score: 0, Offtopic

    dude thats the dumbest thing ive ever hear even if the LANL was getting a 2083 processor the BA has way to much strain that was put on it inadvertently through the software nodes this will create a backlash the likes of which you have never seen on network TV. I should know. I was a child star, when i was younger and trhe cash was great. I wouldn't change it for the world anyhoo in closing i would like to say that i agree

  34. Good Stuff by Perdo · · Score: 3, Interesting

    "The Science Appliance" as it is dubbed will use dual processor AMD based nodes.

    Scary part is that this will be one of the top 5 supercomputers in the world.

    Scary because you could buy all the hardware off the shelf for about half a million dollars.

    On a lighter note:

    "The Linux NetworX cluster will be used solely for unclassified computing, including testing on ASCI-relevant unclassified applications."

    I think they mean text mode quake.

    I guess they got tired of "Global Thermo-Nuclear War"

    --

    If voting were effective, it would be illegal by now.

    1. Re:Good Stuff by gregorio · · Score: 2, Insightful

      Scary because you could buy all the hardware off the shelf for about half a million dollars.

      Scary? Why? Oh, and the interconnect hardware and installation is going to cost you more than 4x this value if you want good latencies and reliability.

    2. Re:Good Stuff by Perdo · · Score: 3, Interesting

      Good dual amd boards come with gigabit ethernet. With prices as they are, the nodes can be put together for about $350,000. That would leave $150,000 for 512 ports of gigabit switches. Cisco gigabit 48 port switches run $5,000. Double that and add an additional nic to each box and use a flat neighborhood network (.pdf)

      That should give each node about 200 MB/s aggregate bandwidth (the best gigabit ethernet runs at 800 Mb/s or 100 MB/s), easily exeeding what can be achieved with much more expensive solutions.

      About the cost of a nice house.

      Put into perspective, a cluster that could outperform Japan's earth simulator would cost 2 million in hardware costs. Outperforming Seti@home's 3,000,000 users would require $10,000,000.

      I know where my lotto money is going :P

      --

      If voting were effective, it would be illegal by now.

    3. Re:Good Stuff by Obasan · · Score: 1

      You've left out a lot here - including the cost of environmentals of setting up a room that can handle this (probably easily a million think about how much heat 1024 machines produces, and how much power they need), the services for cabling, wiring, racking, testing 1024 nodes, which probably easily runs many hundreds of thousands. Just the cost of cables alone runs in the thousands for this kind of project. (You do -not- want to cheap out on your cabling on a cluster...) The article doesn't say they are using gigabit, either. If they are using Myrinet, think $1000 per node for a Myrinet card (2 GB/s). Then think $50,000 for each 128 port Myrinet switch. Then think $60-70/meter for cable. Even if they are using gigE they probably are not using Cisco, Cisco is not used as commonly in performance networks like this one. More popular are Foundry & Extreme Networks.

      Add in also that you have left out storage. Given that they have 2.1TB of memory on the compute nodes, its likely they have 10TB+ of usable storage, so assuming they are using RAID (seems a safe assumption) that means 15-20TB+ of physical storage capacity. High end RAID controllers & enclosures are expensive, as are 15000 rpm Ultra 320 hard disks. (Fiber channel isn't any cheaper, either...) If they want reliability at the minimum storage nodes & management nodes will need UPS's... ouch. I doubt this cluster is costing them under three million, and could be as much as five. And there's probably expenses that have slipped my mind... :)

    4. Re:Good Stuff by Anonymous Coward · · Score: 0

      Good dual amd boards come with gigabit ethernet. With prices as they are, the nodes can be put together for about $350,000. That would leave $150,000 for 512 ports of gigabit switches. Cisco gigabit 48 port switches run $5,000. Double that and add an additional nic to each box and use a flat neighborhood network [aggregate.org] (.pdf)

      While these are suitable prices for building a small-ish white-box cluster, no one in their right mind would do this for a large cluster. For one thing, you want rack mount nodes for this many nodes; you will also want power controllers, and probably serial consoles for all machines. And don't forget onsite hardware maintainance.

      Also, while FNN is really cool, it won't work well for a large cluster, for a few reasons. First, the cabling would be a nightmare. (actually it would compare with dolphin/scali setups) Also, you tend to want as many interfaces on each node as you have switches in the network. This means that you have 20 nics per node. By your math, that means that each node could hit ~2 GB/s, but there are a few minor issues with this. First, you can't actually put 20 of anything into generic hardware.
      Second, the cpu overhead will kill you. This is why anyone building a large cluster for general use includes myrinet on it. The links are 250 MB/s full duplex, and you can get 480 MB/s bidirectional BW out of it.

      btw, a 1024 node cluster like this probably cost in the neighborhood of 6-8 million.

      Put into perspective, a cluster that could outperform Japan's earth simulator would cost 2 million in hardware costs. Outperforming Seti@home's 3,000,000 users would require $10,000,000.

      Outperforming the earth simulator with a cluster would be possible with a fairly large cluster, like say an 8000 processor cluster. No one has build a cluster that big. and no one with the ability to fund one would believe anyone that said they could build one outright. Also, while you could build a cluster that was as fast in raw flops, ES actually gets 87% of peak performance. No cluster gets that. Clusters typically get 70% if you are really lucky. (and have myrinet) That number goes down as you add procs. A 10000 node cluser might be sufficient to compete with ES, but there would be a substantial number of software problems needing to be overcome before it would work.

      This suggestion is not unlike saying you could build a machine competitive to the ES by buying up all of the PS2s in all of the stores around and setting them up in your basement. While the hardware might be able to perform at that level, no one would be able to use it...

      btw, it has been estimated to cost in the low hundreds of millions, not the 10s to build a machine comparable to the earth simulator.

    5. Re:Good Stuff by Perdo · · Score: 2

      Myrinet is 245 MBytes/s for large packets only. Latency kills it on small transfers. Visit their website. go ahead and pay 20x the cost for 20% more performance.. the idea of a cluster is huge quantities of commodity hardware.

      Put it in Alaska where land and power are cheap and cooling needs are minimal. Maybe $200,000 for ten acres with a warehouse.

      Storage is almost under $1000 per terrabyte.

      Put the drives in the nodes.

      I'll bet this cluster is costing them 100 million.

      This is a government contract after all.

      --

      If voting were effective, it would be illegal by now.

    6. Re:Good Stuff by Perdo · · Score: 2

      You are saying a cluster is not as good as a cluster because a cluster has inherent design limitations relative to a cluster.

      Earth Simulator is a cluster using SX-6 as nodes.

      Each node has 4 CPUs and uses DDR-SDRAM.

      Hardware is hardware. A limitation in one system will be found in another.

      ASCI White has almost 8000 Power3 375 MHz processors, it's just old .

      ASCI White, if built today, would use Power4 1300 Mhz processors.

      Earth Simulator is not so special. You have forgotten when ASCI White was new and was shockingly 5 times as fast as it's closest rival.

      Why put anything in a box? Just complicates cooling. The building is the box.

      You do not need one nic for each switch. You can use Cisco's 8 Gb Backplane to stack the switches.

      I'll accept 50% efficiency if it can be built for 1/100 the cost. I'll build two.

      As for cableing, I've wired an 1800 node data center in 2 months with 3 people with fiber, including terminations and 7 errors that took 2 hours to trace and fix. Hardly a nightmare, in fact pretty monotonous and easy, as well as lucrative.

      Assemble 15 nodes a day and you could build it yourself in a year.

      heh.

      --

      If voting were effective, it would be illegal by now.

  35. I second that! by pdp11e · · Score: 1

    It is called Monte Carlo simulation of the radiation transport. Basically one tracks propagation of the high-energy particles as they progress trough the matter (human tissue). e.g. if you simulate a brachytherapy source (radioactive "seed" implanted in tissue), the code "creates" photon with energy characteristic for a given isotope. The direction vector is chosen by a random number generator (RNG). The RNG also decides at what point along the photon's trajectory an interaction with surrounding matter should occur according to the physical probabilities.
    After the interaction, there is a bunch of scattered particles (photons, electrons and positrons) and the code continues to track them until the energy of the n-th generation particle drops below certain energy (10 keV for electrons) at which moment particle's energy is deposited as a dose.
    The object of the simulation is to obtain precise dose distribution. In order to achieve good statistics one needs to run millions and millions of histories.
    Beowulf clusters are ideal for this job because histories are independent and there is no need
    for the fast shared memory and a fancy interprocess communication.
    I had a pleasure to assemble 24-node 1.6 GHz AMD cluster and we achieved sub-minute simulation times, a result that makes this technique suitable for everyday clinical practice.

  36. Thanks for making my point. by abulafia · · Score: 1
    Most of your statements prove my point exactly correct, if we are to belive you..

    Yes, various practices that fall under the moniker of 'technical trading' have been around a long time. By some counts, since right after the 1930s. By others, before then. Software assisted trading is in some ways new, but in the past the same result happened, aleit slower, through agents.

    To give you a point...

    Sure, ill tuned risk management systems fuck up. Plus, they're extremely important to the world economy. That's why Greenspan bailed out a certain well known hedge fund very recently.



    I was not asserting that "much of the research in this area died suddenly". On the contrary, research in risk management is hitting a rather furious pace. Please re-read what I wrote, and this time pay attention.



    The volume of trades taking place without human interention causes huge swings at the moment. We're seeing this now, and have been for a couple of years.

    Bonus points if you come up with a theory why seeking short-term gains are going to cause exactly the "double dip" so many cheerleaders at Bussinessweek and Fortune are trying to recant.(



    If you get bored, you could actually respond to what I was asserting, which was that a commercially viable trading system would rapidly stabilize any advantage it had, because it would spread to everyone who had a serious interest in tracking new developments in this area. That's what's still oddly on topic for the parent post.
    Thank you, drive though.



    -j

    --
    I forget what 8 was for.
  37. Make it short already by Anonymous Coward · · Score: 0

    "IABCOT!"

    TIA,
    AC, YABA, IG.

    1. Re:Make it short already by Anonymous Coward · · Score: 0

      CYIaBCo%s SUYA?!!

  38. Why not mauve? by cocotoni · · Score: 1

    I think that mauve has the most RAM.

    1. Re:Why not mauve? by Anonymous Coward · · Score: 0

      Whell, then you gotta remember to program in "B".

  39. Weather, Oil, Chemistry by anonymous+cupboard · · Score: 2
    I think someone else has mentioned it but one of the largest govt uses outside weapons simulation is weather. You just can't do medium to long range weather without a lot of computer power,

    Oil companies like to have serious computer power too for prospecting and resevoir modelling.

    In organic chemistry, you can do some serious molecular simulations ranging from pharmaceuticals through to the actions of enzymes and catalysts.

    The fluidics side can even extend through to air-flow modeling (aircraft to cars) and combustion.

  40. Don't be so sure by marm · · Score: 3, Insightful

    A supercomputer is a single system image. Some people call large clusters "supercomputers," but technically they're wrong.

    Says who?

    Once upon a time 'supercomputer' meant 'any computer made by Seymour Cray', and this was reasonable, because he (probably) invented the concept. Then there was the mid-80's loose but widely-accepted definition 'any computing system that can do more than 200 MIPS'. Then MIPS went out of fashion and processors got faster and it was 'anything that does more than a GigaFlop'. Or there's the US Department of Commerce definition which was 'any computing system that does more than 195 Mtops (Million theoretical operations per second)' during the 80's, which then got changed to 1500 Mtops and is probably something different now.

    Note that most Linux cluster systems would meet the requirements of most of these - indeed, most single-CPU computers today would meet most of these requirements, which is how Apple manages to get away with calling the G4 a 'supercomputer'.

    Really, these days 'supercomputer' means absolutely anything you want it to be, although if I had to define it, I think probably the fairest definition would be 'anything that can run the LINPACK benchmark suite and get on the Top500 list'.

    Nice try at creative redefinition though.

    1. Re:Don't be so sure by cperciva · · Score: 2

      Really, these days 'supercomputer' means absolutely anything you want it to be, although if I had to define it, I think probably the fairest definition would be 'anything that can run the LINPACK benchmark suite and get on the Top500 list'.

      Personally, I'd refine that to "get onto the first page of the Top500 list"; but with either definition, the point remains that while some clusters can run the linpack benchmark, most render farms can't, and SETI@Home or distributed.net *certainly* can't.

    2. Re:Don't be so sure by foobar104 · · Score: 3, Informative

      The important thing to notice about the word "supercomputer" is that it's singular. A supercomputer is a single system image; this is implicit in the definition. This is not to say that supercomputing clusters aren't worthy; it's just that they're different in important ways from single-system-image supercomputers.

      Some classes of problems aren't suited for cluster computation. I won't pretend to be educated enough to tell you exactly which problems can and can't be adapted for cluster computation, but consider the nature of clusters to see my point. Clusters are highly scalable, but the inter-node latency is huge. An interconnect like Myrinet can get your remote messaging latencies down to the microsecond range, but the far more common MPI/PVM-over-Ethernet solution is a thousand times slower than that. This makes it somewhat inefficient for node N to try to access a bank of memory on node M. In order for a cluster to be efficient, each node should have sufficient physical memory to hold it's entire data set, and each node should be able to operate more-or-less autonomously, without having to contact other nodes.

      Supercomputers are fundamentally different from clusters. In some cases, you can do the same job with either a supercomputer or a cluster. Some jobs are better suited to clusters, while some are better suited to supercomputers. Some jobs, as I mentioned above, are better suited to arrays than to either clusters or supercomputers. It just depends on the job.

    3. Re:Don't be so sure by marm · · Score: 2

      The important thing to notice about the word "supercomputer" is that it's singular.

      Ok. The word 'cluster' is also singular. Big deal.

      A supercomputer is a single system image; this is implicit in the definition.

      Like I said, whose definition?

      I already know that some types of supercomputer work better for certain types of problems than others, all I'm doing is nitpicking at your pulled-from-the-air definition of a supercomputer that somehow magically defines that a cluster of machines cannot also be a supercomputer. Sure, they may only be individual machines connected via a network, but you use such a cluster as a single large, powerful supercomputer, even if it's not a single system image. What about MOSIX clusters? Whilst strictly speaking they are not single system image, they behave like they are - indeed, from a programming point of view they're all but indistinguishable from a single image NUMA machine, and using switched Gigabit Ethernet (cheap these days) the internode latency and bandwidth isn't too horrid either.

      Your overstrict definition of 'supercomputer' makes you sound like you work for one of the old-guard manufacturers like SGI or Cray threatened by the rise of cluster supercomputers.

      My argument is that clusters, arrays, NUMA machines, SMP machines, with both conventional and vector processors - these can all be 'supercomputers', because although the hardware design and programming techniques are quite different for each type, the end result for all of them is a single system (not necessarily a single system image) that solves numerical problems very quickly. They are just different types of supercomputer. If I could convince all the world's population to do maths with an abacus all at once, and I could somehow divide up the work sensibly and then collate the answers, then yes, that would be a supercomputer too - a human supercomputer, although not a particularly fast one.

    4. Re:Don't be so sure by foobar104 · · Score: 2

      Ok. The word 'cluster' is also singular. Big deal.

      The word "cluster" is collective. Like "dozen." It refers to a group of things as a unit. It's not possible to have a cluster of one.

      Clusters of computers are not supercomputers. They're different. You can try all you like to say that the definition of "supercomputer" is arbitrary; I suppose it is, in the sense that all definitions are fundamentally arbitrary. A supercomputer is a single computer. A cluster of computers is not a single computer. Ergo, a cluster is not the same thing as a supercomputer.

      If I could convince all the world's population to do maths with an abacus... that would be a supercomputer too.

      No, it wouldn't. A computer-- and, by extension, a supercomputer-- is a mechanical device. A group of people could perform the same job as a supercomputer, but the group wouldn't be a supercomputer.

      I think you're possibly confused by metaphors. When you say, "A group of people is a supercomputer," you're employing a metaphor to say, "A group of people shares many important characteristics with a supercomputer." Don't be fooled by this. It's not literally true, and shouldn't be accepted as such.

  41. Actually.... by Anonymous Coward · · Score: 0

    Imagine the thermal output of 2050 Athlons....

    1. Re:Actually.... by Hoser+McMoose · · Score: 2

      I know that people hate facts, but here they are:

      Power consumption:

      AMD AthlonXP 2600+ : 68.3W Max, 62.0W typical
      Intel Xeon 2.4GHz: 65W TDP (*)

      *TDP = Thermal Design Power, a kind of ambigious measure of power that is slightly less then the maximum power the chip can use.

  42. Correct me if Im wrong by MC68040 · · Score: 1

    But doesn't it say " build, integrate and deliver a 1,024-processor Linux cluster " rather than 2048 cpu cluster.

    1. Re:Correct me if Im wrong by Anonymous Coward · · Score: 0

      1024 Nodes 2048 Processors

  43. Loosely coupled? Why else? by Anonymous Coward · · Score: 0
    Putting aside whether this is true, how many computationlly intensive problems need tightly coupled computing? Finite element models, (nuclear simulations, weather prediction) work fine with loosely coupled nodes. So does prime number factoring, SETI, or any other work on large data sets.


    What else would you need a super computer of this size for? Is there really an application that requires it, outside of real time virtual reality?

  44. Re:Good Stuff [OT for the joke] by fruey · · Score: 1
    Outperforming Seti@home's 3,000,000 users would require $10,000,000.

    Bet you couldn't make up the dumb team names and stats with that budget though. Or get so many screens to display dumb meaningless graphs (more or less)... and let's not even get started on the Seti Forums and fansites ;)

    --
    Conversion Rate Optimisation French / English consultant
  45. Etherboot or Jailbait by Anonymous Coward · · Score: 0
  46. We did it, and never released by ppetrakis · · Score: 1

    Company went under, API Networks. We had Linux booting out of 2MB of flash in less than 10 secs. The firmware that controlled the deal was less than 400K and allowed for complete disaster recovery, ramdisks, booting successive Linux kernels, and other firmware. Alas it wasnt finished in time... Alternatives still exist to this day besides LinuxBIOS. Any openfirmware vendor like codegen offers the capability of booting Linux from flash. So does Redhat's redboot bootstrap loader which is part of their eCos microkernel.

    Last time I checked, the LB project was hacking off flash chips to bootstrap these things when they've failed. Basically no recovery procedures. I would ask anyone considering this option that they would consider the question "Am I better off using BOOTP?" . 8MB of flash is unreasonable and it speaks volumes about the product. Scyld could do it less than 2MB, what does that tell you??? Before anyone asks "why don't you finish it", Unfortunately Linux is no longer my job; Just a hobby.

    Peter

    --
    www.alphalinux.org
    1. Re:We did it, and never released by ronaldgminnich · · Score: 1
      What it all tells me is you haven't been paying attention :-)

      we run out of 256KB if needed. The 8 MB is for platforms (http://www.cwlinux.com) where people want a local file system in flash.

      We had 104 DS10s booting out of 2 MB flash no problem. So there are variations.

      Too bad about DBLX (your company's product) but that also speaks volumes about that product, I guess.

    2. Re:We did it, and never released by ppetrakis · · Score: 1

      I must admit I have dropped out of the clustering business. It just doesnt concern me anymore. I have been working on alternate firmware for Alpha but that has been slow going since time is scarce. I'am well aware of your DS10 cluster :-), BTW I still have those flash tools if you need them. Yeah, DBLX would have been nice. Would have ;-). I apologize for the generalization regarding the flash size. The site was that was refered to in this post was unavailable for some unknown reason (slashdot maybe?) so I really didnt have a context.

      Most applications wouldnt need the huge flash. Cluster applications would need swap eventualy and that means disk. Hey if you guys are making money and staying in business. You're better off than we where ;-)

      Regards,
      Peter

      --
      www.alphalinux.org
  47. I fail to see how this is unique.... by Sir_Ace · · Score: 1


    I built a 116 node cluster at my last job, and it was a diskless node setup that booted from the network and used NFS-root. Although linuxBIOS would have been cool it's not needed if a machine has SRM {yes as in alpha} or PXE network cards.

    If the only reason they went to LinuxBIOS was to boot off of the network maybe they should examine the date on the first copy of the HOWTO-NFS-Root and realise it has been going on for a LONG TIME!

    1. Re:I fail to see how this is unique.... by ronaldgminnich · · Score: 2, Insightful

      HMMM, I'm sorry that you have failed to see how this is unique :-) You probably should visit http://www.clustermatic.org and read what's there. Of course it has been going on a long time, I first did it 12 years ago with Suns. But your little 116-node cluster probably did not run into the problems you hit at a larger scale. Anyway, what linuxbios gets us: - more sane platform configuration - we load linux from flash so can use all the capabilities of linux as our bootstrap - we boot over myrinet - we're not even cabling the ethernet up - We don't need to set up the serial network which you HAVE to set up with kludges like SRM You're just not going to get that with PXE or SRM. I realize this detail was not available on the short article.

  48. LinuxBIOS? by Anonymous Coward · · Score: 0

    any working links that can clue one such as I on what exactly this is. I can guess by context but then I am stuck with this feeling of silliness at the result... you said "Linux" and "BIOS" right?

  49. jonKatz is dead to us! by Anonymous Coward · · Score: 1, Funny

    I mean, you haven't seen him post any poorly researched diatribes lately (hell in the last eight months...) now have you?!!

  50. Google by Anonymous Coward · · Score: 0

    Look at the first result on a google search for LinuxBIOS.