Slashdot Mirror


10-TFlop Computer Built from Standard PC Parts

OrangeTide writes "Using PCI host adapters and Xeon processors, engineers at Lawrence Livermore National Labs have achieved 10-TFlops relatively cheaply. More information can be obtained from this article at EETimes." Lately, Linux seems to be the operating system of choice for new supercomputers, and this one's no different. It's cool to see big iron made cheaply.

18 of 247 comments (clear)

  1. Imagine... by Anonymous Coward · · Score: 5, Funny

    A commodity supercomputing cluster of these! (There has to be a better name for it, but I'm new here on Slashdot).

  2. Supercomputer developed for... by jabex · · Score: 5, Funny

    ... which was specifically developed for running Doom III.

    --
    Like Teddy with an elephant gun.
    1. Re:Supercomputer developed for... by Charles+Dodgeson · · Score: 5, Funny
      ... Doom III

      Well, considering who's building it and running it, it's reasonable to guess that some form of doom will be simulated on it.

      --
      Prime numbers are exactly what Alan Greenspan says they are -S. Minsky
  3. imagine the future by ryochiji · · Score: 5, Insightful
    From the article:
    >The 1- to 10-teraflops processing range is opening up a revolutionary capability for scientific applications

    In the not too distant future, that kind of processing power could very well be available in home PCs. Imagine what that would do to...well, I mean, dang it, what the heck will we do? Game frame rates can only go so high. Even realism of 3D graphics may have it's limits. Oh sure, we'll find something, but it's difficult for us to imagine now...

    1. Re:imagine the future by Charles+Dodgeson · · Score: 5, Funny
      In the not too distant future, that kind of processing power could very well be available in home PCs. [...] Oh sure, we'll find something [to do with it], but it's difficult for us to imagine now

      the Search for Extremely Trivial Iterations at home.

      --
      Prime numbers are exactly what Alan Greenspan says they are -S. Minsky
  4. Wait until the weapons inspectors get to Iraq! by joeflies · · Score: 5, Funny

    Then the world will finally see the 4000 Playstation 2's that Saddam used to build a supercomputer

  5. Parallel computing by vlad_petric · · Score: 5, Interesting
    The difficulty is not to conglomerate processing power ... you can do that relatively easily with Benjamins ... the real difficulty is in either parallelizing your computations, or making a single processor work faster.

    So the Teraflops they're mentioning are just a theoretical upper bound, don't get too aroused when you see it.

    The Raven.

    --

    The Raven

  6. Interesting Approach on Network by jki · · Score: 5, Interesting
    Selected clips:

    The system has a few unique features that the lab says will facilitate applications performance, including a fast, custom-made network that taps into an enterprisewide file system.

    "This network approach is nice because we can use a standard PCI slot on each processor node, which gives a 4.5-microsecond latency," he said, as opposed to 90-s latency for Gigabit Ethernet."

    The boards are linked by a network assembled by Linux Networx into a clustered system that will have 960 server nodes.

    The file system, called Lustre, uses a client/server model. Large, fast RAM-based memory systems support a metadata center, and data is represented across the enterprise in the form of object-storage targets. "Being able to share data across the enterprise is an exciting new capability

    I think this is especially interesting, because it seems to glue together pieces from traditional clustering and distribted or metacomputing. Is there some site for this project with more details?

  7. who ever said ... by valmont · · Score: 5, Funny
    ... that penguins couldn't do steroids?

  8. Connections through PCI bus? by Dr.+Spork · · Score: 5, Interesting
    Do I understand correctly that they just wired PCI slots from different motherboards together, instead of running the data around over ethernet (which probably would have been plugged into a PCI slot anyway)? If so, I mean, if there's nothing more to it than that, it seems like this will be a kickass way of clustering. But there must be something more to it than I realize, because if there wasn't, there wouldn't be so many ethernet-based beowulf systems.

    So please explain this. I mean, I have two linux boxes in my room and each has a free PCI slot. What do I need to to to network them over directly over PCI?

    1. Re:Connections through PCI bus? by tap · · Score: 5, Insightful
      There are chips designed to connect two PCI busses together, called PCI-PCI Bridges. For instance, I have an Intel dual port ethernet card with one:

      Bus 0, device 12, function 0: PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 3). Master Capable. Latency=64. Min Gnt=4.

      But you can't use this to connect a rack of computers. For one thing the max cable length for connecting two busses would be just a few inches. For putting PCI cards in 1.75" high 1U rackmount cases, there are PCI risers with a short ribbon cable that connects to the PCI slot. Even these short cables often cause timing problems. For instance, with the riser, cards may only work in the first one or two slots that will otherwise work in all the slots.

      But even if you could cable all the computers together on one giant PCI bus, it would still be a bad idea. A good 24 port gigabit ethernet switch (~$2000) has a 480MB/sec switching fabric, to support full speed full duplex on each port. 32 bit 33Mhz PCI is only about 132 MB/sec, not nearly as fast. You'd need a 64 bit 66 Mhz PCI bus to keep up. And there are more expensive gbit switches with more ports that have 100 Gbit/sec fabric. And this is just gbit ethernet, the slowest and cheapest of the high speed interconnects used in modern Beowulf clusters.

      There are faster ways to connect computers than gigabit ethernet. The EE times article is very untechnical, but this one has some more information. LLNL has used a very fast and very expensive interface called quadrics. This is probably the fastest way to connect computers in a Beowulf. People like Cray/SGI and IBM have faster things still, but they cost real big bucks. Other ways to connect a Beowulf are the above mentioned gigabit ethernet (~$100-$250 a node for up to 24 nodes), myrinet (~$1400-$2000 /node up to 128 nodes), and SCIhardware and software (~$1400-$2100 /node). Myrinet uses a switch like gigabet ethernet and the largest switch they have is 128 ports. SCI is switchless, each card has multiple cables (1-3), and is connected in into a ring, 2D or 3D torus.

  9. I need one! by Newer+Guy · · Score: 5, Funny

    I have a lot of movies to convert to DIVX...

  10. Processing power by rovingeyes · · Score: 5, Interesting
    Actually, your statement made me wonder for a while. I remember that till not long ago, US wouldn't let other countries buy latest super computers becoz they feared it'd be used to do those nuclear explosion simlations. Now I'm not sure if it still is the case.

    Anyways, what I'm trying to point out is that it is actually becoming very convinient to build a super computer with lots of PCs that just lie idle. I am not sure if Saddam has heard about cheap linux systems. But what if he could build a super computer cluster?

    Boy this gets interesting and scarier at the same time.

    1. Re:Processing power by FuzzyDaddy · · Score: 5, Insightful
      So the question is (and I don't know, I didn't study nuclear physics beyond A-level), are the significant computational problems associated with the development of nuclear weapons easy to parallelize, or do they require a real supercomputer [sgi.com]?

      I believe the calculations needed are massive finite element calculations. And I would imagine that things happen quickly enough in a nuclear explosion that there's a lot of significant stuff going on over a time period much shorter than it takes for any change to move from one side of the simulated device to the other.

      As an analogy, suppose you wanted to simulate a large number of gravitating bodies. You would break the problem up into sections. Even though each body acts on every other, bodies outside a certain distance can be treated by their average force. So you can simulate things near each other on the same node, and have the nodes talk to pass the information about the "average" field. It requires some communication between nodes, but a large amount of work can be done on an individual nodes.

      Or for your gas example, if you broke the problem up into boxes, you would have to "hand off" a particle as it passed from one box to another, and perhaps pass off information about forces close to the box boundaries. But if a lot of stuff is happening in a single box (like, say, chemical reactions), you can still get a big benefit out of parallalization.

      Also, if designing nuclear bombs is anything like designing microwave components, you would have several simulations going at the same time, to try different variations on one design. Or you would design several subparts and have them running at the same time.

      In short, I think that the problem very much lends it self to parallel computing.

      --
      It's not wasting time, I'm educating myself.
    2. Re:Processing power by LWATCDR · · Score: 5, Insightful

      Since the first Atomic Bomb was made in 1944-45 and worked the first time. All you need is a computer equal to what they had in 1944.
      To make a small portable nuke is harder.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
  11. PCI Null-Modem by Bios_Hakr · · Score: 5, Interesting

    Uuh, I mean null-card connection. I have never really looked at the PCI spec from an eletrical engineer standpoint, but there are probably power leads, data leads, timing leads, and ground leads on there.

    The data leads should be easy...TX to RX. Although they may use a full-duplex lead where the data shares the bus based on clock pulses.

    The power could be dropped, as both machines already have the proper power requirements. The ground leads could be tied together if you wanted, but dropping them shouldn't have too much impact on the final outcome.

    The tricky part would be the clock pulses. In order to keep the data integrity, you need to have both bachines on the same clock. The easy way would be to take the crystal from one motherboard and wire it to the other. Same crystal, same clock pulse.

    Then drivers would be needed to make the other computer look like an attached device. Shouldn't be too difficult. Just take a NIC driver and modify it...heavily.

    I think an easier option would be to share data across the IDE bus. Make an IDE driver look like a NIC driver and send IP across IDE. In fact, I remember Linux Journal publishing an article about someone doing IP over SCSI about 2 years ago. Get some SCSI cards and make your own version of a CDDI network ring.

    --
    I'd rather you do it wrong, than for me to have to do it at all.
  12. I can see the adds now.. by PerryMason · · Score: 5, Funny

    "I was doing my nuclear simulations on the ASCII White and it was like BEEP BEEP BEEP...and like half my work was gone..."

    --
    "I'm tired of all this 'Aren't humanity great' bullshit. We're a virus with shoes" - Bill Hicks
  13. This is not "Big Iron" by sdeath · · Score: 5, Insightful

    The title says it all. Big Iron is _engineered_. No matter how big or how spiffy a Beowulf cluster is, it's still just a bunch of PC motherboards kludged together with a bunch of network cards. There is a reason Crays are expensive - they are _worth it_ from a performance standpoint, because not every problem lends itself easily to the solution of a Beowulf cluster. Some problems require the exchange of a lot of data between a lot of nodes, and a little math will show that it won't take much data interchange to saturate even a GigE switch. Adding more machines is not going to help; craftily designing and overengineering the network _might_, but by the time you get this whole damned thing glued together well enough to approximate a Cray's performance, you'll have spent enough to have just flat-out bought a Cray in the first place.

    As others have noted, while this thing may have a theoretical peak performance of 10 TFLOPS, I'm willing to bet that number goes down like Monica Lewinsky on Quaaludes when you feed this magical supercomputer a problem that's _not_ suitable for distributed.net (i.e. one where computations on one node are dependent on computations on another node, like fluid-dynamics problems, turbulence, etc.)

    Yeah, it's interesting as a curiosity, but this is by no means spectacular. Beowulf is good for what it's good for, which is a "poor-man's supercomputer" that works well for coarsely-parallel problems that don't require a lot of internode communication. It's not the Philosopher's Stone, folks.

    -SD

    --
    I am Chaos. I am alive, and I tell you that you are Free. -Eris