Slashdot Mirror


Student and Professor Build Budget Supercomputer

Luke writes "This past winter Calvin College professor Joel Adams and then Calvin senior Tim Brom built Microwulf, a portable supercomputer with 26.25 gigaflops peak performance, that cost less than $2,500 to construct, becoming the most cost-efficient supercomputer anywhere that Adams knows of. "It's small enough to check on an airplane or fit next to a desk," said Brom. Instead of a bunch of researchers having to share a single Beowulf cluster supercomputer, now each researcher can have their own."

77 of 387 comments (clear)

  1. Imagine... by Glowing+Fish · · Score: 4, Funny

    A beowulf cluster full of these!

    (Okay, now back to responsible mature posting)

    --
    Hopefully I didn't put any [] around my words.
    1. Re:Imagine... by Anonymous Coward · · Score: 5, Funny

      (Okay, now back to responsible mature posting)

      You forgot to provide a link to that...
    2. Re:Imagine... by Jonner · · Score: 3, Insightful

      In this case, I think it's a somewhat serious idea. This design has only four nodes, so connecting several in a modular fashion might make sense, and retain some of the advantages in portability and cost. You could move the individual Microwulfs around, but bring them together for really big problems. Think of it as a LAN party for scientists.

    3. Re:Imagine... by Anonymous Coward · · Score: 2, Informative

      Hmmm....

      NCSU Computer Science Dept. has PS3 cluster topping out at 218Gflops using 8 PS3s. PS3's are not $500 each, so that quite a bit better in terms of bang fot the buck. It's even better than the reduced price PC from Newegg.

      http://moss.csc.ncsu.edu/~mueller/cluster/ps3/

      http://moss.csc.ncsu.edu/~mueller/cluster/ps3/coe. html

    4. Re:Imagine... by mikael · · Score: 2, Informative

      That's probably what would happen if a dozen of these systems were made. Instead of a system in each office, they would probably be placed in a lab, if not in a server room somewhere with remote access through a thin client.

      --
      Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
    5. Re:Imagine... by Bluesman · · Score: 4, Funny

      (Okay, now back to responsible mature posting)

      No, stay with us on Slashdot!

      --
      If moderation could change anything, it would be illegal.
  2. not so impressive... by toQDuj · · Score: 4, Insightful

    It's just four motherboards sitting in a single frame. connected by an ethernet switch.

    True supercomputing machines (sun, ibm) have a little bit better interconnectivity between the components than a mere 1Gb/s line. This can serve its purpose though, VASP will run wonderfully on it. GAMESS probably as well.

    B.

    --
    Every experiment which ends in a big bang is a good experiment.
    1. Re:not so impressive... by QuantumG · · Score: 3, Informative

      http://www.calvin.edu/~adams/research/microwulf/bu dget/

                AMD Athlon 64 X2 3800+ AM2 CPU x 4

      It's two clicks from the summary.

      Slack++

      --
      How we know is more important than what we know.
    2. Re:not so impressive... by dbIII · · Score: 2, Informative

      Mobs like Verari were selling something similar a while ago - not cheap though and I can't see it on their web page now. What is nice now from other places is things like 2 x 8 core machines in 1U (maxtron and probably a few others). The relatively small supermicro boards in that thing would mean you could put a few in a server case - not cheap though.

    3. Re:not so impressive... by pablochacin · · Score: 3, Insightful

      >Are Intel processors "super computers" now or something?
      No, processors ARE NOT supercomputers (actually, the are not computers at all). But if you put enough of them together in the appropriate way, they BECOME a super computer.
      Super computers are no longer made from special purpose hardware. Now it makes much more economical sense to build them from general purpose hardware like those Intel or Power PC processors. Look at the Marenostrum, a super computer here is Spain.

    4. Re:not so impressive... by vrmlguy · · Score: 2, Interesting

      Others have pointed out that this is useful for tasks where the interconnect speed doesn't matter. I'll point out that the first "node" only costs $765, and the next seven are $564 each (then you need a bigger switch). Of course, the 8-way version won't fit in an airplane's overhead luggage compartment anymore. You might want to add a UPS.

      I seem to recall a post earlier this year about some other university building something similar using two quad-core CPUs on each motherboard. Their version, too, wouldn't fit over your seat, as it stood about six feet tall. Hmmm, either Slashdot nor Google can find anything, but I thought it used a frame built of pine 2x2s.

      BTW, is there a benchmark you have to pass to get called a supercomputer? Why couldn't someone grab a bunch of three-year-old desktops that are due to be junked and tie them together for a shot at the title of cheapest supercomputer? Do those ad hoc arrays that the animation studios re-build for every movie count?

      --
      Nothing for 6-digit uids?
  3. On an airplane? by biocute · · Score: 3, Insightful

    It's small enough to check on an airplane

    With security concerns nowadays, it's the amount of cables coming out of it that worries an airline, not the size or weight of this machine.

    1. Re:On an airplane? by arivanov · · Score: 2, Interesting

      You are overestimating the amount of EM noise emitted by a motherboard outside the case. Very few computer components are noisy. The ones that are like some modems, wireless cards, etc feature additional individual shielding.

      --
      Baker's Law: Misery no longer loves company. Nowadays it insists on it
      http://www.sigsegv.cx/
  4. Check in on an airplane ? by boaworm · · Score: 2, Funny

    It looks rather fragile, quite like the iRack (http://www.youtube.com/watch?v=xcjLEwZqcQI), and I dont think it would survive checking in on an airplane given how some suitcases looks like at baggage claim.

    Cool achievement nevertheless.

    --
    Probable impossibilities are to be preferred to improbable possibilities.
    Aristotele
    1. Re:Check in on an airplane ? by ThirdPrize · · Score: 5, Funny

      Checking something called an iRack onto a plane is just asking for a full cavity body search and possibly a nice orange jumpsuit.

      --
      I have excellent Karma and I am not afraid to Troll it.
  5. How is this interesting? by Anonymous Coward · · Score: 2, Informative

    They just linked 4 motherboards together. My cat could do that.

    1. Re:How is this interesting? by MrNaz · · Score: 2, Funny

      Lisa, I'd like to buy your cat!

      --
      I hate printers.
    2. Re:How is this interesting? by CaptDeuce · · Score: 5, Funny

      They just linked 4 motherboards together. My cat could do that.

      Sure. But then your cat would have to moonlight as a mouser, run errands for the neighborhood dogs, and -- worst of all -- give up catnip; all in order to pay for the project.

      I would not want to live in the same house as a sleep deprived cat going through catnip withdrawl.

      --
      "Where's my other sock?" - A. Einstein
    3. Re:How is this interesting? by dbIII · · Score: 5, Funny

      Doubt it. You think you can hook up gigabit ethernet without at least five cats eh?

    4. Re:How is this interesting? by maroberts · · Score: 5, Funny

      They just linked 4 motherboards together. My cat could do that.

      Would your cat be alive at the end of the process? We wouldn't be sure till we opened the case.

      --

      Donte Alistair Anderson Roberts - hi son!
      Karma: Chameleon

  6. heat buildup issues? by toQDuj · · Score: 2, Interesting

    And it looks like they'll be running into heat buildup issues. An enclosure ventilated by one or two desktop fans would have provided sufficient cooling. Mere convection (outside of the tiny on-board fans) is often not enough. The Sun E450's were well ventilated machines, with a clear air path going from the front to the back. The temperature monitors (ambient, cpu (x4), PSU (x3)) were useful as well. One was used for a long time at Stack (www.stack.nl) as a room temperature monitor.

    B.

    --
    Every experiment which ends in a big bang is a good experiment.
    1. Re:heat buildup issues? by Bob+MacSlack · · Score: 2, Informative

      I guess reading the article is asking too much? There are 4 120mm case fans on it.

  7. Great! by Colin+Smith · · Score: 4, Funny

    Now Microsoft have their next development target for Office.

    --
    Deleted
  8. But by phalse+phace · · Score: 4, Funny

    is it powerful enough to run Windows Vista?

    1. Re:But by Corwn+of+Amber · · Score: 2, Funny

      If you mean "without any lag", then it is /required/.

      --
      Making laws based on opinions that stem up from false informations leads to witch hunts.
    2. Re:But by Thrakamazog · · Score: 5, Funny

      Only if you don't plan on playing MP3s.

  9. Lame. by Anonymous Coward · · Score: 4, Insightful

    I am impressed with how amazingly lame this story is. It should have been entitled, "College Senior and Professor discover Ethernet, MicroATX, and PXE boot. Funding dried up before paying for cases. News at 3 am because we can't find anything else to report."

    Honestly, our whole research lab is filled with PXE booting MicroATX computers connected via ethernet. And I guarantee that four "nodes", aka Linux PCs, are cheaper than $2500. Whoop-de-freaking-do.

    1. Re:Lame. by GreatBunzinni · · Score: 3, Informative

      And I guarantee that four "nodes", aka Linux PCs, are cheaper than $2500.

      Indeed. After I saw the component prices I was left dumbfounded. I mean, AMD Athlon 64 X2 3800+ processors at 165 dollars a pop? A kingston 1GB DDR-667 stick of RAM at 124 dollars? Are they on drugs? I mean, I've just bought an Athlon 64 X2 4000+ EE for 68euros (the 3800+ was selling for 59 euros) and each kingston 1GB DDR-800 stick for 46 euros. Where did all the rest of the money went?

      --
      Slashdot, fix your code or at least hire someone who is competent at it to do it for you.
    2. Re:Lame. by dreamchaser · · Score: 3, Funny

      They probably padded the budget and spent the remaining money on hookers and blow. That would explain how they got delusions of grandeur and thought they built something new and innovative when all they did was link 4 motherboards via cheap gig-ethernet.

      This story is literally a 'nothing to see here, move along' one.

  10. the google way by arabagast · · Score: 5, Interesting

    This seems pretty similar to the way google builds their racks, with just mb's and no cabinets. What would have been really cool was if someone made som e kind of network driver for a pci express slot, with them being able to use external cables, is it possible to use a dedicated pci express slot as a interface to another computer, skipping the network bottleneck ?

    --
    Doolittle : ...What is your one purpose in life?
    Bomb no.20 : To explode of course.
    1. Re:the google way by Petaris · · Score: 3, Interesting

      Myself and some other students (back when I was in college) played with doing this via PCI SCSI cards, it worked to a point but wasn't quite the same as all you were really doing is providing SCSI access to each systems HDDs. Still it would have allowed quite fast data sharing if configured correctly. As we had no real goal, it was just one of those "I wonder if we can do it" times, we didn't play further then just the HDD connections and copying files across, which was very fast. :)

      --
      ~Petaris "The world is open. Are you?"
    2. Re:the google way by Stultsinator · · Score: 2, Interesting

      (Commenting rather than modding)

      I've often wondered the same myself. Sure, you can get some speed optimizations by running a slimmed-down wire protocol over the Ethernet, but it's intuitive that any additional hardware between nodes adds latency. Unless NIC hardware is essential for something like buffering, I'd think some sort of PCI bridging driver would be much better suited for this sort of setup.

      If anyone's heard of anything like this please share. I'm off to do some more Googling for it myself.

      -S

    3. Re:the google way by dave420 · · Score: 2, Interesting

      You'd have to implement some sort of switching, as the motherboards in question only had 1 PCIE slot. You'd have to find a motherboard with as many PCIE slots as computers wanting to speak to each other to act as a switch, or have them all talking over one connection, which would diminish performance greatly.

  11. Not to rain on their parade, but... by fgodfrey · · Score: 5, Insightful

    ...this is *hardly* a supercomputer. This is 152.57 times slower than entry number 500 on the Top 500 List. There isn't a nice neat definition of what a supercomputer is anymore, but "capable of running Beowulf" isn't it. Leaving aside the more custom machines that the company I work for (and a few others) build, there are plenty of Linux clusters that *do* qualify. The fastest one seems to be number 8 on the current Top 500 list (a Dell Infiniband cluster at NCSA).

    --
    Go Badgers! -- #include "std/disclaimer.h"
    1. Re:Not to rain on their parade, but... by maevius · · Score: 2, Insightful

      My calculator has about double the power of ENIAC...

  12. Actually... Microwulf might well be revolutionary by Colin+Smith · · Score: 3, Insightful

    One of the problems with supercomputers is that there aren't really very many of them, because of the size and cost. It means that the tools you use to run your supercomputing applications are similarly unusual. The skills to use and develop on parallel systems are then equally scarce. Access to a supercomputer isn't exactly common.

    Microwulf could make all of the above common. For the price of a high spec PC. The commodity nature of it could bring super computing and super computing applications to the masses.

    Then you can scale your application from microwulf to miniwulf to superwulf with little more effort than installing it on the bigger machine.

    Course, they'd have to produce a commodity pre-built system.

    --
    Deleted
  13. It's about the possibilities, not the technology by Kantana · · Score: 5, Interesting

    I see a few people making the expected "It's just four motherboards wired together with Gig E"-comments. While I won't object to that, I'd say this is not about a groundbreaking evolution in hardware, more a case of demonstrating what's possible today with COTS parts. Adding to that the compact packaging, and the ability to run off of a single power cord, it's a nice setup IMHO.

    While it does not have the interconnect of "true HPC" hardware (a bit of a fleeting distinction, but bear with me) it'll surely be suitable for a lot of the simpler, yet still compute-intensive tasks out there ("simple" here meaning not needing a lot of intra-node communication).

    On the flip side, it might fuel the "hell, I'll just build my own cluster"-mentality going around these days. I work in the HPC group at a university, running linux clusters, IBM "big iron" and a couple of small, old SGI installation, and we certainly see a bit of that going around. Problem is, sure, the hardware is cheap and affordable, but getting it to run in a stable and sensible manner without spending large amounts of time just keeping the thing together is a challenge, mainly due to the immature state of clustering software. As many researchers are not exactly keen on spending time solving problems outside their specific field, they're usually better off letting somebody else administer things, so they can just log on and run their stuff.

    But for individuals and small groups of people who are computer savvy enough to handle it, things like these are definately a "good thing" (TM).

  14. Re:Actually... Microwulf might well be revolutiona by Solra+Bizna · · Score: 2, Interesting

    The more computing power is available in the world, the less it will be used to its potential. If everyone had an Earth Simulator in their basement, how much of that power would be wasted?

    Not saying that proliferation of computers is bad, just food for thought.

    -:sigma.SB

    P.S. SETI@home, Folding@home, etc. are cheating. :P

    --
    WARN
    THERE IS ANOTHER SYSTEM
  15. 4 psus, isn't that a waste? by bundaegi · · Score: 5, Interesting
    Sure, nothing beats off-the-shelf components... but powering 4 motherboards using 4 separate PSUs sounds like waste!

    Look at this design: http://www.mini-itx.com/projects/cluster/. It uses DC-DC converters on each motherboards (mini-itx, so low power), a single 12V PSU and a UPS for regulation:

    The DC-DC converters require a clean, well-regulated 12VDC source. I chose to use a heavy duty 60 ampere 12VDC switching power supply capable of delivering 60 amperes peak current which I ordered from an online electronics test equipment supplier. Since badly conditioned AC power is potentially damaging to expensive computing equipment, I use a 1 KVA UPS purchased at an office supply store to make sure the cluster can't be "bumped off" by power line glitches and droputs.
    --
    bundaegi is good for you
  16. GigaFlops by jma05 · · Score: 4, Interesting

    Is 26 GigaFlops significant anymore? I hear that the PS3 can do 20-25 from Folding@Home people. And it is only about a 5th the price. But I hear so many different numbers that I can no longer make sense of them. Why do they bother comparing with DeepBlue, an over 10 yr old super computer? Can anyone with a PS3 can report what their PS3 with Yellow Dog Linux is doing? And what are the numbers for the latest desktop processors? Any recommendations on software to benchmark in flops for my own computers?

    1. Re:GigaFlops by skulgnome · · Score: 2, Informative

      Are your numbers on single-precision computation, or double-precision? Because the PS3's Cell only does amazingly quick floating-point on single-precision values. Double precision is six, seven times as slow.

  17. Re:What would you do with one? by davmoo · · Score: 3, Insightful

    This is kind of like the old joke about a dog chasing a car...what's it gonna do with the thing if it catches it.

    I've thought several times about building a small cluster, just for the experience and the nerd factor. But I never do because I also get in to the issue of just what am I going to do with it once its finished, other than heat my workshop.

    --
    I want a new quote. One that won't spill. One that don't cost too much. Or come in a pill.
  18. Re:But.. by stupid_is · · Score: 5, Funny
    Minesweeper under XP - Yes

    Minesweeper under Vista - No

    --
    -- Intelligence is soluble in alcohol
  19. **Lets chop that price down...the newegg,com way** by Bananatree3 · · Score: 5, Informative
    Motherboard: MSI K9N6PGM-F MicroATX $62.99 * 4 = $251.96

    CPU: AMD Athlon 64 X2 3800+ AM2 CPU $67.50 * 4 = $270

    Main Memory: Kingston DDR2-667 1GByte RAM $48.49 * 8 + $4.99sh = $392.91

    Power Supply: (can't beat price): $76.00

    Network adapter (node to switch): (cant beat their price) $164.00

    Network adapter (switch to node): (cant beat their price) $15

    Switch: Trendware TEG-S80TXE 8-port Gigabit Ethernet Switch $46.99+$7.04sh = $54.03

    Hard drive: Seagate 7200 250GB SATA hard drive $69.99

    DVD/CD drive: (can't beat their price): $19

    Cooling: (can't beat their price): $32

    Fan protective grills: (can't beat their price): $10

    KVM: (can't beat their price): $50 Grand total (incl. 15 in hardware): 1416.89 $1000 saved by using Newegg!

  20. Wussywulf? by MikeFM · · Score: 2, Interesting

    I'm to lazy to run the numbers tonight to compare actual speeds but our dual CPU four-core Xeon (8 cores total) servers cost around $2500 each to build. Looking at their specs I doubt they could be doing much better and they require special clusterish programming.

    --
    At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
    1. Re:Wussywulf? by Draconian · · Score: 2, Informative

      they require special clusterish programming So ? On an SMP machine you need special SMP-ish programming. Great fun if your memory bandwidth runs out...

      Some problems run naturally on distributed systems, some on shared-memory systems. It's a matter of choosing the right machine for the task at hand. Programming in MPI isn't that hard, and unless you are network bound (either bandwith or latency) it scales well. That is the equivalent of an SMP-machine not being memory bound (bandwidth, latency, coherency,...)
  21. gigaflops? by apodyopsis · · Score: 2, Insightful

    gigaflops, schmigaglops.

    this is /.

    i thought performance was measured in fps?

    1. Re:gigaflops? by zero_offset · · Score: 3, Funny

      Only true if your user ID contains 7 digits...

      --

      Slashdot quality declines as the number of hot grits posts decreases. - Provolt's Law, Apr-09-2005

  22. Re:Actually... Microwulf might well be revolutiona by forkazoo · · Score: 5, Informative

    One of the problems with supercomputers is that there aren't really very many of them, because of the size and cost. It means that the tools you use to run your supercomputing applications are similarly unusual. The skills to use and develop on parallel systems are then equally scarce. Access to a supercomputer isn't exactly common.


    Revolutionary? Everything old is new again...

    http://www.mini-itx.com/projects/cluster/
    http://news.taborcommunications.com/msgget.jsp?mid =494184&xsl=story.xsl -- 8 way parallel cluster that fits on an airplane for under 3 grand
    http://www-03.ibm.com/systems/bladecenter/ -- a 7U chassis that holds 14 blades, and is a bit spendy, but not completely unreasonable for some situations
    http://www.linuxjournal.com/article/8177 -- My personal favorite, this page talks about several small portable miniclusters that have been made over the last six or seven years...

    Yes, 8 cores of Athlon64 is faster than 8 cores of low power VIA CPU's from several years ago, but the concept isn't revolutionary, and there isn't a lot of headline worthy engineering that goes into a project like this... I'm sure it's a very handy tool, and I'm not suggested it shouldn't have been built, or that it was entirely trivial to build, but in the end, it's just four ordinary motherboards and ethernet.
  23. Re:How does it compare to a PS3? by MacroRex · · Score: 4, Informative

    Sorry for replying to myself, but I found an interesting paper about the subject. Seems that a PS3 should have Rpeak of 14 Gflop/s with double precision floating point operations. Sounds to me that with a proper clustering solution a four-node PS3 cluster would be significantly faster than Microwulf. And it would probably be a smaller, too :)

  24. Re:**Lets chop that price down...the newegg,com wa by somersault · · Score: 5, Funny

    Discovering that you can build an even more cost effective supercomputer than these guys: priceless

    --
    which is totally what she said
  25. Orac from Blake's Seven by ltrm · · Score: 2, Funny

    A striking resemblance for a box of bits. I wonder if it's got the same surly attitude.

  26. Intel Core 2 is Faster by locster · · Score: 2, Informative

    Am I missing something here? The Sisoft Sandra MFLOPS measurement for a top end Intel Core 2 is 47 GFlops http://www.tomshardware.co.uk/overclocking-intel,r eview-2395-28.html/. OK admitedly this is a sythetic measurement, but it's a ballpark figure right?

  27. Re:Newbie translation please? by noahisaac · · Score: 4, Informative

    So 1 Hz equals 1 FlOp? And a 3.2 GHz CPU can do 3.2 gigaflops, right?
    No, one hertz is one cycle of the processor.

    Can they execute multiple FlOps per tick then?
    Yes. A single processor will perform several steps in one cycle. Typically, the steps are something like:

    1. fetch (an instruction from memory)
    2. decode the instruction
    3. execute the instruction
    4. access (some memory location)
    5. writeback (some values calculated during this cycle)

    In reality, this cycle is usually more complex and processors are designed to predict certain events in order to pack more into a single processor cycle. On top of this, note that the processors used in this machine are all dual-core processors. This means that instead of the 4 processors listed on the hardware manifest, it's really more like 8 processors (well, not quite).

    And do we care that these will bottleneck at the rather limited bus (even forgetting about the switch).
    No.

    Hey, those computer engineering classes I was forced to take as a part of my CS major have actually proven useful! Oh wait, this is Slashdot.
  28. Price/Performance not new... by wilw410 · · Score: 2, Insightful

    The University of Kentucky (where he is coincidently going to grad school) beat his price point years ago on a "real" supercomputer. This super computer was built for about $84 per GFLOP in 2003 and it made the Top500 list when it was built. The Aggregate team at UK is one of the tops in the field when it comes to supercomputers on the cheap.

  29. Nice, but a little low on RAM? by Sandb · · Score: 2, Insightful

    Seems like 2GB per (dual core) node is a little on the low side for practical usage. Not surprisingly though, RAM is the biggest cost of the system (992$ total) and switching to 2GB or 4GB modules will raise the system price considerably. Would still be cheap though.

  30. Re:How does it compare to a PS3? by DrXym · · Score: 2, Interesting
    IBM have already done just that. For example they have a demo of cluster of PS3s providing a real time ray tracing scene.

    I expect the design is very well suited to clustering. The PPUs handle all the data dispatching & balancing with the SPUs left to do the leg work.

  31. Re:How does it compare to a PS3? by Savantissimo · · Score: 3, Interesting

    Good paper - it also says that by using mixed precision (iterated 32-bit math for rough matrix factorization then fine-tuning the precision in 64-bit) the double-precision matrix performance is up to 155 Gflops.

    --
    "Is life so dear, or peace so sweet, as to be purchased at the price of chains and slavery?" - Patrick Henry
  32. Re:Why the extra NIC's? Mobo had 1gbt ports by mattgoldey · · Score: 2, Insightful

    RTFA They bound the onboard NIC to one core of each CPU and bound the add-on NIC to the other core. That way, each core had its own dedicated communications channel.

  33. Lousy Latency Performance, Though by porkchop_d_clown · · Score: 2, Funny

    The cluster depends on gigE for the interconnect, which means data transfers are going to be slow, and have a high latency. He'd be better off spending a little more and using Infiniband equipment.

  34. Re:Do you reckon it could run Crysis? by Pop69 · · Score: 2, Funny

    It might just about be powerful enough to run Vista though....

  35. Re:Newbie translation please? by jaweekes · · Score: 2, Informative
    I was always told that it took at least 2Hz for a processor to do one instruction, but that was back in 1991 when I took my electronics degree.

    A processor normally takes 2-3 clock pulses to perform any instruction, as it cannot perform the operation in the same clock cycle that it receives the operation in. If the operation requires a call to a memory location it will take 3 cycles (one to get the info from the memory location) which is why pre-fetch is so important in modern processors.

    A cycle is triggered by the rising edge of the clock pulse. Whatever the computer does must be completed before the start of the next cycle

    The instruction execution cycle is triggered by the clock cycle, but has several stages
    - Each stage is triggered by successive clock pulses
    - The exact timing depends on the details of a particular machine
    - A complete instruction cycle usually takes several clock cycles to execute

    The instruction cycle is divided into several stages
    - In some machines, some of these stages are performed simultaneously, which speeds things up
    - The stages are common to most architectures

    Sometimes called the Fetch-Decode-Execute Cycle.

    Taken from this pdf.
  36. Re:Newbie translation please? by Waffle+Iron · · Score: 2, Informative

    A processor normally takes 2-3 clock pulses to perform any instruction

    A modern processor may in fact take a dozen or more clock cycles to finish a single instruction. However, by utilizing pipelining, reordering and multiple execution units, a single core may be working on upwards of 50 instructions at once. The resulting throughput can be several instructions per clock on each core.

  37. Maybe it could be more compact by foxb · · Score: 2, Insightful

    In http://clustercompute.com/ you could find better design in term of compactness. Another thing is that the cluster does not need KVM (in process only at test mode) and as noted in several research papers dual 100M can beat gigabit (source http://en.wikipedia.org/wiki/Kentucky_Linux_Athlon _Testbed )

  38. Beowulf = pain in the a** by seven+of+five · · Score: 2, Funny

    Beowulf is a good idea for a very limited number of number crunching applications, or as a student learning tool for comp sci or related studies. Yeah, we built one of those a couple years ago, the professors ended picking up intel quad core machines that were faster (no effing network latency). Beowulf is gathering dust.

    Oh, and try writing your own lam-mpi code sometime...

  39. Re:What would you do with one? by symbolic · · Score: 3, Funny

    Aren't these things like chick magnets?

  40. GPU cluster by ZonkerWilliam · · Score: 2, Informative

    Although not as cheap as the Microwulf, Nvidia has a desktop super-computer for sale http://www.nvidia.com/object/tesla_deskside.html at 500 GigaGlops, to start.

  41. Re:Definition? by mikael · · Score: 4, Informative

    The basic definition of a supercomputer is a system which has top performance compared to other computer systems (within the top 500 or 100).

    In the past, this could only be achieved by having custom CPU's to perform pipelining or parallel processing. Processors in the Cray supercomputers had extremely deep vector pipelines, which was good for three-dimensional simulations like CFD or computer animation. But other systems followed the parallel processing method. The Connection machine had 2^16 one bit processors which was good for encryption/decryption. Other systems used standard CPU's (Intel 80x86's, DEC Alpha's and M680x0's) connected together through a high-speed bus network.

    The different types of systems could be defined according to how these processed instructions/data.

    SISD - Single Instruction, Single Data - Early home computer
    SIMD - Single Instruction, Multiple Data - Vector processors
    MISD - Multiple Instruction, Single Data - Fault tolerant systems
    MIMD - Multiple Instruction, Multiple Data - Parallel processing CPU's

    Some systems had hardwared interconnect configurations - either a 2D square grid, a 3D square grid or torus network, or even star networks, while others had dynamic routing capability. Transputers only knew about the adjacent processors in the four compass directions (NESW).

    But all of these techniques have been incorporated into mainstream CPU's now - you now have dual-core and quad-core CPU's that can be used by laptops.

    Modern day methods are to make the systems super-scalar. Multi-core CPU's can be arranged side by side onto multi-CPU boards which in turn can be rack mounted into chassis which communicate through high-speed interconnect systems. There is no limit on the number of racks that can be used except space and money.

    --
    Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
  42. 1999 called by Sangui5 · · Score: 2, Informative
    They want their slowest Top 500 machine back...

    List of #500 on the TOP500 by year
    Year . .- RPeak . . . | Machine's owner and country | Make & Model
    06/1998 - 15.0 GPLOPS | Southwestern Bell, USA. . . | HPC 6000, Sun
    11/1998 - 20.5 GFLOPS | Koeln Universitaet, Germany | HPC 10000 Sun
    06/1999 - 34.2 GFLOPS | CIEMAT, Spain . . . . . . . | T3E900 Cray
    11/1999 - 38.4 GFLOPS | Bank, United States . . . . | HPC 10000 400 MHz, Sun
    06/2000 - 51.2 GFLOPS | EDS, United States. . . . . | HPC 10000 400 MHz, Sun
    11/2000 - 78.0 GFLOPS | Zurich American, USA. . . . | SP Power3 375MHz, IBM

    Really, calling this a supercomputer is lame. It has only one 250GB disk; it will have utter crap IO performance. Most compute heavy jobs are also disk heavy because you want to checkpoint your intermediate results in case of a crash. Since there is only one disk, one machine must be serving it up to the others (NFS, ISCSI, whatever). It is clustered through gigabit ethernet, which will act as a limit on performance. They even skimped on the connection to the outside world and got a 100MBit card. "Real" clusters use Infiniband or Myrinet, both of which are optimized for high throughput with low latency and low contention. Gigabit is not. Linpack is rather kind to clusters; more finely grained parallel tasks will pay more for the poor linkup.

    Also, with only 4 processors one could also build a 4-way SMP machine which would then not have to deal with any sort of message passing at all. You instead get one shared memory interface. It may be slightly NUMA, but the extra latency cost of hypertransport is amazingly low. Instead, by putting only one dual core die per motherboard, you have to jump through hoops to move work from one die to another, and pay really bad latency costs. You could also do better with a 2 quad-core processors on the same mobo (although you'd have to go Intel for now...). It's easier to program, supports finer grained parallelism, and allows potential savings on other parts.

    I can get 2 quad-core Xeons at 2.4 GHz each and a 2 socket motherboard for $820 at newegg. They spent $980 on 4 dual-core 2GHz processors and 4 single socket motherboards. They also spent $240 on gigabit cards and the switch. So, for $400 less, I can have an SMP machine; one which probably has higher floating point performance as well. Rather than 4 cheap power supplies I can get one nice one (which is probably more efficient too). Further, I don't have to run 3 of my nodes diskless. Really, at this small scale a cluster is not the way to go.

  43. Re:How does it compare to a PS3? by havenskate · · Score: 2, Funny

    Isn't 256MB all it has?

    256 MB XDR @3.2 GHz for system memory and 256 MB GDDR3 @700 MHz for video memory. Not to bring on a console war extravaganza, but the M$ xbox 360 has the same amount of total memory, but instead theirs is 512 MB UMA (Shared with CPU)...

    Anyway, memory could very well be a limitation with a cluster of ps3s vs. the microwulf cluster, but it all would depend on the operations your performing. I'd venture to guess the 256mb of system memory would be enough for anyone. :)

  44. Re:Newbie translation please? by AJWM · · Score: 2, Informative

    That wasn't true even in 1991, except maybe for Intel processors which are notoriously wasteful of clock cycles (which is why they have always advertised clock speed rather than instructions per second). A 1 MHz MOS 6502 was just as fast as a 4.7 MHz Intel 8080 (and needed fewer support chips).

    If you throw more transistors at the problem, and/or different architectures, you can complete instructions in a single cycle. (Especially e.g. register-to-register instructions where the answer comes out of the inputs at the speed of propagation delay through the gates.) If you do it right, you can even design clockless CPUs where the completion of the previous instruction triggers the start of the next one without waiting on an external clock.

    This assumes your CPU is not microprogrammed; the instruction words contain the relevant bitmasks for source and destination registers as well as the control code for the ALU. See for example the PDP-11 instruction set (IIRC).

    Of course as others have pointed out, modern processors pipeline multiple instructions at different stages of execution at the same time, for a net throughput of multiple instructions per cycle.

    --
    -- Alastair
  45. GPU based supercomputing by Traa · · Score: 2, Interesting

    I thought the hip thing was GPU based supercomputing. NVidia even has a dedicated GPU based, desktop sized, scalable supercomputer line called Tesla.

    The basic Tesla unit c870 = 518 Giga flops for ~$1300.
    Tesla s870 = 2 Terra flop for ~$12000 (still desktop size)

    NVidia Tesla

  46. not trying to flame but .. by ILongForDarkness · · Score: 2, Interesting
    Since when is a four CPU node a supercomputer? I remember when the a new Apple system came out, I believe it was the dual CPU G4 system, and they tooted it as a supercomputer on the desktop because it can do 1GFLOP single precision.

    I code for systems with 800 4 Optron nodes, with 10GB/s interconnect, and a couple hundred terabytes of SAN attached storage. That is a supercomputer :) Well sort of, a lot of people in the HPC community consider it just a cluster, as some programs need 64+ CPU's in SMP mode, so any loosely coupled memory model would be considered a serial farm :)

    Also, note that high end platforms, would have redundant power, redundant high end interconnects, redundant hot swap drives etc. There also would be enough of them to need, high end switches, blowers, power conditioners, air circularators, and various other room coolers. Of course a custom built workstation without a graphics card, monitor, or even case is going to beat the pants off of HPC architecture price per flop, good work to the group, but hardly newsworthy in the HPC community.

  47. Which is fine by Sycraft-fu · · Score: 3, Informative

    But you aren't really a supercomputer at that point, you're a cluster. These days the line is more blurred than in the past but more or less the difference is interconnect speed. In a real supercomputer, there are very high speed interconnects, so you can run things that heavily rely on one part communicating with another, like particle simulations. That's why the US Department of Energy buys so many, rather than clusters. They do things like weather simulation and simulation of nuclear weapons, where every node as to be able to talk to every other node with essentially no penalty.

    Now if you have a job that doesn't use a lot of inter-node communication, like say 3D rendering, then a cluster is a better answer. Normal hardware with Ethernet interconnects. Works great and is cheap since you can use commodity parts. But don't confuse that cluster with a real super computer, you throw one of those intense inter node problem at it, it'll fall over because the interconnects are too slow.

    Unfortunately these days people really blur the distinction. You'll see systems on the top 500 list that are really questionable. It'll be commodity hardware connected with something like infiniband. Ok, great, that is faster (both more bandwidth and less latency) than Ethernet, but it still isn't necessairily up to what you'd get from a real supercomputer.

    However in the case of this deal, no, not a super computer. It's a small cluster and they are just calling it a super computer as marketing, effectively.

  48. PS3Wulf by Doc+Ruby · · Score: 2, Informative

    Also in 2003, the University of Illinois at Urbana-Champaign's National Center for Supercomputing Applications built the PS 2 Cluster for about $50,000.

    The PS3 comes out of the box with a Cell uP that gets something like 20 GFLOPS on each $500 PS3. It's already networked into clustered supercomputing like this MicroWulf.

    A $500 PS3 has 20 of the 26.5 GFLOPS the $2800 MicroWolf has. MicroWulf runs Ubuntu, which can also run on PS3. If people can port Linux libraries like Mesa/OpenGL/X to the PS3 SPEs, where most of the power lies, then we'd be looking at $25:GFLOPS, not the $94:GFLOPS on the MicroWulf.

    And while taking a break, you can play Gran Turismo 5, and 40 more games you can afford with the money you save on HW.
    --

    --
    make install -not war

  49. Re:Newbie translation please? by jgunchy · · Score: 2, Funny

    Sounds like somebody is bitter about getting an English degree instead of an Engineering degree...

  50. stable? by wsanders · · Score: 2, Funny

    Don't worry, they're Supermicro, they wouldn't be stable if you cooled them in a swimming pool of liquid nitrogen.

    Still, meh indeed, scrape together the piles of computers your average /.er has in their closet - just imagine a Beowulf cluster of those!

    --
    Give a man a fish and you have fed him for today. Teach a man to fish, and he'll say "WHERE'S MY FISH, YOU IDIOT?"