Slashdot Mirror


Factual 'Big Mac' Results

danigiri writes "Finally Varadarajan has put some hard facts on the speed of the VT 'Big Mac' G5 cluster. Undoubtedly after some weeks of tuning and optimization, the home-brewn supercluster is happily rolling around at 9.555 TFlops in LINPACK. The revelations were made by the parallel computing voodoo master himself at the O'Reilly Mac OS X conference. It seems they are expecting and additional 10% speed boost after some more tweaking. Srinidhi received standing ovations from the audience. Wired news is also running a cool news piece on it. Lots of juicy technical and cost details not revealed before. Myth dispelling redux: yes, VT paid full price, yes, it's running Mac OS X Jaguar (soon Panther), yes, errors in RAM are accounted for, Varadarajan was not an Apple fanboy in the least... read the articles for more booze."

19 of 566 comments (clear)

  1. Brewn? by FatAlb3rt · · Score: 2, Interesting

    Is that a word? How about brewed? Hate to nit, but .... aw... nevermind.

  2. interesting points by kaan · · Score: 5, Interesting

    I think it's interesting that he wasn't a Mac fan at all before this project. He says he chose it because it had better performance than everything else out there ("Ironically, they lost the gigahertz game," he said of Intel. "(The G5) is extremely faster than the Itanium II, hands down."), and was cheaper too (Dell and other manufacturers quoted prices between $10 and $12 million, vs. the $5.2 million or G5s).

    What more do you need? Faster systems, cheaper total cost, and slick looking cases.

    1. Re:interesting points by stevesliva · · Score: 2, Interesting
      Intel wants to market Itanium as a server chip. That means that they are putting 3MB or 6MB on the high end Itaniums. Soon they will have a 9MB cache version. Lots of cache means lots of transistors means lots of heat.
      I don't see your point here. More cache does not make it a better processor architecture.
      Intel is not fabbing Itanium with a state of the art process. Intel leads the world in process technology, yet their Itanium is still on a 130nm process.
      The PPC970 and Power4+ are both fabricated in 130nm technologies. Better silicon does not make it a better processor architecture.

      Speaking of cache, somewhat under-reported in the technical press was IBM's revelations of its upcoming Power5 server architecture. Yup, that's four dual-core processors each with 2MB of L2 cache, and four 36MB L3 cache chips all in the same package. IBM is leveraging it's packaging advantages against Intel's process advantages. Well, that, and making each processor die dual-core multithreaded.

      --
      Who do you get to be an expert to tell you something's not obvious? The least insightful person you can find? -J Roberts
    2. Re:interesting points by RzUpAnmsCwrds · · Score: 3, Interesting

      "Itanium2 is only availble up to about 1.3 Ghz."

      If by "about 1.3Ghz", you mean 1.5Ghz, then, yes, Itanium only goes up to 1.5Ghz. But at 1.5Ghz is faster than the fastest 3.2Ghz Pentium 4. With a decent process and less cache, it could easily scale to 2+ Ghz.

      " but the Itanium is neither cheap nor cool (130W!)"

      This has to do with the fact that the CPU has 3MB of cache on it. That makes the die huge which makes the CPU expensive. It also makes it heat up like a toaster. As a comparison, the latest Pentium 4s are ~90W, and they only have 512K of cache.

      "In the performance arena, Moore's law is useless unless chip designers figure out how to use MORE transistors to compute more quickly."

      My statement was that, for a given performance level, Itanium uses less transistors than RISC. Itanium was *designed* to use more transistors. That's why the instruction set is designed to produce code that runs well in paralell. RISC CPUs have to figure out what can be run in paralell in hardware - Itanium does it in the compiler.

  3. Re:Full price by Zelet · · Score: 2, Interesting

    They costed the G5 against Dell and IBM offerings and the Apple solution was cheaper. Where did you get your numbers? Why don't you go out and price out a Supercomputer for me will ya? Of course you know that it isn't feasible to BUILD 1100 units.

    --
    ...And when they came for me, there was no one left to speak out for me." - Martin Niemoeller (1892-1984)
  4. Dumb Question... by devphaeton · · Score: 4, Interesting

    ....maybe i'm obtuse, but i keep hearing about this thing as "..and we're only seeing X% of its real potential right now!"....

    1) Why can't they just shout "Let 'er rip!!" and crank the thing wide open?

    2) Why all the media buzz concerning this as a `surprise' when they've already got its performance figured out, apparently?

    Sorry.

    --


    do() || do_not(); // try();
    1. Re:Dumb Question... by Blimey85 · · Score: 2, Interesting
      They did have specs before hand. They said ok, we take this many and the max theoretical performance is X. We scale that back to Y percent and that's what we will likely achieve. We need to get to Z performance level and Y percent of X is above the Z threshold so we're good to go. Now lets talk price. It's the cheapest available and they can get it to us to meet our deadline? Great. Lets order.

      They new in advance what they could likely achieve with this cluster and they have surpassed what they were expecting. Now with some more tweaking they may take it a bit further. It's like a race car engine, you know the specs but once you get it and tune it you can often surpass the specs by a wide margin.

      --
      How is it that one careless match can start a forest fire, but it takes a whole box to start a campfire?
  5. Too bad some software patents will be filed by Colonel+Panic · · Score: 3, Interesting

    Varadarajan told the audience he would publish full documentation and release most of the code written for the machine. However, some of the software is subject to patent applications, he said, and he wasn't yet sure if it would be released under an open-source license.

    What's up with that?
    Used to be that work like this done at a Univeristy was considered 'open' as in available to anyone to help advance the state-of-the-art. Not anymore...

  6. Re:Super computer? by Carnildo · · Score: 2, Interesting

    A "supercomputer" is usually one that is optimized for vector operations: operations that take a data set, and perform the same operation on each element of that data set -- sort of a "Super SIMD/SSE/AltiVec/whatever". Your desktop computer is designed around performing a series of different operations on a single data element at a time. The graphics card of your computer could be considered a very specialized supercomputer.

    In terms of raw processing power, the computer on your desk is more powerful than an early Cray. But if you tried to do weather modelling or finite element analysis with both, the Cray would win.

    --
    "They redundantly repeated themselves over and over again incessantly without end ad infinitum" -- ibid.
  7. Re:Full Price? WHY?!? by david614 · · Score: 2, Interesting

    Well, I *do* live in Virginia - - and this is one of the greatest things to happen at a publically funded University in years! Great science, ingenuity, huge potential. Now *that* is why public funding is an essential part of R&D. D

    --
    ELITISM: It's always lonely at the top. Uninvited company is rarely welcome.
  8. Power PC 970 and G5 by mojowantshappy · · Score: 3, Interesting
    From the O'Reilly article:

    "The IBM with a PowerPC 970 was a first choice but the earliest delivery date would have been January 2004."

    "On June 23 Apple announced the G5."

    I was under the impression that the G5 was a Power PC 970. Is it just some derivative of the Power PC 970... or what?

    --

    This page was generated by a Barrel of Circus Midgets, and that is the way I like it!!!

  9. Memory errors? by Hoser+McMoose · · Score: 2, Interesting

    I keep seeing reference to some sort of software that will defeat hardware memory errors.

    How, pray tell, are they planning on detecting these errors? I can understand how you could reduce the frequency of errors with only a slight loss in performance, ie take some sort of checksum of your data after every x number of cycles, but that doesn't eliminate the errors, only reduces their frequency. Maybe it reduces the frequency by enough that you don't need to worry about it, especially if 'x' is a sufficiently small number, but it still seems like a pretty risky prospect to me.

    Anyone seen any actual TECHNICAL details on this point, ie not just some Mac fan yelling "Deja Vu, DEJA VU!!!"?

    1. Re:Memory errors? by stacko · · Score: 2, Interesting

      I'm just guessing, but you'd probably implement the same ECC mechanism in software that ECC memory does in hardware.

      A quick google shows that ECC memory typically uses Hamming codes (or similar variations), which is pretty much what you'd expect. Skimming a few of the links, it would appear that most ECC memory is designed to correct a 1-bit error on a word. It is entirely possible that you can have the right combination of bit-errors that will slip past the ECC, regardless of whether it was implemented in hardware or software.

      It does seem a bit tedious to implement it in software, though. Each read and write to memory would have to be wrapped in the code that reads/detects or generates/writes the ECC bits to another location in memory.

      For the curious, you can learn more about Hamming codes here.

  10. Re:Full price? by OECD · · Score: 4, Interesting

    You'd think apple would at least sell G5's to VT without SuperDrives

    OTOH, five years from now, when they have the world's 65,000th fastest supercomputer, they could just pull the thing apart and give/sell complete computers to their students. Then it's back to the Apple Store to order up a whole lot of G7's.

    --
    One man's -1 Flamebait is another man's +5 Funny.
  11. Re:Anyone find the efficiency of this thing? by Hoser+McMoose · · Score: 5, Interesting

    The efficiency is quite poor for this machine, at least as far as efficiency is termed for supercomputers. The cluster has a theoretical peak of 17.6TFlops/s if I did my math right (8GFlops/s per processor), but they are only turning in an actual score of 9.56TFlops/s, for an efficiency of only 54%. Even if they boost performance by 10%, they'll still only be ~60% efficient.

    For comparison, ASCI Q (#2 on Top500) reaches 68% efficiency, MCR Linux Cluster (currently #3, but to be pushed by by this new Mac cluster) reaches 69% efficiency, and the #1 spot, Earth Simulator, reaches a quite impressive 88% efficinecy.

    Of course, there are other ways to measure efficinecy. When it comes to performance/price, this Mac cluster does very well, even if you do take into account the real costs (ie MUCH more than just the $5.2 million up front cost). For cost/power consumption it seems reasonable, but not outstanding. 10TFlops/1.5MW of power is ok, and not too far off the Earth Simulator's 35TFlops/3.5MW of power, but it's certainly nothing to write home about. Cray's next big cluster, Red Storm, is likely to get over 30TFlops when it's released, but will consume only 2.0MW of power.

  12. Re:Anyone find the efficiency of this thing? by Anonymous Coward · · Score: 1, Interesting

    When it comes to performance/price, this Mac cluster does very well, even if you do take into account the real costs (ie MUCH more than just the $5.2 million up front cost).

    No.

    If you're going to measure the gigaflops per dollar of a computing system and use that to compare one computing system to another, you have to normalize all variables. If you're going to count the cost of the building, then you have to count the cost of the building the Earth Simulator is in, too.

    Either way, the Virginia cluster is the most cost-effective supercomputer ever constructed.

    Run the numbers for yourself.

  13. Re:building supercomputer with desktops sucks by Ffakr · · Score: 2, Interesting

    I'm sure VT would have gone rack if possible, and I've hear a side benefit of the current setup is that, as new nodes become available they will be able to 'retire' the nodes to desktop duty for the staff around campus. A dual G5 should be able to run office pretty well, even in a few years. ;-)

    Also, I've heard that the system controller supports 16GB of ram but that Apple has only certified 1GB DIMMs so far. This would seem likely as a lot of Macs can accept more memory than initially advertised... only because larger memory modules became common (I put 1GB of ram in an old wallstreet G3 powerbook for someone and got it running even though it's officially rated at 512MB,.. I've got a sony from the same period here that absolutely won't take more than 256MB in to slots)

    --

    I'm not feeling witty so bite me

  14. Re:Favorite Quote - Correction About Apple by mduell · · Score: 2, Interesting

    Ah, finally someone who is actually involved with the project. Can you tell me what the total cost of the super comptuer?
    The $5.2M figure seems to just be the Towers (Dual 2Ghz + 4GB RAM is $4814 with the standard educational discount, mulitply by 1100 and you get $5295400). What was the additional cost of the Infiniband cards and switches, the Cisco switches, the racks, and the cooling equipment? Were any modifications necessary for the building (more power, etc)?

  15. Answers by daveschroeder · · Score: 2, Interesting

    From http://macslash.org/article.pl?sid=03/10/28/235723 5&mode=thread "The total cost of the asset, including systems, memory, storage, primary and secondary communications fabrics and cables is $5.2mil. Facilities upgrade was $2mil. 1mil for the upgrades, 1mil for the UPS and generators." Total: $7.2M + essentially "volunteer" assembly So it's still a LOT cheaper than anything even close to comparable.