Slashdot Mirror


IEEE Says Multicore is Bad News For Supercomputers

Richard Kelleher writes "It seems the current design of multi-core processors is not good for the design of supercomputers. According to IEEE: 'Engineers at Sandia National Laboratories, in New Mexico, have simulated future high-performance computers containing the 8-core, 16-core, and 32-core microprocessors that chip makers say are the future of the industry. The results are distressing. Because of limited memory bandwidth and memory-management schemes that are poorly suited to supercomputers, the performance of these machines would level off or even decline with more cores.'"

5 of 251 comments (clear)

  1. Time for vector processing again by suso · · Score: 5, Insightful

    Sounds like its time for supercomputers to go their own way again. I'd love to see some new technologies.

    1. Re:Time for vector processing again by virtual_mps · · Score: 5, Insightful

      It's very simple. Intel & AMD spend about $6bn/year on R&D. The total supercomputing market is on the order of $35bn (out of a global IT market on the order of $1000bn) and a big chunk of that is spent on storage, people, software, etc., rather than processors. That market simply isn't large enough to support an R&D effort which will consistently outperform commodity hardware at a price people are willing to pay. Even if a company spent a huge amount of money developing a breakthrough architecture which dramatically outperformed existing hardware, the odds are that the commodity processors would catch up before that innovator recouped its development costs. Certainly they'd catch up before everyone rewrote their software to take advantage of the new architecture. The days when Seymour Cray could design a product which was cutting edge & saleable for a decade are long gone.

    2. Re:Time for vector processing again by AlpineR · · Score: 5, Insightful

      My supercomputing tasks are computation-limited. Multicores are great because each core shares memory and they save me the overhead of porting my simulations to distributed memory multiprocessor setups. I think a better summary of the study is:

      Faster computation doesn't help communication-limited tasks. Faster communication doesn't help computation-limited tasks.

    3. Re:Time for vector processing again by knails · · Score: 5, Insightful

      No, proper spelling and grammar are important for everyone, not just english majors. With computers so important, if the computer professionals cannot use the language correctly, then who will? We cannot let ignorant people degrade the quality of language and therefore remove beauty and subtle distinctions between similar words just because they're too lazy to conform to standards. If a linguist misused/ignored computing standards, would you not correct them, even though it's not their chosen field of study?

      --
      "I disapprove of what you say, but I'll defend to the death your right to say it" -Voltaire
  2. Unganged channels = already non shared lanes today by DrYak · · Score: 5, Insightful

    The issue is with a single processor that has multiple cores.
    There's no real way to split the banks for each core, so the net effect is that you have 4-32 cores sharing the same lanes for memory.

    No, sorry. That's how Phenom processor are *Already* working.

    Each physical CPU package has two 64-bit memory controllers, each controlling a separate bank of 64bits DDR-2 memory chips. (Each of the two bank in a dual channel mother board).

    Phenom have two mode of function :
    - Ganged : both memory controllers work in parallel, working as if they were a huge 128bits memory connection. That's how dual channel has worked since it was invented.
    That's good for system running few very bandwidth-hungry applications (for example : benchmarks)

    - Unganged : each memory controller work on its own. Thus you have two completely separate 64bits memory channel accessible at the same time. By correctly lying the applications in memory thanks to a NUMA-aware OS (anything better than Windows Vista), that means that two separate applications can simultaneously access each one's memory at the exact same moment, although at only half the bandwith *per process* (but still the same total of bandwidth for all processes running at the same time on a multi core chip).
    This is perfect for systems running lots of tasks in parallel, and is the default mode on most BIOSes I've seen.

    This gives a tremendous boost to heavily multi-tasked applications (a busy database server, for example), and it's what TFA's author are looking for.

    Probably that at some point in the future, Intel will follow the same trend with its QPI processors.

    Also, the future trend is to multiply the memory channels on the CPU: Intel has already planned Triple Channel DDR-3 for their high-end server Xeons (the first crop of QPI chips). AMD has announced 4 memory channels for their future 6- and 12- core chips targeting the G34 socket.

    So the net effect of Unganged Dual Channel is that today you already have 4 cores having a choice of 2 sets of memory lanes to choose among, and within 1 year, you'll have 6-to-12 cores sharing 4 sets of memory lanes.

    By the time you reach 32 cores on CPU, probably that almost each slot will have its own dedicated memory channel (probably with the help of some technology which communicates serially with fewer lines, like FB-DIMM). Or even weirder memory interfaces (who knows ? maybe DDR-6 will be able to give several simultaneous access to the same memory module).

    So, well, once again, it proves that running stupid simulations without taking into account that other technologies will improves beside the number of cores* yields stupid non realistic results.

    Shame on TFA's Author, because the trends to increase bandwith have already started. I little bit more background research would have avoided this kind of stupidity.
    But on the other hand, they would have missed the opportunity to publish an alarmist article with an eye catching title.

    --

    *: Although, yes, the number of cores you can slap inside the same package seems to be the "new megahertz" in the manufacturers' race, with some like Intel trying to increase this number faster without putting so much efforts on the rest.

    --
    "Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]