Slashdot Mirror


Intel Says to Prepare For "Thousands of Cores"

Impy the Impiuos Imp writes to tell us that in a recent statement Intel has revealed their plans for the future and it goes well beyond the traditional processor model. Suggesting developers start thinking about tens, hundreds, or even thousand or cores, it seems Intel is pushing for a massive evolution in the way processing is handled. "Now, however, Intel is increasingly 'discussing how to scale performance to core counts that we aren't yet shipping...Dozens, hundreds, and even thousands of cores are not unusual design points around which the conversations meander,' [Anwar Ghuloum, a principal engineer with Intel's Microprocessor Technology Lab] said. He says that the more radical programming path to tap into many processing cores 'presents the "opportunity" for a major refactoring of their code base, including changes in languages, libraries, and engineering methodologies and conventions they've adhered to for (often) most of the their software's existence.'"

6 of 638 comments (clear)

  1. Re:Not Sure I'm Getting It by Delwin · · Score: 5, Informative

    Because each core is no longer task switching. Once you have more cores than tasks you can remove all the context switching logic and optimize the cores to run single processes as fast as possible.

    Then you take the tasks that can be broken up over multiple cores (Ray Tracing anyone?) and fill the rest of your cores with that.

  2. Re:Not Sure I'm Getting It by k8to · · Score: 5, Informative

    True but misleading. The major cost of task switching is a hardware-derived one. It's the cost of blowing caches. The swapping of CPU state and such is fairly small by comparison, and the cost of blowing caches is only going up.

    --
    -josh
  3. Re:Not Sure I'm Getting It by skulgnome · · Score: 5, Informative

    No. I/O is the slowdown in multitasking OSes.

  4. Re:Not Sure I'm Getting It by Salamander · · Score: 5, Informative

    Because each core is no longer task switching. Once you have more cores than tasks you can remove all the context switching logic and optimize the cores to run single processes as fast as possible.

    OK, so now the piece that's running on each core runs really really fast . . . until it needs to wait for or communicate with the piece running on some other core. If you can do your piece in ten instructions but you have to wait 1000 for the next input to come in, whether it's because your neighbor is slow or because the pipe between you is, then you'll be sitting and spinning 99% of the time. Unfortunately, the set of programs that decompose nicely into arbitrarily many pieces that each take the same time (for any input) doesn't extend all that far beyond graphics and a few kinds of simulation. Many, many more programs hardly decompose at all, or still have severe imbalances and bottlenecks, so the "slow neighbor" problem is very real.

    Many people's answer to the "slow pipe" problem, on the other hand, is to do away with the pipes altogether and have the cores communicate via shared memory. Well, guess what? The industry has already been there and done that. Multiple processing units sharing a single memory space used to be called SMP, and it was implemented with multiple physical processors on separate boards. Now it's all on one die, but the fundamental problem remains the same. Cache-line thrashing and memory-bandwidth contention are already rearing their ugly heads again even at N=4. They'll become totally unmanageable somewhere around N=64, just like the old days and for the same reasons. People who lived through the last round learned from the experience, which is why all of the biggest systems nowadays are massively parallel non-shared-memory cluster architectures.

    If you want to harness the power of 1000 processors, you have to keep them from killing each other, and they'll kill each other without even meaning to if they're all tossed in one big pool. Giving each processor (or at least each small group of processors) its own memory with its own path to it, and fast but explicit communication with its neighbors, has so far worked a lot better except in a very few specialized and constrained cases. Then you need multi-processing on the nodes, to deal with the processing imbalances. Whether the nodes are connected via InfiniBand or an integrated interconnect or a common die, the architectural principles are likely to remain the same.

    Disclosure: I work for a company that makes the sort of systems I've just described (at the "integrated interconnect" design point). I don't say what I do because I work there; I work there because of what I believe.

    --
    Slashdot - News for Herds. Stuff that Splatters.
  5. Re:Not Sure I'm Getting It by Erich · · Score: 5, Informative
    Single Address Space is horrible.

    It's a huge kludge for idiotic processors (like arm9) that don't have physically-tagged caches. On all non-incredibly-sucky processors, we have physically tagged caches, and so having every app have its own address space, or having multiple apps share physical pages at different virtual addresses, all of these are fine.

    Problems with SAS:

    • Everything has to be compiled Position-independent, or pre-linked for a specific location
    • Virtual memory fragmentation as applications are loaded and unloaded
    • Where is the heap? Is there one? Or one per process?
    • COW and paging get harder
    • People start using it and think it's a good idea.

    Most people... even people using ARM... are using processors with physically-tagged caches. Please, Please, Please, don't further the madness of single-address-space environments. There are still people encouraging this crime against humanity.

    Maybe I'm a bit bitter, because some folks in my company have drunk the SAS kool-aid. But believe me, unless you have ARM9, it's not worth it!

    --

    -- Erich

    Slashdot reader since 1997

  6. Re:The thing's hollow - it goes on forever by dryeo · · Score: 5, Informative

    And before they made it into a movie it was an interesting short story. http://en.wikipedia.org/wiki/The_Sentinel_(short_story)
    If you'd like to read it, seems it is this PDF, http://econtent.typepad.com/TheSentinel.pdf

    --
    https://en.wikipedia.org/wiki/Inverted_totalitarianism