Slashdot Mirror


Scaling To a Million Cores and Beyond

mattaw writes "In my blog post I describe a system designed to test a route to the potential future of computing. What do we do when we have computers with 1 million cores? What about a billion? How about 100 billion? None of our current programming models or computer architecture models apply to machines of this complexity (and with their corresponding component failure rate and other scaling issues). The current model of coherent memory/identical time/everything can route to everywhere; it just can't scale to machines of this size. So the scientists at the University of Manchester (including Steve Furber, one of the ARM founders) and the University of Southampton turned to the brain for a new model. Our brains just don't work like any computers we currently make. Our brains have a lot more than 1 million processing elements (more like the 100 billion), all of which don't have any precise idea of time (vague ordering of events maybe) nor a shared memory; and not everything routes to everything else. But anyone who argues the brain isn't a pretty spiffy processing system ends up looking pretty silly. In effect, modern computing bears as much relation to biological computing as the ordered world of sudoku does to the statistical chaos of quantum mechanics.

1 of 206 comments (clear)

  1. Problems with this blog. by Anonymous Coward · · Score: 4, Informative

    The problem posed by the author is somewhat of a straw man argument: "The trouble is once you go to more than a few thousand cores the shared memory - shared time concept falls to bits."

    Multiple processors in a single multicore aren't required even today to be in lockstep in time (it is actually very difficult to do this). Yes, locally within each core and privates caches they do maintain a synchronous clock, but cores can run in their own clock domains. So I don't buy the argument about scaling with "shared time".

    Secondly, the author states that the "future" of computing should automatically be massively parallel. Clearly they are forgetting about Amdahl's Law (http://en.wikipedia.org/wiki/Amdahl's_law). If your application is 99.9% parallelizable, the MOST speedup I can expect to achieve is 1000X, forget about millions. High sequential performance (ala out-of-order execution, etc.) will not be going away anytime in the near future simply because they are best equipped to deal with serial regions of an application.

    Finally, I was under the impression that they were talking about fitting "millions" of cores onto a single die, until I read the to the end of the post that they are connecting multiple boards via multi-gigabit links. Each chip on a board has about 20 or so cores with privates caches or local store. They talk to other cores on other boards through off-chip links...... SO isn't this just a plain old message passing computer?! What's the novelty here? Am I missing something?