Scaling To a Million Cores and Beyond

← Back to Stories (view on slashdot.org)

Scaling To a Million Cores and Beyond

Posted by kdawson on Tuesday June 29, 2010 @06:13PM from the can't-get-there-from-here dept.

mattaw writes "In my blog post I describe a system designed to test a route to the potential future of computing. What do we do when we have computers with 1 million cores? What about a billion? How about 100 billion? None of our current programming models or computer architecture models apply to machines of this complexity (and with their corresponding component failure rate and other scaling issues). The current model of coherent memory/identical time/everything can route to everywhere; it just can't scale to machines of this size. So the scientists at the University of Manchester (including Steve Furber, one of the ARM founders) and the University of Southampton turned to the brain for a new model. Our brains just don't work like any computers we currently make. Our brains have a lot more than 1 million processing elements (more like the 100 billion), all of which don't have any precise idea of time (vague ordering of events maybe) nor a shared memory; and not everything routes to everything else. But anyone who argues the brain isn't a pretty spiffy processing system ends up looking pretty silly. In effect, modern computing bears as much relation to biological computing as the ordered world of sudoku does to the statistical chaos of quantum mechanics.

19 of 206 comments (clear)

Min score:

Reason:

Sort:

multi core design by girlintraining · 2010-06-29 18:22 · Score: 4, Insightful

Simply put, there are some computational problems that work well with parallelization. And there are some that no matter how you try to approach it, you come back to a serial-based model. You could have a billion core machine running at 1Ghz get stomped by a single core machine running at 1.7Ghz for certain computational processes. We have yet to find a way computationally or mathematically to make intrinsically serialized problems into parallel ones. If we did, it would probably open up a whole new field of mathematics.

--
#fuckbeta #iamslashdot #dicemustdie
1. Re:multi core design by jd · 2010-06-29 18:30 · Score: 4, Interesting
  
  You cannot parallelize a serial task, any more than you can have 60 people dig one posthole in one second. On the other hand, there are MANY tasks that are inherently parallel but which are serialized because either the programmers aren't up to the task, the OS isn't up to the task or the CPU isn't up to the task.
  (I don't know if kernel threads under Linux will be divided between CPUs in an SMP system, they certainly can't migrate across motherboards in any MOSIX-type project. That limits how parallel the bottlenecks in the program can ever become. And it's one of the best OS' out there.)
  
  --
  It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
2. Re:multi core design by Anonymous Coward · 2010-06-29 19:25 · Score: 3, Interesting
  
  If one is willing to use the transistors of a billion core machines to speed up a single problem anyway, some transistors could be better used to accelerate the specific problem in the form of custom circuits. There could be a similar layered structure in the system as the brain uses. The lowest, fastest and task customized levels could be build using hard-wired logic, the layer above it using slowly reconfigurable circuits with very fast switching speeds, the layers above easily reconfigurable circuits with slower switching speeds and the highest level using the normal general purpose logic. A higher level could train and use the services of a lower level just like the brains might do in a case of a phobia, trauma or some psychosomatic condition. New sub fields of computer science and computer engineering of a computer psychologist and a computer psychiatrist might be created out of necessity..
3. Re:multi core design by jd · 2010-06-29 20:10 · Score: 3, Informative
  
  You've got to be careful when talking about threads. There are four basic models: SISD, SIMD, MISD and MIMD. Of those, only SISD is serial, but if you've two independent SISD tasks, you can run them in parallel. Most modern supercomputers are built on the premise that SIMD is good enough. Not sure where MISD is used, MIMD fell out of favour when vector processors became too expensive but may be revived on more modest CPUs with modern interconnects like Infiniband.
  
  --
  It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
4. Re:multi core design by William+Robinson · 2010-06-29 20:45 · Score: 5, Interesting
  
  Not sure where MISD is used
  Back in 1987, when I was part of team that was designing parallel processing machine, with 4 neighboring CPUs sharing common memory (apart from their own local memory, kind of systolic array), we were designing machine suitable to simulate aerodynamics or weather forecasting using diffusion equations. We believed that it was working on MISD model, where different algorithms running in different CPUs utilized same data for analysis, using bus arbitration logic.
  
  --
  hilarious
5. Re:multi core design by WillDraven · 2010-06-30 02:38 · Score: 4, Funny
  
  You cannot parallelize a serial task, any more than you can have 60 people dig one posthole in one second.
  We do it all the time around here:
  1 to operate the pile driver
  2 holding up stop/slow signs
  3 riding in the "follow me" vehicle
  4 standing around supervising
  5 cops writing tickets in the surrounding 8 mile work zone
  10 administrators to approve the project
  15 residents jumping out of bed at 6am thinking it'a a bomb going off
  20 people sitting in their cars honking their horns for motivational support
  Of course the whole procedure and traffic carnage can last for months or years, but the actual post being rammed in only takes a second. ;-)
  
  --
  This is my sig. There are many like it but this one is mine.
Bluudy Blogs by jd · 2010-06-29 18:25 · Score: 3, Informative
I've left out links to some projects, by request, but everything can be found on their homepage anyway. Anyways, it is this combination that is important, NOT one component alone.
--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:Better be running OSS by jd · 2010-06-29 18:38 · Score: 3, Interesting

I don't know about this specific project, but Manchester is strongly Open Source. The Manchester Computer Centre developed one of the first Linux distributions (and - at the time - one of the best). The Advanced Processor Technologies group has open-sourced software for developing asynchronous microelectronics and FPGA design software.
Manchester University is highly regarded for pioneering work (they were working on parallel systems in 1971, and developed the first stored-program computer in 1948) and they have never been ashamed to share what they know and do. (Disclaimer: I studied at and worked at UMIST, which was bought by Manchester, and my late father was a senior lecturer/reader of Chemistry at Manchester. I also maintain Freshmeat pages for the BALSA projects at APT.)

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Problems with this blog. by Anonymous Coward · 2010-06-29 18:43 · Score: 4, Informative

The problem posed by the author is somewhat of a straw man argument: "The trouble is once you go to more than a few thousand cores the shared memory - shared time concept falls to bits."
Multiple processors in a single multicore aren't required even today to be in lockstep in time (it is actually very difficult to do this). Yes, locally within each core and privates caches they do maintain a synchronous clock, but cores can run in their own clock domains. So I don't buy the argument about scaling with "shared time".
Secondly, the author states that the "future" of computing should automatically be massively parallel. Clearly they are forgetting about Amdahl's Law (http://en.wikipedia.org/wiki/Amdahl's_law). If your application is 99.9% parallelizable, the MOST speedup I can expect to achieve is 1000X, forget about millions. High sequential performance (ala out-of-order execution, etc.) will not be going away anytime in the near future simply because they are best equipped to deal with serial regions of an application.
Finally, I was under the impression that they were talking about fitting "millions" of cores onto a single die, until I read the to the end of the post that they are connecting multiple boards via multi-gigabit links. Each chip on a board has about 20 or so cores with privates caches or local store. They talk to other cores on other boards through off-chip links...... SO isn't this just a plain old message passing computer?! What's the novelty here? Am I missing something?
Re:Better be running OSS by capo_dei_capi · 2010-06-29 18:48 · Score: 5, Funny

This 1-million core machine better be running open source software and not proprietary software.
Yeah, especially if their software is licensed on a per-core basis.
Damaged Brains by b4upoo · 2010-06-29 18:48 · Score: 3, Insightful

Some folks with severely damaged brains seem to make better human computers than people with healthy brains. Rain Man leaps to mind as well as other savants. It seems that when some parts of the brain are impaired the energy of thought is diverted to narrower functions. Perhaps we need to think of delivery more energy to less cores to make machines that do tasks that normal humans are not so good at doing.
Re:1 billion cores by TheSpoom · 2010-06-29 19:17 · Score: 3, Informative

I'm pretty sure the poster meant to do something like this:
fork();
fork();
fork(); // etc.
which would make the number of processes increase exponentially every time the forked processes forked again. Not 1, 2, 3, but 1, 2, 4, 8, 16... and 2^30 gets you above 1 billion.

--
It's better to vote for what you want and not get it than to vote for what you don't want and get it.
- E. Debs
Re:Last time I run a parallel program... by palegray.net · 2010-06-29 19:26 · Score: 4, Insightful

Given you statement, why would you link to a document entitled Reevaluating Amdahl's Law? Did you even read what you linked to? Here's an excerpt:

Our work to date shows that it is not an insurmountable task to extract very high efficiency from a massively-parallel ensemble, for the reasons presented here. We feel that it is important for the computing research community to overcome the "mental block" against massive parallelism imposed by a misuse of Amdahl's speedup formula; speedup should be measured by scaling the problem to the number of processors, not fixing problem size. We expect to extend our success to a broader range of applications and even larger values for N.

--
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
Re:Dangerous idea by ProfessionalCookie · 2010-06-29 19:33 · Score: 4, Insightful

From a science perspective I'm pretty sure that either computer are already "sentient" or (IMHO, more likely) that we don't really understand what sentience is. At all.
The Internet by pmontra · 2010-06-29 19:37 · Score: 5, Interesting

The Internet is at least in the 1 billion cores range. The way to use many of them for a parallel computation has been demonstrated by Seti@home, Folding@home and even by botnets. They might not be the most efficient implementations when you have full control of the cores but they show the way to go when the availability of the cores and the communication between them is unreliable, when they have different times and different clocks and when they might be preempted to do different tasks.
The brain isn't a spiffy processing system. by master_p · 2010-06-29 20:04 · Score: 4, Insightful

The brain does not do arithmetic, it only does pattern matching. That's what most people don't get and that's the obstacle to understanding and realizing AI.
If you ask how can humans can then do math in their brain, the answer is simple: they can't, but a pattern matching system can be trained to do math by learning all the relevant patterns.
If you further ask how humans can do logical inference in their brain, the answer is again simple: they can't, and that's the reason people believe in illogical things. Their answers are the result of pattern matching, just like Google returning the wrong results.
Re:Reminds me of Hillis by pieterh · 2010-06-29 21:19 · Score: 4, Interesting

You don't even need Erland, you can use a lightweight message-passing library like ZeroMQ that lets you build fast concurrent applications in 20 or so languages. It looks like sockets but implements Actors that connect in various patterns (pubsub, request-reply, butterfly), and works with Ruby, Python, C, C++, Java, Ada, C++, CLisp, Go, Haskell, Perl, and even Erlang. You can even mix components in any language.
You get concurrent apps with no shared state, no shared clock, and components that can come and go at any time, and communicate only by sending each other messages.
In hardware terms it lets you run one thread per core, at full efficiency, with no wait states. In software terms it lets you build at any scale, even to the scale of the human brain, which is basically a message-passing concurrent architecture.

--
My blog
Re:Human brain != computer by Wescotte · 2010-06-29 22:15 · Score: 4, Insightful

but hopeless for calculating with a reasonable degree of accuracy the actual distance to that object- the margin of error for most people is on average going to be quite large.
I disagree. How can we learn to throw a basketball into a tiny hoop from far away without having very accurate estimates? Think of any sport and just how many good estimates are done VERY quickly and pretty damn accurately. How can a painter look at any scene recreate (to scale) what they see on canvas? I'd say are brains are pretty damn good at calculating with very high accuracy.

Just because I can't say the hoop is exactly 32.74578453 feet from me doesn't mean I don't know how far it is away. If I can throw the ball into the hoop then I have accurately calculated/predicted the distance.

Look at how sometimes people are mid-conversation, talking about something they know in depth and suddenly they forget what they were going to say- this is because processing in the brain has gone completely off track.
I'm having a hard time coming up with a good analogy but I suspect these situations are similar to interrupts in computers. Something more important requires the brains resources at that time. It's not like the information is forgotten it's simply not accessible at that movement in time. The information is never "lost" it's just unavailable for a time. If it was lost you wouldn't have the "oh yeah" moments when you remember it or look it up again. You recognize it because you already knew it.

While I agree the brain isn't as effective at large scale number crunching I do believe it's something the brain can be trained to do. There are plenty of people out there who can do insanely complex arithmetic in their heads. I suspect the reason we all don't have such skills is because we don't need them.

but there's a lot that current computers can do that the brain can't- serious large scale number crunching for example
There is no real reason in the survival of the fittest terms for us to be able to accomplish such tasks. So those resources in the brain were put to use on other tasks like accurately processing visual and audio data. I can hear or spot a predator very quickly and accurately in all types of environments and lighting conditions. If we use a computer to perform these tasks we realize just how much computation is required. There is no reason these resources couldn't be allocated to general number crunching. It's just evolution says they are better used for other tasks.
Re: "steal" by TaoPhoenix · 2010-06-30 01:58 · Score: 3, Funny

No no - you had the golden chance and missed it!
You *license* the baby!

--
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine