Slashdot Mirror


Intel Talks 1000-Core Processors

angry tapir writes "An experimental Intel chip shows the feasibility of building processors with 1,000 cores, an Intel researcher has asserted. The architecture for the Intel 48-core Single Chip Cloud Computer processor is 'arbitrarily scalable,' according to Timothy Mattson. 'This is an architecture that could, in principle, scale to 1,000 cores,' he said. 'I can just keep adding, adding, adding cores.'"

14 of 326 comments (clear)

  1. Could be good for games using raytracing by mentil · · Score: 4, Insightful

    This is for server/enterprise usage, not consumer usage. That said, it could scale to the number of cores necessary to make realtime raytracing work at 60fps for computer games. Raytracing could be the killer app for cloud gaming services like OnLive, where the power to do it is unavailable for consumer computers, or prohibitively expensive. The only way Microsoft etc. would be able to have comparable graphics in a console in the next few years is if it were rental-only like the Neo-Geo originally was.

    --
    Corruption is convincing someone that the selfless ideal is the same as their selfish ideal.
  2. Re:One question? by JWSmythe · · Score: 2, Insightful

    The only thing I'd be compensating for is the fact I can't do calculations at Exaflop rates in my head.

        Just like my car only compensates for the fact I can't run at 165mph. :)

    --
    Serious? Seriousness is well above my pay grade.
  3. Re:Biggest Hurdle Not Cores by Anonymous Coward · · Score: 3, Insightful

    Basically, we are going to need compilers that automatically take advantage of all that parallelism without making you think about it too much, and programming languages that are designed to make your programs parallel-friendly. Even Microsoft is finally starting to edge in this direction with F# and some new features of .NET 4.0. Look at Haskell and Erlang for examples of languages that take such things more seriously, even if the world takes them less seriously.

    I don't know about AI, but almost certainly we will end up with both compilers and virtual machines that are aware of parallelism and try to take advantage of it whenever possible.

    But still, certain algorithms just aren't very friendly to parallelism no matter what technology you apply to them.

  4. Instruction set... by KonoWatakushi · · Score: 3, Insightful

    "Performance on this chip is not interesting," Mattson said. It uses a standard x86 instruction set.

    How about developing a small efficient core, where the performance is interesting? Actually, don't even bother; just reuse the DEC Alpha instruction set that is collecting dust at Intel.

    There is no point in tying these massively parallel architectures to some ancient ISA.

    1. Re:Instruction set... by kohaku · · Score: 4, Insightful

      There's also no reason to throw away an ISA that has proven to be extremely scalable and very successful, just because it's ancient or it looks ugly.

      Uh, scalable? Not really... The only reason x86 is still around (i.e. successful) is because it's pretty much backwards compatible since the 8086- which is over THIRTY YEARS OLD.

      The advantage of the x86 instruction set is that it's very compact. It comes at a price of increased decoding complexity, but that problem has already been solved.

      Whoa nelly. compact? I'm not sure where you got that idea, but it's called CISC and not RISC for a reason! if you think x86 is compact, you might be interested to find out that you can have a fifteen byte instruction In fact, on the i7 line, the instructions are so complex it's not even worth writing a "real" decoder- they're translated in real-time into a RISC instruction set! If Intel would just abandon x86, they could reduce their cores by something like 50%!
      The low number of registers _IS_ a problem. The only reason there are only four is because of backwards compatability. It definitely is a problem for scalability, one cannot simply rely on a shared memory architecture to scale vertically indefinitely, you just use too much power as a die size increases, and memory just doesn't scale up as fast as the number of transistors on a CPU.
      A far better approach is to have a decent model of parallelism (CSP, Pi-calculus, Ambient calculus) underlying the architecture and to provide a simple architecture with primitives supporting features of these calculi, such as channel communication. There are plenty of startups doing things like this, not just Intel, and they've already products in the market- though not desktop processors. Picochip and Icera to name just a couple, not to mention things like GPGPU (Fermi, etc.)
      Really, the way to go is small, simple, low power cores with on-chip networks which can scale up MUCH better than just the old intel method of "More transistors, increase clock speed, bigger cache".

    2. Re:Instruction set... by Arlet · · Score: 3, Insightful

      The only reason x86 is still around (i.e. successful) is because it's pretty much backwards compatible since the 8086- which is over THIRTY YEARS OLD.

      That's a clear testament to scalability when you consider the speed improvement in the last 30 years using basically the same ISA.

      you might be interested to find out that you can have a fifteen byte instruction

      So ? It's not the maximum instruction length that counts, but the average. In typical programs that's closer to three. Frequently used opcodes like push/pop only take a single byte. Compare to a DEC Alpha architecture, where nearly every single instruction uses 15 bits just to tell which registers are used, no matter whether a function needs that many registers.

      If Intel would just abandon x86, they could reduce their cores by something like 50%!

      Even if that's true (I doubt it), who cares ? The problem is not intel has too many transistors for a given area. The problem is just the opposite. They have the capability to put more transistors in a core that they know what to do with. Also, typically half the chip is for the cache memories, and the compact instruction set helps to use that cache memory more effectively.

      one cannot simply rely on a shared memory architecture to scale vertically indefinitely

      Sure you can. Shared memory architectures can do everything explicit channel communication architectures can do, plus you have the benefit that the communication details are hidden from the programmer, allowing improvements to the implementation without having to rewrite your software. Sure, the hardware is more complex, but transistors are dirt cheap, so I'd rather put the complexity in the hardware.

    3. Re:Instruction set... by Arlet · · Score: 2, Insightful

      Examples? It's just a different model, it's doesn't prevent you solving any problem.

      A typical consumer desktop machine, running typical programs for instance. In order to use these cores effectively, all these programs need to rewritten. Imagine your word processor reformatting a 500 page document on 1000 cores. It's just not going to work very well.

      How about the operating system ? 1000 different cores all trying to access a file system on a single physical drive. How are you going to run that efficiently ?

  5. Re:Future of Programming by Anonymous Coward · · Score: 5, Insightful

    Learn a functional language. Leanr it not for some practical reason. Learn it because having another view will give you interesting choices even when writing imperative languages. Every serious programmer should try to look at the important paradigms so that he can freely choose to use them where appropriate.

  6. I/O and memory bandwidth by francium+de+neobie · · Score: 3, Insightful

    Ok, you can cram 1000 cores into one CPU chip - but feeding all 1000 CPU cores with enough data for them to process and transferring all the data they spit out is gonna be a big problem. Things like OpenCL work now because the high end GPUs these days have 100GB/s+ bandwidth to the local video memory chips, and you're only pulling out the result back into system memory after the GPU did all the hard work. But doing the same thing on a system level - you're gonna have problems with your usual DDR3 modules, your SSD hard disk (even PCI-E based) and your 10GE network interface.

  7. Re:1000 cores is nothing by Electricity+Likes+Me · · Score: 2, Insightful

    1000 cores at 1Ghz on a single chip, networked to a 1000 other chips, would probably just about make a non-real time simulation of a full human brain possible (going off something I read about this somewhere). Although if it is possible to arbitrarily scale the number of cores, then we might be able to seriously consider building a system of very simple processors acting as electronic neurons.

  8. This is NOT a cache-coherent/SMP machine! by Terje+Mathisen · · Score: 2, Insightful

    The key difference between this research chip and the other Multicore chips Intel have worked on, like Larrabee, is that it is explicitly NOT cache coherent, i.e. it is a cluster on chip instead of a single-image multi-processor.

    This means, among many other things, that you cannot load a single Linux OS across all the cores, you need a separate executive on every core.

    Compare this with the 7-8 Cell cores in a PS3.

    Terje

    --
    "almost all programming can be viewed as an exercise in caching"
  9. Paraphrasing Torvalds... by menkhaura · · Score: 2, Insightful

    Talk is cheap, show me the cores.

    --
    Stupidity is an equal opportunity striker.
    Fellow slashdotter Bill Dog
  10. Whats the point if Photoshop is only 2 processors by cdpage · · Score: 2, Insightful

    Photoshop has been stuck at 2 processors for Way too long. Software companies have been lagging behind hardware far too long. Until I see See more software taking advantage of cores of more than 1 or 2... I'm not wasting money on them.

  11. Benchmarks by Chemisor · · Score: 2, Insightful

    According to benchmarks, a functional language like Erlang is slower than C++ by an order of magnitude. Sure, it can distribute processing over more cores, which is the only thing that enabled it to win one of the benchmarks. I suspect that was only because it used a core library function that was written in C. So no, if you want to write code with acceptable performance, DON'T use a functional language. All CPU intensive programs, like games, are written in C or C++; think about that.