Slashdot Mirror


Heterogenous Multiprocessor Chip Runs Tao/Elate

Madmac wrote with this cool item: "A trio of Japanese companies have teamed up on a multiprocessor chip design that can embed multiple processors and DSPs on a single chip, all running Tao's VP code." An interesting snippet: "Up to eight processor engines can be added to one MISC device, for performance of 200 to 900 million operations per second. They can include any general-purpose RISC processor, DSPs, SIMD engines, vector processors, graphics processors or customized logic."

7 of 43 comments (clear)

  1. interesting, but by cheese_wallet · · Score: 3

    Why is this so exciting? Is it really that difficult to put multiple processors on a die? And this sounds like they are just putting multiple die in a package. Oddly enough, I believe it is referred to as die and not dice when speaking of pluralities in ICs

    When a die gets large, you have thermal expansion problems, so one can't just stick four pIII's (or whatever) on a single die, it gets to large. But you could stick four pIII die in a single package. flip chip might not be the best way to go about doing this though.

    fyi, a company I've worked for in the past makes a multi-chip module that incorporates 5 die. Not all are processors though, a couple are cache.

    --Scott

  2. Re:Not that fast. *not* by jdwilso2 · · Score: 3

    well actually, bogomips and mips and all that don't really mean jack and a half. If you look at ratings like that, they come from frequency. You're cpu is a 450MHz, it can theoretically execute 450 million instructions per second. Yeah, 450 million no ops. But what good does that do you? to tell you the truth, most of the performance of any x86 processor is lost in branch prediction and memory latency. As far as I can tell, bogomips = bogus mips anyway... I don't know why 2.3.99 would be reporting 2xMHz for your bogomips unless they have something screwed up in the kernel they need to fix. Ain't no way a standard superscalar 450MHz processor is gonna execute one instruction every 0.5 cycles. Not an x86 anyway...

    And even if you benchmark an actual program to try and see how many instructions per second it's actually getting, it only means anything for that program. It's totally dependant on how many branches are in the code, and how much of the time memory is being accessed.

    These people's idea is excellent because it focuses on the direction the computer industry needs to go: parallelism. Forget doing everything sequentially, do it all at the same time! I'm not talking about in one program though, I'm talking about throughout the system. You've got your websever and you fileserver and all your device drivers and your os and all that good crap to run while you play quake, and the only way to really help out is to tack on another processor to run other programs at the same time. This is kind of like what SUN is doing for the MAJC architechture with thread level parallelism.

    And finally, I don't see how you people can honestly think that one processor running at something like 450 is going to be as good as 8 all on the same die all running different processes in parallel.

    Speed isn't about megahertz or gigahertz or instructions per second. It's about the time it take to run something. Who needs benchmarks when I have my analog wall clock to tell me what's best.

    JDW

  3. Damn, it's called BOGOmips for a reason by Signal+12 · · Score: 4

    And not only are you using Bogomips numbers to compare to some other processor speed, someone even moderated that up! How's this world going to end?

    --

    Inflation is everywhere.

  4. Haiku by 575 · · Score: 3

    Tao sounds familiar...
    Echos of an article
    Amiga perhaps?

  5. I wonder why Transmeta hasn't tried this... by tilly · · Score: 3

    Today CPUs spend a lot of energy trying to extract parallelism out of code designed to be run linearly. The ability to take advantage of parallelism is strongly limited by your ability to find it, rather than the ability of the chip to carry out instructions in parallel.

    Well if the chip is emulating a dual processor machine, then you have pushed a lot of that work down to having the OS identify 2 processes that can run in parallel. I would think that this would be a huge win.

    Is there something obvious that I am missing?

    Cheers,
    Ben

    --
    My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
    1. Re:I wonder why Transmeta hasn't tried this... by Christopher+Thomas · · Score: 5
      Today CPUs spend a lot of energy trying to extract parallelism out of code designed to be run linearly. The ability to take advantage of parallelism is strongly limited by your ability to find it, rather than the ability of the chip to carry out instructions in parallel.

      Well if the chip is emulating a dual processor machine, then you have pushed a lot of that work down to having the OS identify 2 processes that can run in parallel. I would think that this would be a huge win.


      This is nice, if you can overcome two concerns:

      • Duplicate register sets.
        Each thread will need to work with its own copy of the virtual chip's registers. For processes, instead of threads, each will also need its own page table (though you could just remap them to different sections of the same table). You might be able to emulate dual x86 processors, but something with a larger register set would pose problems.

      • Different instruction pointers.
        Each thread is an independent instruction stream, that can wander and jump any which way. You'd need explicit hardware support to be able to emulate this without prohibitive overhead.

      • Greater memory load.
        As the processes would be (hopefully) independent, you'd be hitting two completely different sets of pages when accessing memory. This means you'd need a cache twice as large to avoid thrashing. You could get around this by supporting multiple threads, not processes, but this limits the advantage of your proposal.

      • More hardware complexity.
        Transmeta's fundamental approach was to reduce hardware complexity and cost. As some additional hardware support is needed (in fact, a fair bit of additional support), emulation of SMP systems is unlikely to be embraced by Transmeta.


      This is a fascinating idea, but there are substantial hurdles that would have to be overcome implementing it in practice.
  6. The Real Question... by smack_attack · · Score: 3

    Yeah, that's all fine and dandy, but can you program missiles with it like my Playstation II ?