Slashdot Mirror


Heterogenous Multiprocessor Chip Runs Tao/Elate

Madmac wrote with this cool item: "A trio of Japanese companies have teamed up on a multiprocessor chip design that can embed multiple processors and DSPs on a single chip, all running Tao's VP code." An interesting snippet: "Up to eight processor engines can be added to one MISC device, for performance of 200 to 900 million operations per second. They can include any general-purpose RISC processor, DSPs, SIMD engines, vector processors, graphics processors or customized logic."

2 of 43 comments (clear)

  1. Damn, it's called BOGOmips for a reason by Signal+12 · · Score: 4

    And not only are you using Bogomips numbers to compare to some other processor speed, someone even moderated that up! How's this world going to end?

    --

    Inflation is everywhere.

  2. Re:I wonder why Transmeta hasn't tried this... by Christopher+Thomas · · Score: 5
    Today CPUs spend a lot of energy trying to extract parallelism out of code designed to be run linearly. The ability to take advantage of parallelism is strongly limited by your ability to find it, rather than the ability of the chip to carry out instructions in parallel.

    Well if the chip is emulating a dual processor machine, then you have pushed a lot of that work down to having the OS identify 2 processes that can run in parallel. I would think that this would be a huge win.


    This is nice, if you can overcome two concerns:

    • Duplicate register sets.
      Each thread will need to work with its own copy of the virtual chip's registers. For processes, instead of threads, each will also need its own page table (though you could just remap them to different sections of the same table). You might be able to emulate dual x86 processors, but something with a larger register set would pose problems.

    • Different instruction pointers.
      Each thread is an independent instruction stream, that can wander and jump any which way. You'd need explicit hardware support to be able to emulate this without prohibitive overhead.

    • Greater memory load.
      As the processes would be (hopefully) independent, you'd be hitting two completely different sets of pages when accessing memory. This means you'd need a cache twice as large to avoid thrashing. You could get around this by supporting multiple threads, not processes, but this limits the advantage of your proposal.

    • More hardware complexity.
      Transmeta's fundamental approach was to reduce hardware complexity and cost. As some additional hardware support is needed (in fact, a fair bit of additional support), emulation of SMP systems is unlikely to be embraced by Transmeta.


    This is a fascinating idea, but there are substantial hurdles that would have to be overcome implementing it in practice.