Slashdot Mirror


Heterogenous Multiprocessor Chip Runs Tao/Elate

Madmac wrote with this cool item: "A trio of Japanese companies have teamed up on a multiprocessor chip design that can embed multiple processors and DSPs on a single chip, all running Tao's VP code." An interesting snippet: "Up to eight processor engines can be added to one MISC device, for performance of 200 to 900 million operations per second. They can include any general-purpose RISC processor, DSPs, SIMD engines, vector processors, graphics processors or customized logic."

16 of 43 comments (clear)

  1. Re:Hmm by Kaufmann · · Score: 2

    Yes it is. The bastards are making a bundle out of it too. Ah, business in the Internet age: take a completely un-original idea that's been thoroughly researched and investigated for the past few decades, make it just a little bit less clever, slap a "Java(tm)" (or "XML", or "Web", or "Linux", or "e-", et cetera) on it, and get ready to make millions! :)

    Ah well. I guess I'm just jealous.

    --
    To the editors: your English is as bad as your Perl. Please go back to grade school.
  2. Here is your link... by NatePWIII · · Score: 2

    http://www.theregister.co.uk/000505 -000008.html


    Nathaniel P. Wilkerson
    NPS Internet Solutions, LLC
    www.npsis.com

    --

    Nathaniel P. Wilkerson
    www.haidacarver.com
  3. That depends on what you are doing by tilly · · Score: 2

    If a programmer wants to fully exploit the available CPU power of multiple processors, then indeed they have to put in extra work. This is not necessarily a big deal. If you are already forking processes, then you have already done that work. Likewise if you are running a server that runs multiple processes in parallel (eg Apache), then the work has been done for you.

    And, of course, for many things the end user doesn't care that the program is an efficient hog of multiple CPUs. If I background one process and turn to doing other things, I have already won if the backgrounded process has a minimal impact on my observed performance.

    For all of these reasons people find wins running dual-processor systems even though they are mostly running programs whose programmers were not explicitly writing programs that take advantage of multiple processors.

    Cheers,
    Ben

    --
    My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
  4. MISC by Tardigrade · · Score: 2

    MISC has been used to refer to Minimal Instruction Set Chips for a while now. A little research would have shown them this; now the acronym recognition is severely diluted.
    MIS Chips

  5. Hmm by dougman · · Score: 2

    Is this the same Tao mentioned in yesterday's (gasp) Amiga story?

    (mind starting to race)

  6. The Amiga Developer Support site is now online. by Dredox · · Score: 2

    All developers who bought Amiga`s SDK (Software developers kit) or developer`s machine will be able to get free support for the Elate based Amie RTOS here .

  7. interesting, but by cheese_wallet · · Score: 3

    Why is this so exciting? Is it really that difficult to put multiple processors on a die? And this sounds like they are just putting multiple die in a package. Oddly enough, I believe it is referred to as die and not dice when speaking of pluralities in ICs

    When a die gets large, you have thermal expansion problems, so one can't just stick four pIII's (or whatever) on a single die, it gets to large. But you could stick four pIII die in a single package. flip chip might not be the best way to go about doing this though.

    fyi, a company I've worked for in the past makes a multi-chip module that incorporates 5 die. Not all are processors though, a couple are cache.

    --Scott

    1. Re:interesting, but by DGolden · · Score: 2

      Because the cool bit isn't really the multiple-cpu cores-on-die, the cool bit is the way they tie the processors together. Basically, they can throw chips specialised for different tasks, with different instruction sets, and different strengths and weaknesses together, and the Tao/Amiga system will take advantage of all of them according to their strengths, and dynamically recompile your application for them, and each thread will run on an appropriate processor i.e. a sound fx intensive thread onto a DSP core, 3D graphics onto a vector core, etc. And you don't need to know the nitty-gritty details, just write code in the language of your choice (Java, C, C++ and VP asm are the current possibilities) for a virtual machine.

      If you remember the original "classic" amiga from 1985, its design philosphy was not to run everything on the CPU, but to slap task-specific (but programmable) co-processors in to do various tasks extremely quickly, with their own DMA to the unified memory pool. This was coupled to an OS that used message-passing by reference, which meant that there was no memory copy overhead in interprocess communication (and unfortunately also meant that proper interprocess memory protection was impossible - i.e. there's no distinction between threads and processes on a "classic" Amiga), which is why the Amiga, at the time, could wipe the floor with any other system (and often well above) its price range in terms of simultaneous graphics and sound data throughput. (later to be termed "multimedia"). It also made programming the system... interesting...

      Really high-performance stuff tended to mean stepping outside the OS - while the OS was cool in other ways, it didn't expose the full power of the hardware architecture.

      The Tao VP technology makes programming a heterogenous multiprocessing environment (relatively) easy -
      the software design concepts have caught up with the hardware design concepts. It's probably no coincidence that the developers behind Tao were once Amiga programmers, who had to bend their minds around usng the 68000, copper, blitter, sound and disk IO processors all at once.

      It now becomes increasingly clear that the new Amiga will, indeed, be "in the spirit of the 'classic' Amiga", just like Amiga Inc. have been saying all along, and will also be, like the original Amiga, technologically advanced copared to its peers. Of course, both the hardware and software will be astronomically more advanced than the amiga's set of custom chips (the "PAD" - for Paula, Agnus,Denise - Amiga custom chips were traditionally given women's names (and the motherboards B-52 album names))

      It remains to be seen whether managment can screw it up as thoroughly as the original Amiga. :-(

      My major worry is some parts of the system are a tad more proprietary than people are used to these days in the Open Source world, including the multitudes of ex-Amiga users who have changed over to Linux.

      --
      Choice of masters is not freedom.
  8. Not that fast. by molo · · Score: 2

    My Celeron @ 450 does 897.84 bogomips (according to 2.3.99-pre6). 900 million ops per second doesn't sound all that fast any more.

    As a side note, with 2.2.14, linux reported my CPU was about 450 bogomips. Anyone know why there was the change?

    Regardless, with today's gigahertz processors, 900 million ops per second is certainly no better that what we already have.

    Don't get me wrong, it sounds like interesting technology. Could produce very flexible chips, but I don't see a real speed gain here.

    --
    Using your sig line to advertise for friends is lame.
  9. Re:Not that fast. *not* by jdwilso2 · · Score: 3

    well actually, bogomips and mips and all that don't really mean jack and a half. If you look at ratings like that, they come from frequency. You're cpu is a 450MHz, it can theoretically execute 450 million instructions per second. Yeah, 450 million no ops. But what good does that do you? to tell you the truth, most of the performance of any x86 processor is lost in branch prediction and memory latency. As far as I can tell, bogomips = bogus mips anyway... I don't know why 2.3.99 would be reporting 2xMHz for your bogomips unless they have something screwed up in the kernel they need to fix. Ain't no way a standard superscalar 450MHz processor is gonna execute one instruction every 0.5 cycles. Not an x86 anyway...

    And even if you benchmark an actual program to try and see how many instructions per second it's actually getting, it only means anything for that program. It's totally dependant on how many branches are in the code, and how much of the time memory is being accessed.

    These people's idea is excellent because it focuses on the direction the computer industry needs to go: parallelism. Forget doing everything sequentially, do it all at the same time! I'm not talking about in one program though, I'm talking about throughout the system. You've got your websever and you fileserver and all your device drivers and your os and all that good crap to run while you play quake, and the only way to really help out is to tack on another processor to run other programs at the same time. This is kind of like what SUN is doing for the MAJC architechture with thread level parallelism.

    And finally, I don't see how you people can honestly think that one processor running at something like 450 is going to be as good as 8 all on the same die all running different processes in parallel.

    Speed isn't about megahertz or gigahertz or instructions per second. It's about the time it take to run something. Who needs benchmarks when I have my analog wall clock to tell me what's best.

    JDW

  10. Damn, it's called BOGOmips for a reason by Signal+12 · · Score: 4

    And not only are you using Bogomips numbers to compare to some other processor speed, someone even moderated that up! How's this world going to end?

    --

    Inflation is everywhere.

  11. Haiku by 575 · · Score: 3

    Tao sounds familiar...
    Echos of an article
    Amiga perhaps?

  12. Just for the record.....on Amiga & Tao relations by Red+Moose · · Score: 2
    Just to let interested people know that this is teh company who are partnered with Amiga for the NG OS.

    Also, given the actual nature of the article, I am going to quote from the Amiga site from last Friday before the Elate/Amiga SDK was announced -

    It was with this heavy attitude that I attended an impromptu meeting with a group of visiting Japanese consumer electronics companies.

    I had never met them before, but they had heard what we were doing, and they remembered the Amiga fondly.

    After our presentation, one of the gentlemen sat back, and informed me that what he had just heard, and saw was the most exciting opportunity that he had seen in years, and that we were absolutely the correct company for them to work with.

    What they had not told us until after that, was that these three gentlemen were actually representing a group of over 50 consumer electronics companies, and they were looking for a long term partner!

    Let's just say that they liked what they saw, and heard. There are many things that appear to be going on behind closed doors.....

    --

    Acting stupid isn't much fun when there's someone around who knows better

  13. I wonder why Transmeta hasn't tried this... by tilly · · Score: 3

    Today CPUs spend a lot of energy trying to extract parallelism out of code designed to be run linearly. The ability to take advantage of parallelism is strongly limited by your ability to find it, rather than the ability of the chip to carry out instructions in parallel.

    Well if the chip is emulating a dual processor machine, then you have pushed a lot of that work down to having the OS identify 2 processes that can run in parallel. I would think that this would be a huge win.

    Is there something obvious that I am missing?

    Cheers,
    Ben

    --
    My usual seat in the cluetrain is at A HREF="http://pub4.ezboard.com/biwethey.ht
    1. Re:I wonder why Transmeta hasn't tried this... by Christopher+Thomas · · Score: 5
      Today CPUs spend a lot of energy trying to extract parallelism out of code designed to be run linearly. The ability to take advantage of parallelism is strongly limited by your ability to find it, rather than the ability of the chip to carry out instructions in parallel.

      Well if the chip is emulating a dual processor machine, then you have pushed a lot of that work down to having the OS identify 2 processes that can run in parallel. I would think that this would be a huge win.


      This is nice, if you can overcome two concerns:

      • Duplicate register sets.
        Each thread will need to work with its own copy of the virtual chip's registers. For processes, instead of threads, each will also need its own page table (though you could just remap them to different sections of the same table). You might be able to emulate dual x86 processors, but something with a larger register set would pose problems.

      • Different instruction pointers.
        Each thread is an independent instruction stream, that can wander and jump any which way. You'd need explicit hardware support to be able to emulate this without prohibitive overhead.

      • Greater memory load.
        As the processes would be (hopefully) independent, you'd be hitting two completely different sets of pages when accessing memory. This means you'd need a cache twice as large to avoid thrashing. You could get around this by supporting multiple threads, not processes, but this limits the advantage of your proposal.

      • More hardware complexity.
        Transmeta's fundamental approach was to reduce hardware complexity and cost. As some additional hardware support is needed (in fact, a fair bit of additional support), emulation of SMP systems is unlikely to be embraced by Transmeta.


      This is a fascinating idea, but there are substantial hurdles that would have to be overcome implementing it in practice.
  14. The Real Question... by smack_attack · · Score: 3

    Yeah, that's all fine and dandy, but can you program missiles with it like my Playstation II ?