Slashdot Mirror


OEMs Jump Onto Transmeta Bandwagon

Scooter writes "News.com is reporting that Diamond Multimedia has announced a Web-Pad product based on Transmeta's 3120 processor. The report also mentions that NEC, and possibly a dozen other companies are investigating similar possibilities. It's nice to see things taking shape for Crusoe so quickly. " For more details on the chip itself, check out our recent story.S3 has also announced development work that will be done with Transmeta. They are working on a "Linux-based Internet appliance".

1 of 187 comments (clear)

  1. There's more to Crusoe! by Uksi · · Score: 5

    The most exciting feature, for me, of Crusoe is code morphing. Reading the white paper on technology behind the chip (something that *a lot* of posters here should do before posting) got me excited even more.

    Basically, after a piece of code is translated to native code and optimized, it is cached. Next time it is executed, if it's still cached, the already translated and optimized verison executes.

    The benefit of this is speed. A lot of people doubt this speed, saying things like "an emulator can't possibly run at 75% speed of the native system", etc. There are two reasons why Crusoe can outperform the native system, one of which is really not apparent and ignored by almost every person that criticizes Crusoe.

    The main thing to remember here is that Crusoe has some radical, very different technology decisions.

    First, as any experienced software engineer would point out (backed by experimental data), 90% of a program's execution time is spent in 10%(!) percent of its code. What this means is that if ONLY that 10% of the code is optimized, it will speed up 90% of program's execution time. Crusoe's code caching mechanism helps this immensely because as a program runs, these 10% become cached in native code and translation from non-native machine code is done only ONCE.

    You may be saying, "So what, in the best case, the program will run almost as fast as the native system, but it simply can't beat the native system." That's where you're wrong.

    The second reason is that the software layer not only performs translation, but optimization as well. You may now object that if the original program is optimized by the best optimizers, Crusoe's optimizer can't do better. Well, it can because of Crusoe's architecture. Note that, for example, x86 processors have a small number of registers (which are areas for data stored internally *in* the processor; such data is accessed the *fastest*). Crusoe's VLIW architecture, however, has a lot more registers and its out-of-order pipelining, branch prediction. Also being a very-long-instruction word processor, it executes a lot of small instructions (atoms) in one big full instruction (molecule). Molecules can be executed in parallel (pipelining). Crusoe's optimizer takes advantage of these features, making the translated code use more native registers, instead of accessing normal memory or L1/L2 cache (which are slower) and groups code to be processed in parallel.

    Crusoe's optimizer performs really aggressive optmiziation. Perhaps the neatest feature is how Crusoe handles aliasing. Here's some pseudo-assembler code that loads from the same memory location twice:

    load from %X to %register
    ...(do some stuff with %register)...
    store %anotherregister to %Y
    load from %X to %register
    add %register and something else
    etc.

    This is the tightest optmiziation a compiler can perform. The compiler can't eliminate the second load operation to the register because %Y may be an alias for %X (that is, %Y may point to the same memory location as %X). Such aliases come up rarely, but they can come up, and so the compiler can't risk eliminating the second load instruction because it can't predict whether %X is an alias for the %Y. Nobody can, not even the processor.

    Crusoe takes a radically different approach in this situation. Its optimizer ELIMINATES the second load operation, assuming that %Y is not an alias for %X. However, in case it is, it marks an internal bit that protects %X from being overwritten by the store instruction. So the code that one ends up with doesn't have that load instruction and when the case of %Y being an alias for %X does happen, it simply generates the extra load instruction on the fly.

    This may seem like an insignificant optimization, but in reality, it can be quite significant since things such as these happen in programs very often (and often %Y ends up being not an alias for %X). Elimination of extra loads permits better pipelining (more code executed in parallel), and an extra load may take quite a bit of time if the load has to be done from the memory.

    There are a whole bunch of cool other things about Crusoe's technology which makes it a great all-around processor.

    So, what this means is that thanks to the revolutionary architecture, Crusoe's optimizer can optimize that 10% BEYOND the original and actually run faster.

    Users of computationally-intensive programs will especially benefit from this. For example, a 3d ray tracing program spends a lot of time in the small, tight rendering code. Having that optimized so well by the processor can have a significant effect.

    Crusoe also uses filtering techniques to avoid caching code that is executed once-an-hour (thereby preserving translated native often-executed code in the cache as long as possible).

    As the website mentions, most benchmarks only measure a bunch of tasks done in 10 or 20 minutes. The website asks: do you really repetitively do 10 different tasks on your word processor for half-hour or do you actually sit in front of a processor and type most of the time? This is indeed a valid rhetorical question.

    Most benchmarks are too short to let Crusoe speed things up as much as possible.

    Although I don't like the "mobility features" that Transmeta keeps pushing every other sentence (damn marketing) and I don't like the fact that their benchmarks mix performance with "mobility features" (even though there is some validity in doing tat), I think that Crusoe is a very exciting technology and wish I had one.

    Stop thinking in terms of megahertz. As processor technology gets more advanced, all these things stop mattering. In one app, your 700Mhz AMD may perform much slower, in another it can perform much faster. It's never same speed all the time.