Slashdot Mirror


Ars Technica Gets Into Crusoe

redmist writes "Ars Technica has a great, in depth article about the new Crusoe chips. Enjoy." This one will answer most of the questions I've heard about Crusoe's guts, and how it differs from other microprocessors. "Must" reading for all hardware junkies!

210 comments

  1. oh by Anonymous Coward · · Score: 0

    but you have time to sit on slashdot blabbering about how much time you dont have ha ha ha ha ha ha ha

  2. Re:Don't write to the VLIW, but... by Anonymous Coward · · Score: 0

    The cache has a fixed size (though it may be change by the OS) and the code morphing software must do some cache management. If a piece of translated software is not used for some time (corresponding to some function in the app you're running), it may be removed from the cache to make room for a new piece of translation. At a given point in time, only a part of your app is present in the translation cache. So it's not possible to just "save the whole translation". Moreover, some parts of the app, such as initialization stuff that is run only once, might never get translated at all, because it's cheaper to just interpret it.

    Anyway, the startup cost of translation is likely lower than the time it would take to retrieve the saved version from disk.

    And anyway, low-end chips like the TM3120 are meant to run on machines without disks :-)

  3. Re:Transmeta not impressive by Anonymous Coward · · Score: 0
    I think Crusoe is a nice chip, but the *HYPE* (and I mean hype) caused by deliberate secrecy and press leaks thoroughly destroyed any chance of it being seen as revolutionary in my eyes.

    I'm sure the guys at Transmeta are just crushed.

  4. Re:SMP Transmeta benefits? by Anonymous Coward · · Score: 0

    In case you didn't get it the first time. _These_ chips have the northbridge built in. _These_ chips and their associated code-morphers are specifically designed for a single-CPU, ultra-low-power, fully-Internet-ready, DVD-playing _laptop_. Part of _this_ design called for (in the engineers minds) a built-in northbridge.

    _Future_ Transmeta chips don't have to follow these rules. Making a SMP-capable version is trivial now that they actually have real proof that their designs work. Drop out the northbridge, crank up the MHz, slap these on 2- or 4-CPU riser cards that share a rockin' bus (I'll take a quad machine to start with ;), and you've now gone from a "Mobile Intel" killer to a Xeon killer.

    Even better, now it's six months down the road, and they come out with the Sparc version of the code-morpher. So you just fire up your web browser, download the new code, call your local Sun dealer, buy a copy of Solaris, flash the new ISA into the CPUs, and install!

    But, Sun bites you on the ass, and two months later, after trying MacOS X, and really liking the underlying NeXT architecture, you see on slashdot that the PowerPC ISA is available. Cool!

    Now it is three months later. Your 8 CPU machine is starting to seem a little behind the times so you buy the new 32 CPU Transmeta TM431434153 (comes in a nice minitower case with a 300 wat power supply, since by now they've gotten these things down to .1 watts). You install MacOS XIV.


    But wait! There's another update! Alpha this time, and since by now Compaq has sold Alpha yet again, and it hasn't been made for six months, you can flash yet again, and you turn the old 8-way into a rocking Linux (kernel 4.2, BTW) desktop.



    Stupid predictions aside, this is the future of computing, whether some here know it or not.

  5. Re:Yet another /. rant... by Anonymous Coward · · Score: 0

    why do I need to see this here? I read it on the Ars site hours agoo.

    What a moronic statement. The same could be said for any of the posts that appear on /. I read C|Net news quite a bit, but I certainly don't piss and moan when I see story on /. about an article I've already read.

    Here's a really easy solution: Don't follow the link! Or better yet, create your own variant of /. that specifically excludes links to Ars. I even have a good name for it: "The Other Slashdot. News for nerds, except for those originating from Ars Technica, because we've already seen it."

  6. Open source version of Code Morphing software by Anonymous Coward · · Score: 0

    It would be cool if there is an open source clone of the code morphing software so we can morph a PowerPC into a Intel x86 or vice versa... and easily portable to simpler CPUs like the strongArm so we can get fast performance... that shall be a cool project.

    Will Linus Torvalds come after open source hackers who clones his software-patented technology? Dare Linus Torvalds take on the Open Source community? Will Linus threaten the community?

    1. Re:Open source version of Code Morphing software by Anonymous Coward · · Score: 0

      Why do you care? Most whiners who yap like you couldn't write a "Hello World" program in any language. Put that in your pipe and smoke it.

  7. Re:Java? by Anonymous Coward · · Score: 0
    Re: a Java Code Morpher

    Consider what Hannibal wrote in his Crusoe article:

    The sequential, x86 application and OS code is fed into the Code Morphing layer, which takes an entire group of x86 instructions at a time and renders a "translation." A translation is a hunk of x86 code that's been translated into native Crusoe VLIW code. This translation only needs to be done once ...

    The Code Morphing software watches the translation code to see which pieces of it get used most often. The more a block of code is used the more time the Code Morphing compiler spends aggressively optimizing it, so that that block continues to run faster and faster with each use.

    Now compare that to this description of the Java HotSpot Dynamic Compiler from Sun:

    The Java HotSpot Performance Engine starts a program by interpreting the bytecodes. As the program runs, a profiler monitors the program to determine the most heavily used portions of the code.

    Nearly all applications spend most of their time in only a small portion of their code. The Java HotSpot profiler identifies those parts of an application's code that are most critical for performance. Java HotSpot then compiles and optimizes the performance-critical ``hot spots'' without wasting time compiling seldom-used code. Furthermore, the runtime analyses also enable the compiler to perform native-code optimizations not possible with static compilers.

    This seems like a marriage made in heaven!

    Peter Robinson

    (I've registered as Rodes but I don't have my pw yet.)

  8. Hey, you got any stats on power bills? by Anonymous Coward · · Score: 0
    I was just thinking: at a nominal 30 watts, a P3 or K7 takes 720 watt-hours/day, or 22 KWh/month, which costs about $3 where I live.

    $3/month is about $100 over 3 years, which is how long I like to keep my machines.

    My point: the price of the electricity is approaching the price of the processor!

    Not to mention my desk space ... man a desk full of laptops is looking better than a giant 21" monitor all the time.

  9. Sweet by Anonymous Coward · · Score: 0

    I'm excited, aren't you?

  10. Re:Beowulf by Anonymous Coward · · Score: 0

    Cost is not the real limiting factor in an SMP configuration - it's bandwidth. To fit the requirements of SMP, all processors must have equal access to memory and I/O resources, which makes those systems the real bottleneck. Taking a closer look at memory architechture, you'll notice that memory is currently running between 8-10 ns access times. Processors, which are now pushing 1GHz, will typically have access times below 2 ns. This is obviously a problem in even a uniprocessor system, as would imply we will complete a fetch instruction only once every 4-5 cycles. A bit of a major drawback when you've got multiple execution units sitting there waiting for instructions and data. The solution, which works exceptionally well, is to add cache, both on the processor and between the processor and memory. This does, however, present a problem in an SMP system. Imagine, if you will that processor 1 (P1) fetches a memory location (A) and writes it. Now P1 has write-back cache, which means the modified value of A is written into the cache to be flushed later. Now imagine processor 2 (P2) goes to fetch the same memory location before P1 has flushed its cache. If it were to take it from main memory, it would get the old unmodified value. The way most commodity systems deal with this is to snoop the processor bus; i.e. the processor asserts itself on the bus, broadcasts the memory request to the rest of the processors as well as the memory controller, and, if the memory location is dirty, waits for the other processor to flush its cache line. There are variations on this architecture, such as adding a switch architecture to allow for the actual memory transfers to occur point-to-point in stead of over a shared bus. But the actual broadcast must be done simultaneously to all processors; i.e. it must be atomic. This is not to say that integrating processor cores on the same die doesn't have merit. A lot of the tricks used to ensure cache coherency could be modified in light of the integration - the cache snooping, for example, could be done on chip. Or for that matter you could switch to a more exotic method for cache coherence, such as integrating a dedicated cache directory on die that would record what cache lines each processor core currently has loaded. The disadvantage to this is that you are now designing specialized hardware and adding to complexity of the solution. Also, you have to take into account that any of these approaches increases die size, lowering the viable yield during manufacture. Likewise, the market for such a chip is considerably lower than current commodity chips, thereby raising prices even more. There are also advantages to clustering that cannot be met by SMP and related schemes (like CC-NUMA) - reliability. Any single computer has the disadvantage of being vulnerable to a single component failure, despite the advent of things such as hot-plug PCI and RAID configurations. You are still vulnerable to bad processors and memory, which can be even more dangerous, as they may start corrupting data instead of simply not working. By using discrete independant units, a cluster architecture hopefully minimizes this issue to the point where you're virtually immune to a complete loss of availability. Instead, component failure results only in decreased processing power. For more information on the subject, I would suggest picking up "In Search of Clusters" by Gregory Pfister.

  11. Re:You aren't SOPOSED to code in it's native set by Anonymous Coward · · Score: 0

    So mr "Expert", how do you estimate 50% CPU cycles go to morphing? Please, enlighten us , oh mighty dopehead.

  12. Re:Crusoe core instruction set? by Anonymous Coward · · Score: 0
    This would have to be rewritten for another chip, but rewriting the instruction emulator is a lot less effort than recompiling the os and all apps.

    It probably also requires a lot less effort than using hardware alone to build a fast, modern CPU that is backwards compatable with a ~20-year-old instruction set.

  13. x86 only (mostly??) by Anonymous Coward · · Score: 0

    I keep seeing people say they expect/want TM (transmeta) to make a code morphing layer for PPC....um, how are they going to do that? wouldn't they have to liscence the instruction set from apple? and we all know how much apple loves to give away the stuff for there hardware. they said no to the clone makers, I would imagine they would say no to TM, and even if they did give TM the specs so TM could write a code morphing layer to run PPC apps..I bet the total cost of a cruso laptop running PPC would be more then a simularly configured powerbook/ibook, apple wouldn't shoot themselves in the foot and allow a competitor to come out with a cheaper product, there a buissness they like to make money after all. also ars said the northbridge and the SD-DRAM (that right?) moduals were all intergrated on die so wouldn't that mean you need to change all of that stuff if you wanted to run a different arcitecture. after all PPC doesn't run on x86 core logic sets.. and of course the liscening would apply to any chip they wanted to write a code morphing layer from, unless of course the chip had open design specs I suppose anyway if I'm wrong about any of the above assumptions please lemme know, I'd love to have a cheap PPC laptop but I really don't think its likely to happen :(

    1. Re:x86 only (mostly??) by Anonymous Coward · · Score: 0

      Ahh - no. They might have to license something from IBM or Motorola though. IBM "invented" the Power RISC architecture before Apple had anything to do with it. There have been groups in the past that had dreams of building a faster PowerPC than what IBM or Motorola can do - remember Exponential? The obtained a license to build PowerPC chips easy enough, its just that by the time they could develop their own PowerPC core, Motorola and IBM were cranking out the 604e, which provided about the same compute power as the Exponential chip, but at like 1/10 the power consumption. (The 604e was something like 7 watts where the Exponential chip was estimated at around 60 to 70 watts.) And so, Exponential went under - all the while pointing the finger at Apple when it was their own flawed product and development timeline that was to blame. I mean, OBVIOUSLY if Apple can buy a 604e from IBM or Motorola that produces the same computer power and 1/10 the heat output, they aren't going to be buying many chips from Exponential. Duh! Of course, this is all ancient history now - PowerPC has sense moved on to the G3 and G4 chips. And you are free to plug them into any piece of hardware you like.

      The only stick Apple has over you is the Mac ROMs - you need these to boot MacOS, and Apple isn't handing out any licenses for these. So, if all you care about is running PPC code no problem! On the other hand, if its running MacOS your after - you'll have to deal with Apple.

      I guess that was the long way of saying - Transmeta is free to make Crusoe run PowerPC instructions if they like. Apple has no say in the matter. In fact, Apple might even buy chips from Transmeta if they were to run PPC code and provided a better price/performance than "real" PPC chips.

    2. Re:x86 only (mostly??) by Anonymous Coward · · Score: 1

      Ahh - no. They might have to license something from IBM or Motorola though. IBM "invented" the Power RISC architecture before Apple had anything to do with it. There have been groups in the past that had dreams of building a faster PowerPC than what IBM or Motorola can do - remember Exponential? The obtained a license to build PowerPC chips easy enough, its just that by the time they could develop their own PowerPC core, Motorola and IBM were cranking out the 604e, which provided about the same compute power as the Exponential chip, but at like 1/10 the power consumption. (The 604e was something like 7 watts where the Exponential chip was estimated at around 60 to 70 watts.) And so, Exponential went under - all the while pointing the finger at Apple when it was their own flawed product and development timeline that was to blame. I mean, OBVIOUSLY if Apple can buy a 604e from IBM or Motorola that produces the same computer power and 1/10 the heat output, they aren't going to be buying many chips from Exponential. Duh! Of course, this is all ancient history now - PowerPC has sense moved on to the G3 and G4 chips. And you are free to plug them into any piece of hardware you like.



      The only stick Apple has over you is the Mac ROMs - you need these to boot MacOS, and Apple isn't handing out any licenses for these. So, if all you care about is running PPC code no problem! On the other hand, if its running MacOS your after - you'll have to deal with Apple.



      I guess that was the long way of saying - Transmeta is free to make Crusoe run PowerPC instructions if they like. Apple has no say in the matter. In fact, Apple might even buy chips from Transmeta if they were to run PPC code and provided a better price/performance than "real" PPC chips.

    3. Re:x86 only (mostly??) by nester · · Score: 1

      apple doesn't own the PPC ISA any more than intel owns x86. afaik, the only thing apple has to do w/ ppc is that they use them. i don't think apple had very much to do with the ppc isa development.

    4. Re:x86 only (mostly??) by Guy+Harris · · Score: 2
      wouldn't they have to liscence the instruction set from apple?

      Only if Apple (who didn't invent the PowerPC instruction set; it's a derivative of the IBM POWER instruction set) have some form of intellectual-property rights for the instruction set.

      If there are any such rights owned by Somerset, Apple might also have some say in licensing it.

      and even if they did give TM the specs so TM could write a code morphing layer to run PPC apps

      "The specs", in the sense of the instruction set specifications for PowerPC, are publicly available, although if the chip+software is intended to look like a particular PowerPC chip (I think the MMUs may differ, e.g. may have software TLB reload on some processors and hardware TLB reload on others), they'd need that spec as well (I think the specs for various PowerPC chips are also publicly available).

      also ars said the northbridge and the SD-DRAM (that right?) moduals were all intergrated on die so wouldn't that mean you need to change all of that stuff if you wanted to run a different arcitecture. after all PPC doesn't run on x86 core logic sets

      If somebody wanted to clone not only some PowerPC CPU but a support chip set for it, so that they could run OSes such as MacOS unmodified, that stuff might have to be changed...

      ...but that's just cloning a Mac, which Apple isn't allowing even if you use existing PowerPC chips.

      Of course, there is the possibility that Apple would want to use a Transmeta chip in a Powerbook, say, in which case the Apple licensing issues go away.

      Apple are unlikely to be the ones to block such a Code Morphing(TM)(R)(LSMFT) layer; they don't, as far as I know, have a problem with people building non-Mac-compatible PowerPC machines, and they already have, as far as I know, the tools to block people from building Mac clones.

  14. Terminator2 by Anonymous Coward · · Score: 0

    Anyone remember the Terminator 2 movie?

    The chip that made it easier to develop self learning AI?

  15. Re:Slightly Off Topic by Anonymous Coward · · Score: 0
    Or not - the Merced will be big clunky and slow because, just like every previous "recent generation" Intel CPU it will have hardware to translate x86 to the Merced VLIW.

    Well, uh: isn't this basically what Crusoe is doing? I realize they've got a few other features that Merced may not be able to duplicate, but Crusoe should be sufficient to prove that a chip need not be 'big and clunky' to execute x86.

  16. Re:Slightly Off Topic by Anonymous Coward · · Score: 0

    Right, but the question was not "which will run x86 code better - Merced or Crusoe?" It was, what if I just want to bang on the VLIW directly? And my answer was that if you want to bang on the VLIW you should buy a chip that only does the VLIW in hardware (like Crusoe) instead of one that does x86 translation as well (like Merced.) And sense Transmeta doesn't want people banging the VLIW directly, I guess that means you have to buy a Trimedia or something for this purpose.

  17. software=buggy by Anonymous Coward · · Score: 0

    Is it me, or is software, especially this complex "code morphing" stuff prone to be buggy. Then what, you have to download new codemorphing software (I hate flashing roms). It seems like another layer of complexity that can go wrong.

  18. Re:Doesn't make sense by Anonymous Coward · · Score: 0

    Your right, of course - just a little bit confused. (Un?)fortunately, nobody is claiming that the Transmeta Crusoe CPU is faster than an Intel Pentium III - because it's NOT faster! Transmeta's own benchmarks show that the Crusoe is not as fast as a Pentium III. What IS being claimed is that the Crusoe comes "close enough" to the performance of a PIII so as to be useful, with only 1/7th the electrical budget. Thats a big deal for battery powered equipment! IE your laptop can run several times longer, etc.

  19. Re:What I'd like to see... by Anonymous Coward · · Score: 0

    I want to see Crusoe vs StrongARM.

    StrongARM has no floating-point. It's also not x86 compatible.

    But it might be even less power hungry than Crusoe. Next-gen StrongARM chips are expected this year, and will range from 150 to 600MHz, will consuming between 0.040W and 0.450W (well, that's what Intel has announced).

  20. Not Really A harware junkie, but... by Anonymous Coward · · Score: 0

    Ars is correct in saying that all facets of the chip race are about to change. They'd be fools not to push a server/workstation line directly into competition with Intel as soon as they can. They're not fools. The significant thing in my mind is that the folks at Transeta really went and rethought a lot of concepts that everyone knew were good, and they added a new twist to it and created something unique. I often wondered at watching the Alpha market slowly slump when the RISC based chips consistantly turned out blazingly fast benchmark numbers. They were maybe 2 or 3 generations ahead of Intel for awhile, but lost it. Hopefully Transmeta is going to be the company that is finally able to not only introduce a cool technology like this, but also evolve it.

  21. Re:Slightly Off Topic by Anonymous Coward · · Score: 0
    And my answer was that if you want to bang on the VLIW you should buy a chip that only does the VLIW in hardware (like Crusoe) instead of one that does x86 translation as well (like Merced.)

    Doh! I should work on my reading comprehension. Yeah - I see your point.

  22. Re:The customary question... by Anonymous Coward · · Score: 0

    In the process of answering someone else question about the possibility of Crusoe SMP, I had a kinda neat idea for a Crusoe cluster - eg Beowulf. You can't really run the Crusoe in the traditional shared memory SMP sense because the "north bridge" is built into the chip. That is to say, instead of a processor bus coming out of the chip you get a PCI bus and an SDRAM bus. However in there lies a cool possibility for a Beowulf cluster. Beowulf clusters communicate over a network fabric so why not just use the built in PCI bus of the Crusoe as that network fabric? You could put 4 Crusoe chips and a PCI bridge onto a PCI card and there you have 4 nodes of your Beowulf. Now plug a few of those cards into a passive PCI backplane and you have n multiples - 4 cards would be a 16 node Beowulf with just 4 PCI sized cards, a passive backplane, and a power supply. Of course, you would need some stuff like a hard drive to boot your OS - so I guess each card would need an IDE controller or something, still no big deal from the hardware stand point. And a helavu fast network for interconnecting the nodes. Perhaps you could stick a gigabit Ethernet card into a fifth slot on that backplane and that could be your link out to additional 16 node boxes.

  23. Re:What I'd really like to hear about... by Anonymous Coward · · Score: 0
    One possible problem is that the chip only has a finite number of registers (64, IIRC). So if you are emulating a chip with 40 registers, simultaneously emulating a second chip that needs more than 24 registers could cause problems.

    Unlikely. A multi-ISA chip would probably 'switch' between ISAs in the same way that a multitasking OS switches between tasks: All the register values for the outgoing task get stored, and new register values for the incoming task get loaded. If something like this is done for a multi-ISA chip, there's no need for register values from both ISAs to coexist in the CPU at the same time.

  24. Re:Slightly Off Topic by Anonymous Coward · · Score: 0

    Or not - the Merced will be big clunky and slow because, just like every previous "recent generation" Intel CPU it will have hardware to translate x86 to the Merced VLIW. A better statement might have been, if you want to code native VLIW, go buy a Trimedia processor - at least if you want an efficient chip that doesn't contain a bunch of x86 baggage.

  25. Re:software=buggy NOT by Anonymous Coward · · Score: 0

    I'm sure all the EEs are getting a chuckle out of your post. What makes you think hardware is immune to bugs? Remember the Intel F00F bug? Or how about Intel's IA32 floating point fiasco? If anything, the code morphing software can be stored in flash ROM and updated if found buggy. It also means that the hardware can be less complex, ergo less buggy hardware. Sounds like a win-win situation to me.

  26. Re:SLASHDOT SUCKS AND IS GAY by Anonymous Coward · · Score: 0

    then why are you reading?

  27. NOT funny by Anonymous Coward · · Score: 0

    but rather communistic and zealous (in the linux persuasion)!

  28. Re:Crusoe-VLIW native code by Anonymous Coward · · Score: 0

    Code morphing software is stored in ROM (or FlashROM if you want it to be upgradable) and loaded to main memory at boot time. It's NOT stored in some mysterious part of the chip.

    If the code morphing software is stored in FlashROM, you may lock a part of it so that it never gets overwritten. That part could then hold enough code to read a new FlashROM image, eg. from a floppy disk. So even if you turn the power off during flashing, you can still use a recover disk.

  29. Re:Crusoe-VLIW native code by Anonymous Coward · · Score: 0

    Seems like sometime in the past I was under some foolish impression that software was a lot more expensive to develop than hardware. I'm just wondering how this fits into this idea of pushing function that used to be in hardware up into software?

    Today's high end processors are very complex. The design teams are huge, the development cycle is long. This kind of hardware IS very expensive to develop.

    Moreover hardware bugs can be very costly... Just ask Intel about the Pentium division bug...

  30. Re:SMP Transmeta benefits? by Anonymous Coward · · Score: 0

    The answer is - NO! This CPU is clearly not designed for traditional SMP designs. The PCI "north bridge" and the memory controller are built right onto the Crusoe chip. Without access to the CPU main bus, you are kinda locked you out of the traditional shared memory design.

    However, you could do a cluster where each CPU has its own memory and runs its own process and the CPUs just communicate over some kind of network. That network might as well be the PCI bus, as it comes "for free" with the Crusoe CPU. You would be limited to a 4 CPU cluster though, unless you invest in a bunch of PCI to PCI bridges.

  31. Re:Crusoe-VLIW native code by Anonymous Coward · · Score: 0
    Heh, the thing I think is cool is that you could start off buying a chip this year, and if a new technology (Like SIMD or 3DNow!) comes out, you can just go to Transmeta's web site or whatever, download the new instructions, and go run a program that uses the new instructions!

    There's a basic risk here, though: from what I understand, the 'Code Morphing' software doesn't reside in main system memory - instead, it's in a special on-chip memory area, which is loaded from a ROM at boot time. So you replace the ROM with an EEPROM, and make it possible for users to cram a new instruction set in there. What happens if there's a bug in that new instruction set, or the flash process fouls up? Your computer won't boot. It won't even come close to booting - this isn't something you can fix with a bootable floppy, because the code to load the system on the boot floppy won't run any more. Now how do you fix it?

    Actually, I suppose there's an obvious solution available: Make it so that the chip can load its 'Code Morphing' layer from either the EEPROM, or a hard-wired ROM, and make it possible to choose between the two with a jumper. If something fouls up, open the case, swap the jumper, reboot, and re-write the EEPROM. Still, this could be a big pain in the ass for people who aren't comfortable rooting around inside their computers.

  32. Re:The customary question... by Anonymous Coward · · Score: 0

    Moderate this up, I haven't laughed this hard in a long time. And besides, if anyone can use the karma, it looks like fr0g can.

  33. Re:Some Question about Crusoe by Anonymous Coward · · Score: 0

    I really hate to knit pick here, but learn your x86 assembly!

    code from above:
    ADD AX, BX
    SUB CX, AX
    JNZ Cx
    You are a tad bit off in your last line.
    a) you either need a 'test' instruction to check of the zero flag is set for jnz to work. I don't see a test or anything modifying the flags
    b) jnz jumps to a label. cx is a register, doh! maybe next time :)

  34. Hey YUTZ!! by Anonymous Coward · · Score: 0

    Ever hear of the FDIV bug??? How about F00F?

  35. Re:SLASHDOT SUCKS AND IS GAY by Anonymous Coward · · Score: 0

    He's merely transferring his masochistic ways to the electronic realm.

  36. Re:Yet another /. rant... by Anonymous Coward · · Score: 0

    if you read Ars, you may notice they link to slashdot for some of their stories, why not the other way around?

  37. Re:Crusoe-VLIW native code by Anonymous Coward · · Score: 0

    (...) if Crusoe can't run different "morphers" simultaneously (which I suspect it can't).

    I suspect it can. If I were working at Transmeta, running two ISAs at the same time would be one of my top goals : if you can run x86 code and Java code at the same speed, you're a winner.

    I heard they have been demonstrating Java programs running simultaneously with x86 code at the conference, but could not get much more details.

  38. Re:Slightly Off Topic by Anonymous Coward · · Score: 0

    Excellent thought. Going along your suggestion but pushing it a bit further: What transmeta has done is decouple ISA from the chip. which of course is a point that you already get, but I'll repeat it as motivation for what follows. With decoupling, a new class of engineers will be created. That is ISA designers. there is very few ISAs right now because of the enormous cost of a port of everything that runs on top of it. With the apearance of codemorphing the cost is dramatically reduced to pratically 0. Since the system can be running multiple ISAs at the same time. (With a few flash rom cards.) So this frees up ISA engineers to taylor instruction set for application needs. In fact I see a day in the future where most applications will be written in multiple ISAs. You might be asking why one need multiple ISAs anyways. Well just as different computer language address different problems. different ISAs can be taylored too. If a particular component of a application performs no floating point calc but does a lot of memory manipulation, the ISA for that piece can be optimized for that. And the corresponding code-morphing software will take that into account. Another part of the application might do a lot arithmatics, so the ISA for that part might be optimized for that. So in the future, maybe 30 to 50 years (It takes people a while to adapt to such a break through tech) we will see a suite of virtual ISA that never has and will never be implemented in hardware. These virtual ISAs will be like C/C++, java, eiffle, etc. is today. (Of course by then I hope programming will be fully visual, but that is another story.) email me if you think my ideas are interesting. email me at bineronbrain@netzero.com

  39. computer architecture by hennesy and whatsisface by Anonymous Coward · · Score: 0

    in their section on VLIW processors they explicitly mention that some kind of emulation could be used to help the modularity / consistency of VLIW cpu.. that book was written a couple years back.. guess they will include more info on Transmeta now!

  40. Quake3 WAS running with a hardware accelerator by Anonymous Coward · · Score: 0

    So I'm assuming that demo of quake3 they showed WAS running in software mode with some pretty fancy dynamic optimisations going on.

    There's no chance the Transmeta CPU would have done the 3D calculations in software. No matter how much dynamic optimisation is applied. You simply need more FPU power to do this than the TM5400 seems to provide.

    I mean, we don't have much data on the FPU, but we know it only has 1 such unit. So it can only launch one new FP operation per cycle, which is not enough for this kind of applications.

    Compare this with the Sony/Toshiba Emotion Engine (EE) which has 2 additionnal 128-bit SIMD units, each providing 4 single-precision FP operations. (BTW, the EE is not a VLIW, it's a classical 2-way superscalar).

  41. Yet another /. rant... by penguinboy · · Score: 0

    As one of the undoubtedly many /. readers who also reads Ars, why do I need to see this here? I read it on the Ars site hours agoo.

    1. Re:Yet another /. rant... by Bill+Currie · · Score: 1

      Because not all /. reades read Ars. Also, because posting the article on /. allows /. readers to discuss it.

      --

      Bill - aka taniwha
      --
      Leave others their otherness. -- Aratak

    2. Re:Yet another /. rant... by dizzydogg · · Score: 1

      Some of us have actual work to do and can't spend all hours of our day surfing all the computer news sites to see if one has a new tidbit.

  42. Re:The customary question... by fr0g · · Score: 0

    NOT CHEAP!!!!! what kind of chips are you buying?? or are you eating paint chips?

  43. Why I'm Disappointed in Crusoe by Anomalous+Canard · · Score: 0

    I was hoping for a chip that would ease the transition from x86 based software into generic Unix software. Imagine a chip that would allow a x86 compatibility-mode but would run native software more efficiently. That would ease the transition in the same way that Windows 3.0 eased the transition from DOS to Windows applications by providing backwards compatibility for old apps. It appears that Transmeta is not interested in documenting the native instruction set for Crusoe which means that there won't be native compilers which means there won't be native apps that take advantage of the speed that the new chip offers. Now *perhaps* the knowledge about the efficiencies available in the translation laver will give Linus a leg up over the coders of other operating systems, but that knowledge dosn't translate to all the other coders working on Open Source software.
    Anomalous: inconsistent with or deviating from what is usual, normal, or expected

    --
    Anomalous: deviating from what is usual, normal, or expected
    Canard: a false or unfounded repor
    1. Re:Why I'm Disappointed in Crusoe by Mike+Hicks · · Score: 1

      Unfortunately, I think the x86 compatibility would end up being used as a crutch, like the Windows 3.x compatibility in OS/2. Software developers thought, "Well, if OS/2 runs Windows applications, there's no reason for me to port my app to OS/2."

      Also, much the same thing has happened in the Windows world. Many apps have 16-bit code under the hood, making Microsoft's transition from Windows 9x (16/32-bit OS) to the coming NT derivatives (fully 32-bit). This is also one reason why the WINE project can't run certain programs.
      --

      Ski-U-Mah!

    2. Re:Why I'm Disappointed in Crusoe by Anomalous+Canard · · Score: 1

      It's only remains a crutch if there's no compelling reason to move on past it. OS/2 offered little that Windows itself didn't offer, but if native apps were faster, that would be plenty of incentive to move past the transition layer.

      Note to Moderators: Flamebait?!? You couldn't tell flamebait if it scorched your shorts. I gave a personal impression (Why *I* was disappointed, not Why you should be) and I backed it up with facts. What I expected, and what I got.
      Anomalous: inconsistent with or deviating from what is usual, normal, or expected

      --
      Anomalous: deviating from what is usual, normal, or expected
      Canard: a false or unfounded repor
  44. Re:hi by MrTGuy · · Score: 0

    I pity the fool who don't like mr T! Mr T vs Slashdot Go Read it suckas!

    --
    Quit yo jibba jabberin' and come see Mr. T
  45. Re:hi by Anonymous Coward · · Score: 1

    Mr T vs. CmdrTaco would be funnier if you had Mr T saying "hella", like all the other, funnier Mr T vs. "whatever", instead of "helluva".






    damn i'm picky.

  46. Why turn it off by Anonymous Coward · · Score: 1

    Given their target of small mobile devices, like webpads and the like, its low power consumption and sleep mode, I don't think it's intended to be "turned off".

    1. Re:Why turn it off by Handtuch · · Score: 1

      Those gold-caps are afaik just to keep the memory from loosing its bits. Take a flash-memory and you even need no buffering. Alex.

    2. Re:Why turn it off by tzanger · · Score: 2

      Regardless, it will have to be turned off even if it is for battery changes.

      Not to be an ass, but my Palmpilot doesn't lose its data when I switch batteries. Nor do (new) VCRs lose their programming on power loss. Devices called super capacitors (5V 1F style things) keep enough energy around to keep very low power components up and running in a sleep mode to ride out such interruptions.

    3. Re:Why turn it off by scheme · · Score: 2
      Given their target of small mobile devices, like webpads and the like, its low power consumption and sleep mode, I don't think it's intended to be "turned off"

      Regardless, it will have to be turned off even if it is for battery changes. Also people may drain the battery after using it for a while, requiring a power down. These points aside, even low power devices today are turned off (i.e. laptops, palm pilots, etc.) since even standby mode drains too much power. The crusoe systems will probably drain more power than a palm pilot so you'll probably need to turn it off.

      --
      "When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
  47. Re:What I'd really like to hear about... by Anonymous Coward · · Score: 1
    Watch the videos (in RealMedia format, 1 2 3). They indicate that it would be possible to multitask multiple instruction sets simultaneously, and demoed a CPU running java bytecode. The instruction sets aren't stored onchip, they are implemented in software.

    One possible problem is that the chip only has a finite number of registers (64, IIRC). So if you are emulating a chip with 40 registers, simultaneously emulating a second chip that needs more than 24 registers could cause problems. You'd probably store the extra registers in memory, which would slow the performance, but it wouldn't be any worse than a software-based emulator like vmware.

  48. Give me a damn gcc or I don't want it by Anonymous Coward · · Score: 1
    Mobile Linux is a derivative of Linux, and Linus Torvalds is not the only copyright holder of the Linux kernel. As Linus himself has said, he cleverly tied his own hands: so many people have copyrights in the kernel that nobody, not even Linus, can release a non-GPL version now.

    So even if Transmeta has no native C compiler, they still have a complete bootable operating system we can read.

    And what C compiler does Transmeta use for Mobile Linux? Did they somehow remove the zillion lines of gcc'isms from the kernel code? Or is their compiler a derivative of gcc?

    Meanwhile, I say: fuck the compatibility argument. I'm a big boy. If you tell me that my native VLIW binaries will crash and burn on the next model over, I can handle that. I'll recompile the program when I switch machines, but I want a native gcc, or the chip is not worth programming for.

  49. Re:explain "cooL' by Anonymous Coward · · Score: 1

    Explain why Crusoe is ``cool''? My friend, for those who know, no explanation is necessary. For those who don't know, no explanation is possible.

  50. SMP Transmeta benefits? by Anonymous Coward · · Score: 1
    I was just thinking about some interesting side-effects of a SMP Transmeta computer. Would it be possible to have a central optimized cache for all the processors? Any problems with contention? Could one processor be dedicated entirely to computing optimizations? Any advantages here? I guess I'm comparing this to current SMP server systems. Would it be silly to consider mobile SMP applications?

    just random thoughts...

  51. Re:Sweet...or sour??? by Anonymous Coward · · Score: 1

    Well, if you really read the Ars article, he makes it pretty clear that the Crusoe is NOT a high performance chip. It is designed to run "typical" applications (like Office) at a similar pace as a full blown Intel CPU but with lower power consumption. (It seems to me we have heard this line before - AMD tried to sell us all the K5 - or was it the K6 that was "the fastest engine ever designed for Windows apps." The problem was, if you wanted to run Quake instead of Office, the CPU generally sucked ass performance wise.)

    I think the snake oil is pretty obvious if you look at benchmarks Transmeta has published. They are showing some "relative" time to complete typical Windows tasks vs. an Intel CPU and the Crusoe is loosing - though not by much. We don't get any "standard" benchmarks like SPEC or Drystones or MIPS or MFLOPS because if they ran those, Crusoe lack of processing power would just be all the more apparent. (Though it might come close if they ran those benchmarks as compiled native code as opposed to emulated x86.)

    The reason they can get away with this, of course, is that you don't need a Pentium III 600 to run typical "Office" like apps - most of the CPU power on a chip like the PIII just gets burned up in system idle cycles anyway. Now, certainly the fact that Crusoe is low power is promising - a lot of people need a laptop that can run for 10 hours and they don't necessarily need to run Q3A full bore. It's also pretty cool that they put the "north bridge" and the memory controller on the same chip as the CPU - that's a really good idea, especially for the mobile market they are targeting. But this all doesn't excite me that much - does anybody remember the DEC StrongARM RISC? Another example of a chip that provides reasonably good performance from less than one watt of power - though it did not provide any kind of x86 compatibility.

    Now obviously the Ars article points out that these aren't the ONLY CPUs Transmeta will produce. In the future they may build high performance workstation or server class chips. For now, I guess all the performance junkies can go back to drooling over the Alpha.

    Just my 0.02

  52. Re:You aren't SOPOSED to code in it's native set by Anonymous Coward · · Score: 1

    The reason to do the optimization on the fly is that by doing so you gain extra profiling information that is impossible to get at compile time. Dynamic optimization/recompilation allows the processor to improve the execution speed of blocks it executes frequently, and also do things like adjust caching schemes and do better speculative (or predicated, I guess) execution.

  53. Re:What I'd like to see... by Anonymous Coward · · Score: 1

    Interesting that you should ask this question. In fact, Motorola makes an embedded version of the 603e called the MPC8240. It has a built in 66MHz PCI "north bridge" and 100MHz SDRAM controller. Sounds a little like the Transmeta chip except that it consumes more power and only runs "native" PPC code. The 8240 is generally used in applications like routers or other network devices.

    Performance of the MPC8240 is in the range of 375 dhrystone MIPS at 266MHz. Would be nice if we had a similar benchmark for the Crusoe, yes? As another benchmark, the StrongARM SA1100 comes in at about 250 dhrystone MIPS at 220MHz - so similar performance. The StrongARM, of course, consumes less power (under 1 watt) than the MPC8240, but the 1100 does not have the built in PCI bridge.

    Of course, then you can get into the "higher power" CPUs like the PowerPC G4 - it sits at 825 dhrystone MIPS at 450 MHz. Or, if you get into the SIMD vector processor, a billion floating point ops/seconds. That's pretty fast, though the chip consumes about 5 watts. Things like the Intel PIII and AMD Athlon provide about the same compute power as a G4, but consume MUCH more power - something in the range of 30 watts for these beasts. If your going to consume that much power, you might as well get yourself an Alpha which will give you double the performance of the Athlon on the same electrical budget. (You can't run x86 code native on a Alpha, but who gives a F* if you can get twice the performance for the same electrical budget?) Clearly a 30 watt CPU is well outside the notebook computer range. Obviously that's what "slow" low power chips like the Crusoe are for ;)

  54. explain "cooL' by Anonymous Coward · · Score: 1

    The low power aspects of Crusoe *are* cool. But I am curious about what makes code-morphing "cool". VLIW is old hat, and code-morphing sounds suspiciously like a JIT: it recompiles x86 into a native VLIW format. As an example, take Sun's Java hotspot compiler, which adaptively recompiles one machine language format (Java bytecode) into another (x86 or Sparc or whatever.) Hotspot *also* does on-the-fly optimization and it also analyzes a running program for "hotspots" that need to be aggressively optimized.

    I also note that Hotspot was heavily hyped and hasn't quite lived up to being the world-changing technology that it was supposed to be. I guess adaptive recompiling is harder than we thought...

    Finally, VLIW *can* be damn fast. But what happens if you encounter a bunch of move instructions in a row, or a bunch of integer instructions, or whatever? Then only one of the four possible slots will be filled per clock cycle, while the other three instruction units sit around twiddling their thumbs, no?

    IMO, we already have a a portable low level language. Its called C! I also suspect that any reasonable C compiler will out-optimize a JIT/Code Morpher/whatever just about any day of the week.

    Hey - if I'm wrong, somebody please educate me! It sucks being ignorant!

    1. Re:explain "cooL' by Trepalium · · Score: 2
      JIT is actually very different. A JIT compiler doesn't have to deal with things like a memory map that cannot change, self-modifying code, stack stuffing, hardware interrupts, etc. But this idea is far from new. The macintosh emulator for PC, Executor, used similar techniques. It was based on a dynamic recompiling CPU emulation core that would translate and simplify a series of instructions, cache the resulting instructions, and then execute them. These technique is only efficient when large amounts of code can be executed without interruption and without the requirement of cycle-level precision of timing. For 98% of the time, x86 software doesn't care or need to care about very precise CPU timings (there's too many different types of x86 CPUs out there to make it useful).

      As an instruction set, the x86 is pretty bad, however it's easy to code for and easy to optimize for, which are it's biggests strengths. As a mid-layer API for this device, it was probably a good choice -- x86 recompiles well on RISCy machines with lots of registers. PowerPC and others probably wouldn't. They have a lot of registers themselves and are more complex (plus the wide variety of x86 clones means that most people will likely shy away from dangerous instructions, whereas since the PPC is VERY standardized, many software packages could rely on subtle bugs in the silicon of the PPC. Believe me, bugs are the hardest part of any hardware archetecture to emulate.)

      ARDI has some fairly interesting whitepapers on their implementation of the 68k instruction set on x86. Keep in mind this is MUCH more difficult to do than the reverse. The 68k CPU has 16 registers, whereas x86 only has 8, etc.

      The big problem with C is the same C source file compiled with the same compiler can often produce many wildly different results, and C doesn't solve the problem of hardware accesses, which almost always need to be done in a low-level language. This CPU/software will be beneficial to many companies due to the fact they will be able to reuse existing hardware, drivers and software with this. As long as the prices for the CPU get really low eventually, it could really lower the prices for PDA and hand-held computers. (Imagine playing a 3D accelerated game of Quake III on a hand-held machine)

      --
      I used up all my sick days, so I'm calling in dead.
  55. Re:Slightly Off Topic by Mike+Hicks · · Score: 1

    Rumor has it that they actually ported Linux to run on bare hardware, and it didn't really help enough to make it worth the trouble. Besides, a new version of Linux would likely have to be made for each different Transmeta chip (as the TM3120 and TM5400 have different instruction sets)

    One thing that we may find, however, is that a certain architecture is emulated better than x86 (i.e. the PowerPC, ARM, or Alpha architecture may be easier to translate into native VLIW) Therefore it may be a better idea to run Linx over PPC/ARM/Alpha code-morphing software on a Transmeta chip (or maybe just specific type of Transmeta chip works better, etc., etc.)

    Boy, this gets confusing after a while.


    On a somewhat different topic:
    I kind of wonder if IBM is actually getting some technology from Transmeta. They moved the AS/400 from 32-bit to 64-bit (CPUs) a few years back and had to make sure the new systems were able to execute old code (actually, I understand that AS/400 machine code is abstracted from the object code of programs, though probably not in quite the same way as how Transmeta did things - if that makes any sense at all..)
    --

    Ski-U-Mah!

  56. Re:This is so cool... by Mike+Hicks · · Score: 1

    I understand there was a proof-of-concept demo at the Crusoe unveiling that would switch from x86 to Java bytecodes. I'm not sure if swapping between x86 and Java required a reboot or anything like that -- I wish I could have been at the unveiling so I could have seen that in person (but then I'm probably a terrible reporter ;-)
    --

    Ski-U-Mah!

  57. Re:Crusoe-VLIW native code by Mike+Hicks · · Score: 1

    The primary reason is that they don't want to have to make these chips backwards compatible. Intel has a lot of problems with this - even the newest Pentium III's must support programs written for 386s

    Heck, a Pentium III can run 8080/8086 code (maybe even 8008 code or 4004 code!)

    Since the morphing code is running in Flash ROM, it can be upgraded, but if someone tried to load a morpher that doesn't work they're gonna have trouble reverting back to x86.

    Heh, the thing I think is cool is that you could start off buying a chip this year, and if a new technology (Like SIMD or 3DNow!) comes out, you can just go to Transmeta's web site or whatever, download the new instructions, and go run a program that uses the new instructions! (Well, presuming that Transmeta will support older chips and whatnot -- that could be a problem with having different instruction sets for each chip. How long do you support an instruction set?)
    --

    Ski-U-Mah!

  58. Re:You aren't SOPOSED to code in it's native set by Gumby · · Score: 1

    I believe the morphing layer is compiled to native code. If it was the highest performance way to do the morpher, then that contradicts your claim. (ok so there is some minimal component that would have to be native to boot strap the morpher) Also, I estimate that 50% of the cpu cycles are spent running the morpher, so native code would get an automatic 2x advantage over x86 code.

  59. Real Speed by Gumby · · Score: 1

    We've heard about a ~650MHz TM chip being comparable to a 500MHz PIII. But the real question is, what fraction of the CPU cycles are running the morpher? That is a very interesting question. Especially when comparing different morpher's etc. A first guess would could be PIII @500 is ~700 MIPS, TM gets about 2 x86 Ins/cyc so 350 MHz are spent on application code and ~300 MHz on code morpher. I'm damm impressed with a near or better than 1-1 ratio!

  60. Re:Slightly Off Topic by David+Greene · · Score: 1
    This still doesn't solve the problem of third-party vendors. Shipping multiple binaries for different classes of CPU core is not economical. There are just too many variables. The support costs are nightmarish. Code morphing can alleviate this.

    And to beat a dead horse, the code morpher also optimizes. This is extremely important to the performance of Crusoe. It can actually run programs faster than if they were compiled natively, due to the run-time information available to the optimizer.

    --

    --

  61. Re:Slightly Off Topic by David+Greene · · Score: 1
    You're ignoring the fact that the translator also optimizes. If you look at this post, you'll see that the code morpher does some neat tricks to get around the aliasing problem. This is something a static compiler can't easily do. Sure, with profiling and re-compilation it might make some intelligent guesses, but isn't it simpler to let the translation software do it for you?

    What Transmaeta has essentially done is take the Merced core and execute the compiler at run-time. The alias handling structure acts like the ALAT on Merced.

    Code executed through the translation layer should perform better than code executing on the bare metal because the translation software is learning and optimizing.

    Think of it this way: would you rather manage your stock portfolio as is done today, by guessing what might happen, or would you rather know what the market is going to do and trade your stocks accordingly. I guarantee that I can beat your statically predictive management every time if I have that additional context.

    --

    --

  62. Re:Slightly Off Topic by David+Greene · · Score: 1
    No, it's not a closed source problem. It might be a binaries problem. I don't know about you, but I wouldn't want to statically compile an ActiveX object every time I view a web page.

    The translation software provides backward compatibility, yes, but it also provides flexibility for Transmeta.

    What if Transmeta desigs the TM-ISA? It's a virtual machine designed to translate efficiently to the bare hardware. Now compilers can take advantage of the additional registers provided by TM-ISA. If a new core provides more physical registers, TM-ISA v.2 can be released, allowing the use of more registers by the compiler.

    That's all well and good, but we get the additional benefit that old programs run on the new hardware just fine, and there's no additional hardware cruft to maintain compatibility.

    Ok, that's pretty cool. Backward compatibility is important. But what's really neat is that Crusoe provides forward compatibility. Code written to TM-ISA v.2 will run just fine on processors released with TM-ISA v.1 as long as new firmware is loaded that can understand TM-ISA v.2. So now software houses can release code optimized for the latest and greatest without worrying about users behind the curve not being able to run their stuff.

    How often do people moan about RedHat not providing Pentium-optimized packages? With Crusoe, RedHat can silence the critics without impacting us 486 users.

    --

    --

  63. Re:You aren't SOPOSED to code in it's native set by David+Greene · · Score: 1
    Moderators, knock this one up! Great explanation of JIT vs. Dynamic Compilation/Specialization!

    Note that there is no reason Crusoe couldn't support a staging compiler. Transmeta could always release a virtual ISA that had support for doing this efficiently. And of course you could always write a dynamic compiler in x86 (ugh). The point is that Transmeta could directly provide support for something akin to DyC in a later processor. And still maintain both backward and forward compatibility.

    Pretty neat trick, I'd say.

    --

    --

  64. Re:You aren't SOPOSED to code in it's native set by David+Greene · · Score: 1
    No, a static compiler will never produce optimal code. It can't for several resons:
    • Separate compilation. Without being able to look at the whole program at once (including BTW, system libraries and kernel code), the compiler can't fully know the aliasing conditions present in the program. It can make some guesses, but a function call to an external module will pretty much kill the optimizer (though you can do things with locals and such).
    • Lack of context. A static compiler has no idea what will happen at run-time. Can the compiler elimiate a load after a store? Not if the store potentially writes to the same address from which the load reads. But there may be times when the load is not dependent on the store. Theoretically, the code morpher can take advantage of this. With profiling, a static compiler can gain some run-time context. But you're relying on the assumption that the profile runs are representative of the way the program will be used for all time. And then you get into that hazy, ugly area known as dynamic memory allocation...
    • ISA limitations. The compiler is restricted to the idioms provided by the machine's ISA. On the x86, performance is absolutely killed by the lack of general-purpose registers and the non-orthogonality of the instruction set. With the code morpher, Transmeta could theoretically release a "clean" ISA that is a nice compiler target. And improve it as the experience builds up.

    --

    --

  65. Re: native instruction set by pohl · · Score: 1
    I agree completely, but wanted to offer a reason why folks might be willing to live with the lack of forward-compatibility: perhaps the source for the software that they run is freely available and they don't mind recompiling. Just a thought.

    I rather liked the idea that one poster suggested: rather than writing to the native instruction set, invent a new intermediate instruction set that is optimized towards making a better-performing code-morphing layer. It's a very interesting suggestion.

    I also wanted to say that I'm surprised that more folks aren't really excited to read the insightful analysis at the end of the article where they gave a convincing argument for future transmeta chips that are not limited to the low-power mobile market. It had me salivating.

    --

    The "cue the foo posts in 3, 2, 1..." posts will commence with no subsequent foo posts in 3, 2, 1...

  66. Re:You aren't SOPOSED to code in it's native set by pohl · · Score: 1

    The article contains some good reasons for not doing it ahead of time in the compiler: with the code-morphing layer, you can keep real statistics on which blocks of code are actually used frequently, and whether or not a branch is likely to be taken -- under the actual conditions that the software is running. I know of no compiler that optimizes by running code with real data. Can it really be done? It just sounds like something best done dynamically to me.

    --

    The "cue the foo posts in 3, 2, 1..." posts will commence with no subsequent foo posts in 3, 2, 1...

  67. Re:Slightly Off Topic by David+Price · · Score: 1
    Do you want to kill the Crusoe? Because that's what your thirty-year-old delusions of assembly-code grandeur will do to it.

    The point behind the Crusoe is not, not not NOT, to just be a better faster chip that optimizes better and consumes less power than those on the market now (though it is.)

    The Crusoe's selling point is compatibility. Transmeta can churn out all sorts of chips, some optimized to sip current from batteries at a tenth of the rate of today's monsters, some designed to guzzle power even more and be speed demons. They can make radical changes to the basic design of the chip while doing this, and it won't matter, because though the way things are done internally may go topsy-turvy, the instruction set won't change, and the same programs can be run on each.

    This neatly solves the drag placed on development by the need for backwards-compatibility (Want to run DOS 3.3 on your Athlon? You can if you feel like it.) Just like Windows, x86 chips have accumulated baggage - the sediment of silicon long since passed into figurative dust.

    Transmeta has designed a beautiful thing - a chip that transcends backwards-compatibility. Writing to the bare metal on the Crusoe bolts it down, turns it into just another fixed-in-place bit-smashing engine. Kills it, in other words, removes what makes it an elegant hack.

    Don't do it. Please.

  68. The one true reason... by Enahs · · Score: 1

    Hrm, perhaps because there are folks such as I, who not only do not regularly read Ars Technica, but also aren't whiny bastards such as you?

    --
    Stating on Slashdot that I like cheese since 1997.
  69. Re:Transmeta not impressive by BluBrick · · Score: 1
    There's a thread on Usenet that claims Transmeta's *ORIGINAL* goal was not low power, but the best performance, but when they couldn't attain it, they "fell back" to a low power selling point.


    OK, Transmeta have proven that they are pretty damn good at keeping secrets, so I would take the info obtained from that Usenet thread with a decent sized grain of salt (as opposed to most other Usenet "wisdom" :). However, they may have been aiming for peak performance, and discovered the low power aspect by accident, then concentrated on that. I don't know, you don't know, and Transmeta aren't about to tell either of us, are they?



    --
    Ahh - My eye!
    The doctor said I'm not supposed to get Slashdot in it!
  70. Finally, some clueful reporting and analysis. by Jeff_Uphoff · · Score: 1

    After hearing media reports that varied from referring to Linus as a "key executive" within Transmeta (he's not a corporate executive, which should be obvious to anyone who's viewed the web site or bothered to read the press package distributed at the launch), to describing Crusoe as "Internet-powered" and then asserting it draws its electrical power from the Internet itself, it's nice to see that someone with a clue actually sat down, read, analyzed, and reported on the technology that we introduced yesterday.

  71. Direct won't do much. But ... by Flammon · · Score: 1

    First thing is that you won't have many useful instructions to do what you need with the simple native instruction set that Crusoe provides. So you would need to be creative and optimize your code very well to get the speed you are looking for. Remember, your code optimizing abilities are competiting with a very advanced code morphing technology. Next, for the few clock cycles that you are going to get out of doing native is not worth the programming effort.

    Here is what needs to be done instead. Design an instruction set specific to the application that you are writing. Our current CPUs can handle very broad tasks and try to be good at everything and when it can't things like MMX, 3DNow and whatnot start to show up in the CPU.

    So, If you know the box you are setting up is going to be a web server, design an instruction set that a web server would fly on. If you play games, design an instruction may looks like 3DNow on steroids.

  72. Re:Some Question about Crusoe by Maarten · · Score: 1
    Would the translation for each instruction be cached, or is the sequence cached? The article implies that the sequence is cached since the CodeMorph software can optimize the speed on subsequent passes. However, this seems to limit the benefit gained from caching to relatively tight loops or common sequences of code depending on the cache size.

    I suspect that the translation units are based on so called 'basic blocks' which can most easily be described as anything in between a target label and a branch (i.e. entry and exit points in your code). This would allow optimisation of loop bodies.

    This can be extended by going to 'super blocks' (multiple basic blocks) allowing sofisticated things like loop unrolling, software pipelining etc.

    What I'm actually interested in is how the translation cache is being accessed. In a later post somebody states that the translation cache is maintained in main memory (therefore benefitting from the regular data cache). I'm not sure I understand how it is possible to do efficient cache lookups in this way. I assume they use hashing methods to map x86 memory pages to 'translation cache lines', but this has a much higher overhead then hardware based cache lookups.

    I am also been a bit suprised by people being worried about loosing the cached translations when powering of a system. People, we're talking here about loops that are being executed 100s if not 1000s of times. Having to do the translation again for the first few iterations is not going to be the big performance loss they seem to think it is!

    --
    Maarten Boekhold
  73. Re:computer architecture by hennesy and whatsisfac by Maarten · · Score: 1

    'whatshisface' would be David Patterson, who together with David Ditzel authored the 'The Case for Reduced Instruction Set Computing' article which started the whole RISC thingie.

    --
    Maarten Boekhold
  74. Re:What I'd really like to hear about... by Doctor+Memory · · Score: 1

    I also wonder whether it can multitask between different instruction sets. I guess the task switching overhead would be pretty brutal if there isn't room onchip for multiple instruction sets.


    I would guess not, since there is only a single TLB, configured at boot time. Unless you wanted to flush it every time you changed instruction sets (!)

    --
    Just junk food for thought...
  75. Re:Crusoe-VLIW native code by Doctor+Memory · · Score: 1

    Still, this could be a big pain in the ass for people who aren't comfortable rooting around inside their computers.

    IMHO, people who aren't comfortable "rooting around inside their computers" probably won't be writing their own code morphers. This isn't script kiddie stuff...

    --
    Just junk food for thought...
  76. Roadmap: high-end TM chips by korpiq · · Score: 1

    It would seem to me to take some work on top of what was released to be able to attack the server CPU market.

    I"d love to see these happen in next five years:

    - Code Morpher for Alpha, PPC, ...
    - Code Morpher to recognize the instruction set of a binary
    - "optimization practically finalized for this piece of code" bit
    - a TM CPU bus for several chips to share the same translation cache
    (how necessary is this actually?)
    - communication interface for operating systems
    - ability to save final VLIW version of code beside the original binaries

    Those would in essence offer the ability to turn a system eventually to VLIW binaries without actually putting any effort to it.

    Once TM has covered its development investment:

    - Open Source the Code Morpher
    -> worldwide development of support for
    - any chips
    - integration with high-level compilers.

    "No stop signs! No speed limits!" - AC/DC: Hghway to Hell

    --

    I think, therefore thoughts exist. Ego is just an impression.
  77. Re:You aren't SOPOSED to code in it's native set by Virgil · · Score: 1

    Coding nativly would be SLOWER then using the morphing layer. You also don't get the benifit of the optimaztion.

    Yes, but a good compiler will generate fully optimal code to begin with. A compiler that targets the Transmeta core Instruction Set should give you better code than the two level translation scheme.

    But that's neither here nor there. Transmeta will not want people to code to the native Instruction Set because it will undermine their flexibility with the underlying hardware. Right now, the major benefit of the two level translation scheme is that the hardware architecture can be updated and improved while presenting the same programing model to application developers. This will allow Transmeta to aggresively experiment with the hardware architecture while maintaining software compatibility. This is very very cool!

  78. Re:Is "mobile linux" GPLed? by psaltes · · Score: 1

    If you read the FAQ, it explicitly states that the source will be released.

  79. Re:Crusoe-VLIW native code by FWMiller · · Score: 1

    Only time will tell whether Transmeta's making us pay a penalty up front in the form of morphing so that they don't have to deal with backwards compatiblity in future will pan out for them from a business point of view. If all this thing does is run x86 code at lower power, they aren't going to have a market lead long. Two things are happening right now, guaranteed: Somebody is reverse engineering it The big boys are doing the same damn thing as fast as they can One of these two items will cut Transmeta's legs out from under them. Unless they get a killer app for the CPU and penetrate the market as quickly as possible, I'm not sure there's enough here to justify the effort they've gone to (read the VC dollars pumped in) I mean, if you'd just sunk $100 million into a company over 5 years and they came out with a slower x86 clone, what would you think? Oh, and I guess I have a question too? Seems like sometime in the past I was under some foolish impression that software was a lot more expensive to develop than hardware. I'm just wondering how this fits into this idea of pushing function that used to be in hardware up into software?

    --
    Frank W. Miller
  80. Re:Slightly Off Topic by FigWig · · Score: 1

    Evidently you haven't been reading much of the Crusoe propaganda. They don't want anyone to access the native instruction set so that they can change the chip core without having to worry about legacy apps. Imagine a chip that could go from pure CISC to RISC without having to change the apps. In this way the hardware implementation is decoupled from the instruction set interface.

    Pretty neat, but I haven't seen any real mention of emulating any architectures other than x86.

    --
    Scuttlemonkey is a troll
  81. Re:What I'd really like to hear about... by Brent+Nordquist · · Score: 1

    The Ars article also points out that some of the registers are used by the code-morphing software, too... you couldn't count on having 40 + 24.

    --

    --
    Brent J. Nordquist N0BJN
  82. Future quaking power? by Mr.+Flibble · · Score: 1

    Certainly not the type of chip I want to be playing quake on... For now.

    The telling quote is at the end of the article though:

    I'd say that it's only a matter of time before we hear an announcement of another product line from Transmeta. It won't be named Crusoe, because it won't be aimed at the mobile and embedded markets. It'll be a workstation and server class x86 CPU that runs Linux like a fiend, and it'll compete directly with Intel's IA-64. I can't wait.


    It does make me wonder though, if such a chip (Slightly altered) would actually end up being superior for Quake. Given that the translating software is able to identify which parts of cache are used more often it becomes better at branch prediction, this could translate into faster gaming... I think... Contrary to this thought though is the fact that the Celeron is a good gaming processor with 128K cache... We shall see.

    Along similar lines, if the x86 instructions are software, how much of the x86 instruction set does Quake use? Would the flexible software end up speeding up Quake by getting the x86 instructions out of the way?

    --
    Try to hack my 31337 firewall!
  83. Re:Is "mobile linux" GPLed? by Mr.+Flibble · · Score: 1

    It will probably be out, but Linus does not have to release it. Still, you know he will.

    --
    Try to hack my 31337 firewall!
  84. Re:Slightly Off Topic by Jeremi · · Score: 1

    You lot just aren't getting it. If you remove the code morphing layer, then you have to put backwards compatibility into the hardware down the road.

    Not really. Transmeta could then just write a code-morphing layer to "morph" the ISA you coded to into the new one. No?

    --


    I don't care if it's 90,000 hectares. That lake was not my doing.
  85. Re:Crusoe core instruction set? by Helge+Hafting · · Score: 1

    No, no, please! That would be a disaster!! The hole point of this architecture is to get rid of this compability mess.

    Well, someone will still have to suffer the incompatibility mess: Those who write morphing sw for various cpu's. This will surely be more than just Transmeta, if the concept takes off.

    The x86 instruction set isn't necessarily the best for this chip. Someone could make up a different one (perhaps something that use 32 registers or so) make a compiler for it, and have better performance than x86 code on the same chip.

    This would have to be rewritten for another chip, but rewriting the instruction emulator is a lot less effort than recompiling the os and all apps. Still, someone must do it.

  86. Re:Slightly Off Topic by Shoeboy · · Score: 1

    Transmeta could then just write a code-morphing layer to "morph" the ISA you coded to into the new one. No?
    Brilliant! The Meta Morphing Power Processors! Why stop there. Why not have transmeta write code morphing software that emulates their native instruction set and on top of that run code morphing software that emulates their native instruction set and on top of that run code morphing software that emulates their native instruction set and on top of that run...
    IT'S TURTLES ALL THE WAY DOWN!!!
    --Shoeboy

  87. RE: Beowulfs --> notes from the past by CodeShark · · Score: 1
    Late to the commentary, but I hope this adds something to it.

    way Way WAY back in micro-processor terms (1984-1985), I developed a white paper that attempted to extrapolate where PC's would develop by Y2K. (I'll put it up on my website if I can find which 5-1/4" floppy I saved it on, and re-hook a 5-1/4" drive to my PC).

    Hopefully it doesn't seem self-congratulatory (because a number of my other conclusions stunk) or redundant to this thread to mention that three or four of the paper's conclusions fit the idea of developing a Crusoe type "beowulf in a box" exactly:

    • High speed, low power CPU cores would be required ( 200 Mhz speed). Why? Because even if I had the ability to write programs that could keep all the Crusoe processors running at full tilt 100% of the time, I could conceivably power 50 Crusoe Processors or so on the same power supply that used to supply two Athlons (68 Watts),
    • The CPU units would perform on-chip instruction decoding so that chip and system architectures could be developed more flexibly,
    • Each CPU would have an abundant amount of cache memory in which to put commonly executed code units, and finally that no matter what,
    • for performance, massively parallel execution was more important than raw speed in terms of overall CPU speeds, etc.
    Now then, programming for massively parallel system is a b----, and I couldn't do a Beowulf cluster if I tried, but these chips and the StrongArm series are the first ones which met all of the specs in a fifteen year old paper.

    Just in time for Y2K. Interesting, eh?

    --
    ...Open Source isn't the only answer -- but it's almost always a better value than the alternatives...
  88. Possibilities by mbrod · · Score: 1

    Bravo /. that is the kind of stuff I wanted to read about the chip.

    While a lot of people are concentrating on how well this will work in small devices the author of this article is excited about the large-scale applications of the chip. I would have to agree. Think about a busy web server that is continuously generating web pages and doing database transactions. The code morphing software can spot that trend and be ready for it.

    Should be interesting.

    MBrod

    1. Re:Possibilities by fReNeTiK · · Score: 1

      Bravo /. that is the kind of stuff I wanted to read about the chip.

      Shouldn't that be bravo ars technica?

      There article on the K7 was great, btw...

      --
      I strongly believe that trying to be clever is detrimental to your health. -- Linus Torvalds
  89. Is "mobile linux" GPLed? by rueba · · Score: 1

    If yes, where is the source?

    --
    The only reason all cover-ups appear to fail is that you never hear about the ones that succeed.
    1. Re:Is "mobile linux" GPLed? by DanMilburn · · Score: 1

      Is Linus the sole copyright holder of ALL the code in the kernel?

      Of course not.

      Therefore, once they release "Mobile Linux", they *have* to release the source under the GPL.

    2. Re:Is "mobile linux" GPLed? by SEE · · Score: 2

      Yes, it's GPLed.

      Where is the source? Read the GPL -- they don't have to release the source until they distribute the code. Mobile Linux hasn't been released yet, so they can sit on the source for now. Linus has promised it will be available RSN.

      Steven E. Ehrbar

  90. Re: code morph cache.. by slashkitty · · Score: 1
    The example of code optimization they gave was a DVD player. After the first frame, pretty much all the needed code optimizations where completed and stored for the movie.

    The time it takes to re learn the optimization is very very short when compared to power on/off cycles.

    --
    -- these are only opinions and they might not be mine.
  91. Other Applications of Code Morphing? by Seanasy · · Score: 1

    I'm not a hardware guru so pardon the speculation...

    Obviously, the code morphing is focused on x86 right now and, as the article suggests, may be adapted for PPC, Alpha, etc. in the future. Is it feasible that it could also be adapted for specialized processors such as graphics or sound?

    I'm imagining an SMP-type of Transmeta box that, when you load Quake, automagically loads code morphing software onto one of the processors to act as the graphics accelerator or, if you're watching a DVD, can act as an MPEG decoder card

    Is what I'm suggesting conceivable or am I way off base?

  92. Overclocking by noom · · Score: 1

    So, we know by now that Crusoe only requires around 1 watt of power to operate, and that this results in a maximum temperature of 48C (thus, a fan isn't required to cool the thing). But, if you don't really care about power consumption and you installed a fan over the heatsink, one has to wonder how much faster these things can be clocked before they start showing glitches. The only problem I can see is the LongRun software which will automatically reduce power consumption if it's not necessarily needed -- this might mean that the only way to overclock the chip would be to modify the LongRun code (stored in FlashROM).

    Any guesses as to how long until someone figures out how to patch the FlashROM so to allow overclocking? I give it about 6 months after Crusoe based systems hit the shelves.

    -NooM

  93. Re:Slightly Off Topic by Tony-A · · Score: 1

    Even if you think Assembler is a high level language, you probably do not want to code directly to the bare metal. It is not a nice native VLIW machine code. It is the target for the code morphing layer. It's been a long time since I've even looked at microcode (early low-end IBM 370s were microcoded) but it tends to be obscure, twisted, very unfriendly, and I cannot imagine that it's gotten any better with time. Minor mistakes do very bad things. Only one program is written, the program to read and execute the "higher-level" machine code.

  94. Re:Some Question about Crusoe by Tony-A · · Score: 1

    To nit-pick,
    SUB CX,AX
    sets flags based on result in CX
    If things are case sensitive, Cx would be a valid label.
    Actually, both ADD and SUB set flags on x86.

  95. Re:You aren't SOPOSED to code in it's native set by fReNeTiK · · Score: 1

    Yes, but a good compiler will generate fully optimal code to begin with.

    How do you prove a whole program (as opposed to a short algorithm like quick-sort) is optimal? Isn't this a very hard problem to solve?

    --
    I strongly believe that trying to be clever is detrimental to your health. -- Linus Torvalds
  96. Re: Okay. But how about 'hinting'? by evbergen · · Score: 1

    What if the code morphing software authors (Hi Linus!)
    decided to 'extend' the X86 base instruction set just a bit, like 3DNow, MMX and so forth, only not to
    provide graphics acceleration, but to provide 'hints' to the code morphing software?
    In my imagination, there need be just a few of those extra instructions, and a clever compiler can stick them in to provide
    the code morpher with some difficult-to-find but static dependancy/ordering information. (I.e. potentially anything not needing run-time statistic gathering).
    If these hints are absent, the morpher just does its regular job. Now, good idea or not?

    --
    All generalizations are false, including this one. (Mark Twain)
  97. Nature Friendly? by HaKn5La5H · · Score: 1

    I supprised this chip wasn't sold as "the first nature friendly chip."

    I've heard all the statistics about 20,000,000,000 tons of coal being burned an hour to support the internet's routers, and a hundred times that to run our desktops.

    They should be getting any partner they can, and if some tree-hugger organization sells your chips for you; it ain't bad.

  98. Cool! by HeatherMax · · Score: 1

    One point the chap from Ars Technica misses is the heat output from laptops.

    I know that I for one am looking forward to a cooler laptop.

    --
    Andrew.
  99. Re:Crusoe-VLIW native code by Gorth · · Score: 1

    There's a basic risk here, though: from what I understand, the 'Code Morphing' software doesn't reside in main system memory - instead, it's in a special on-chip memory area, which is loaded from a ROM at boot time. So you replace the ROM with an EEPROM, and make it possible for users to cram a new instruction set in there. What happens if there's a bug in that new instruction set, or the flash process fouls up? Your computer won't boot. It won't even come close to booting - this isn't something you can fix with a bootable floppy, because the code to load the system on the boot floppy won't run any more. Now how do you fix it?

    How do you fix it? Well if this code-morphing software is on a flash ROM so that it can be rewritten, it would be in essence like having a second flash BIOS. There would be no greater threat here than simply flashing you BIOS. It's a pretty safe process if you take a little care.

  100. Re:Beowulf by nhowie · · Score: 1
    The only reason to have beowulf at all is that it's more economical than SMP sytems, it's not a better solution than massive SMP IF massive SMP can be made cheaply.

    Wrong. SMP relies on shared memory, so doesn't scale as well as a truly parallel (ack, I can never spell that corecktly) architecture.

    For SMP, the performance increase as you add extra chips decreases, and tails off dramatically at a relatively small number of chips (12 IIRC) due to the communication bottle-neck (this is for an OS that handles SMP well, Linux doesn't scale at all well ... yet). Parallel machines, on the other hand, give a theoretically linear increase in performance as you add more nodes. This is why 'proper' super-computers use parallel transputers, rather than just building big SMP machines.


    --
  101. Re:Slightly Off Topic by nhowie · · Score: 1

    One thing that we may find, however, is that a certain architecture is emulated better than x86

    Or even create a new 'abstract' instruction set that is architecture-independant, hence clean and probably faster to emulate than trying to emulate code optimised for different hardware (hell, we might even see the kernel re-written in Java, so it can run it as byte-code {j/k}). On a side note, do you think code that's been aggressively optimised for a certain architecture will run faster or slower on transmeta than code that's just been 'normally' optimised?
    --

  102. Re:The customary question... by jopasm · · Score: 1

    Ever seen an Avalon box? Lots of Alphas on
    small boards w/ lots of memory plugged into
    a fast backplane. You don't need a built-in
    hard drive, you just need a workstation to
    "feed" the cluster - you have a nice fast
    workstation hanging off a nice fast network
    connection (Myrinet seems to be popular for
    this - it's what Compaq was using in the
    Beowulf cluster they had at the Atlanta
    Linux Showcase). You just need some way to
    load a "bare" OS so the processors can start
    talking to the network - and (at least in
    theory) the linux distro. that's "built into"
    some of these chips could do that.

    'Course, as others have pointed out, the
    chips unveiled yesterday aren't high-end,
    but that doesn't mean they won't have
    a high-end design in the future. :>

    --

    ObTagLine: The more you run over the 'possum, the flatter it gets.

  103. The biggest problem with Beowulfs is Comm Speed by cameldrv · · Score: 1

    Inter-node communication is far more important than heat or space. In fact, I would say that heat and space are two of the smallest problems with Beowulfs.

    1. Re:The biggest problem with Beowulfs is Comm Speed by jmp100 · · Score: 1

      Still, a processor that runs at 700MHz and dissipates only 1 watt of power? That's gotta save some on the electrical bills.

  104. Crusoe Explination by JediLuke · · Score: 1

    Its basically a good compilation of the technical specs...what i really want to see is the power consumption comparisons between the TM5400 and the PIII...that was the coolest part and they don't even have a diagram anywhere...but its a great in depth look. Can't wait to get my hands on one of the devel specs kits.

    We are the music makers, we are the dreamers of dreams
    JediLuke

    --

    JediLuke
    -Do or Do Not, There is no Try
    1. Re:Crusoe Explination by JediLuke · · Score: 1

      What about ARM Processors? they only use half a watt i thought. And there is ARMLinux


      JediLuke

      --

      JediLuke
      -Do or Do Not, There is no Try
    2. Re:Crusoe Explination by chris_martin · · Score: 1

      The mobile Pentium eats around 7 watts of power last I checked. The full blown PIII is, what? around 43 watts? The lowest most of us get right now with a mainstream processor is the G3's that use around 3 watts (in the desktop, the portable's can cycle and lower the clock speed to reduce it even more)
      personnaly I can't wait until the webtop devices start shipping.
      If they're smart they will all include 802.11DS wireless.
      I've already got a wireless connection to the net with Apple's Airport, I'd love to plug in (or not as the case may be) a webtop into the situation.

      --
      -- Chris Martin, System Administrator
  105. Laptops, blah. by Elbereth · · Score: 1

    I couldn't care less about laptops or handhelds. Low power, high performance chips are always good, though, especially if they're cheap and have a gimmick (like, say, emulating other architectures).

    So, when can I buy an SMP motherboard with six or eight of these Transmeta processors on it? Intel is the only game in town for low-cost SMP, and it's not very low-cost at all, IMHO. Have you priced the Alpha lately? There are Alpha CPUs being discontinued that cost more than my whole SMP P3 U2W SCSI workstation! Funk dat!


    I bet I could build a quad Xeon system with Ultra160 RAID and 21" monitor for the price of a barebones 21264. Factor in a second CPU and SMP Alpha board, and I could have a beowulf cluster of quad Xeons plus 21" monitors, RAID, and gigabit ethernet. Bah.

  106. Re:This is so cool... by Snoochie+Bootchie · · Score: 1

    Why limit yourself to choosing one at boot? Why not have a multi-chip module with four Crusoes in it and emulate several architectures simultaneously?

  107. Re:Crusoe Explanation by Traser · · Score: 1

    There's a very interesting paper on the Crusoe and the Very Long Instruction Set processor it uses hereIt's in .pdf. The coolest bit is it's IR pictures of both a PIII and a Crusoe playing a DVD with software. The PIII is operating at a max of 105.5 Celsius, the Crusoe at 48.2.

    --
    Insanity is contagious. - Yossarian
  108. Very informative! by oldman1080 · · Score: 1

    This article was very informative overall. I found particularly interesting the bit about Transmeta asking for a new set of benchmarks that takes into account efficiency as well as performance.

    If Transmeta could afford to hire Linus Torvald's to create a Mobile Linux for their CPU, why couldn't they hire someone to create benchmarks for their CPU? That leads me to wonder, are there any good open source benchmarking programs? Perhaps if an open source one became popular, it would be easily modified to do benchmarking for new processors like this.

    --
    Find and share links to celebrity profiles on MySpace! http://www.myspacecelebrities.com
  109. The short of it all .. by unAnonymous+unCoward · · Score: 1

    It seems that what Transmeta has done is to take the ideas developed for JIT compilers and apply them all to hardware. Pretty neat stuff.

    1. Re:The short of it all .. by unAnonymous+unCoward · · Score: 1

      Ok ok, they only moved JIT techniques a bit closer to the hardware. You're right, it's still software, even if in the end it is shown to be hardware assisted software.

      My first attempt to post got lost in the Ether..does that make me a nonfirst post or a first nonpost? :)

    2. Re:The short of it all .. by Guy+Harris · · Score: 2
      It seems that what Transmeta has done is to take the ideas developed for JIT compilers and apply them all to hardware.

      "Apply them all to hardware" in what sense? The binary-to-binary translators for Crusoe chips are software; they just happen to be running on hardware that offers some assistance, but the translation itself isn't done by hardware (and happens at a layer below even the lowest-level OS code; as far as the OS is concerned, all the way down to the lowest level, the chip looks like an x86).

  110. Re:Doesn't make sense by Sean+Johnson · · Score: 1

    yeah well, I thought the whole point of this was about more efficiency and power consumption; not about raw "pedal to the metal" speed.

    --
    >>>>>> Chewie, take the professor in the back and plug him into the hyperdrive.
  111. Nice Tech If You Can Get It.... by nellardo · · Score: 1

    I'd heard some info about what Transmeta's chips were doing ahead of time (under non-disclosure, of course), but never had enough information to figure out what they were doing that was all that different from microcode, which of course has been around for years and years.

    Props to Ars for such a clear write-up.

    And props to Transmeta for rethinking the problem.

    --
    -----
    Klactovedestene!
  112. Re:The customary question... by punkass · · Score: 1

    I you crazy? have you seen CPU prices lately?

    --
    "Nobody owns the fucking words man." - James Dean
  113. Re:Transmeta not impressive by Strongtium90 · · Score: 1

    They have essentially built a Japanese Compact Car that is fuel efficient, and not an Italian sports car.


    It's like the Mazda RX-7 of microprocessors!
    It does it a little different and a little better.

  114. Re:Crusoe core instruction set? by aeonek · · Score: 1

    There is alot of glitzy information now available about Crusoe VLIW, a core instruction set that is nothing like x86 and the code morphing software. But the actually technical nitty gritty seems to be lacking. Can a program get access to the core instruction set thus bypassing the code morphing?

    No, no, please! That would be a disaster!! The hole point of this architecture is to get rid of this compability mess. They've already done two different instruction sets. Every new processor from Transmeta will likely use a new instruction set, optimized for whatever the processor is designed for. Don't you see the advantage of this?? Well Transmeta does, so they won't be releasing their compiler, or any specs.

    Elbrus E2K uses a similar translation technique, and is said to have seven times the performance of an alpha. So it's not just about power, Transmeta will probably make faster chips in the future. It's probably not a coincidence that Dave Ditzel used to work with Elbrus in Russia back when he was working for Sun. Personally, i think Dave Ditzel is a bit embarassed that the Crusoes isn't faster. This guy used to do chips for UNIX workstations that nobody could afford, back when we where using c64 and spectrum. Look out for fast chips from Transmeta in the future. And as the article points out, there are some hints in that direction.

    --
    "Bernoulli was wrong. X proves that you can fill a vacuum, yet still it sucks." - Dennis Ritchie
  115. Re:Slightly Off Topic by aeonek · · Score: 1

    One thing that we may find, however, is that a certain architecture is emulated better than x86 (i.e. the PowerPC, ARM, or Alpha architecture may be easier to translate into native VLIW) Therefore it may be a better idea to run Linx over PPC/ARM/Alpha code-morphing software on a Transmeta chip (or maybe just specific type of Transmeta chip works better, etc., etc.)

    I have thought about this also. I don't want this old legacy stuff, although the core is all cutting-edge and all that.

    You could design a "pseudo-architecture", that is optimized to be translated in realtime by code morphing software, rather than executed in hardware, like current architectures. That is what Elbrus did with their E2K.


    --
    "Bernoulli was wrong. X proves that you can fill a vacuum, yet still it sucks." - Dennis Ritchie
  116. Re:Some Question about Crusoe by eggnet · · Score: 1
    In addition, the cache's size or location isn't given. Is it a small cache on die or is it located in system memory?

    The answer is both. It is stored in main memory, which is cached on chip. The CMS sets the memory usage on boot, but the OS can change the allocation on the fly. It's in the Ars article.

  117. A lot of new ideas! by varaani · · Score: 1

    I missed the webcast, so reading this article was something of a revelation to me. I'm amazed by all the things they do differently than any processor it is compared to, and although I know nothing about StrongARM or other mobile processors I have no trouble believing that it's completely different from them too. Just how long did the developing of the Crusoe take?

    The Slashdot crowd (and me too) usually feels reluctant towards doing things in software that could be done in hardware, ie. WinModems. However the Crusoe does this for a good reason, to save power, not production cost.
    I'm already waiting the Transmeta ads showing not MHz, but MHz/W numbers..

  118. Java? by oren · · Score: 1
    There was a time people though that creating a dedicated "Java Chip" would be a good idea. It sort of fizzled, but the notion does have some merit.


    Technically, the immediate candidates which came to mind to pull such a trick have been X86 companies. They have a lot of experience in converting one instruction set (X86) to another (whatever their chip "really" runs). For some strange reason, Intel never made anything of this :-)


    Transmeta, on the other hand, seem to have both the technology and the lack of, shall we say, unfortunate entanglement with a certain software company :-) so such a project would make sense for it. A JVM which was tightly integrated with the code morphing layer could probably run circles around existing implementations.


    Given that that the same chip could also run another instruction set at the same time (its all in the code morphing software, after all), then you'll get a machine which runs native Windows/Be/Linux/Etc. applications, but doesn't discriminate against Java ones. In fact, it might even encourage them. The Java bytes code should be much easier to optimize then X86...


    If Sun has any sense, they would be starting to work on such a thing ASAP. It is a natural fit to their Jini initiative - not to mention the HAVI one. If Transmeta would release a Java-friendly chip with tailored HAVI support, they could be "the" choice for consumer electronic devices.


    Life is definitely going to be interesting in the next few years...

  119. Thank you Ars Technica by Alton · · Score: 1
    Thank you Ars Technica for stating exactly what has been biting at me since I read the original story on slashdot ( and the following 600+ posts).

    Crusoe IS NOT FOR GAMING MACHINES.

    Crusoe IS NOT FOR SERVERS.

    Crusoe WILL NOT REPLACE THE PIII and Xeon and Athalon et al

    Crusoe is for machines where high end performance is secondary to EFFICIENT PERFORMANCE. (laptops etc..)

    Will you see an SMP capable Crusoe? Probably not. Why? Because you don't need SMP to run Word and Netscape well.

    Will you see desktop computers running Crusoe processors? Probably very few, and most of those will be built by people like the slashdot crowd who believe its the best processor for all jobs,and those who just like to tinker.

    Crusoe isn't designed to be the best in all areas. It is designed for the mobile market. If you want to put together a server or a blazing fast Quake machine.. DON"T GET A CRUSOE. Thats not what it is designed for.

    Thanks Ars Technica for not falling for the hype and telling it like it is. And thank you for a wonderful technical briefing. As always, the technical writing prowess of the Ars Technica staff impresses me. Most people have a VERY difficult time explaining technical issues half as well.

    --
    "Anyone who can't laugh at himself is not taking life seriously enough." - Larry Wall
  120. Re:Crusoe core instruction set - Mobile Linux? by TimRiker · · Score: 1

    Doesn't stand to reason that Linux compiled for Crusoe would run faster that Linux compiled for x86? I would like to see tools available to compile the base OS in native instructions and run the apps in x86. This should give a significant performance boost. Maybe a tool to convert binaries and save the result to disk? The users could DL apps like WordPerfect and convert them to Crusoe's binary format. If one is running only open source software, then one could recompile the world. Now that's what I'd like, a Crusoe all-day laptop running Crusoe Linux. ;-)

    --
    Tim Riker - http://rikers.org/
  121. Hannibal rules. by dwalsh · · Score: 1

    That is an excellent article. If you are into CPUs you should read the other articles he has done on Ars Technica (linked to at the bottom of the Transmeta article).

    --
    ${YEAR+1} is going to be the year of Linux on the desktop!
  122. Does anyone know... by Esperandi · · Score: 1

    If you can sign up to receive Transmeta's developers kit thingee they have on their website if you're not in a company and will probably never really produce anything? I'd just love to read about it and toy with ideas, but I doubt I'd ever produce anything of real value... I don't want to sign up and have them smack me on the ass...

    Esperandi

    1. Re:Does anyone know... by jfunk · · Score: 3

      Most, if not all, semiconductor manufacturers are really cool about this. The companies that were the coolest to me were: Analog Devices, Microchip Technologies, Maxim, National Semiconductor, TI, and Motorola among others.

      All of those companies gave me precious device documentation and many of them gave samples as well. I used all of this in school and later in professional life ("we need a good low-power instrumentation amp." "I got a really cool one from AD which has great documentation, let's try it out and we can use them in volume (millions) later" "ok"). Semiconductor companies know the benefits of such behaviour and tend to act accordingly.

      Embedded technologies are a very lucrative market that a lot of young people are jumping directly into (myself included). To deny the flow of information on your products would be like tying your own knot. I'm pretty sure Transmeta realises this.

      Ask and ye shall receive.

  123. No Hype by Esperandi · · Score: 1

    The marketing guy explained that Transmeta has no hype, its all buzz ;) Buzz is when other people speculate about your company and products, hype is when you speculate about your own company and products... I'm not sure which is better, or which leads to less let-downs...

    Esperandi

  124. Re:What I'd like to see... by T-Punkt · · Score: 1

    > I want to see Crusoe vs StrongARM.
    And I want to see Crusoe vs ARM's ARM10 (400 Dhrystone 2.1 MIPS at 300 MHz... optional Vector Floating-Point unit capable of delivering 600 MFLOPS ... )

    The StrongARM design is over four years old now and developement of the StrongARM family has nearly stopped after Intel bought it from Digital Fastest StrongARM when launched (5th Feb. 1996): 200Mhz SA110, fastet StrongARM Digital made: 233Mhz SA110, fastest StrongARM Intel makes now: 233MHz SA110...

  125. Re:Beowulf by jmp100 · · Score: 1
    Another consideration in SMP is this: If the CPU costs $75 but the motherboard costs $940, it's not a bargain until lots of people make the motherboard, which drives the price down.

    I think SMP is a good idea for these chips. If the architecture can be modified to do it, I don't see why you couldn't have four or eight of these on one board. If you've ever looked inside the chassis of modern dial-up gear (the "modems" on the ISP's end, not the POTS device with the red blinky lights you have on your serial port), you know it's not unreasonable to have upwards of 8 processors - such as the i960, in Nortel's CVX gear - on one card alone, with numerous cards in one chassis.

    At that point, you could build a massively parallel single computer, or a cluster of them if you needed even better/more redundant/more fault tolerant/whatever performance.

  126. Re:Was Quake3 running with a hardware accelerator? by tsphere · · Score: 1

    Everything in the docs for quake3 implys that you *need* some form of hardware acceleration to even run the game at all. Mesa lets you get around that, but at what a cost!

    Anyway, if you check Transmeta's circuit diagrams, a large part of the chip is blocked off a "Floating Point / Graphics Unit." True, the chip has a rockin' 128-bit core, but I doubt that it could handle the general loating-point calculations needed for transform and lighting as well as the rest of the 3d pipeline normally handled in the accelerator card (read fillrate).

    Although if they're powering digital LCDs they don't need massive, hot, expensive 350 MHz RAMDACs...

    --
    Tetris rules.
  127. IBM in the '70s all over again by Mija+Cat · · Score: 1

    IBM in the '70s had a problem...they could build more powerful processors, but had to keep compatability with previous ones.

    Part of their solution was to rig the OS so it could "host" other more primitive OSes and make the appropriate calls to the new CPU as needed.

    Looks like Transmeta just re-applied this, only at a much more silicon layer.

    I guess those who don't study history (are you listening, Intel?) are doomed to be defeated by those who do...

    Meow

    --
    Yes, that's really my e-mail. Don't change a thing.
  128. Crusoe == Winmodem by mprovost · · Score: 1
    It's basically the same concept - why waste silicon when I can do the same thing in software? And I won't publish my interfaces and only support the most common systems. This is basically what Transmeta is doing by only making code morphing for x86. It's just like the Winmodem manufacturers only providing drivers for Windows. They leave Linux users out - and people complain.

    Why? The argument usually goes like "Hey, I have some component here that in theory I could be using but they won't write drivers for me or release the specs so I can do it myself." The counter argument is to go buy a "real" modem with everything implemented in silicon. Pretty soon people will be complaining, "Hey, I bought this laptop and it won't let me run LinuxPPC, even though it is clearly capable of doing so, if they wrote a code morpher or released the VLIW specs so I could do it myself." A similar counterargument is: go buy a real processor that supports all this in silicon.

  129. Wouldn't it be better... by Pufferfish · · Score: 1

    ...to, instead of writing software for the code-morphing and then run it on the normal chip, instead make two smaller chips, one being the Crusoe and the other being another chip that would be optimized for the functions code-morphing needs? He mentions that most x86 chips have all these functions on the chip, and the chip is optimized for those functions, which translates into a bigger/hotter chip but also a faster chip. In Crusoe, the stuff is done in software, which actually means that Crusoe still has to do it, except in a different way and without optimized hardware. Now, the reason for this is to keep heat/size/energy use down.

    So, why not make two chips (you could put a heatsink on both if you wanted them to be well cooled), and first send the instructions through one (the 'Friday,' as it were) which would translate the x86 instructions, do the branch predict, register rename, and instruction reorder. Then you send it along to your main chip (the 'Crusoe') which would then do the processes. In other words, instead of just making the chip smaller and then running what you took off the hardware in the software, move it on to another chip. This would:

    A) Increase the amount of airflow you could get over your chips (because there are two chips instead of one). And...

    B) Increase the amount of work the Crusoe could do, because it isn't doing all that translation anymore. This would have the added bonus of making the Crusoe even cooler.

    Now, I suppose the tradeoff would be that, since you're running on two chips instead of one, the information has to go farther (it has to travel between the chips instead of within the chip). Except that if you put your 'Friday' in the way between your Crusoe and it's input, you'd be replacing that much wire. I don't know how much of a speed loss it would be, but I don't think it would be much. And since you'd never have to send information from the Crusoe to the Friday, there wouldn't be any problem there; when the Crusoe does its thing, it can just send the output directly to whatever needs it, bypassing the Friday.

    Also, it might not fit in a standard motherboard, except that since the Crusoe doesn't seem like it will be sold alone (it looks like it will be built into portable devices), it shouldn't be a problem.

    --
    Then again, I could be wrong.
  130. Re:What I'd really like to hear about... by MrHat · · Score: 1

    I watched the entire Transmeta presentation yesterday (~2 hrs long). From what I saw, I got the impression that the "Code Morphing Software" also serves as a layer of abstraction, allowing Transmeta to change the underlying CPU implementation or instruction set without breaking applications. I even saw (I think in another /. post) that even the VLIW instructions are at least partially translated by the "Code-Morphing" software into a lower-level format.

    Playing around with the low-level stuff - including branching, etc - would be a blast, but I got the impression that Transmeta would remain reluctant to release specs, for fear of being forced into the backward-compatibility game, much like Intel.

  131. Re:Beowulf by MrHat · · Score: 1

    This is one of the questions I really would have liked to hear asked at the press conference - "Are there any plans/hooks in place for SMP operation?".

    Massive SMP looked very probable IMHO - especially the heat/power consumption angle of it.

  132. java by sc · · Score: 1

    Is there opportunity here to somehow make Java faster? It seems redundant that the JVM converts to native code which gets converted to Crusoe's ISA. What if a JVM could directly convert to Crusoe's ISA. Please excuse my ignorance on technical matters; and please no flames about Java.

    1. Re:java by chrislike · · Score: 1

      The only way this could work is if transmeta created the JVM for the crusco, then I think it woudl work at the same level as the x86 instruction set, just coverting java straight to VLIM instead of whatever was running on the x86 system. It's a decent idea, but Sun woudn't be able to right the JVM, as they don't have access to the instruction set, like the rest of us.

  133. Re:Slightly Off Topic by Karellen · · Score: 1

    ActiveX? Ewwwwww! :)<humour>

    OK, RedHat may not provide pentium-optimised packages, but if the _chip writers_ write the compiler optimisers (which they could put the team who are currently working on the Code Morphing Software on, as that wouldn't be neccessary anymore) then they could ship the relevant compiler for the chipset, along with an OS compiled with that compiler on the box they ship.

    This would also mean that even though the compiler is optimised to some degree by the chip vendor, the Open Source community would be able to play with it as much as possible to eke out maybe a little more speed, which they can't do at the moment with the Code Morphing stuff (as as I have been led to believe)

    (Note : this is still devils advocate. My personal opinion is still that the Crusoe range sounds like seriously cool stuff - so no flames please :) )

    --
    Why doesn't the gene pool have a life guard?
  134. Re:Slightly Off Topic by Karellen · · Score: 1

    Playing devils advocate here (sort of)

    The argument you're making surely only holds water for closed source software.

    The only reason that Intel *has* to maintain backwards compatibility with the 386 (and even 286?) is because there's a load of really old *binaries* out there that won't run if you remove some of the old instructions. Surely with OSS, all you need to do is write a new back-end for your favourite cross-platform compiler suite (gcc, anyone?), rebuild your app and copy it to your new computer with it's brand spanking new chip that doesn't have anything in common with anything that's come before it, and it'll all still run fine.

    You want to junk those extraneous FPU instrunctions that now have equivalents in your new SIMD unit? Go ahead. The new compiler back-end you've just written to accompany your new chip won't generate any of those old FPU instructions, it'll pass them to your SIMD unit.

    Backwards *binary* compatibility is a *closed source* problem.

    Why not have your compiler generate native Crusoe 3400/5400 instructions (if such things exist).

    K.

    --
    Why doesn't the gene pool have a life guard?
  135. Re:What I'd really like to hear about... by billybob+jr · · Score: 1

    The morphing software is going to be stored in flash ram and loaded into system memory (or maybe the L1 cache?!?). Sounds like a pretty crazy scheme they have come up with.

  136. Re:Slightly Off Topic by timmyd · · Score: 1

    I think that a processor like this aims not to be flaming fast but to be flexible. I doubt they made this thing so they could get two hundred fps out of quake in software mode. I personally think that flexibility or extendability is more important than speed because if you want to change something, you don't mess everything else up. Would it be neat if everything was backwards and forewards compatible?

  137. Re:You aren't SOPOSED to code in it's native set by rsborg · · Score: 1
    I'm sorry, I don't get it. Maybe I'm just dense. Why do all this "morphing" and optimizing at runtime, instead of at compile time?

    Here's a non-hardware example: Oracle. Originally, ORCL used basic heuristics and rule-based optimization. However, for large DB's and high-throughput installations, the big win comes with the Explain Plan and the performance-based optimizer. In newer versions, they will stop supporting the rule-based optimizer entirely. (read Oracle Performance Tuning)

    Simply stated, there are things you can do @runtime that are nondeterministic at compile time, and thus more efficient.

    --
    Make sure everyone's vote counts: Verified Voting
  138. Re:You aren't SOPOSED to code in it's native set by khym · · Score: 1

    That's the whole point of Crusoe, you DON'T code for it directly. It takes other instuctions, starting with x86, and runs them faster, better, and optimizes on the fly.

    The "code morphing" layer is what makes Crusoe stand apart from the rest. It optimizes on the fly the instuction set it's running on the fly. This means that your aps will run faster and faster as it runs. This layer is what gives the Crusoe it's speed. Coding nativly would be SLOWER then using the morphing layer. You also don't get the benifit of the optimaztion.

    What about designing a virtual architecture that is very easy for the code morphing engine to translate, gets rid of performance degrading quirks (like the x86 exception handling mechanism), and also allows you to give hints to the optimizer and code morpher. Programs compiled to this virtual architecture would execute faster than x86 code, but would still take advantage of all of Crusoe's features and also be runnable on all of the different Crusoe chips.
    --
    Give a man a fire, and he'll be warm for a day, but set him on fire, and he'll be warm for the rest of his life.
  139. Re:Some Question about Crusoe by rjamestaylor · · Score: 1
    scheme wrote:
    ...If you power off the computer...

    That's the old way of thinking. Crusoe sleeps drawing less than 20 milliwatts. Who turns this thing off?
    It runs Linux. Who needs to turn this thing off?

    It runs Windows... well, that's a different matter.

    Transmeta: the processor rethought.

    :-only kona in my cup-:
    :-robert taylor-:
    --
    -- @rjamestaylor on Ello
  140. Re:The customary question... by skvat · · Score: 1

    Why not? its a CPU. It can run Linux. There's no reason I can think of why it shouldn't be able to be the cpu in a cluster node. Only problem I see is the price (min. $65 for the cheapest and $120+ for the bigger one) which is not exactly low.

    --
    Help! my .signature is stalking me!
  141. Crusoe Possibilities? by Scriven · · Score: 1

    This article brings up some interesting possibilities, and I'm wondering how viable some of the logical extensions to these possibilities are.

    He mentions, for instance, that PPC/Alpha emulation is theoretically possible. Would it be possible to do both at the same time, some soft of hybrid Mac/PC?

    Also, if the translation is done in software, could it not be possible for the bios/OS to recognize one x86 chip, but have the code morphing software actually be translating for X processors in parallel? There would be no recompiles necessary, and it should run as a super fast x86 box, shouldn't it?

    Thanks for all the great info!
    This is my .sig. It isn't very big.

    --
    This is my .sig. It isn't very big.
    --An Oldie, but a Goodie!
  142. Doesn't make sense by Hoo00 · · Score: 1

    You have a genuine Intel x86 chip running x86 software with hard instruction set. Then, you have "cool shoes" running very long instructed road with soft ware, morphing to be x86 compartible. How does that make it any faster than an intel chip ? Even Einstein cannot break the speed of light and Linus cannot break the software gates. Before I go any further towards off topic, all I want to say is, don't trust Transmeta! They make claims but show no real benchmark or real solid evidents. I felt pity for Linus after I watched the webcast.. what was he thinking, being manipulated by some corporates "bad guys" to play quake like a kid. Remember what the guy said? It was his show!! I bet they fixed the match, so that Linus will loose and makes Linux looks bad. Now, take a look at transmeta's website and see who are the bosses? Linus is nobody, an employee and a tool used by a startup company to attact Microsoft's attention. Oh yeah, Linux will run on the 400mhz chip and that's all it can do, forever doom in a rom chip. If you want real mobile solution, try the 700mhz solution that runs on Microsoft Windows!!! (Too bad this is just a joke and do not represent my view of the actual event.)

  143. Re:You aren't SOPOSED to code in it's native set by dtaye · · Score: 1

    we've seen with the hotspot java vm 100% increase in speed and significant decrease in footprint - this is all due to the profiling info. it collects while running code. Same goes with crusoe morphing layer. Now what would be interesting is to see HotSpot on Crusoe vs. MAJC!

  144. Re:The customary question... by dizzydogg · · Score: 1

    I'm Canadian, so my money isn't worth much, and I'm just out of college and just started working, and I could still afford to buy about 4 of the high end ones per month. (minus motherboard & extra hardware requirements). My computer would go up by approx. 2.4 Ghz per month!

  145. Re:Beowulf by dizzydogg · · Score: 1

    Considering the low size/heat/power usage of these chips, they could probably squeeze several of these into one chip. Imagine a 3Ghz proccessor that can run x86/macintosh/alpha/etc.. with just a change of software.

  146. Re:Sweet...or sour??? by aroobie · · Score: 1

    I'm afraid that I have to agree with this posting. However, I can see alot of uses for the Crusoe, I have a hundred users who never do anything but but run word processing who absolutely don't need a PIII or anything much above a Pentium 233. In fact, I have trouble getting a 'small' enough machine now. If the cost is right for a lightweight desktop, I'll stop buying Intel...Not to mention the 'pat on the back' I'll get for reducing the monthly power bill.

    Still drooling for another Alpha at my site..

    --


    My other car is a motorcycle!
  147. Re:Slightly Off Topic by Ravensign · · Score: 1
    I think that some of the normally on-chip logic is "emulated" in software. According to the article at Ars this includes the branch predictor, as well as the x86 decoder.

    I think that, given that the chip hosts some of it's functionality in software, writing to the VLIW native set wouldn't improve things because it still needs to be massaged by the code morph software, ie the core can't run "everyday" software without going through code morph, ipso facto.

    For instance, you go through all the trouble to compile for the VLIW ISA and then, given the nature of the chip, it can't even run your binary directly on the core anyhow, it has to go through the morpher to enjoy 100% of the advantages of the architechture.

    At that point you might as well use the most (commercially) successful ISA ever, x86 which they have worked hard to optimize the code morpher for, anyhow.

    --
    "Sig free in '03!"
  148. Apple does NOT own PowerPC by Wesley+Felter · · Score: 1

    The PowerPC architecture was designed primarily by IBM and Motorola and the specs have been publicly available for years.

  149. Games and Crusoe by striker17 · · Score: 1
    Crusoe opens up a lot of interesting opportunities for future game optimization. The only reason why Quake et al. are written for x86, PPC, etc. is that these chip architectures are the only ones available currently. No one (that I know of) has specifically designed an instruction set that is tailored toward being game-optimized. Maybe the x86 instruction set is not really optimized for games and is really horrible from the hardware angle. With Crusoe, you could define your own game-specific instruction set (with the appropriate translations defined on the Crusoe-side.)

    I can just imagine it now... an Open Source instruction set for a virtual processor running only on Crusoe (under Linux of course.)

    My only question here is whether or not the instruction set that the game/application is compiled against makes a large difference in the ultimate performance. My guess is that having a specific instruction set that is designed specifically for a given purpose (i.e. fast games) would boost performance tremendously.

    ************************

    This .sig space for rent

    ************************

    1. Re:Games and Crusoe by Graymalkin · · Score: 2

      The best way for games to run would be like nVidia is doing with the GeForce's GPU, the GPU handles the graphics and transforms and all the heavy duty FPU calculations while the system's CPU handles the actual code of the game. The instruction set I would guess would be best for gaming is true RISC, it gets the job done as simple and quick as possible. Games as well as any graphical pose a challenge to processors and programmers because you have two things going on, the data manipulation and control of the program and then the graphical manipulation of the graphics. Look at any CLI programs, they have a single job to do usually at a time and can work in order, Quake needs to do 30 things at once.

      --
      I'm a loner Dottie, a Rebel.
  150. Crusoe core instruction set? by Anonymous Coward · · Score: 2

    There is alot of glitzy information now available about Crusoe VLIW, a core instruction set that is nothing like x86 and the code morphing software. But the actually technical nitty gritty seems to be lacking. Can a program get access to the core instruction set thus bypassing the code morphing? Is it possible to detect the Crusoe processor with x86 compatible instruction so that in critical performance sections of an application Crusoe specific/pre-morphed code can be run if the Crusoe is detect but the application still can execute standard x86 code if it isn't detected? Can a programmer provide their own code morph software thus turning Crusoe into a fast Z80 for example? Does Transmeta have plans to code morph other instruction sets like PPC? And does "Linux Mobile" contain any Crusoe specific instructions or does it depend complettely on the software code morph of x86?

  151. Re:What I'd like to see... by Yarn · · Score: 2

    PPC chips arent really aimed at the mobile market. I want to see Crusoe vs StrongARM.

    --
    -Yarn - Rio Karma: Excellent
  152. Re:What I'd really like to hear about... by David+Greene · · Score: 2
    Except that the code morphing software has one very important property: it optimizes the code dynamically. You can't do that by statically compiling to the VLIW layer.

    Now I suppose Transmeta could design a full O-O-O core, but I don't see the point. If the software does a good job, the additional flexibility they gain to change the underlying machine is worth it.

    As far as branches go, yes, you usually can guess a backward branch is going to be taken. But branches are still a huge problem. It's tough to keep a processor core fed. And don't even get me started on multiple branch prediction. The hit rate goes way down. A study was done here that showed processors today (or in the near future) spend about half the time recovering from branch mispredictions. That's a lot of wasted work. While the code morphing software can't do a perfect job, it is somewhat easier to tune the chip. And then think about per-application tuning. Load a different set of rules depending on the program you're running.

    Interesting, no? :)

    --

    --

  153. I guess you didn't read the article by kip3f · · Score: 2
    • This architecture allows for some interesting optimizations not feasible in conventional CPUs.

      "Crusoe's Code Morphing software not only keeps track of which blocks of code execute most often and optimizes them accordingly, but it also keeps track of which branches are most often taken and annotates the code accordingly. That way, Crusoe's branch prediction algorithm knows how likely a branch is to be taken, and which branch it should speculatively execute down. If a branch isn't particularly likely to go one way or the other, then Crusoe can speculatively execute down both branches.

      Contrast this with speculative execution done on a normal CPU, where hardware limitations like buffer and table sizes limit the amount of information you can store about a particular branch and its execution history. Since Code Morphing keeps track of the branch histories in software, it can record a more finely grained description of the execution patterns of a wider window of code, and therefore assess more accurately whether or not a specific branch is likely to be taken."

    • High performance on the desktop is also interesting: "So you see, they made the Code Morphing software extremely modular. They can implement whatever parts of it they like in hardware to get whatever degree of performance gain they want. Crusoe should be viewed more as a proof of concept than as the ultimate outcome of 5 years of work. Crusoe represents one extreme of a spectrum that stretches from "implement the bare minimum in hardware" to "implement everything in hardware." Now that Transmeta has a technology that's proven to work in the most difficult case (where 2/3 of the transistor logic has been moved into software), they can go back in the other (easier) direction and start putting stuff in silicon.

      Furthermore, since there's a software layer between the ISA of the binary and the machine's native ISA, Transmeta is free to beef up the execution engine (or any other part of the core) however they like, because the only thing that will require a recompile is the Code Morphing software. A case in point is the two chips in its product line. Each has a slightly different core (the Windows chip has special instructions in it that help speed up Windows), but they both are fully x86 compatible. There's nothing to keep them from stuffing new functions and features (SIMD anyone?) into the silicon, to help scale the product has high up as they want to go with it.

      I'd say that it's only a matter of time before we hear an announcement of another product line from Transmeta. It won't be named Crusoe, because it won't be aimed at the mobile and embedded markets. It'll be a workstation and server class x86 CPU that runs Linux like a fiend, and it'll compete directly with Intel's IA-64. I can't wait."

    I, for one, am really excited about the possibilities.
    --
    My opinions may have changed, but not the fact that I am right.
    --
    ****Gfx Scrollbar Special case hit!!*****
  154. Dynamic clock speed adjustment and BogoMIPS? by Tet · · Score: 2
    The article talks about the chip dynamically adjusting its clock speed to minimize power usage. While this is done in existing laptops, it only tends to be triggered by specific events (being connected/disconnected from mains power, battery level reaching certain thresholds etc). The article seems to imply Crusoe will adjust its speed dynamically at any time.

    I'm curious to know how OSes will handle this. For example, we've already had a thread on the linux-kernel list about timing loops being thrown off by this for existing laptops (because the bogomips on which they're based are calculated at boot time). What was the outcome of that thread? Was a solution reached? Will it apply for Crusoe too?

    --
    "The invisible and the non-existent look very much alike." -- Delos B. McKown
  155. Re:Slightly Off Topic by Guy+Harris · · Score: 2
    (actually, I understand that AS/400 machine code is abstracted from the object code of programs, though probably not in quite the same way as how Transmeta did things

    Correct. Compilers for the AS/400 (and its System/38 predecessor) for the languages in which applications are written generate code for a virtual machine with a very CISCy instruction set; low-level OS code translates that to the native instruction set. (That long antedates Transmeta; as indicated, it dates back to the System/38, which I think came out in the late '70's; IBM needed no technology from Transmeta to do that - binary-to-binary translation is hardly a Transmeta invention.)

    It isn't done in exactly the same fashion, in that, on S/38's and AS/400's, the low-level OS code is written in languages that compile (or, for some code, assemble) into the native machine's instruction set, unlike Crusoe, where the only native code that's run is the translation software and the output of the translation software. Also, I don't think the translation on AS/400 is done as dynamically; I think programs are translated in their entirety the first time they're run, and the executable code for the entire program is kept around.

  156. What I'd like to see... by SEE · · Score: 2

    Comparisons to the PowerPC chips.

    After all, the Crusoe architecure is not a performance demon aimed at desktops/servers, and it is not aimed at the ultra-low power consuption StrongArm market. But might be suitable for the sorts of applications that embedded PPCs are currently used in...

    Steven E. Ehrbar

  157. Crusoe is like database SQL. Why is there SQL?! by deusx · · Score: 2

    No, no, no, **YOU** STILL DON'T GET **IT**.

    As far as I can tell, the Crusoe processor engine itself is not special. If you are a "talented programmer programming to the bare metal", you might as well program in assembly on another pre-existing chip.

    And then as a chip manufacturer, you'll face 20 years trying to ensure your vintage instruction set that those bare metal hackers employed.

    You're missing the point.

    Take database servers. Oracle, MySQL, Informix, Sybase, Uncle Joes Ultimate Data Thingy... Just about all of them allow access to their data through a standard SQL language.

    But... But... but... Wouldn't it just be so insanely cool and fast if I could just direcly access the ISAM structures and indexes and modify disk sectors directly?!?! I fully expect every dedicated DBA and application designer to go to the bare iron to squeeze performance from their data warehouses!

    Has that happened? No. Why? Because MOST, EVERY DAY APPLICATION DESIGNERS DON'T "PROGRAM TO THE BARE METAL". It's too complex, intensive, and fruitless a task. Why is Slashdot written in Perl and not assembly? Why isn't Linux 100% x86 assembly?

    There is a BIG difference between just a cool hack and maintainable elegance.

    Why do we have high level languages? Why do we have abstraction layers? Why?

    The Code Morphing is an abstraction layer. Initially, that layer is the x86 instruction set, an arbitrary set of instructions that just happens to currently be widely used. Using Code Morphing, the Crusoe can leapfrog on that wide base of support, while throwing away the hardware architectural garbage traditionally needed to support it.

    Back to SQL: Oracle supports SQL for access to data, but beneath, I'll bet you that a lot of the specific operations upon data that those SQL statements fire off has changed ENORMOUSLY over the years. What would have happened had they allowed programmers straight past the abstraction layer? They still would be trying to support that API today, and I bet they wouldn't be as free to rework their server software.

    Furthermore, why do we have the DBI module and DBD modules in Perl? To provide a semi-universal abstraction layer across all databases. When one database's API changes for performance reasons, efficiency, whatever, you just change the morphing-- er DBD-- layer to accomodate it.

    What is the point of Crusoe then?

    Not to provide assembly hackers with a new opcode set to learn and tweak, which 90% of the application design world will never learn or exploit, and therefore will remain voodoo essentially.

    The point is to provide an architecture which supports ABSTRACTION LAYERS of assembly opcodes. So Transmeta is free to vary the underlying hardware in any exotic or esoteric form they see fit, throwing backwards compatibility of their VLIW opcodes to the wind because the Code Morphing allows the SAME ABSTRACTION LAYER API to be exposed to the application designer.

    Now, finally, note I keep saying 'application designer'. This is as opposed to 'dedicated hacker'.

    Read the definition of a hack. The first two definitions are not my idea of elegance. Something that's quick and does the job but not well. Or, something that is incredibly good, but took a long time.

    Now, read the definition of elegant. Something that combines simplicity, power, and grace. Something that is understandable, almost obvious in its expression. Something maintainable.

    Tell me what's more maintainable: Assembly code for the Mx-650938 processor, or Java code. It's a close call, but I'll have to go with the Java code. It's harder to write a hack in Java, than it is to create an elegant design in assembly.

    It's not about performance. We haven't even BEGUN to wring the performance from the chips we have-- and why? because it's not humanly possible for every applications designer to be a brilliant assembly hacker, which is why we have compilers!

    So, finally, why spend your time learning the latest opcode set when you can just focus on a higher level language and leave the hand tweaking and performance tweaks to the man behind the curtain of the Code Morphing abstraction layer of OZ?!?!?!


  158. Re:Crusoe core instruction set - Mobile Linux? by binarybits · · Score: 2

    One of the things that Crusoe supposedly does is it caches frequently used code in its "compiled" form. This means that you only take a performance hit the first time you run it, and then it should run pretty much at full speed.

    If they give you access to the underlying architecture, then they are committed to keeping that architecture in future versions. This way they can make up a new ISA for every chip, and just tweak the code morphing layer to make it work.

    This gives them a performance hit now, but as Intel is forced to continue to support the x86 architecture in hardware for every new chip, they will have to make their chips ever bigger and ever hotter. Transmeta's approach will likely prove superior in the long run.

  159. Re:You aren't SOPOSED to code in it's native set by binarybits · · Score: 2

    Because writing a new code morpher for this architecture would take R&D dollars that would be better spent emulating real like PPC or IA64 architectures with existing user bases. The small increase in performance you'd get from a "native" ISA would not justify the additional costs of writing and supporting the software for it.

    Also, it sounds like they are optimising each chip to specifically support code morphing from a specific architecture. That means that x86 *is* a reasonably efficient instruction set for this particular hardware. Yes, you could probably make a faster one, but the gains would be marginal unless you actually got direct access to the underlying ISA, which defeats the whole purpose of this strategy.

  160. Re:What I'd really like to hear about... by Admiral+Burrito · · Score: 2

    ...is how much faster this thing will run if it's not emulating an x86. It looks pretty hot under the hood, and if, instead of using standard guess-aheads, you can tell it which branch to use as default or even tell it about branches ahead of time (which you often know well before the actual conditional looping operation) so it's not guessing at all.

    It seems a lot of posters are thinking the same thing. But...

    You could say the same about a Celeron/P-III/Athlon/Whatever.

    "I wonder how much faster my Athlon would go if I could rip out the silicon that does the intruction decoding / reordering / branch prediction / etc and code directly for the execution units."

    It probably wouldn't go much faster (I'd guess that silicon does it's job pretty well) but by ripping out all those transistors you could significantly reduce power consumption.

    In fact, if you think it through for five years or so you'll probably wake up one day and find you've re-invented Crusoe. Of course it'll be old news by then.

  161. I really... by Graymalkin · · Score: 2

    like the Ars article, it was well written. I think the Crusoe is impressive because it does what RISC was originally concepted to do. Look at MIPS, it's a RISC architecture yet it has some of the most complex processing units you'll find. Things like Crusoe and MAJC really rattle the cages of other chip makers because they take an entirely different approach to the chip design. Even PPC is getting really complex, especially by adding the AltiVec unit onto the die, while it improves performance in come calculations it adds signifigantly to the price and complexity of the chip. The human brain can calculate some pretty complex things yet it's processing is done in a massive amount of simple processes rather than a small number of complex ones. I think the next generation of super computers will be built a little more like Crusoe chips, maybe even using Crusoes. The more times it works a calculation the faster it does it, this would add phenominal performance to alot of things we use super computers for right now. Maybe in the next ten years we'll see desktop teraflop systems.

    --
    I'm a loner Dottie, a Rebel.
  162. Re:Slightly Off Topic by Shoeboy · · Score: 2

    Ahem...
    IF YOU WANT TO CODE DIRECTLY TO A VLIW CORE BUY A &*$#ING MERCED!!!!!!
    Sorry about that. You lot just aren't getting it. If you remove the code morphing layer, then you have to put backwards compatibility into the hardware down the road. That means lots o' transistors and high power consumption 2 or 3 years down the road. That also means that compiler complexity goes up dramatically. So you'll wind up having a crippled architecture and low quality compilers 10 years down the road. That's stupid. Additionally, if the compiler is entirely responsible for the optimization, you lose the niftly on-the-fly code tuning based on actual runtime data -- this is the coolest thing about the Crusoe.
    --Shoeboy

  163. Re:What I'd really like to hear about... by ChrisDolan · · Score: 2

    I also wonder whether it can multitask between different instruction sets. I guess the task switching overhead would be pretty brutal if here isn't room onchip for multiple instruction sets.

    My understanding from the articles I have read is that maybe, eventually, but right now it only emulates x86.

  164. branch prediction by TheDullBlade · · Score: 2

    My whole point was that branch prediction can be replaced by expicit pre-branch notification.

    Branch prediction now is very stupid. Circuits try to guess, in real time, which branch will be taken. If the C compiler explained to the branch "predictor" that "this will loop 27 times, then stop looping".

    Furthermore, explicit cache requests could be compiled. "I'll stay in this function for a while, but I'm also going to call these functions."

    With profile-based optimizations and careful design you might never have a cache miss or a branch misprediction.

    I've gotta get me one of these, and play around with alternative opcode sets. This is just the coolest toy for exploring computer architecture.

    --
    /.
  165. Re:What I'd really like to hear about... by delld · · Score: 2

    One would not write in the native VLIW - one would create a new instruction set that hid the VLIW, but used its best features, and interacted with the hardware better - ie saved the optimizations and the branch predictions for the next time the program is run. One need not write in VLIW to get rid on the x86 instruction set. ( I wonder if one could design an instruction set to run one's favorite operating system ( linux - *bsd ... )

  166. Don't write to the VLIW, but... by Shotgun · · Score: 2

    why not save the cache to permanent storage. The processor optimizes the code and then saves the optimized code to disk as a "shadow" executable. The next time the program is loaded the OS would indicate that it has already been optimized and pass the shadow to the processor which could bypass the translator. The translator could attach a signature to the shadow, and if it didn't agree it would reload the program and translate from scratch. In this way, you would get permantly optimized code for all your programs while retaining the flexibility of the current design.

    Of course, one problem with this would be getting support for shadow programs built into the OS. I wonder if Transmeta has anyone that could handle this?

    --
    Aah, change is good. -- Rafiki
    Yeah, but it ain't easy. -- Simba
  167. Re:You aren't SOPOSED to code in it's native set by ~k.lee · · Score: 2

    The "code morphing" layer is what makes Crusoe stand apart from the rest. It optimizes on the fly the instuction set it's running on the fly. This means that your aps will run faster and faster as it runs. This layer is what gives the Crusoe it's speed.

    The only way "code morphing" could run faster than native code is by exploiting runtime information to perform optimizations that are not possible at compile time. In other words, self-modifying code that runs faster than static code.

    This is plausible, but that doesn't mean there would be no performance benefit in compiling native code. Research on self-modifying code is not unique to Crusoe---it's a very active area of research, and there are two major kinds: JIT and dynamic compilation. JIT, which you're probably all familiar with from Java, involves translationg code (typically from a foreign instruction set) and performing optimizations at runtime; dynamic compilation involves "staging" code at compile time to modify itself in a disciplined manner at runtime. JITs and dynamic compilation are very different in the nature of optimizations they perform; one of the major differences is that because dynamic compilation performs its analysis at compile-time, it can theoretically perform much deeper and more sophisticated optimizations.

    Crusoe does no staging (it can't: it executes fully precompiled code), so its optimizations operate under severe time constraints. Therefore, Crusoe's code morphing is likely to produce code optimality akin to that emitted by a JIT compilation system: shallower analysis, shallower optimizations. Which almost certainly makes Crusoe's "code morphing" worse than native staged dynamic compilation would be.

    In summary: my point is that self-modifying native code that improves its performance at runtime is entirely possible without "code morphing". On the other hand, binary x86 compatibility is arguably Crusoe's major selling point, so there's not much impetus for them to bother encouraging any kind of native code compilation. Anyway, I get the impression that Crusoe's entire architecture would have to be revamped if they wanted to run native code so it's a moot point.

    If you're thoroughly confused by now, try visiting the dynamic compilation project at the University of Washington for more information on dynamic compilation.

    ~k.lee

    (BTW: this does not mean that Crusoe does not embody any technical innovations. In particular, the hardware support the chip provides for its runtime code translation is very interesting.)

    --
    (remove nospam for email)
  168. You really, really, still don't get it... by Simon+Brooke · · Score: 2
    Efficiency isn't exactly exciting. Unless I am using a Palm Pilot, I really don't care if my PentiumIII or Alpha is sucking 34W and my Nvidia GeForce is sucking another 30. What I care about is how fast my performance is. How many transactions can I run? How many frames per second am I getting? How many polygons can I push?

    You really, really still don't get it, do you? Firstly, Crusoe is the first chip Transmeta has got out the door. It's the simplest possible silicon, with the hard bits done in software. But there's no hard line between what functions can be done in hardware and what can be done in software. It's just that software is cheaper to tune.

    When Transmeta have got code-mophing tuned the way they like it there is nothing to stop them releasing a new chip with the code-morphing engine in hardware.

    But even if they don't, the limitation on performance computing design is cooling, as Cray amply showed. Crusoe consumes 1/32 the power of your PIII; so, for a given cooling system, you can stick 32 Crusoes in the same box. If each Crusoe gives you 66% of the compute power of the PIII, you've got a box which is going to deliver you more than 21 times the number of polygons your PIII can push.

    One thing I haven't yet seen quoted is the part-price for a Crusoe, but if the silicon is as simple as people are suggesting the part-price could be very low - small dies have relatively lower reject rates because if you have one flaw per square inch, every inch square chip has a flaw whereas only one in ten 0.3 inch chips does.

    By contrast your PIII is inherently an expensive part - it isn't expensive because Intel are profiteering, it's actually expensive to make. If Transmeta start shipping Crusoes at (say) around $10 per part in quantity, there isn't any way Intel can compete anywhere along the line.

    I currently run two PII/300s in my desktop box. I bought them because two 300MHz parts and a motherboard to accomodate them were, at the time I bought them, a lot cheaper than one 500MHz part. If I can get, say, 8 400MHz Crusoes for the price of one 700 MHz Intel part, I will be quite happy to run them, and so I expect will a lot of other people.

    Assuming, of course, that Linux 2.4 will run 8-way parallel on Crusoes, but I'm kind of prepared to bet it will :-)

    --
    I'm old enough to remember when discussions on Slashdot were well informed.
  169. Re:You aren't SOPOSED to code in it's native set by kaphka · · Score: 2

    I'm sorry, I don't get it. Maybe I'm just dense. Why do all this "morphing" and optimizing at runtime, instead of at compile time? Binary compatibility with existing processors is a nice feature, and I'm sure it will help Crusoe get a foothold in the market, but why can't we at least have the option of bypassing the emulation when native software becomes available? (Or does the Crusoe already allow this? The reports haven't been clear on that.)

    --

    MSK

  170. Was Quake3 running with a hardware accelerator? by rogerbo · · Score: 2

    I originally posted this in a previous crusoe article but no one commented on whether it's actually feasible or not. Any big brain VLIW gurus want to tell me if what I suspect might actually be true?

    The quake3 performance we saw on the ZDTV webcast was pretty damn impressive. Everyone seems to be assuming that they had 3d accelerators in those TM5400 laptops.

    You can run quake 3 in software mode under mesa at about 3 frames per second.

    But this is transmeta we're talking about and that was Dave Taylor, the SAME dave taylor that once leaked a document onto usenet ranting about
    the inferiority of hardware graphics accelerators and that what he really wanted was a generic parallel processing chip that could do arbitary transforms.

    GEE, a lot like the crusoe chip can do?

    (anyone got the link to that usenet posting on deja that dave taylor tried to cancel?)

    Isn't it feasible that they have put hooks into their code morphing software that optimises specially for 3d transforms and mesa/opengl?

    Especially in the linux version? Where they have all the source code to linux and mesa?

    Hmm, what fancy optimisations could those clever brains come up with?

    Maybe those transmeta laptops WON'T need 3d accelerator ships?

    And it would completely defeat the purpose of a low power laptop to put a big,hot,power sucking 3d chip in it. So I'm assuming that demo of quake3 they showed WAS running in software mode with some pretty fancy dynamic optimisations going on.

    Maybe the reason they didn't make a big deal about this is that it's still a "work in progress" as Linus said about mobile linux so they don't want to hype it yet.

    Someone prove me wrong?

  171. Hint on Crusoe Webpad from 1-3-00 by Shook · · Score: 2
    The Transmeta webcast reminded me of something I read in U.S. News & World report a few weeks ago. It was in an article about IBM's Mark Dean.

    Quote:
    Early in the next century, Dean hopes his new concoction, which he says is "in the idea and invention stage," will be ready for the public: a sleek tablet that is magazine-size, inexpensive, programmable, and voice-activated. He expects his unnamed dream pad, which will run on a 24-hour battery, to provide everything a PC does, including streaming audio and video, word processing, and spreadsheets. It will even have a port for old fogies who can't give up their keyboards. And it will wirelessly put the Internet and other information at your fingertips.
    End Quote.

    Of course the article never mentions Transmeta, but I bet this web pad would be powered by Crusoe. Here's the link for the article.

  172. The next step? by G27+Radio · · Score: 2

    The current instruction sets of most processors are probably designed based on certain price:performance ratios taking the cost of producing them as hardware as a major consideration. Transmeta could come up with their own virtual instruction set that would be optimized for thier chips. It would be an easy move for the software developers since their old code could still run on the processor anyway until they recompile to the virtual instruction set. I didn't read the whole Ars article because it's past my bedtime (I'll read it tomorrow at work.) But the author made a comment about framerates "(yet)" -- I didn't see what he was eluding to by the "(yet)" but I got the impression he expects Transmeta compete beyond the mobile arena.

    Another thought I've had is that things just got harder for a company like Intel. It was no easy task for AMD to get big enough where they could afford to be competitive with Intel. But Crusoe-type processors sound like they would be much easier to design and produce...new companies will have a much lower barrier for entry into the competition. Lucky for Transmeta that they have their patents ;)

    numb

  173. This is so cool... by CaptainCarrot · · Score: 2
    Imagine one of these things loaded with two or three different code morphing modules. Your boot loader begins by asking which architecture you'd like to emulate. Want to run your games? Boot up as a x86 with Windows. Doing graphics design (or running one of Ambrosia's cool games, which they refuse to port to Wintel)? Boot as a PPC with MacOS. Doing some TT&C on your satellite constellation? Zap, you're an Alpha!

    OK, I'm just an applications geek, and know next to nothing about hardware, so this probably sounds pretty stupid. Live with it.

    --
    And the brethren went away edified.
  174. Re:Slightly Off Topic by mOdQuArK! · · Score: 2

    I _know_ what you're saying, I _read_ the Transmeta whitepaper & have a pretty good idea of the concepts behind the Code Morpher, I _know_ what how the Transmeta people _want_ the chip to be used, and how a lot of people think it _should_ be used - just as I _know_ that there are going to be some people who will ignore all that & will hack on the VLIW instruction set directly. 99.9% of the people programming for the Transmeta chips won't - but there will be a few that will.

    They won't give a damn about backward compatibility, or what the "next" chip is going to implement - they're not programming for money, they're programming for fun, and they'll program using the VLIW instruction set because they'll think they can do it better than the Code Morpher can (for a particular chip, and a particular set of instructions). When they start playing with a new chip, they'll learn the VLIW instruction set for THAT chip and do it all over again.

    BTW, regarding some of the replies:

    1. "Transmeta's chips transcend backwards compatibility."

    Bull.

    Transmeta have to create versions of the Code Morpher to be "backwards compatible" with all of the various instruction sets that they choose to support from the other chip companies, plus any "improvements" to the instruction set that those chip companies make. They will have to create a Code Morpher version to run on each new chip that they develop. (Can you say, front-end/back-end?)

    If they did a good job architecturally, and make it easy to upgrade the Code Morpher (assumedly in FlashROM or something similar), then given the current processor-types, it shouldn't be too difficult for them to create new front-ends and back-ends.

    As time goes on, like any project, the Code Morpher code base will get more complicated & difficult to maintain. They'll make mistakes encoding the instruction sets, and then have to issue updates to correct it, etc.

    2. "Code executed through the translation layer should perform better than code executing on the bare metal because the translation software is learning and optimizing."

    By definition, a "perfect programmer" will always be able to do AT LEAST AS WELL as an optimizing compiler (even at run-time!), because he or she can USE THE SAME TRICKS as the optimizing compiler (write code which collects metrics & recreates itself based on those metrics). And because the programmer has application knowledge which the compiler doesn't, he or she will mostly likely be able to DO BETTER.

    Like I said before: for the most part, programmers will use what Transmeta gives them - and for a very small fraction of programmers, in the tiny bits of their code where they want to squeeze out everything they can from the hardware, they're going to try to bang on the metal.

    Based on the strong reaction to my reply, I'd say that at least a few people have been programming for a living so long, they've forgotten how much fun it is to "push the envelope" of any given piece of hardware.

  175. Re:Slightly Off Topic by mOdQuArK! · · Score: 2

    I'm sorry, YOU aren't getting it.

    No matter how good the Code Morpher is, a talented programmer programming "to the bare metal" will be able to do better. A geek screaming for performance on their "baby" doesn't give a damn about whether the next processor will change its instruction set - he (or she) is interested in getting the max. performance out of the CURRENT processor - which DOESN'T mean you let somebody else's software get in the way.

    As far as on-the-fly code tuning is concerned, no matter how good the "tuner" is, it can only react to changes & build code AFTER it has accumulated some metrics, whereas a programmer who is intimately familiar with his or her problem-space, can prebuild tuned code for handling most of their expected cases.

    I fully expect dedicated hackers to do what every programming freak does - use the provided tools most of the time, and where they want total control & performance, to write the VLIW directly (no matter WHAT the people who made the chip say).

    Frankly, ignoring all the hype, this is just a RISCier RISC chip - what the original RISC folks were aiming for in the first place, but which has fallen by the wayside as they tried to compete with Intel.

  176. Re:Slightly Off Topic by _blueboy · · Score: 2

    There are several reasons why Transmeta doesn't want people coding for the native instruction sets. First of all, coding for a native instruction set will just give us the same problem as we have with x86 now -- too many applications to change the architecture, so crappy architecture ends up hanging around way longer than it should. Second, they stated that the instruction sets for the two chips are incompatible, so obviously there is no single "Transmeta Instruction Set". Third, they like the code morphing because it allows them to make fixes that can be downloaded. If people are coding apps to run natively, this can't be done.

    But......
    I have been thinking about this too and I'm wondering if it would be possible/logical to define some VLIW Instruction Set that could be used on all Transmeta chips, but would be faster and more efficient than translating x86. The CMS would still be translating from the "Transmeta Instruction Set" to the chip's native instruction set, so they could keep all the benefits as before.

    Whadyall think?

    --
    pdubroy AT yahoo DOT com
  177. Re: code morphing *has* been seen before by addp4 · · Score: 2

    In the article there's this paragraph: Now, let me just stop and say that a number of folks, in their effort to show that they've "seen it all before" and can't be taken in by the hype, have tried to compare Code Morphing to Alpha's FX!32 or to an emulation program like SoftWindows. Such comparisons are like comparing a MinuteMan missile to a bottle rocket. In this case, you should feel free to believe the hype; Code Morphing is cool. I'd have to say that code morphing has been around. One only needs to look at executor from ARDI. It dynamically recompiles 68k code into x86 code using an instruction generator. i think ardi has a whitepaper on this on their site. Besides that, there's not that much difference between FX!32 and code morphing from the software perspective except for the fact that Crusoe had more hardware support of fixups (via the shadow register file and the gated store buffer), FX!32 runs offline instead of dynamically, and the threshold for code generation is much higher (FX!32 translates based on profile info, Crusoe probably only translates when they're enough blocks to make the translatation overhead worthwhile.) In addition, there *has* been work doing dynamic recompilation. That's essentially what a JIT is. Or you can look at a paper in the 1998 ASPLOS proceedings. There's a paper there describing such a system (called Shogun, I think), unfortunately, the target arch didn't have all the crusoe's aforementioned hardware hooks, so the performance isn't quite as high. Even VMware has done this stuff before, well VMware started off as simOS, which did have a dynamic translation as well as interpreted mode. Its just that no one has integrated the translator and added the hardware hooks to make it as efficient.

    --
    .oOo.
  178. The customary question... by Anonymous Coward · · Score: 3

    can we run a Beowulf cluster with it? =-)

    Seriously though. The biggest problems with Beowulfs is space and heat, and imagine low-heat low-space processors wedged in there. Makes me horny.

    From the mind of the most famous poster in all of slashdot

  179. Re:What I'd really like to hear about... by tzanger · · Score: 3

    ...is how much faster this thing will run if it's not emulating an x86.

    That is missing the point, IMHO. One of the reasons the chip kicks ass is because they can change the hardware and you can't tell. Write native VLIW on this pig and you're fucked if they change, just like all the other processors.

    ... this is coming from a guy who prefers assembly to high-level languages in 98% of cases. I think they really struck on something here, don't fuck it up by asking to write in the "native tongue" of this beast. Well, unless you're writing your own processor. :-)

  180. Slightly Off Topic by Accipiter · · Score: 3
    I just came up with a thought...

    Okay. The Crusoe is fully x86 compatible. Great. But how about developing applications for this processor that skip the translation step, and are already written in the processor's native language? Think about a Distributed.net client written SPECIFICALLY for this processor, with no x86 instructions.....

    I'm betting that would speed up apps tremendously. Even Linux....ported directly to Crusoe's native instruction set. The problem I see is, the processor is designed to run x86 out of the box. Code would have to be written to change the Flash ROMs on the processor to bypass translation and hit the core directly, or at least do a straight-through delivery. (Why translate VLIW to VLIW?)

    (IF YOU DO THIS AND FRY YOUR CRUSOE, I'M NOT LIABLE.)

    -- Give him Head? Be a Beacon?

    --

    -- Give him Head? Be a Beacon?
    (If you can't figure out how to E-Mail me, Don't. :P)

  181. Beowulf by Shoeboy · · Score: 3

    Who cares about Transmeta Beowulf's. With the low transistor count and low temp, this chip could do the same SMP-on-a-chip thing that IBM is planning for the PPC. The only reason to have beowulf at all is that it's more economical than SMP sytems, it's not a better solution than massive SMP IF massive SMP can be made cheaply. Of course, some organizations will have a need for beowulf clusters of massively SMP systems...
    ...damn it, now I'm horny.
    --Shoeboy

  182. Some Question about Crusoe by scheme · · Score: 3

    I have some concerns about the performance that the Cruose processors will actually have. The article mentions that translated instructions will be cached and then be reused if the CodeMorph software sees it again. However, it seems like the CodeMorph's state information will not be mantained between runs. If you power off the computer, the software loses the cached information and has to start from scratch again. In addition, the cache's size or location isn't given. Is it a small cache on die or is it located in system memory? The cache is probably on die for speed reasons but this would limit the size of the cache. This could be a performance hit since the cache is also used as a data cache and instruction cache.

    Another question concerns the way the instructions are being cached. For example suppose the following instructions were given

    ADD AX, BX
    SUB CX, AX
    JNZ Cx

    Would the translation for each instruction be cached, or is the sequence cached? The article implies that the sequence is cached since the CodeMorph software can optimize the speed on subsequent passes. However, this seems to limit the benefit gained from caching to relatively tight loops or common sequences of code depending on the cache size.

    On a side note, the article implies that the CodeMorph software lightyears beyond anything else. However, some of its highly touted features appeared in other software before. For example, DEC's FX!32 would initially just translate code but would also observe the application behaviour and then optimize the code based on that after the application finished executing. It could do this optimization several times, optimizing more aggressively on each pass. Also Apple's 680x0 emulator was also based in rom that would start up initially so what the MacOS could boot. The CodeMorph software has some new features if it really does OO scheduling and optimization on the fly but that seems like a pretty big hit on performance.

    If future server/desktop oriented processors implement large parts of the CodeMorph software in hardware, how will that be any different than AMD or Intel's processors since they'll all be implementing a hardware instruction translation unit besides the Transmeta core being VLIW. Plus the transistor count and power consumption will also sky rocket along with that.

    --
    "When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
  183. Crusoe-VLIW native code by SMN · · Score: 3

    Transmeta does NOT want us programming directly in Crusoe VLIW-native code. In fact, the opcodes will NOT be the same on the 3400/5400 chips, and will probably change for all future chips (each model/variation would need its own code morphing software).

    The primary reason is that they don't want to have to make these chips backwards compatible. Intel has a lot of problems with this - even the newest Pentium III's must support programs written for 386s. Intel has a hard time because it can't change these opcodes, but instead has to add new ones - hence MMX, SIMD instructions, the Katmai extensions (the P3 stuff), etc (and similarly, AMD has added 3dnow! et al).

    Transmeta wants the freedom to be able to drastically change newer models of the CPU to keep it running at optimal speed/efficiency. If they wanted to allow us to write Crusoe-native code, then they'd need morphing software that allows newer models to morph old code to its own (modified) native code. In other words, a real pain in the rear and definately a problem if Crusoe can't run different "morphers" simultaneously (which I suspect it can't).

    As for other morphing software to emulate other processors: I wouldn't be surprised if they allowed it to emulate some other chips - like the PPC, so it can run MacOS stuff - but it won't run nearly as well as x86 emulation will. The chip is meant to be able to morph code from many different platforms, but there are a lot of shortcuts to emphasize x86. I think that topic is addressed in the Ars Technica stuff, but basically Crusoe uses a FPU very similar to the x86 one. I think there are some other things for that in hardware, as well as the fact that we know they're dedicating most of their time to creating the x86 morphing software so it will be the most optimized.

    I highly doubt that we'll be able to write our own morphers. I think that it's an extremely difficult thing to do, it would require knowledge of the Crusoe instruction set (which, as I said above, they don't want to release), and the morphing software is probably authenticated somehow. Since the morphing code is running in Flash ROM, it can be upgraded, but if someone tried to load a morpher that doesn't work they're gonna have trouble reverting back to x86.

    Linus said that "Mobile Linux" is NOT a code fork - it's just the x86 version with a few modifications to make it run better on embedded platforms. Why reinvent the wheel?

    Keep in mind that this is all SPECULATION - if anyone here has other information to the contrary, I'd like to hear it =)

    --
    -- Imagine how much more advanced our technology would be if we had eight fingers per hand.
  184. What I'd really like to hear about... by TheDullBlade · · Score: 4

    ...is how much faster this thing will run if it's not emulating an x86. It looks pretty hot under the hood, and if, instead of using standard guess-aheads, you can tell it which branch to use as default or even tell it about branches ahead of time (which you often know well before the actual conditional looping operation) so it's not guessing at all.

    There's of all kinds of fun I could have with this chip...

    I also wonder whether it can multitask between different instruction sets. I guess the task switching overhead would be pretty brutal if there isn't room onchip for multiple instruction sets.

    --
    /.
  185. Transmeta not impressive by rcromwell2 · · Score: 4

    They have essentially built a Japanese Compact Car that is fuel efficient, and not an Italian sports car.

    Efficiency isn't exactly exciting. Unless I am using a Palm Pilot, I really don't care if my PentiumIII or Alpha is sucking 34W and my Nvidia GeForce is sucking another 30. What I care about is how fast my performance is. How many transactions can I run? How many frames per second am I getting? How many polygons can I push?

    Crusoe may be important for the coming ubiquitous computing revolution (if it ever happens), but they are not the first to go after low power (remember Rise? Remember WinChip IDT? Don't forget Strong ARM)

    I think Crusoe is a nice chip, but the *HYPE* (and I mean hype) caused by deliberate secrecy and press leaks thoroughly destroyed any chance of it being seen as revolutionary in my eyes.


    The Code Morphing technology is not revolutionary. Emulators have been doing dynamic instruction set recompilation for years now, DEC did it with FX32, Sun does it with Java JIT's (including HotSpot which does recompilation based on runtime profiles), SmallTalk VM's have been doing it, hell, even one of the Commodore 64 emulators does it if I recall. John Carmack's Quake3 engine even does it. I'm sure there are hundreds of projects in Academia that have been doing it. The only relevent difference is the hardware assist that the Crusoe has.

    Chances are, when you hype something too much, it's going to be disappointing. There's a thread on Usenet that claims Transmeta's *ORIGINAL* goal was not low power, but the best performance, but when they couldn't attain it, they "fell back" to a low power selling point. I think it's in comp.arch.







  186. You aren't SOPOSED to code in it's native set by HomerJ · · Score: 5

    That's the whole point of Crusoe, you DON'T code for it directly. It takes other instuctions, starting with x86, and runs them faster, better, and optimizes on the fly.

    The "code morphing" layer is what makes Crusoe stand apart from the rest. It optimizes on the fly the instuction set it's running on the fly. This means that your aps will run faster and faster as it runs. This layer is what gives the Crusoe it's speed. Coding nativly would be SLOWER then using the morphing layer. You also don't get the benifit of the optimaztion.

    Also, the instruction sets are different for each chip. Each set is further optimized for what it's use is going to be. So if you code for one Crusoe chip natively , it doesn't run on the other. This lets Transmeta change the instruction set as needed to. Like if it's faster to do something one way, they can change it and not break compatability with anything. And they can give you the update with a software patch.

    So, it doesn't matter if people don't have the instruction set for the native Crusoe processors. They will change alot, and everytime they change you would have to recode every program again. Why bother? Also you don't get to use what the Crusoe processor is all about, it's code morphing layer.

    So, PLEASE, stop complaining that you can't code natively for this chip. The code won't go any faster, and as soon as Transmeta changes the set, your programs wouldn't run anyways. So it's a moot point to code navitly for it.