Slashdot Mirror


Dynamic Cross-Processor Binary Translation

GFD writes: "EETimes has a story about software that dynamically translates the binary of a program targeted for one processor (say x86) to another (say MIPS). Like Transmeta they have incorporated optimization routines and claim that they have improved execution times between one RISC architecture and another by 25%. This may break the hammer lock that established architectures have on the market and open the door for a renaissance in computer architecture."

27 of 179 comments (clear)

  1. what's new with this? by um...+Lucas · · Score: 3

    We've had processor and machine emulators and processor independance for so long now...

    SoftPC, Soft Windows, Virtual PC, XF86, Virtual Playstation, Java, WINE, Wabi, MAME, and so many others...

    Why should this one be the news?

    While Java was basically the only one that's tried to dislodge x86, they've all shown that while it's feasible to run another architecture's binaries ontop of a CPU, it's not the preferred way of doing things.

    YAE (yet another emulator)

    And big deal if it only translates a program from one binary arch. to another... Without an equivalent OS, the calls have nothing to be translated into...

    And i could lead into the slashdot mantra of if all programs were opensource, we wouldn't need somethng as sloppy as an emulator anyhow...

    Or am i missing something about the significance?

  2. Re:Yup by jilles · · Score: 3

    Well cell phones are no longer the limited machines they used to be. They have quite a lot of processing power, can be equiped with several MB of memory. Once you have that, coding C is a waste of time and time is the difference between profit or loss in the mobile market. A two month delay can literally make the difference. A company like Nokia produces dozens of different mobile phone types each year. That's why they love cross platform and couldn't care less that they would have to spend a few dollars more on the hardware. Besides, Moore's law also applies to the mobile market. Mobile hardware is doubling in speed just as fast as desktop and server hardware. Current mobile architectures such as applied in pda's, mobile phones etc. have plenty of horsepower and most of these architectures already have JVMs running on top of them.

    Java programs are crossplatform. In the mobile market this means that once you have a JVM ported to your phone, you can run a rapidly growing number of programs without any change. That cuts back development time dramatically. C doesn't give you the same advantages because you have to recompile, test and debug before you can expect even the most portable C code to run without a hitch.


    --

    Jilles
  3. Re:HP Dynamo project by WNight · · Score: 3

    Exactly.

    This is what that company that was mentioned a few months back was doing...

    We all know that if you zip a file again, it gets smaller again, but it takes exponentially more time.

    Couple that with the sort of exponential speed increase you get with repeated recompilation, and you get almost zero-sized files in a fixed ammount of time.

    The exponential increase I speak of, is that if you get a 22% each time, you've got 48% increase after the second pass, and 81% after the third. It just keeps getting better.

    The drawback with this is that because each recompilation of the program is a different binary (or it wouldn't be faster) it takes a new memory block. This means that the ram requirements approach infinity as well. Kinda nasty.

    But, the patented part of this was that the company was going to use a Ram Doubler(tm?) type technology to compress the program in RAM, as well as the file. This then gets nearly infinite compression in a little over twice the time taken for single compression (there's some overhead) and about three to five times the RAM (there's more overhead in storage) required for just a standard 1-pass 30% compression algorithm.

    The neat thing is it doesn't require quantum computing or anything, it's all off-the-shelf stuff, just linked in a neat way.

    This'll revolutionize the market when they release it... we think MP3s are small! An 80GB HD will offer nearly endless storage.

  4. Re:Sounds like an emulator by TheTomcat · · Score: 3

    if everyone wrote assembler, and didn't ever depend on anyone else's libraries (including OS, and BIOS), this would work.

    But this won't work for the same reasons that DOS software won't run natively on Linux. There's too much dependance on general-use code (like OS based interupts (21h, f'rinstance)). (not that that's a bad thing, just in this circumstance, it makes straight-up translation impossible).

  5. Mac 68K (CISC) to PPC (RISC) dynamic recompiler by kriegsman · · Score: 3

    Every PCI PowerMac has a 68K (CISC) to PPC (RISC) dynamic recompilation emulator in it that it uses for executing 68K code. And MHz for MHz, the execution speed of the 68K code when dynamically recompiled as PPC code, is roughly comparable (plus or minus 50%?) to the speed of the original 68K code on a 68K processor.

    The very first PowerMacs (NuBus based) used instruction-by-instruction emulation to run all the old 68K Mac code, including some parts of the OS that were still 68K.

    The second generation PowerMacs (PCI based) included a new 68K emulator that did "dynamic recompilation" of chunks of code from 68K to PowerPC, and then executed the PPC code; this resulted in significantly faster overall system performance.

    Connectix later sold a dynamic recompilation emulator ("Speed Doubler") for Nubus PowerMacs, that did, in fact, double the speed of those machines for many operations, mainly because so much of the OS and ROM on the first-gen PowerMacs was still 68K code.

    I think that dynamic recompilation has a bright future; x86 may eventually be just another "virtual machine" language that gets dynamically recompiled to something faster/more compatible/etc at the last moment.

    -Mark

  6. Re:All Roads Lead to Open Source by El · · Score: 3
    As opposed to BASIC, which is an extremely stupid solution to the same problem?

    I don't see how the problem that I'd like to send you dynamic content via email without requiring you to be running the same CPU as I am is caused by closed standards. On the contrary, it seems to be an inevitable side effect of competition in the processor market. Yes, it is an obvious solution: given that I can do on-demand translation to the Java Virtual Machine, how much harder is it to do on-demand translation to the instruction set of a real CPU?

    --

    "Freedom means freedom for everybody" -- Dick Cheney

  7. All Roads Lead to Open Source by zpengo · · Score: 3

    It's funny how things are heading these days. Java, .NET, and dynamically-translating processors are all "brilliant solutions" to a problem that was caused by closed standards in the first place.

    --


    Got Rhinos?
    1. Re:All Roads Lead to Open Source by egomaniac · · Score: 4

      Yes, and look how well that's worked for the Unix camp.

      I'm not dissing open source -- I'm just pointing out the realistic view that it doesn't instantly solve all your problems. I realize that just about everybody on Slashdot will freak out about this, but I actually don't like Linux. I don't use it. I think it's just another Unix, and much as I dislike Windows I don't have to spend nearly as much time struggling with it just to (for example) upgrade my video card. (Cue collective gasps from audience). Unix has its place, and I think that place is firmly in the server room at this point in time.

      (Disclaimer: This is not intended to be a troll. Please don't interpret "I don't like Linux" as "I think Windows is better than Linux", because I don't like Windows either. I think they're both half-assed solutions to a really difficult problem, and I think we can do better. What I mean is more along the lines of "If you think the open source community has already created the Holy Grail of operating systems, you've got to get your heads out of your asses and join the real world")

      So, if your thinking is that Linux should be the only platform because it represents the One True Way -- I answer that by saying that you sound an awful lot like a particular group in Redmond, WA that also thinks their platform is the One True Way.

      This industry cannot exist without competition, open *or* closed. Saying that these problems exist because your platform is not the only one in existence is incredibly childish.

      --
      ZFS: because love is never having to say fsck
    2. Re:All Roads Lead to Open Source by egomaniac · · Score: 5

      I realize that Slashdotters love to trumpet the Open Source horn, but this comment is absurd. "Open Source" != "runs on all platforms".

      The amount of work necessary to get a complicated X app running on many different flavors or Unix is certainly non-trivial, and that's just *one* family of operating systems. And it either requires distributing umpteen different binaries or requiring endusers to actually compile the whole damned program. All well and good for people whose lives are Unix, but do you *seriously* expect Joe Computer User to have to compile all his applications just to use a computer? ("how hard is it to type 'gmake all'?" I hear from the audience... as if you'd expect your grandma to do it, and you've *never once* had that result in forty-six different errors that you had to fix by modifying the makefile. make is not the answer)

      The problem of having a program run on multiple platforms is not "caused by closed standards in the first place" as you state. It is caused simply by having multiple standards -- closed or open makes no difference. SomeRandomOpenSourceOS (TM) running on SomeRandomOpenSourceProcessor (TM) would have just as much trouble running Unix programs as Windows does. This is a great solution to a real problem; don't knock it just because you have a hardon for Linux.

      --
      ZFS: because love is never having to say fsck
  8. Re:Solutions in search of a problem by Rimbo · · Score: 3

    You're right in the case of the desktop and applications world. However, in the embedded world, such as cellphones and 802.11, this is VERY useful. The problem of multiple proprietary platforms is the current bane of the telecom industry, which this company is clearly targeting.

  9. A crazy thought... by CraigoFL · · Score: 3
    From the article:
    "Translating CISC to RISC is bit like pushing uphill, but we can get close to parity in performance assuming the same clock speed," he said. "That's because we work the 90:10 rule on the fly. The software spends 90 percent of its time in 10 percent of the lines of code. That means for RISC-to-RISC and CISC-to-CISC translations, we are able to make improvements. We have seen accelerations of code of 25 percent."
    I wonder if they could run the optimizer without the translation layer (or make a ChipX-to-ChipX dummy translation), and squeak some extra performance out of code on any platform?
  10. Re:Show ME the demo! by Spy+Hunter · · Score: 3
    GCC might know a lot about processor architecture but it can't know about what tasks you will be asking the compiled application to do, so it can't optimize for that.

    Hmmmm, I just had a crazy idea. What if you could compile your GCC application in a special way, then run it under simulated normal working conditions and have it log performance data on itself, just the kind of data that these run-time optimizers gather. Then, you could feed GCC this collected data along with your application's source and recompile it and GCC would be able to turbo-optimize your app for actual usage conditions! If it can be done on-the-fly at run-time, it can be done even better at compile time with practically unlimited processor time to think about it.

    Even if the end-user used the application in a nonstandard way it might still provide a performance benefit because there are lots of things that a program does the same way even when it is used in a different way.

    Would this be feasible? Would it provide a tangible perfomance benefit? (like HP's Dynamo?) Comments please!

    --
    main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
  11. Re:No, Open Source is the solution by Anonymous Coward · · Score: 4

    Yeah, that's why star office runs so well on SGI's. It's been open-sourced for almost a year now, and still hasn't been compiled successfully on mips.

  12. Re:Dynamic Recompilation by landley · · Score: 4

    MetroWerks here in Austin did the emulation layer for Apple's M68K->power switchover. They did a really clever thing of identifying long "runs" of code that nobody ever jumped into the middle of, then they treated them as one big instruction with a lot of side effects, and optimized them as a block. (Not one instruction at a time, but the whole mess into the most optimal set of new platform instructions they could.)

    It was quite clever. It's also quite patented, and has been since before the Power PC came out. (And in a sane world those patents would have expired by now, but with patent lengths going the way of copyright...)

    Eventually, when the patents expire, this sort of dynamic translation will be one big science with Java JITs, code morphing, and emulation all subsets of the same technology. And somebody will do a GPL implementation with pluggable front and back ends, and there will be much rejoicing.

    And transmeta will STILL be better than iTanium because sucking VLIW instructions in from main memory across the memory bus (your real bottleneck) is just stupid. CISC has smaller instructions (since you can increment a register in 8 bits rather than 64), and you expand them INSIDE the chip where you clock multiply the sucker by a factor of twelve already, and you give it a big cache, and you live happily ever after. Intel's de-optimizing for what the real bottleneck is, and THAT is why iTanic isn't going to ship in our lifetimes.

    Rob

  13. Transitive Software by philj · · Score: 4


    Here's the homepage for the company - Transitive Software
    (Apologies for the Karma whoring)

  14. these guys are full of it by dutky · · Score: 4
    I'm not saying that their product doesn't work (though I seriously doubt that they can get an improvement in speed from anything other than their hand-picked benchmarks) but that they are probably just trying to spin an established (but under-reported) technology in order to attract venture capital.

    There is no rennaissance in computing that will be ushered in by this product. We have already seen it's like with DEC's FX32 (intel to Alpha) and Apple's synthetic68k (M68k to PowerPC) as well as a number of predecessors (wasn't there something like this on one or another set of IBM mainframes) and current open source and commercial products (Plex86, VMware, Bochs, SoftPC, VirtualPC, VirtualPlaystation, etc.), all of which use some amount of dynamic binary translation, and none have set the world on fire. They are mildly usefull for some purposes, but the cost of actual hardware is low enough to kill their usefullness in most applications.

    I wish these guys luck, but I doubt anyone will be too enthusiastic about this product. They might have stood a chance if they'd pitched this thing a year or two earlier (when there was lots of dumb money looking to be spent) but they are probably toast today.

  15. This was being worked on a few years ago. by saurik · · Score: 4

    This was being worked on a few years ago by some people at The University of Queensland. Unfortunately, they got tired of the project (and, if I remember correctly, that they weren't getting much popular support).

    Their website is at :
    http://www.csee.uq.edu.au/~csmweb/uqbt.html

    "UQBT - A Resourceable and Retargetable Binary Translator"

    To note, they mention that they got some funding from Sun for a few years. (Likely either causing or due to their work on writing a gcc compiler back-end that emits Java byte-codes.)

  16. Re:Why bother? by El · · Score: 4
    Why bother? Suppose I come up with a neat program on my SparcStation, and I want to email to all my friends to show it off. Now, maybe recompiling from source isn't a problem for you and your small circle of friends, but truth be told, some of my friends and relatives (gasp!) DON'T EVEN HAVE A COMPILER INSTALLED ON THEIR COMPUTER!!! My only choice is to send them an executable. Again, maybe you have such a small circle of friends that you can keep track of what kind of computer each of them is running. But quite frankly, some of my relatives, when asked "do you have an x86, PowerPC, 68000, or Sparc chip in that there puppy" can only respond with "huh?!?"

    Your arguments regarding optimization also apply to distributing files as Java byte code, but the simple fact is, for most applications, nobody gives a damn about optimization anymore anyway! Let's see, even if your favorite text editor were 100, or even 1000 times slower, would you be able to type faster than it can buffer input? I don't think so! For the few cases in which cycles are that critical, shouldn't the code be written in hand-optimized assembly and made available in system libraries anyway?

    Your argument that straight binary translation is useless, and that you also need to re-create the entire run time environment is a good point. This, however, is an argument in favor of using Java (or som equivalent), and is an argument AGAINST distributing everything as source. Have you tried lately to write a non-trivial application where the same source compiles on both Linux and Windows lately? (It can be done, but it is EVEN LESS FUN THAN HERDING CATS!) Fact is, this whole discussion is fairly pointless because run-time environment compatibility is both much more important and much harder to acheive than mechanical tranlation of one machine's opcodes to another machine's opcodes.

    --

    "Freedom means freedom for everybody" -- Dick Cheney

  17. Processor emulation, big deal by poot_rootbeer · · Score: 4

    Let's say my source architecture uses interrupt-based I/O. My target uses memory-mapped. Will this translator be able to handle that?

    To be honest, translating one CPU's version of 'CMP R1, R2' to another's doesn't sound like it will user in a renaissance of anything.

    -Poot

  18. Reliable? Optimal? Supported? by BlowCat · · Score: 4
    If we are talking about open source:
    How many people would want to run a "translated" web server? Database? Scientific appliction? How reliable can it be? Why not recompile it natively?

    If we are talking about closed source:
    The same questions except the last one plus lack of technical support for non-native architectures at least by some vendors (e.g. Apple).

  19. Crazy Like a Fox... by The+Monster · · Score: 4
    I wonder if they could run the optimizer without the translation layer (or make a ChipX-to-ChipX dummy translation), and squeak some extra performance out of code on any platform?
    I had exactly the same thought. The article says [as always emphasis mine]
    The Dynamite architecture is based around a translation kernel, with a front end that takes code aimed at a source processor and a back end that aims the translation at a new target. The front end acts as an instruction decoder, building an abstract, intermediate representation of the subject program in the form of what Transitive calls "directed acyclic graphs." The kernel can then perform abstract, machine-independent optimizations on this representation.
    So, x86 => DAG => x86 should work just fine. In fact, x86 => DAG => x86 => DAG => x86 should produce exactly the same code on the second iteration; I wouldn't be surprised if Transitive is doing exactly this to test whether the optimizer were working correctly. At this point, Dynamite sounds conspicuously like Dynanmo.

    I wonder if the specs for DAG will be open so that code can be compiled directly to it, optimized, and then distributed, saving the first two steps in the process. I can see commercial software vendors being all over this idea.

    --

    [100% ISO 646 Compliant]
    SVM, ERGO MONSTRO.

  20. Dynamic Recompilation by Anonymous Coward · · Score: 5

    This sounds an awful lot like the dynamic recompilation of MIPS to x86 done in many emulators (such as UltraHLE, Nemu, Daedalus and PJ64).

    I've been working on the dynarec for Daedalus for about 2 years now, and currently a 500MHz PIII is just about fast enough to emulate a 90MHz R4300 (part of this speed is attributable to scanning the ROM for parts of the OS and emulating these functions at a higher-level). Of course, optimisations are always being made.

    After reading the article, I'd be very interested to see if they can consistently achieve the 25% or so speedups that they claim (even between RISC architectures).

    For those interested, the source for Daedalus is released under the GPL.

  21. Re:HP Dynamo project by woggo · · Score: 5
    whoops! that's supposed to say
    Dynamo dynamically optimizes binaries; an equivalent in the Java world is IBM's Jalapeno VM. Unfortunately, the Dynamo approach is only feasible on the HP architecture, and it is only feasible on HP-PA because the PA-RISC chip has an absurdly large i-cache (greater than 1 mb). Look at HP's Dynamo site for more information, but IIRC the problem is dynamo's extremely aggressive branch prediction.

    damn slashcode...

  22. Re:-yawn- by lostguy · · Score: 5
    ahem. Ignorance does not equal proof.

    To quote:

    The performance results of Dynamo were startling. For example, Dynamo 1.0 could take a native PA-8000 SpecInt95 benchmark binary, created by the production HP PA-8000 C compiler using various optimization levels, and sometimes speed it up more than 20% over the original binary running standalone on the PA-8000 machine.

    That's binary translation from/to the same machine.

    This is basically run-time instruction block reorganization and optimization, which can definitely improve a given binary on a given machine, over compile-time optimizations. Admittedly, a native binary, run through this kind of profile-based optimizer, will probably be faster than a translated-then-optimized binary, but neither you or I can state that with any authority.
  23. Humorous context... by Durinia · · Score: 5
    ...the subject program in the form of what Transitive calls "directed acyclic graphs."

    Wow! What innovative technology! I wonder when they will patent this so-called "directed acyclic graphs". And they picked such a cool name! It sounds so mathematical!

    Okay, enough laughing at the expense of clueless reporters...

  24. HP Dynamo project by dmoen · · Score: 5
    This sort of technology has been around a long time. HP's Dynamo project has been running since 1995. When Dynamo is run on an HP PA-RISC and is used to emulate HP PA-RISC instructions, speedups of up to 20% are seen. That's pretty astonishing: you would think that emulating a processor on that processor would be slower, not faster.

    Doug Moen.

    --
    I have written a truly remarkable program which this sig is too small to contain.
  25. Solutions in search of a problem by Chairboy · · Score: 5

    While this is fascinating sounding technology, it sounds more like a solution in search of a problem. There are already software solutions for emulation (SoftPC, VMWare, etc). There are already cross platform language solutions (Java, etc) and so on. Despite this, the market for massively cross platform applications has not really developed. It isn't as if a 25% performance increase is whats holding back the 'rennaissance' the author speaks of.