Dynamic Cross-Processor Binary Translation
GFD writes: "EETimes has a story about software that dynamically translates the binary of a program targeted for one processor (say x86) to another (say MIPS). Like Transmeta they have incorporated optimization routines and claim that they have improved execution times between one RISC architecture and another by 25%. This may break the hammer lock that established architectures have on the market and open the door for a renaissance in computer architecture."
Frankly, I must disagree with you -- Debian *does* have mechanisms for automated source (as well as binary) distribution, completed with dependancy handling and the like.
Indeed, apt-get has wowed a great many of my (formerly) Windows-using friends. No installers needed! No library conflicts! etc...
If by "Unix" having a "great, working system" you mean all major unices having such a system, you're quite right that they don't. However, there are some excellent solutions in place.
(I'll grant you that this is *not* true for everything else the poster mentioned -- Java,
There is more information content in the original code for an optimizer to make use of then there is in a binary (or assembly). If this were not the case, would not optimizers run *after* the assembly translation is done? In fact, all reasonable compilers run the vast majority of their optimizations *before* the translation occurs, and only a few small peephole optimizations are done on translated or nearly-translated code. The unfortunate (for them) facts are that:
The verdict: don't fall for this. Even if it works, and even if it has no effect on performance in the common case, there's no benefit. The only useful things that can come of this are the magic peephole optimizations they might be using, which should go into general-purpose compilers.
"directed acyclic graph", abbreviated "DAG", is a generic term describing a data structure. You might also call it a "tree". That their internal data is stored in such a structure is almost implied by the nature of the software. That is how compilers generally store their internal data.
That's what this is: a compiler. Its input is a machine language instead of a high level language. This is interesting, but not necessarily all that useful. It solves a piece of the problem, but not the hardest piece.
The ability to take an existing piece of code and run a static optimizer on it might be interesting, but I suspect that such a device would exercise enough previously undiscovered bugs in the targeted software to make its use as anything but a testing & debugging tool rather impractical.
The idea you suggest resurfaces every few years. A while back it was called "thin binaries"; in the late nineties it was called "Java". In any case, it never takes off quite as well as everyone seems to think it should, simply because the processor is only one part of the machine, and not the hardest one to emulate.
-Mars
Well... yes. But I didn't want to spend half an hour discussing the issue. How would *you* have explained it?
-Mars
It used dynamic recompilation of the sort mentioned here, and from what I've heard, was at a pretty acceptable speed. It also did run-time optimization, or as Transmeta would put it, code morphing.
I believe there was also a FX!32 compatability layer for Digital Unix and later Linux, although support was slightly more sketchy. If I remember correctly, this was around the time that Digital made it possible to use libraries compiled for Digital Unix under a Linux environment.
Anyone else have more to say about FX!32? I'd be interested in more info.
Dynamo dynamically optimizes binaries; an equivalent in the Java world is IBM's Jalapeno VM. Unfortunately, the Dynamo approach is only feasible on the HP architecture, because the PA-RISC chip has an absurdly large i-cache (extremely aggressive in branch prediction.
Nobody made NT on Alpha software
Well, Microsoft had ported their entire BackOffice suite (Exchange, SQL, etc) over to Alpha. There were also versions of IIS, VisualBasic (for DCOM), Netscape Enterprise, Oracle, Lotus Domino, and so on.
So, people made the (server) software, just that nobody bought it.
(My theory was that NT networks generally scale out across multiple boxes rather than up to larger boxes, meaning that few NT shops needed more than a 4-way Intel box, which was the only point the Alpha's started to get price competitive.)
We seriously considered Alpha/NT servers at one job I worked at back in 1995-6. It met our software checklist, but the DEC sales engineers couldn't even get their damn server to boot on two successive visits. Then the Pentium Pro shipped. Game Over.
--
Business. Numbers. Money. People. Computer World.
I think bochs is great as it allow intel binaries to run on all sort of other platforms. You just need a super fast PC to get some performance out of it..
I don't want a lot, I just want it all!
Flame away, I have a hose!
Only 'flamers' flame!
The comment that this is not suitable for
hand-optimized loops in DSPs plainly
means that this is an emulator.
What would be cool instead is if someone
made a binary cross-compiler so it would go
through you harddisk's binaries and convert
them from, say, x86 to PPC so that you could
take your hard drive, take it from an x86
system, put it on a Mac and have Windows
boot natively (modulo ROM issues on Macs).
All without access to Windows source code.
For certain classes of problems, compilers do a hell of a lot better job than people do. I don't think most people are that great at optimizing assembly code by doing things like properly sharing registers. In general, when it comes to optimizing memory access, compilers will beat 99% of all programmers out there.
But for picking a better algorithm, no, compilers suck. But I don't know of any active research in having a compiler change the algorithm implemented by the programmer, so you're using a straw man here. Then again, it's been 4 years since I've done any compiler research...
-jon
Remember Amalek.
Users primarily want functionality. Besides, I doubt the use of Java will have a large impact on it. Most energy in e.g. mobile phones is used when actually using the connection.
As for your questions, I've seen a mpg player written in Java, I believe the java compiler is a java program, I have seen a few nice games in Java although it isn't quake of course. I have just spent my afternoon hacking away in netbeans, agreat Java IDE and of course written in Java.
I just think you should revise your opinion regarding bloatedness. The very least you could consider is wondering why the heck all these mobile hardware guys are deploying Java despite your argument. Presumably they know what they are doing and maybe your arguments are not valid?
Jilles
How about footprint? From the article it sounds like this emulation will require a couple of MB.
I'm in no way an embedded expert, but I was of the impression that RAM is expensive in the embedded and small form-factor world.
I can see that this technology might be a time-to-market saver if you have a load of assembler written for one embedded CPU and want to move it quickly to a new platform.
Hmm, how about the interface with support and I/O chips? This thing is, from what I understand, only a cpu emulator/translator. If you change the platform you will probably have to write new drivers for the other chips.
If J.K.R wrote Windows: Puteulanus fenestra mortalis!
I wonder if the specs for DAG will be open so that code can be compiled directly to it, optimized, and then distributed, saving the first two steps in the process. I can see commercial software vendors being all over this idea.
Target CPU neutral binary formats have been around for a while. OSF has ANDF (Architecture-Neutral Distribution Format). Also check SDE (Semantic Dictionary Encoding).
A hardware neutral distribution format is not the complete solution, though. The target platform that you want to run the executable on has to provide the software environment and APIs that the executable needs.
So, it is only suitable for distributing CPU-neutral but OS/environment-dependent user-space applications. What it really does is to save you the job of recompiling an application for PPC/x86/SPARC/whatever-Linux. This would certainly make life easier for non-x86 Linux users, but it is not a general solution for making applications platform-independent.
If you want a run-anywhere solution you also need to define a runtime environment, which is exactly what Java does.
If J.K.R wrote Windows: Puteulanus fenestra mortalis!
Isn't that _exactly_ what FX32 did?
AFAIK it dynamically translated binary
code for Pentium to Alpha processors
with runtime optimazation.
Yep, apart from it did it for x386 and up --> Alpha.
I think it was released in or before 1997.
Before - 1996 or 1995, IIRC.
Simon
Coming soon - pyrogyra
Why not recompile it natively?
Ah, that is the question.
Some believe that dynamic optimization can do things that static optimization can't do. For instance, you can straighten code that used to have a lot of branches in it. You can't do that statically because you don't know which branches will be taken most often. You could do every possible straightening and include them all with the binary, but that would probably be prohibitively large. You could profile the code and use that to direct a recompilation, but then that's nothing more than really-slow dynamic compilation.
So, once dynamic optimization technology has advanced, it may outperform statically-optimized code even in the same architecture.
Now, what happens if you dynamically optimize the dynamic optimizer...
--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
The solution is source distribution.
Compilers know more about the program than translators do, and they also allow linking to native libraries. Can a translator do that?
Yet another proprietary solution to fix another problem caused by proprietary solutions.
There is nothing so silly as other peoples traditions, and nothing so sacred as our own.
Mike Van Emmerik is still working on the project, as is 4 students. The project leader Cristina Cifuentes is currently doing research at Sun Labs on commercial extensions of this work. There will be an open source project at the end of the year apparently, the code has already been released under a BSD style license but it is not publically available as yet. The funding from Sun was a gift to Dr Cifuentes simply because they liked what she was doing. I was just a happy employee when I wrote that broken backend.
How we know is more important than what we know.
Uhh... Are you a troll, or are you just really stupid? Have you actually TRIED gzipping a file more than once?
Original Tar file: 30720 bytes
Gzipped tar file: 6895 bytes
Gzipped gzipped tar file: 6923 bytes
Basically, you get ONE chance to properly discover all of the redundant data. After that, it's pretty much an uphill battle.
I'm not sure if EEtimes is oversimplifying, or if Transitive technologies is filling heads with BS.
"...Translation, sometimes called software emulation..."
Translation != emulation
"...Crusoe specifically takes X86 code...In contrast, Transitive's...[fluffy adjectivies]...can, in theory, be tailored for many processor pairs.."
Crusoe isn't X86 specific, and it can be tailored for many processor pairs in reality, not just in theory.
"...We have seen accelerations of code of 25 percent..." doesn't mean that everything runs 25% faster. I don't even hear Transitive technologies saying that it does.
I wonder how many more companies will come up with new and innovative techniques like this now that Transmeta has become very noticable? I wonder how long before the cash-strapped Transmeta starts filing patent infringement suits? (Please Linus, make them play fair!)
I completely agree about the dynamic content bit -- and I don't think the right answer is for us to have to mail around tarballs of C source and makefiles.
And while you're correct in spirit, in that converting from Java byte codes to native code is fundamentally the same problem as converting from native Platform A code to native Platform B code, the latter is orders of magnitude more difficult. Java was designed for just such a conversion, and so is about as simple as you can get.
Real Programs, on the other hand, are unbelievably complex to the point that simply deciding "Is this byte code or data (or both)?" is literally unsolvable in the general case. For comparison, the distinction between code and data is obvious in Java. Plus, for "real" emulation you have to emulate all the weird I/O ports and other basic hardware, none of which is trivial.
Comparing a Java VM to serious hardware emulation because they are both emulators is is like comparing "Hello World" to MacBeth because they are both English text -- technically correct, but not a really meaningful comparison.
ZFS: because love is never having to say fsck
Now if they could figure out a way to deal with endianness, and the other 99% of the platform specific stuff in most code, it might be worth something...
http://www.masturbateforpeace.com/
VMWare isn't an emulator.
And it isn't already?
If you're thinking in terms of desktop systems and software written in high-level languages, you're right. But the target market of this company is the embedded systems world, where the code is typically hand-optimized assembly and even custom-made instruction sets for systems that are built from heterogeneous proprietary systems. Some proprietary chips are better than others, and often you don't know which is the best solution until you've already implemented the whole thing.
For the telecom industry, this solution, if it works, is a very good one.
If you say otherwise, you're ignoring history. RISC processors rock for most application. Look at Transmeta, a 700 MHz Crusoe can act as, worst case, 300 MHz Pentium III, using a lot fewer transistors and a lot less power. If MP actually worked, you could get such an advantage based on silicon space of performance/power.
Opinions?
At the beginning was at.
The more deeply the optimiser is run the BIGGER the percentage speedup Dynamo gives. The reason is that the speedups are different speedups than the ones found by the compiler, and the less time spent in the rest of the code, the more significant the speedups are.
e.g. the compiler can't optimise into a DLL, but Dynamo can
e.g. Dynamo can profile the code and optimise virtual function calls in ways that the compiler can't
-WolfWithoutAClause
"Gravity is only a theory, not a fact!""Our claim is that we can run 1:1 or [even] better than native speeds"
Bullshit.
Wake me when these guys go out of business. Been here, seen this. The x86 emulator guys made the same claims for their Mac-based emulators, almost word for word. (I won't even get into Transmeta's claims that have turned out to be similar bullshit).
This is just a special case of an optimizing compiler, which Java run-time optimizers also fall into.
These claims, as well as the claims for the "magic compiler" that can produce code better than humans, will never happen until we have real human-level AI that can "understand" the purpose of code. You can only get so far with narrow-vision algorithmic optimization, as proven by the failure of 40 years of research. (Failure, only as defined as producing code as good as a human can).
--
Sometimes it's best to just let stupid people be stupid.
this is not either an evolution or a revolution. it is a quick fix the legacy problems of "modern" computers.
but still...
a good idea. for example, the article states that a CISC to RISC translation would still be inefficient, how much so?? would a 1.4ghz athlon be equivilent to a 500mhz PPC or would it be better? could this all a much more usfull form of emulation, as in i cant afford a g5 MAC for macOSX so ill just use my athlon?.
also, with the claim of possible speed improvements accross RISC to RISC translation this may light a bit of a fire under the arses of some of the big players(intel-IBM) to build a new arcitecture with these optimizations in hardware.
this could be used as a tool for competition with transmeta with some good hardware backing it up. a CPU could be made as a base and the translation hardware could be pre-programmed to emulate multiple platforms. people would no longer have to worry about which arcitecture their WinCE apps are compiled for because their chip would run MIPS or ARM at nativelike speeds.
Look, you're all missing the point here. What I was saying is that using the term DAG is an incredibly poor way of explaining the technology to a layperson.
Speculating on the basis of what I know of optimization in compiler theory, the data structures in question probably consist of a root node for the start of the program, with mutually independent paths of execution as branches on the graph. They join up eventually, but if you showed the general topology of the graph to a non-specialist, they'll go "Oh, that's a tree."
Now, at this point, you have two choices. You can say, "Well, actually, it's not a tree per se but a type of data structure known as a 'directed acyclic graph', or DAG for short, because...", with the result that you'll lose your audience completely, or you can say, "Yeah, it does, doesn't it", and get on with your explanation of code translation.
Unfortunately, when you're explaining a difficult concept to a non-specialist audience, you make a few sacrifices in accuracy in order to try and convey some sense of understanding. The trick is determining how much distortion of the truth through economy is acceptable. Science educators make this kind of compromise all the time, particularly in the physical sciences.
I think worrying about the subtle (to a layperson) differences between trees and DAGs is unnecessary and an impediment to explaining the concept (code translation) in question.
And in response to the AC who started this thread, I got a GPA of 6 in Data Structures.
No, the issue here is: why bother using all the syllables in 'directed acyclic graph' when just 'tree' (with maybe the qualifier 'dependency' in front) will do nicely, thank you. Nice, ordinary language that the common man has a slightly better chance of understanding, particularly when it comes with a quick explanation of the dependency issues involved (e.g. preventing pipeline stalls by putting an instruction that reads a register as far downstream from the last instruction that wrote the register as possible).
I fully support /. posters making fun of people mindlessly using jargon to impress clueless journalists. Death to jargon as a means of maintaining the technocratic order. Or something.
In most Australian universities, 7 is the highest possible GPA.
Just table based translation with few ifs engrained
in the code is not good justification for hype,
however some companies survive just that way.
Anyhow, they managed to emulate other chips in
hardware. Thats like carb in a car that can work
on the same fuels, alas the fittings are not same,
so they cannot be integrated into a engine
environment, with some heavy modificaions.
Being able to to run code from penitum on
your chips that just modifes registers and adress
ranges is interesting challenge, but its just
that.
Drivers written for 'common' enviroments
surrounding chips would not work on new platforms,
and if they will that will mean that new platform
is just an old one with new processor, that
to externals is just like plain pentium chip.
Feat like VMWare is more admirable, thanks to
those CISC commands that allow for multiputer
based technologies.
New statements like Apple made way ago, and so
as Sun did with their hardware are more forward
thinking than that mere table lookup embedded in
hardware.
Remember some companies survive on hype, hyping
old or new technology. Transmeta has firmly placed
itself in that market share, so it will be tough
for this company in near term.
is this.....is this for REAL?
great comedy company.
We aren't tied to the ancient x86 architecture because of legacy, we're tied to it because of manufacture contracts with x86's owner -- Intel. It's the industry, not software which is holding things up for cross-platform capabilities. Bring in a company with Intel's power and reach and we might get somewhere, but another emulator or translator isn't going to make a difference.
"I'll just chip in a bit for RedHat: I actually have that installed on my university machine." - Linus, '95
I think that I shall never see
A program lovely as a directed acyclic graph
IBM was doing this exact same thing with DAISY, although the scope seems a bit narrower: http://www.research.ibm.com/daisy/ It's very interesting that we're just now talking about this stuff. It may get to a point where PC architectures will be able to do something similar to what an AS/400 does....the application is insulated from the hardware completely, and when transported to a new architecture, it automatically translates to run on the new architecture, fully able to exploit the abilities of that architecture.
I believe that some of you guys are missing the point. The quote was:
The author of the article makes it sound as though Transitive has invented DAGs. That's what is funny. Durinia is not a weenie making fun of their technology, and it's not necessarily an attempt on Transitive's part to dazzle.
"The only rights you have are the rights you are willing to fight for."
Ars Technica did an article on this topic a year ago. Check this link for the article.
We've had processor and machine emulators and processor independance for so long now...
SoftPC, Soft Windows, Virtual PC, XF86, Virtual Playstation, Java, WINE, Wabi, MAME, and so many others...
Why should this one be the news?
While Java was basically the only one that's tried to dislodge x86, they've all shown that while it's feasible to run another architecture's binaries ontop of a CPU, it's not the preferred way of doing things.
YAE (yet another emulator)
And big deal if it only translates a program from one binary arch. to another... Without an equivalent OS, the calls have nothing to be translated into...
And i could lead into the slashdot mantra of if all programs were opensource, we wouldn't need somethng as sloppy as an emulator anyhow...
Or am i missing something about the significance?
Well cell phones are no longer the limited machines they used to be. They have quite a lot of processing power, can be equiped with several MB of memory. Once you have that, coding C is a waste of time and time is the difference between profit or loss in the mobile market. A two month delay can literally make the difference. A company like Nokia produces dozens of different mobile phone types each year. That's why they love cross platform and couldn't care less that they would have to spend a few dollars more on the hardware. Besides, Moore's law also applies to the mobile market. Mobile hardware is doubling in speed just as fast as desktop and server hardware. Current mobile architectures such as applied in pda's, mobile phones etc. have plenty of horsepower and most of these architectures already have JVMs running on top of them.
Java programs are crossplatform. In the mobile market this means that once you have a JVM ported to your phone, you can run a rapidly growing number of programs without any change. That cuts back development time dramatically. C doesn't give you the same advantages because you have to recompile, test and debug before you can expect even the most portable C code to run without a hitch.
Jilles
Exactly.
This is what that company that was mentioned a few months back was doing...
We all know that if you zip a file again, it gets smaller again, but it takes exponentially more time.
Couple that with the sort of exponential speed increase you get with repeated recompilation, and you get almost zero-sized files in a fixed ammount of time.
The exponential increase I speak of, is that if you get a 22% each time, you've got 48% increase after the second pass, and 81% after the third. It just keeps getting better.
The drawback with this is that because each recompilation of the program is a different binary (or it wouldn't be faster) it takes a new memory block. This means that the ram requirements approach infinity as well. Kinda nasty.
But, the patented part of this was that the company was going to use a Ram Doubler(tm?) type technology to compress the program in RAM, as well as the file. This then gets nearly infinite compression in a little over twice the time taken for single compression (there's some overhead) and about three to five times the RAM (there's more overhead in storage) required for just a standard 1-pass 30% compression algorithm.
The neat thing is it doesn't require quantum computing or anything, it's all off-the-shelf stuff, just linked in a neat way.
This'll revolutionize the market when they release it... we think MP3s are small! An 80GB HD will offer nearly endless storage.
Every PCI PowerMac has a 68K (CISC) to PPC (RISC) dynamic recompilation emulator in it that it uses for executing 68K code. And MHz for MHz, the execution speed of the 68K code when dynamically recompiled as PPC code, is roughly comparable (plus or minus 50%?) to the speed of the original 68K code on a 68K processor.
The very first PowerMacs (NuBus based) used instruction-by-instruction emulation to run all the old 68K Mac code, including some parts of the OS that were still 68K.
The second generation PowerMacs (PCI based) included a new 68K emulator that did "dynamic recompilation" of chunks of code from 68K to PowerPC, and then executed the PPC code; this resulted in significantly faster overall system performance.
Connectix later sold a dynamic recompilation emulator ("Speed Doubler") for Nubus PowerMacs, that did, in fact, double the speed of those machines for many operations, mainly because so much of the OS and ROM on the first-gen PowerMacs was still 68K code.
I think that dynamic recompilation has a bright future; x86 may eventually be just another "virtual machine" language that gets dynamically recompiled to something faster/more compatible/etc at the last moment.
-Mark
I don't see how the problem that I'd like to send you dynamic content via email without requiring you to be running the same CPU as I am is caused by closed standards. On the contrary, it seems to be an inevitable side effect of competition in the processor market. Yes, it is an obvious solution: given that I can do on-demand translation to the Java Virtual Machine, how much harder is it to do on-demand translation to the instruction set of a real CPU?
"Freedom means freedom for everybody" -- Dick Cheney
It's funny how things are heading these days. Java, .NET, and dynamically-translating processors are all "brilliant solutions" to a problem that was caused by closed standards in the first place.
Got Rhinos?
You're right in the case of the desktop and applications world. However, in the embedded world, such as cellphones and 802.11, this is VERY useful. The problem of multiple proprietary platforms is the current bane of the telecom industry, which this company is clearly targeting.
Hmmmm, I just had a crazy idea. What if you could compile your GCC application in a special way, then run it under simulated normal working conditions and have it log performance data on itself, just the kind of data that these run-time optimizers gather. Then, you could feed GCC this collected data along with your application's source and recompile it and GCC would be able to turbo-optimize your app for actual usage conditions! If it can be done on-the-fly at run-time, it can be done even better at compile time with practically unlimited processor time to think about it.
Even if the end-user used the application in a nonstandard way it might still provide a performance benefit because there are lots of things that a program does the same way even when it is used in a different way.
Would this be feasible? Would it provide a tangible perfomance benefit? (like HP's Dynamo?) Comments please!
main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
MetroWerks here in Austin did the emulation layer for Apple's M68K->power switchover. They did a really clever thing of identifying long "runs" of code that nobody ever jumped into the middle of, then they treated them as one big instruction with a lot of side effects, and optimized them as a block. (Not one instruction at a time, but the whole mess into the most optimal set of new platform instructions they could.)
It was quite clever. It's also quite patented, and has been since before the Power PC came out. (And in a sane world those patents would have expired by now, but with patent lengths going the way of copyright...)
Eventually, when the patents expire, this sort of dynamic translation will be one big science with Java JITs, code morphing, and emulation all subsets of the same technology. And somebody will do a GPL implementation with pluggable front and back ends, and there will be much rejoicing.
And transmeta will STILL be better than iTanium because sucking VLIW instructions in from main memory across the memory bus (your real bottleneck) is just stupid. CISC has smaller instructions (since you can increment a register in 8 bits rather than 64), and you expand them INSIDE the chip where you clock multiply the sucker by a factor of twelve already, and you give it a big cache, and you live happily ever after. Intel's de-optimizing for what the real bottleneck is, and THAT is why iTanic isn't going to ship in our lifetimes.
Rob
Here's the homepage for the company - Transitive Software
(Apologies for the Karma whoring)
There is no rennaissance in computing that will be ushered in by this product. We have already seen it's like with DEC's FX32 (intel to Alpha) and Apple's synthetic68k (M68k to PowerPC) as well as a number of predecessors (wasn't there something like this on one or another set of IBM mainframes) and current open source and commercial products (Plex86, VMware, Bochs, SoftPC, VirtualPC, VirtualPlaystation, etc.), all of which use some amount of dynamic binary translation, and none have set the world on fire. They are mildly usefull for some purposes, but the cost of actual hardware is low enough to kill their usefullness in most applications.
I wish these guys luck, but I doubt anyone will be too enthusiastic about this product. They might have stood a chance if they'd pitched this thing a year or two earlier (when there was lots of dumb money looking to be spent) but they are probably toast today.
This was being worked on a few years ago by some people at The University of Queensland. Unfortunately, they got tired of the project (and, if I remember correctly, that they weren't getting much popular support).
Their website is at :
http://www.csee.uq.edu.au/~csmweb/uqbt.html
"UQBT - A Resourceable and Retargetable Binary Translator"
To note, they mention that they got some funding from Sun for a few years. (Likely either causing or due to their work on writing a gcc compiler back-end that emits Java byte-codes.)
Let's say my source architecture uses interrupt-based I/O. My target uses memory-mapped. Will this translator be able to handle that?
To be honest, translating one CPU's version of 'CMP R1, R2' to another's doesn't sound like it will user in a renaissance of anything.
-Poot
How many people would want to run a "translated" web server? Database? Scientific appliction? How reliable can it be? Why not recompile it natively?
If we are talking about closed source:
The same questions except the last one plus lack of technical support for non-native architectures at least by some vendors (e.g. Apple).
I wonder if the specs for DAG will be open so that code can be compiled directly to it, optimized, and then distributed, saving the first two steps in the process. I can see commercial software vendors being all over this idea.
[100% ISO 646 Compliant]
SVM, ERGO MONSTRO.
This sounds an awful lot like the dynamic recompilation of MIPS to x86 done in many emulators (such as UltraHLE, Nemu, Daedalus and PJ64).
I've been working on the dynarec for Daedalus for about 2 years now, and currently a 500MHz PIII is just about fast enough to emulate a 90MHz R4300 (part of this speed is attributable to scanning the ROM for parts of the OS and emulating these functions at a higher-level). Of course, optimisations are always being made.
After reading the article, I'd be very interested to see if they can consistently achieve the 25% or so speedups that they claim (even between RISC architectures).
For those interested, the source for Daedalus is released under the GPL.
damn slashcode...
Wow! What innovative technology! I wonder when they will patent this so-called "directed acyclic graphs". And they picked such a cool name! It sounds so mathematical!
Okay, enough laughing at the expense of clueless reporters...
Doug Moen.
I have written a truly remarkable program which this sig is too small to contain.
While this is fascinating sounding technology, it sounds more like a solution in search of a problem. There are already software solutions for emulation (SoftPC, VMWare, etc). There are already cross platform language solutions (Java, etc) and so on. Despite this, the market for massively cross platform applications has not really developed. It isn't as if a 25% performance increase is whats holding back the 'rennaissance' the author speaks of.