Slashdot Mirror


Transmeta Awarded Another Patent

Eric_Scheirer writes "You can read it here. Can someone explain to me what it means?"

345 comments

  1. Re:Summary by Anonymous Coward · · Score: 0
    On a side note, why spend all of this effort to be x86 compatible when you have the source code?

    Consider a concurrent remote shared memory object store. All the clients of the store will need to fault in objects from the global store, which means they need a common object format. Sure, you could conceivably store your objects in a neutral format like XML, but you'd much rather not pay the costs of translating between a native low-level object format and such a high-level format. If you don't want to lock yourself into a single hardware architecture/instruction set for the lifecycle of the objects, then using something like Java bytecode helps considerably. A code morphing processor like what Transmeta may be creating could be even better yet: hardware/instruction set independence without the limitations of bytecode interpretation or Just In Time compilation.

    Frankly, your attitude of "Just compile one that works on my machine" is rather retrograde, limited, and naive in the context of ditributed object/component systems.

  2. Re:Behind the technology - the business by GnrcMan · · Score: 2

    No, you're all reading this wrong. Look at the bottom of the patent, with the code samples. Notice the use of the word commit. That's basically an exception barrier. At that point, the stores actually occur. trapbarriers are used commonly on the Alpha. A number of instructions are executed, then when a trapb happens, any exception which occured in that block of code fires. This is a hardware method of rolling back stores to get the exact state at the time of exception without tracking state on an instruction by instruction basis. Allows for pipelining, scheduling, etc. Read this comment for more from me!

  3. Re:Hypocrites! by fatboy · · Score: 1

    What part of _HARDWARE_ do you not understand?? This isnt a software patent.

    --
    --fatboy
  4. Actually, it doesn't know there won't be an error by Mr+Z · · Score: 2

    This patent exists because they can't determine ahead of time if there will be an error. This patent, inconjunction with one of their other patents, provides a method for them to muddle on in the usual case of no error and still have a means of rolling back in the less usual case of an error.

    This particular patent convers stores that are speculatively executed, but may need to be killed because an instruction that occurred logically before them in the original code stream faulted.

    --Joe
    --
  5. Re:What it Really does by GnrcMan · · Score: 1

    Pick up a copy of the Alpha Architecture Reference Manual and look up trap shadows. Rollback can be very useful when pipelining instructions because there is no guarantee of precisely where a fault issued from. Precise exception trapping is slower and more complicated. Read this for more!

  6. Re:What it Really does by swb · · Score: 1

    I'm thinking of a rewind button for my PC. I can execute some application and if I screw up, I can "rewind" back to where I was before. This sounds kind of stupid, but I can see consumer devices eating this up. It would also make it easier to replay that last death in Quake without having to go back to your last saved game.

  7. In my closet by Anonymous Coward · · Score: 0

    Cleaning out junk the other day, I found a Transputer. Did I buy it from Inmos? Is there a Linux port for it? Will Transmeta run on it? Would this qualify as Beowulf-in-a-Box? And when will a picture of my box make it onto a box of Wheaties? Inquiring programmers want to know.

  8. Re:It means .. by mvw · · Score: 1
    you start with instruction set specific code and meta-compile it into custom hardware that is dynamically reconfigurable, resulting in very fast execution on hardware that is essentially optimized for each particular application.

    Sure, and I use that state-of-the-art parallelizing compiler on my 8-fold XEON SMP system to get a blazing fast application.. oh, sorry, there is no such compiler that turns a random program efficiently into a parallelized one? Oops.

    So please tell me why the engineers at Intel or AMD don't fire on their logic optimizers to implement the processor instructions in such an optimal manner?

  9. Re:YES, that's what I got by be-fan · · Score: 1

    Its actually pretty simple. Essentially a PPro is emulating an x86, right? A PPro 200 is faster than a Pentium 200 and thus you get faster emulation than native!

    --
    A deep unwavering belief is a sure sign you're missing something...
  10. Just in case you still dont get it... by Cylix · · Score: 1

    To completely clarify for the dumb founded...

    They were awarded a patent for a universal hardware based processor emulator.

    Generally speaking of course...this is similar to how AMD's K6 family of processors work..by converting X86 instructions to faster RISC instructions...but thiers operates on a broader scale.

    --
    "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
  11. Patent requirements by LoppEar · · Score: 2

    When writing a patent I believe you are required to phrase the abstract as one sentence. I suppose it was originally intended to show the purpose of a patent as shortly and concisely as possible.

    That clearly is not happening. But like so many other things to do with the patent office, this outdated requirement has been preserved.

    LoppEar

  12. Run it through babelfish...And play backwards! by FieldsBoy · · Score: 1

    Read this out loud, record it and reverse it, and you get a really great recipe for bananna nut muffins.

    -BF

    --

    -BF
  13. Re:Processors by drw · · Score: 1

    If Transmeta is creating the next big thing in processors, I am almost positive they will have someone else manufacturing them. It is not uncommon for semiconductor companies to be fabless, and I am sure we would have heard it if Transmeta was purchasing/building a fab.

  14. Re:What about the "permanent bit" by Capt+Dan · · Score: 2

    Yes, but what about the permanent bit? some other comment (sorry no link) thinks that it means the cache of another processor. Use one to translate, then ship it somewhere else for execution. But that would imply a multiprocessor while I think it is a single CPU (possibbly multiple processing units).

    But why have any reference to permanent storage if the data just gets shipped back and forth between caches?

    --
    Sig:
    Barbeque is a noun. Not a verb.
  15. Re:Hmm ... by Zurk · · Score: 1

    Also see http://vliw.ibm.com for more info.

  16. Re:Huh? - Time to call on your Grammer classes =] by gothic · · Score: 1

    Here is how it looks:

    "circuitry for permanently storing memory stores temporarily stored when a determination is made that a sequence of translated instructions will execute without exception or error on the host processor"


    Here is how it should look:

    "ciruitry for 'permanently storing memory stores' temporarily stored when...."
    It might be easier if you abbreviate it..

    "ciruitry for PSMS's temporarily stored when.."

    Make sense now? =] The whole dern document is like that. It's sick..I had to read certain sentences multiple times.. =] It's like reading a book with no periods, no capitals, no nothing. Insane. But yeah...Still sounds cool.. =]

  17. Re:Wait... by PagoPago · · Score: 1

    Doctor Torvalds?

    I didn't know he had a doctorate.

    I didn't even know he owned Transmeta.

  18. But is instruction level compatibility enough? by taer · · Score: 1

    Granted, its a huge very important idea. But we already have binary compatibility between x86 and x86 running 2 different OSes(which is what WINE does if Im not mistaken), and it's getting better and better every day. But isn't the binary compatibilty just half of the battle? This will be a kick ass chip, but it's not going to let a fortune 500 company run werd on linux still. Of course, this is my wild speculation. Could be totally wrong.

  19. anybody work in chip fab? by lbergstr · · Score: 1

    Ok, if TM is really inventing the Holy Grail, they're probably going to expect, um...high demand. So wouldn't they need to have either (1) built their own chip fab or (2) contracted with another chip fab? Seems like someone should be able to confirm (1) or (2), and if neither has occured, it's likely we won't see this chip for some time.

  20. Re:Doesn't this violate theory? by Anonymous Coward · · Score: 0

    This is not exactly the same as the halting problem. If you know what a processor does given some instructions and it's current state, you can determine when the processor will have an exception. You can only do this for only a few future instructions at a time though.

  21. Re:Hmm ...[wrong url sorry] by Zurk · · Score: 1

    damn sorry..its actually
    VLIW

  22. Stephen King??? by hitzroth · · Score: 2

    Did anyone else notice that the patent
    attourny's name is Stephen King?

    Is this whole thing some kind of cruel
    quasi-technological/horrific hoax?

    --
    In mathematics, one does not understand things, one merely gets used to them.
    --VonNeumann
  23. Re:Wait... by AndyL · · Score: 1

    You missed yesterdays news. He's got an 'honorary' one.

  24. Translation for Geeks by SorsKode · · Score: 1
    The patent wording wasn't very clear so I converted it to HEX. Now it makes perfect sense. =)



    0000000 7041 6170 6172 7574 2073 6f66 2072 7375
    0000010 2065 6e69 6120 7020 6f72 6563 7373 6e69
    0000020 2067 7973 7473 6d65 6820 7661 6e69 2067
    0000030 2061 6f68 7473 7020 6f72 6563 7373 726f
    0000040 6320 7061 6261 656c 6f20 2066 7865 6365
    ....

    Geek talk vs. lawyer talk.

  25. Re:Have we really thought this through yet?? by Anonymous Coward · · Score: 0
    Well, you got one thing right: WOAH.

    Just because you no longer care about the instruction set that a bit of code has been compiled into doesn't mean that you can instantly start mixing and matching OSes or APIs. Consider the current Wine project: on x86 platforms they are dealing with code that has been compiled to exactly the same instruction set, yet there is still a great deal of work to be done to support the Windows API in a Unix environment.

    Yes, it's theoretically possible, but it doesn't happen automatically or for free.

    If you are talking about code that has been written to the same API but compiled to different instruction sets (e.g., POSIX C code that has been compiled for both x86-Linux and SPARC-Linux), then a morphing processor might very well be able to effortlessly run either application within a single OS environment. No longer would you have one hardware-flavor of Linux user complaining that there are only binaries for another hardware-flavor of Linux available.

  26. Re:What if... by Guy+Harris · · Score: 2
    From the comments I've been reading it seems like the patent is for a processor that would translate instructions for other processors into its own instruction set, make sure the translated instructions would work, and if so run them.

    Read the patent, not the comments. Many people seem to think the processor would translate instructions itself, perhaps because the patent goes on about

    a processing system having a host processor capable of executing a first instruction set to assist in running instructions of a different instruction set which is translated to the first instruction set by the host processor

    but the patent later indicates that "the processor" does that translation by running translation software:

    Typically, the target application is being designed for some target computer other than the host machine on which the emulator is being run. The emulator software analyzes the target instructions, translates those instructions into instructions which may be run on the host machine, and caches those host instructions so that they may be reused.
    what would it do if the translated instuction would cause an error?

    Not bother storing the results of that instruction.

    Would the processor just not carry the translated instructions out?

    ...or, at least, make it look as if it didn't.

    If so, that would seem to be quite a flaw.

    Why? The error could trap, and the trap handler (or code it invokes) could do whatever is necessary to simulate the what the processor being emulated would do in that error situation (although the "exceptions" they talk about aren't necessarily errors - I scrolled past one example of "native-Transmeta" code, generated from x86 code, that assumed that the code doesn't make unaligned memory references that cross a page boundary; if that happened, "either hardware or software alignment fix up" would detect this, and perhaps generate more pessimistic and slower code and restart the emulation running the new code).

  27. Re:Huh? by Anonymous Coward · · Score: 0

    "circuitry for permanently storing memory stores temporarily stored when a determination is made that a sequence of translated instructions will execute without exception or error on the host processor"


    Lets say you are emulating instruction sequence E1,E2,E3 and the resulting host processor code is H1,H2,H3,H4,H5,H6,H7,H8,H9. During the processing of H1-9, data is written to a cache (a "temporary store") until the processor circuitry has determined that H1-9 ran "without exception or error". If it did, then there is circuitry to move the data from the cache to RAM ("permanent memory store")

    In other words, data is held on chip until a sequence of instructions completes, at which point it gets sent to memory.

    This would be useful if you follow both paths of a branch statement. Once it is determined which branch was supposed to be taken, that data gets posted, while the data generated by taking the wrong branch gets tossed.

  28. Hmm ... by Drey · · Score: 2

    I think it said "we've got a cpu that pretends to be another cpu and it needs a place to store the instructions it's actually running to run the instructions it thinks it's running, and we're patenting how it does that." But I'm probably wrong.

    1. Re:Hmm ... by Anonymous Coward · · Score: 1

      The error-prediction system looks like a hardware band-aid solution for "that other OS's" bad code. ;) The possibility of instantaneous suspend/resume looks nice, though, and there might be interesting hardware tricks to play with if this were ever applied to anything resembling a "standard desktop".

    2. Re:Hmm ... by InTheWoods · · Score: 2

      A lot of nice explainations of how it does what it does..but not of Why it does what it does. All computers run software to get to an end point. If this thing runs ( almost ) all software, it's to get to an end point. The effect of the machine...Windows is just another application. So is Linux , or BEos, etc. One resource drive with a second configuation drive lets you drop in any OS disk and have an variable configuation of the type of machine it was written for.

    3. Re:Hmm ... by egbassline · · Score: 1

      A thinking chip. It reads, it interprets, it functions. Cross platform, cross lingual. Different material. Should be out soon.

    4. Re:Hmm ... by The+Brave+Coward · · Score: 1

      Seems like all these theories are true ... let's wait for the announcements.

    5. Re:Hmm ... by bogtrotter · · Score: 1

      You read it the same way I did... A CPU that can run multiple instruction sets.

    6. Re:Hmm ... by V. · · Score: 1

      somebody moderate this guy up to 5. Bye, Jove. I think
      he has it!

    7. Re:Hmm ... by Anonymous Coward · · Score: 0

      I am registering........ I the meantime I think this is fairly simple. They are building a processor capable of emulating other (micro)processors. They may be translating the source to their target order code by hardware - wow thats really hard - or by software, just hard. Either way the results of the translation will be stored and re-used One of the problems with translation is that the source order code has checks which are costly to implement in the target. Equally the target sequence may execute things out of order before an error occurs. We need to get back to the correct process state i.e. the one when the error would have occured. This in itself is not hard - registers can simply be saved before a sequence - but store is a different matter. This patent implies that they can buffer the writes until the translated sequence knows that there are no errors. When this occurs it commits the writes to the store. I assume that this "write buffer" is read in parallel to the store just in case a sequence has reads of written data (or maybe it optimises those in registers until the last write in the sequence). How big the buffer is depends on how big a "block" of source instructions they can handle - and if they have jumps in the blocks - that would really be clever to buffer up the writes for a block that loops. I suspect the buffer is relativly small and tied to the translation process - the generated code would not have more unique writes than the buffer could hold. One possible improvement would be to hold two buffers and empty one while the other was being filled - obvioulsy allowing the reads to come from either buffer or the store (cached of course). And of course making it go fast makes it all the more difficult - but serious fun. Good luck to them. Anyway thats my ideas.

    8. Re:Hmm ... by WyrdOne · · Score: 1

      It seems to me that they are trying to implement something similar to ECC for a processor so that an operation will not crash the system. they add a second processor to take over if needed. The way a current dual processor system work the jobs are doled out by insturction not by process...that way if one process hangs it can take over both processors andhang both....with the new method being described it won't freeze both processors if a proces hangs.

    9. Re:Hmm ... by Snorbert+Xangox · · Score: 1

      In summary: it's for extremely aggressive speculative execution. For instance, it lets them speculatively execute branching code paths, including memory operations, without knowing which branch is actually taken until sometime afterwards.

      A place where this might make a lot of sense for their rumoured dynamic translation architecture is if they unroll loops when translating foreign code into its own native instruction set; in that case, it could well be a performance win to execute the unrolled loop and then have any memory operations retrospectively committed or annulled based on whether the unrolled loop overran the actual end of the loop. (Note that this isn't just about deciding not to stick data into memory, but also preventing possible virtual memory exceptions for speculative memory accesses until they are known to have really happened. Think of what could happen otherwise in an OS kernel to the loop clearing newly allocated pages, for instance. :-)

      This patent is along the same lines as an earlier patent of theirs which covered speculative r/w of IO locations.

      I can't wait for the official announcement of whatever it is they are supposed to be doing... =:^)

      --
      -Snorbert, somewhere in the antipodes
    10. Re:Hmm ... by zagmar · · Score: 1

      Yes, but it also verifies that the instructions will execute correctly before it processes them. Sweet...really sweet...Super Sweet!

    11. Re:Hmm ... by Anonymous Coward · · Score: 1

      Convert X86 instructions to TM instruction (verify translated instructions will run properly - like java byte code verification) and run it directly - NOT UNDER EMULATION when execution takes place, SO ALL THIS VERIFICATION

    12. Re:Hmm ... by el_diablo · · Score: 3

      looks like a cpu which read foreign instruction sets and then translates them into its own set and execs them in a highly parallel manner to produces a faster execution than the original processor. THe trick here looks to be finding out which things can be done parallel without causing an exception. end result is a transmeta chip that runs the instructions of other chips faster.

    13. Re:Hmm ... by Anonymous Coward · · Score: 0

      That's what I thought it looked like. A chipset that can emulate any cpu.

    14. Re:Hmm ... by mvw · · Score: 5
      looks like a cpu which read foreign instruction sets and then translates them into its own set and execs them in a highly parallel manner to produces a faster execution than the original processor.

      TRANSlatingMETAprocessor?

  29. It fits with horizontal microprogramming... by AJWM · · Score: 1

    So far this fits with the dynamically re-microprogrammable processor I speculated on last week. For "a first instruction set" in the patent, read "microcode", and the patent becomes more legible. In short, I think it's a combo of dynamically changeable microcode with JIT compilation of the interpreted instruction set, although there are a couple of different approaches to that.

    Now, the specifics in this case seem to be a provision for cacheing the results of parallel instruction execution and then either voiding or writing that cache depending on whether the instructions cause an exception or not. That in itself is nothing new, but in combination with "just-in-time" compilation of, say, x86 code to Transmeta microcode it might be. Particularly if the Transmeta processor uses horizontal microprogramming (read, "very very very long instruction word") to speed up the processing. (Loosely speaking, with horizontal microcode, each bit in the very long instruction word (could be hundreds of bits) maps to a discrete piece of logic (gate, flip/flop, etc) in the processor. Given an appropriate processor design it might be possible to map several instructions in a more vertical set (x86, PPC, etc) to a single wide microinstruction, effectively executing all of those in parallel, but then you really need some way of flushing everything if it screws up. Which this patent provides.)

    (It'll be interesting to see if the active microcode store is loadable from RAM (making it end-user microprogrammable) or just from a fixed set of microprograms on ROM (which may live on the CPU die)).

    --
    -- Alastair
    1. Re:It fits with horizontal microprogramming... by Anonymous Coward · · Score: 0
      Loosely speaking, with horizontal microcode, each bit in the very long instruction word (could be hundreds of bits) maps to a discrete piece of logic (gate, flip/flop, etc) in the processor.

      That's pretty much the definition I would have given for VLIW. What does it mean now, if not that?

  30. Processors by Anonymous Coward · · Score: 0

    It means with a 99% certainty that transmeta is producing Processors.

    1. Re:Processors by Imperator · · Score: 2

      Or rather, that Transmeta is developing processors. They might be produced by a different company. I don't know how large Transmeta is or whether it's capable of producing its own chips.

      --

      Gates' Law: Every 18 months, the speed of software halves.
  31. A compiling processor! by Slur · · Score: 1

    Some of the cooler benefits of this new processor have gone unnoted by the otherwise snappy Slashdotters. A compiler that translates any and all instruction sets into its own native code provides an amazing means for JIT, open source, and all manner of programs written in their own customized instruction sets.

    If the TransMeta ideal propagates widely, it will be a new world of software design. Instead of compiling for a processor, you only need to pre-compile into a pseudocode, provide a description of that pseudocode to TransMeta, and the processor takes care of final translation, and apparently a good deal of the debugging work too. As a development platform this sounds like a serious ideal. In a world of open source and platform independence, the TransMeta sounds like a real solution.

    Yeah, I'm a Mac programmer. You got a problem with that?

    --
    -- thinkyhead software and media
  32. Rolling back is useful in translation. by Mr+Z · · Score: 2

    If your doing ISA->ISA translation, rollback is very, very useful. Suppose we have an x86-style instruction like this:

    MOV [EAX*4 + 4], EBX

    This particular instruction does several things: It reads EAX, multiplies it by 4, adds 4 to it, and then stores EBX at the address genereated to the desired location in memory. This might break up into several RISC-like ops: (these are written in the more traditional RISC form OP src, src, dst for clarity)

    SHL EAX, 2, tmp1

    ADD tmp1, 4, tmp2

    STORE EBX, *tmp2

    It doesn't take a brain surgeon to see that these steps could overlap their execution with other instructions. For instance, the instruction that calculates EBX could overlap execution with the left-shift and the add. If the original instruction was in a tight loop, then it could even overlap with itself! Why is this important?

    Say you have some code which is stepping through the array, and say that the array spans a page boundary. And, say that the second page isn't "paged in." When the loop hits the page boundary, a fault will occur. Because the stores are being spooled extremely rapidly, the loop may not be informed that a particular STORE faulted until several other stores were executed. All stores after the faulting one need to be killed since we need to process exceptions in a precise order. Here's where the rollback becomes handy: We merely discard the extra, incorrect stores, and roll back the processor state to be consistent with the emulated state of the machine at the time of the fault.

    This is what most current x86 clones do when they translate x86 instructions into "RISCops" or whatever they decide to call them. I'm guessing Transmeta is aiming to do a similar sort of translation, only with a more configurable flavor.

    --Joe
    --
  33. Looks like.... by TBone · · Score: 2

    ...Hardware Instruction Set translation (i.e. x86 to Alpha). Maybe for an embedded cross-platform system?

    --

    This space for rent. Call 1-800-STEAK4U

    1. Re:Looks like.... by Anonymous Coward · · Score: 2

      One possibility it could be, would be for it to be more of a multi-CPU control centre. It is first fed an instruction from the software, then it picks which processor should get it and sends it off to that proc with any appropriate changes to the code to get it to work (and checking that it works before letting it out).

      Combine it with a slot T scheme (T for transmeta :-) that lets them repackage any CPU in their own high speed cartridge, and their little processor can send tasks off to the appropriate intel/AMD/PowerPC/whatever chips (and there could be a mixture).

      Perhaps their little CPU thing will also let it experiment if there are multiple CPUs and idle time. Send an instruction off to CPU A and CPU B at the same time and see who gets a valid answer back the soonest. That might be why it needs its own memory - so it can queue the same task up for multiple CPUs but then delete it from the other CPUs queue once it has a reply back from whichever CPU got to it first.

      Combine that with a CPU bus and things could get interesting. Perhaps you stick the transmeta CPU in a regular CPU slot/socket, and then stick an daughter board into an AGP like slot.

      Or alternatively, they could be going a bit like the amiga, and have semi-specialised CPUs, but which can also be used for other things. So your sound card can also do general CPU tasks if it isn't playing enough audio to take up its full load, and your video card could do other stuff. Though I tend to think it'd be more sensible to just form a CPU bus for all the cards, and just have a set of adapter plugs on the m/b. So you could get a m/b with 5 peripheral ports, so if you want to use 1 visual, 1 audio, 1 joystick, 1 mouse and 1 keyboard then thats fine, but they could also do 2 visuals, 2 audio and 1 keyboard or whatever.

      So that way you get a semi-specialized video CPU card if you want hot graphics, but its CPU can also be used for general stuff (and the transmeta controller helps with translations), and you might get a fast general CPU for regular stuff - but it can also do your graphics work as well if it gets to be too much for the video CPU card alone (or you just don't have one at all).

      So you could mix'n'match differently specialized CPUs on the CPU bus, as well as just add a new one every year for the latest speed and keep the old one there too (presumably the controller CPU would be able to shut down the slowest CPUs to save on electricity if they weren't needed).

      Though I don't think the CPU bus is too realistic, as is technology really up to that kind of thing?

      (a fibre optic backbone between CPUs that multicasts requests and the CPUs just pick up requests based on how full their internal queues are? :-).

      I don't know, the possibilities are endless. But I'm quite content to just wait and see what (if anything) transmeta actually comes out with. It's just fun to play around with guessing :-).

      Hmm, maybe I should get an account here someday.
      --Vastor.

  34. Missing link? by Redundant() · · Score: 1

    Could be one hellatious gamer machine! This article made heavy use of the term prior art processor and "modern computer". Last time I checked modern computers ran under a thin layer of hydrogen? (one example from ABC news) Since the speed difference between a modern cryogenic computer and "prior art" computers is substantial it becomes easier to visualize needing to cache and distribute "prior art" computer instructions in this manner.

    Send the flying saucers in.

  35. They will not produce processors by gwolf · · Score: 1

    According to an old article I have misplaced, Transmeta once defined itself as "fabless" - They research, do all the logic design, but when it comes to manufacturing (if it ever comes - I certainly wish it does!), they will hire another company's facilities to do so.

  36. They're NOT making a new CPU... by Anonymous Coward · · Score: 0
    The way I read it, they're making a CPU-like chip that does the instruction conversion part of emulation in hardware and then feeding the conversion output into... whatever they want. They could *just* be building the coprocessor, and using an Alpha, or PowerPC as the host CPU.

    Doing the conversion stage of the emulation in parallel on a second "CPU" is damn clever. If stuff like Virtual PC on the Mac can get 50% efficiency doing everything in software, then moving 1/2 or more of the task to a dedicated coprocessor could easily give them better than 100% efficiency.

    It wouldn't make sense for them to build the primary CPU! They can use any other CPU, as long as it's faster than Intel's.

    And lots of CPUs are faster than Intel's. :-)

    I'm placing my completely uninformed bets on PowerPC. It would be a huge coup if Transmeta introduced a "Wintel" compatible motherboard that had their coprocessor and a socketed PPC that smoked Intel's offerings.

    1. Re:They're NOT making a new CPU... by robj · · Score: 1
      No, I'm sorry, they are making a new CPU. You're completely missing the point of the patent.

      Current CPUs do all the instruction decomposing into primitive operations, rescheduling, and speculative execution in hardware for every instruction that goes through the execution pipes. The whole point of this patent is that that's all wasted, redundant, excess work. You can save lots and lots of transistors, and gain lots and lots of speed, by simply not doing any of that work more than once. That's what their software/hardware combo lets them do.

      It would be pointless to run it on any current chips, since all current chips are literally hardwired to do all the extra work that they're figuring out how to avoid.

  37. Re:It means .. by jafac · · Score: 1

    you need a compiler for folks who want to run code natively on this beast.

    "The number of suckers born each minute doubles every 18 months."

    --

    These are my friends, See how they glisten. See this one shine, how he smiles in the light.
  38. Sounds like fast emulation by grokblah · · Score: 1

    Sounds like plans for a coprocessor that translates X86 instructions into some native instruction set to be executed a processor. In otherwords a translator for the x86 instruction set so that transmetas chip can pretend to be a pentium. The claim it will be faster though.

    1. Re:Sounds like fast emulation by mvw · · Score: 1
      I don't get that emulation bit. Why not code an efficient x86 CPU directly?

      Is it maybe some kind of CISC to RISC compilation to improve speed? (Analogon to Java JITs?)

      Or do they want to create a non-x86 platform with exceptional x86 emulation? (Like an alpha CPU based system that could fall back to run x86 software very fast)

  39. Genetic? Evolution? by HakimGeorge · · Score: 1

    Hmmm... Very interesting

    What happens if a genetic algorithm is used to evolve the translator? it does not take long to train specially when considering high speed execution. it can be trained via the exceptions, the genes can be stored and mass produced.

    If this is done at the proper level, it could prove to be a formidable processor, at worse case runs at native processor speed, at best much faster than anything we have seen.

  40. Re:What they said... by PagoPago · · Score: 1

    A multi-platform cluster. Hmmm. That sounds like Amoeba.

  41. Summary of the Invention by Lazuli · · Score: 2

    It is, therefore, an object of the present invention to provide a host processor with apparatus for enhancing the operation of a microprocessor which is less expensive than conventional state of the art microprocessors yet is compatible with and capable of running application programs and operating systems designed for other microprocessors at a faster rate than those other microprocessors. Whooh baby. Sounds cool.

    1. Re:Summary of the Invention by Anonymous Coward · · Score: 0

      Yeah ... kinda makes java obsolete. Heck just run any binary as a native binary. As long as you can determine what CPU it is designed to run on ... translate that sucker.

  42. Re:What about the "permanent bit" by GnrcMan · · Score: 1

    Remember, processor instructions eventually operate on data...data that will be permanently stored in memory! If you have an instruction that faults preceeding one or several store operations (I refer to instructions which store to RAM, as opposed to a cache or something) then a trab barrier (sorry, I'm using Alpha-centric terms here), you've changed the contents of the RAM erroneously(sp). The instructions should never have issued and RAM should not have been modified. That's where permanent memory stores come in.

  43. Abstract (very) for the lazy folks by Wah · · Score: 3

    Apparatus for use in a processing system having a host processor capable of executing a first instruction set to assist in running instructions of a different instruction set which is translated to the first instruction set by the host processor including circuitry for temporarily storing memory stores generated until a determination that a sequence of translated instructions will execute without exception or error on the host processor, circuitry for permanently storing memory stores temporarily stored when a determination is made that a sequence of translated instructions will execute without exception or error on the host
    processor, and circuitry for eliminating memory stores temporarily stored when a determination is made that a sequence of translated instructions will generate an exception or error on the host processor.


    hmmm???

    --
    +&x
    1. Re:Abstract (very) for the lazy folks by Anonymous Coward · · Score: 0

      Two words: speculative execution. Run code in parallel, fix up what wouldn't be allowed later. Just a (very) quick reading and guess.

    2. Re:Abstract (very) for the lazy folks by mvw · · Score: 1
      Two words: speculative execution. Run code in parallel, fix up what wouldn't be allowed later

      But for that I don't need to mess with different instruction sets etc.

  44. native vs emulation? by Xtacy · · Score: 1

    um if most of the above comments are correct, saying it would run say native x86 instructions faster than a true x86 could do itself, that leads me to think maybe the reason they have Linus there would be to code a linux kernel that runs natively on the new transmeta chip? Now that would be nice :)

  45. Safe cache by Monty+Worm · · Score: 2
    Just my reading:

    This is a device for assisting in processor emulation: I believe it will hold commands in memory until it knows that they will execute without error. Quite a good idea.

    Simple, elegant, and not obvious. All the requirements for a good patent.

    This is really the sort of thing that Windoze really needs: a 'this instruction would cause the program to do "bad stuff(TM)", so I won't allow it. It should stop a single process for taking whole systems down.

    --
    ... and today's pet project has ... been discarded for lack of time.
    1. Re:Safe cache by Anonymous Coward · · Score: 0

      "Simple, elegant, and not obvious. All the requirements for a good patent."

      Actually, I think you are confusing the requirements of a good mystery novel with those of a good patent.

      If you really want to know the requirements of a good patent, one place to start is the most recent Manual of Patent Examining Procedure, which is a document used by U.S. patent examiners.

      The USPTO website (http://www.uspto.gov) has a copy of this document somewhere on their website, but an unofficial hypertext version is available at:
      http://patents.ame.nd.edu/mpep/
      for your viewing pleasure. The manual is quite large, but a good place to start is the MPEP table of contents, from which you can select Chapter 700 (the chapters are numbered in 100's), entitled "Examination of Applications", and then section 706.02, "Rejection on Prior Art." There are many other sections to look at (including chapter 2100, entitled "Patentability") that you can profitably browse.

      Perhaps more to the point, however, if you want a preview of some of the work that a company is doing, there is often a better place to look than issued U.S. patents. Most other applications around the world are published a period of time after the application is filed, even before the application is examined. On the other hand, U.S. patents are published only after examination, and only sometime after they are actually allowed. This process sometimes takes three years or more. Thus, a search of international patent applications often turns up things about a company much sooner than a search of the the U.S. patent database. Fortunately, international searches at least reasonably suitable for this purpose can also be done free of charge.

      Try this. Go to
      http://ep.dips.org/dips/ep/en/dips.htm
      and select, for example, "Search in PCT (WO) patents" and type the company name in the "Applicant" field. (You can also do this for European, "Worldwide" and Japanese patents. Note that the "applicant" in countries outside the U.S. is usually a corporation, not the individual inventors) If the company is interested in obtaining patent protection in countries other than the U.S., you will often find more about what it is working on this way than you will by searching for the company in the "Assignee Name" of the U.S. patent database.

    2. Re:Safe cache by abaum · · Score: 1

      In particular, it lets you look at (say) a bunch of x86 code,
      translate it into your host machine, then rearrange and optimize those
      instructions for best performance. If you did this naively, then you
      might move a store much earlier, and if an instruction further down
      traps, you have now corrupted memory. (Yes, this does matter, even if
      some previous instruction trapped. It could be a page fault, for
      instance) This technique was mentioned in the first patent (and is
      more or less what a high end Alpha does internally, under hardware
      control. Here, it's implemented as a software/hardware combination.

      There are a pile of optimizations you can use if you don't have to
      worry about loads and stores being aliased; the fact that the C
      language has pointers (which make detection of this by the compiler
      extremely difficult) is probably the single biggest reason that
      Fortran is still preferred in the high performance computing arena
      (because it doesn't permit aliasing There's also a special page table
      bit mentioned that marks a page as having translated insts. Any
      attempt to write to that page (which is considered self-modifying
      code) will trap and flush the translated/optimized instructions.

    3. Re:Safe cache by parkrrrr · · Score: 2

      It also appears to be transaction-oriented: a sequence of instructions that would fault will have no effect, regardless of whether it would have executed a write to memory before the fault. This could be handy, because it means that bad code won't corrupt memory on the way down.

    4. Re:Safe cache by Anonymous Coward · · Score: 0

      For one processor to incorporate another processor's instruction set is routine. (Intel backward compatibility, VAX, ...). Speculative execution is routine in any modern processor. Software emulators for running a different instruction set on a processor have been widely used for many years. On chip instruction set translation has been done for x86 family processors down to a more RISC core. So now the combination of hardware emulation and speculative execution is simple, elegant and "non-obvious"? Isn't this a bleedingly obvious thing to do, driven by the need to use all the real estate on a modern processor for something that might be useful, knowing that the processor is mostly I/O bound?

    5. Re:Safe cache by Anonymous Coward · · Score: 0

      Can we say Java!!!

      This is a "Java" Intel.

      Wow, what an idea...

    6. Re:Safe cache by mvw · · Score: 1
      This is a device for assisting in processor emulation: I believe it will hold commands in memory until it knows that they will execute without error. Quite a good idea.

      Good idea, yes. But how many folks will need such a device? Doesn't sound like such a product could keep more than a couple of engineers in bread and butter.

  46. More from the Patent by GnrcMan · · Score: 1

    Heres a very small quote from the patent that should clear this up:

    2. A gated store buffer for controlling the execution of memory store operations to system memory generated during execution of a sequence of instructions by a processor comprising: ...

    1. Re:More from the Patent by Capt+Dan · · Score: 2

      Gotcha. the patent is definitely being printed for later perusal tonight.

      You have to admit that translating code to native in parallel with actually execution would be pretty cool though.

      --
      Sig:
      Barbeque is a noun. Not a verb.
  47. Re:It means by Evil+Pete · · Score: 1

    Notice that it is 125 words in one sentence!

    I wonder if anyone tried to actually read this on one lungful of air ... supports the thesis that it is the output of some scripting.

    --
    Bitter and proud of it.
  48. The Carmack Connection? by Anonymous Coward · · Score: 0
    And for kicks, this was posted to the Matrox G400 GLX list by John Carmack, back in July. Apparently he was at Transmeta.

    >>> On another note, when I was out at transmeta a few days ago to talk to a bunch of the engineers, Linus expressed personal interest in making sure that 3D works out well on linux. I also talked about a lot of AGP issues, so I am hoping that some usefull seeds were planted. John Carmack

    1. Re:The Carmack Connection? by hedgehog_uk · · Score: 1

      If John Carmack was at Transmeta to talk to the engineers, then he must have signed a NDA. In that case, he wouldn't be able to say anything about what they're up to. I would guess that he could mention the discussion with Linus only because 3D on Linux has absolutely nothing to do with what Transmeta is working on. However, I'm very curious to know why Carmack needed to talk to Transmeta engineers. Maybe Transmeta need a kick-ass game to demonstrate the awesome power of their processor and have hired ID to port Quake 3?

      HH

      --
      Yellow tigers crouched in jungles in her dark eyes.
      She's just dressing, goodbye windows, tired starlings.
  49. Check out the images by ashuntwo · · Score: 1

    If you click on the images box at the top of the application, you can see the images referenced in the application. Among them is a diagram of what this invention does. The same applies to Transmeta's other 3 patents. I have no idea what I'm looking at, but hopefully someone else will be able to say something inteligent about the pictures.

    --
    Andrew Huntwork a-huntwork@uchicago.edu
  50. Partial Store Ordering & emulation by Geekholder · · Score: 1

    [[ Doh! hit return in Subject. Pop stack, try again. ]]

    Well, after wading through the patent my brain hurts, and only one thing is clear.

    Most modern processors implement a PSO mode, for Partial Store Ordering. That is, if you do a series of stores:

    a = 1;
    b = 2;
    c = 3;

    You would expect the processor to store things to the memory system in the order they were run. So the memory location corresponding to "a" should be set to 1 before the memory location corresponding to "b" is set to 2.

    But that is slow, because you're serializing. Modern memory architectures are oriented around cache lines. A cache line is usually 16 to 64 bytes, depending on what kind of CPU it is. To modify a single byte the CPU must first read the entire cache line from RAM, then modify it, then write the whole cache line back. (Its actually a lot more complicated than this, I'm simplifying).

    To make the stores happen exactly in order the CPU must issue the read for the cache line containing "a", wait for the result, merge the change, and then write the cache line back. Then it issues the read for the cache line containing "b"... and so on. Lots of dead time in the middle.

    A faster scheme is to issue the read for the cache line containing "a", and continue. When the store to "b" is executed, issue the read for the cache line containing "b", even though "a" hasn't finished. This is generically called a "store buffer": it buffers up stores.

    This is where things get interesting, especially if you are in a multiprocessor system. Lets say "b" is currently held in the cache of some other CPU in the system, while "a" is not. When your CPU issues a read for the cache line containing "a" it will have to go all the way to RAM, and might take a while to return. When your CPU issues a read for the cache line containing "b", it will get the answer right away because it is in a cache much closer to you.

    Now you have a problem: you're ready to finish up the store to "b", but "a" isn't ready yet. What to do? That is what Partial Store Ordering means. You are allowed to complete the store to "b" before the store to "a" is finished.

    In most cases, this is fine. Most of the time you don't care if stores complete out of order, and if its faster its better. There are two cases where you do care:

    1. You're implementing a mutex lock
    2. You're a device driver

    A mutex is a data structure to protect some other data structure from simultaneous accesses by different CPUs. It is vital that the stores to the mutex be complete before you start modifying the data structure you wanted to protect. Most modern processors provide a special instruction specifically for mutexes, generally some sort of test-and-set. And generally processors which provide PSO provide a synchronization or barrier instruction so that you can force things to complete in a specific order.

    Device drivers are more problematic. Lets say you have an exceptionally stupid disk controller which expects you to write 512 bytes to a buffer, and then command it to write the buffer to disk. Pseudo-code is:

    for (i = 1; i 512; i++)
    buffer[i] = mydata[i];

    command = WRITE_TO_DISK;

    If the processor did the stores out of order, such that command is set to WRITE_TO_DISK before all of the buffer[i] locations are set, the result is disastrous: the disk controller will write corrupted data to the disk.

    Again, this is handled in modern processors. Generally there is a bit in the MMU per page which enables PSO for that page. Pages which are mapping hardware devices will have PSO disabled.



    This finally brings us to the topic of the patent. They provide a store buffer (presumably a LARGE store buffer) where they can speculatively execute instructions in advance. They can perform the instructions in whatever order they want, whatever is fastest. If when they get through with all that, they find that actually the memory they are looking at is marked as PSO-disabled... they throw everything in the store buffer away and do it again on a slower path that maintains ordering.

    Ok. My brain hurts now.

  51. Re:My layman's explanation by taniwha · · Score: 2
    actually I think it's slightly different from what you describe (see my full explanation below) they do this so they can translate the code without having to worry about x86 exception semantics (like where the PC is or what values the various registers or flags have). And they assume that they don't take any excpetions - if they don't all is cool - they know the state at the end of the basic block (or whatever unit they are using for translation - hopefully they're doing better than simple basic blocks).

    If they DO get an exception ... then they use the described hardware to throw away the side effects of executing the code fragment, and interpret the x86 instructions from the start - doing all the proper instruction semantics .. when they get the exception again this time they know what the PC is and what all the flags and registers etc are

  52. Re:Have we really thought this through yet?? by babbage · · Score: 1
    Could this explain why / how Transmeta brings together people like Linus Torvalds and Paul Allen of Microsoft? (That's the guy's name, right? I just know that autistic billionaire honcho guy. Oh this one is a billionaire too? How about that. Learn something every day...).

    What would this mean, exactly? Would you really get to pick your o/s independently of your applications? I could run BeOS, with a bunch of Linux compiled tools, while running MSOffice and maybe some Mac or Amiga software too for good measure? And maybe, just maybe, Java wouldn't be glacially slow? Or would Java be flat out irrelevant? (...iwishiwishiwishiwish...)

    If this is the case -- if this really is possible to the extent that is implied here ...why didn't anyone think of this decades ago?



  53. A few other things by Stradivarius · · Score: 5

    OK, these are just a few other bits of interest I picked out of the patent:

    In a preferred embodiment of the invention, the morph host is a very long instruction word (VLIW) processor which is designed with a plurality of processing channels.

    I'm not going to go into huge detail about VLIW machines (particularly since I don't know all that much about them :-). Suffice it to say that traditional VLIW CPUs fetch multiple instructions at once, and rely on the compiler to ensure that there are no dependencies between instructions in a fetch group (if the compiler can't find x number of independents, it will pad the holes with non-operations, or NOPs). Looking at Transmeta's patent, it appears that rather than a compiler doing this checking, their code-translation software will be doing it on the fly. RISC/CISC machines, on the other hand, typically do this checking in hardware. But Transmeta's reasoning seems to be that doing it in hardware adds complexity, hence lower clock rates, and also doesn't make multiple instruction sets very feasible.

    Regarding the instruction translation and subsequent caching I mentioned in my previous post, a quote from the patent illuminates the matter a little more:

    The code morphing software of the microprocessor...includes a translator portion which decodes the instructions of the target application, converts those target instructions to the primitive host instructions capable of execution by the morph host, optimizes the operations required by the target instructions, reorders and schedules the primitive instructions into VLIW instructions (a translation) for the morph host, and executes the host VLIW instructions.

    When the particular target instruction sequence is next encountered in running the application, the host translation will then be found in the translation buffer and immediately executed without the necessity of translating, optimizing, reordering, or rescheduling. Using the advanced techniques described below, it has been estimated that the translation for a target instruction (once completely translated) will be found in the translation buffer all but once for each one million or so executions of the translation. Consequently, after a first translation, all of the steps required for translation such as decoding, fetching primitive instructions, optimizing the primitive instructions, rescheduling into a host translation, and storing in the translation buffer may be eliminated from the processing required. Since the processor for which the target instructions were written must decode, fetch, reorder, and reschedule each instruction each time the instruction is executed, this drastically reduces the work required for executing the target instructions and increases the speed of the microprocessor of the present invention.


    Transmeta seems to have an excellent idea here. They're caching optimized translations of the incoming instructions, so rather than have to translate and optimize over and over each time you see that bit of code, you do it once and then just grab it from the cache. Due to the spatial and temporal locality of programs (ie the fact that your accesses to instructions are not random, but are localized in loops, etc), this cache ("translation buffer") will only fail to have a translation present once every million instructions. So you're doing *one* translation every million cycles, rather than a million translations like current processors would have to do. Interestingly enough, a scheme like this was brought up as a discussion item in my Superscalar Processor Design class a couple of weeks ago, though my professor used the example of an specialized Alpha decoding/translating x86 and caching the results. One might even write the translations back out to disk as an attachment to the original executable, so that the next time you run the program that's fewer translations you have to do, and eventually you'll have a fully translated version on your hard disk for optimal speed. I guess we'll just have to wait to see if Transmeta does something similar.

    One embodiment of the enhanced hardware includes sixty-four working registers in the integer unit and thirty-two working registers in the floating point unit. The embodiment also includes an enhanced set of target registers which include all of the frequently changed registers of the target processor necessary to provide the state of that processor; these include condition control registers and other registers necessary for control of the simulated system.

    It seems this new chip is going to have a lot of registers. As Cartman would say, sweeeeeet!

    The patent also provides some sample C code, the corresponding x86 assembly, and some sample optimizations the Transmeta system may perform. It's a little more than half way down the page, if you want to look, just scroll until you see code :-)

    1. Re:A few other things by Salamander · · Score: 1

      >Transmeta seems to have an excellent idea here. They're caching optimized translations of the incoming instructions, so rather than have to translate and optimize over and over each time you see that bit of code, you do it once and then just grab it from the cache.

      NexGen already did this, just as you describe, i.e. the i-cache stored instructions in their translated form. I don't know if this approach was retained after AMD bought NexGen, though.

      --
      Slashdot - News for Herds. Stuff that Splatters.
  54. Re:Hardware assists for binary-to-binary translati by Geekholder · · Score: 1

    I gather they're being so aggressive about speculative execution that they're defining a whole bunch of new exceptions.

    I.e. I speculatively executed a bunch of stores and didn't keep them in order, but when I got around to looking at the MMU page I found it was marked to not do PSO so now I throw it all away and start again, in order this time.

  55. FX86 processor with speculative pipes by Anonymous Coward · · Score: 0

    I agree with the fx86ish emulating processor, but the problem is: how do you make it fast.

    I think the answer is in the way they overstate:
    "until a determination that a sequence of translated instructions will execute without exception or error".

    Translation: What if, somehow, they can make lots parallel pipes cheaply, and each of these pipes speculates on different outcomes of the same code based on different initial states.

    The moment the correct initial state is known, the outcome is now known without executing the instructions, and all other pipes (the erroneous ones) dealing with the same code are flushed.

    Those flushed pipes are now free to speculate further down the code.

    If you've got a lot of cheap pipes, this is how you'd make emulation fast!

    Chris

  56. Re:Patents - Hard vs. Soft by Komodo · · Score: 1

    Usually what people are slamming is software patents. Some people will also slam hardware patents, but most people here have a libertarian attitude towards Intellectual Property - you only deserve it if you work hard at it, and mix enough of your labor with the device. The software patents we slam are total BS in this respect.

    Also, hardware is more tangible - you can't pirate hardware just by looking at it.

  57. Hmmm? by ripcrd · · Score: 1

    Do I sense a causality loop in the patent description. Surely this technology is intended to tear a hole in the space-time continuum. Q will be so pissed.

    Later,
    Rip

    --
    --Somewhere there is a village missing an idiot.
  58. Re:Sure Looks Related To Their Other Patents by brainwasher · · Score: 1

    I agree with your analysis, and this leads into another area: speculatively executing "roughly" translated code would take a long time if there are a few sets of it and only one target host. Either the host would have to be really fast (10 gigahertz range) to make this viable, or have multiple complete execution units, ( akin to multiple integer units in a Pentium, but more complex ).

    Since a 10 gigahertz CPU is unatainable with current technology, the host CPU would have to be like a SMP machine - eg. 16 copies of the exact same CPU, but all on one silicon.

    If the invididual execution units are too complex they'd be expensive (to design and make) and not able to do high clock rates, so we're looking at relatively primitive execution units, probably with the complexity of, say, Motorola 68000, but cranked up to, say, 1 GHz.

    This would sit well with making a cheap and cheerful processor, able to undercut the price of Intel stuff by a wide margin, due to a much larger yield per silicon waver. The problem with this is who would actually want your CPU unless it runs x86 stuff - and this is where the translation software and translation-assisting hardware (ie. this patent) comes in. I doubt that this extra hardware would take much silicon real estate.

    Now I can see how it can actually run stuff faster - even if 10 of the 16 execution units ran translated code which was crap, there's still 6 sets of code that were OK. If one unit runs at 1 GhZ, we have in effect a 6 GHz execution.

    All sounds good in theory - I wonder how it actually turns out, or didn't turn out, seeing as this patent was filed in 1996 and I remember Linus saying something about a change of direction last year.

    Let's look at the economics of introducing a new processor into the market. For the demand to be there, it would have to satisfy these conditions:
    1) run x86 stuff
    2) _much_ cheaper to produce than current Intel stuff. Look at AMD and Cyrix - they have nice processors, but not really that much cheaper.
    3) runs stuff at a comparable speed

    If the above conditions are satisfied, then it would sell well... however 3rd condition could also be:
    3) run stuff much faster

    which would allow Transmeta to charge about the same as Intel and make a very nice profit.

  59. Sounds Like IBM's Super OS by Real+Timer · · Score: 1

    IBM was rumoured to be working on a microkernel OS that would run Windows, OS/2 and AIX applications, using intel emulation in a new version of the PowerPC. Maybe transmeta can realize this on their processor.

    --
    Changes aren't permanent, but change is.
  60. Re:It means .. by JoeyLemur · · Score: 1

    My question is... will the Transmeta processor allow you to compile directly for the "core" instruction set? Emulation is all fine and good, but... :)

  61. Re:FPGA - Field Gate Processors by mvw · · Score: 1
    They can be configured to handle a single task MUCH faster than a single (or a bunch) of pentiums ever could. Thats why the crypto box can crack codes in under a second, it can't do much else mind you but it can crack DES in a fraction of the time 1 or even 100 pentiums could.

    But who designs that configuration? Those "des brute force crackers" or "chess chips" or "graphics processors" are all logic designs done by experienced human designers.

    While I think it is possible to write configuration compilers that analyze an application and produce some logic configuration to implement it, I don't believe that those results can rival the abovementioned special designs in quality.

    Given a DES cracking software as input, it must come up with something that rivals the Deep Crack design.. unlikely.

  62. Re:My layman's explanation by Stradivarius · · Score: 1

    >Wouldn't you incur some sort of performance hit

    Well, yes. What Transmeta is essentially doing is an emulation of multiple foreign instructions sets running on top of their native instruction set. Thing is, by doing it the way they are, they may have been able to get the hardware sufficiently faster to more than make up for it. See my previous post.

  63. Re:Wait... by Anonymous Coward · · Score: 0

    he has a Doctorate.. but thats it..

  64. Non-serial dynamic execution by Anonymous Coward · · Score: 0

    This sounds a lot like the non-serial dynamic execution experiments that Cray was working on a few years back. The idea was, you take a serialized instruction set from a set host architecture, using an xts instruction translation table, and it uses active data piping to off-load the instruction processing to a non-core logic unit. The idea behind is was to delinearize the flow of data throughout the chip(s). This would have had a lot of potential, except for the fact that it was completely incompatible with the uniform memory architecture that SGI was working on when they bought cray. The meta-structure for this architecture was used during the production of the MAJC chip by Sun, and who knows...maybe Transmeta's chip? :)

  65. Re:What it Really does by Sloppy · · Score: 3

    Can't think of a situation (except for processor bugs like the F00F one), where the processor hangs in mid of some instruction, stumbled over some microcode gone crazy. So I simply see no benefit of a rollback of an instruction, sorry.

    Happens all the time, although there's already ways of dealing with it. Consider virtual memory. Having to redo an instruction, because some exception occurred in the middle of it, isn't very .. um .. exceptional.

    But I can't think of how this relates to the Transmeta speculations. Well, actually I can, but my theory is so wild-ass that everyone would laugh at me.

    Oh, what the hell. This is as good a place as any for me to make a complete fool of myself... I think Transmeta is making a display circuit that instead of fetching each pixel from a frame buffer, executes a little program for each pixel. The program must execute incredibly fast since the result must be available before the horizontal scan goes to the next pixel.

    There, I said it. Now everyone can back away from me quietly, and then point and laugh when they reach a safe distance.


    ---
    Have a Sloppy day!
    --
    As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
  66. Re:YES, that's what I got by TPx · · Score: 1

    Uh... no.

    The Pentium Pro IS a x86 processor...

  67. Re:FPGA - Field Gate Processors by Guy+Harris · · Score: 2
    i believe it even says somewhere in the patent that it is more software based than not

    The patent says that "emulation software" would translate x86 or whatever code into "native Transmeta" code (see other postings of mine in this thread, many of which amount to "software translation, dammit, not hardware translation").

    As such, I don't know why this need involve any FPGAs at all - the patent doesn't seem to describe a processor that can be configured at the hardware level to run arbitrary instruction sets, it appears to describe a processor that lets software (presumably running on that processor) translate other instruction sets into the native instruction set making optimistic assumptions about what the code being translated does, get exceptions if those assumptions are invalid (with the exception handler presumably doing more pessimistic translations and retrying with the new code), and not have to worry about irreversible state changes having been made by overly-optimistically-translated code.

  68. Re:Huh? by pod · · Score: 1
    I don't see what all the confusion with this particular excerpt is...

    permanently storing memory stores temporarily stored

    You are storing in permanent memory[,] stores that are currently in temporary memory.

    How's that confusing?

    --
    "Hot lesbian witches! It's fucking genius!"
  69. Re:Hypocrites! by Anonymous Coward · · Score: 0

    The part that gets me confused is the worship that is ladled out for Transmeta, apparently because they hired Linus Torvalds to work for them.

    It's like the whole "Slashdot Community" is a bunch of schoolgirl groupies, weak at the knees to hear their darling Linus' name spoken.

  70. Babelfish translation by Anonymous Coward · · Score: 0

    Apparatus wants you use in processing processor system having host CAPABLE OF executing first
    INSTRUCTION Seth you assist in running INSTRUCTIONS OF different INSTRUCTION Seth which is
    translated you the first INSTRUCTION processor Seth by the host including circuitry wants
    temporarily until storing MEMORY net curtain generated determination that sequence OF translated
    INSTRUCTIONS wants executes without exception or processor error on the host, circuitry wants
    permanently storing MEMORY net curtain temporarily stored when determination is larva that sequence OF translated INSTRUCTIONS wants executes without exception or processor error on the host, and circuitry wants eliminating MEMORY net curtain temporarily stored when determination is larva that sequence OF translated INSTRUCTIONS wants gene-guess

  71. Patent?!? by Anonymous Coward · · Score: 0

    Oohh.. wait a minute here... Does this mean the GPL'ers will now all of a sudden start supporting the idea of PATENTS since Lord Linus is now involved in patented "whatever"?

  72. It's a trap! by MostlyHarmless · · Score: 1

    This patent is a hoax. It is part of an elaborate plot to bring to an end civilization as we know it. Let me clarify:

    Transmeta posts a patent on the gov't web site. It is shrouded in gibberish to prevent anybody from actually interpreting it (and thus ruining the plot).

    An observant /. reader finds it, wonders "wtf???", and being the dutiful netizen he is, forwards it to /. for analysis.

    Slashdot posts his comment. Naturally, thousands of viewers immediately check out the government's patent site. This is compounded by the message board and the decoding attempts. A side effect of this is that babelfish is also flooded (see neuroid's post).

    The government's site shuts down. The NSA thinks that hackers did it, and trace thousands of hits back to slashdot.

    Slashdot's site mysteriously disappears a day later.

    This is only the beginning of a great war that expands to include the entire armed forces pitted against the world's top hackers. This ends when a misguided cracker accidentally misses his intended target of fbi.gov, and accidentally triggers a nuke.

    China and russia get pissed and start launching nukes. America launches mroe nukes to get back at them. Iraq launches a nuke just to get in on the fun. In the end, everybody dies.

    --
    Friends don't let friends misuse the subjunctive.
  73. Further clarification by Stradivarius · · Score: 5

    Some notes for those who may want a more in-detail explanation:

    The beginning of the patent ("claims") is essentially just a list of things that all modern, superscalar, out-of-order processors do, and saying "hey we do this too".

    Basically, out-of-order machines execute instructions out of their program order (hence the name :). This means that if your code sequence is A,B,C; the CPU may actually execute it such that B is done executing before A. But B's results cannot be written to system memory or the architected registers ("machine state")until you know that instruction A didn't generate an exception. That's so that you can provide precise exception handling, ie that the OS can service A's exception and then resume exection with B. If you don't wait to do your memory store, then you'll end up executing B twice, which you didn't intend. So that's what all the talk in the beginning of the patent about memory stores, etc, is about.

    If you get past all the uninteresting stuff like that in the beginning, you'll find the following:

    "The present invention overcomes the problems of the prior art and provides a microprocessor which is faster than microprocessors of the prior art, is capable of running all of the software for all of the operating systems which may be run by a large number of families of prior art microprocessors, yet is less expensive than prior art microprocessors. "

    The idea it seems is that rather than making complex hardware to execute the instructions and perform speed enhancements, they're doing speed optimizations in software. Which in turn allows very simple hardware(which in turn should translate to really high clock speeds). It seems that Transmeta's bet with this is that the penalty incurred by doing software rather than hardware optimizations is offset by the increase in clock speed and decrease in hardware cost.

    Using such an approach should also make running multiple instruction sets a much easier task. Currently processors do their instruction decoding in hardware. But if Transmeta has managed to do this decoding (fast) in software, then they can just add a little more software to allow multiple instruction sets. They also seem to be caching the translations of non-native to native instructions in a memory structure of some sort, so that they minimize the redundant emulation computations.

    Actually, to address gupg's comment, it also seems that they should not need *any* special compiler support, because they can run stuff that was compiled for any of the various instruction sets they choose to support. So they themselves should not need to do compiler work. I would guess that the reason they're hiring all sorts of compiler folks is that they need people to do the afore-mentioned software instruction translation, and the people best suited for that are compiler people since they work on the instruction level all the time. Most other programmers don't have to deal with anything other than high-level languages, and so would not be particularly well suited to doing what Transmeta is doing.

    Anyway, hopefully this explained things a bit more to everyone. My reading and explanation of the patent was pretty quick since I have to go to class in a few minutes. I'll finish reading the patent afterwards and add anything else I think you might like to know.

    Cheers,
    Stradivarius

    1. Re:Further clarification by Anonymous Coward · · Score: 0

      I still don't get it! If they do part of the emulation in software, what chip does in run on? Which instruction set does the meta compiler use?

  74. This proves it! by MassacrE · · Score: 0

    They found the entrance to the Hollow Earth! :)

  75. isnt it obvious (what transmeta means) by Anonymous Coward · · Score: 0

    i was reading the patent thing and excuse me if someone else has said it before. but wouldnt their name mean TRANSform/METAmorphasize? i just remember reading everyone trying to figure out what it means and coming up with all these direct meanings of parts of the words in latin and stuff. but i bet this is what it really stands for.

  76. The other half of their "memory" device patent by Anonymous Coward · · Score: 0

    Well.. It looks to me like this is the other half of their memory device, the one that anticipates bad memory addresses. These two parts together must form the product, or at least some part of it. My guess is failsafe data storage and retrieval, which will be worth more than Microsoft. As storage approaches terrabytes (sgi source code will support 100,000,000 terrabytes under linux) the percantage of failure becomes shockingly large, even that caused solely by cosmic rays. Good luck to them.

  77. Paul Allen's PARC days? by Anonymous+Freak · · Score: 1

    Uh, he was co-founder of Microsoft. He never worked at PARC. He went from prep school to Harvard to Microsoft, to illness, too investing in four billion little tech companies (and the Portland Trailblazers, and the Seattle Seahawks, and the Jimi Hendrix museum...)

    --
    Another non-functioning site was "uncertainty.microsoft.com."
    The purpose of that site was not known.
    1. Re:Paul Allen's PARC days? by neilv · · Score: 1

      yes, hello, duh. I meant, trying to recapture the heady days of PARC by re-hiring many of the originals at transmeta.

  78. Re:Awright!! by funcused · · Score: 1

    Now we can run Windows on a Sparcstation!!

    No, now we can run SparcLinux, AlphaFreeBSD, and i386Windows all on the same box :)

    -funcused

  79. Re:Glad I wasn't the only one to get it. by anactofgod · · Score: 1

    LOL...agreed! Nothing funnier that a bunch of "softies" trying to make head-or-tails of a *real* hardware spec...albeit one obscured by American legalese...

    Hey! Now that is an idea for unbreakable encryption!

    Exercise for the hardware literate...search the Patent database for other Transmeta patents that have been granted, and look for a pattern in the documents...

    The question that I have is what uP tech is Transmeta looking to use to get these fantastic speed gains? How about a VLIW architecture? I seem to recall reading about a chip (called "Viper"??? created by an English company???) a couple of years ago that was supposed to be RISC-spanking, CISC-go-crying-to-mamma fast...

    Any other idle speculation? Hmmmm???

    ...anactofgod...

    {Personally, I believe the post that stated...
    "Actually, it's a way to run any application for any processor and any OS, straight from Emacs."
    except I suspect that it will emulate vi on a native Emacs processor...}

    ***BROAD GRYNN***

    --

    ---anactofgod---

    "Equal opportunity swindling - *that* is the true test of a sustainable democracy."
  80. Multiprocessor systems of the future by Pingo · · Score: 1

    Patents is just about the very essence of an invention and not the complete story. This makes it very difficult to really deduce Transmetas real intentions.

    However to me this looks like something that could be very very useful if you have a multiprocessor system. It would probably be much easier to get software running efficiently on multiprocessor systems if you sometimes can perform a ROLLBACK.


    //Pingo

    --
    --- Linux or FreeBSD, it's like blondes or brunettes. I like both. ---
  81. Actually... by Scott+Wood · · Score: 1
    Presumably, if the optimized batch of native instructions failed, it would only retry once, with no optimization so it could accurately determine the non-native instruction that caused the fault. At that point, it could raise an exception.

    As for running the instructions "for real", it seems as if it does run them "for real" the first time, but holds the results in a temporary buffer which is only committed when the sequence completes successfully. Otherwise, the common case (no fault) would be slow, since there would be reexecution of the same code.

  82. Re:Have we really thought this through yet?? by Anonymous Coward · · Score: 0
    Or would Java be flat out irrelevant?

    JVM implementations, Java interpreters, and JITs might be irrelevant, but Java the language, the bytecode, and the libraries/APIs would likely become much more relevant -- Java really is a much nicer development environment than is C or C++, it has just been lacking in the runtime environment.

  83. Re:Compiler Support Issues by LarsG · · Score: 1

    However, any new aggressive architecture requires a lot of compiler work. To make your new applications fly, you would want to compile them using the native processors compilers.

    Not always. It all depends on where the bottlenecks are in the current state of technology.

    If the native IS is very wordy and the bottleneck is between CPU and RAM, you would get better performance by using a CISCy IS.

    --
    If J.K.R wrote Windows: Puteulanus fenestra mortalis!
  84. Different instruction sets by Anonymous Coward · · Score: 1

    But for that I don't need to mess with different instruction sets etc.

    You wouldn't if you were just making an X86 emulator. However, they were just using the X86 as an example.

    It seems to me that they have developped circuitry to determine when an emulated instruction has done something which the underlying hardware cannot handle.

    For example, the x86 processors have both memory ports and i/o ports. The 680x0 processors do not have i/o ports. If you were emulating a 386 i/o instruction on a 68000, you would have to do all sorts of stuff in order to get the desired results, because the i/o hardware is not built into the 68000.

    If you were emulating a 6502, you wouldn't have that problem, but you would have memory addressing problems. The 6502 can only handle addresses up to $FFFF before it returns to $0000.

    It sounds like they have developped circuitry to tell them when the emulated instruction has succeeded, and when it has violated the transmeta cpu capacities.

  85. Re:check Paul Allen's resume by Anonymous Coward · · Score: 0
    Paul Allen didn't work at PARC. When do you think he was there?
    • Lakeside Programmers Group (until graduated from highschool in 1971)
    • Enrolled at Washington State University
    • Started Traf-O-Data with Bill Gates
    • TRW (circa 1973)
    • MITS (1975)
    • Microsoft (1975)
    • Diagnosed with Hodgkin's disease (1982)
    • Left Microsoft (1983)
    • Asymetrix (1989)
    (Dates are approximate.)
  86. It's a hardware-assisted just-in-time compiler... by chadmulligan · · Score: 2
    Any emulator has to translate one instruction in the emulated instruction set into one or more instructions in the target instruction set. For instance, the Mac's 68K emulator does this... but it also tries to cache previously-interpreted 68K instruction sequences and stashes them in a cache, where it can reexecute them again without retranslating each instruction.

    This patent (and, yes it's in English - but patent-lawyer English) apparently implies a hardware-based mechanism to store translated instructions in a on-chip cache and then execute them afterwards, hoping that at that point other tricks like pipelines and multiple instruction units will be able to do their thing.

    In a normal emulator, you get relatively little benefit from the normal on-chip caches and pipelines. This would seem like an interesting way to speedup a X86 (or even PPC) emulator.

    And if you think there's little use for this, think "Java Virtual Machine". Think "hardware-assisted just-in-time compiler"...

  87. Re:Hey monkey boy by Anonymous Coward · · Score: 0

    Or get a few friends in and bash out a copy of Hamlet...

  88. programmable processor by Recbo · · Score: 1

    Ten times faster, a hundred times faster with java and perl, a thousand times faster with awk and pipes, five to ten times faster with graphics. OK?

  89. Re:My layman's explanation by xHost · · Score: 1

    wouldn't decompiling cause a lot of errors anyway ?

  90. Re:YES, that's what I got by TPx · · Score: 1

    I'm sorry, but that sounds exactly like vaporware...

    How do achieve emulation speed faster than native speed? (of course clock speed being equal)

  91. Re:Filed: July 24, 1996 by nd · · Score: 1

    I'm not a lawyer, but I'm pretty sure that just because a patent is filed in 1996 doesn't necessarily mean it will be _published_ at that time.

  92. Re:Summary by Guy+Harris · · Score: 2
    On a side note, why spend all of this effort to be x86 compatible when you have the source code?

    Umm, because they don't have the source code to all the, say, x86-architecture programs they might want to run?

    IMO open source software is going to make hardware architecture very competitive.

    "Is going to make" isn't the same as "has made". Yes, typing make to get "native-Transmeta" machine code for your application may not require all the work that this patent involves, but it involves, instead, waiting for open source versions of the programs they're interested in showing up, and they may not be willing to wait for that.

  93. Not just for emulation.. by Nelson · · Score: 1

    Am I the only one who sees some non-emulation uses for this technology? I'm think multiple processors.

  94. Re:What it Really does by mvw · · Score: 1
    Rollback can be very useful when pipelining instructions because there is no guarantee of precisely where a fault issued from.

    Are we talking of rather seldom events here (what the name trap suggests), or of some quite often occuring speculative execution attempts, where one has to rollback invalid branches (thus the ones that will not be executed) by oneselve, and not by support from the speculative execution hardware?

  95. Re:Patent?!? - Read Here to Understand the issue by Komodo · · Score: 1

    What most supporters of the GPL have a problem with is bogus SOFTWARE patents. Software and hardware patents are not the same thing. Most technological countries (besides the USA) do not even grant software patents.

    The problem with (bogus) software patents is that the people who try to get them haven't worked hard enough to get them, haven't contributed anything new, etc. The patent office just doesn't understand that these are trivial non-inventions.

    Now, some supporters of the GPL will also fight hardware patents, but they are probably doing it on a general sense that intellectual property is evil.

    There. A rational explanation and distinction, without flamebait or polemic. You can almost always find one if you take the time to look ;)

  96. Wait a Sec by pnatural · · Score: 1

    Now someone correct me if I'm wrong, but isn't the type of complexity they're talking about here a Bad Thing? I mean, how many times have you heard someone bemoan the CISC architecture and how complicated it is? And in the next breath, you typically hear about the wonders of RISC (or more accurately, RISC-influenced). This sounds like another layer. What about the axiom, "Do one thing, and do it well"?

    Granted, from an application and OS perspective, it really doesn't matter what the microcode does, but increased complexity typically comes at a price -- it costs more to design, to produce, and to run.

    OTOH, if their objective is to supplant existing CPU architectures, it may well lead to one of those magic Star Trek boxes -- you know, the ones that can read alien computers just by being in the same room.

    1. Re:Wait a Sec by Anonymous Coward · · Score: 0

      CISC isn't evil, it's just a scheme for compressing instructions that happens to be very difficult to pipeline.

  97. Transmeta Patents Dynamic Recompilation? by Effugas · · Score: 2

    Hate to be the pain in the ass demanding people be a bit consistent in their distaste for patents, but Ye Olde UltraHLE on Win32 *appears* to do a good chunk of automagic rewriting of processor instructions intended for another architecture.

    I doubt it has the same kind of exception handling as we see described in this patent, though. Them TransMetans do some funky stuff ;-)

    Yours Truly,

    Dan Kaminsky
    DoxPara Research
    http://www.doxpara.com

    Once you pull the pin, Mr. Grenade is no longer your friend.

  98. Re:My layman's explanation by Anonymous Coward · · Score: 0

    "circuitry for permanently storing memory stores temporarily stored when a determination is made that a sequence of translated instructions will execute without exception or error on the host processor" leaves me to believe that this process will recognize previously translated instruction sets and execute immediately, without translation... --Lothar needs to login --

  99. Hey monkey boy by Anonymous Coward · · Score: 0

    Come sit over here with me, I've got an extra bananna. We can make some margaritas and ponder the universe. Maybe even write some code.

  100. Re:I see one problem.... by Guy+Harris · · Score: 2
    what would Transmeta do when Intel introduces a new opcode in their Pentium IV?

    Add more code to their binary-to-binary translator software to handle that new instruction. The processor isn't doing the translation, except to the extent that it runs the translation software (see other postings of mine in this thread for the quote from the patent that speaks of "emulation software").

  101. Just a Pentium with rollback? by Coppit · · Score: 1
    Intel's newest chips do this sort of translation, from CISC x86 instructions to RISC-like microinstructions, which are then run in the superscalar, pipelined core. What's new here?

    • The source instructions don't have to be x86, and maybe can be changed on the fly. (Okay, they didn't say that, but I think it's implied.)
    • This sort of thing is great for emulating stuff like operating systems, where you need to be able to trap instructions that run in protected mode before they do anything wrong. (Think VMWare.)
    • It sounds like speculative execution, except the whole processor's state can be rolled back. (Actually, the whole system, since memory is included.)

    Okay... Now I'll get out of computer architecture, and back to my home in software. ;)
    ---------------------------------------------- ---------
    "For I am a Bear of Very Little Brain, and long words

  102. Re:Just wondering by Anonymous Coward · · Score: 0

    Or they're using an Intel chip (or AMD, Cyrix, etc) and the 50 MHz slowdown is PCI limited.

  103. Re:worthless by Anonymous Coward · · Score: 0

    Good point. Regarding the future ok Linux, if Linus were independently insanely rich he'd be able to devote all of his time with Linux. At Transmeta he works 99% on Linux, 1% building Transmeta buzz (Basically telling people, "Yeah I work there. Can't tell you anything"). IPO before the Pyramid scheme collapses and he's set for life.

  104. Re:Filed: July 24, 1996 by fwad · · Score: 1

    Oh I'm sure that some at Transmeta are sitting there reading /. and having a bit of a laugh. I mean, they only have to cough and message boards / irc gets filled up with talk about them. Question - if you were given the choice - would you sign a NDA and get told what they're doing but then not be able to tell others or would you prefer not knowing and carry on thinking about it? The dreaming's a lot more fun IMHO.
    --

    --
    -- Kernel Panic: Error reading /dev/caffeine
  105. I think its pretty much spelled out:Fast Emul. by Anonymous Coward · · Score: 0

    Fast emulation. They mention over and over "we have something already built that runs x86 code faster than anything else that runs x86 code, and at a lower price point". They mention Postscript and Java, and generalize saying that the thing could do any sort of emulation/interpretation that needed a state machine, but the crux of the document is about running x86 code really _fast_. Seems like what they've done is got a hardware/software hybrid emulator. The software peice takes x86 code and packages it into VLIW-like instructions for their "morph" processor. The hardware takes the VLIW instructions and tries them out. If no exceptions are generated, it stores that, in hardware, as a "valid translation", and updates the physical and emulated machine state to reflect it. However, if any sort of exception _is_ raised, all the information to backout the instruction bundle is in the metal (notice how they talk about commit and rollback). It's all spelled out. Register renaming, commit buffers. Their first patent was on efficient storage and retreival of translated ops, if i recall correctly. A "cache", if you will, of VLIW packages that correspond to known good translations of emulated code. This patent specifically says "we have a non-x86 processor that runs emulated x86 code faster than anything else, including any native x86 processor" Wow.

  106. Other Uses for Rollback by Anonymous Coward · · Score: 0

    1) You can pre-evaluate the most likely branch of a conditional statement (or both branches), and throw the results away if your guess was incorrect.

    2) Good for shared-memory multiprocessing. You can run straight out of local cache ignoring memory synchronization, because if you become out of sync (IE someone else locked memory you were using), you can go back to the last synchronization primitive and restart after the cache has been resynced.

    3) Big rollbacks are useful for debugging, as you can undo/replay from a crash to see what went wrong.

    4) You can do garbage-collection by stashing intermediate results and then unwinding your program.

    5) From a future-of-computing standpoint, true reversible computers (with reversible instruction sets) use less power and generate less heat, since nearly all heat dissipated by a chip is the result of erasing bits (which increases the entropy of a system).

  107. "super"processor by Hard_Code · · Score: 1

    Looks like this could be the master processor in a multiprocessor system in which processors can run heterogenous instruction sets...ideal for asynchronous multiprocessing. Whatever it is, it sounds cool.

    --

    It's 10 PM. Do you know if you're un-American?
  108. Fscking moderators screw up again... by Anonymous Coward · · Score: 0

    OK, why the fsck was that post moderated down to TROLL?

    Answer - because it dares to make a less-than-glowing cvomment about Our Lord And Saviour Torvalds, obviousdly.

    Sheesh. There was me thinging that slashdot was news for nerds, not Worship for Torvalds.

    The guy made a PERFECTLY valid point, so why moderate it down?

    *sigh*

  109. True -- the will process producers! YAHAHAHAHA by Anonymous Coward · · Score: 0


    urrgghhhh

    blrkghgh urk urk urk


  110. Behind the technology - the business by Bloob · · Score: 2

    I concur, and I would advise reading of the DETAILED DESCRIPTION if you scroll the page down a bit.

    What is interesting is that they appear to have created a CPU capable of running applications designed for one of many target systems (Intel x86/Pentium, PPC, Postscript and Java even) by buffering the instructions, optimising them, and then checking their execution for errors before execution occurs. Quite brilliant and mind-bogglingly complicated.

    Note the business-angles hinted at: speed and optimisation come at significant cost; cost of producing any microprocessor is out of reach of most companies (inferring a mass market), large number of applications written for many targets (Windows, java, etc.), problems associated with traditional thinking with regard to optimization and parralel processing.

    To create a microprocessor which overcomes the above at viable cost to both manufacturer and customer would be enormous!

    Just think of it, you're running the Transmeta CPU which is running some OS, and running Office 2000 through it and knowing that the CPU will trap any problems before they occur! This is a hardware VM-Ware!

    Ooh I'm drooling already!

    James Green

  111. BeerWolf? by Anonymous Coward · · Score: 1

    Can I make one of those "BeerWolf" clusters with it?

  112. Portuguese NOT portugese!!! by Anonymous Coward · · Score: 0

    I have seen the word Portugese written in Slashdot so many times that I can't take it anymore. Memorise PORTUGUESE, not Portugese.

  113. Re:Actually, it doesn't know there won't be an err by Alik · · Score: 1
    This patent exists because they can't determine ahead of time if there will be an error. This patent, inconjunction with one of their other patents, provides a method for them to muddle on in the usual case of no error and still have a means of rolling back in the less usual case of an error.

    In conjunction with what someone else said about branches counting as exceptions, this is a start on the answer I'm looking for, but I'm not satisfied. Specifically, there's still a problem of infinite loops. There's no way, AFAIK, to prove that the translation of a given series of target instructions wouldn't send the host into an infinite non-branching loop. Therefore, the host will "muddle on" forever or until the user hits the reboot key. Worse yet, this looping would be totally dependent on the state of registers and memory, and thus is very likely to be irreproducible.

    Someone else said that it was a matter of executing a few instructions to see if they screwed up, then committing the memory ops to permanent store, and continuing with that incremental process. However, how does one know at which point the "cause" of an exception or error occurred? It could be that the instruction you executed five minutes ago was the problem. Given that their checkpointing buffer must be finite, this can't possibly catch all translation errors. It's also likely that they can't tell in advance which ones won't be caught, and thus you're gambling every time you run a program. They're pretty good odds, but it's still gambling.

  114. Re:Translation? by Anonymous Coward · · Score: 0

    386 processors and higher have a memory management unit which sets up memory segments in protected mode so that applications can't stomp on kernel space. If a memory access goes awry, the MMU tosses an access violation and dispatches the Deep Shit Handler, which pops up a useless window under WinSlug, or barfs core under *n*x. Processors which want to emulate X86 CPUs have to know when to write to RAM and when to not scribble where they shouldn't. This patent appears to cover such emulation.

  115. Nitrozac explains TransMeta. by Matter+Eating+Lad · · Score: 1

    Nitrozac has the real answers, check out her TransMeta Secret Lab.

    She also revealed they have discovered the entrance to the Hollow Earth... I believe her.

  116. Re: How do you do this all and make it cheaper? by mindlace23 · · Score: 2

    Because the only instructions sent to the processor (after optimization) are instructions that are known to succeed.

    The process of optimization is based in software as well- the instruction translation (code morphing, they call it) software is written in code native to the VLIW chip.

    IE: there are no speculative instruction paths on the VLIW chip. There are something like 4? on the PIII.

    In other words, the chip can have about 4x less transistors than a comparable x86 chip.

    This means:

    • higher yield in fabrication (less cost)
    • More chips per wafer (less cost).

    Simpler chips can also be run stably at higher clock frequencies than more complex chips of the same manufacturing process. (.18 micron, .22, etc)

    Also, the optimized instructions have 70% or less operations than the original instructions.

    I'm getting some of this from their earlier patent.

    --
    ~mindlace
  117. Re:What it Really does by GnrcMan · · Score: 1

    By fault I mean exception or error (which is exactly what the patent says...guess that's why I don't write patents!). That is, by nature, "exceptional". The problem, to expand furthur, is that checking for an exception slows down execution of the "normal" code path. That's why traps are used. The processor then only has to check at certain user (by that I mean compiler, usually) defined places. The problem is that this makes things more complex (and much slower, at least on the Alpha) when the exception actually occurs.

  118. Re:YES, that's what I got by Foogle · · Score: 1

    Yes, it is an x86 processor, but IIRC there is some emulation going on there. I think it's mostly in the 16-bit code though. I could be totally offbase here. In any case, they could run at a faster speed than an actual pentium without any problem. Clock rate isn't really a measure of speed (look at Alpha v. Athlon v. PIII). The Transmeta processor could be FASTER than that PIII and run the PIII instructions in emulation. If it was done well enough, the emulated speed would still be faster. Of course, this is all speculation at this point.

  119. Re:Hypocrites! by Anonymous Coward · · Score: 0

    Oh, yes, definately, this person is a troll because he spoke against Linus.
    Sheesh. What a bunch of anal retentives here...

  120. Re:Wait... by Mija+Cat · · Score: 1

    Doesn't own Linux.
    Doesn't own TransMeta.
    That clear things up for ya?
    Me either!

    Mija Cat

    --
    Yes, that's really my e-mail. Don't change a thing.
  121. Filed: July 24, 1996 by jmacleod · · Score: 1

    Notice the patent was filed in '96? It means this thing is three years old folks. No new info over the previous patents. No valid interpretation on TransMeta's current projects. Bet they're having a laugh at us now...

    1. Re:Filed: July 24, 1996 by jmacleod · · Score: 1
      To clarify: by comparing the date of the filing of this patent against the dates of filing of Transmeta's other patents, this is the patent which was filed first.

      The implication to me is that this patent describes what Transmeta did 3 years ago.

      But I have to agree with the chap who suggested it's a TRANSlating METAprocessor...

  122. Re:A HERRING! by Jason+W · · Score: 1

    I'm not too familiar with patents, but don't they have to show the patent office that they have a just claim to the patent? If so, then they must have went through alot of trouble to plant a red herring!

  123. I think the Instruction Translation Cache is it. by Anonymous Coward · · Score: 2

    I read it all, and, I believe, understood it.

    From my understanding of Intel and AMD CPU's, what they do is convert the x86 instructions into groups of RISC instructions, which are then run by the core processor.

    What the TransMeta CPU does is CACHE the results of the translations into a multi-megabyte on chip buffer.

    So, while a Pentium III takes up to 20 cpu cycles to decode some of the more complex instructions, the TransMeta CPU takes even longer, but makes a better optimization. But once it's translated it's buffered away so that if it's needed again soon, it takes ZERO cycles to decode.

    The TransMeta CPU then justifys the time cost of taking GROUPS of instructions, optimize the hell out of them, taking as long as it needs, then file the result for future use.

    All the exception handling stuff is needed in case an exception happens in the middle of a group.

    Say, for example your program contains the instructions "A,B,C,D".

    The Transmeta CPU translates this into "1,2,3".

    Further, lets say an error occurs at step "2".

    The CPU then Rolls back (read up on Transaction processing) to the state before "1" executed.

    Then it translates the instructions one at a time until it recrates (or fails to recreate) the exception.

    It then Commits the changes to the emulated registers, and reports the exception at the point when it occurs in the origional code.

    Put simply, this thing will KICK INTELS ASS. possible speed improvements of over 10 times.

    and, the same principles could be applied to any other CPU instruction set.

    This patent does not appear to cover emulating multiple Instruction Sets at once, but nothing stops it from being applied in that manner. it would be just as hard as doing it with a 'Prior Art' CPU design.

    Nor does it seem to be FGPA related, but I suppose FPGA's could be used somewhere in it.

  124. Re:What's the Application? by dufke · · Score: 1

    >Speculation what apps need mega CPU cycles?

    3D games!!!

    This is true even with a 'GPU' like the GeForce, since future games are gonna want to do some very realistic physics/AI.

    -

    --
    __
    Comment submitted. There will be a delay before you understand what you posted.
  125. Re:And what about Rice's Theorem by alexor · · Score: 1

    I agree. Sounds to me like either: 1. The code checking will counterset the speed increase of the fast "checking" proc. 2. There is no "check" that will know if the code will cause error or exception. Ie: Nice patent but ultimatly won't stand up in court. I don't think Transmeta is into proc design - I think they are trying to make something that will, independant of proc design changes, allow hardware emulation of one system on another. Basically a generic device that allows a faster new processor to emulate an older one with different instruction sets. Could destabilize longterm relationships between hardware manufacturers and OS developers (like Wintel) by allowing OpenSource distributions to emulate proprietary systems using whatever chip they want.

  126. Gain speed with recursion? by korpiq · · Score: 1

    P = A Processor
    TMP = TransMeta Processor

    TMP(P) = Ptmp, which is faster and cheaper than P

    TMP is a P, thus

    TMP(TMP(P)) = P2tmp, which is faster and cheaper than TMP(P)

    Thus, P ntmp approaches indefinitely immediate execution speed for zero price, likely being indistinguishable from perfection for very large values of n.

    --

    I think, therefore thoughts exist. Ego is just an impression.
  127. In laymans terms by xmedar · · Score: 1

    The Register story about the patent can be found here :-

    http://www.theregister.co.uk/990929-000011.html

    --
    Any sufficiently advanced man is indistinguishable from God
  128. Veil, indeed. by theGnome · · Score: 2

    I think what it really means is that after reading that, you'll end up with a big headache. Am I mistaken or was that really one huge sentence?

    *digs around for his aspirin*

    - dom

    --

    - gnome

    What's up, Mr Jones?
    1. Re:Veil, indeed. by Psyicide · · Score: 3

      Yes. It seems that Transmeta has perfected a run-on sentence processor, able to mutate any reasonable statement into a more obscured but equivalent sentence until comprehension is completely lost by the reader.

    2. Re:Veil, indeed. by Imperator · · Score: 1

      Actually, their real innovation is storing the sentence in memory until they're sure it's not understandable.

      --

      Gates' Law: Every 18 months, the speed of software halves.
    3. Re:Veil, indeed. by MemRaven · · Score: 1
      Actually, to the best of my knowledge, this is not their fault: really old patent law stipulates that in order to make patents understandable to laypeople, they must be phrased in a single sentence.

      Of course, it has provided the opposite result.

      But then again, IANAL.

  129. It's like it... by Jimhotep · · Score: 1

    It executes instructions internal till it knows
    that a set of instructions will not fail.
    Then runs them for real.

    Even has internal memory for any memory writes
    the instructions may execute.

    Just wonder how it acts if the instructions do
    fail, just keep trying till the cows come home?

  130. Re:whoah by Adelvillar · · Score: 1

    Plain English: A super fast parallel processor, that translates instructions compiled for any OS and CPU architecture. The catch here is it will remember what went right thus not repeating any previous failed operation. Looks like a "learning chip", the question is will it serve another chip or will it be running the whole show?

    --
    "In God we trust, all others must bring data" - W. Edwards Deming
  131. Intel's gotta be worried... don't they? by jdub! · · Score: 1

    Just think about the amount of time we've spent imagining all the incredible vistas brought by this mystical, fascinating secret!

    I know I'm intrigued - incredibly intrigued. Surely Tom Waits is too - "What's he doing in there? What's he building in there?"

    But for all out time spent, we've only got something to gain - what about Intel? Surely they've been trying to get in on the secret? Maybe they already know. I'd hate to be either them or TransMeta at the moment, one trying to squeeze in and the other trying to shut everyone out. Paranoia mania.

    Let's just hope TransMeta aren't developing yet another browser or auction site, eh?

  132. My layman's explanation by sporkboy · · Score: 5

    It appears that the flow will be like this.

    1. Set of instructions comes into processor in one instruction set (like x86).

    2. This device stores the data for this series of instructions temporarily

    3. The device translates the (x86) instructions into its own internal instruction set and figures out an ordering that will not cause it to have exceptions.

    4. The device retrieves the temporary data and "fills in the blanks" in the "inner" processor to get results, the so called "permanent storage" is probably the inner processor's instruction cache.

    5. The data is cleared from the interim area once it's acted upon.

    1. Re:My layman's explanation by Thauma · · Score: 1

      So in other words it decompiles the instructions and recompiles them into its own native intruction set on the fly...
      Sounds like a dynamic recomplation emulator wired in hardware to me.

    2. Re:My layman's explanation by redactor · · Score: 1

      So, this roughly does the same thing for CPU's that an interpreted language (perl, python, etc) do for code? Wouldn't you incur some sort of performance hit like you do in an interpreted language?

  133. Transmeta Patents Profit by giant.sammich · · Score: 1

    This is obviously a patent for making excessive quantities of money.
    Good-bye Apple/IBM/Alpha/Amiga, hello cross-platform Linux. Read the very end of the patent file.

    --
    If I could get Lightwave for Linux, I'd give my Windows PC to the lowest bidder.
  134. What didn't turn out? by Christopher+B.+Brown · · Score: 2
    I think that the "in practice" that turned out badly has been the price wars between Intel, AMD, and Cyrix.

    Two years ago, there was room for some serious profits on CPUs as they were the most expensive component in a computer system.

    That has changed such that the most expensive component is commonly the hard disk, followed (with MSFT software) the OS, with CPU in third or fourth place.

    With that change, this leaves Transmeta without the viable "IA-32 market" they may have expected to have.

    Based on the droppage in pricing, it is not clear that there is room to get vast decreases in pricing.

    Of course, considering that Transmeta is fabless, and doesn't directly have a sales organization for CPUs, the goal might have been to construct technology to allow building cheaper IA-32 chips, and then license it to AMD or Intel or Cyrix.

    I'm not sure any of them are necessarily interested to the tune of $Billions...

    --
    If you're not part of the solution, you're part of the precipitate.
  135. From what I understood... by Anonymous Coward · · Score: 0

    It seems that one of Transmeta's goals is to produce a very fast, very cheap Pentium replacement that will hopefully break Moore's law (which I suspect Intel could do if it really wanted) and advance the art by several years in one leap. It will consist of a simple but extremely fast processor with lots of registers, a super-fast cache, and a software translation layer that optimizes as well as translates. Imagine a Transmeta chip that is 5 times faster than a 1ghz PIII and a third of the cost. We can but hope! Overall, Transmeta will have a single chip that they could sell to every competing hardware manufacturer in the world; Apple, HP, DEC, IBM, etc, etc which will outperform their own chips and do it more cheaply. I don't think they are focusing on the idea of a chip that can pretend to be a Pentium, an Alpha, a PowerPC, etc all in one box. That is probably possible, but the market is limited compared to the massive sales of single-use chips. Steve.

  136. more details by jkauth · · Score: 1
    reading the "DETAILED DESCRIPTION" reveals lots more: (basically, it looks like they have got a hardware implementation of DEC's "FX32":

    • optimize/translate code FRAGMENT to new architecture
    • run and test for errors, i/o conflicts
    • cache the resulting optimized code in temp memory
    • store the resulting optimized code (presumably to disk)
    • next time this code fragment is run, run the optimized version immediately!

    the whole point of the thing is that they are able to reduce the instruction set on their chip to a VERY primitive set, and therefore reduce the number of gates on the chip. fewer gates = less heat, so you can crank up the frequency of the chip to probably several gigahertz!

    so what you have is a cool (as in chilly) CPU running FEWER instructions at a HIGHER frequency!

    cool thing to note: they have a working x86->TMCPU box, AND IT IS FAST!

    ... As a comparison, one embodiment of the present invention designed to run all available X86 applications is implemented by a morph host including approximately one-quarter of the number of gates of the Pentium Pro microprocessor yet runs X86 applications substantially faster than does the Pentium Pro microprocessor or any other known microprocessor capable of processing these applications.

    ... The use of a translation buffer to hold translated instructions allows instructions to be recalled without rerunning the lengthy process of determining which primitive instructions are required to implement each target instruction, addressing each primitive instruction, fetching each primitive instruction, optimizing the sequence of primitive instructions, allocating assets to each primitive instruction, reordering the primitive instructions, and executing each step of each sequence of primitive instructions involved each time each target instruction is executed. Once a target instruction has been translated, it may be recalled from the translation buffer and executed without the need for any of these myriad of steps.

    ... Some of the additional registers allow the use of register renaming to lessen the problem of instructions needing the same hardware resources

    ... The target (or shadow) registers are connected to their working register equivalents through a dedicated interface that allows an operation called "commit" to quickly transfer the content of all working registers to official target registers and allows an operation called "rollback" to quickly transfer the content of all official target registers back to their working register equivalents. The gated store buffer stores working memory state changes on an "uncommitted" side of a hardware "gate" and official memory state changes on a "committed" side of the hardware gate where these committed stores "drain" to main memory. A commit operation transfers stores from the uncommitted side of the gate to the committed side of the gate. The additional official registers and the gated store buffer allow the state of memory and the state of the target registers to be updated together once one or a group of target instructions have been translated and run without error.

    ... a typical operation of the code morphing software of the microprocessor when furnished the address of a target instruction by the application program is to first determine whether the target instruction at the target address has been translated. If the target instruction has not been translated, it and subsequent target instructions are fetched, decoded, translated, and then (possibly) optimized, reordered, and rescheduled into a new host translation, and stored in the translation buffer by the translator.

    ... When the particular target instruction sequence is next encountered in running the application, the host translation will then be found in the translation buffer and immediately executed without the necessity of translating, optimizing, reordering, or rescheduling.

    ... If a set of translated host instructions is executed without generating an exception, then the new working register values determined at the end of the set of instructions are transferred together to the official target registers

    ... However, one embodiment of the invention designed to run X86 programs utilizes a translation buffer of two megabytes of random access memory.

    ... If the comparison with the A/N bit in the TLB shows that the operation, however, affects an I/O device, then execution causes an exception to be taken; and the translator produces a new translation one target instruction at a time without optimizing, reordering, or rescheduling of any sort.

  137. It means .. by gupg · · Score: 5
    I think it means (from the abstract) that they are going to provide compatability to other processors by converting their instructions to their host processor. So, the story unfolds. Obviously, they have a super fast processor and will provide for running Intel etc instructions on their processors.

    The patent itself is more concerned with making sure that the conversion process occurs without any exceptions taking place .. or actually holding the processor state and waiting for a sequence of instructions to make sure no exception etc happens and then excuting it on the host processor.

    They obviously also need strong compiler support for such a processor which explains all the software and compiler people they have been recruiting.

    Fun, fun, fun .. who says Computer architecture is dead !

    Sumit

    1. Re:It means .. by noom · · Score: 1

      How do you think superscalar processors work? They analyze dependancies between instructions -- if two (or more) instructions are independent (one instruction doesn't depend on the result of another) then they can be executed simultaneously.

      BTW, you should read about things like partial evaluation and specify to see what's meant by code optimized for a given application. For instance, suppose I have a function:

      foo(a,b) = a + b;

      If I specify that the param 'a' always equals 0:

      foo2 = foo(0,_);

      then foo2(b) = b. This is all that specification does. Clearly, the compiled code for foo2() is far simpler than that of foo(). (IOW, calling foo2(x) runs faster than calling foo(0,x)).

      Modern functional languages do this stuff pretty well, and there's been some research on applications in operating systems (for instance, optimizing code paths for specific devices).

      BTW, when people are discussing compilers, you should realize that the word "optimizing" is meant as a funny joke, nothing to take seriously. An "optimizing compiler" would be more accurately called a "compiler which usually generates better code than a naive approach," but for marketing reasons, the former sounds better.

      -NooM

    2. Re:It means .. by DirkGently · · Score: 1

      why don't they? because they have to provide legacy support for archetecture that should have died along with the 16 bit horse it rode in on. Doesn't it kind of suck that your PIII and Athlon are still known as "i386" compatible?

      Dirk

      --

      I keep trying to pick fights, but I can't shake this Excellent karma.

    3. Re:It means .. by ecampbel · · Score: 1

      If people are forced to recompile, why do you need to emulate the instructions of another processor? The goal of emulating another processor is so that recompilation isn't necessary.
      Perhaps the compiler teams are writing their own compilers so that if vendors want to achieve maximum speed, they can recompile, otherwise they will be running in emulation mode. This is how the MacOS deals with 68k code and how the upcoming Merced processor will work.

      --

      Sig goes here
    4. Re:It means .. by Anonymous Coward · · Score: 1
      If people are forced to recompile, why do you need to emulate the instructions of another processor?

      Not conventional compilation, but meta-compilation. Instead of starting with "human text" source code and compiling it into instruction set specific executable code, you start with instruction set specific code and meta-compile it into custom hardware that is dynamically reconfigurable, resulting in very fast execution on hardware that is essentially optimized for each particular application.

      Or at least that's one interpretation....

  138. worthless by Anonymous Coward · · Score: 0

    Sorry, but I'll beleive it when I see it. The "buzz" over transmeta is rediculous. They probably haven't done ANYTHING yet, and are just going to IPO to cash in just before folding the company when it is discovered they actually haven't done anything.

  139. Switching contexts in a rapid state??? by Jerenk · · Score: 2

    From what little I've read of it (actually now read most of it), it appears to be a way to allow for fast context switching between processor modes. Since everyone is speculating that their chip will emulate other chips (instead of providing their own ISA), this just goes in hand with that.

    I also see a lot of stuff about pointer manipulations. Maybe this is at the core of how they will attempt this (i.e. keep all "processes" in memory with their own vm space and then "swap" 'em out when necessary).

    In my rough perusal, I may have missed some very important details. =)

    Justin

    --
    Mu. P.S. The address you see is real. =)
    1. Re:Switching contexts in a rapid state??? by Jerenk · · Score: 1

      It is desirable to provide competitive microprocessors which are faster and less expensive than state of the art microprocessors yet are entirely compatible with target application programs designed for state of the art microprocessors running any operating systems available for those microprocessors.

      Oh, boy, if they can truly do this, I'm want to be the first one to have one of these babies. Shudder.

      BTW, all the memory and pointer stuff has to do with error correction and detecting. That way, it can detect overflows in a consistent manner.

      What I want to know is how they can do this without sacrificing speed??? But, they have some of the brightest minds and money (Paul Allen's) in this company.

      Watch out.

      Justin

      --
      Mu. P.S. The address you see is real. =)
    2. Re:Switching contexts in a rapid state??? by TummyX · · Score: 1

      Who else do they have working on this?

      I would also like to know how the hell can they do this without sacrificing speed? It's unimaginable. Does it not sacrifice speed in the same way java isn't supposed to sacrifice much speed?
      Maybe they have heaps of read aheads and instruction predicting circuits or something :/

    3. Re:Switching contexts in a rapid state??? by BDKR · · Score: 1

      Actually, I thought it had something to do with branch prediction. What you termed as instruction prediction. There normally is a penalty involved with a failed branch, but if this approach can lesson or completely kill the possibility of such, then there would be a resultant performance gain.

      Rock hard, ride free

  140. They give performance numbers! by robj · · Score: 1
    Did anyone else miss this???

    As a comparison, one embodiment of the present invention designed to run all available X86 applications is implemented by a morph host including approximately one-quarter of the number of gates of the Pentium Pro microprocessor yet runs X86 applications substantially faster than does the Pentium Pro microprocessor or any other known microprocessor capable of processing these applications.

    Jeez, that seems pretty impressive, if true. Imagine an Athlon-equivalent ("faster than any other known microprocessor") for one-quarter the price....

  141. Microcode++ by musique · · Score: 1

    This sounds a lot like a microcode processor (processor+apparatus) except that this processor is more fault tolerant and the microcode controller (analagous to the apparatus) is external to the processor instead of integrated into it.

    Interesting.

  142. It means by FooBarSmith · · Score: 4

    Either the people at Transmeta really need to take a course in basic english and especially punctuation or that they have a random patent generator that strings together random combinations of processor, executed, processing, circuits, determination and stores.

    My money is on the latter, maybe Linus whipped together a Perl script in his lunch hour?

    --
    stty erase ^H
    1. Re:It means by Bocephus · · Score: 1
      Dude, welcome to patent law. It's all like that, especially in the realm of electronics.

      It's why I'm not touching that field with a ten-parsec pole.


      --
      "Even genius needs a competent technique."--Robert Fripp
    2. Re:It means by hruntrung · · Score: 1

      it kinda reads like they copied it from the Critique of Pure Reason.

  143. What this patent says by praxis · · Score: 1

    What I gather is, that they are not patenting a processor to run other processors' instruction sets. The host processesor, which does run the other intruction sets, already exists (maybe a previous patent of their's, though might have been patented by someone else), and they are creating an apparatus that allows that host processor to store the state of the processors it emulates. Well, that's my short take, I have not yet looked through the referenced patents to get more information though.

  144. Glad I wasn't the only one to get it. by Anonymous Coward · · Score: 0

    Good to see that not everyone here is idiots, and that some people who frequent this 'News for Nerds' site can actually read technical language.

    this flood of "Patent confusing documents" is damned annoying.

  145. Re:A HERRING! by Rational · · Score: 1

    Well, the name of the company was originally "Transmeat", until the guy who registered with InterNIC made a typo, so you could well be right...

    --
    "Be nice, veer left, and never stop thinking" Iain Banks - Walking On Glass
  146. And what about Rice's Theorem by for(;;); · · Score: 1

    They seemed to be flinging those "check whether the code will cause error or exception"s around pretty liberally. I wonder how deep that will go (and how much computation time it would take!); and wouldn't sandboxing, not "code checking", really be what's necessary (if one wants security) when swapping in new processor states?

    --

    "Whatever happened to fair use?"
    -- Duff-Man
  147. Re:Doesn't this violate theory? by jmacleod · · Score: 1
    only if P!=NP. (Granted, if they've just proved P=NP, a lot of mathematicians are going to be looking for new projects! And a lot of Traveling Salesmen are going to find the best route between their cities ;-)

    I doubt that's what's implied. However, I'm often wrong so I can learn from as many mistakes as possible. :-)

  148. whoah by miahrogers · · Score: 0

    could someone please post a description of what this thing is? Just reading the description makes my head spin
    char *stupidsig = "this is my dumb sig";

  149. More parallelism.. execution in multiple CPU's? by Chexum · · Score: 1
    The trick here looks to be finding out which things can be done parallel without causing an exception.

    IMHO it's rather about making the common case (i.e. no exceptions via memory fault, etc) go fast, and have a hardware glue to find the non-fast quickly. Basically, the idea of cacheing followed to the end of it. :)

    And another guess.. Maybe these "exceptions" can mean that other "instruction streams" has modified/could use the exact same in-memory data another instruction sequence is doing... On another CPU?

    I can smell it... Can they build a system into which you can simply add a few more transmeta cpu's, and seamlessly increase the performance of existing code? Maybe even in a network? You buy another few transmeta box, and your suddenly can encode MPEG2 in real time? Maybe this partly depends on the compiler's smartness to let the processor avoid conflicting stores a bit longer, but they control that too :)

    (It would also explain why Linus is so sure we will have more appliances, simply because it will make the meta-box (meta-transputer?) go faster...)

    --
    "Ten years from now, they could do it in a few seconds." -- The Racketeer of the Hellfire Club, 1993, Phrack 42
  150. It helps if you run it through babelfish... by neuroid · · Score: 4
    instrument for US one processing system t a capable processing host executing a first instruction ajust ajud functioning instruction a different instruction ajust that est translates first instruction ajust processing host including provisory circuit for storing memory armazen to ger until a determination that a sequence translates instruction execut without exception or error processing host, permanent circuit for storing provisory memory armazen stored when a determination est faç that a sequence translates instruction execut without exception or error processing host, and circuit for eliminating provisory memory armazen stored when a determination est faç that a sequence translates instruction to ger an exception error in the processor.

    That's from english->portugese->english

  151. How much it will cost & how fast it will go by robj · · Score: 1
    OK, screwed up my last comment a bit... but everyone in this thread seems to have missed the clear mention of price and performance!

    As a comparison, one embodiment of the present invention designed to run all available X86 applications is implemented by a morph host including approximately one-quarter of the number of gates of the Pentium Pro microprocessor yet runs X86 applications substantially faster than does the Pentium Pro microprocessor or any other known microprocessor capable of processing these applications.

    Imagine that... they really could be talking about a huge price/performance leap. I would be worried if I were Intel or AMD... or Motorola... or IBM... or anyone making chips with lots of on-die instruction wrangling logic.

  152. Patents by Anonymous Coward · · Score: 0

    I was wondering about something here. Usually when there is a patent story on /. there will be a vocal group of people badmouting patents. Like say if Apple patents something or if the company that patents a device won't play nice with the Open Source community, but if Transmeta patents something it's all good and fine? Isn't Transmeta one of Paul Allen's companies?

  153. Re:Just wondering by Creepy · · Score: 1
    Not to add to speculation, but it could very well be... About 2 weeks ago, there was a newsburst on Apple Insider mentioning that Connectix corp, maker of VirtualPC emulation software, had a hardware emulation card in the works, capable of running a PC in emulation roughly 50MHz slower than the primary chip.

    Now it doesn't take a genius to see that
    a) they may be using the transmeta chip (and it could well be limited by PCI bus speed)
    or
    b) they're developing a 'similar technology'

    anyone else care to speculate?

  154. Mixed Mode Interpreted by shemnon · · Score: 1

    From what I read, it sounds like the newest generation of java JITs with Mixed Mode execution from Sun/IBM/Symantec. Part of the execution is interpreted, but some of it goes to native code and back. The diffrence and I believe the novelty is that once they (Transmeta)can prove that execution will be without anomoly they throw out the old code and stick with the JITed version of the code. Of course it's all in silicon and with real microprocessor instructions and not virtual bytecodes.

    --
    --Shemnon
  155. probably not by Anonymous Coward · · Score: 0

    Running multiple architechture binaries may be possible, but an operating system is MORE than just the CPU it runs on. Each application (Unix, Mac, Be, Windows, Linux, etc) would need the FULL SET OF SYSTEM LIBRARIES INSTALLED AND WORKING. While this may very well be possible (as evidenced by VMWare), each new "flavor" of OS would take another XXX megabytes of RAM and disk space. The resulting OS would be HUGE, BLOATED and SLOW (no matter how "fast" the CPU is, it will still be limited by memory and disk subsystem bottlenecks). And ultimately, noone would care. Most people are happy in their ignorance of the Mac and Linux and simply run Windows, for them this would mean nothing. No, I think there is something much larger at work. I'm not sure what their angle is, but it will have to be far reaching to really mean something. I can't wait, but I'm not building my hopes up on any one thing.

  156. Re:FPGA - Field Gate Processors by Salamander · · Score: 1

    As someone else already pointed out (but it bears repeating): FPGA = Field Programmable Gate Array.

    The problem with FPGA-based computing is that the reprogramming time is very large, so it only yields performance gains when that cost can be amortized over an extremely large number of operations using the current "instruction set". In some specialized cases this can yield amazing performance, and it's way cool, but it's not suitable for general-purpose computing. There is work going on to reduce reprogramming time and/or allow partial reprogramming, but AFAIK there haven't been any major breakthroughs recently.

    BTW, I think there were some good discussions of this in an earlier /. article about StarBridge Systems. You might want to give it a look.

    --
    Slashdot - News for Herds. Stuff that Splatters.
  157. Another invalid patent by Salamander · · Score: 1

    There seem to be two critical parts of what they're trying to claim:

    >having a host processor capable of executing a first instruction set to assist in running instructions of a different instruction set

    ...and...

    >including circuitry for temporarily storing memory stores...

    The first part is simply translation of one instruction set into another, something that has been done many times in the past both in software and in hardware including many x86 clones. Nothing new here.

    The second part is simply speculative execution, a technique that has been used in bunches of processors already.

    To give them credit, they at least attempt to address the non-novelty of the first part in the "prior art" section, but they seem oddly silent regarding prior art relating to the second part. IMNSHO this patent could never hold up in court.

    As happens way too often nowadays, the patent office has screwed up and allowed someone to patent what they did not invent. The total inadequacy of the patent office's review process is the real story here.

    --
    Slashdot - News for Herds. Stuff that Splatters.
  158. my gawd by kootch · · Score: 2

    okay, I'm going to be moderated down, but is that english?!?!?

    I think I can imagine the patent officers that were reading this going "um, billy-bob, do you know what any of this means?" and "um, no earl-ray, I have no clue what they're talking about. Must be that internet/computer mumbo-jumbo. Guess we'll just have to give it an okay..."

    starting at points #7, it starts to make a bit more sense... basically a machine running a program and then wiping that program out of the memory...

    this part had me tho...

    "means for transferring memory stores to the means for permanently storing memory stores, and

    means for storing memory data replaced by the memory stores, "

    here we go...

    "This invention relates to computer systems and, more particularly, to methods and apparatus for providing an improved microprocessor. "

    Another line that confuses the hell out of me...
    "It is difficult and expensive to make a microprocessor run as fast as state of the art microprocessors"

    um, I'm not sure whether to say "duh." or "huh?"

    1. Re:my gawd by cjeris · · Score: 1

      ugh. now i know why every patent lawyer i've ever met is dumb and grouchy. it's not their fault, they have to read this stuff all day.

      --
      Constructive logic destructs my brain.
    2. Re:my gawd by midav · · Score: 1

      Another line that confuses the hell out of me...
      "It is difficult and expensive to make a microprocessor run as fast as state of the art microprocessors"


      Considering that they are trying to create microprocessor 'capable of executing a first instruction set to assist in running instructions of a different instruction set' ( kind of Universal Turing Machine? ), I would interpret the phrase above as
      "It is difficult and expensive to make a generic microprocessor run as fast as state of the art instruction set specific microprocessors."

  159. This patent means more than processor emulation by KevThorpe · · Score: 1

    What people seem to have missed is that this will do more than processor emulation. The examples they give are performing speculative optimisation over sections of code, not single instructions. This means that the processor will start cacheing microcode solutions to the programs it is running, not the opcodes. The net effect of this is that someone running, for example, perl will find that the processor eventually ends up running common perl operations in microcode. This is a vast leap into the unknown - imagine a computer running Apache in microcode!!!!!!

  160. Speculative emulation of CISC machine instructions by SpecYouLater · · Score: 1
    This device supports speculative emulation of CISC machine instructions. If TransMeta is really implementing this device, it appears to me that they're developing a very high performance RISC machine that will be able to emulate the X86 architecture at a speed significantly faster than the current generation of Pentiums.

    Speculative execution makes it possible to optimize conditional branches so that they take no more time to execute than a non-branching instruction stream. In order to do this, the processor "sees" the branch coming in its lookahead buffer. It then begins fetching instructions located at both possible branch targets and executes them "speculatively" before the branch is reached. This is speculative execution because the processor doesn't know which branch path will be taken.

    On a RISC machine, very few instructions reference memory, so it is easy to execute long instruction streams speculatively without the possibility of incurring a page fault. This is not the case with CISC architectures like the x86 family. Here, you can easily have three-address instructions that load two operands out of memory, add them together, and store the result at a third address. All three of these memory references can generate a page fault, which themselves cause a branch to the OS's interrupt handler.

    The new device simply makes it possible for the emulator to speculative execute these instructions as long as they don't generate a page fault. When the original branch is ready to be executed, the processor checks the speculatively emulated instructions in the chosen path. If they completed without error, the device is instructed to write out the contents of its local store for any memory locations that were changed. The processor then continues to execute at the instruction after the speculative execution, without ever experiencing a pipeline stall.

    If the chosen branch path that was speculatively emulated would have resulted in a page fault, the device is instructed to write out any stores that did not cause a fault, and then take the page fault. In this case, the pipeline will stall, but this stall could not have been avoided, because the page fault was inevitable.

    Dave Ditzel (the founder of TransMeta) was the principal architect for the SPARC-V9 64-bit RISC machine. The SPARC Architecture Manual-V9 contains extensive information about branch prediction and speculative execution. You can get it on Amazon.

  161. My attempt at interpretation... by NewWazoo · · Score: 1

    It looks to me like this patent covers a device which holds intructions belonging to one instruction set (say, x86) and translates them into another instruction set (say AXP) for execution. It also looks like it holds these instructions in memory until it can be verified that they will execute without errors, and then stores them for later execution. It also covers circuitry that removes these stored instructions...

    IANAL/CS. My best guess...

  162. Register and Shadow Register Rollback by suede · · Score: 1

    The transaction processing aspect of the processor is most important when combined with the "shadow register" concept.

    It looks as if the Morph Processor is a translated cache, where the Host Processor (see previous transmetta patent) reads from the the Morph Processor. The Host processor thinks it is reading from memory/bus/whatever but is actually reading from the Morph Processor.

    Registers in the Morph Processor can be dynamically renamed or shadowed for particular execution forks. The transaction processing is scary when you start rolling back one fork of execution and commiting another without slowing the process!

    More impressive is the ability of the Morph Processor to combine/evaluate/execute concurent execution paths from multiple programs in different languages using multiple Morph Software!

    _____

    --
    Available in more colors than there are starfish in a swimming pool! ORDER NOW!
  163. Re:What it Really does by hazydave · · Score: 1

    Even going back to Transmeta's first patent of last year, it's clear they're doing some kind of CPU technology, in which a target instruction set (say, x86) is dynamically recompiled and cached in a native, probably VLIW ISA. On the surface, this is not all that different than what DEC's FX!32 and some Java JITs so. But in both of those cases, there's no hardware to deal with. Transmeta's first patent dealt primarily with the hardware emulation tricks. This seems to be a contination of some of their ideas from the first patent. When you take any target ISA and convert it into something else, there is never a 1:1 instruction correspondence. But if you're optimizing and, even more, if you're targeting a VLIW machine, you're going to want to combine the functionality of a block of x86 code into a block of VLIW code, as much as that's possible. No inherent instruction boundaries remain in this new code. Which is fine, until you hit an exception. At that point, your emulation needs to emulate a machine exception. The problem is, just where, in the actual x86 code, did this take place? Transmeta's first patent dealt with the issue of tracking and resyncing to the emulated instructions. This deals with ensuring that a backtrack isn't visible outside of the "black box" x86 that the whole emulation provides.

    --
    -Dave Haynie
  164. What they said... by dkh2 · · Score: 2
    It looks like they took a clue from Digital (a.k.a . DEC, now part of Compaq) with the FX86! package they sent me for running x86 compiled Win32 apps on my Alphastation.

    They're taking data/code from one processor/platform and shipping it to another for work, then (presumeably) shipping the results back. This will be tremendously useful in loadsharing situations where you don't have all the same hardware.

    Picture a multiplatform Beowulf cluster built of a mixture of G3's, G4's, Pentium II's, Alpha's, SGI's, and a couple of Amiga's just to make it fun.

    I guess you'd have to call this a Beomutt cluster. ;-)

    D. Keith Higgs
    CWRU. Kelvin Smith Library

    --
    My office has been taken over by iPod people.
  165. Speculative execution with instruction journaling by franzzup · · Score: 1

    Apparently, Transmeta believe they can build processors that execute simple
    instructions extremely quickly. One approach mentioned in the patent is a VLIW
    processor, which can potentially execute many (simple) instructions
    simultaneously.

    At the same time, they want to be able to run x86 software, so the focus is on
    emulation. It is easy to see how one might translate ("code morph") a simple,
    sequential stream of x86 instructions into RISC or VLIW instructions, which
    could then be reordered and optimized (as a compiler's peephole optimizer
    might do) to run at maximum, native speed on the target hardware or "morph
    host". However, real-world code never has an unbroken, linear flow of control.
    One problem that prevents existing software emulators from running emulated
    software at native speeds is the need to be able to interrupt the emulation at
    any instruction boundary, relative to the original x86 (target) instruction
    stream.

    Thus, even in a linear sequence of target instructions, an exception
    may be triggered after each and any target instruction. Possible causes include
    illegal operands (divide by zero, illegal memory address) or a software or
    hardware interrupt that transfers control to an interrupt handler. However,
    the instruction being executed by the host (VLIW) architecture at the time the
    exception occurs may be performing some or all of the functions of one or more
    target instructions simultaneously. (E.g. a floating point divide and two
    integer multiplies.)

    Yet in order to reproduce accurately the effect of the
    emulated target processor's instructions, the morph host must complete the
    emulation of the target instruction that caused the exception, and then
    transfer control to the exception handler, without disturbing the emulated
    target's state with additional, half-executed target instructions. In other
    words, the target execution state must be rolled back to an earlier state, and
    then the code must be reexecuted up to and including the exception-causing
    instruction, but none that follow.

    Current pipelined microprocessors do things like this in hardware (e.g.,
    speculative execution). Transmeta claims that they can greatly simplify the
    processor's hardware by moving these scheduling concerns into software, while
    providing explicit hardware checkpointing queue management instructions. For
    instance, writes to memory are tagged and placed in a speculative queue. When
    the target instruction responsible for the write finishes execution and an
    exception can no longer require that it be rolled back, the tagged write is
    explicitly committed. (For instance, if a target store instruction begins
    execution together with a floating point instruction that affects a target
    register, then the memory write of the store instruction and the register write
    of the fp instruction can be committed after both instructions have completed
    their emulated execution.) If a VLIW style instruction set is used, these
    commit operations can be performed in parallel with other operations.

    The speculative execution of memory stores is useful in other ways as well.
    Thus store operations may be initially treated as though they reference
    memory. Accordingly, the emulation may reorder them, combine them with
    other operations, etc. However, during execution it may turn out that a
    store operation actually references memory-mapped I/O. This is determined by
    comparing the assumed type of access ("normal" for memory or
    "abnormal" for I/O) with the "A/N protection bit" associated with the
    target address page in the MMU's translation look-aside buffer. If a "normal"
    (i.e. potentially reordered) write to an "abnormal" (i.e. order-sensitive I/O)
    address is detected, the morph host takes an exception, returns to the last
    checkpoint of known target state, and restarts the execution of the target
    instructions, which are then recompiled to execute serially.

    A similar technique is used to deal with self-modifying code. Pages of memory
    containing target code are marked with a "T" bit in the TLB if they have been
    translated (compiled to morph host instructions). If a write occurs to a
    memory page whose "T" bit is set, the corresponding host translations
    (i.e., compiled morph host instructions) are flushed from the cache, and the
    target code must be dynamically recompiled again the next time it is executed.

    It would seem that many other optimizations will be possible in software,
    given that the morph host processor allows explicit software control over
    speculative execution, commit and rollback.

    Chris Ferebee

  166. incompleteness theorem by Anonymous Coward · · Score: 0
    probably not, actually. every formal system (which curcuitry definitely is) contains truths that are not provable within its own system. this is godel's incompleteness theorem in a nutshell..

    so. TMP(TMP) probably isn't possible (due to the fact it would defy entropy and such things =). although TMP might be able to handle most or all other formal systems, TMP may be an unprovable (unexecutable) truth within formal system TMP. i.e. its not self-provable.

    before you think that this is ludricous... there are many mathematical systems that work fine until you try and use them to prove themselves (even though they ARE valid). another system is oft required to perform the trick.

  167. New Processor Architecture? by Anonymous Coward · · Score: 0

    New processor that pretends to be an old one, and supposedly does so well? Sorta like an x86 mock up, but with a better processor inside. Hehe, one might even think a RAW processor *wink wink*. -Mountaineer

  168. DoH! Stupid brain.... by Anonymous Coward · · Score: 0
    Well I took a look at the aforementioned link, and have concluded that either:
    1. It's a bunch of b.s.
    2. I'm a monkey, and all that technobabble actually makes sense.
    Anybody want to place bets on what occurs first as a result of this patent award: a useful product that improves the average Joe's life, or a lawsuit is filed by the patent holders against some schmoe who actually did create something useful that might actually sound something like what is "described" in the patent?

    Oooh, my head hurts. I'd take some aspirin, but without opposable thumbs this child-safe cap is kicking my ass...

  169. Hmm... by El · · Score: 1
    Sounds like speculative execution of partially-translated instructions to me, which requires a mechanism for committing the results if the instructions are valid, and backing out if they are not. Very useful if you are doing Java Just-In-Time compiler-like translation of instructions to your native code -- which is what I've been suggesting that Transmeta is doing all along.

    The question remains, of course, as to which processors instruction sets they are translating... I suspect x86 to begin with, but doing both x86 and PowerPC would be nice. And of course, translating the instructions is only half the battle; you also have to provide a compatible run-time environment (which is where I suspect Linus comes in.)

    --

    "Freedom means freedom for everybody" -- Dick Cheney

    1. Re:Hmm... by Ian+Bicking · · Score: 2
      Is it possible that they could translate something like Java bytecodes at a speed on par with compiled C code? Well, I guess the answer is: no one has any idea.

      But if they could do something like that -- not just for Java, but other environments that do better with dynamic compiling (like polymorphic OO systems, e.g. Smalltalk, Common Lisp) -- that would mean a real revolution in programming. The advantages of C for anything other than systems programming would be greatly diminished.

      Of course, if the translation is all hardcoded, that's unlikely to be very helpful for higher-level languages. And maybe the translation assumes some sort of commonality -- registers and the sort -- that most processors share, but wouldn't be shared by most sorts of bytecodes. This reminds me of what Linus was talking about in his article on the portability of Linux.

  170. Re:YES, that's what I got by GypC · · Score: 1

    From my meager understanding; the PPro/PII/PIII processors actually break down the x86 CISC into RISC-like chunks before execution... thus, they are technically RISC processors that emulate an x86.

    Of course I could be completely wrong...

  171. It's not an "emulator in HW". It's "SQL in HW" by Flu · · Score: 1
    Well, not really, but almost; This patent is similar to what is done in all databases all the time; However, what TM is doing is much cooler than that (more on that below).

    In an ordinary database, the operator (an "emulated" CPU instruction) is doing a series of changes to the database (memory), which all requires that the database (memory) is in a completely known state during the complete transaction and that the each individual change must be successful (instruction must not cause an excepion or error) for the complete set of changes to be valid.

    See at the bottom of this article for an example.

    From another operators viewpoint (an I/O unit), the database (memory) must never be left in an half-updated state. Therefor, all individual database updates takes place in a temporary storage area, which is only made permanent when it has been determinated that no errors have occurred during the sequence.

    This patent is merely a hardware-implementation of the temporary storage area and the commit/rollback behaviour in SQL database.

    So what are TM using this patent for? Well, read on! :-)

    It is actually only a piece in their real invention; a programmable processor. Note; I do NOT thinks this is a processor with hardware-emulation, although it can probably be used as such.

    Their invention, though, is acually not even that new, but it probably haven't been used in full-scale. I know for sure that it was possible to write new CPU-instructions for an old mini-computer called the NORD 100, simply by altering the microcode that was to be executed when certain OP-codes were fetched.

    As far as I can understand from their previous patents, that is what they seem to be developing: A small, generic, multi-purpose processor that can be programmed to understand an arbitary set of CPU-instructions. Because of that, the actual numbers of transistors required can be kept to a minimum.

    The good thing about this are several:

    • A lower amount of processors will fail the burn-in test during production, since the failure-rate increases exponentially with the number of transistors. Thus, lower costs for "same" speed.
    • Less transistors also means lower complexity and this can aid in higher speeds at the same clock frequency.
    • Less transistors also means less heat, allowing for even higher clock frequecies.
    Probably this means that the actual number of CPU-instructions that can be understood will be limited, but as mentioned in the patent, 2MB internal RAM is enough to execute x86 programs.

    If smaller internal RAM is used, more specialized CPU-instruction sets can be allowed, for example the instruction set of a JVM, postscript, extremely high-speed numbercruncher (code-cracker, image-manipulating DSP, modem) or whatever. I personally believe this processor will be a modular processor that will find its use in embedded applications such as printers, routers, web-servers, cameras, fingerprint controllers, copiers, palm-tops etc., where many features of the standard x86 CPUs (MMX, 3D NOW!, virtual 8086 emulation, page fault detection etc) can be left out for the benefit of some highly specialized features such as a JVM, postscript interpreter, DSP or something similiar.

    However, they do write the following in the patent, which is pretty cool:

    As a comparison, one embodiment of the present invention designed to run all available X86 applications is implemented by a morph host including approximately one-quarter of the number of gates of the Pentium Pro microprocessor yet runs X86 applications substantially faster than does the Pentium Pro microprocessor or any other known microprocessor capable of processing these applications.

    (enhancements are mine)

    /Flutte

    Example from above

    An example would be a database where the ages of persons were stored as years since the person was born, rather than date of birth. Every January 1st, the complete database must be updated. We cannot allow the update to fail after a couple of persons, since if that happends, we do not know who were updated, and who weren't.

  172. What it Really does by Coventry · · Score: 5

    Ok, its for emulation, but it Doesnt Just speed emulation. This allows for instruction ROLLBACK. Want a journeling filesystem? How about a journeling processor?
    The patent is for a co-processing unit that not Only translates an foreign instruction set into native instructions for a 'target processor', But, acts as a go-between for that target processor and memory. It stores the processor state, and buffers any memory writes, until it is certain that a group of instructions has been run without exception or error... If the translated instructions crash, no damage is done. Not only is this amazing overall, but it allows for Very speculative, and Very fast, instruction translation and branch prediction...

    --
    man is machine
    1. Re:What it Really does by EEEthan · · Score: 1

      If it's a separate emulation processor, couldn't it be used to run code, really quickly, on any processor or processors(in the realm of total theory) run macos on a fleet of 386's? Or x86 linux on a dozen g4's?
      Anyway, I think it'll be equally cool and out of normal, human price ranges.

    2. Re:What it Really does by Anonymous Coward · · Score: 0

      ha ha ha ha. pixels. ha ha ha ha ha ha. you kill me. ha ha ha ha

    3. Re:What it Really does by TheGreek · · Score: 1

      I'm thinking of a rewind button for my PC. I can execute some application and if I screw up, I can "rewind" back to where I was before. This sounds kind of stupid, but I can see consumer devices eating this up. It would also make it easier to replay that last death in Quake without having to go back to your last saved game.

      Nay, not stupid. Just not quite feasible.

      Most applications (Quake, by your example) process millions of instructions per second. Millions. All of these instructions play with memory, registers, disk, peripherals, etc. So to "replay that last death" in Quake, you would need to store several million instructions AND a picture of the state of the registers, memory, peripherals, etc AT EACH INSTRUCTION. The difficulties involved and the resource utilization and engineering necessary just make it not worth it.

    4. Re:What it Really does by the_tsi · · Score: 2

      It doesn't sound to me like this chip is actually doing the emulation; just the translation, and then buffering it so another chip can pick it up from there and run with the instructions... which would make sense with everything else you said.

      -Chris

    5. Re:What it Really does by PhiRatE · · Score: 1

      I expect theres one more trick to this, I believe that the repeated mentions of the second set being translated into the first set indicate that in fact, the first set defines what an exception or error in this context is. This is extremely important, being able to rollback if the determinant of a long matrix calculation is 0 (the matrix being too big to hold in registers) would be of incredible value for simplifying code, introducing further stability and increasing performance, especially in multiprocessor systems, where mutexes etc could be checked part way THROUGH the calculation, rather than only at the start.

      --
      You can't win a fight.
    6. Re:What it Really does by BugMaster+ChuckyD · · Score: 2

      I think you're reading alittle too much into what is said in the patent. From the abstract:

      determination is made that a sequence of translated instructions will generate an exception or error on the host processor [empahsis mine]

      It seems to me that what they are doing here is making sure that the translation is correct, i.e. that the native instructions make sense. It does not do anything about any memory writes that might take place as a result of the execution of those native commands. Remember that the BSOD in windows comes as a result of the execution of a valid set of x86 instructions that mess up memory in a way that stops the application/system from functioning properly. What TM is talking about here would not effect that sort of thing at all (the chip logic would have no idea that writing to memlocation x would screw up the running of your app)

      This implies to me that whatever mechanism TM is using to quickly translate (say) x86 -> TM instruction set can cause a set of instructions that make no sense (for instance a value is written to a register then another a value written to the same register without the first value being used at all -- that might not be a very good example but its that sort of thing )

    7. Re:What it Really does by mvw · · Score: 1
      This allows for instruction ROLLBACK. Want a journeling filesystem? How about a journeling processor?

      That idea is not convincing.

      If you view a relational database as a processor for SQL statements, a rollback just ensures the SQL statements being atomic. Thus you end up with the state before that SQL statement was issued, and not in some mess that is left from some error during the processing.

      Similiar holds for file system commands that translate into complicated block operations, where a rollback would ensure atomic fs commands.

      CPU instructions aren't that complex, they are fairly atomic by themselves. Can't think of a situation (except for processor bugs like the F00F one), where the processor hangs in mid of some instruction, stumbled over some microcode gone crazy.

      So I simply see no benefit of a rollback of an instruction, sorry.

  173. Re:Hmm ... Firmware FX32? by Tetra · · Score: 1

    Seems like what DEC's FX32 is doing for native Win32 execs, only it's storing the instructions in a permanent way, like flash memory? just a guess.

    --
    Regards, tEtra
  174. Transmeta speculation (segfault style) by pb · · Score: 2

    Today, a press release for Transmeta, Inc. was cleverly disguised as a patent. Transmeta, Inc. was truly proud that the US Patent & Trademark Office (USPTO) allowed them to release a press release endorsing their vaporware, and was soon picked up by a local website (www.dotslash.org).

    "It amazes us that the geeks were able to interpret 'Apparatus for use in a processing system' as 'Wow, they've got something faster than Intel!'. That was our intent, of course, but we hate to see our bretheren fail a Turing test..."

    --
    pb Reply or e-mail; don't vaguely moderate.
  175. Re:Run on by Anonymous Coward · · Score: 0

    From what I can tell its a run-on sentence. Other than that I dont understand what in the world they are saying.

  176. emulation by Anonymous Coward · · Score: 0

    it appears to be an advanced processor that emulates any processor on a very low level like a fly on the wire processor which changes how it configuration depending on the task to give much better performance it (i think) is able to store information in the processor temporaily like RAM ......i think

  177. Translation? by PhiRatE · · Score: 1

    Apparatus for use in a processing system having a host processor capable of executing a first instruction set to assist in running instructions of a different instruction set which is translated to the first instruction set by the host
    processor
    --
    This would seem to be some kind of emulation? hardware assisted translation of instructions for high speed emulation would be my guess
    --
    including circuitry for temporarily storing memory stores generated until a determination that a sequence of translated instructions will execute without exception or error on the host processor
    --
    Temporarily storing memory stores...perhaps, caching all the memory writes until they're sure the code is going to work? not sure why you'd want to do that for emulation, perhaps a high-speed, high-definition equivalent of memory protection?
    --
    circuitry for permanently storing memory stores temporarily stored when a determination is made that a sequence of translated instructions
    will execute without exception or error on the host processor, and circuitry for eliminating memory stores temporarily stored when a determination is made that a sequence of translated instructions will generate an exception or error on the host processor.
    --
    Stuff to write the cache out, and stuff to clear the cache?

    But what does it all mean??? :)

    Ahh, perhaps, the first set sets up a virtual machine/debugging environment for following code, assisting in system stability? you could sandbox pretty well if you could just run whole blocks of code and then recant on any writes that were done if it threw an exception. Need a pretty odd compiler though.

    --
    You can't win a fight.
  178. Not Emulation, IT IS DIRECT EXECUTION by Anonymous Coward · · Score: 0

    Verify For No Errors And Execute

  179. Re:Have we really thought this through yet?? by jani · · Score: 1

    I think what you and many others are forgetting is that just because we're masturbating because we think we're seeing a hardware system that solves all our binary compatibility issues, it doesn't mean that the hardware system in question will replace all other hardware systems in existence.

    If the best system always won, Intel and Microsoft would never have made it. ;)

  180. A HERRING! by Skip666Kent · · Score: 4

    Those bastards have patented my favorite fish! Of all the nerve!

    Really, tho', it could be a Red Herring. Transmeta could be cashing in on the popular assumption that they're going to create a wild new processor that'll be Everything to Everyone in order to disguise the fact that they're really in the process of opening the ULTIMATE multimedia porn sight for cyber-trans-sexuals.

    (Not that there's anything wrong with that...)

    --
    **>>BELCH
    1. Re:A HERRING! by SEE · · Score: 1

      A herring? A chip that translates instructions into another instruction set reminds me more of a different type of fish...

      OTOH, does that mean Douglas Adams can claim prior art?

  181. Re:Huh? by scumdamn · · Score: 2
    This would be useful if you follow both paths of a branch statement. Once it is determined which branch was supposed to be taken, that data gets posted, while the data generated by taking the wrong branch gets tossed.

    This is the same method Merced/McKinley uses, isn't it? Does that count as prior art?
  182. Digital Clutch.. by Anonymous Coward · · Score: 1

    This may allow the Transmeta chip to "freeze"
    the current processor it is emulating (say x86)
    and jump to another type (like PowerPC) while holding the first image in memory. So in addition
    to holding the current state of a program (in a multi-tasking environment), the
    Transmeta chip may need to hold what kind of processor it runs on.

    How this works with their Orbital Mind Control lasers is beyond me.

    -Yet another coward

  183. Re:YES, that's what I got by MindStalker · · Score: 2

    Simple, you first translate it into an archtecture that is blasingly fast. Imagine if you will that I had a program to translate your office 2000 program into a program to run on my high end dec alpha. at the same clock speed it will run much faster on the alpha. This simply makes the translation phase in the hardware. Also it optimized the code and checks for errors, making it even faster as you don't have to deal with errors in the central processor. How do you do this all and make it cheaper. I HAVE NO F#%# IDEA!

  184. Re:Hypocrites! by Anonymous Coward · · Score: 0

    Wow! I love weak kneed schoolgirls! Spunk!

  185. It seems to do several things by danwatt · · Score: 1

    What I gather from it: 1) It can translate instructions (like x86) into its own format 2) It has some sort of intelligent memory buffering system. 3) Judging by the abstract, it SEEMS to almost be able to switch "modes" or instruction sets on the fly

  186. Wow by Palin+Majere · · Score: 1

    If I understand this correctly, what this means is:
    They have an 'appratus' that can temporarily and/or permanently store the results of a translation from one instruction to another.

    Imagine this: A new instruction set comes out. You fire up your Transmeta processor, download some sort of 'data table' for the new instruction set, and let the processor *learn how to decode it*.

    This patent would allow Transmeta to build such a instruction set translator into the processor itself, and then have this 'apparatus' that they've received the patent for decode new instruction sets and store the results of how to do it.

    Cool. Very Cool!

  187. Quick Summary by meta4 · · Score: 3

    If you had scrolled way down the page, you would have found this:

    SUMMARY OF THE INVENTION

    It is, therefore, an object of the present invention to provide a host processor with apparatus for enhancing the operation of a microprocessor which is less expensive than conventional state of the art microprocessors yet is compatible with and capable of running application programs and operating systems designed for other microprocessors at a faster rate than those other microprocessors.

    This and other objects of the present invention are realized by apparatus for use in a processing system having a host processor capable of executing a first instruction set to assist in running instructions of a different instruction set which is translated to the first instruction set by the host processor comprising means for temporarily storing memory stores generated until a determination that a sequence of translated instructions will execute without exception or error on the host processor, means for permanently storing memory stores temporarily stored when a determination is made that a sequence of translated instructions will execute without exception or error on the host processor, and means for eliminating memory stores temporarily stored when a determination is made that a sequence of translated instructions will generate an exception or error on the host processor.

    These and other objects and features of the invention will be better understood by reference to the detailed description which follows taken together with the drawings in which like elements are referred to by like designations throughout the several views.

  188. WOAH!! by ScUmM_BoY · · Score: 1

    This sounds like the processor can take ANY instruction from another chipset (x86, ALPHA, SPARC, etc) and translate it ON THE FLY to the Transmeta instruction FASTER than the native chipset! It also seems like there is a buffer for waiting instructions. i want a Beowulf cluster of THESE babys...

  189. Transmeta by technos · · Score: 5

    It appears to be a system in which a processor is fed a sequence of instructions in a translated foreign set, and the results are held in cache until it can be ascertained that the entire stream of instructions will run without error, at which time the cache is released. They may be using this purely as a CISCRISC mechanism, or they may be planning a platform where the actual program code is 'broken' into chunks, and the processors might encounter exception if the granularity of the sets is off. They may even be planning a platform that does multi-arch emulation on a transparent hardware/microcode level, ala AS/400. Heck, they might be doing all three! They also give an allusion to making a cheap processor run code designed for a more expensive one, so perhaps they're planning to give Intel a run for their money.

    I'm sorry, but that is the closest I can get to an answer with the available information.

    --
    .sig: Now legally binding!
    1. Re:Transmeta by Guy+Harris · · Score: 2
      They may even be planning a platform that does multi-arch emulation on a transparent hardware/microcode level, ala AS/400.

      PowerPC-based AS/400's don't have microcode in the CPU, as far as I know. The older IMPI ones had two levels of what was called "microcode", but the Inside the AS/400 book by Frank Soltis (one of the architects of S/38 and AS/400) said the "vertical microcode" was just machine code and was called "microcode" for legal reasons (if it was software, IBM would have to unbundle it; it was "microcode", however, which meant they could bundle it with the hardware). The "horizontal microcode" was conventional microcode, used to implement the IMPI instruction set.

      I.e., the emulation is done largely in software, by translation of the high-level "MI" instruction set into the native instruction set (IMPI or extended PowerPC), although that software was, at one point, called "microcode".

      The processor described in the various Transmeta patents also appears to do that translation in software, not hardware; this patent says

      Typically, the target application is being designed for some target computer other than the host machine on which the emulator is being run. The emulator software analyzes the target instructions, translates those instructions into instructions which may be run on the host machine, and caches those host instructions so that they may be reused.

      (emphasis mine).

  190. Re:FPGA - Field Gate Processors by Lucidity · · Score: 1

    Well I don't belive that a Field gate processor would actually take end software then exprapolate a new design. Rather each time the processor is configured SOMEONE would have to tweek it at some point.

    --
    ~`'`~-,_,-Jason Wylie-',_,-~`'`~
  191. Re:FPGA - Field Gate Processors by Lucidity · · Score: 1

    I agreee with you after reading it a number of times. FPGA was just wishful thinking, I really wish they would make one though.

    --
    ~`'`~-,_,-Jason Wylie-',_,-~`'`~
  192. Halting Problem Solved!! TM proves Turing Wrong! by Anonymous Coward · · Score: 1

    Okay, maybe its a stretch, but when you think of the patent in that light it is rather amusing... Make sure the code won't error before executing it... Sorry, it's been a long day... :)

  193. This thing..... by Roofus · · Score: 1



    has mulit-platform emulation written all over it.




    I think :)

  194. Transmeta Patent by comradebren · · Score: 2

    "...It is, therefore, an object of the present invention to provide a host processor with apparatus for enhancing the operation of a microprocessor which is less expensive than conventional state of the art microprocessors
    yet is compatible with and capable of running application programs and operating systems designed for other microprocessors at a faster rate than those other microprocessors..."


    think of it like this: the cpu is capable of reading the instruction set for another architecture, figuring out what that architecture needs from the cpu, determining all possibile instructions of that architecture, and "emulating" that architecture by a technique that allows the "emulation" to be as fast or faster than the original architeture (by taking advantage of the invented cpu's "extra" free stuff).

    so, what that means is that the cpu would theoretically be able to run any OS designed for any instruction set (ie x86, alpha, mac, etc.)

    or at least that's how i read it, but whoami

  195. Fast emulation by Anonymous Coward · · Score: 0

    We are building reprogrammable chips. And the software that reprograms thems. This makes 'em faster (like how a 3d card is faster at graphics, but sucks at spreadsheets) not by being faster buy by reconfiguring on the fly and acting like dedicated hardware. Usually they are way too expensive to ass produce, but hey we're getting very close

    1. Re:Fast emulation by Anonymous Coward · · Score: 0

      umm, mass produce, that is.

  196. To be more specific... by SpinyNorman · · Score: 2

    It appears that the superscalar speed necessary for faster-than-target emulation comes from a VLIW design. A single VLIW instruction is generally going to correspond to more than one target (x86/whatever) instruction, hence the patent's subject matter of efficient cached memory store and exception determination - you don't want to commit the VLIW memory stores until you've determined that *all* of the corresponding target instructions would have suceeded.

    As the patent points out, it applies equally to emulation on other (non-VLIW) superscalar architectures, but the emphasis does appear to be on VLIW.

  197. maybe they are giving sun a run for their money by Anonymous Coward · · Score: 0

    Maybe they will do both..allowing small business and home pc users to use high end software. then would it not be possible to incorparate sun and alpha software into an existing linux platform?

  198. Didn't Tao do this? by FooBarSmith · · Score: 1

    If I remember rightly, a company founded by an ex ZX Spectrum programmer, Tao were flaunting a radical new byte code based OS a few years ago. This theoritcally allowed systems to be built with Multiple (Different) Processors and scale across them properly. Surely the work done by these people would have included this specific idea?

    Nowadays they seem to be concentrating on Java bytecode stuff and embedded systems... In the states they would have undoubtedly got mucho Venture Capital. Over here in the UK they are struggling.

    --
    stty erase ^H
  199. Quite like that FPGA supercomputer-on-a-desktop by Morgaine · · Score: 2

    It's not really all that different to the way the recently announced supercomputer-on-a-desktop works, the one that translated microprocessor-type instructions into FPGA wiring on the fly, just in time, so that CPU instructions effectively run on dedicated logic intead of in generic microcode, ie. *much* faster.

    This area is called Reconfigurable Computing, and it's been around for quite a few years (there's some quite reasonable supporting hardware available for it from Xilinx).

    Transmeta's patent differs from that in the detail of course, but the general principle is remarkably similar, so much so that they've probably included references to it among the prior art somewhere.

    --
    "The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
  200. Reliable and Programmable Emulation by exa · · Score: 1

    Once a friend (yep, a geek one) and I used to talk about an ultimate emulator generator that would generate an optimal software emulator (possibly at the load level) for any emulation of B on A. Now that was a difficult task, but cranking it ou at the h/w level is kinda funky. Now that you have the target processor set, all that is required is
    translate an instruction stream on the fly, and secondly make sure no exceptions occur so that won't crash the target cpu.. you know crashing the cpu isn't what you'd like.

    I think linus will be chopping some code to make that host processor internal to the kernel, so it's gonna be all transparent.

    I believe one would love to emulate x86 instruction set and (most part of a) JavaVM at a level that low. And who knows, you even get your RISCy or EPICey target processor do some of the harder stuff like blits or DSP.

    Still, sounds pretty cool, and makes sense when the previous patents considered.

    Keep kewl,

    --
    --exa--
  201. Re:Must be crypted... by technos · · Score: 0

    Yes, it is encrypted. If you'd like a copy of the key, you can get it at the local bookstore. Just ask for 'The 1999 American Heritage Dictionary of Simple English Words and Phrases'. It'll run you six or seven bucks US, so better save up your allowance!

    --
    .sig: Now legally binding!
  202. This is not a new idea -- look at the PPro by cameldrv · · Score: 2

    The Pentium Pro was out when this was filed, and it operates very similarly. It translates x86 instructions into micro-ops which are executed out of order and retired in order. Memory writes go into the retirement buffer, and if they are the result of a mispredicted branch, they are expunged from that buffer. This is pretty much exactly the same as what Transmeta is claiming in this patent.

  203. I see one problem.... by Skratch · · Score: 1

    Unless it's firmware that can be upgraded, what would Transmeta do when Intel introduces a new opcode in their Pentium IV? Wouldn't it be useless trying to run any code optimized for the new Intel chip?

    --

    -- My neighbors dog has a four inch clit.
    1. Re:I see one problem.... by drudd · · Score: 1

      Well first of all, when Intel introduces new opcodes, all of their own processors become obsolete, so it forces everyone to upgade anyway.

      It sounded to me like they were planning on keeping processor images (for the lack of a better description) in CMOS, so that Transmeta's chips would actually be able to be upgraded.

      Although that patent is so wordy its impossible to really know what their end product will look like.

      Doug

      --
      Venn ist das nurnstuck git und Slotermeyer? Ya! Beigerhund das oder die Flipperwaldt gersput!
    2. Re:I see one problem.... by El · · Score: 1

      Uh, dude! If the microcode is reprogramable on the fly, it's better than firmware that can be upgrading. Although yes, it will take a software upgrade to emulate a new Intel instruction set.

      --

      "Freedom means freedom for everybody" -- Dick Cheney

  204. Re:FPGA - Field Gate Processors by Geekholder · · Score: 1

    No, I don't think so.

    It is true that Field Programmable Gate Arrays are reconfigurable hardware. The classic Xilinx architecture (the first FPGA I know of to be commercially available) has a large number of small blocks of SRAM which drive logic gates at the periphery. By loading a bit pattern into the SRAM, you can make that block behave like any small block of logic you wish: an 8-1 mux, an adder, part of a barrel shifter, etc.

    Xilinx parts can be reloaded "on the fly". You reset the part and tell it to reload, and it tristates its outputs and reconfigures itself according to the new code you supply. So for example, after identifying what sort of monitor is connected to your VGA port (640x480, 1024x768, 1280x1024), the FPGA on your framebuffer might reload itself with a wad optimized for that screen size. I know of a Mac video card from Radius about 5 years ago which worked like this.

    The research projects you list take this a step further. After identifying what sort of task you want it to perform, the FPGA in your CPU would reload itself with a wad optmized for that function. So, for example, if it figured out you were doing lots of very large multiplies (because you're running an RSA key exchange), it could dump the floating point unit and reload that FPGA with a 512 bit exponentiator. Later when you start Quake it could dump the exponentiator and bring the floating point unit back. Still later when you start up your Finite Element Analysis package it could dump the single precision floating point unit which was fine for Quake, and bring in the slower but more accurate double precision FPU.

    The transmeta patent doesn't describe anything of this nature. The patent describes a very sophisticated store buffer which can delay, commit, or discard a speculative instruction stream as needed. It isn't FPGA based, as far as I can see.

  205. and what about native execution? by FIGJAM · · Score: 1

    If the alleged Transmeta CPU transalates instructions of other architectures faster than the original processor, surely there must be native code that will run xx times faster without the need for transalation...

    --
    Do your best, hope for the best, suspect the worst.
  206. Now we know what Transmeta does! by KDubuisson · · Score: 2

    Well now we know it! And what an unbelievably brilliant idea that will make them the next Intel. For the lay persons, what they are making is the combination of a new microprocessor and BIOS that interprets X86 commands and runs them on a RISC processor. This combination works as a hardware excellerated DOS/Windows emulator. Their claim that it runs faster than Intel's systems makes sense. It's like having two processors that are cheaper than the Pentium that run faster than the Pentium. I have two questions: When is the IPO and when do I get my hands on one of these. I imagine that the error handling will do what Microsoft and Intel have never accomplished...revert back to "last good state" when an error is generated. Oh my!!!!!!

    --
    A freak and lovin' it!
  207. Re:Here's what TransMeta is up to by Anonymous Coward · · Score: 0

    So far you're the ony one that has "got it".

  208. Re:Wait... by webslacker · · Score: 1

    So... this is more support for the theory that they're developing a chip that can run binaries of any platform? Hold on while I write a check to Dr. Torvalds for one of these babies!

  209. Re:Must be crypted... by Don+Sample · · Score: 1

    That's because it is written in Patentese.

    It looks like English, but it's not. I'm not even sure it's a language.

  210. My take on this thing. by bogamo · · Score: 2

    It sounds to me like they have 1) A fast processor that speaks its own language. 2) Some device that translates code from some other istruction set on the fly and independent of the fist processor. 3) Once the translation is deemed to be correct, the original is tossed. What this means: They can run any instruction set they like as fast as they want because the main CPU is not doing the translation.

    --
    Check out TrailRegistry.com, my hiking site, Maps, altitude pr
  211. Huh? by scumdamn · · Score: 4
    "circuitry for permanently storing memory stores temporarily stored when a determination is made that a sequence of translated instructions will execute without exception or error on the host processor"

    I vote for it being the random patent generator. My favorite part of the whole solliloquy is
    permanently storing memory stores temporarily stored

    you can't beat that! Maybe they're really working on an optical processor and wanting us to think they're working on a universal processor that'll run any other processor's code. Good one, Linus (and others), but what's it really do

  212. They deserve to read it by David+Jensen · · Score: 2

    They write this stuff everyday.

  213. Sure Looks Related To Their Other Patents by Christopher+B.+Brown · · Score: 3
    The latest patent surely looks related to the other patents previously awarded...

    The basic idea for all of the patents has been to provide mechanisms to allow one to:

    • Create a new CPU that uses one instruction set;
    • That CPU is emulating the instruction set of some other CPU ( Oh, Say, Perhaps IA-32... );
    • The patent provides for some scheme whereby instructions are run in some sort of "emulation mode," where they try to execute in a sort of abeyance...
    • The system then seeks to detect situations where the emulation starts going astray, and provides mechanisms for "coping with this error."
    The various patents have involved that mechanism for coping with the errors, with an attempt to construct ways of quickly working around them.

    This parallels the notion of Lagrangian Relaxation, where you take a problem, with various restrictions, and relax those restrictions. In exploring the solution space, the system will find solutions that aren't in the feasible solution space of the (unrelaxed) problem.

    In the case of Lagrangian Relaxation, the way of coping with that is to associate values with the objective function that penalize infeasible solutions, thus encouraging the system to head towards feasible solutions.

    In the case of Patent 5958061, the "relaxation" is that the system performs the emulated instructions, modifying a temporary memory store, and rolling back when it hits cases where the preliminary emulation results in errors on the host processor.

    Patent 5832205 concentrated, in contrast, on the apparatus to detect a failure of speculation.

    --
    If you're not part of the solution, you're part of the precipitate.
  214. FPGA - Field Gate Processors by Lucidity · · Score: 1

    They are making Field Gate Processors, they aren't new but they are very cool. http://bersj.www.media.mit.edu/~vmb/papers/chidi99 _abs.html http://www.atmel.com/atmel/products/prod3.htm Basically it is a fully configurable CPU that can be programmed on the fly to fully dedicate to a single goal and complete it very quickly. Your standard Intel is designed for general number chunching and nothing much else in particular. But if you had control over what each register and logic gate did you could make your processor totally dedicated and streamlined to complete a single task. as well as being programmable the cache is spead out so that each logic gate has its own cache rather than one lump of cache for the whole board, this also speeds things up.

    --
    ~`'`~-,_,-Jason Wylie-',_,-~`'`~
    1. Re:FPGA - Field Gate Processors by Anonymous Coward · · Score: 0

      this is almost exactly what it struck me as being - not neccessarily FPGA (which btw is field programmable gate array (right?)) but perhaps something similiar: a microproccessor which can be almost entirely reprogrammed - i believe it even says somewhere in the patent that it is more software based than not, but perhaps i'm mistaken.

    2. Re:FPGA - Field Gate Processors by El · · Score: 1

      Correct me if I'm wrong, but wouldn't this only be useful if you had multiple processors? This would be, however, one one of emulating a Pentium faster than the Pentium itself -- breaking up the instructions to operate on multiple processors in parallel. Instead of multiple pipelines on a single processor, why not simply a bunch of cheap processors, with a control processor that reorders instructions and assigns each instruction or group of instructions to a specific processor. Speculative branching? Have a CPU follow each branch! Sounds simply in theory, but extremely difficult to make work in practice.

      --

      "Freedom means freedom for everybody" -- Dick Cheney

    3. Re:FPGA - Field Gate Processors by Lucidity · · Score: 1

      A normal field gate processor is design to work in conjunction with a normal CPU. But as for having multiple CPUs, I have to say again : A standard CPU is designed for general purpose number crunching and is not designed for specific tasks, if they were they would be a great deal faster thats why FPGAs are going to exist. They can be configured to handle a single task MUCH faster than a single (or a bunch) of pentiums ever could. Thats why the crypto box can crack codes in under a second, it can't do much else mind you but it can crack DES in a fraction of the time 1 or even 100 pentiums could.

      --
      ~`'`~-,_,-Jason Wylie-',_,-~`'`~
    4. Re:FPGA - Field Gate Processors by Lucidity · · Score: 1

      If I am anywhere close to being right it is probably similar to a Field Gate Array, but most probably closer to a configurable microprocessor. Even though IMO they are both similar concepts.

      --
      ~`'`~-,_,-Jason Wylie-',_,-~`'`~
  215. Semi-layperson's view by Supergrass · · Score: 1

    (disclaimer: I haven't read all of the document yet, so there may be errors)

    Based on the abstract and skimming the first few pages, it looks like the patent is about temporarily buffering memory stores generated by instructions from a different instruction set. (and then throwing them out if the instruction sequence is determined to cause an exception or other bad stuff)

    Doesn't sound like this is for translating the instructions themselves, but I could be wrong. Later on they talk about prior art of emulation, and how being able to reorder instruction execution (and keep track of memory usage by instructions executing out-of-order) will allow for much faster emulation of other processors.

    Oh, and there's also mention of translation of other instruction sets to a native set. Sounds like on-the-fly translation of instruction sets to me...

    This would seem to be nearly definitive proof of the validity of the rumors surrounding Transmeta. Bring on world domination. :)

    --
    Wherever there's a will, there's a motorway.
  216. Just wondering by TheRain · · Score: 1

    Isn't it a more common strategy to patent a product idea as early as possible in developement to beat out other companies that may be developing similar technologies?

    --
    Please help! I'm stuck inside my virtual reality headset!
    1. Re:Just wondering by Mr+Z · · Score: 1

      Yes. The USPTO isn't exactly fast, and Transmeta hasn't been around all that long. So, I'm wagering the patents that we're seeing now were filed very early in Transmeta's life, and may have nothing to do with what Transmeta is doing today.

      I know someone who works at a startup. While their basic technology hasn't changed, their target and application, and everything they wrap around it has as they've gone looking for money. Transmeta isn't much different.

      Remember... Moore's Law keeps marching on. A technology that's neat and innovative today is scrap a couple years from now, so you have to change your approach with time if you end up taking too long. With a startup, that means the possibility of having to do something entirely different than you originally started with.

      --Joe
      --
  217. Other features it might have by El · · Score: 1
    1) Asynchronous execution of instructions -- instead of requiring an integral number of clock cycles, each instruction takes as long as it takes, then passes control to the next instruction.

    2) FPGA-like microcode that is changable on the fly. This wouldn't really be useful unless you had a bunch of these puppies running in parallel, then you could microcode one to do DCT's one to emulate an X86, one to execute Java byte codes, etc.

    --

    "Freedom means freedom for everybody" -- Dick Cheney

  218. I know what it does by Anonymous Coward · · Score: 1

    Its a desert topping and a floor cleaner, all in one.

  219. Explanation in two words.... by Anonymous Coward · · Score: 0

    Uncrashable computer.

  220. What it means... by KilobyteKnight · · Score: 1

    "Can someone explain to me what it means?"

    I'm not sure, but I think it means I need to go ahead and mortgage my house in case they have an IPO soon.

    --
    When will Windows be ready for the desktop?
  221. Reconfigurable logic processor by mvw · · Score: 1
    We are building reprogrammable chips. And the software that reprograms thems. This makes 'em faster (like how a 3d card is faster at graphics, but sucks at spreadsheets) not by being faster buy by reconfiguring on the fly and acting like dedicated hardware.

    I am sceptical. Reminds me of the dilemma with parallelizing compilers.

    Because the dedicated hardware design, like that of a graphics card, is the result of careful human analysis of a special problem. So you would need a very intelligent "reconfiguration compiler" that analyses a given assembler program and translates it into an optimized logic gate configuration.

    For the shortest programs, the assembler instructions, the processor designers did this optimal casting into logic, when they implemented the instruction set.

    In contrast a reconfiguration compiler could have a larger window, including some dozen of instructions, and might squeeze out some extra cycles.

    I have no clue if that would yield remarkable speed improvements.
    Anyone tested such ideas on emulated reconfigurable hardware?

  222. Hardware assists for binary-to-binary translation? by Guy+Harris · · Score: 2
    That patent, plus one of the other patents (mentioned on Slashdot a while ago), seems to suggest that if what they end up building involves the patents they're filing (i.e., assuming those patents don't come from what they were working on at one time, but decided not to build), then it may be a processor with an instruction set different from that of other processors, plus something (quite possibly software, not necessarily hardware in the processor, as some appear to have inferred) that translates other instruction sets into the Transmeta instruction set, and does so "speculatively", in that it assumes that the translated code won't get a fault.

    If the code does get a fault ("exception or error" - this could be an exception without being an error, e.g. a page fault), then anything that code did "speculatively" and that wouldn't have been done by the untranslated code had it gotten that exception hasn't made any permanent state change, so the fault cancels/backs out any uncommitted state changes and presumably traps to software that would do whatever is necessary to do what the untranslated code would have done.

  223. WAKE UP by johnjones · · Score: 2

    i'm fed up of this

    its for this > its for that

    they are produceing a system that runs a code template !
    (if you dont know work out how you can add ppc to a AS400 and not recompile)

    the product will have multithreading in

    OK

    peace enough of this guessing

    john
    a poor student @ bournemouth uni in the UK (a deltic so please dont moan about spelling but the content)

    1. Re:WAKE UP by Guy+Harris · · Score: 4
      (if you dont know work out how you can add ppc to a AS400 and not recompile)

      Much of the audience may not be familiar with AS/400's, so that's not necessarily much of a hint.

      System/38 and AS/400 compilers generate code in a high-level pseudo instruction set; the low-level OS kernel, when told to run one of those programs, translates it into the native instruction set and runs that. (See Frank Soltis' Inside the AS/400; go to the 29th Street Press's home page and select "General Interest" under "*** ALL AS/400 TOPICS ***", and then look for that book, which they claim to have online - the URLs on that site look depressingly dynamically-generated, so I'm loath to make a direct link.)

      This let them change the native instruction set from the apparently 360-flavored "IMPI" to an extended PowerPC instruction set without requiring people to recompile programs (unless they tossed out the pseudo instruction set code to save disk space).

      From the various Transmeta patents, it sounds as if they're building a chip intended to be used in an environment making use of binary-to-binary translation, as the S/38 and AS/400 do, but it's not at all clear that they intend to use B2B translation in exactly the same fashion - they appear to be targeting existing low-level instruction sets, e.g. x86, rather than some high-level instruction set like the S/38 and AS/400 "MI".

  224. FPGA - Field Gate Processors by Lucidity · · Score: 2

    They are making Field Gate Processors, they aren't new but they are very cool.

    http://bersj.www.media.mit.edu/~vmb/papers/chidi 99_abs.html
    http://www.atmel.com/atmel/products/prod3.htm

    Basically it is a fully configurable CPU that can be programmed on the fly to fully dedicate to a single goal and complete it very quickly. Your standard Intel is designed for general number chunching and nothing much else in particular. But if you had control over what each register and logic gate did you could make your processor totally dedicated and streamlined to complete a single task. as well as being programmable the cache is spead out so that each logic gate has its own cache rather than one lump of cache for the whole board, this also speeds things up.

    --
    ~`'`~-,_,-Jason Wylie-',_,-~`'`~
  225. File Date 1996? by ender- · · Score: 1
    Anyone else notice that this patent was first filed in 1996?!?!

    Ender

  226. Code Morphing..... by Darksky · · Score: 1

    ..sounds kewl, if i was Intel, I would be very worried... the question is, how much is this product gonna cost compared to existing products?

    --
    01101100 01101001 01101110 01110101 01111000 01110010 01110101 01101100 01100101 01110011
    1. Re:Code Morphing..... by El · · Score: 2

      It's gonna be a lot cheaper per CPU, but you'll need a bunch of them to do the same work -- no more single-processor systems! Remember, in theory, a large group of 6502's can emulate a PIII faster than a PIII... if you can manage to write the software to coordinate the breaking up and reordering of instructions. Finally, a true RISC machine -- why have multiple instruction pipelines on a single processor, when you can have multiple processors with single pipelines?

      --

      "Freedom means freedom for everybody" -- Dick Cheney

  227. Of Registers and other less obvious things by the_tsi · · Score: 2

    So I got this feeling from reading it that this chip must have a shitload of registers (I mean, everything they're doing is all about avoiding memory i/o it seems), so I started grepping... it's certainly very register-based. But here was the good part:

    > These improvements include a gated store
    > buffer and a large plurality of additional
    > processor registers

    "large plurality" sounds to me like a whole boatload more than the ones we used in those silly MIPS simulators to learn assembly theory and certainly more than any x86 chip I've seen.

    I also saw some references to VLIW conversion, so I did another grep; I think this is one of the best paragraphs, and it's not in Greek...

    ----------
    FIG. 2 is a diagram of morph host hardware designed in accordance with the present invention represented running the same application program which is being run on the CISC
    processor of FIG. 1(a). As may be seen, the microprocessor includes the code morphing software portion and the enhanced hardware morph host portion described above. The target
    application furnishes the target instructions to the code morphing software for translation into host instructions which the morph host is capable of executing. In the meantime, the target
    operating system receives calls from the target application program and transfers these to the code morphing software. In a preferred embodiment of the invention, the morph host is a
    very long instruction word (VLIW) processor which is designed with a plurality of processing channels. The overall operation of such a processor is further illustrated in FIG
    --------

    This pretty much gives away what people have been saying since the beginning. Morphing hardware AND software elements that work in conjuntion to provide (drum roll) a fast as HELL computer. And it will run software we already have. And pretty darn near anything you throw at it. Want to be a Playstation for a day? How about an O2? Now switch back to Pentium II so you can type up that report and then become a G4 so you can make the graphics to insert in the presentation that accompanies it.

    This thing will be doing the code morphing in parallel (which is what this invention seems to be... the morpher) and then run it on another fast chip that's related to one of the earlier patents. And it will all be controlled by a little driver that turns into a "layer 1 vmware" (now that our computers will need an OSI layer model... :P )

    -Chris

  228. YES, that's what I got by berteag00 · · Score: 2

    DETAILED DESCRIPTION, Paragraph 1:
    "The present invention overcomes the problems of the prior art and provides a microprocessor which is _faster_ that microprocessors of the prior art, is capable of running _all the software_ for _all the operating systems_ which may be run by a _large number of families of prior art microprocessors_, yes is less expensive than prior art microprocessors. (my emphasis)

    in other words, the Holy Grail of computer architecture: processor emulation that's faster than the native processors. yes, sounds too good to be true, but at least it won't be vaporware... ;-)

  229. Summary by Josh+Turpen · · Score: 1

    It's a safe L1 cache for their proccessor emulator.

    On a side note, why spend all of this effort to be x86 compatible when you have the source code? IMO open source software is going to make hardware architecture very competitive. We can finally drop legacy binary compatibility and go for pure performance architectures. Using 'make' as your install program has it's advantages.

    --
    --- A Jesus Fish eating a Darwin Fish only proves Darwin's point.
  230. No you're all wrong: it's for Emacs by kzinti · · Score: 1

    It's a way to run the Emacs lisp engine on any host CPU! Directly on bare metal! The good folks at Transmeta have finally realized that what the world needs is a better Emacs, with hardware assist. Sort of a turbo-Emacs. Linus has often said that Emacs is evil... well, now we know how he plans to fix it. Emacs is dead! Long live Emacs!

    --JT

    1. Re:No you're all wrong: it's for Emacs by Imperator · · Score: 5

      Actually, it's a way to run any application for any processor and any OS, straight from Emacs. Unrelated planned features for Emacs include improved SMB support, an extremely light-weight httpd, and preliminary support for USB child-rearing devices.

      --

      Gates' Law: Every 18 months, the speed of software halves.
  231. Wow. What an idea. by walters5 · · Score: 1

    Let's see what facts we have to deal with here:

    -linus torvalds works for Transmeta. He says what he does has a lot to do with the future of linux.
    -a processor that can translate other processor languages and code into its own and run them much faster than the others can process their own code alone
    -the very future of linux
    -one very on the edge head of the pack company
    -a great deal of secrecy
    So here's my point:
    Wouldn't the perfect compilation of all of these facts be the invention of a processor that when combined with a highly specialixed operating system (the future of Linux) can run the programs of any operating system made for any processing lanuage faster than we've ever seen before? As in running mac apps, win apps, sun apps, even palm os apps and everything else imaginable all from linux faster than ever believed. Wow. I hope that's what's in store for us. And its even supposed to cost less!

  232. puff of smoke? by Anonymous Coward · · Score: 0

    After so much of secrecy, Transmeta should not come out like "mad scientist running out of his lab with a puff of smoke from his lab's windows"

  233. Re:What about the "permanent bit" by Capt+Dan · · Score: 2

    Yes yes. But what about circuitry for permanently storing memory stores temporarily stored when a determination is made that a sequence of translated instructions will execute without exception or error on the host processor

    Wha'dup with the "permanently storing memory stores temporarially stored?" It's pretty much decided that temporarially stored implies a cache. Take the code, translate it. Store it in the cache until it is verfied, then execute it.

    Sounds like their temporary cache of instructions can be sent somewhere else for storage once they have been verified.

    Hmmm. HHHMMMMmmmmmm.....

    So does that mean it can execute, say x86, code in emulation, and at the same time translate into native transmeta opcode in order to be run natively at a later time? It's doable. I can picture a basic flow diagram circuit in my head right now.

    --
    Sig:
    Barbeque is a noun. Not a verb.
  234. it has to do with fast translation and exceptions by taniwha · · Score: 2
    basicly you want to be able to translate say x86 code into some other native instruction set and not to have to worry (too much) about x86 exception semantics or context... you do the high performing translation for the common case and have your hardware pick up any exceptions - meanwhile you have your hardware not commit any memory system changes (the store buffer) until you know your code fragment ran without exceptions - then you commit the changes to memory - if you take an exception you back off and emulate the code fragment probably x86 instruction by instruction (that way you find out which x86 instruction caused the problem etc etc and the x86 state looks clean to the x86 program).

    Chances are the code fragments are basic blocks (between branches etc)

    I think I've seen this idea by another name before so I'd guess there's prior art - but hey it's a patent you can read anything the lawyers can get away with into it ....

  235. Awright!! by Anonymous Coward · · Score: 0

    Now we can run Windows on a Sparcstation!!

    1. Re:Awright!! by Anonymous Coward · · Score: 1

      I think not.

      I think it means that there will be another chip capable of running other instructions.

  236. Another possibility by squeakphd · · Score: 1

    Granted, this would be an excellent form of BSOD-prevention for Windows, but there may be more to it than that. A few of the more subtle points in the patent could also be used as a sort of virus defense. For instance, it mentions virtual hardware devices and suggests that the emulator would know some things about how hardware will behave when certain I/O instructions are executed. So, in addition to being able to prevent exceptions, it could also make sure a program doesn't do anything naughty to the hard drive. I don't think this is exactly what they had in mind, but it's something that occurred to me as I was reading the patent (and before my brain began to bleed from the redundancy).

    1. Re:Another possibility by mvw · · Score: 1
      Granted, this would be an excellent form of BSOD-prevention for Windows

      I doubt that the operation leading to a BSOD consists just of a couple of processor instructions, whose rollback would resolve the situation.

  237. The Prior Art section is excellent... by Jack+William+Bell · · Score: 2

    It is cogent, well written and covers a lot of ground. Someone really did their homework on that!

    Much of the rest of the patent application is as deliberately dense as they can make it. Including one run-on sentence that would take me three huge breaths to speak aloud :-)

    For information on what this thing actually does, read the 'DETAILED DESCRIPTION' section. On interesting fact gleaned from there in a quick reading is the fact the emulation co-processer is called a 'morph host' and it apparently executes some kind of special opcodes used for emulation. So to do the emulation you write 'code morphing software' that translates incoming instructions to the 'morph host' instruction set. Very Cool! And, of course, the 'transactioning' and error checking stuff noted in prior posts.

    It is looking more and more like the early rumors of a Transmeta 'emulate anything' design were on the nose...

    Jack

    --
    - -
    Are you an SF Fan? Are you a Tru-Fan?
  238. What they are creating by Anonymous Coward · · Score: 0

    A processor that will create long meaningful sentances, but which are extremely confusing and cryptic. From this demo it looks like things are going very well.

  239. What if... by nsanch · · Score: 1

    From the comments I've been reading it seems like the patent is for a processor that would translate instructions for other processors into its own instruction set, make sure the translated instructions would work, and if so run them.

    What I want to know is, what would it do if the translated instuction would cause an error? Would the processor just not carry the translated instructions out? If so, that would seem to be quite a flaw. Maybe I'm missing something incredibly obvious here, but maybe not. Can anybody answer my question?

    --
    I never did like to do anything simple when I could do it ass-backwards. - Neuromancer
  240. Have we really thought this through yet?? by pgm · · Score: 3

    OK, after reading a ton of messages, I'm thinking about this whole instruction translation issue. If Transmeta is making a "co-processor" that would translate instruction sets, and _THIS_ thing can store the existing state of the processor then....

    Couldn't they theoretically (siq) be working on a system that would allow you to run MULTIPLE instruction sets inside of single OS?? The implications would turn the existing software industry (of which I am a part) onto it's ear!

    Could we actually have a box running some form of unix, and actually be able to run ANY application natively on it - no matter what OS it was written for?? Think about running a BeOS app next to a Win32 app, next to an application compiled for i386 Redhat! WOAH.

    If this is even close to what actually exists in Transmeta's labs, then we are in for a serious roller coaster over the next couple of years!

    Quivering with anticipation...
    p.


  241. Aha by mcc · · Score: 0

    It appears that transmeta has patented the run-on sentence.

    Perhaps they are working on a microchip that will translate x86 instructions into difficult-to-read english in real time.

    (score: 0 redundant)

    transmeta will let you know what they're doing when they're ready. have patience. getting all worked up and making wild guesses won't really accomplish anything. :)

  242. Bare bones translation by brennanw · · Score: 1

    It's a thing that holds stuff for another thing, until the other thing is ready to act on the stuff, then the first thing sends the stuff to the second thing, and it's done.

    Yes, I am a tech writer. :)

    --
    Eviscerati.Org: All Hail the Eviscerati
  243. exception trapping and the Alpha by GnrcMan · · Score: 2

    If the code does get a fault ("exception or error" - this could be an exception without being an error, e.g. a page fault), then anything that code did "speculatively" and that wouldn't have been done by the untranslated code had it gotten that exception hasn't made any permanent state change, so the fault cancels/backs out any uncommitted state changes and presumably traps to software that would do whatever is necessary to do what the untranslated code would have done.

    That's exactly what I got out of that part. And it sounds pretty cool. This particulary would have applications in multiprocessing systems.
    On the Alpha, we handle exceptions using something called trap barriers, which is a software method of handling this sort of thing. What happens is a fault appears to issue from the trapb and you are left to your own devices(from a compiler perspective) to discover where the exception occurs. It isolates the exception down to what's called a "trap shadow". This translates to a pain in the ass because we don't know precisely where a fault issued from, only the "shadow". Multiproc's complicate this mess further. This makes for interesting, but complicated compiler development.

    Moving this to hardware, OTOH, would greatly simplify things, especially when emulation adds a layer of obfuscation.

    That's why the exception handling part is what I zero'ed in on. It sounds really neat.

  244. FPGA? by zmooc · · Score: 1

    I didn't understand very much of the patent, but it sounds a bit like a description of a Field Programmable Array (FPGA) Processor. That would be conform the rumors I read in the replies to a post about the FPGA-processors which Starbridge Systems has already build. Their description is really cool; the slashdot-post talks about their "$1000 computers 350 times as fast as a pII-350". That may be a bit much, but the concept is extremely nice.

    --
    0x or or snor perron?!
  245. Risc and cisc computing on one system? by Nur-Sothca · · Score: 1

    Doesn't this basically also mean that this one, ehm, machine, can use different cpu instuction sets? Now i'm not a cpu workings expert but there are multiple instruction sets around, most systems use cisc (or something, as far as I know) and some are Risc. As i see it this thing can perform risc and cisc instructions by translating these instructions to the native instuction set the machine uses itself.

    Am I making sense? It all works out in my mind ;-)

  246. Transmeta Patent by libertas · · Score: 1

    A software emulator translates instructions from the emulated processor to an equivalent sequence of instructions that can be run on the target processor. However, the emulator must also generate instructions that check for conditions which would have resulted in exceptions on the emulated processor but may not cause exceptions on the target processor. These conditions would have been automatically checked by the hardware on the emulated processor. This checking can be a major performance bottleneck, both because of the additional instructions and because they may increase serialization. If you have a machine that allows multiple instructions to execute in parallel - for example a VLIW machine or one with multiple execution units - then it would be nice to run the exception checks in parallel with the instruction that does the actual work. Obviously, if you do this and detect an exception you don't want the results of the instruction doing the work to be stored. This patent covers a hardware feature to suppress these stores.

  247. No longer a secret by Anonymous Coward · · Score: 0

    This simply means that Transmeta has finished designing a CPU that has its own native operating mode, and can translate code for other chips to its own native mode. In short, this chip should run x86, Alpha, G4, PPC, Sparc, etc software without emulation. You could have a dual boot system running x86 linux and PPC linux. Hell, with that "multi OS simultaneously" software thats available, you can have PPC Linux and x86 Linux running at the same time... This chip could send Transmeta to the top over Intel and AMD. Or if they liscense production to another company, then that company will probably have an antitrust suit to deal with in a few years. This has the potential to really shake up the computing world.

  248. Clustering? by Christopher+B.+Brown · · Score: 2
    My take on this was that there would be part of a CPU devoted to this "multi-tasking." After all, in order for the communications to take place really quickly, it all needs to be on one chip.

    You do, however, show an interesting notion; if the patented matters reveal a "protocol" for allowing this emulation, it may make it plausible to have multiple "little processors" doing work, and getting to change Real Memory when it makes sense to commit the work.

    It would certainly be neat if this were amenable to putting a bunch of "little processors" working together. The communications takes place at a much lower level than Beowulf; it may even be at a lower level than is done with SMP.

    --
    If you're not part of the solution, you're part of the precipitate.
  249. Next Patent by Anonymous Coward · · Score: 0

    Let's watch Transmeta to prevent them to patent Coventry's idea :-)

  250. a kind of smart cache? by LabRatty · · Score: 1

    I see it a bit like having your L2 cache with a microprocessor acting as a realtime translator attached. Things get sucked into the L2 cache translated to X and passed to the X speaking CPU. When the CPU is done it writes back to L2 as normal and the microprocessor deals with caching or memory writeback. By using a programmable array as the L2 cache translator the language can be changed as required if it could not store more than one at a time. Pushing the 'what if's to the limit, it could even be changed as easily as changing consoles, an x86 binary running in one window, a PPC binary in another, as you move between windows the OS tracks which language needs to be spoken and informs the programmable array.

  251. Re:Hypocrites! by Lucidity · · Score: 1

    Considering the lack of facts surrounding this topic that seems a bit extreme. I mean this has nothing to do with the "community" since this is not software and most of us are not into designing and implementing chip designs.

    --
    ~`'`~-,_,-Jason Wylie-',_,-~`'`~
  252. Speculative Store Queue with Checkpointing by Mr+Z · · Score: 1

    The invention appears to be the following:

    • A hardware queue which holds values that were stored by speculatively-executed instructions,
    • Logic which commits these stores to memory when the speculatively-executed instructions are retired,
    • Logic which discards these stores when the speculatively-executed instructions are killed,
    • Logic which maintains "checkpoints" or "tags" which delimit/identify groups of stores according to the speculative instruction flow, so that they may be committed or killed in these tagged groups.

    The gist of this is that you can batch up groups of speculative stores, and either commit them or discard them in groups. This is important in an instruction translation environment, because a single emulated instruction may generate several stores when translated to a set of "native" instructions.

    The hardware natively detects faults at native-instruction granularity, whereas the exceptions and errors need to be handled at the original emulated instruction granularity. So, the hardware needs a method for tagging writes that were due to a specific emulated instruction sequence that was speculated, so that it can kill the writes together and perform the appropriate exception processing. By doing this, you can allow the native instructions to execute in a more arbitrary order, since you can easily weed out the stores you wanted from the stores you did not.

    --Joe
    --
  253. Transmeta = change inbetween by Anonymous Coward · · Score: 0

    That is what Transmeta would be translated to. So I think they are trying to make an hardware Emulator from i386 to alpha og sparc.

  254. Doesn't this violate theory? by Alik · · Score: 1
    It's been a few months since I took theory of computing, but I seem to recall that one of the "undecidable" problems is determining whether a given Turing machine (which is, in theory, equivalent to any given processor) will be able to successfully operate on a given input string (which is, in theory, equivalent to any set of instructions to that processor). This patent claims, if I read it correctly, that it can somehow examine instructions and determine whether or not the host will be able to execute them without an error. That seems to violate some fairly well-accepted math. You could get around it if the host processor was equivalent to a DFA, I think, but then you'd have a processor that could only do regexps.

    Can someone explain to me why I'm wrong? (I'm not very good at abstract math, so I'm quite sure I'm wrong.)

    1. Re:Doesn't this violate theory? by GnrcMan · · Score: 1

      Not quite...It can execute instructions without checking for exceptions then rollback state to match where the exception occurs.

    2. Re:Doesn't this violate theory? by Anonymous Coward · · Score: 0

      branching instructions are required for this to be a problem, given, say, three instructions that follow sequentially, there is no problem determining the final state of the machine when the last one is finished executing... i'll bet a branch would be considered an "exception" for this processor

  255. It means... by Yodalf · · Score: 1

    First of all, they already have a working prototype:
    ...
    "As a comparison, one
    embodiment of the present invention designed to run all available X86 applications is implemented by a morph host including approximately one-quarter of the number of gates of the Pentium Pro microprocessor
    yet runs X86 applications substantially faster than does the Pentium Pro microprocessor or any other known microprocessor capable of processing these applications. "

    This might mean the death of AMD...

    Also, on another note, since the morpher is just (very low level) software, it could be open source if they wanted it to...

    Can you imagine: an actually useable open source _processor_ ... That should give Intel (& others) something to think about.

  256. Compiler Support Issues by gupg · · Score: 1
    They do not need compiler support to convert from other processor code to their processor code .. I agree with this.

    However, any new aggressive architecture requires a lot of compiler work. To make your new applications fly, you would want to compile them using the native processors compilers. BTW, It is probably a VLIWish processor - my guess ;-)

  257. what it REALLY does by neilv · · Score: 2
    First, if you go to the patent office page again, and hit the next button, you'll see that they have a number of patents, the sum total of which is not a processor, but a computer.

    There's a somewhat interesting write up on CNN (from the time of the first patent, nov. 98). There seemed to be some posts that missed who transmeta really is - it's owned by Paul Allen, who also owns Interval (another think tank). His whole goal has been to recreate his PARC days, when really smart people could team up and work on just about anything they wanted (the result we all know, since we're using it).

    Transmeta's computer does at the processor level what JIT and Java do for software. Java lets you write one program and run it on many OS's. JIT speeds that process by pre-translating java byte codes into native code.

    The transmeta box will allow a chip manufacture to make a single chip, that will run any OS, and (by cacheing instruction conversions, as well as memorizing repeated instructions) actually run them all faster than the zillion chips AMD, Intel and the rest are cranking out.

    Think about it: Universal hardware, universal applications, and plethora of invisible middleware.

    Welcome to the future. You heard it here first. Too bad you can't by stock in Transmeta....

    $.02

  258. Re:What about the "permanent bit" by GnrcMan · · Score: 1

    Take the code, translate it. Store it in the cache until it is verfied, then execute it.

    I'm not sure that's quite right. What I gleaned was: Take the code, translate it, execute it, cache it at the same time, when a trap barrier of some sort is hit, verify no faults. If there are faults, roll back based on the cache to the exact point of the fault. This is very important! Read this comment for more information.

  259. Adaptive Emulation by Spooker · · Score: 1

    From what I can tell of the patent, they are patenting a method of "Adaptive Emulation" which as others have said would allow the processor to step back in the run when an error is occurred and try another state until the code completes without error.

    It could be an Intel-killer if it can be realized, as we saw with FX86 from DEC the technology is there...but putting it in the hardware like this would make FX86 look like my Palm Pilot running the gameboy emulator trying to host Sonic the Hedgehog.

    You might be able to say that this is the next step toward "real" computer learning capabilities.

  260. More like "cluttering" ;-) *nt* by Anonymous Coward · · Score: 0

    nt=no text

  261. TRANSmeta by Anonymous Coward · · Score: 1

    Now we know what the boys at TransMeta are really up to don't we! They're making a product that TRANSlates META-languages!

  262. K6 and up use this method by Barbarian · · Score: 1

    As far as I understand, K6's and above have a CISC translator which converts x86 to some custom RISC processor instructions for execution by the real processor.

  263. Potential Hardware to Suppoert Binary Retargetting by LL · · Score: 2

    Hmmmm ..... looks like some hardware assist to help binary retargetting. For people not familiar with the concept, take a look at an overview. The concept is sound in that as ESR points out 95% of the programming jobs out there are spent in maintaining old code on old machines. However, if there was a way of abstracting and specifying the hardware characteristics and mapping from one to another, then old binaries could be shifted onto newer and cheaper hardware with less hassle. I can think of cases like old Cray binaries where porting them to a new MPP would be too painful manually, some of those timing cases can be really subtle. Given that computer companies are very relunctant to support hardware which isn't current (ie not profitable) and others could potentially go belly-up (correct me if I'm wrong, I think only IBM is one of the few giants left from the 60's), there is a need to protect the million of man-years spent on specific packages. Of course, research has shown that retargetting works better with availability to the original compiler source :-).

    Given the rate of corporate take-overs, you could quite easily end up running a zillion different systems and lose valuable time in trying to consolidating everything.

    Oh well, add this to the speculation pile along with everyone else.

    LL

  264. Transmeta == transform in the middle by Anonymous Coward · · Score: 0

    It is a Hardware emulator that is changing instructions from one architecture to another one.

  265. Re:Hypocrites! by Anonymous Coward · · Score: 0

    Ahh, I see. Because most of us write(or care about) software here, we are against patents ONLY FOR software. Special casing software in the patent system would not be fair. Digital hardware is really, down at fundamental level, a bunch of logic that transforms its inputs into outputs. Funny, thats exactly what software is. Just because Linus is working at a Hardware company does not make it right. Guess what, he can make mistakes too! (Oh, and theres no Santa Claus)

  266. Most important section: by astyanax · · Score: 1

    As a comparison, one embodiment of the present invention designed to run all available X86 applications is implemented by a morph host including approximately one-quarter of the number of gates of the Pentium Pro microprocessor yet runs X86 applications substantially faster than does the Pentium Pro microprocessor or any other known microprocessor capable of processing these applications.

    The fact that this was at least started in '96 explains why they only compare against the PPro.

    This has already been pointed out, but look ma! I can use bold in HTML!
    The fact that they say "any other known microprocessor" is scary to say the least. I'm glad Linus had enough trust in what they're doing to work for them, it sounds like awesome stuff. I just hope this doesn't stay vaporware for very much longer =P.

    If this does come out as a motherboard/processor solution, I hope they're going to use the standard ISA/PCI/AGP combo we know and love.

  267. What's the Application? by Redundant() · · Score: 1

    This sounds a lot like database management with all the transaction rollback features and whatnot.

    The fact that they are passing processor instructions tells me that this is some kind of distributed processing load management tool.

    I used to hear about wasted CPU cycles and how nice it would be if CPU load could be shared amongst many inexpensive distributed processors back in the 80's.

    Since the 90's though the CPU hasn't been the bottleneck for any application I've ever heard of.
    Speculation what apps need mega CPU cycles?

  268. Now that I read it again by Lucidity · · Score: 1

    Now that I read it again it does look a little more like a template reader or high level decoder. http://slashdot.org/comments.pl?sid=99/09/28/15312 30&threshold=0&commentsort=0&mode=thread &pid=69#99

    --
    ~`'`~-,_,-Jason Wylie-',_,-~`'`~
  269. Here's what TransMeta is up to by pabs · · Score: 1
    ...or at least what they're patenting:

    ...From the patent article...
    The present invention overcomes the problems of the prior art and provides a microprocessor which is faster than microprocessors of the prior art, is capable of running all of the software for all of the operating systems which may be run by a large number of families of prior art microprocessors, yet is less expensive than prior art microprocessors.

    Rather than using a microprocessor with more complicated hardware to accelerate its operation, the present invention combines an enhanced hardware processing portion (referred to as a "morph host" in this specification) which is much simpler than state of the art microprocessors and an emulating software portion (referred to as "code morphing software" in this specification) in a manner that the two portions function together as a microprocessor with more capabilities than any known competitive microprocessor. More particularly, a morph host is a processor which includes hardware enhancements to assist in having state of a target computer immediately at hand when an exception or error occurs, while code morphing software is software which translates the instructions of a target program to morph host instructions for the morph host and responds to exceptions and errors by replacing working state with correct target state when necessary so that correct retranslations occur. Code morphing software may also include various processes for enhancing the speed of processing. Rather than providing hardware to enhance the speed of processing as do all of the very fast prior art microprocessors, the present invention allows a large number of acceleration enhancement techniques to be carried out in selectable stages by the code morphing software. Providing the speed enhancement techniques in the code morphing software allows the morph host to be implemented using much less complicated hardware which is faster and substantially less expensive than the hardware of prior art microprocessors. As a comparison, one embodiment of the present invention designed to run all available X86 applications is implemented by a morph host including approximately one-quarter of the number of gates of the Pentium Pro microprocessor yet runs X86 applications substantially faster than does the Pentium Pro microprocessor or any other known microprocessor capable of processing these applications.

    The code morphing software utilizes certain techniques which have previously been used only by programmers designing new software or emulating new hardware. The morph host includes hardware enhancements especially adapted to allow the acceleration techniques provided by the code morphing software to be utilized efficiently. These hardware enhancements allow the code morphing software to implement acceleration techniques over a broader range of instructions. These hardware enhancements also permit additional acceleration techniques to be practiced by the code morphing software which are unavailable in hardware processors and could not be implemented in those processors except at exorbitant cost. These techniques significantly increase the speed of the microprocessor of the present invention compared to the speeds of prior art microprocessors practicing the execution of native instruction sets.
    ...

    Basically, they're describing a combo hardware/software dynamic OS emulation implementation.
    Almost like a JIT compiler for ASM on crack -- cool.

    ...
    -- rot13 my email address for the real thing
    --

    Odds of being killed by lightning and winning the lottery in the same day: 1 in 2^55