Transmeta Astro -- More Details
chill writes "We've recently seen announcements, product launches and reviews from AMD and Intel on their new low power chipsets. Not to be left out, Transmeta has more details on their forthcoming Astro processor. Slashdot covered the Astro back at Comdex in November."
Seems to me the Transmeta chips work fine.
For reference, I'm using a Toshiba Libretto L1, purchased from Dynamism.com.
The details article mentions that a lot of the load the hardware normally does is being shunted back on the software. According to ArsTechnia, that's where it should be. (1, 2)
My question is, will compilers be able to bypass the code morphing software, and directly work with the Transmeta's underlying instruction set?
What's this Submit thingy do?
Ever heard about Matrox?
Analogies don't equal equalities, they are merely somewhat analogous.
As far as I know, Matrox primarily deals in business cards designed for enhancing 2D display (text, image editing, etc). I was referring to gaming cards.
http://www.duke.edu/~kaf3/lowpower/slide28.html
I remember way back before they released anything, their major claim to fame so to speak was their code morphing tech where it would just emulate whatever cpu you needed.
No, their claim to fame was that their code morphing allowed them to run x86 instructions on a VLIW chip, which may turn out to be more scalable/efficient than either RISC or CISC architectures. The R&D on the code morphing was just as expensive as the R&D for the rest of the chip, so I can't imagine they'd go repeating that for some less popular architecture.
They never said they were about to release code morphing packages for other platforms. Idiotic journalists (and slashdot readers) were the ones that pointed out that the code morphing could work for other platforms.
There are no trails. There are no trees out here.
In which case, why do you state that you would use either an Athlon XP or a Transmeta chip? these are the exact opposites in the speed per power curve!
It sounds to me much more like an 'anything but intel' approach - fine, but at least admit it.
If you want a low power consumption (and quiet) desktop solution now, look into the VIA C3 series, not fast but very low power.
If you want a high power but fast solution, look at Intel or AMD, they rule the desktop one way or another.
I personally would like to see these new transmeta chips available in small embedded boards where their low heat production and high level of integration would be of great value, much like the C3 boards current are, but another step up, smaller and with lower power usage.
The code morphing is still there. Read the article. This chip is not a "low power x86" but a RISC-style chip that runs x86 instructions through efficient emulation. This is what makes Transmeta interesting. The processor itself is 256-bit! So can it address 2^256 bits of memory? That really should be enough for anyone. It's kind of funny, with Intel in the background saying that 32bit is good enough for any desktop application...
I'm a bit puzzled about the good and bad things of the various low power x86 CPU series. So far, I have identified at least five different:
- Transmeta Crusoe
- Via C3
- Intel ULV (old, now outdated by the new Centrino)
- Intel Pentium-M (aka Centrino, which appears to be a chipset strategy as well)
- AMD XP-M (aka Low Voltage Thoroughbred)
So, please tell me, why should I choose over the other? Where are the conceptual differences?
------------------
You may like my a cappella music
Well, my understanding was that the Transmeta was forced to do a bit of hardcoding for the x86 instruction conversion. It was readers, and notably the slashdot editors, who really pushed the idea of cross-platform support. CowboyNeil actually asked in a broadcast, and they gave him spin. It was *possible* perhaps, but not going to happen.
The point of code morphing was to reduce the extra hardware and make a more efficent chip. Intel did that with IA-64, reducing much of the logic by putting it into the software and replacing it wih far more advanced (and complex) hardware features. The ideal is that the slowdown by software can be made up through simpler, lower powered devices that provided good performance.
Hopefully this new generation will be more promising then the earlier.
Fastest to slowest:
AMD XP-M
Intel Pentium-M
Intel ULV Pentium III
Via C3
Transmeta Crusoe
Least power to most power:
Transmeta Crusoe
Intel Pentium-M
Via C3
Intel ULV Pentium III
AMD XP-M
Cheapest to most expensive:
Via C3
Transmeta Crusoe
Intel ULV Pentium III
AMD XP-M
Pentium-M
It depends on your need; if you are going for embedded systems try a non-x86 processor, which is better in all two categories and in the middle in performance. For a laptop, the XP-M or Pentium-M offers desktop replacement performance; if battery life is your thing, the Pentium-M, Via C3 or Transmeta processors ought to do ok. If cheap is the most important thing then go Via.
The processor is not RISC, it's VLIW. A meta-instruction is made of 8 smaller, 32-bit ones. The key characteristic of VLIW is that these 8 instructions are explicitly parallel; the processor knows, when processing this instruction, that it can execute all these 8 subinstructions in parallel (now a sub-instruction is RISC-like, I grant you that). The difficulty is finding this level of parallelism in existing x86 programs (this is the job of the software code morpher)
Furthermore, only the meta-instruction is 256 bits, not the registers, etc (which are only 32 bits). That'd be way too wasteful. Most apps don't need more than 32 bits, anyway. Only big servers need more than 4 Gigs; this processor is targeted to mobile applications, therefore I'm pretty sure it can only address 4 G of RAM.
The Raven
PPC vs x86 IA has nothing to do with VLIW.
VLIW improves performance when the instruction stream can be split up over multiple processing units.
Exhibit A:
LOAD A
LOAD B
LOAD C
LOAD D
ADD A, B
MOD A, C
ADD A, D
STORE A
LOAD E
LOAD F
LOAD G
LOAD H
ADD E, F
MOD E, G
ADD E, H
STORE E
Exhibit B:
LOAD A
LOAD B
LOAD C
LOAD D
LOAD E
LOAD F
LOAD G
LOAD H
ADD A, B
ADD E, F
MOD A, C
MOD E, G
ADD A, D
ADD E, H
STORE A
STORE E
Exhibit A is more difficult to make parallel than exhibit B, since the potentially parallelable code is separated, and, from a CPU's short-range perspective, each operation in exhibit A depends on the previous, which makes executing the instructions in parallel impossible.
In exhibit B, instructions independent of each other are right next to each other. This makes it easy for the CPU to separate the code into parallel units.
I'm no expert, I just read a lot of Ars Technica.
(As an aside...they have an article that may change your views about x86(ala P4) vs PPC(ala G4e). It doesn't take one side or another, it just points out the different approaches used by each architecture.)
What's this Submit thingy do?
XScale technically isn't ARM. It is an old ARM implementation that was frozen in time when they sold it to Intel. ARM marches on in MIPS/watt in design wins. XScale only improves through process wins since Intel only owns an old implementation.