AMD's 64-Bit Chip
EyesWideOpen writes "AMD is set to release a 64-bit chip early next year which will be completely backwards compatible with the Athlon line. The current 64-bit offering from Intel, Itanium, is an entirely new chip that has no backwards compatibility with its x86 line of chips (from the 8080 chip to the Pentium IV) and is designed only for high end servers. AMD's solution to this problem is the Opteron chip (product info) which will be in servers, desktops and laptops. Here is a wired article."
Somewhere a Delorean reached 88 miles an hour to get this story to us.
Finally, math books without any of that base 6 crap in them.
Ford Motor Co. is set to release today a new car, the Model "A", based on the award winning and famously popular Model "T". The new Model "A" is backwards compatible with all previous 4 wheel gasoline powered Model "T" cars produced by Ford and its competitors, and can run on the same roads as them.
Infuriate left and right
What's the point of making something that is unsupported by a large chunk of today's software
Because you end up with a CPU that has layers of compatibility upon layers of compatiblity.
you'll have real mode, protected mode and now probably something like 64 bit mode.
imho it's better to get rid off all the old junk and start over once in a while.
Of course, Itanium is backwords compatible with x86 code. It just isn't particularly fast when doing so.
...considered part of the x86 family? The first processor in that lineup is the 8086. I think the 8086 might've been source-code-compatible (to some extent) with the 8080, but you can't take an 8080 binary and run it on any x86 processor (emulation doesn't count).
20 January 2017: the End of an Error.
I found my Dual AMD box to be as good, if not better from a never going down standpoint. Same goes for my gaming boxes. After building a bunch of systems, however, I do have one major beef with AMD...
... Things are a little better these days because the quality heat sinks - with paranoid mode on - are less likely to crush a CPU than when folks were trying to strap a socket 370 heat sink on an Athlon, but I still feel like it is a crap shoot every time I have to remove the CPU. I end up trying to stay about the $100 mark for CPU's as a result. (Yes, the MP's cost me much more, and I was very nervous when I mounted them)
For the love of god, put a coat of nickel or something on the CPU!
I chipped a couple when rev 1 of the Chrome Orb came out. Fool me once
+++ UGUCAUCGUAUUUCU
> imho it's better to get rid off all the old junk and start over once in a while.
Unless of course you've got an installed base somewhere in the billions, 20 years worth of compiler optimization, a factor of, what 100, more people that know the assembly language, etc. And it doesn't help if good compilers won't exist by the time your chip comes out. And if the internal interface teams have difficulty communicating, you're going to be late, hot, slow, and over-complicated.
Starting over is nice from a design perspective, especially because it feeds the urge for creativity that most engineers have. Unfortunately, that do-over is not always executed well, and it turns out to be a little underwhelming, just like Itanium.
Fight the urge to think that all new things are good. Please.
Outside of a dog, a book is a man's best friend. Inside a dog, its too dark to read.
A properly designed 64-bit CPU does not need to 'run slower' to run 32-bit apps. AMD came up with a simple solution to the 32-bit limitations of X86 code: they added a new 'mode' to the processor to run 64-bit binaries. when this mode bit is set (similar to the old Real and and Protected modes of X86 chips), the chip utilizies the full 64-bit-wide pathways for data and cacluations, when this bit is not set, only the lower (or is it upper? AMD isn't saying...) 32-bits of the pathways are used. The same exact logic units are used for all 32-bit and 64-bit calculations, only the bit-depth precision changes. Thus if it takes an ADD instruction 16 cycles to add two registers and store the results in a third register, it takes 16 cycles reguardless fo whcih mode the processor is in. Of course, AMD also added an extra 8 registers for use in 64-bit mode... very useful.
The itantium does not get the majority of it's speed from being 64-bit - this is a common mistake people make. It has a _very_ different design and instruction set - EPIC - which places the burden of parallel instruction determiniation on the compiler. Basicly, they used the oldest software refactoring trick in the book, but on the whole processor design: they examined the amount of time spent executing, and looked for the bigest runtime performance-hit that could be moved from a O(n) to a O(1) penalty by simply moving the calculation. In this case, modern processors spend a great deal of time trying to handle multiple instructions at once, which may or may not be parralellizable (is that a word?) - thus the processor has to figure out, on the fly (in a P4, for example), if it can execute the next four add instructions in parallel, or if they are interdependant and cannot... By placing the burden of parellelism determination and instruction scheduling on the compiler, intel made the compiler writer's job much harder, but at the benefit of increased performance.
Oh, and most PDA processors are much more traditional, and thus don't require complex compilers like the itanium, so actually porting a compiler (or an assembly-lang app) to a PDA from x86(32-bit) is easier than creating one for the EPIC architecture.
And yes, I know the above is an oversimplification, and Intel and AMD both did a lot more, in a lot more detail, on thier 64-bit chips.
Oh, and I think the next few iterations of itaniums _will_ beat the AMD 64-bit chip on bechmarks. But not by a landslide.... And with the differences in price (EPIC chips are Expensive... capital E) the AMD chips will win the hearts of many and be the performance-price ratio king. And who wants to pay 3 times as much for 20% more performance?
man is machine
>> very advanced VLIW-esque architecture
Ah, yes. EPIC. Explicitly Parallel Instruction Computing. AKA VLIW. EPIC is market-speak. Intel didn't want to admit that it was making a VLIW chip for two reasons:
1. There is only one company that has every sold a VLIW chip that actually worked, and that people bought: TI makes DSPs, where are VLIW. They make tons of money. They are the only ones that ever did it right.
2. There is only one company that has ever made a good VLIW compiler: TI, again.
Lets think briefly about how great EPIC is, using the two main selling points I remember from a presentation I saw on it a few years ago (sorry if my memory is bad, no coffee this morning, I'm not responsible).
1. Instructions are Explicitly Parallel. So, the compiler tells you that these two or four or however many instructions can be executed without worrying about data dependency. Terrific. Assuming that the compiler actually works, which is still an open question.
The only difference between this setup and what's in your Athlon or Pentium4 is that the looking-for-independence is done in hardware on your Athlon instead of by the compiler on your Itanium. This means that there is the *possibility* that EPIC does better at finding independence because the compiler *should* know more about the code when its in a higher level language. *Should*. Essentially, until the science of compilers takes a quantum leap or we start using programming languages that makes these things easier (correct me if I'm wrong, please), Itanium will be at most as fast as a superscalar processor that finds independent instructions on its own and does register renaming.
2. Predicates and conditional execution. While the whole notion of the predicate in EPIC is more complicated and complex than just conditional execution, its not entirely more useful IMHO, or at least that was my impression the last time I heard someone talk about it. Alpha has conditional execution. ARM has conditional execution. I can append checks to the condition codes in ARM assembly. I don't really understand why this is so nifty.
I've said it before, and I'll say it again. Resist the urge to think that whatever marketdroids tell you is new is actually good. Sometimes its not.
(If more knowledgeable people are lurking, please correct any errors I've made, but I think I've got this right.)
Outside of a dog, a book is a man's best friend. Inside a dog, its too dark to read.
"64-bit code is twice as big as 32-bit code" bloatware excuse
Unfounded. Though I find Itanium's instruction coding (16 bytes per 3 instructions) bloated, not all high-"bit" machines have to have bloated bytecodes. The ARMv4 architecture, used in processors such as the ARM7TDMI in the Game Boy Advance, has a standard 4-byte-per-instruction encoding, and an optional 2-byte-per-instruction encoding called "Thumb". Thumb code runs at about two-thirds of the speed of ARM code on machines with fast memory because some operations take more instructions on ARM than on Thumb, but Thumb code really shines when running on small or slow memory and can help drain less battery power on mobile machines. Apps will often have most of the app in Thumb but some of the time-critical inner loops in ARM.
Will I retire or break 10K?
If you want a "fresh" architecture that isn't full of old junk, buy an Alpha. Or for that matter a MIPS, SPARC, or Power4. All of which are 64-bit and have either always been 64-bit, or at least had their original 32-bit designs planned around 64-bit expansions.
Personally, I think it's amazing how much old crap has been piled onto x86. It's really remarkable it runs at all, and it's even fast! I used to turn up my nose to the x86 given how they piled all the 32-bit extensions on the old 16-bit core. It's really a travesty. And the actual instruction set and register set looks like a damn train wreck compared to MIPS or PPC. But they are soooo cheap I eventually got over it, and just try to avoid thinking about any level lower than 'C' now so I don't go insane.
The vast majority of the old cruft in the X86 architecture that nobody uses any more has been demoted into microcode or other non-optimized crevices. Ever since the Pentium came out, good programmers and compilers have been using an almost RISC-like subset of the X86's myriad possible instructions, operands and addressing modes. IOW, all that old stuff really doesn't slow things down in the real world.
Anyway, recent CPUs have been transforming X86 instructions on-the-fly into bizarre internal parallelized architectures anyway. This hidden logic is an order of magnitude more complex than what is visible in the X86 instruction spec. The implementers are free to completely redo the hidden stuff with every new generation of X86 chip.
The Wired article has other errors as well. A 32-bit CPU isn't limited to 4GB; that confuses address space with physical memory. The definition of exabyte is wrong (1000 petabytes, not 1000 terabytes). The 8080 in 1981? Closer to 1975. And many have mentioned the bogus "no compatibility" claim.
One wonders if the whole thing wasn't a troll.
A dual 1Ghz Mac can emulate x86 and performs as well as a 266Mhz PII
A 667 mhz 64 bit Alpha can emulate x86 but is only as fast as a 200mhz Pentium Pro
An 800 Mhz Itanic emulates x86 as fast as a 166Mhz Pentium.
Linux can emulate a cluster on a single machine.
Any PC with two network cards can emulate a Cisco router.
Intel stopped marketing Itanium's x86 emulation mode because it is abysmally slow. The emulator is of course compiled on Itanium's still very immature compilers so it will improve in the future.
The Sledgehammer contains a complete x86 core and a complete 64 bit risc core. At 800mhz it outperforms a 1.6Ghz Pentium 4 running stock Windows XP and stock applications.
Running 64 bit SUSE, the Sledghammer performs as well as an Itanium at the same clock speed.
Sledgehammer is expected to ship at 2.0 Ghz. It's should perform as fast as a pentium 4 at 3.4 Ghz. Each processor has it's own memory controller so there is no shared memory bottleneck for multiprocessing. 2 processors should be exactly twice as fast using multithreaded applications. Sledghammer scales to 8 processors.
If voting were effective, it would be illegal by now.