Linus Has Harsh Words For Itanium
Anonymous Coward writes "As a follow up to the earlier story "Intel: No Rush to 64-bit Desktop"... In words that Intel are likely to be far from happy with, the Finnish luminary has stuck the boot into Itanium. His responses to some questions on processor architecture are sure to be music to AMD's ears. Linus, in an Inquirer interview concludes: "Code size matters. Price matters. Real world matters. And ia-64... falls flat on its face on ALL of these."" Of course, Linus works for a chip maker ;)
Not to mention the fact that most home users won't see a 2X performance boost from 64 bits.
Go here to create your own Slashdot dis
Not only does he work for a chip maker, he's like totally obsessed with the i386 architecture. I guess it's what he cut his teeth and and he's going to stick with it. But to think that no-one else has a use for it is very short-sighted.
...
It'll probably still make it into the kernel, though. I mean, alpha and sun architectures are in there, so
E000-VB14-G8RY
Wow, just about everything under one topic. Linux, AMD, and Intel. So by this we are going to have 64-bit processors soon, is that what I'm hearing? Or will this turn out to be like most computer issues and come out a few years from now?
FOML: Rise to Power
Now, we all know that the Itanium isn't everything it's cracked up to be, and I think none of us at are wrong in blaming intel for coming out with a lousy product....
But, isn't one of those situations he mentions in the interview (namely, running a large database server) what this chip is designed to be doing?
As I recall, the IA64 isn't designed for the desktop user... In fact, desktop users probably don't even need 64 processing for a number of years still....
Yet we're attacking Intel for making the chip to fit it's niche?
Perhaps we need to be more fair in the context of the usefulness of the chip, instead of considering it in all contexts and criticizing it based on that?
"but speaking strictly from the technological point of view"
I think that was his point. It's great technically but it sucks in the real world. If its not practical its a shitty architecture IMHO.
I also think the x86-64 is a more viable solution as well.
----
Go canucks, habs, and sens!
Worse is better
although the original essay talks about Unix and the LISP machines, it just keeps being true. Linus talks about the "charming oddities", well there you go: worse is better. Try for perfection, and the real world will eat you alive.
I also think he's right about the masses being what matter; I think Intel is still thinking about the data centre, not Joe Sixpack, with Itanium.
ZOMG I WOULD LOVE TO KNOW ABOUT YOUR FEELINGS ON MACINTOSH VERSUS WINDOWS, VI VERSUS EMACS, AND HOW YOU'RE NOT A DORK
what will the 'masses' do with a 64-bit processor? the best reason to move up to 64bits is to increase maximum memory, and althought memory is now cheap, its not that cheap!
32bit processors can have up to 4GB of RAM. The most memory i know someone to have is 1GB, and computers most often come with half of that, 512MB. We still have a long way before we hit the 4GB ceiling (a long while!).
I am actually a tad worried for AMD, since they plan on coming out with the x86-64 pretty soon. And i dont know who will actually buy it (or need to buy it).
64bit processors belong where they are most needed, specialized machines.
what is nailchipper?
Can't say I disagree with Linus's logic, but I don't know if this was that great of a decision politically-speaking. It might not matter, but if anything linux *needs* support from big players like Intel and vice versa in order to grow. This won't necessarily hurt, but I doubt it can help matters on the Intel front.
>>They delivered a revolutionary product.
It's not 'revolutionary', if there is no revolution.
People toss this word about like it means 'incremental change'. The Industrial Revolution was a revolution because it entirely changed the way people live and work. How is anything Transmeta done even remotely close to something of this level? It's not.
Cost? Memory is cheap. Hard drive space is cheap.
Execution speed? Make your instruction cache bigger. Goes with the territory.
Download time? You'll find that RISC programs are more compressible than their CISC counterparts, so this shouldn't be much of a problem.
So, really, why is code size important? I'm sure there's something I'm missing here, but code size strikes me as something that was a lot more important "back in the day," when memory was more precious.
"I don't care what Linus says about it."
Makes ya curious why anyone should care about what he says, duddn't it?
I'd rather hear about what people can do than hear people complain about what they can't do. Why? Because you buy hardware to suit your needs, not suit your needs to your hardware.
Would you really want to return to the dos himem.sys, memmaker, extended and expanded memory, and autoexec.bat hacks again? Sure they were not needed for the first several years of DOS when people had only 512 kb of ram but the situation changed quickly. Its this is what first turned me off from Microsoft. If I had 8 megs of ram and had 6 free why couldn't I run dune2? Do I not have a 32-bit chip? I had to create a custom boot disk with autoexec.bat just to run the game. That is screwed up.
A Hammer is nice just like a 386 was nice to have run 16-bit software. They were particularly usefull in Windows3.11 since it actually had 32-bit disk access while everything else was 16-bit. The hammer is fast at running 32-bit software and is easily upgradable if customers want to add ram. They do not understand techno mumble jumbo. Its not like you can explain the base of 2 math when Joe just wants to purchase a 4 gig ram stick and wonders why Windows wont recognize all the ram.
http://saveie6.com/
Linus isn't saying he won't let it in. He's simply saying that the thinks it's not a good arch based on technical merit. He'll let it in. He never said he wouldn't. He's just saying he doesn't like the way the chip was designed (what choices they made, etc).
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
What are YOU smoking?
Optimizations done at compile time are far better than optimizations done at runtime. At compile time, more is known about the structure of the program, where the flow of the program will be going, and more time intensive optimizations can be done than ones done in realtime in the cpu.
Itanium is slower right now, but as compilers with optimizations tailored to it come out, it has the potential to kick every RISC processor's ass. The reason for this is that RISC processors are bogged down by doing the optimizations at runtime that Itanium doesn't have to care about. This means the Itanium will have the same or less stalls and more efficient use of the processor.
Go read up on compiler optimizations to see why cutting out the middleman of instruction sets is a good thing.
No, because win95 is a piece of shit OS, but you point is valid. AMD should take this opportunity to stick to Intel however. Intel's been playing this game with comsumers' mind that the bigger the number the better the processor. Turnabout is fair play and I hope AMD takes this opportunity to bombard the computer buying public with 64 bit ads. I'd love to see Intel's answer to that, "uuhhhhh, bigger is better only if it says Intel Inside, yeah that's the ticket!"
I have to disagree with your assessment about the Pentium CPU at the time it was introduced.
Remember, the Pentium CPU runs x86 architecture code natively, so it did not require an expensive starting-from-scratch mentality to take fully advantage of the CPU like you have to do with the Itanium CPU. In short, programs that ran on the 80486 CPU could run on the Pentium CPU with no modifications.
AMD is the wildcard. If x86-64 is the bomb and takes off like AMD is betting on it. Intel lost the 64bit war for many years. IBM and maybe even Sun will quietly (well sun doesn't do jack shit quietly) push x86-64 for the low end while IBM POWER4 and POWER5 and POWER6 down the road run the big end.
Basically Intel needs something like Sun to jump on it IA64 to really give it some credibility and they don't sound real eager to. IBM sounds like they are down for the fight. Alpha, MIPS, PARISC are all pretty dead; long term and relatively speaking. Meanwhile, if Intel doesn't get on the shit quick then they'll have to support x86-64 too and that's the real death blow to IA64.
SPEC scores tell me almost nothing useful. The code to run SPEC benchmarks is emitted by tricked-out compilers whose whole purpose is to emit hand-crafted assembly code specifically tuned to run those SPEC benchmarks. It doesn't tell me anything about how well common programs and subsystems perform at common tasks. You might as well buy a family car based on the quarter-mile time at the racetrack for a like-model car with a supercharger and dangerously-tweaked ignition timing, burning 120 octane racing fuel.
In five years, if the Itanium isn't a huge success, will you eat your words?
As time goes by, computer languages are trending towards more dynamic behavior. This tends to favor things like JIT compiling and linking into already running programs. Fewer people are going to be able to afford the luxury of spending hours to preprocess their code to fit into an extremely static ("explicitly parallel") hardware model. This will be especially true when chip makers treat their rocket science static compilers as a separate profit center.
Not to mention, the CPU is the one that is actually in the position to know what optimization is needed right now based on the currently running data set. Given that there is usually a several year lag between the latest CPU developments and widespread compiler support, I'd go for a CPU that knows how to do its own tricks. (Hasn't the Itanium architecture been nailed down for almost a decade now? And we're still waiting on better compilers for it?)
For the expensive memory environment for which it was designed the VAX was fabulous. And it was designed to be scalable as well.
You can snicker at the CISC VAX architecture, but it ran multi-user in less RAM than many processors today have CACHE. Remember 2 MB of RAM was a lot when the 11/780 was introduced. 600 MB drives were considered HUGE and were the size of washing machines.
Its scalable architecture let a copy of VMS from the lowliest processor be physically mounted on the most capable and boot just fine.
It had BCD instructions too, not just string.
But Gorden Bell got a lot more right than he got wrong. And the compact and orthogonal instruction set of the VAX looks pretty good today.
Let me repeat this one more time:
NO GAMING CONSOLE IS 128-bit (nor will they be 256-bit)
The PS2 is a 32-bit system. It has a 32-bit wide address space and word space. It happens to have a quad-word SIMD execution unit. By this logic, the MMX-enabled pentium is also 128-bit.
Okay... got that out of my system.
What the 64-bit address space WILL do is make OS design simpler. This is an important win for developers. I understand OS start-up times will be vastly improved because applications, libraries, etc. will all be able to load at static addresses in memory, all precomputed. It'll also make database-as-filesystems easier to implement.
Forget gaming machines, this is BIG stuff, a big step, and Intel is foolish to ignore it.
Fuck Beta. Fuck Dice
Ooooh, benchmarks. Any company can pick and choose good benchmarks for their chip; you aren't even giving numbers. I want to see some real-world numbers, preferably ones that relate to the Itanium2's ability to handle non-preprocessed code (as other posters have mentioned, trying to work with anything dynamic throws all of Itanium's fancy explicit parallelization out the window) Put up or shut up.
That's it. I'm no longer part of Team Sanity.
First of all it is not very smart to try to reduce code size by putting complicated instructions in the processor architecture.
A succesfull architecture may be used for 20 years, and there is no way you can know which complex instructions will be most usefull/popular in several years. And when you start making upgraded chips for a design, these complex instructions will be a real pain in the ass.
The x86 architecture is a perfect example - it is a mess and many of its instructions are not used at all. The x86 is succesful because the way history played out - it was put on the first pcs, and the incredible numbers of precessors sold allowed intel to put more development money into that architecture than any body else was able to put into theirs. And large initial investments, and large sales numbers mean that individual chip prices can be lower.
Nevertheless, the alpha and some of sun's chips can still compete with intel in the server environment, with much smaller investments and worse production technology. That basicly shows the weakness of the x86 architecture.
When you have multiple pipelines and multiple stages per pipeline the size of your chip will grow exponentially to the number and complexity of your instuctions. Eventually adding more pipelines will be pointless and then you are reduced to adding cache as the only way you can improve your architecture.
For a Risc architecture, multiple pipelines will cost less overhead and more can be used. Processor performance can be increased by adding more pipelines without having to increase speed.
Intel has the money and the clout to make a succesful risc architecture. It is brave of them to do it, but from an engineering point of view it is the only right thing to do.
AMD will support x86 because they do not have the clout to force a new architecture on the world. It is a completely understandable policy, but then again will result in worse performance (unless their engineers are somehow much more brilliant than intel's).
Of course the real world matters and in the real world almost everyone uses x86. But if someone can change that it is intel.
but this is like reading the comments after John Carmack has posted some remarks on graphics chips. There's always a rush of people to claim "I know its not trendy, but he's full of shit". Ah, the rebel without a cause . . . the problem being, of course, that there are some people who actually have accomplished significant achievements. These people, such as Linus or Carmack, will always get a listen from those of us who are less technically inclined because they have proven that they have at least SOME idea of what they are talking about, whereas the critics are nobodies.
After all, if you're so smart, how come you haven't done anything anyone else would notice? Or, put another way, the world is full of people who Know Better. These people will tell you, until they are blue in the face, that they Know Better. We can take their word that they Know Better, because they told us this themselves. But if you can't demonstrate that you Know Better where the rubber meets the road, then, well, you really don't have much to say, do you?
Does Linus know eveything there is to know about cpu architecture? Nope. He doesn't even know everything there is to know about Linux. But he does know a lot more than the average bear, and unlike the peanut gallery lurking on internet message boards, he has demonstrated that he knows a lot. If Linus doesn't like the itanium, that's a kick in the teeth to intel regardless of whether your imaginary compiler works a lot better on the itanium than on x86.
Ya, but he works for trasmeta. If he were trying to pimp the company he works for he'd be pushing some Transmeta chip not AMD's stuff. Then again I could be wrong and there could be some connection between AMD and Trasmeta or some "The enemy of my enemy is my ally" type of deal.
Without Windows for x86-64, AMD is dead. No, Linux will not save it. However, the moment Microsoft releases Windows for x86-64, Itanic is history. The market will overwhelmingly favour x86-64 because of the much lower price (I expect at least 3-4 times lower, cosidering that the Itanic CPU alone sells for over $3000), and perfect backwards compatibility. Itanic's ia32 support is so pathetically slow that it may as well not exist, so a move to Itanic requires you to replace _all_ your software, which ain't cheap, while x86-64 allows you to do incremental upgrades. So, taking simple economics into account, Itanic will go the way of that ship and AMD will emerge the winner... provided there is a version of Windows for x86-64. Without that there is no point of talking about "64 bit desktop" market because it just won't exist. So what is Microsoft doing?
___
If you think big enough, you'll never have to do it.
(Hasn't the Itanium architecture been nailed down for almost a decade now? And we're still waiting on better compilers for it?)
That's the key isn't it? Itanium demands breakthroughs in compiler technology. Will this happen?
I dunno.
Maw! Fire up the karma burner!
Sometimes, just becomes someone HAS an economic interest in something, and IS interested in seeing something fail/succeed, does not automatically invalidate the point he/she makes. Linus didn't just put forth an unsubstantiated rumor or point of view; he backed his points up with facts and reasoning. If he is biased, show facts and reasoning to counter the bias, or else you are no better than the FUD-mongers when you write him off.
It is still a full port, if you want to get the benefits of the 64-bit architecture. If you want to keep running 32-bit x86 code, don't even bother recompiling. But don't make the mistake of thinking that switching 32-bit x86 code over to x86-64 is a simple re-compile.
//(forgive me if I have the parameters backwards, I'm doing this from memory. And notice that I'm a bad programmer, I didn't check the return value.)
It is still a port, with all that is included in that awful word.
Do you understand how little 64-bit safe code there is that runs on 32-bit x86 systems? Most of the linux kernal is already 64-bit safe, because it has been ported to so many other 64-bit architectures already. And it still wasn't a simple "just recompile it".
Speaking specifically to C programs here, porting from 32-bit to 64-bit is not a fun process. A variable declared as "int" switches in allocation size. This is good and bad.
fread (fp, sizeof(int), &var);
Congratulations, you just killed all your existing data files. And if you happened to read a 32-bit pointer from that data file (any structures that you write directly that contain a pointer write a pointer... you'll throw the pointer value away when you read the structure back in, but you still have to read the proper data size), and then assign a pointer to it... Oh, you're going to have all sorts of fun playing with that.
Yes, this may only be an issue with "bad" C code that assumes it will ever only run on a 32-bit platform... That probably covers 99% of all x86 C code out there, for any OS you care to name.
Don't pretend it will be easy moving from 32-bit x86 to x86-64. For most programs, I assure you, it will be non-trivial. Anything that does direct memory allocation will have to be checked very carefully. Anything that does binary file i/o will have to be checked very carefully. Oh, and anything that uses "magic" numbers will have to be checked... Have you ever used an if conditional for an int of the form
if (i == 0xFFFFFFFF)
congrats, you just assumed 32-bit for your architecture.
64-bit clean code is the exception, not the rule.
This is my sig. There are many like it but this one is... Oops. Frank, I've got your sig again! Where's mine?
It would be very risky for Microsoft not to provide a version if windows for x86-64. Microsoft are already facing major competition in their server market from free *nix. If they allowed the competition access to free reign on a very fast and powerful architecture, they would be taking a major risk.
Microsoft need to weigh up development costs against the risk the *nix/x86-64 will become very popular and decide whether its worth their while to compete with a version of Windows. I would guess it is.
Yes, RISC programs tend to be longer (sometimes considerably) than CISC programs. There are actually two (main) reasons for this. First, as you mentioned, you need to replace single instructions (usually ones that do memory-to-register operations) with multiple operations (such as a load followed by an math operation). Second, the instructions themselves tend to be longer, since most (all?) RISC archetectures have exclusively 4-byte opcodes - usually something like 1 byte for the opcode, a few bits for flags, and the remaining for up to three arguments. (register numbers or immediate values). CISC archetectures have varying length opcodes, some a signle byte and some several bytes.
There are a couple of mitigating factors here. First, compilers are usually not very good at using some of the more complicated combined instructions, so they go unused, inflating CISC code to match RISC code. Second, careful optimization of RISC code can identify repeated or unecessary memory operations, and eliminate them. When the memory operations are tied to the arithmetic (or other) operation, that is not possible. Finally, since RISC archetectures generally have large register files and all registers are equivelent, fewer operations are needed to shuffle bits around to where they can be worked on, whereas i386 has a lot of operations that only work on certain registers (though far less than earlier incarnations).
I used to be a big fan of Intel cutting life support on the 386 archetecture, because RISC is so obviously cleaner and nicer. However, I have started to believe the AMD hype about x86-64, which is basically along the lines Linus talks about here. RISC vs. CISC doesn't really matter any more, and the i386 archetecture is not so bad. If you A) add some more general purpose registers, B) eliminate most of the remaining register usage restrictions, and C) Ditch the worst (looking and performing) FPU on the market in favor of almost anything else, you have yourself a very servicable archetecture. Extend the registers and addressing to 64 bits, and you have something that has a lot more room to grow. That is what the x86-64 is, and despite all the rumors that Intel has their own 64 bit extension to x86, if they don't actually release soon people will start to adopt x86-64 and they will be in the unenviable posistion of having AMD dictate the future of their product line.
I have heard frequently that something like only 5% of the transistors on the PPro core were tied to the "i386ness" of the core. I assume with the P4 that number is even less. It seems then that the instruction set is not as big of a deal as we would like to think.
The thing that puzzles me about ia64 is: if the whole point is to "make the compiler do it", and none of the fancy instruction reordering is done in silicon, why is it so expensive?
Now sure, Linus seems to be a great guy, and I'm not saying he isn't, but he still has his slant to things, conscious or not, deliberate or not, malicious or not. So it does pay to keep all things in perspective. He appears to be more benevolent than most, a competent programmer, and a personable individual, but he is also an employee of transmeta, and that affects what he says or doesn't say.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
That's the real key. Now days, recompiling well written software for different CPU's is trivial, provided the OS API is the same. If Windows runs on an Itanium well, I can likely just recompile my software and be done. If it can emulate 80x86 well enough to let me run old Windows programs, that's game-set-match.
/. doesn't like the view of the world through Microsoft colored glasses, but that's the reality that is out there. If it runs their software quickly, users couldn't care less about what the CPU type is, and that includes high-end server applications as well.
I realize
That's a very theoretical argument. And in practice, current CPUs like x86 already require that code follow a strict set of rules for maximum performance (think of e.g. cacheline alignment and branch likelihood). So the difference between Itanium and, say, the P4 is not as big as a strictly theoretical argument would have you believe. The _big_ difference of course is that in almost a decade, nobody has managed to build that "ass-kicking" Itanium compiler.
This has got to be a troll.
I'll elaborate. Itanium is statically scheduled, so a "good" compiler has to know in advance the order in which instructions will complete. This is impossible when you're doing random memory accesses some of which will hit in cache but you don't know which ones.
Every other modern CPU can do out-of-order execution so a "good" compiler doesn't have to know exactly the order in which instructions will complete.
That is why there are "good" compilers for most architectures, but not for Itanium.
The itanium compliers need to find parallelism that doesn't manifest itself in regular C/C++ code. Sure the compiler can optimize things, but it does understand your algorithm, nor can it observed how it might execute all its codepaths, the hardware will see real behaviour. As Linus suggests, the architecture is just too clever to be truely practical. Further each Itanium rev is likely to need to change the rules on how instructions can be grouped and how long it takes for a load to complete for the results to be useable OR inforce some rules in HW to stall the execution with NOPs until it is. And again Linus hints, that if you need to apply rules in HW then do them completely and not in some half assed way.
That said, I think Transmeta also has some problems, because they too rely on deriving parallelism from an x86 code stream which might very difficult to acheive and have the code retire things in the correct sequence. In this respect Hyper Threading could be very effective because it allows the processor to obtain a level of parallelism from distinctly different code streams.Transmeta could of coarse implement pseudo HT but how knows. Perhaps Linus?
Itanium 2 in benchmarks are faster than any other processor used for the same thing, it even kicks the Power 4's ass! Itanium given time will have software support and better compilers that are more optimized for the Itanium architecture. Intel had many problems getting a working processor and even more trouble trying to get support. Intel backs the Linux community and then we see now that support being thrown out the windows BY the Linux community. does this make sense? Itanium yes, on paper has been around for years, yes there were lots of problems, yes it doesn't run 32-bit code well (remember Intel's first shot at RISC with the P6 processor known as the Pentium Pro? It didn't work with 16-bit code well as there wasn't a consumer OS at the time (Windows 95) that could use only 32-bit code. This chip worked well on 98/NT/2000/XP/ME and Linux/BSD). so the moral of the story is, give it a little time, support it, and it will shine :D .
sig, Twat's that all about?
The implications of this are more profound than it first seems. First, this removes the requirement that CPUs be particularly compatible with anything. In fact, the some of the Transmeta CPUs aren't compatible with each other - yet they all run the same programs.
This frees up the design incredibly. For comparison, the Transmeta CPUs (which Linus writes code morphing software for) and the IA-64 (which he thinks is crap) are both VLIW architectures, with about the same issue width and number of registers, and so on. They are more similar to each other than either is to the x86.
However, the Transmeta CPUs leave a lot of things like the ordering of instructions and handing of exceptions exposed to software. The code morpher takes care of those jagged edges, making them disappear - as a result, the CPU implementation is very simple but very effective.
In comparison, the IA-64 needs to explicitly specify what instruction combinations are allowable in the input packets, needs to store exception information (there are at least 128 bits that do nothing but remember if an exception happened when using a register), and so on. The result is that the sharp jaggie bits now look like soft jaggie bits, yet the machanisms needed to keep the whole thing consistant bogs it down. I don't think an IA-64 could be implemented at all in a CPU as simple as Transmeta's.
The second important thing is that the CPU is not tied to a single instruction set. I think Transmeta has made a mistake in not including support for other instruction sets, but it has demonstrated that it can. I remember reading that the first demo showing a Transmeta CPU running Doom was written in x86 machine language, except for the inner loop which was written in Java bytecode. Every iteration the CPU+software switched instruction sets without missing a frame.
If Transmeta is aiming for the low power, embedded marketplace, the dominant player there is ARM, not x86. If they could offer the ability to mix popular embedded ARM code with popular desktop x86 code, they might have a real winner there.
But as yet, the revolutionary aspects of the Transmeta designs are unexploited, and mostly even unnoticed. But I'm convinced that even if they don't do it, it will get done another way, whether it's Java, .NET, some Open Source project (Parrot?), or something that's not even noticed yet (Tao Group Elate?). But I think they are revolutionary.
Because then the University can't take the code and sell it at a profit. It's pretty standard practice that all work you do, including new, patentable ideas, are considered "work for hire" and are owned by the school. "Existing project" and "OSS" tends to mean, "we can't exploit this to get money".
I've been called a "Fucking Dick" by better people than you.