Linus Has Harsh Words For Itanium
Anonymous Coward writes "As a follow up to the earlier story "Intel: No Rush to 64-bit Desktop"... In words that Intel are likely to be far from happy with, the Finnish luminary has stuck the boot into Itanium. His responses to some questions on processor architecture are sure to be music to AMD's ears. Linus, in an Inquirer interview concludes: "Code size matters. Price matters. Real world matters. And ia-64... falls flat on its face on ALL of these."" Of course, Linus works for a chip maker ;)
Not to mention the fact that most home users won't see a 2X performance boost from 64 bits.
Go here to create your own Slashdot dis
Not only does he work for a chip maker, he's like totally obsessed with the i386 architecture. I guess it's what he cut his teeth and and he's going to stick with it. But to think that no-one else has a use for it is very short-sighted.
...
It'll probably still make it into the kernel, though. I mean, alpha and sun architectures are in there, so
E000-VB14-G8RY
This is from the Linux-Kernel mailing list, not an Inquirer interview. Here is the post.
Wow, just about everything under one topic. Linux, AMD, and Intel. So by this we are going to have 64-bit processors soon, is that what I'm hearing? Or will this turn out to be like most computer issues and come out a few years from now?
FOML: Rise to Power
Now, we all know that the Itanium isn't everything it's cracked up to be, and I think none of us at are wrong in blaming intel for coming out with a lousy product....
But, isn't one of those situations he mentions in the interview (namely, running a large database server) what this chip is designed to be doing?
As I recall, the IA64 isn't designed for the desktop user... In fact, desktop users probably don't even need 64 processing for a number of years still....
Yet we're attacking Intel for making the chip to fit it's niche?
Perhaps we need to be more fair in the context of the usefulness of the chip, instead of considering it in all contexts and criticizing it based on that?
Linus being opinionated and brash? Never!
"but speaking strictly from the technological point of view"
I think that was his point. It's great technically but it sucks in the real world. If its not practical its a shitty architecture IMHO.
I also think the x86-64 is a more viable solution as well.
----
Go canucks, habs, and sens!
Now I'm no programming guru, but it seems to me that the x86-64 architecture is a great one. In fact, the only thing that I could see being done to improve it would be to add more general purpose registers. I believe that the new registers are all GP (IIRC), but I think that makeing them ALL GP (even the older ones) would be good, and maybe bring up the number of registers to a good round 32 or something. Am I missing something glaring wrong? If you're going to toss out all of the x86 stuff (like ia-64), I think you should be able to emulate it in hardware about as fast as current x86 processors can. When Apple switched to PPC, couldn't they emulate 68k code about as fast (or at least faster than 1/2 the speed of) the fastest 68k chips?
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
The best architecture is still VAX. Clearly string operations at the processor levels is what any procesor needs to be the best and fastest ;}
the fact that linus works for a chip maker doesnt really matter because he dosn't develop the chips. he gets paid there to develop the linux kernel.
Worse is better
although the original essay talks about Unix and the LISP machines, it just keeps being true. Linus talks about the "charming oddities", well there you go: worse is better. Try for perfection, and the real world will eat you alive.
I also think he's right about the masses being what matter; I think Intel is still thinking about the data centre, not Joe Sixpack, with Itanium.
ZOMG I WOULD LOVE TO KNOW ABOUT YOUR FEELINGS ON MACINTOSH VERSUS WINDOWS, VI VERSUS EMACS, AND HOW YOU'RE NOT A DORK
As much as we depend on intel to push cpu manufacturing techniques to new heights, they have fallen down in the desktop market anyway. Ive lost count on how many new units they've added for poor lowlevel optimizers to keep up with. This with the slap in the face of reduced instructions per tic in the p4 so they could juice up the multiplier and sell "faster"mhz cpu's at double the price is more than enuf for me to stop watching them. Im far more interested in the new power5 coming out of IBM for a 64bit architecture to pay attention to. BTW, what ever happened to alpha 21364? is a 64bit cpu really newsworthy?
"It's worth noting that Torvalds' employer, Transmeta, has licensed x86-64 so he is likely to have access to Hammer hardware." This sounds really interesting. Any ideas what it means?
Of course, Linus works for a chip maker
And if trends continue, it could be Old Dutch.
So he is more likely to know what he's talking about.
Personally, I'm getting a bit tired of all the inane cynicism that passes for reflective commentary in modern society. While it's true that the world has its villians, it is more true that people often just hold opinions irrespective of their economic interest. I for one, trust that Linus is among these favored many.
(Not joking this time)
what will the 'masses' do with a 64-bit processor? the best reason to move up to 64bits is to increase maximum memory, and althought memory is now cheap, its not that cheap!
32bit processors can have up to 4GB of RAM. The most memory i know someone to have is 1GB, and computers most often come with half of that, 512MB. We still have a long way before we hit the 4GB ceiling (a long while!).
I am actually a tad worried for AMD, since they plan on coming out with the x86-64 pretty soon. And i dont know who will actually buy it (or need to buy it).
64bit processors belong where they are most needed, specialized machines.
what is nailchipper?
Sun has an interesting( biased) peace on Itanium. If I were buying a server I would avoid Itanium like the plauge. It is possible that Intel could even cancel the whole project and leave customers high and dry. Not to mention software availability is a problem.
I prefer the risc architecture. I like the idea of keeping things simple and efficient which is alot like structured programming. VLIW does not follow this ethic.
http://saveie6.com/
Can't say I disagree with Linus's logic, but I don't know if this was that great of a decision politically-speaking. It might not matter, but if anything linux *needs* support from big players like Intel and vice versa in order to grow. This won't necessarily hurt, but I doubt it can help matters on the Intel front.
>>They delivered a revolutionary product.
It's not 'revolutionary', if there is no revolution.
People toss this word about like it means 'incremental change'. The Industrial Revolution was a revolution because it entirely changed the way people live and work. How is anything Transmeta done even remotely close to something of this level? It's not.
The Inquirer.com isn't exactly a bastion of responsible reporting.
It doesn't look like an interview took place at all. It looks like they took some choice quotes out of context from the kernel development mailing list to spur some pageviews.
If I recall correctly the Crusoe processor is 128bit . It is simply executing 32bit code through "code morphing"
Netcraft confirms it: Itanium is dying.
One more crippling bombshell hit the already beleagured Itanium community when Slashdot confirmed that Linus thinks Intel dropped the ball with Itanium. Itanium now powers 0.00% or all servers. Coming on the heels of a Netcraft survey which plainly states that Itanium has gained absolutely NO market share. This reenforces what we've known all along: Itanium is collapsing in complete disarray.
You don't need to be a Kreskin to predict Itanium's future. The writing is on the wall: Itanium faces a bleak future, in fact there won't be any future at all because Itanium is dying. Intel has dumped millions into Itanium, red ink flows like a river of blood.
All major surveys show that Itanium has steadily held its ground at 0.00% use while millions of other processors are produced daily. If Itanium is to survive at all it will be among CPU dilettante dabblers and hangers-on. Nothing short of a miracle could save Itaniu, at this point in time. For all practical purposes, Itanium is dead.
Trolling is a art,
Mickey-mouse == poor quality, inconsistent
Outfit == organization, company.
Free Java games for your phone: Tontie, Sokoban
What the hell are you smoking? I want some.
Every risc archeticture with the exception of the sparc3 performs better. Especially IBM's power4 and the upcomming power5.
Also there is more then speed when comparing architectures. Itanium is a terrible platform to write compilers for. Alot of optimizations which are tradionally done in the chip at runtime itself must be set by compiler options. Not all of it can be done efficiently like this.
Speedwise Alpha is getting old now but still is the fastest chip around untill the power5 comes out this fall. For coding and optimization, Mips is the best cpu around.
http://saveie6.com/
look here:
pricewatch
almost $3000 for the chip. wow, and for so many mhz, too...
--
"It is now safe to switch off your computer."
Actually that's a good question. I think chipmakers should slow down a bit and enjoy life. Perhaps meet halfway with a 48 bit chip...
Trolling is a art,
Code size matters. Price matters. Real world matters
If only on-chip instruction set morphing mattered...
(sorry, but it's true...he's living in a glass house on this one.)
Code size matters because *cache* isn't cheap. Worse, you can't make L1 cache arbitrarily fast without slowing down your chip big time.
Number 2 (make cache bigger) is easier said than done, and works against number 1 (cost).
"It's overkill, of course. But you can never have too much overkill." - Anonymous Slashdot Coward
Memory is cheap, but the architectures to try and contain all of the memory certainly are not. Try telling your fortune 100 customer that they need to go buy a bigger sun/hp box because you need 80gb of ram to run your application and they start making you reduce memory consumption.
The "memory is cheap" thing is silly, and just encourages people to be lazy and not layout data structures in the optimal way, and/or use efficient data containers.
Check the latest SPEC CPU benchmarks. The Itanium2 has the fastest floating-point score and is no slouch in the integer tests either. It will improve. Linus will eat his words in a few years.
"I don't care what Linus says about it."
Makes ya curious why anyone should care about what he says, duddn't it?
I'd rather hear about what people can do than hear people complain about what they can't do. Why? Because you buy hardware to suit your needs, not suit your needs to your hardware.
"Did anybody notice a 2X performance boost moving from Windows 3.1 on 16bit MS-DOS to a nominally 32bit Windows 95 OS?"
I did. There was so much less time in between crashes that I learned to move quickly!
Well, that fud filled anti-MS joke should earn me a Karma point or two.
Would you really want to return to the dos himem.sys, memmaker, extended and expanded memory, and autoexec.bat hacks again? Sure they were not needed for the first several years of DOS when people had only 512 kb of ram but the situation changed quickly. Its this is what first turned me off from Microsoft. If I had 8 megs of ram and had 6 free why couldn't I run dune2? Do I not have a 32-bit chip? I had to create a custom boot disk with autoexec.bat just to run the game. That is screwed up.
A Hammer is nice just like a 386 was nice to have run 16-bit software. They were particularly usefull in Windows3.11 since it actually had 32-bit disk access while everything else was 16-bit. The hammer is fast at running 32-bit software and is easily upgradable if customers want to add ram. They do not understand techno mumble jumbo. Its not like you can explain the base of 2 math when Joe just wants to purchase a 4 gig ram stick and wonders why Windows wont recognize all the ram.
http://saveie6.com/
[AMD] Was recently considering leaving the CPU business altogether
Uh.. what? AMD can't leave the CPU business. That would leave them with.. Flash memory. We all know how much revenue that brings in for them.
You have any links to support this claim?
Dacels Jewelers can't be trusted.
AMD Was recently considering leaving the CPU business altogether."
Um when was that? The only thing I recall was a Slashdot article with a misleading headline...
To quote Linus: "And I further bet that using a native distribution (ie totally ignoring the power and price and bad x86 performance issues), ia-64 will work a lot worse for people simply because the binaries are bigger. That was quite painful on alpha, and ia-64 is even worse - to offset the bigger binaries, you need a faster disk subsystem etc just to not feel slower than a bog-standard PC."
Yeah, RISC workstations always seemed sluggish to me for interactive use. Not sure if it's really due to the increased time to load binaries, or some other optimization issue.
The read from theinquirer.net is all wrong. The slashdot story line is also wrong. It does not state at all what it implies. Here is the link to what Linus actually wrote:
3 02 .2/1909.html
http://www.ussg.iu.edu/hypermail/linux/kernel/0
Now, I agree with Linus on the PPC MMU issue. Can anyone tell me what he means by "baroque instruction encoding"? I have been doing x86 and 68k assembler for a long time, I have never heard of this.
Enjoy,
It's just the normal noises in here.
Linus isn't saying he won't let it in. He's simply saying that the thinks it's not a good arch based on technical merit. He'll let it in. He never said he wouldn't. He's just saying he doesn't like the way the chip was designed (what choices they made, etc).
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
What are YOU smoking?
Optimizations done at compile time are far better than optimizations done at runtime. At compile time, more is known about the structure of the program, where the flow of the program will be going, and more time intensive optimizations can be done than ones done in realtime in the cpu.
Itanium is slower right now, but as compilers with optimizations tailored to it come out, it has the potential to kick every RISC processor's ass. The reason for this is that RISC processors are bogged down by doing the optimizations at runtime that Itanium doesn't have to care about. This means the Itanium will have the same or less stalls and more efficient use of the processor.
Go read up on compiler optimizations to see why cutting out the middleman of instruction sets is a good thing.
I think the problems with the Itanium boils down to this:
1. The CPU's are insanely expensive. They make the majority of x86-architecture Intel Xeon CPU's look like a bargain.
2. Where are the server applications that take advantage of the Itanium CPU? They're not exactly widely available, to say the least.
3. Programming for Itanium is still a somewhat iffy proposition.
Meanwhile, AMD's Athlon 64/Opteron offers these advantages:
1. The CPU will definitely NOT be insanely expensive to purchase.
2. Programming for the AMD x86-64 architecture is not going to require kiboshing a bunch of legacy programming tools and starting from scratch--it is a straightforward process to convert today's programming tools to take full advanratge of the x86-64 native mode.
3. Because the programming tools are so readily available, both operating systems and applications for the Athlon 64/Opteron will be available widely by the time the new AMD CPU's are finally released for sale. Already, UnitedLinux is porting Linux to run in x86-64 native mode, and Microsoft is very likely readying versions of Windows XP Home/Professional and Windows 2003 Server that will run in x86-64 native mode.
Meanwhile, Intel supposedly has a 64-bit x86-architecture CPU codenamed Yamhill that has developed. However, given we don't know how Yamhill implements 64-bit x86 instructions Intel will have to do some VERY serious convincing to Linux kernel programmers and to Microsoft to write Yamhill-native code--and Intel is far behind the AMD efforts.
Code size matters because *cache* isn't cheap. Worse, you can't make L1 cache arbitrarily fast without slowing down your chip big time.
This is irrelevant with the trace cache in the Pentium4. Instructions are decoded into micro-ops, by "traces" which are sequences of executed ops, and stored in the cache. Compact x86 CISC instructions are not stored in the trace cache.
No, because win95 is a piece of shit OS, but you point is valid. AMD should take this opportunity to stick to Intel however. Intel's been playing this game with comsumers' mind that the bigger the number the better the processor. Turnabout is fair play and I hope AMD takes this opportunity to bombard the computer buying public with 64 bit ads. I'd love to see Intel's answer to that, "uuhhhhh, bigger is better only if it says Intel Inside, yeah that's the ticket!"
MIPS is behind Itanium in performance. HP-PA is behind Itanium in performance. SPARC3 is behind Itanium in performance. SPARC64V is behind Itanium in performance. Alpha has higher specint but lower specfp. Power has higher specint but lower specfp. Both major current IA32 processors have higher specint, but they are slaughtered on specfp.
That's without even mentioning TPC or Java benchmarks which make Itanium look just as good or better.
they were not needed for the first several years of DOS when people had only 512 kb of ram
Wha? I would have given my right arm for 512kb. Mine had 128k. Next step up had 256k. Geeze, 512k? We'd have been in happyland...
wow, but you know why they call it "power4" and "power5" though.
the damn things have THOUSANDS of pins (like 8000+ on some iterations, iirc?), and drains MASSIVE current - as in, the kind that makes your dual O/C'd athlon look like an LCD clock.
I think IBM's power4/5 chips are as well "unsuitable for real world" as well, but for some different reasons. That's not to say they won't be put into some niice servers though - and that's the point, itanium2 wasn't meant for desktop (for at least a while) anyway, and I think in their world they play some different rules.
My life in the land of the rising sun.
The SPEC benchmarks are real-world. That's the point of them, and they've been used over the last 10 years to judge the real performance of a processor.
Don't forget that Itaniums are clocked far lower than P4's. The difference is that Intel doesn't plan on marketing 64bit chips to the consumer for a couple years, while AMD has their sights set earlier due to the expected lifespan on the Athlon-family and that their future is bet on 64bits.
I guess the main thing to note is that the P4 will be around for a least two years longer, where you can't say the same thing about Athlon family, at least at the high end.
Also coming into the picture is that Apple may have 64 bit workstations in ~ a year.
I suppose I'm not too threatening, presently, but wait till I start Nautilus
Yes, revolutionary.
Just like the Segway.
AMD is the wildcard. If x86-64 is the bomb and takes off like AMD is betting on it. Intel lost the 64bit war for many years. IBM and maybe even Sun will quietly (well sun doesn't do jack shit quietly) push x86-64 for the low end while IBM POWER4 and POWER5 and POWER6 down the road run the big end.
Basically Intel needs something like Sun to jump on it IA64 to really give it some credibility and they don't sound real eager to. IBM sounds like they are down for the fight. Alpha, MIPS, PARISC are all pretty dead; long term and relatively speaking. Meanwhile, if Intel doesn't get on the shit quick then they'll have to support x86-64 too and that's the real death blow to IA64.
SPEC scores tell me almost nothing useful. The code to run SPEC benchmarks is emitted by tricked-out compilers whose whole purpose is to emit hand-crafted assembly code specifically tuned to run those SPEC benchmarks. It doesn't tell me anything about how well common programs and subsystems perform at common tasks. You might as well buy a family car based on the quarter-mile time at the racetrack for a like-model car with a supercharger and dangerously-tweaked ignition timing, burning 120 octane racing fuel.
In five years, if the Itanium isn't a huge success, will you eat your words?
Back when it was released, it was roundly maligned for offering shitty performance for Win95 users. "Buy a Pentium 233MMX" all the magazines screamed.
Well, the PPro turned out to be one of the best chips of its day, and the 200Mhz version performed within 5% of the Pentium II 300mhzs that were released 18 months later. I still have dual-PPro system running my CVS/MP3/print/etc. server.
Linus may be a god in the linux software universe, but I wouldn't discount Intel on this just yet.
...The shipping is free.
As time goes by, computer languages are trending towards more dynamic behavior. This tends to favor things like JIT compiling and linking into already running programs. Fewer people are going to be able to afford the luxury of spending hours to preprocess their code to fit into an extremely static ("explicitly parallel") hardware model. This will be especially true when chip makers treat their rocket science static compilers as a separate profit center.
Not to mention, the CPU is the one that is actually in the position to know what optimization is needed right now based on the currently running data set. Given that there is usually a several year lag between the latest CPU developments and widespread compiler support, I'd go for a CPU that knows how to do its own tricks. (Hasn't the Itanium architecture been nailed down for almost a decade now? And we're still waiting on better compilers for it?)
That doesn't mean it's the best solution. Merely the one that's going to win. Architecturally speaking, x86 is one of the biggest loads of crap to come along since...well...hmmm...I can't think of anything crappier off the top of my head.
Extreme register pressure. Segmentation models that make you want to retch. Hacks (PAE, anyone?) that leave any sane designer gibbering incoherently.
If you read the thead, Linus' main argument seems to be "to get good performance, all the other architectures have had to do complex things in hardware, so there's no real hardware simplification in going with a 'better' architectural design. Plus variable length opcodes are a natural cache optimization!"
I respect Linus a great deal, but he's talking out of his ass here. I agree that IA-64 may be best relegated to some academic's wet dream, but just about any of the major RISC architectures are big wins over x86. Intel and AMD have worked miracles with x86 to get it to run fast, but at a staggering engineering cost. The teams working on RISC chips tend to be a fraction of the size to come out with a high-performance chip. If the RISC houses had an engineering team of comperable size (and access to the same bleeding edge lithography processes) it would easily be worth an extra 25% in performance, minimum.
If you look in the embedded world, just about anything that requires serious embedded performance is RISC based (MIPS/ARM, mostly), simply because it decreases the engineering work involved by an order of magnitude. Plus, writing low level software for just about any RISC chip is loads easier than for x86.
Unfortunately, x86 is here to stay for the foreseeable future. Intel killed Alpha, not by buying it, but by doing a great job of pushing cheap x86 performance to the same level as Alpha, often surpassing it in later years. The same thing is happening to the other workstation-class RISC vendors, and, honestly, to Itanium, too. I don't see any reason to believe the march to x86 hemogeny outside the embedded world is likely to slow anytime soon.
ia64 is in the mainline kernel. At least Debian and Red Hat have released, stable distributions for it. Red Hat even sells support for it.
ia64 is "in there" as much as alpha and sparc, even if it isn't quite as well tested.
If you overclocked it it would eat itself..
NT was built on the i860 first, then ported to the i386 arch. More accurately, MS engineers emulated the i860 untill the chip was ready.
.
MS did this to make their new OS more or less platform independant. They didn't want to get 'stuck' on the x86.
Slashdot story here . Article here
Huh?
The second highest rated TPC box in the world is running Itaniums...
t s. asp?resulttype=noncluster
http://www.tpc.org/tpcc/results/tpcc_perf_resul
Linux made him ... oh wait nevermind.
Transmetta makes a lot of ... oops there I go again.
Intel is a company that time and time again proves it knows how to make money. It may not always support the crowds it should (like /. readers and superusers) but they are still making money.
Sure there are lots of difficulties going to a new ISA. Especially at the server level. And yes Itanium has had some performance problems, especially in its first revision, but then again when was the last time you saw a company produce a 1st generation microprocessor and have it do well?
IA64 offers tons of advanced ILP concepts and OS concepts that, when correctly implemented, can increase performance drastically. (if your looking for examples, data speculation, control speculation, predication, registers with kernel access only, rotating register files, a much larger register set, etc).
The problem may be, it puts a lot of complexity into the Compilers, and compiler technology isn't good enough for Itanium yet.
But then again, what do I know, Linus has made more money than I have. I just like arguing the other side while everyone else screams about how the Itanium will die.
Let me repeat this one more time:
NO GAMING CONSOLE IS 128-bit (nor will they be 256-bit)
The PS2 is a 32-bit system. It has a 32-bit wide address space and word space. It happens to have a quad-word SIMD execution unit. By this logic, the MMX-enabled pentium is also 128-bit.
Okay... got that out of my system.
What the 64-bit address space WILL do is make OS design simpler. This is an important win for developers. I understand OS start-up times will be vastly improved because applications, libraries, etc. will all be able to load at static addresses in memory, all precomputed. It'll also make database-as-filesystems easier to implement.
Forget gaming machines, this is BIG stuff, a big step, and Intel is foolish to ignore it.
Fuck Beta. Fuck Dice
I know it's not very nerd-like to say that Linus is wrong and that AMD sucks, but in the case of the Itanium, that is exactly how I feel. Intel/HP's Itanium architecture is perhaps the most advanced processor to hit the market and has tremendous potential (from a Computer Architecture point of view). Because it's so new, its performance will be aweful, but shall improve with time. Anyone remember the SuperSparc? It performed horribly and was soon replaced by the UltraSparc. As will the Itanium II replace the Itanium.
As for the emulation/legacy code argument, I say screw it. gcc is already ported to IA-64. And as a Linux user, most of my favorite open source programs can be ported with little difficulty.
First of all it is not very smart to try to reduce code size by putting complicated instructions in the processor architecture.
A succesfull architecture may be used for 20 years, and there is no way you can know which complex instructions will be most usefull/popular in several years. And when you start making upgraded chips for a design, these complex instructions will be a real pain in the ass.
The x86 architecture is a perfect example - it is a mess and many of its instructions are not used at all. The x86 is succesful because the way history played out - it was put on the first pcs, and the incredible numbers of precessors sold allowed intel to put more development money into that architecture than any body else was able to put into theirs. And large initial investments, and large sales numbers mean that individual chip prices can be lower.
Nevertheless, the alpha and some of sun's chips can still compete with intel in the server environment, with much smaller investments and worse production technology. That basicly shows the weakness of the x86 architecture.
When you have multiple pipelines and multiple stages per pipeline the size of your chip will grow exponentially to the number and complexity of your instuctions. Eventually adding more pipelines will be pointless and then you are reduced to adding cache as the only way you can improve your architecture.
For a Risc architecture, multiple pipelines will cost less overhead and more can be used. Processor performance can be increased by adding more pipelines without having to increase speed.
Intel has the money and the clout to make a succesful risc architecture. It is brave of them to do it, but from an engineering point of view it is the only right thing to do.
AMD will support x86 because they do not have the clout to force a new architecture on the world. It is a completely understandable policy, but then again will result in worse performance (unless their engineers are somehow much more brilliant than intel's).
Of course the real world matters and in the real world almost everyone uses x86. But if someone can change that it is intel.
They have so many pins because it is not a single cpu. It is an MCM (multi-chip-module). Each Processor "Brick" contains up to 8 CPU cores.
Current draw is around 250-300A for an MCM. Alot? Hell yeah, but your average athlon XP pulls about 35A. 8 x 35 = 280.
So, not so big a difference.
Early on the chief advantage of the approach was that you could use the freed silicon for things like extra registers, and that's exactly the approach taken by Acorn (now ARM) and the PowerPC range. Would you prefer to have eight registers and a single byte copy-block instruction, or 64 registers and have to replace that copy-block instruction with (*gasp*) three simpler instructions?
(Actually, I guess that depends on how good your cache is. There's no such thing as a free lunch)
You are not alone. This is not normal. None of this is normal.
Ya, but he works for trasmeta. If he were trying to pimp the company he works for he'd be pushing some Transmeta chip not AMD's stuff. Then again I could be wrong and there could be some connection between AMD and Trasmeta or some "The enemy of my enemy is my ally" type of deal.
Without Windows for x86-64, AMD is dead. No, Linux will not save it. However, the moment Microsoft releases Windows for x86-64, Itanic is history. The market will overwhelmingly favour x86-64 because of the much lower price (I expect at least 3-4 times lower, cosidering that the Itanic CPU alone sells for over $3000), and perfect backwards compatibility. Itanic's ia32 support is so pathetically slow that it may as well not exist, so a move to Itanic requires you to replace _all_ your software, which ain't cheap, while x86-64 allows you to do incremental upgrades. So, taking simple economics into account, Itanic will go the way of that ship and AMD will emerge the winner... provided there is a version of Windows for x86-64. Without that there is no point of talking about "64 bit desktop" market because it just won't exist. So what is Microsoft doing?
___
If you think big enough, you'll never have to do it.
(Hasn't the Itanium architecture been nailed down for almost a decade now? And we're still waiting on better compilers for it?)
That's the key isn't it? Itanium demands breakthroughs in compiler technology. Will this happen?
I dunno.
Maw! Fire up the karma burner!
Sometimes, just becomes someone HAS an economic interest in something, and IS interested in seeing something fail/succeed, does not automatically invalidate the point he/she makes. Linus didn't just put forth an unsubstantiated rumor or point of view; he backed his points up with facts and reasoning. If he is biased, show facts and reasoning to counter the bias, or else you are no better than the FUD-mongers when you write him off.
Torvalds wrote that Intel had made the same mistakes "that everybody else did 15 years ago"
when RISC architecture was first appearing.
RISC first showed up on the commercial radar screen almost twenty years when MIPS Computer Systems
was formed. But people at Stanford (and Berkeley, IIRC) had been publishing papers about
RISC for four or five years before that, and people at IBM were working on it even before that.
And the CDC 6600 was a RISC machine in the 1960s. If you don't believe me, ask Cray's Chief Scientist Burton Smith.
In seeking the unattainable, simplicity only gets in the way. -- Alan Perlis
No electrons were harmed creating this post, though some may have been subjected to electrical and/or magnetic fields.
Yeah, that's a given. I was just trying to point out that the Crusoe is already a 128 bit processor. So you won't be seeing a 64 bit Crusoe any time soon. Maybe a Crusoe executing instructions intended for an X86-64, but that's just an extension of the code morphing software. Even then I think the new astro chip would be more likely for that application since it looks to be meant for high density low cost blade servers with more punch than the Crusoe.
Some guy from Transmeta badmouths Intel's new processor and Slashdot files this under AMD?
I know that AMD has something to gain here but shouldn't this be under a different topic? Maybe when it gets reposted it'll be correct.
Who is Itanium good for? Who is G4 or Power4 good for? What is X86 good for?
That's like asking what is a saw, hammer and screwdriver good for...they each have an application.
All these architectures have their good points and bad points. I've written sparc and x86 assembler and I can't say that they are better or worse than each other....just different.
At this point the hardware is MOOT. Unless algorithms get significantly better soon, the hardware won't matter. Sure, we'll get mega memory address space with any 64-bit architecture, but what does that get you? More memory address space? Big deal...so you've got big memory space...that won't make NP=P any time soon.
-ted
It is still a full port, if you want to get the benefits of the 64-bit architecture. If you want to keep running 32-bit x86 code, don't even bother recompiling. But don't make the mistake of thinking that switching 32-bit x86 code over to x86-64 is a simple re-compile.
//(forgive me if I have the parameters backwards, I'm doing this from memory. And notice that I'm a bad programmer, I didn't check the return value.)
It is still a port, with all that is included in that awful word.
Do you understand how little 64-bit safe code there is that runs on 32-bit x86 systems? Most of the linux kernal is already 64-bit safe, because it has been ported to so many other 64-bit architectures already. And it still wasn't a simple "just recompile it".
Speaking specifically to C programs here, porting from 32-bit to 64-bit is not a fun process. A variable declared as "int" switches in allocation size. This is good and bad.
fread (fp, sizeof(int), &var);
Congratulations, you just killed all your existing data files. And if you happened to read a 32-bit pointer from that data file (any structures that you write directly that contain a pointer write a pointer... you'll throw the pointer value away when you read the structure back in, but you still have to read the proper data size), and then assign a pointer to it... Oh, you're going to have all sorts of fun playing with that.
Yes, this may only be an issue with "bad" C code that assumes it will ever only run on a 32-bit platform... That probably covers 99% of all x86 C code out there, for any OS you care to name.
Don't pretend it will be easy moving from 32-bit x86 to x86-64. For most programs, I assure you, it will be non-trivial. Anything that does direct memory allocation will have to be checked very carefully. Anything that does binary file i/o will have to be checked very carefully. Oh, and anything that uses "magic" numbers will have to be checked... Have you ever used an if conditional for an int of the form
if (i == 0xFFFFFFFF)
congrats, you just assumed 32-bit for your architecture.
64-bit clean code is the exception, not the rule.
This is my sig. There are many like it but this one is... Oops. Frank, I've got your sig again! Where's mine?
Sorry while I rant, but you just stomped on one of my nerves. (Unless your comment about neededing that much RAM was a complaint about Adobe or their direct *cough* compeitors -- sucks to be you.)
<Old Geezer Mode> In one case, not long ago, a fellow lab-rat Eric Mortenson had sold his research and tools to Adobe, but part of the poorly-written agreement said that he couldn't upgrade his work station. So he finished his Ph.D on a 386 with 32-MB of RAM, while the rest of us in the lab were using Pentium 3's, DEC Alpha's, and various SGI boxs. Eric's algorithms ran great on the newer PC's even though he couldn't develop them on the new boxes. Other with Adobe (NOT on that web site interestingly enough) needed the DEC Alphas (64-bit machines) with scads of memory and much more running time to do a similar implementation of Eric's algorithms. </Old Geezer Mode>
3D rendering doesn't take that much RAM. As a 3D graphics researcher and developer, I have worked with models where individual objects were multi-gigabytes (meshes+textures and volumes) but even then, having 1GB of RAM was more than enough for us to reach 20-30 FPS realtime on a box with NT4 and first- and second-generation 3D cards. Software rendering with very realistic detail was a little slower (3-5 fps) but was fine for writing movies. Progressive geometry & texture transmission, continuously calculated view-dependant detail levels, and other current and not-so-current research would solve the memory problems in 3D. Don't believe me? Go to Visualization 2003 and see if the leading researchers are finding RAM as their primary bottleneck. It is a bottleneck of course, but processing speed, caches, and the system BUS limitations are far more troubling.
As for video editing, you only need enough memory for the tools, a few frames, and whatever operations you are performing. In every case that I've had to do video editing, I've seen two classes of tools -- those that take gobs of memory and try to copy the entire video clip into RAM and end up thrashing for memory -- and those that intellegently figure out what is needed and use only the memory needed for the app.
An example of the first, an Adobe AfterEffects rendering a simple math function over time was only able to render 30-seconds because it wanted to buffer the AVI file in memory and ran out of RAM (2GB) after a several-hour rendering. An example of the second, a simple home-brew compositor that used the Windows multimedia API to write the AVI to disk -- the same machine and the same set of images required about 45 minutes to render the entire clip.
So instead of saying:
I would suggest you say " I need to buy tools that are properly designed and implemented for my class of computer. "
Frob.
//TODO: Think of witty sig statement
It would be very risky for Microsoft not to provide a version if windows for x86-64. Microsoft are already facing major competition in their server market from free *nix. If they allowed the competition access to free reign on a very fast and powerful architecture, they would be taking a major risk.
Microsoft need to weigh up development costs against the risk the *nix/x86-64 will become very popular and decide whether its worth their while to compete with a version of Windows. I would guess it is.
Take my grandfather for example. He worked as a transportation lawyer, back in the bad old Interstate Commerce Commision days. The ICC was created to regulate railroad monopolies, but was eventually coopted by the railroads to keep out trucking competition. In order to establish a new shipping route, you needed to prove to the ICC that there was a "need" for this new shipping route. Clearly, it was an absurd, anti-competitive system. My grandfather retired shortly before the industry was deregulated. However, to this day, he still believes that the ICC was a good thing, because being dependant on its existance for a job made him a believer.
The point is, when you have an economic interest in something, it can start to affect how you think about things. The wealthy tend to want tax cuts, and the poor tend to want spending increases, but most of them are probably not conciously supporting those positions for their own selfish ends. They truly believe that what's good for them is the right thing for everyone, it's a natural justification process for humans, and I wouldn't think less of Linus for the tricks his subconcious might play on his mind
"The question of whether a computer can think is no more interesting than that of whether a submarine can swim" -EWD
Would you prefer to have eight registers and a single byte copy- block instruction, or 64 registers and have to replace that copy-block instruction with (*gasp*) three simpler instructions?
You obviously know much more about cpu architecture than I do, so perhaps you could help me with something I don't totally understand: I don't doubt that RISC architechture is simpler, but isn't the executable code for a RISC cpu much larger than the code for a non-RISC cpu? For some things that can be done in one instruction on a CISC cpu, it takes several on a RISC. This would seem to me to take both more code for a program and be more intensive on the system bus by transfering the extra instructions. Am I wrong?
Why is this moist???
The key point about Itanium is that it is a horrible general purpose processor but it is a serious contender to be very good processor for supercomputing. It has very good floating point performance and the EPIC architecture is designed to be very good on Fortran, especially vectorizable Fortran which is very prevelent in HPC applications. What Linus said is correct in the context of Itanium as a general purpose processor, but its doesn't give Itanium the credit its due as a floating point supercomputer which is the only place its going to sell and is what it was designed for.
It will probably never be very good for most C and C++ apps. Pointer aliasing in particular will give the Itanium compiler fits. Unless you manually tell the compiler there are no two pointers accessing the same memory the compiler can't safely or effectively pack the parallel instructions in the VLIW and that is the essential to good performance in VLIW.
You do have to really question the sanity of some execs at Intel and HP for spending the staggering sums they've spent on Itanium. Supercomputing just isn't big enough a market for them to have any chance to recoup their investment in our lifetime and they aren't going to sell it in to the mass market as Linus said.
For a general purpose 64 bit processor to run existing C and C++ applications AMD is going to win hands down. But as many have noted its not likely most people are going to really need a 64 bit processor anytime soon so Intel will probably do just fine selling 32 bit x86 processors for a while.
@de_machina
There are two issues here:
1. There is no difference in the speed it takes to transfer data, because the bus is wider. There is also no difference in the time it takes to process data, because registers are also wider. There is a decrease in cache performance (because addresses take up more space). All other things (CPU design, clock speed, etc.) being equal, this hit would be of about 5%. It would only apply to programs running in 64-bit mode, though (the Hammer can still run in 32-bit mode, and can use 8, 16 and 32-bit pointers even in 64-bit mode, in certain instructions).
2. AMD's x86-64 Hammer doesn't just increase the register size to 64 bits. It adds several new registers, that can (with minor adjustments in the compilers) give a pretty good speed improvement (I'd say about 10% for the same clock speed, although this will depend a lot on the specific program). It also improves the prefetch and adds SSE2 support (one of the few areas where the P4 has an edge). This should give the Hammer approximately a 20-25% improvement over an Athlon XP at the same clock speed (more, if SSE2 is used).
RMN
~~~
Yes, RISC programs tend to be longer (sometimes considerably) than CISC programs. There are actually two (main) reasons for this. First, as you mentioned, you need to replace single instructions (usually ones that do memory-to-register operations) with multiple operations (such as a load followed by an math operation). Second, the instructions themselves tend to be longer, since most (all?) RISC archetectures have exclusively 4-byte opcodes - usually something like 1 byte for the opcode, a few bits for flags, and the remaining for up to three arguments. (register numbers or immediate values). CISC archetectures have varying length opcodes, some a signle byte and some several bytes.
There are a couple of mitigating factors here. First, compilers are usually not very good at using some of the more complicated combined instructions, so they go unused, inflating CISC code to match RISC code. Second, careful optimization of RISC code can identify repeated or unecessary memory operations, and eliminate them. When the memory operations are tied to the arithmetic (or other) operation, that is not possible. Finally, since RISC archetectures generally have large register files and all registers are equivelent, fewer operations are needed to shuffle bits around to where they can be worked on, whereas i386 has a lot of operations that only work on certain registers (though far less than earlier incarnations).
I used to be a big fan of Intel cutting life support on the 386 archetecture, because RISC is so obviously cleaner and nicer. However, I have started to believe the AMD hype about x86-64, which is basically along the lines Linus talks about here. RISC vs. CISC doesn't really matter any more, and the i386 archetecture is not so bad. If you A) add some more general purpose registers, B) eliminate most of the remaining register usage restrictions, and C) Ditch the worst (looking and performing) FPU on the market in favor of almost anything else, you have yourself a very servicable archetecture. Extend the registers and addressing to 64 bits, and you have something that has a lot more room to grow. That is what the x86-64 is, and despite all the rumors that Intel has their own 64 bit extension to x86, if they don't actually release soon people will start to adopt x86-64 and they will be in the unenviable posistion of having AMD dictate the future of their product line.
I have heard frequently that something like only 5% of the transistors on the PPro core were tied to the "i386ness" of the core. I assume with the P4 that number is even less. It seems then that the instruction set is not as big of a deal as we would like to think.
The thing that puzzles me about ia64 is: if the whole point is to "make the compiler do it", and none of the fancy instruction reordering is done in silicon, why is it so expensive?
That's the real key. Now days, recompiling well written software for different CPU's is trivial, provided the OS API is the same. If Windows runs on an Itanium well, I can likely just recompile my software and be done. If it can emulate 80x86 well enough to let me run old Windows programs, that's game-set-match.
/. doesn't like the view of the world through Microsoft colored glasses, but that's the reality that is out there. If it runs their software quickly, users couldn't care less about what the CPU type is, and that includes high-end server applications as well.
I realize
I've never run NT on Alpha, but my understanding is that Microsoft took the easy way out and ran NT basically as a 32-bit OS. Linux, OTOH, fully supports 64-bit and has done so ever since the Alpha port matured. Microsoft finally had to bite the bullet and fix their 32-bit-isms when then came out with Win2k for ia64, but that was years later.
The Alpha also gets you on unaligned memory accesses - if you and your compiler are not careful, you can force some very slow OS traps. I wouldn't be surprised if your applications were slow partly because of this.
Hmmm, what's not to like about Digital Unix? I always thought it was quite nice, as proprietary Unixes go. It was slower than Linux, due to its microkernel layer.
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
When I say 3D rendering I don't mean openGL/DirectX rendering....I mean full raytraced, reflection/refraction with global illumination, complex shaders, etc. With scenes as complex as I'm working with, I'm still looking at rendertimes of upwards of an hour per frame.
And as for video editing. Take a 1920x1080 (max HDTV) clip in raw 1 targa per frame format, add gradients, filters, masks, particle effects, 3d camera movements and lighting, and tell me you can buffer more than a few seconds in RAM. Don't believe me? Go download the demo version of Combustion from Discreet and try it.
I'm out of my mind right now, but feel free to leave a message.....
Go here for a really good summary of current CPUs.
The Internet is full. Go Away!!!
Ask anyone who has done assembly language programming on x86 and a decent CISC and x86 will always lose out too.
But the x86 has evolved a lot since the bad old days. You could regard the ugly stuff as vestiges of a primitive form and stick to saner modes.
A larger code size can be a significant disadvantage nowadays. Imagine CISC as compressed RISC opcodes. The current situation is the CPU is VERY much faster than the RAM or even the 2nd level cache. So it's not a big deal to have to decompress (decode/expand to RISC) instructions in the CPU. You gain overall processing throughput that way.
As long as that situation remains, larger code size is a significant issue. It means fewer programs in memory.
True RISC processors you talk about are declining. Most are becoming more pragmatic. Which is what Linus is talking about.
There is no chance of seeing a Power 4 or 5 in an Apple machine. They are IBMs high end server processors.
The PPC970 however is a different matter. Based on the Power 4 core with AltiVec, minus the on chip level 3 cache and multiple cores (though going back to multiple cores is a possibility when they improve their fabrication to 90nm from 130nm)
Don't blame me - this
Intel was founded 1968 and AMD in 1970, so what was your point about AMD being "so young"?
I'll elaborate. Itanium is statically scheduled, so a "good" compiler has to know in advance the order in which instructions will complete. This is impossible when you're doing random memory accesses some of which will hit in cache but you don't know which ones.
Every other modern CPU can do out-of-order execution so a "good" compiler doesn't have to know exactly the order in which instructions will complete.
That is why there are "good" compilers for most architectures, but not for Itanium.
Itanium's problems were visible from the moment the architecture appeared. It is, and was, an architecture that should excel at running Fortran programs, which are much more easily optimized than code written in C, C++, or Java. Compilers written ten years ago should be able to do a decent job compiling Fortran to Itanium with only a modest amount of porting work. Problem is, people aren't just running Fortran on Itanium.
The apparently-dynamic nature of current programs (that is, the intractability of statically analyzing them) has been coming for years. Ten years ago I spent my time studying the inner loops of SPEC benchmarks, and even then the typical inner loop of a C program was the instructions:
compare X with a value
branch out if equal
load indirect through Y to get Y'
load indirect through Y' to get X
branch to top of loop.
If Y (and Y', and Y'', etc) don't address memory in cache, you're hosed. Static prediction algorithms used in some of the first RISC chips (HP-PA, e.g.) work as well as any other on this loop, but you don't know that you're done until you load all the data and compare it. The loop cannot run any faster per iteration than the latency of the memory that happens to hold the data (Cache is King).
Object oriented programming, whether accomplished with an OO-TM programming language, or just a structure full of function pointers, is about the same can of worms (internally, the processor is caching the last location of the indirect branch, so it is not substantially different from prediction of conditional branches).
Itanium 2 in benchmarks are faster than any other processor used for the same thing, it even kicks the Power 4's ass! Itanium given time will have software support and better compilers that are more optimized for the Itanium architecture. Intel had many problems getting a working processor and even more trouble trying to get support. Intel backs the Linux community and then we see now that support being thrown out the windows BY the Linux community. does this make sense? Itanium yes, on paper has been around for years, yes there were lots of problems, yes it doesn't run 32-bit code well (remember Intel's first shot at RISC with the P6 processor known as the Pentium Pro? It didn't work with 16-bit code well as there wasn't a consumer OS at the time (Windows 95) that could use only 32-bit code. This chip worked well on 98/NT/2000/XP/ME and Linux/BSD). so the moral of the story is, give it a little time, support it, and it will shine :D .
sig, Twat's that all about?
Compile time optimizations can take more of the program structure into account, so can be really big wins when they're done well. But they can't take into account program behaviour. For example, the compiler can only tell that there's an if statement in the middle of a loop. The CPU can tell that 99% of the time it is false, so it can skip it and only go back if needed (profiling optimizers can do something similar, as can recompilers and translators - like Java HotSpot JVM and the Transmeta CodeMorpher (Linus works on that), which will both stop and optimize if a block of code is executed a lot).
The compiler is also nearly clueless about the structure of the CPU. Although the compiler can assume a CPU has a floating point unit, it has no idea if it has two that can execute in parallel, or how long the pipelines are. Modern CPUs can re-order instructions to match the actual hardware. The hardware can even add registers by "renaming" them - the x86 instruction set has eight general purpose registers, but the CPUs usually have dozens, so if a group of instructions share a register but don't share data, the CPU can give two independent registers the same "name" as far as the instructions are concerned, and execute both groups at the same time.
Yes, a compiler might assign registers more effectively, but only if you knew before hand what CPU was going to be used.
Compilers can group instructions together in such a way that the CPU is more likely to detect the right patterns. In this case, the optimizations are "implied", and a smart CPU will pick up on them and run faster, a dumb CPU won't, but it normally won't hurt performance. This means that you can have both high level, compiler optimizations as well as low level, CPU optimizations that the compiler could not perform.
The implications of this are more profound than it first seems. First, this removes the requirement that CPUs be particularly compatible with anything. In fact, the some of the Transmeta CPUs aren't compatible with each other - yet they all run the same programs.
This frees up the design incredibly. For comparison, the Transmeta CPUs (which Linus writes code morphing software for) and the IA-64 (which he thinks is crap) are both VLIW architectures, with about the same issue width and number of registers, and so on. They are more similar to each other than either is to the x86.
However, the Transmeta CPUs leave a lot of things like the ordering of instructions and handing of exceptions exposed to software. The code morpher takes care of those jagged edges, making them disappear - as a result, the CPU implementation is very simple but very effective.
In comparison, the IA-64 needs to explicitly specify what instruction combinations are allowable in the input packets, needs to store exception information (there are at least 128 bits that do nothing but remember if an exception happened when using a register), and so on. The result is that the sharp jaggie bits now look like soft jaggie bits, yet the machanisms needed to keep the whole thing consistant bogs it down. I don't think an IA-64 could be implemented at all in a CPU as simple as Transmeta's.
The second important thing is that the CPU is not tied to a single instruction set. I think Transmeta has made a mistake in not including support for other instruction sets, but it has demonstrated that it can. I remember reading that the first demo showing a Transmeta CPU running Doom was written in x86 machine language, except for the inner loop which was written in Java bytecode. Every iteration the CPU+software switched instruction sets without missing a frame.
If Transmeta is aiming for the low power, embedded marketplace, the dominant player there is ARM, not x86. If they could offer the ability to mix popular embedded ARM code with popular desktop x86 code, they might have a real winner there.
But as yet, the revolutionary aspects of the Transmeta designs are unexploited, and mostly even unnoticed. But I'm convinced that even if they don't do it, it will get done another way, whether it's Java, .NET, some Open Source project (Parrot?), or something that's not even noticed yet (Tao Group Elate?). But I think they are revolutionary.
Because then the University can't take the code and sell it at a profit. It's pretty standard practice that all work you do, including new, patentable ideas, are considered "work for hire" and are owned by the school. "Existing project" and "OSS" tends to mean, "we can't exploit this to get money".
I've been called a "Fucking Dick" by better people than you.