Why Doesn't the Itanium Get the Respect It's Due?
happycorp wonders: "As in recent years the Itanium does well, easily beating x86 processors even at its low clockspeed (1.4Ghz). The supercomputer people are serious about benchmarking (no easily tricked microbenchmarks or reliance on closed-source
commercial apps), so the discrepancy between the performance and perception of this chip is serious.
With a single-CPU Itanium2 system at
around $2000 their price is already reasonable, and the price would come down
(and software would be ported) if the Itanium ever became a mass market chip. Having an affordable chip one step above a Xeon or Opteron in floating-point performance would not be such a bad thing for gaming enthusiasts (or 3D artists). So, the recent
article
on the
Top 500 supercomputers list brings up a question I've been meaning to ask:
Why do we see so many disparaging opinions of the Itanium processor (all those 'Itanic' jokes, etc.)?"
"It seems computing enthusiasts' sentiment is set against this processor, and its likely that it's going to be abandoned sooner or later. We'll be paying for x86 compatibility indefinitely (recall the Xeon has roughly
three times the number of transistors of the ppc970 for example; but we hardly get three times the performance).
These are a couple scores from the top 20, with the total gigaflops divided by the number of processors to obtain a per-processor speed:
rank processor ghz (gflops / #procs) speed #5 ppc970 2.2 (27910 / 4800) 5.81 #7 itanium2 1.4 (19940 / 4096) 4.86 #10 opteron 2.0 (15250 / 5000) 3.05 #20 xeon 3.06 (9819 / 2500) 3.92
Given this, consider what a 2 or 3 Ghz Itanium could do.
(fine print: I am not affiliated with the Itanium or the top500 list in any way)."
These are a couple scores from the top 20, with the total gigaflops divided by the number of processors to obtain a per-processor speed:
rank processor ghz (gflops / #procs) speed #5 ppc970 2.2 (27910 / 4800) 5.81 #7 itanium2 1.4 (19940 / 4096) 4.86 #10 opteron 2.0 (15250 / 5000) 3.05 #20 xeon 3.06 (9819 / 2500) 3.92
Given this, consider what a 2 or 3 Ghz Itanium could do.
(fine print: I am not affiliated with the Itanium or the top500 list in any way)."
I had to study the chip in one of my EE class. The technology in it is really really impressive. I love the memory architecture provisions!
One, it gets no respect because nobody uses it. Where is the kudos for the transputer? Why does nobody love the Apple ///?
Second, yes it beats the x86 into the ground. I'm not surprised. Now show me how it compares against a real CPU. We've already seen that the Itanium is competing in a different space (supercomputers), so show me how it compares with the MIPS that SGI have ditched in its favour. I wouldn't be surprised if an n GHz MIPS stuffs an n GHz Itanic into the floor.
Probably because when it mattered a single CPU Itanic was more like $12,000 and not $2,000. After fucking up all their marketing and delivering strategies no one wants one anymore.
I'm Rick James with mod points biatch!
Itanium was a huge project jointly developed with many partners, most of the significant ones have long since abandoned the effort.
It was supposed to be the future of Intel - shipping units on the order of the pentium line. A redesign from scratch of how processors "should" be designed.
It's taken far longer, cost far more, and yielded far less than promised.
That's basically it.
Also, I'd be willing to bet Intel staked a bigger part of its decision on the availablity of platform independent binaries making serious inroads, which hasn't really materialized. Platform independence of the major OSS and commerical apps is obtained through porting and source-level compatability.
I may be entirely wrong, but I believe the dislike for the Itanium stems from the fact that you can't compile any decently optimized code for it. Apparently, even Intel can't create a good compiler/linker and toolkit for creating machine code that makes good use of EPIC. Even though the processor itself is more efficient and faster, the same thing compiled to machine code running side by side with an Opteron or any other x86-64 chip will see the x86 win. If somebody could come up with a decent compiler/linker that provided full EPIC optimizations, they would be bangin, but they don't have it so we don't use it.
The people who work on scientific applications take performance seriously. They put a lot of effort into optimization. The itanium architecture is hard to optimize for, and the compilers just aren't there yet for the general case. So you wind up with a disparity between the performance in scientific applications and general purpose applications.
Other reasons itanium can't compete:
1) Compare the performance of itanium with xeon/opteron in running native x86 code.
2) Compare the costs of building real end user systems.
3) Compare the availability of windows xp drivers.
"Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
to compile for Itanium. Speaking as a compiler researcher, Itanium is great for generating research papers because there are all sorts of things that you can do from a compiler perspective. The problem is, outside a research environment, someone has to implement a lot of the ideas in an Itanium compiler to make it useful. Unfortunately, most of the stuff in the Itanium research papers isn't easy to implement and most of what gets put into commercial compilers are the easily implementable ideas.
I only ever called it the Itanic because one of my professors, who works (or worked) at Intel and researched the architecture very extensively to document it also called it the Itanic. According to him, it was basically what everyone else has been saying so far.. great idea, bad execution.
It's like sex, except I'm having it!
I worked at a startup that was building a database ~70 gigs in size. It took 2 months to build said database. Lots and lots of very small lookups and inserts.
Memory was our bottleneck. More ram equals more speed. So we spent BIG bucks and bought a quad Itanium with 12 or 16 gigs of memory (I forget exactly how much it had).
The Itanium was slower than a dual X86 with 2 gigs of memory! And not just a little slower. We spent weeks trying to get the database optimized.
Why does no one respect the Itaniums? Intel made a slow chip. Then they released the sequel. I've already paid my dues on that line once. I'm not playing this round.
Agile Artisans
Well, there are many reasons the Itanic failed. It was a great architecture, a neat idea. Shift all of the intelligence in the chip up to the compiler, execute in-order, optimised code, get rid of deep bypassing, etc. Generally, get rid of the extra 50% of the chip that's dedicated to turning an instruction stream into a series of vectors.
Note, it *was* a neat architecture.
Then, everybody got involved. Imagine a roomfull of architecture, compiler, and systems PhD's, each with their own pet idea. And this chip had them ALL in it. Anybody remember the i432? In a way, this was the i433.
BUT. This meant a complete break with the current codebase, and in the final analysis intel didn't have the guts for it. Especially once their hopes for compilers weren't being borne out (once, Intel was a HUGE player in the market for compilers PhD's). So the guys at Intel decided to add x86 hardware compatilbility to this. Then, since their compiler plans weren't working out, they added out-of-order execution.
Now, all of these things had crazy interactions. Suddenly, who knew what it was doing? Then the power... all those units, executing all those dead instructions - it ran HOT. Then the fact that x86 compat and o-o-o were a gigantic boat anchor in terms of chip real estate, driving the cost through the roof pretty much sealed its fate. It became a "server processor". And if you get 7 or 8 P4's for the price of one Itanium... well, your cluster is better served with those 7 or 8 P4's.
Pride goeth.
I think the big problem is that it cannot run x86 software very quickly.
Yeah, that is why semi trailers don't get respect like Dodge Neons. They use diesel fuel instead of unleaded!
My point is that if your buying a 64bit system that is fast in order to run your old 32bit programs slowly. Wrong tool for the job.
I've got 65 Itanium processors downstairs. They are fast and reliable for high memory bandwidth floating point calculations, which is what we use them for. They may be a disappointment with running IE or Outlook, but for crunching numbers they are great. I have yet to of tried an Opteron but will in the next couple of weeks. From what I understand those too have become great at high memory bandwidth number crunching, but I'll wait for the numbers vs marketing speak. Now, Itaniums do suck in the power consumption and heat dissipation department.
Itaniums get such a bad rep here on Slashdot because its cool to do so. Itaniums are made by the "big guy", Intel. If they were made by AMD they would not get the same rap as they do.
The other big thing against the Itaniums is market need. A generic x86 that you can throw in the trash and replace for about $1k if there are any problems are sufficient for 99% of the servers out there. If not even preferred. Now, what other market would want a fast 64bit architecture with high memory bandwidth -- databases. Sun and Oracle fill this void. Well except for the fast and high memory bandwidth part, but Oracle+Sun is a proven combination with years of experience. Solaris does not run on Itaniums. Linux does (flawlessly), but even Oracle+Linux is not that widely adopted. I have no clue about Windows state on an Itanium. I see no real use to run Windows on an Itanium, but someone else might, but I doubt its very common.
Although Intel has some more to go with the low-voltage Itaniums because they are capped at 1.3GHz, but they are working on that. Also, Intel has dropped the price of these guys considerably. This too was an issue with Itaniums, but they have dropped by about 1/2 the price over the years.
IMHO, Intel should continue on the power management issues and price and market these chips more for number crunching. Their performance on the top500 site is impressive, but even if all of the top 500 computers used 4,000 Itanium processors each, that would only be 2,000,000 processors total, and a super computer that size is not purchased very frequently.
Most amusing to me was that the early versions had the chip serial numbers on the area covered with the heatsink. Removing the heatsink voided your warranty. You needed that serial number to get warranty work done on the processor.
This sig has absolutely no significance and serves only to take up screen space and waste the time of the reader.
That question answers itself: You think differently from most people. Highly specialized, hand optimized massively parallel predictable crunching seems to matter to you. It doesn't to most people. You're in a minority. Get used to it.
BTW, i860 and Alpha suffered from basically the same problem.
A couple of points that seem to have been missed when looking at why the itanium less widespread:
- each CPU is quite large, having a square surface area for the unit about 2" x 5" and it's about 2" high
- That area includes a voltage regulater and the passive cooling fans
- It doesn't include any of the necessary active cooling
If you add these physical factors to the points already made about heat, power and EFI bios, it's obvious to say that Itanium won't run in your mini-ATX destop or laptop. This isn't a slam on the design, as it was never designed to run in those form factors, but it's hard to see how any cpu today is going to have a wide use if it isn't available for dual use for destop and servers. Once you eliminate the desktop market, (and I'm going to lump the workstation market in with the servers) the number of places you can sell these processors drops considerably.Once you start adding in the lack of Windows support for itanium, the strides that the 86_64 architechture has made in capability, and the low numbers of current adopters, it's not looking like Itanium will ever gain widespread acceptance.
The Internet has no garbage collection
Why this chip is not for me are two reasons:
1: I'm not buying one before the software is ported to it -- and at a comparable price to its PC equivalent!
2: It may be a step above an Opteron for floating point, but is it still that step about a dual processor Opteron that I can buy today for less money than a mono-processor Itantium?
As for the "Itanic" jokes (all of which are way off-base, since heat output of any H.M.S. Itanic would melt any iceberg long before it could do any damage), blame The Register. I saw them use the term long before anyone else.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
in real supercomputing you do not want your processor to 'auto-schedule' or rearrange your code.
in the end, real special code is still hand oprimized, since no compiler nor any built-in rescheduling algorithms can actually know what I really want to achieve.
Maybe I just want to accept the half ready value because I don't care for part of it.
Maybe I want to put one instruction way ahead to prime a set of registers for what is coming.
A processor which is always auto-scheduling can achieve only performance within the foresight that the rescheduling design put into it. But not for my very special algorithm for just this one dumb equation I want to solve.
It is not a 'fail' criterion for a processor to strictly adhere to what I tell it, and thus provide an exactly reproduceable solution path each time.
The result of this automatic rescheduling is that execution times in the end become non deterministic. In some cases you just want to avoid that.
The scope is for sure not gaming, but hand optimized supercomputer grade code: check the compiler's result over and manually squeeze clock cycles out by doing things that seem to put cycles in, but reward in the end because one 'senseless' instruction may just have served to prime a cache o, register file, or vector set with new content for the next run, just in time when the pipe runs empty.
Curious why the DecAlpha does not appear in these posts....
They exist, they fail miserably optimizing most C and C++ code or at least they aren't good enough at it to make up for Itanic's handicapped clock speed.
You just need to read the other posts here about how hard it is to develop compilers that can find 4 way parallel instructions to cram in to the VLIW at compile time. You find a lot more opportunities at runtime using dynamic scheduling at the price of complexity in the CPU.
Maybe someday the compilers will be really good and Pentirum/AMD CPU clocks will hit the wall and Itanic will reign supreme. Intel is one of the few companies with pockets deep enough to keep it alive and keep pouring the billions in to both the CPU and the compiler, until it starts outshining x86_64 on anything other than vectorizable Fortran. Wouldn't necessarily count on that confluence of events happening in time to save it. I'd really like to see how much Intel has sunk in to Itanic versus the ROI. It must be appalling. Only a company with a near monopoly elsewhere could survive it.
Me I'll take an AMD 3400+. My whole computer cost $800 versus $2000 for just an Itanic CPU, it has 2 GB/sec memory bandwidth, runs IA32 apps really fast, is running Gentoo Linux so everything is taking advantage of all the new registers and instructions set improvements, and I have 64 bit addressing. Its sweet and sensible.
I'm not argueing that Itanic wont hold its niche in supercomputing. Aren't many people who are going to put one on a desktop or in a server.
@de_machina