Itanium Update

What a dog by nagora · 2001-09-03 06:31 · Score: 1, Flamebait

This thing is garbage. The power and the insane complexity of writing a decent compiler for its instruction set just makes me wonder what Intel were thinking. Not to mention the speed.

--
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"

Re:What a dog by deranged+unix+nut · 2001-09-03 06:44 · Score: 3, Informative

I'm sure it is proprietary, but Intel has written it's own optimizing compiler for the IA64 instruction set.

It is an interesting solution to the performance problem: Rather than just increase clock speed again, figure out the performance details at compile time and arrange the code to help the processor run it more efficiently.

For example, if you have an if statement and the compiler can determine that 95% of the time the TRUE block will be executed, the code can be arranged so the branch prediction will choose the more frequent route and the pipeline penalty won't need to be paid as frequently. (This is just a simple case of optimization, the IA64 will require insanely complex optimizations, but that is just expanding on what compiler writers have been doing for years.)

It makes the compiler orders of magnitude more complex, but it could potentially increase execution speed by a couple orders of magnitude too.
Re:What a dog by vanguard · 2001-09-03 06:46 · Score: 1

Am I responding to flamebait?

Anyway, this thing is not garbage. I've wondered for a long time why chip designed couldn't do what intel is calling "hyperthreading". It will soon become a reality. I'm excited about it.

--
That which does not kill me only makes me whinier
Re:What a dog by be-fan · 2001-09-03 06:54 · Score: 2

Intel has a lot of smart people working for it (smarter than either you or I). They have done some dumb things, but overall they've been on the mark a surprising number of times (even the P4 looks pretty impressive now that clock-speeds have ramped up). It would be a serious mistake to underestimate them like that. If Intel is putting this product on the market, you can bet that they've fixed the compiler problem. Initial benchmarks of the Itanium seem to show that it can keep right up there with the Alphas and SPARCs in terms for performance (fp, at least).

As for compilers, don't discount Intel so easily. They make incredible compilers. The features of ICL for x86 make compiler designers cream their pants. Read this article for some info about Itanium's compiler design.

--
A deep unwavering belief is a sure sign you're missing something...
Re:What a dog by nagora · 2001-09-03 08:32 · Score: 2

It is an interesting solution to the performance problem: Rather than come up with an efficient design, come up with a bad one that wastes huge amounts of energy (where do you thing all that heat is comming from?) and try to make up for it by adding loads more complexity to the instruction decode.

but that is just expanding on what compiler writers have been doing for years.

What they've been doing wrong for years.

Simplicity is the correct answer, Intel clearly didn't understand the question.

TWW

--
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
Re:What a dog by jallen02 · 2001-09-03 09:30 · Score: 2

I don't care of you have the smartest people in the world, which is likely with Intel, if you mismanage and drive products the way Intel has it does not matter.

Sure you have a ton of smart people but I just have a lack of faith in the whole architecture. You can have the brightest bunch of people in the world but if you make them cook burgers does it matter? Not to discredit your post on other means, but saying because Intel has smart people is kinda silly. They also have stupid people by the same token, does that make them less likely to suceed?

Jeremy
Re:What a dog by be-fan · 2001-09-03 10:02 · Score: 2

My point was that you can't outright say "why is Intel doing such a dumb thing?" (Which the original poster did). Intel is not a stupid company (like MS). They have been very sucessful, not based only on their monopoly status (AMD has been making significant inroads into their turf) but because of the quality of their products. Within the constraints of the x86 architecture, Intel's chips have been incredible. Even now, the P4 looks really promising in its 2GHz+ varients. You just can't discount such a company so easily.

--
A deep unwavering belief is a sure sign you're missing something...
Re:What a dog by roca · 2001-09-03 14:33 · Score: 3

> It is an interesting solution to the performance
> problem: Rather than just increase clock speed
> again, figure out the performance details at
> compile time and arrange the code to help the
> processor run it more efficiently.

That is neither interesting nor a solution. People (i.e. compiler writers) have been working on this for forty years with some (limited) success.

> the IA64 will require insanely complex
> optimizations, but that is just expanding on
> what compiler writers have been doing for years.

Just because the IA64 demands heroic compiler optimization to make up for its shortcomings doesn't mean that the ability to write such compilers will suddenly spring out of nowhere.

Compiler researchers haven't just been sitting on their butts for the last forty years.

> For example, if you have an if statement and the
> compiler can determine that 95% of the time the
> TRUE block will be executed, the code can be
> arranged so the branch prediction will choose
> the more frequent route and the pipeline penalty
> won't need to be paid as frequently.

This was a bad example. Dynamic branch predictors (such as you find in any modern fast CPU) do a great job in practice, better than any known static predictors.
Re:What a dog by roca · 2001-09-03 14:37 · Score: 2

It's already a reality. What Intel calls hyperthreading is coming in the next generation Alpha, is already shipping in POWER-based AS/400 systems, and is also in some specialized network processors.
Re:What a dog by roca · 2001-09-03 14:52 · Score: 2

Groups of people often act much more stupidly than their constituent members. Intel has certainly made a few stupid moves over recent years:
-- IA64
-- Rambus
-- The home wireless network standard they pushed that got beaten by 802.11
Re:What a dog by roca · 2001-09-03 14:54 · Score: 2

> If Intel is putting this product on the market,
> you can bet that they've fixed the compiler
> problem.

Your faith is touching. Another possibility is that the Itanium project was way behind schedule and that they had to ship something, anything, because their competitors and the rest of the industry were laughing at them. And so they shipped a CPU with the worst SpecInt number in the industry and even warned their customers that this was really just a development chip and 'real' hardware would have to wait for the next generation.
Re:What a dog by nagora · 2001-09-03 22:03 · Score: 3, Insightful

Well its fairly obvious that you are an expert on cpu design.

I've programmed about a dozen chips in both the games field and compiler-writing field, I don't design chips any more than Eddie Irvine designs racing cars. But I don't think I'll ever see him getting into a tractor for his qualifying lap.

Raw speed became less important for most applications, so intel added mmx to speed up multimedia.

What planet are you on? MS and Intel have conspired to make raw speed as important as possibe for years. I personally have been offered payment by Intel to produce slower software as part of their "everybody must upgrade" roadmap. MMX came as a direct response to the increasing performance of 3D boards which reduced the need for a faster CPU. Intel fear anything which reduces the need to upgrade so they tried to fight back with MMX. That fear led to the only sigificant addition to the instruction set since the 386.

Once a few quality compilers are around this won't even be an issue.

You grossly underestimate the difficulty of this instruction set. I doubt there will ever be more than one (ie Intel's) good compiler and I doubt there will ever be even one which is reliable and predictable.

--
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
Re:What a dog by be-fan · 2001-09-04 08:40 · Score: 2

-- IA64 cannot officially be called "dumb" yet.
-- RDRAM was a mistake, but it wasn't just Intel. Nintendo bought into RDRAM, as did Sony and several graphics card makers. RDRAM fizziling was NOT something that could have been predicted. I kept up with the reporting back when RDRAM was still called nDRAM (as in unknown), and nobody expressed any objects.

--
A deep unwavering belief is a sure sign you're missing something...

328 registers??? by PollMastah · 2001-09-03 06:33 · Score: 1

Did I read that right? 328 registers?

If that's what I think it is... that's an AWESOME improvement over previous x86 incarnations :-) Just imagine the extent of freedom your C++ compiler will have with register allocation ... this will cut down memory accesses by at least an order of magnitude!

Of course, this all depends on whether these registers are general purpose. They'd better be, 'cos I can't imagine needing 300+ registers for special purposes while still giving you the klunky ole EAX, EBX, ... & co. registers.

--

Poll Mastah

Re:328 registers??? by glassware · 2001-09-03 06:43 · Score: 1

Actually, this is technically inaccurate. The IA-64 architecture (click on the link, it's the assembly language reference for the Itanium) has 128 integer registers and 128 floating point registers - on each side, 127 real ones, and R0, which is fixed to return 0.
What's not commonly known is that the P3 and P4 also have dozens if not hundreds of registers. The trick is register renaming: the P3 and P4 speculatively execute instructions as fast as they can, and they assign the results to temporary registers. If the processor needs these results, they reassign them back to the real registers like EAX, EBX, and so on.
So, overall, I'm not sure where the 328 number comes from. :P
Re:328 registers??? by nester · 2001-09-03 06:51 · Score: 1

328 *physical* registers, not logical (ISA accessible). with 128 context switches will hurt big time ia64. yet another bad design decision of the itanic.
Re:328 registers??? by David+Greene · 2001-09-03 06:55 · Score: 1

So, overall, I'm not sure where the 328 number comes from.

Probably 128 integer regs + 128 float regs + 64 branch/predicate regs (note: NOT general-purpose) + miscellaneous regs like the IP, etc.
While the P3/P4 have lots of registers, they aren't registers in the sense most people think about them. They solve the dynamic antidependecy problem. The static data allocation problem is a separate beast. Those renaming registers aren't visible to the compiler so you'll still have the same number of memory operations in the program.
Same deal with SMT/Hyperthreading. More registers are needed, but they aren't the sorts of registers the compiler can use.
It's interesting that a write to R0 is defined to fault. Is this just for Itanium or is it an IA64 architectural decision? If so, it seems like a very poor one to me.

--
Re:328 registers??? by be-fan · 2001-09-03 06:56 · Score: 2

Yea, they are. As I remember it, 128 general purpose integer and 128 general purpose fp. Download the Intel C++ docs for some info about the ASM-viewpoint architecture of the Itanium.

My only question is for the OS guys out there. How does an OS handle context switches with 328 registers? With 8 bytes per register, thats more than 2K of data to dump out every context switch!

--
A deep unwavering belief is a sure sign you're missing something...
Re:328 registers??? by VAXman · 2001-09-03 07:34 · Score: 3, Interesting

328 *physical* registers, not logical (ISA accessible). with 128 context switches will hurt big time ia64. yet another bad design decision of the itanic.

A context switch happens one in a blue moon. Fast context switches are not going to make up for sluggish performance for the real work the machine is doing between context switchs. Registers are considerably faster than cache; the absolutely fastest cache in the world is P4's L1 cache which has a load latency of 2 cycles, and on most architectures it is 3 cycles. Putting 128 qwords into registers is an absolutely dramatic speedup for programs which have a working set more than 8 dwords (all that IA-32 gives you).
Re:328 registers??? by dmt · 2001-09-03 07:42 · Score: 1

It's not nearly as expensive as you might think. For example, on a 733MHz Itanium, lmbench reports a 2 process context switch time of almost exactly 1 microsecond. For comparison, a 1.2GHz Athlon does this in about 0.88 microseconds. So yes, it's a little bit slower, but it's clearly right up there. Of course, by the time you get to 16 processes touching 64KB of data each, the Itanium does a context switch in about 60 microseconds and the Athlon is up in the 112 microsecond range.
Re:328 registers??? by Nickoty · 2001-09-03 08:32 · Score: 1

It's interesting that a write to R0 is defined to fault. Is this just for Itanium or is it an IA64 architectural decision? If so, it seems like a very poor one to me.

Why, except perhaps that it seems a bit like wasted effort to check for accesses. Why would you want a /dev/null in asm?

--

-- Cure for Cancer instead of SETI! (only w32 yet - mail and beg)
Re:328 registers??? by naasking · 2001-09-03 11:40 · Score: 1

Well the x86 architecture is poor to begin with when dealing with user/supervisor transitions. The absolute minimum spent is something like 50 cycles on a PIII. Typical RISC architectures have transition times on the order of 4 cycles. It's really pathetic and the source of alot of performance problems with microkernels. It's one of the main reasons why they haven't been widely adopted.

--
Higher Logics: where programming meets science.
Re:328 registers??? by murphj · 2001-09-03 12:50 · Score: 1
So, overall, I'm not sure where the 328 number comes from
1. 1. From the help file you posted - 128 integer, 128 floating point, 64 predicate, 8 branch.
--
SONY. Because caucasians are just too damn tall.
Re:328 registers??? by dmt · 2001-09-03 13:43 · Score: 1

We are taking about a full context switch here, not just entering/exiting the kernel. In my experience, x86 has traditionally been among the leaders here, because it has so few (architected) registers.

If you have numbers to the contrary, please post them. Don't just make unsubstantiated claims.
Re:328 registers??? by David+Greene · 2001-09-04 01:45 · Score: 1

On the Alpha, a load into R0 acts as a memory prefetch. If the IA64 architecture defines this as faulting, separate prefetch instructions will be needed. Now, they might be needed anyway to get various levels of service/semantics, but a "basic" prefetch is often implemented this way.

--
Re:328 registers??? by renoX · 2001-09-04 07:14 · Score: 1

Well you know there are intermediate steps: usually other RISC architecture have either 32 or 64 registers..

And having 128 registers has other drawbacks that long context switches: there is a size versus speed trade-off..

These problems get only worse when you use SMT: a four-way SMT CPU needs four-times more register at least..
This plus the in-order design:I don't believe that Intel will ever use SMT for the IA-64, more likely they'll go CMT..

A part from the "mine is bigger than yours" factor I'm wondering what kind of speed-up you have when you go from 64 to 128 number of registers..
It is probably very limited apart from very specials problems..
Re:328 registers??? by naasking · 2001-09-04 12:06 · Score: 1

Substantiated claims? Ever heard of Jochen Liedtke and the L4 microkernel? He and many other researchers wrote the following papers if you'd care to read them:

The performance of microkernel based systems

Achieved IPC performance

Microkernels must and can be small

On Microkernel construction

Improved Address-Space Switching on Pentium Processors by Transparently Multiplexing User Address Spaces

If you'd like a breakdown, here it is: x86 sucks for context switch times for two main reasons, a)user/supervisor transition times are an order of magnitude slower than other architectures and b) a poorly designed TLB cache results in a flush whenever a context switch occurs(only if the address space changes of course). Any advantage the x86 architecture gains by having few registers is lost(and then some) by these other factors. This can be readily seen in operating systems and kernels that rely heavily on context switching such as microkernels.

For quick and dirty evidence: here. That's a link showing context switching times for Linux running on an 850 MHz PIII. The times typically hover around 12 microseconds. The papers I linked to above show achieved IPC times for L4 which are steady around 3 microseconds on much lower-end hardware. That's IPC, not just context switching, ie. context switch and copying data to another address space. The L4 teams have tweaked their implementation as fast as it can go on x86, and have achieved performance an order of magnitude higher than Linux(at least in this area). This demonstrates some pretty solid expertise. Given this, they say admit that x86 is very poor in this respect and a great hindrance in designing a good operating system. I believe some of the papers briefly discuss other architectures, but most of the focus is on the x86 platform because it's such a performance problem.

P.S. the fact that the x86 is register poor is not a good thing. Having two few registers to manipulate data is often a hindrance. Saving 3 times as many registers doesn't take that long anyhow.

--
Higher Logics: where programming meets science.

When will we see some improvements from the Alpha? by bconway · 2001-09-03 06:34 · Score: 2

As someone with a few friends that recently made the move from Compaq's Alpha division over to Intel, what I'm most curious about is what revision of the chip will we see any improvements being incorporated from the Alpha design. I can't imagine Intel would want to let out any news on work that they bought instead of engineering themselves, but I think it'd be interesting to hear what exactly was directed ported over in the designs, if anything, as well as a detailed comparison of the two processors. Any info, anyone? Perhaps the second big revision of the IA64 chips?

--
Interested in open source engine management for your Subaru?

Re: "HyperThreading" in IA-64 by 2002 by Bodero · 2001-09-03 06:34 · Score: 5, Informative

Hyperthreading, as implemented, exists in the Pentium 4 line.

Right. And there's no indication that something similar will appear in IA64 until at least 2006 (which is the *earliest* that the Alpha team could likely add it to that complex - or if you prefer messy - an architecture if the hooks for it weren't already built in).

It's a weak second to SMT. With HT, as I understood it, if a processor happens to have a floating point op and an integer op on hand at the same time, it can run both of 'em at once, instead of sequentially. That's the limit to the HT magic. It can't do two FP or integer ops at once.

Well, real-world server applications could be sped up by 30%, which would mean that HT could execute multiple *non*-FP instructions at once (and the article doesn't say it can't, just that it can't execute two FP ops at once).

It actually seems to look quite a bit like EV8's SMT, except that we don't know if it currently adds more execution units to the P4 architecture and whether all execution units can be applied to service a single thread if multiple threads aren't present. And, of course, it only supports two concurrent threads rather than four.

Intel stole and then implemented Alpha technologies for its Pentium, and only much later did it negotiate with Digital to get the official right to use that stuff.

No: I'm assessing the situation, unlike your propensity for drawing conclusions based on vague speculation and no data.

IA64 has to all appearances been developed with zero attention paid to things like out-of-order execution (in fact, it was developed explicitly to *avoid* out-of-order execution). OOO and SMT are intimately intertwined in EV8's SMT design, and apparently also in HT's. There's no indication that Intel has until now given any thought toward incorporating SMT/HT technology in EPIC, and every indication that it will thus take at least close to 5 years before such IA64 technology hits the street (especially as incorporating it into EPIC will almost certainly involve radically different internal approaches than those used to incorporate it into EV8 and P4).

Pentium 4 Multithreading? by glassware · 2001-09-03 06:36 · Score: 2, Insightful

Did anyone notice that in the middle of the article it says that the Pentium 4 chip has hardware multithreading, yet it was disabled "until the company comes out with its first Xeon processor with multithreading."

Shades of the whole 486SX debacle?

Re:Pentium 4 Multithreading? by be-fan · 2001-09-03 07:00 · Score: 2

No, its just that the technology probably isn't mature enough to release. Its different from the 486SX, which is closer to the whole Celeron thing.

--
A deep unwavering belief is a sure sign you're missing something...
Re:Pentium 4 Multithreading? by VAXman · 2001-09-03 07:43 · Score: 2

Not really. On the 486 the FP unit was a discrete part of the chip, which could potentially have a defect, and thus be disabled. It was done only to increase yield. MT on P4 is spread completely through the chip, and it is unlikely that a defect would prevent MT from working but let the chip run in single-thread mode (since almost all parts of the chip are shared in MT). The reason MT is not enabled is not a manufacturing issue (like on 486SX) but mostly for paranoia about pioneering a totally new feature.
Re:Pentium 4 Multithreading? by RedAlert99 · 2001-09-03 21:04 · Score: 1

It was not done ONLY to increase yield. It was also done to differentiate products and discriminate (in the economic sense) between buyers. Those who could afford it bought the the DX, others bought the SX, but it would've been too expensive to actually run completely separate fab lines, so they just disabled it on some, made less profit on those, but sold units they otherwise wouldn't have sold.

--
Cats know what you're thinking. They don't care, but they know.

Re:When will we see some improvements from the Alp by Dr.+Spork · 2001-09-03 06:36 · Score: 3, Funny

The way I understand it, Intel bought Alpha not to praise it, but to bury it.

Well, This isn't it. by BiggestPOS · 2001-09-03 06:37 · Score: 1

I'm running a Coppermine 850 right now, and at the time it was a sensible upgrade from my Katmai 450, just as the 450 was a good upgrade from the PII 233 before it. But right now I'm scratching my head. My next CPU upgrade will most likely require a new Motherboard as well, so I now have the freedom to go to a completely NEW architecture. (The ABIT BX6 will probably go in our media box) But I don't really want a P4, the Itanium definitely isn't for what I do, and I have never really been an AMD fan, I just don't know their stuff. So where is Intels next chip for ME?

--
What, me worry?

On-board OS by omega9 · 2001-09-03 06:39 · Score: 1

This isn't so much about the CPU itself, but the chipset it fits to:

The BIOS on all Itanium chipsets (AFAIK) are setup to have a small kernel onboard. I.E. - you can boot the system with limited funcionality even if there's no floppy, HDD, or other boot medium present. If you do have a filesystem present, the "BIOS-boot" will even give you access to it.

Not the biggest feature on the block but helpfull none the less.

--
I'm against picketing, but I don't know how to show it.

Re:On-board OS by driehuis · 2001-09-03 12:28 · Score: 2

Alpha and Sparc architecture had this (almost) from day 1.

True, but probably unimportant. OpenFirmware has been around (I think even as an IEEE standard) for ages, but apparently the PC world doesn't care.

If I sound frustrated, it's because I am. OpenFirmware is such a small bone to throw the techies that I think it's criminal it never came about in the PC world. After years of haggling the BIOS vendors, we now have BIOSes on some machines that can optionally emulate an ANSI terminal for console access on a serial port. Which means you can use Kermit under DOS to manage the machine remotely. Even "tip" on UNIX is suboptimal here, and the best this BIOS will do is emulate the full-screen config menu. Beeeuuurk.

--
Bert Driehuis -- All I asked was a friggin' rotatin' chair. Throw me a bone here, people.
Re:On-board OS by AlgUSF · 2001-09-05 04:51 · Score: 1

All descent server archetectures have something like this. I know Sun boxes, and Digital boxes do. Digital (I still refuse to call them compaq) boxes have a kinda VMSesque command line, I guess they want to stay with their VMS roots. :-)

--

I want my rights back. I was actually using them when our government stole them after 9/11.

Any support for virtualization? by iconian · 2001-09-03 06:40 · Score: 1

Anybody knows if Itanium or that 64-bit AMD processor will have better support for CPU virtualization?

Re:Any support for virtualization? by Graymalkin · 2001-09-03 18:37 · Score: 1

Considering the chip is marketed towards high end servers and workstations it will probably never see Windows 95 installed on or around it. There is no need for Windows 95 on Itanium based systems. Check out SGI's 750 or the HP i2000, SGI only offers Linux as of yet on theirs while HP offers HP-UX Linux and WindowsXP on theirs. DOS is nowhere to be seen on these boxes and will not be seen on them.

--
I'm a loner Dottie, a Rebel.

*idiotic consumer point of view* by supabeast! · 2001-09-03 06:43 · Score: 2

I will now translate the thoughts of an average American regarding this article:

"Wow, a 64-bit processor with 6MB? I can finally have a computer more powerful than my N64! I hope it doesn't let little Billy access all of that satanic-internet-porn any faster, though...."

A Pentium-Equivalent Rating for these? by Dr.+Spork · 2001-09-03 06:45 · Score: 1

I would laugh if Intel eventually decided to sell these impressive-looking chips for desktop systems and had to do a big campaign about how clock speed is not terribly relevant to how the chip performs, in hopes of silencing Athlon owners saying "Ha ha, a whole Gigaherz!? How much did you pay?"

How will they market that? by Ghoser777 · 2001-09-03 06:45 · Score: 3, Interesting

So when most people go out and buy a computer, they see a lot of mhz and think it's really fast. So if they're use to 2ghz+ pentiums, why would they even think of buying a 1ghz itanium? Sure, I know it'll probably be faster, but how does intel plan to market these? Will they also drop mhz ratings like AMD? Or will they go on some major re-educaiton campaign, like Apple?

F-bacher

--
James Tiberius Kirk: "Spock, the women on your planet are logical. No other planet in the galaxy can make that claim."

Re:How will they market that? by HerrNewton · 2001-09-03 06:53 · Score: 3, Funny

And Intel will, of course, be working upstream against its own past marketing push which is largely responsible for the MHz Myth. Nice.

--

----
Am I the only one who thinks Microsoft is a misnomer? Perhaps Macrosoft would be a better fit?
Re:How will they market that? by Glonk · 2001-09-03 07:16 · Score: 1

Somehow I think the McKinley is aimed at a slightly different market than the consumers.

The people who buy McKinley's know about it first, you'd have to for something that expensive.
Re:How will they market that? by VAXman · 2001-09-03 07:48 · Score: 2

Remember that Itanium is marketed solely towards IT people, who know that gigahertz does not equal performance, and who do real performance studies before deploying a system. Look at the success of HP and Compaq whose chips are reasonably fast, yet have slow megahertz ratings (or Sun and IBM, whose chips are slower, and have low megahertz ratings, but sell very well).

It is only the consumer market which looks at gigahertz. Which means that Intel will have to make a high megahertz version if it expects Itanium to enter the consumer market.

Diminishing clock speeds by ral · 2001-09-03 06:46 · Score: 3, Insightful

...the Itanium product line will see its speed increase from 800 to 1 GHz, which is half the frequency of the company's fastest 2-GHz Pentium 4....Intel contends, however, that the faster front-side bus, more on-chip memory and redundant logic resources will more than make up for the processor's lag in clock speed.

We can only hope that this chip helps the media away from using clock speed as the primary (often only) measure of performance.

Re:Diminishing clock speeds by Skuld-Chan · 2001-09-03 07:13 · Score: 1

But clock speeds are often a inidcator of performance - anyone who has played rocky's boots and knows anything about clocking transistors knows that.
Re:Diminishing clock speeds by WNight · 2001-09-03 09:03 · Score: 2

Is there a modern version of this? The last I saw it was on the Apple ][ in the early 80s.

I wouldn't mind checking it out again if you can point me to a copy.
Re:Diminishing clock speeds by Jordy · 2001-09-03 09:56 · Score: 2

Intel will most likely not face the same problems AMD has with regards to marketting of their CPU due to the target market.

When you buy a machine with $2000-$5000 CPUs, you tend to do real research on the performance of the system you are buying.

--
The world is neither black nor white nor good nor evil, only many shades of CowboyNeal.
Re:Diminishing clock speeds by roca · 2001-09-03 15:04 · Score: 2

> When you buy a machine with $2000-$5000 CPUs, you
> tend to do real research on the performance of
> the system you are buying.

Which makes you wonder who would possibly buy an Itanium (especially for non-FPU-intensive servers where Intel's pushing it).
Re:Diminishing clock speeds by Skuld-Chan · 2001-09-03 19:32 · Score: 1

There was a version for the C64 - which I played with.

E-mail me sometime I can't send you the disk file and point you to a good emulator.

Re:Yeah but... by Bjarke+Roune · 2001-09-03 06:46 · Score: 1

for ((...) i++) is just WRONG. The proper way to do it is for ((...) ++i). The other way is both less effecient, and it makes it seem as if you need the original value of i for some special processing, while in fact you don't.

--
Bjarke Roune

Chick magnet by doorbot.com · 2001-09-03 06:47 · Score: 3, Funny

for me, it means single instruction xor for the 64 bit hash codes used in chess transposition tables

Watch where you say that, or you'll be using that nifty Itanium to repel the hordes of women instinctively flocking to you like the salmon of Capistrano.

Re:Chick magnet by quietlysubversive · 2001-09-03 08:01 · Score: 1

lots of women like smart guys

--
----(o)----
Re:Chick magnet by madcow_ucsb · 2001-09-03 08:09 · Score: 1

Apparantly you haven't seen "Dumb and Dumber".

"I'm talking about a place where the beer flows like wine, where the women instinctively flock like the salmon of Capistrano. I'm talking about Aspen."

IA64 is the "heir apparent" by dpilot · 2001-09-03 06:50 · Score: 5, Insightful

Is anyone else so completely stunned as me, that essentially everyone (except AMD) has rolled over and allowed the IA64 to be crowned heir apparent as the new high-end microprocessor? The Alpha is dead by acquisition, HPPA is dead by partnership, MIPS is lost somewhere in the low end, and Sparc and Power4 are both retreating upstream.

It's amazing that ANYONE can field the number of mistakes that Intel has, and get away with it. For some time now, their first-outs have been essentially flops:

Pentium: Remember the 5V room heaters?

Pentium: Then the 3.3V units with floating point bugs?

Pentium Pro: The ancestor of the Pentium II/III line was a good CPU in its own right, and worked well for Unix and OS/2. But it completely missed the market, performing terribly on 16 bit code.

Celeron: DeCeleron, until they put the cache back on. From another point of view, the whole Celeron program has been a disaster, either by its own crippling, or by revealing how overpriced the PII/PIII line is.

Pentium III: CPUID - A 'workstation idea' that once again missed its market. Maybe if they'd found a way to node-lock software that can't be used for machine tracing. Maybe that's not what they were after.

Pentium 4: Let's face it, this CPU is just plain uneven and imbalanced. After a round of redesign to even it out, just like with the others, it could very well be an excellent CPU. Tame the prefetch, expand the trace cache, etc.

Itanium: Didn't even make it out the door before spin-doctoring began. "Just wait for McKinley!" I've already heard one set of rumors that McKinley isn't going to *really* do it either, so just wait for IA64-III.

Is all this any better than the "Just wait for this new release!" that Microsoft keeps pulling? Though I guess Intel does generally get each family right on the second shot.

AMD has a good product, I just wish they were a little less mum, and had a better response than warmed-over P-numbers. I also wish we could hear a bit more noise about the Hammers.

--
The living have better things to do than to continue hating the dead.

Re:IA64 is the "heir apparent" by mz001b · 2001-09-03 06:57 · Score: 1

MIPS is lost somewhere in the low end
Umm... have you ever used a MIPS chip? The R10k and R12k are beautiful processors and very fast. Don't let the low MHz rating fool you. The SGI compilers are also very good -- they do a lot of optimization and the profiling tools are some of the best around. There are lots of hardware counters on the R10k (32 I believe) that make it easy to find out where in your code to all your FLOPS are, the secondary cache misses, branch mispredictions, ...
I wish SGI/MIPS would continue along with these chips. They are a wonderful platform to develop on.
Re:IA64 is the "heir apparent" by Jah-Wren+Ryel · 2001-09-03 07:02 · Score: 2

Sounds like you haven't heard about a little chip called the Power4
Plus Sun sure hasn't rolled over either, Sparc performance has always been subpar, but they make up for it with a good OS (Solaris) and tons of applications.

--
When information is power, privacy is freedom.
Re:IA64 is the "heir apparent" by DarkEdgeX · 2001-09-03 07:02 · Score: 3, Informative

Pentium III: CPUID - A 'workstation idea' that once again missed its market. Maybe if they'd found a way to node-lock software that can't be used for machine tracing. Maybe that's not what they were after.

I think you're confusing CPUID with Processor Serial Number (PSN).. PSN, IMHO, was a good idea, but the privacy zealots cried foul and ruined an otherwise good way to lock software to a specific individuals CPU. (YES, I know there are work-arounds that pirates can use (from simply hex-editting the instructions that check for the PSN to writing drivers return false info).) I really wish Intel hadn't backed down on PSN and included it in the P4 (afterall, for those naysayers that don't want PSN, or their identity, revealed to websites or software, you can disable it in the BIOS).
Oh well. Thought I'd clear that up. CPUID is GOOD. PSN is BAD (to the privacy folk, anyways).

--
All I know about Bush is I had a good job when Clinton was president.
Re:IA64 is the "heir apparent" by Anonymous Coward · 2001-09-03 07:11 · Score: 1, Insightful

You didn't mention the i860 and the iapx432, the processors Intel wants you to forget about, and it seems that they succeeded.
Re:IA64 is the "heir apparent" by kilrogg · 2001-09-03 07:14 · Score: 2, Insightful

Hey, how could you forget Rambus!
Re:IA64 is the "heir apparent" by warpSpeed · 2001-09-03 07:54 · Score: 1

Where can I get $50-100 mother boards for these chips? Until I can they are not going to enjoy the market dominance of Intel.

Intel will just continue the legacy until you can get commodity MoBos from other vendors for different chips. I would be all over a Power4 system _IF_ I could get it as cheaply as an Intel/AMD based system.

~Sean
Re:IA64 is the "heir apparent" by sheldon · 2001-09-03 08:03 · Score: 2

While you have certainly kept track of Intel, I believe you've completely ignored the rest of the world's history.

That is, all these companies have had their share of problems.

When the Alpha was first released it ran *HOT*. I had one of the early DEC3000/300 on my desk. DEC had other problems with the Alpha. The CPU itself was denied it's future because of poor quality boxes it was put into.

I'm not quite as farmiliar with Sparc or PowerPC, but we shouldn't forget that Sun was having difficulties with the Sparc found in the E10000 not too long ago. To the companies who had paid millions for these boxes, it was a bigger deal than the Pentium floating point problem.

AMD has had their share of flops. The early 386 and 486 designs were good, but should we all forget the K5 and the early K6?

I had a Cyrix 486DX/50 clone back in '94, and it wouldn't work with a variety of software under Linux such as ghostscript. Cyrix replaced it, reluctantly... I had to argue with them on the phone despite Infoworld articles reporting the problem.

I don't see Intel has having a signifigantly worse track record than others. Their product is certainly used in a higher number and thus the failures are higher profile.
Re:IA64 is the "heir apparent" by Glock27 · 2001-09-03 08:06 · Score: 1

First off, I agree with your comments regarding Intel's missteps lately.
AMD has a good product, I just wish they were a little less mum, and had a better response than warmed-over P-numbers. I also wish we could hear a bit more noise about the Hammers.
I'm hoping that AMD will really trumpet the Hammer message once they have actual silicon. It should be a great alternative to Itanium, and will be positioned to immediately make inroads in the desktop/workstation market as well. I hope AMD will do a good job (as Intel has with P4 and Itanium) with helping the compiler writers get the most out of the chip. The Linux based Hammer simulation tools are at least a good start.
I know AMD released less than stellar numbers regarding current sales (although it may have been mainly flash related), but I'd have to think Athlon would be doing very, very well in the current economic climate...
At any rate, if AMD can execute well on Hammer, it should quickly make major inroads at the high end, IMO. AMD's reputation has grown tremendously with the success of Athlon.
186,282 mi/s...not just a good idea, its the law!

--
Galileo: "The Earth revolves around the Sun!"
Score: -1 100% Flamebait
Re:IA64 is the "heir apparent" by Jah-Wren+Ryel · 2001-09-03 08:14 · Score: 1

You gotta compare apples to apples - You won't be seeing IA64 mobos for even $500 for at least a couple of years, if ever (a lot of things can change in a couple of years). When the CPU is at least 3 grand, you can bet the infrastructure is going to be similary expensive.

--
When information is power, privacy is freedom.
Re:IA64 is the "heir apparent" by dpilot · 2001-09-03 08:35 · Score: 2

I mentioned both - they're retreating upstream into the high-end server market.

--
The living have better things to do than to continue hating the dead.
Re:IA64 is the "heir apparent" by DarkEdgeX · 2001-09-03 09:46 · Score: 1

If you upgrade your system enough that it involves upgrading your CPU, then you should discuss it with the individual vendor. I really don't see your point, since in fact my idea is a more leniant system than Microsoft's Product Activation crap in Windows XP (change enough items and you're screwed, even if the CPU stays the same).
PSN works better for this, IMHO.

--
All I know about Bush is I had a good job when Clinton was president.
Re:IA64 is the "heir apparent" by Jah-Wren+Ryel · 2001-09-03 10:20 · Score: 2

You are right that you did mention it, but I disagree with the "retreating upstream" part. Sparc is in some very cheap boxes and IA64 is just as upstream, if not more so, than any recent PA-RISC, Alpha or Power cpu.

--
When information is power, privacy is freedom.
Re:IA64 is the "heir apparent" by dpilot · 2001-09-03 10:40 · Score: 2

True, but Intel has a mindshare that I suspect would let it move IA64 down into consumer-level in the next 5 years, if it wants to and if P4 has run out of steam, or if there is some AMD move that needs countering. I doubt either Power4, Sparc, or PA-RISC could play there.

--
The living have better things to do than to continue hating the dead.
Re:IA64 is the "heir apparent" by dpilot · 2001-09-03 10:46 · Score: 2

Didn't. But that was mostly a chipset, motherboard, and support chip issue. I presume you mean the last-minute delay of the 820 launch.

Another aspect of Rambus is the untamed prefetch on P4. It's so aggressive that only Rambus can provide enough bandwith to keep it running, at least until dual-channel DDR. But according to the reviews, most of that bandwidth is merely wasted, but needed to keep the processer fully fed.

--
The living have better things to do than to continue hating the dead.
Re:IA64 is the "heir apparent" by dpilot · 2001-09-03 10:53 · Score: 2

The difference is in what happens after failure. Intel has had a string of failures, and manages to come back, be forgiven, and continue to dominate the market. The others aren't so lucky. Even with as strong a product as the K7, AMD just hasn't cracked the higher profit markets. Intel would like us to believe that they heyday of the K7 is fading, and if they do a good respin on the P4, they'll be right.

--
The living have better things to do than to continue hating the dead.
Re:IA64 is the "heir apparent" by kilrogg · 2001-09-03 11:28 · Score: 1

I presume you mean the last-minute delay of the 820 launch.
Nope, I was refering to the decision to go with Rambus for the P4. Rambus is generally regarded as a bunch of crooks who tried to screw over fellow Jedec members by secretly patenting their ideas, then tried to extort money from them (the courts agree).
The fact that intel associated themselves with this trash is a major blunder, IMHO. (nevermind the fact that DDR is arguably supperior, lower cost, and easier to implement)
Re:IA64 is the "heir apparent" by listen · 2001-09-03 11:48 · Score: 1

Sorry, you aren't making sense.
Microsoft plan a stupid thing that tramples on consumers rights, so you support something very nearly as stupid? Huh? In case you haven't noticed, two things can both be bad without being the same. There isn't always a good guy in every comparison.

Nodelocking is broken technology - it relies on ignorance, and its only purpose is to undermine the doctrine of first sale. It does nothing to stop piracy. No nodelocking technology is worth the bits it is encoded in.
Re:IA64 is the "heir apparent" by NovaX · 2001-09-03 11:53 · Score: 1

Close. The K5 was an AMD design, which absolutely stunk. The K6 design was origionally based on the K5 architecture, but the NexGen core replaced it. IMHO, AMD at that time was worse then Cyrix, and buying NexGen saved the company from another poor designed chip, and slowly built them up. Cyrix was not so lucky, and thus continued on designing slow chips and trying to compensate with their integrated solutions (which flopped).

A quick architecture page to back me up on the K5 not being NexGen: here

--

"Open Source?" - Press any key to continue
Re:IA64 is the "heir apparent" by styopa · 2001-09-03 13:01 · Score: 3, Interesting

Someone needs to defend the SPARC chipset, and what [I see] Sun Microsystems is doing, so here I am.

Sure the single processor, or even up to 8 processor results are not the greatest thing out there. In the single through four processor units Intel beats them, and higher the Power series takes over. What one tends to forget is, for a processor that is designed for SMP, A) 1024 processors linearly is damn good, and B) it is relatively cheap for a server class processor. Also the SPARC line is known to have the least number of hardware bugs of any major processor out there.

Sun really doesn't need a sports car of a chip anyway. Servers and workstations need uptime. They don't need to attack the user market yet. First they seem to be more actively attacking the workstation market with the sub-$1000 SunBlades. With a Sun solution the workstation only needs to be moderately fast, but the server needs to be DAMN fast because the most intensive processes run on the server and display over the network. Small steps.

--
Disclamer - Opinion of Person
Re:IA64 is the "heir apparent" by drsoran · 2001-09-03 13:12 · Score: 1

Sun Blade 100, Netra X1, etc. The motherboards in those are probably about $100 but Sun isn't in the business of selling seperate components anymore than Intel is in the business of selling workstations and complete PCs.
Re:IA64 is the "heir apparent" by Chagrin · 2001-09-03 13:31 · Score: 1

The Sparc chips may be many things, but cheap is definitely not one of them.

--
I/O Error G-17: Aborting Installation
Re:IA64 is the "heir apparent" by DarkEdgeX · 2001-09-03 14:14 · Score: 1

I don't see how it's broken, considering it works. If you write a special piece of software (say, something proprietary) and want a way to lock it so it only executes on certain workstations, PSN is the way to do it that doesn't heavily trample on the users rights. The only upgrade they could make that would break the software is if they switched to a new CPU-- change the motherboard, change the videocard, change anything else but the CPU, and the software keeps working.

As I said, it does have weaknesses, but for general users this weakness isn't easily exploitable. As for Microsoft, shrug, it all depends on your point of view.. If *I* wrote a piece of software that did something that big business might find useful and wanted to sell it to corporations and STILL make sure it wasn't ran on more than one workstation, *I'D* implement PSN in my code to stop them from using it on different systems than sold for.

But clearly your logic diverges from mine on this singular point, so you'll never agree.

--
All I know about Bush is I had a good job when Clinton was president.
Re:IA64 is the "heir apparent" by roca · 2001-09-03 15:11 · Score: 2

> I hope AMD will do a good job (as Intel has with
> P4 and Itanium) with helping the compiler writers
> get the most out of the chip.

It's just warmed-over x86 (and I mean that in a good way). Should be dead simple to modify an x86 compiler to target x86-64 ... and I read it's been done for gcc.
Re:IA64 is the "heir apparent" by njdj · 2001-09-03 18:35 · Score: 1

It's amazing that ANYONE can field the number of mistakes that Intel has, and get away with it

I don't think they'll get away with this 130-watt room-heater. Remember that 130 watts is just the processor. Everybody thought that the first Alphas had a power problem - and they were only 30 watts! Forget this thing for the desktop, it's going to need cooling fans that will make so much noise you won't be able to think in the same room.
Re:IA64 is the "heir apparent" by DarkEdgeX · 2001-09-04 03:38 · Score: 2

Realizing now that I never explained what CPUID really was-- CPUID is an instruction introduced in the original Pentium (and some late model 486's, though undocumented and unsupported) that returns a plethora of useful info on what kind of processor is being used, as well as what features it has. AMD and Intel share a lot of the same info (as far as the data layout), but diverge on others. In Intel's incarnation, a bit-flag is returned that exposes the status of such features as an on-board FPU, MMX, SSE and SSE2, as well as some individual instructions such as CMPXCHG8B. Both AMD and Intel reveal Family, Model and Stepping information, as well as an ASCII string representing their company slogan (in Intel CPU's, "Genuine Intel"). Even newer processors tell you their name and speed in an ASCIIZ string.

Quite useful, and pretty much does away with arcane checks to see what processor the code is really being ran on (like the various methods of checking to see if you're running on an 8086 or 286, or 386 vs. 486, for example).. =) Unfortunately, if you want to run on these golden oldies, you still have to do those arcane checks, but once you establish that you're working with a Pentium or higher processor, you simply do a CPUID and you're done.

--
All I know about Bush is I had a good job when Clinton was president.
Re:IA64 is the "heir apparent" by DarkEdgeX · 2001-09-04 08:51 · Score: 1

Back in the day, before AMD made it big, Intel basically used this as a slap in the face to imitators since it'd be illegal to reproduce this text string in an AMD processor (or any other makers, Cyrix being another contender at the time). Of course this does let you verify that you have an Intel CPU in your system without tearing it apart (for those who bought them pre-assembled off the shelf, for example).

--
All I know about Bush is I had a good job when Clinton was president.
Re:IA64 is the "heir apparent" by dpilot · 2001-09-05 12:04 · Score: 2

Glad to know that AMD is responsible for the bugs in Via chipsets for the Athlon. Is Intel responsible for the bugs in Via chipsets for the Pentia. Or is AMD responsible for those bugs, too?

--
The living have better things to do than to continue hating the dead.

power by lavaforge · 2001-09-03 06:55 · Score: 2

The article states that the Itanium pulls 130 watts of power. That seems rather high, even for the space heaters that we like to call cpu's nowadays. Is the Itanium using the new all-copper .13 micron process, or an older technology?

Re:power by itarget · 2001-09-03 09:01 · Score: 1

0.18 micron with copper interconnects but the internal traces are alluminum. It's going to be, in a word, a furnace.

--

"Where shall the word be found, where will the word resound? Not here, there is not enough silence." -T.S. Eliot
Re:Power by beardcz · 2001-09-03 19:40 · Score: 1

You mean one for each CPU, of course...

--
No sig for me - too lazy to fill one in...

So, McKinley isn't a properly designed system? by deranged+unix+nut · 2001-09-03 06:55 · Score: 5, Insightful

This is a rather odd quote from the article:
(bolding is my emphasis)

To protect against heat-related system meltdowns, McKinley includes a programmable thermal trip that can throttle processor performance by 40 percent to cut power consumption. But the company sees that more as a safety net, not as an answer to thermal issues. "This should never be needed in a properly designed system," said Naffziger.

Re:So, McKinley isn't a properly designed system? by Saidin · 2001-09-03 07:44 · Score: 2, Informative

McKinley is the CPU, the system in the box that the CPU gets plugged into. So, if the box (system) is properly designed, the CPU never needs to throttle itself.
Re:So, McKinley isn't a properly designed system? by tswinzig · 2001-09-03 12:05 · Score: 2

You do realize McKinley is a PROCESSOR, not a SYSTEM, right?

--

"And like that ... he's gone."

Re:Compiler by anonymous+loser · 2001-09-03 07:04 · Score: 5, Informative

Apparently you're not familiar with VLIW processor design. It's not "throwing it off" to the software guys because it's too difficult to implement. It is dramatically reducing the complexity of the pipeline, thereby increasing throughput by orders of magnitude (see CISC vs. RISC).

And the compiler has far *more* information than the runtime hardware has. The scheduling hardware is only capable of looking a few instructions at a time to decide how to enhance ILP, whereas the compiler by its very nature has access to the entire program at once, and can perform optimizations not possible in hardware.

This is further enhanced by a development cycle that includes profiling. As you use the program during development, the compiler can use the same profiling information that is used to "manually" optimize code to perform its own optimizations. With an advanced OS, this become extremely powerful, as some of the registers on the processor actually keep track of profile data at runtime. Then, during page swaps to/from virtual memory, the processor has the opportunity to dynamically optimize and recompile the code.

Re:Yeah but... by Anonymous Coward · 2001-09-03 07:04 · Score: 1, Interesting

Funny how identical code ; 6 : for(i=0;i<10;i++) j+=1; 00006 c7 45 fc 00 00 00 00 mov DWORD PTR _i$[ebp], 0 0000d eb 09 jmp SHORT $L468 $L469: 0000f 8b 45 fc mov eax, DWORD PTR _i$[ebp] 00012 83 c0 01 add eax, 1 00015 89 45 fc mov DWORD PTR _i$[ebp], eax $L468: 00018 83 7d fc 0a cmp DWORD PTR _i$[ebp], 10 ; 0000000aH 0001c 7d 0b jge SHORT $L470 0001e 8b 4d f8 mov ecx, DWORD PTR _j$[ebp] 00021 83 c1 01 add ecx, 1 00024 89 4d f8 mov DWORD PTR _j$[ebp], ecx 00027 eb e6 jmp SHORT $L469 $L470: ; 7 : for(i=0;i<10;++i) j+=1; 00029 c7 45 fc 00 00 00 00 mov DWORD PTR _i$[ebp], 0 00030 eb 09 jmp SHORT $L471 $L472: 00032 8b 55 fc mov edx, DWORD PTR _i$[ebp] 00035 83 c2 01 add edx, 1 00038 89 55 fc mov DWORD PTR _i$[ebp], edx $L471: 0003b 83 7d fc 0a cmp DWORD PTR _i$[ebp], 10 ; 0000000aH 0003f 7d 0b jge SHORT $L473 00041 8b 45 f8 mov eax, DWORD PTR _j$[ebp] 00044 83 c0 01 add eax, 1 00047 89 45 f8 mov DWORD PTR _j$[ebp], eax 0004a eb e6 jmp SHORT $L472 $L473: is less efficient. Perhaps you need to get a decent compiler.

Fat pipe by Anonymous Coward · 2001-09-03 07:04 · Score: 1, Funny

Could it be that Intel(tm) is learning that it's not how long your pipe is but how you use it?

Thanks dude, I just sent this to Bjarne Stroustrup by Anonymous Coward · 2001-09-03 07:06 · Score: 1, Funny

To: bs@resaerch.att.com C++ is just WRONG. The proper way to do it is ++C. The other way is both less effecient, and it makes it seem as if you need the original value of C for some special processing, while in fact you don't.

Re:Yeah but... by AndrewHowe · 2001-09-03 07:09 · Score: 2

Yes, any decent compiler will perform this optimisation. In C there's no difference, because you're not using the value of the expression. In C++ 'i' could be of a class with an overloaded operator++(int) which would involve creating a temporary. It's not always possible to optimise that away.

20 - 8 means a slower clock??? by anonymous+loser · 2001-09-03 07:10 · Score: 1

With an 8 stage pipeline, as opposed to the 20 stage pipeline in the P4, clock frequencies are obviously not as high (~1 GHz).

What??? That's totally false, not to mention counter-intuitive. The whole reason for the shorter pipeline is to increase throughput. Think of Henry Ford and the classic assembly line. If you have stages that involve scheduling instructions to be fed into different (parallel) pipelines, as opposed to DIRECTLY COPYING instructions from cache into the appropriate pipeline, which do you think should be faster?

Re:20 - 8 means a slower clock??? by ralfp · 2001-09-03 07:58 · Score: 1

The whole idea is that you do less in each step, so that each step can run faster.

In an assembly line, say there are 21 screws to put in. If each step has one person inserting 3 screws, it will take 7 steps to do it. Now if each step has one person inserting one screw, it will take 21 steps, but each step can go three times as fast.

The steps are serial. In the first case you would have 7 people, in the second case you would have 21 people (stages), but you could do three times as much work per unit time.

The link to the intel doc posted by Anonymous Coward contains a space in "article", remove it and the link will work fine (this must be because that is where the line break occurs in the "Comment" box if you insert the link).

328 registers? by NoMoreNicksLeft · 2001-09-03 07:14 · Score: 2, Interesting

My god. I'll never learn assembly on a modern chip. I tried on the 386/486, but gave up, and opted for the 65c02 (a fine little chip). I'm getting to the point where it's time to move on, and I was going to attempt the 68k or even PPC (no altivec though). I think I might actually manage to learn that, but I can't even begin to imagine 328 registers. Especially arranged the way intel tends to arrange them...

Will anyone outside of cpu engineers and compiler authors even learn asm on this monster? Or have we truly moved past the point where programmers understand the cpu?

Re:328 registers? by Alanzilla · 2001-09-03 08:54 · Score: 1

Or have we truly moved past the point where programmers understand the cpu?

We moved past programmers understanding the hardware LONG ago. At least around here....
Re:328 registers? by DarkEdgeX · 2001-09-03 16:43 · Score: 2

256 of them are numbered generic registers (so instead of EAX, EBX, ECX and EDX, you get r0 through r127, then 128 (or 127) more floating point registers). They're also 64-bit in size, vs. 32-bit on x86 based CPU's.

I'm not sure what the other 100+ registers are, though I believe there are 64 "predicate" registers that have a 1-bit accuracy (eg: set to 1 or 0) and can't be used as generics (and wouldn't be useful even if they could).

--
All I know about Bush is I had a good job when Clinton was president.
Re:328 registers? by netnic30 · 2001-09-04 15:51 · Score: 1

for the most part general purpose assembly language code will be a for really specialized tasks only. Rely on the compiler team to produce a good compiler with profileing to improve performance (unless you can think like a pipelined cpu that is)

Re:Yeah but... by asherlangton · 2001-09-03 07:29 · Score: 1

From Knuth's rng-double.c: /* This program is copyright (c) 2000 by D E Knuth; [. . .] for (j=0;j<KK;j++) aa[j]=ran_u[j]; for (;j<n;j++) aa[j]=mod_sum(aa[j-KK],aa[j-LL]); for (i=0;i<LL;i++,j++) ran_u[i]=mod_sum(aa[j-KK],aa[j-LL]); for (;i<KK;i++,j++) ran_u[i]=mod_sum(aa[j-KK],ran_u[i-LL]); [. . .]

Excuse me? by x136 · 2001-09-03 07:31 · Score: 1, Flamebait

A hundred and thirty watts?!? For just the chip?

Holy frickin' crap! I've got whole computers that use less juice than that!

"I'm sorry, sir. that 400 watt power supply is insufficient to run your new Itanium. You will have to buy at least a 1.2 kilowatt power supply..."

--
SIGFEH

Re:Excuse me? by Saidin · 2001-09-03 07:49 · Score: 1

McKinley is not a desktop CPU, nor was it ever meant to be. The sort of power supplies you see in serious workstation and servers are not the 400W things you see in a desktop. Also, while 130W is high for this space, it is not astronomical. CPUs were pushing the 80-100W envelope easily.
Re:Excuse me? by nick-less · 2001-09-03 09:19 · Score: 1

Holy frickin' crap! I've got whole computers that use less juice than that!

As said before, this is a server CPU, even IBMs Power4 sucks over 125Watts, but its small desktop brother has a low power consumption.

If there will ever be a ItanicIII CPU for desktops, I'm sure that it will be around 40-50 Watts and 1GHz (and Intel telling that GHz is not all, registers are ;-)
Re:Excuse me? by rchatterjee · 2001-09-03 09:23 · Score: 1

I'm not sure how serious a workstation/server you're talking about but here at my work we have some dual 64-bit UltraSPARC systems that work fine off of 400 watt power supplies. So if you're talking about the 64-bit workstation/server space, 130 watts per processor is in fact fairly astronomical.
Re:Excuse me? by Midnight+Thunder · 2001-09-03 10:01 · Score: 1

I can imagine the next California black outs being labelled 'Intel Inside'.

--
Jumpstart the tartan drive.
Re:Excuse me? by kcRabbi · 2001-09-03 11:41 · Score: 1

I have an itanium based system on my desk right now. It's the HP i2000 workstation. It has an 800W power supply. It runs HP-UX and RedHat 7.1 very nicely but Windows XP is an absolute dog. And that's with 1GB RAM!

kcRabbi

130 Watts by color+of+static · 2001-09-03 07:41 · Score: 2

Damn, I guess a large scale SMP machine will be dual use convection oven then. Oh wait, by the time I add in FSB buffering and memory maybe that will be true of the workstations :-).

Re:130 Watts by OmegaDan · 2001-09-03 08:03 · Score: 3, Funny

Mattel offered "barbie" and "hot wheels" computers earlier this year ... maybe intel could go in with Mattel and offer an Easy Bake Itanium computer.

--
Free Techno/Jazz/DNB/MI Music by guys obsessed with monkeys!
Re:130 Watts by Midnight+Thunder · 2001-09-03 10:06 · Score: 1

This is what they mean by convergence.

--
Jumpstart the tartan drive.

Whew! It's fun to be over your head. by Kintara · 2001-09-03 07:43 · Score: 2, Interesting

I think I've figured out what the whole 64-bit thing is about. It means that each instruction (right term?) has more capacity to carry data. This doesn't necessarily mean that it will be twice as fast, of course, because not all instructions are that large.

What I'm confused about is how it affects programming. Does this mean that everything will need to be optimized for you to take advantage of the higher bitrate? How will programs that are written for 32-bit systems handle it; can they handle it? How about backwards compaibility?

Do any other people read these sort of threads even though they know that it will be over their heads most of the time?

--
--Kintara

hmm. by silent_poop · 2001-09-03 07:48 · Score: 1

"6mb onchip L3 cache..." == good news for lazy programmers.

--

--
silence is poetry.

There was already a 64-bit xor! by Anonymous Coward · 2001-09-03 07:50 · Score: 1, Informative

for me, it means single instruction xor for the 64 bit hash codes used in chess transposition tables

Try PXOR in the MMX instruction set. It's been there for years.

Obvious AMD marketing campaign... by Glock27 · 2001-09-03 07:50 · Score: 1

"Like the high-end Intel Itanium, the AMD Athlon gets more done with lower clock speeds!"

186,282 mi/s...not just a good idea, its the law!

--
Galileo: "The Earth revolves around the Sun!"
Score: -1 100% Flamebait

email from intel by xted · 2001-09-03 07:57 · Score: 2, Interesting

I received this from one of my intel comrades which was sent to all of the intel eployees.

Speed is important. On Monday, Intel launched the Intel® Pentium® 4 processor at 2 GHz. Tuesday, during his keynote atthe Intel Developer Forum, Paul Otellini, executive vice president and general manager, Intel Architecture Group, demonstrated a processor operating at fully 3.5 GHz.

But that's not the half of it. Otellini went on to note that the Pentium4 microarchitecture is expected to scale to a whopping 10 GHz.

Now that's a "Wow!"

But, exciting as speed is, it isn't everything. While it is important,"it is not sufficient to drive the levels of growth and innovation that will allow our industry to prosper," Otellini said.

Speaking before an audience of 4,000 developers, designers, and executives Tuesday, Otellini noted that as the computing industry has grown and new technologies have evolved, purchasing criteria are changing. "We all need to change the pattern of our investments," he cautioned the crowd. "We need to think beyond gigahertz and build substantially better computers."

Buyers now look to a variety of features, noted Otellini: style, form factor, security, power consumption, reliability, communications functions, price, and overall user experience. Combinations of these and other features are driving end-user technology requirements in individual market segments. Intel plans to develop technologies that will help address these changing requirements in each of the key market segments.

Here are just a few of the ways Intel plans to go beyond gigahertz, as Otellini revealed in his keynote address:

It's like multiple processors on a single chip
Otellini introduced the audience to a breakthrough in processor design called hyper-threading. This technology allows microprocessors to handle more information at the same time by sharing computing resources more efficiently. The technology provides a 30 percent performance boost in certain server and workstation applications and will first appear next year in the Intel® Xeon[tm] processor family.

130 Watts. by istartedi · 2001-09-03 08:04 · Score: 3, Interesting

This makes me wonder, how many Crusoe processors could you put in a box (all other components equal) and equal this power consumption? Would the performance of such a box meet or exceed the performance of an Itanium box for real-world servers?

--
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?

Re:130 Watts. by Chagrin · 2001-09-03 09:13 · Score: 2

Well, the Crusoe processor uses about 1-2 watts. So, you're talking about 65 Crusoe processors to eat up the power of a single Itanium. If you're going by the entire motherboard with its components, an RLX 324 uses 15.7 watts of power.

--
I/O Error G-17: Aborting Installation

Questions.... by stikves · 2001-09-03 08:12 · Score: 1

I have some questions. Can anyone answer them?

Are all these >300 registers "general purpose" or do we still have CISC "features" like doing divisions with only DX? If so this "is" a performance enhacement (also answers power problem).
L3 is 6MB, OK. But what about L1 and L2. Also is L1 unified or seperated?
Is this a complete RISC design (ie: there is no DIV operation)?
Will we be able to do "microprogramming"?
What about floating point? Does anyone know FLOPS rating of Itanium? (FLOPS: Floating point instructions per second)

Thanks for the answers :)

Re:Questions.... by Alanzilla · 2001-09-03 08:57 · Score: 1

Addressing #2, the concept of L3 is from marketing.

From the technological standpoint, that is still L2 cache. What is unique about the IA64 machines is their L0 (what marketing is calling L1): 2 clock reads and writes for data (3 clock reads and writes for instructions).
Re:Questions.... by DarkEdgeX · 2001-09-03 16:50 · Score: 1

Are all these >300 registers "general purpose" or do we still have CISC "features" like doing divisions with only DX? If so this "is" a performance enhacement (also answers power problem).

I don't know the answers off-hand to your other questions, but this one I do-- there are 128 general purpose 64-bit registers, and 128 floating point registers (also general purpose). IA-64 allows code to "allocate" registers as needed from this set of 128, and also supports register renaming. Gone are the days of EAX, EBX, ECX and EDX for the most part.

--
All I know about Bush is I had a good job when Clinton was president.

The consumer will never see an IA64 processor. by Alanzilla · 2001-09-03 08:38 · Score: 1

It is designed for servers, and possibly extremely high-end workstations.

Re:The consumer will never see an IA64 processor. by garbuck · 2001-09-03 11:16 · Score: 1

It is designed for servers, and possibly extremely high-end workstations

Until next year.
Re:The consumer will never see an IA64 processor. by Alanzilla · 2001-09-03 14:27 · Score: 1

Servers and high end workstations are a dead market. The only people buying high end boxes these days are gamers.

Yeah. I know you're trolling, but it's so... hard... to... resist..... Aaaaaaaagggggggghhhhhhhh!!!!!!!
Re:The consumer will never see an IA64 processor. by thogard · 2001-09-04 11:18 · Score: 1

And why do you say this is a troll? I work for a company thats in the business of selling PCs among other things and the high end server business is dead. People are now buying "low end servers." They don't want 256mb of ram in them and don't want to pay for all that extra memory even though it adds $20 to the price. The same is true for general workstations. They don't want to pay for the bigest and fastest when the slowest pc you can get is faster than all the workstations they currently have. Every "server" I've seen in the last few months is just a generic PC assigned server duties. I don't see there ever being a market for dedicated servers for small business again.

MOD PARENT UP! ;)))) by Nickoty · 2001-09-03 08:40 · Score: 1

Besides, if there had indeed been different code generated, that'd be a typical example of a situation in which you should fix it in the compiler, NOT in the code.

--

-- Cure for Cancer instead of SETI! (only w32 yet - mail and beg)

Re:Compiler by nagora · 2001-09-03 08:41 · Score: 2

It is dramatically reducing the complexity of the pipeline, thereby increasing throughput by orders of magnitude (see CISC vs. RISC).

Brought to us by the same people that told us the big pipeline would solve all our problems and that RISC was a deadend, that bought up and squashed the ARM, that thought that no one would need more than 8 registers or 640K of memory and all the other crap Intel have spouted since it invented the 4004 and then proceeded to get everything else wrong.

Intel has spent the last twenty years proving how little it knows and how much it depends on MS for a free ride onto the desktop.

TWW

--
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"

Treat r0 more like /dev/zero by yerricde · 2001-09-03 08:56 · Score: 1

Why, except perhaps that it seems a bit like wasted effort to check for accesses.

It's only a few gates, and it can help spot bugs earlier.

Why would you want a /dev/null in asm?

Not /dev/null but /dev/zero. For example, MIPS doesn't have "load immediate" but does have a 3-way "or immediate", ori r16, r0, 3, which loads register r16 with 3 ORed with the value in r0. It also removes the need for a 'negate' instruction, as sub r16, r0, r16 will negate r16.

--
Will I retire or break 10K?

Re:Treat r0 more like /dev/zero by Nickoty · 2001-09-03 09:15 · Score: 1

These examples just requires that you can read r0, and get zeroes from it, right? That is present in IA64 and useful and nice, as you clearly demonstrate.

I think it was just writing to r0 that wasn't allowed, and I can't figure out why anybody would want to do that.

--

-- Cure for Cancer instead of SETI! (only w32 yet - mail and beg)
Re:Treat r0 more like /dev/zero by scheme · 2001-09-03 11:17 · Score: 2

I think it was just writing to r0 that wasn't allowed, and I can't figure out why
anybody would want to do that

Think NOP. I believe on the alpha a nop was implemented as add r0, r0, r0. Possibly on other architectures as well.

--
"When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
Re:Treat r0 more like /dev/zero by tricorn · 2001-09-04 05:31 · Score: 1

r31 is the "zero" register on the Alpha. Standard-form NOP instruction is:

bis r31,r31,r31

"bis" is the "bit sum" instruction, otherwise known as "or".

Translation Time...arrggg... by Cylix · 2001-09-03 09:14 · Score: 4, Funny

With an 8 stage pipeline, as opposed to the 20 stage pipeline in the P4, clock frequencies are obviously not as high (~1 GHz).
This beast has a small wang... its not the size that counts, but how you use it. (no giggling from the girls damn't)

130 Watts power consumption...
Who needs space heaters anyway?

...6mb of on die cache...
OY! Hold your wallet tight, not for the light bank accounted!

I'm sure many people can appreciate 64 bit integer ops; for me, it means single instruction xor for the 64 bit hash codes used in chess transposition tables.

Not quite what the intel boys will be using in their next commercial. However, the wizards in marketing will be stressing the enhanced features of porn browsing. The fourth blue intel commando will be a scantily clad woman... further emphasizing the need for this processor which will not just make the internet faster, but will speed on your favorite pron sights.

--
"You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra

assembly language reference? by Skapare · 2001-09-03 09:30 · Score: 2

So where do I download a free reader that runs on Linux for that file of binary garbage?

--
now we need to go OSS in diesel cars

Power by man_ls · 2001-09-03 09:31 · Score: 1

Hmm, maybe the cases with redundant power supplies will find a slightly higher market now. One for everything except the processor, and one for the CPU chip itself.

!!!

And my 750-Athlon runs hot as it is.

J.Koebel

Re:Compiler by Anonymous Coward · 2001-09-03 09:43 · Score: 1, Insightful

Actually I am very familiar with VLIW and I suggest you read up on some of the comments by the Alpha development team on Merced. You clearly need a second viewpoint.

You're saying the compiler has knowledge of registers, and what branch will be taken? Further you're saying the compiler has knowledge of the *current* memory structure? Latency of a particular memory fetch/store (whether the data is in L1/L2/L3/L4/L5 memory?). When DRAM refresh is going to hit (if ever). Or that an interrupt may come in randomly.

All this info is VERY useful for the processor to reorder its instructions to avoid a pipeline stall. But of course, you'd say the compiler knew all this detail ahead of time - right? ;)

Further, the compiler typically does NOT have access to the entire program at once. Many, many programmers do not have one huge .C file for their entire project. Opting rather to make many .o's and link them together. The linker typically does not do optimization. The compiler attempts to, but often assumes uniform memory access latency (the linker typically decides the memory map).

And i'd really prefer to avoid optimized recompiles during VM page swaps. They take long enough as is. And really, on a decent system - you shouldn't be paging!

Tom

What? A dog? by doorbot.com · 2001-09-03 09:57 · Score: 2

Simplicity is the correct answer, Intel clearly didn't understand the question.

That's assuming they were listening in the first place.

If Intel had big plans for the long run, they'd create a "simple" processor, let's take the original Pentium as a bad example:

Add MMX. Customers upgrade.
Change processor form factor. Upgrades galore.
Add SSE. More upgrades.
Change proessor form factor again. Upgrade.
Change form factor, add SSE2 and slap on a few marketing terms. Further upgrades.

The advantage is each time you can say the processor is "new and improved" so people will buy new ones. Does it really matter that a Pentium III 600 is more than enough power for 90% of computer owners? Of course not.

What makes me laugh, though, is how Intel switched to the Slot 1 form factor so it would be easier for customers to install processors (how often does that really happen?) and then switched back. I'll bet they were planning it all along.

Re:What? A dog? by johnjones · 2001-09-03 10:34 · Score: 2

the best one yet is cache

put more cache on the chip because after all its the same structure on the die and get huge gains in performance
(as long as your MMU and cache lines are done right)
but takes up more die so more expensive

Charge an ARM and a LEG

oh come on you arnt serious ? I hear you cry

ask an intel engineer the diff between XEON Px and plain Px

answer cache
(yes I know that designing a decent cache is hard but compare it to a real change in arch ;-)

result (foolish) customers upgrade

fun of the fair

regards

john jones

Re:High performance or what ? by be-fan · 2001-09-03 10:05 · Score: 2

Apparently, this poster things that cutting edge designs get written at the moment of their release.

--
A deep unwavering belief is a sure sign you're missing something...

Re:Compiler by anonymous+loser · 2001-09-03 10:29 · Score: 1

You're saying the compiler has knowledge of registers, and what branch will be taken? Further you're saying the compiler has knowledge of the *current* memory structure? Latency of a particular memory fetch/store (whether the data is in L1/L2/L3/L4/L5 memory?). When DRAM refresh is going to hit (if ever). Or that an interrupt may come in randomly. All this info is VERY useful for the processor to reorder its instructions to avoid a pipeline stall. But of course, you'd say the compiler knew all this detail ahead of time - right? ;)

Having more data at compile time does not preclude having the same branch prediction and memory access data in hardware, as you imply. Itanium still has the ability to do branch prediction and handle memory latency the same as any modern processor. Why don't you read the documentation and get back to me?

And i'd really prefer to avoid optimized recompiles during VM page swaps. They take long enough as is. And really, on a decent system - you shouldn't be paging! The reason you do this during a page swap is because that is when the processor would normally be stalled/idle waiting for data or instructions anyway. If implemented properly, this requires no extra cycles to perform. Every system page swaps, which is why TLB's (translation lookaside buffers) exist...to translate between virtual memory addresses and local memory addresses.

AMD by Decimal · 2001-09-03 10:42 · Score: 1

But I don't really want a P4, the Itanium definitely isn't for what I do, and I have never really been an AMD fan, I just don't know their stuff. So where is Intels next chip for ME?

You seem quite adamant that your next chip should come from Intel, and one of the reasons you give is that you don't know much about AMD. So why don't you look into AMD, and learn their stuff? They really are a great company and right now their 1.4 GHz Athon runs just as fast (or faster) than a 1.7 Ghz P4. The Sledgehammer chip will have a mode for backwards compatibility with x86, use 64-bit instructions and you can be sure that it will run cooler and faster than anything Intel will put out at the same time.

--

Remember "Bring 'em on"? *sigh

Re:AMD by BiggestPOS · 2001-09-03 11:23 · Score: 1

For some reason when I think AMD, I think of shitty VIA chipsets that have fucked up compatiblely issues with my favorite Video Card makers stuff. (I had a Riva128, 4 meg, Played the Quake3 IHV really well actually on my PII-233).
If this is no longer the case with AMD, I might have to go that route.

--
What, me worry?
Re:AMD by BiggestPOS · 2001-09-03 11:58 · Score: 1

You don't remember all the muleshit with the original Slot-A athlons and the current Nvidia chips? VIA sucks.

--
What, me worry?

CPUID vs PSN by dpilot · 2001-09-03 10:44 · Score: 2

You're right, I had a brain blip on that. I meant PSN instead of CPUID.

I merely wish they had looked into some PSN-type technique that would let software be nodelocked without being usable for tracking. I don't believe PSN must be bad, at least not to anyone other than a fanatical Free Software type, who believes NO software should need to be paid for. I'm sure a technique can be used which will not alarm privacy advocates.

--
The living have better things to do than to continue hating the dead.

Re:thread switching? by pslam · 2001-09-03 10:51 · Score: 1

Switching is usually at around 100Hz. 100 * 3072 bytes = 300KB/sec. This is a relatively small cost and should hit L1 cache the vast majority of time. It does effectively knock 3KB out of the L1 cache.

I'd guess that the Itanium CPU has some scheme to reduce register swapping on context switching. I can instantly think of at least one way - having "dirty" bits for segments of the register set, so it can be broken into, say 32 register chunks. I'll have to grab the tech ref manual at some point.

Re:G4 kicks butt. by PurpleBob · 2001-09-03 10:59 · Score: 2

So, by the way that's phrased, either you have no idea what the number of bits per instruction has to do with anything, or you're dumbing down your language because you think the rest of Slashdot doesn't.

"G4 has 128 bits in it! Bits make computer go fast! Bits good!"

--
Win dain a lotica, en vai tu ri silota

In terms of puffs and gives by yerricde · 2001-09-03 11:38 · Score: 1

In an assembly line, say there are 21 screws to put in. If each step has one person inserting 3 screws, it will take 7 steps to do it. Now if each step has one person inserting one screw, it will take 21 steps, but each step can go three times as fast.

And you lose time while the work is moving to the next person.

In the first case you would have 7 people, in the second case you would have 21 people (stages), but you could do three times as much work per unit time.

Not necessarily. Call each insertion of a screw (or each layer of logic) a "puff" and moving the work to the next worker (or setup and hold for flops between pipeline stages) a "give." If a give takes a significant amount of time, fewer puffs per give can actually bog down performance. No matter how long a puff takes, the slowest worker's puff-puff-give time (or "critical path") always determines the clock frequency of the processor.

--
Will I retire or break 10K?

Re:Compiler by anonymous+loser · 2001-09-03 12:08 · Score: 1

Brought to us by the same people that told us the big pipeline would solve all our problems and that RISC was a deadend, that bought up and squashed the ARM, that thought that no one would need more than 8 registers or 640K of memory and all the other crap Intel have spouted since it invented the 4004 and then proceeded to get everything else wrong.

Actually most of the design work came from HP's VLIW research team. I also find it disheartening that you are trying to prove your case by defaming Intel, which is the weakest form of argument (pathos) according to Socrates.

The chipset is the magic... by driehuis · 2001-09-03 12:16 · Score: 2

I've been extremely reluctant in going the AMD route. My first AMD processor was a 133MHz 486, which was branded in a way as to resemble the Pentium 75 (on the premise that it was as fast). I put it into my firewall, which was not getting heavy duty at the time.

The thing sucked eggs, and I threw the motherboard in the trash and used the CPU as a paperweight.

At some stage, I needed a faster CPU, needed a motherboard to go with it, so I made the jump to a PentiumIII/450. I needed to revive my firewall, so I bought a decent ASUS 486 mobo at a fair. On a hunch, I put my paperweight AMD133 in. I was pleasantly surprised, and I only replaced the thing when I got a real cheap 300MHz Cyrix mobo.

Bottom line, it't the motherboard (or rather the chipset on it) that makes or breaks the CPU. I'm now running an ASUS A7V-E with a 1GHz Athlon, and I've been a happy camper. I'm not an overclocker (matter of fact, I underclock some machines just because I don't need CPU power for other things than video recoding, and some machines are on the other end of the globe, so I don't want to lose sleep over fan failure).

My main gripe with the VIA KTA133 chipset is the fact that I have to sign a $#@#%$#% NDA to get decent specs on it. FreeBSD doesn't seem to grok its I2C based hardware monitoring, and without those docs I'm SOL. Apart from that, it's working great. Even under Carmageddon^WWindows.

--

Bert Driehuis -- All I asked was a friggin' rotatin' chair. Throw me a bone here, people.

Re:The chipset is the magic... by driehuis · 2001-09-03 14:16 · Score: 1

Huh, I will beat you on that one. My main squid proxy for the longest time was a 386SX with 12MB of RAM, and a really slow IDE disk scavenged from a laptop.

Only reason I upgraded was that I got tired for waiting for the prompt to appear when ssh'ing into the machine.

--
Bert Driehuis -- All I asked was a friggin' rotatin' chair. Throw me a bone here, people.

I hope Apple and AMD pick up on this... by guttentag · 2001-09-03 12:17 · Score: 1

in their "Megahertz Myth" literature:

"So you're a speed demon, huh? You bought the most powerful microwave available so your popcorn would be ready 20 seconds before your neighbor's.

You could run out and buy a PC with Intel's new 2 Ghz Pentium 4 processor for around $1,800...
or step up to their higher-performance 1 Ghz Itanium processor (insert link to Intel's Itanium literature here) for between $8,000 and $15,000.

Then again, now that you know that megahertz (and gigahertz) don't equal speed, you could come to your senses and buy an 800Mhz Power Mac G4 for about $1,500.

Feels good to have the inside info, doesn't it? Welcome to Apple."

130 watts?!?! by Guppy06 · 2001-09-03 12:20 · Score: 2

Why do I have visions of new computers plugging into a 230V AC socket, like dryers and ovens? 130 watts an awful lot of juice when you consider most power supplies only put out around 5 volts DC or so.

For those that don't remember their EE or physics courses: watts = volts * amps. And one amp through your torso is enough to kill just about anybody.

Re:130 watts?!?! by MikeBabcock · 2001-09-03 14:00 · Score: 2

If you go to American Power Conversion's website and look around, they have some good information on why running your servers at 230 is more efficient than 115 anyway.

--
- Michael T. Babcock (Yes, I blog)

From Dell's web site by clovis · 2001-09-03 12:40 · Score: 1

http://rcommerce.us.dell.com/rcomm/config.asp?orde r_code=H1054&conum=70&ConfigType=3

Check out the choices of Operating System.
And no, I haven't called to see if they're shipping today.

Remember 6502? by driehuis · 2001-09-03 12:42 · Score: 2

In its heyday, the 6502 was an eight bit RISC processor avant la lettre. It featured a whopping 256 memory locations that could be accessed with near-zero overhead. The famous page zero.

Needless to say, this great concept had gone to the dogs before the first consumer laid his/her hands on the device. Oblivious to the CPU design, a major manufacturer of operating systems (we called them BASIC interpreters at the time, by the way) has decided that most of page zero should be allocated to the OS^WBasic interpreter. I'll leave it to our hidden conscience to name the prepretrator of this gruelsome mistake.

I have long grown over the idea of using assembly as a faster programming language. The number of times I beat an assembly program with something hacked up in Perl, I don't even want to remember. Not because Perl is the best thing since sliced bread, but because humans are so poor at dealing with complexity. Get it working first, and leave optimization to the compiler. Then, if you have a bottleneck, analyze it, and fix the bottleneck in a targeted piece of code (whether C, or assembly, or something else).

--

Bert Driehuis -- All I asked was a friggin' rotatin' chair. Throw me a bone here, people.

Re:Remember 6502? by driehuis · 2001-09-03 14:13 · Score: 2

Seems like you're in for the heck of it :-)

That's good. I'm happy to be reminded every once in a while that whatever half baked wisdoms I spout, they are usually based on a corporate image of what makes economical sense and what doesn't.

The art of computing wasn't furthered that much by the corporates, I know!

--
Bert Driehuis -- All I asked was a friggin' rotatin' chair. Throw me a bone here, people.

Re:When will we see some improvements from the Alp by thogard · 2001-09-03 12:59 · Score: 1

Intel remembers history...
Remember when Shurgart made the best drives in the world? Someone pissed off a bunch of engineers and they founded Seagate. Where is Shurgart now?

Special Cases for Intel chips by Anonymous Coward · 2001-09-03 13:05 · Score: 1, Funny

Kind of like the P4 case with the special bolts to hold up the cpu, the Itanium is going to require a special case with a lightning rod to provide the 130 jigawatts.

Re:When will we see some improvements from the Alp by pjbass · 2001-09-03 13:54 · Score: 1

It's actually funny that you mention that; Intel bought them so they can replace the VAX's they use in factories, which in turn will be replaced with Itanium/McKinley once someone can write an OS that can support a fab (like VMS). I hate VMS, but point me to an OS that will run on anything non-DEC or non-IBM under-run (and don't mention Solaris, because it WON'T do it...), and be able to support the fabs and everything that is tied to them (WIP movements, billing, shipping, ordering, cross-site processing, etc.).

Slashdot needs 'Ironic' rating... by salimma · 2001-09-03 14:06 · Score: 1

I completely agree that 130 watt is a rather ridiculously high power output, OTOH of course Intel was referring to the system makers properly cooling the McKinley when they referred to 'a properly designed system'.

Heck, if I just bought myself a $3K CPU I would not want it melting down either.

Michel

--
Michel
Fedora Project Contribut

Re:Compiler by roca · 2001-09-03 14:39 · Score: 2

> Having more data at compile time does not
> preclude having the same branch prediction and
> memory access data in hardware, as you imply.
> Itanium still has the ability to do branch
> prediction and handle memory latency the same as
> any modern processor.

No, it does not. In the quest for increased scalability they threw out "out of order" execution. All instructions must retire in order. This cripples its ability to tolerate unpredictable memory latencies.

Re:Compiler by roca · 2001-09-03 14:47 · Score: 2

> It's not "throwing it off" to the software guys
> because it's too difficult to implement. It is
> dramatically reducing the complexity of the
> pipeline, thereby increasing throughput by
> orders of magnitude

That's what they say, yet somehow decreasing the complexity of the pipeline hasn't produced many benefits in practice. The clock speed is low and the throughput (as measured by benchmarks) hasn't increased by orders of magnitude ... or whatever improvements there have been are trashed by other problems in real-world applications.

> The scheduling hardware is only capable of
> looking a few instructions at a time to decide
> how to enhance ILP
This is quite false. Modern CPUs can have over 100 simultaneously executing instructions in flight. Furthermore modern CPUs take advantage of hardware such as branch predictors which records information on hundreds or thousands of instructions in order to make better execution decisions.

Profile-based optimization is a cool idea in theory but despite decades of research, it's seldom used. I suspect that one reason why is that (in C programs) reoptimization can reveal bugs in your code that were previously hidden (like an uninitialized variable that, by luck, always happened to be zero when the code was optimized a certain way). People don't like it when their system suddenly starts exhibiting new bugs that no-one else can reproduce.

Mem subsystem is what I've heard. by Sangui5 · 2001-09-03 15:38 · Score: 1

From what I've heard from a friend who's gone from DEC to Compaq to Intel, they're mostly interested in the memory subsystem and the bus. The moveover from "Alpha" to "Intel" is supposed to take a while though. He says that one more release of the Alpha is going to be made, a few minor revisions, and then no more. The whole Alpha team will be engulfed by 2007 (I think?), but gradually.

One can hope that one the first subteams to move over is whoever designed the Alpha bus. It might be a bit more expensive, but it's better than that POS that Intel is currently using.

I'm finding it a little scary that all of the people I know who really disliked Intel are now working for them. Alpha just plain got bought, and everyone I know at HP is helping with McKinley. Freaky.

!= Correct by DarkEdgeX · 2001-09-03 16:46 · Score: 1

Wrong. Good news for data intensive apps that tend to read a lot of the same data (say, database indexes) from memory. Cache doesn't have a whole lot to do with "lazy programmers" other than allow code and data fetching to occur faster (especially in loops or, in this case, larger loops, I'd imagine).

--
All I know about Bush is I had a good job when Clinton was president.

Re:When will we see some improvements from the Alp by Jordy · 2001-09-03 17:57 · Score: 2

Alan Shurgart (the man credited for creating the floppy drive) left Shurgart Associates in 1974 due to a dispute about the direction of the company.

In 1979, Finis Conner (who later founded Conner which was bought by Seagate) approached Shurgart to develop 5 1/4" hard drives and the two founded Seagate.

I believe Shurgart Associates was purchased by Xerox around the time when Seagate was founded.

--
The world is neither black nor white nor good nor evil, only many shades of CowboyNeal.

Re:Ridiculous power consumption by Graymalkin · 2001-09-03 18:14 · Score: 1

If you want to know why this is off topic look up the specs of an UltraSPARC III or POWER3 processor and figure out how many watts of power they each up. An Itanium is right up there in line with them. A bajillion registers and 6 megs of on chip cache lends to sucking down a whole bunch of amperage. It isn't like Athlons plus the Golden Orb fan you've got on top doesn't suck down its fair amount of power.

--
I'm a loner Dottie, a Rebel.

PXOR can do xor on 64bit numbers by Utopia · 2001-09-03 18:31 · Score: 1

regarding means single instruction xor for the 64 bit hash codes used in chess transposition tables
Some of 64 bit operations are already possible with MMX.
PXOR instruction in the MMX set can xor 2 64-bit numbers.

Re:PXOR can do xor on 64bit numbers by nagora · 2001-09-03 21:46 · Score: 2

Doesn't really matter if you're playing Japanese chess with 81 squares. The interesting point here is that it shows how over-specialised chess programs have become and how little they tell us about artificial intelligence. A really intelligent chess program could play either game (and any other varient) equally well. The 64bit transposition tables tell us nothing about how a human plays chess.

--
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"

New house by Graymalkin · 2001-09-03 19:00 · Score: 1

Intel is finally letting its vendors get into the REAL high-end games. I'm curious now to see how Itanium fares against both the USPARC3 and POWER3/4 in real world performance. Does Intel really have a chance here by deviating from the RISC-like quo of the market? Well I suppose Intel will only be making the chips and everyone else will be building the boxes. This raises another question; what are Sun and IBM going to do to compete? Sun and IBM are both in charge of production of their high end chips and thus have fairly fine control over the margins. Other OEMs on the other hand like HP and (let's say) Dell are getting their chips from an outside producer whose producing them in much higher quantities than IBM or Sun. This puts Intel in the position of eventually cheapening their chips enough to where HP and Dell can undercut Sun and IBM in price/performance. So anybody have access to a high end workstation they can jam a Red Hat install onto for a little for some Mindcraft style tests? Mindcraft in the sense they are simulated real world tests as opposed to pure benchmarking.

--
I'm a loner Dottie, a Rebel.

Re:New house by supersnail · 2001-09-04 00:59 · Score: 1

Both SUN and IBM are currently shipping chips that outperform the Itanium (which is very much sample ware) on every real world benchmark.

Furthermore the latest generation of the Power4 chip (Shipping in October) comes with two processers with a builtin shared L3 cache on a single chip!

In addittion it features several other speed enhancing innivations.

SUN are rummored to have something similar in mind for thier next generation Sparcs.

Basically INTEL are dying on the alter of "386" compatability. In trying to make a 64 bit chip compatable at instruction level to the worlds worse 16 bit instruction set INTEL have set themselves an impossable task.

--
Old COBOL programmers never die. They just code in C.

Re:G4's, the Megahertz Myth and the BPI Myth by Saint+Fnordius · 2001-09-03 20:10 · Score: 1

Well, you're reading a lot into the comment there. The speed of a system depends on so many factors, it's amazing that you can even compare across systems.

Rather than extolling the virtues of bits per instruction, the post we're both replying to is actually being skeptical about the whole BPI/Megahertz thingy, and adding a parting shot about how he loves his Mac.

(BTW, I love my Mac, too, mainly because Apple has managed to make a system that actually lets me get things done without a hassle. It's this foresight in the architecture that lets my old 200Mhz 604e keep on trucking as a productive workstation!)

Re:Whew! It's fun to be over your head. by doug363 · 2001-09-03 20:21 · Score: 2, Informative

I think I've figured out what the whole 64-bit thing is about. It means that each instruction (right term?) has more capacity to carry data. This doesn't necessarily mean that it will be twice as fast, of course, because not all instructions are that large.

Yup, exactly right. It means that the CPU tends to deal mainly with 64-bit (8 byte) chunks of data at a time, instead of the more common 32-bit chunks. As far as programming goes, not everything needs larger instructions. For example, to program a user interface, 32 bit integers are quite sufficient for most purposes (unless you have over 4 billion items in a listbox or something). If you only need to store a number from 1 to 10, using 8 bytes instead of 4 is a waste of memory. (This happens a lot.) However, it is useful for many operations, such as multimedia, games, DSP applications, crypto, etc. etc. These applications would run faster on a 64-bit processor because they can use 1 instruction to manipulate a 64 bit number instead of 2 or more that are necessary to do the same thing on a 32-bit processor.

The other reason to use 64-bit processors is that it makes it easier to use 64-bit memory addressing. (For various reasons, it's a little easier to program if memory addresses are the same size as integers.) If you have more than 4 GB of RAM, (or you want more than a 4GB address space more precisely) then you need larger pointers. At the moment x86 programs use 32 bit pointers, but the Pentiums and above actually have 36 address lines, so they can use up to 64GB of RAM. Anyway, a 4 GB address space will be fairly cramped in about 10 years, so it's time they bumped that up a bit.

Intel has an emulation mode in the IA-64 series to allow people to run existing 32-bit programs, but at the moment it's dog slow. (It runs at about the speed of a Pentium 133, if that, when the processor is running at around 700 MHz.) The IA-64 architecture is completely different from the current IA-32 (x86) stuff. I get the impression that the 32 bit emulation doesn't use as many tricks as the existing processors to get programs to run faster. They're also overhauling the motherboard/BIOS stuff that's been around for a long while. (Some of it since the original IBM PC.)

Of course, just because a processor can do 64-bit operations, it doesn't mean that it's actually faster than its predecessors. For instance, IA-64 has a few weaknesses:

It doesn't have an integer multiply instruction. You have to convert to floats and back if you don't want to program the multiply using shifts or something.
It doesn't support a floating-point type with better precision than 64 bits (called "double" in many programming languages). This makes it unsuitable for high-precision calculations. Current IA-32 chips can use up to 80 bit floating point values.
Intel seems to have tried to include every feature (except see above) but the kitchen sink in the instruction set. Loads of processor hints about instruction grouping, branch prediction, cache hints, and heaps of other stuff. This makes quite a complex design that could be difficult to implement and write really good compilers for. (Then again, Intel could always sell their own...)
And all of the space-heater comments.

Anyway, it remains to be seen what effect the above points will have on its acceptance.

Re:Compiler by nagora · 2001-09-03 21:42 · Score: 2

So you're saying the intel should have either broken backwards compatibility, or designed a super chip twenty years ago?

Backwards compatibility did not require the retension of a tiny register set (no general purpose registers - Jesus Christ!) and was a fairly bogus concept anyway when the 386 came in.

The 386 family is a bad design and if you'd ever programmed it you'd know. There is nothing good about the design.

TWW

--
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"

Re: "HyperThreading" in IA-64 by 2002 by arri · 2001-09-03 22:10 · Score: 1

Intel stole and then implemented Alpha technologies for its Pentium, and only much later did it negotiate with Digital to get the official right to use that stuff.

Nope, this one is definitely a bad line. The story goes back into the midst of time but what actually happened is rather confusing: at some point Digital sued Intel for patent infringement and everyone started shouting "Pentium copies Alpha". The apparent truth as told by a Digital chap to me at the time is that the VAX CPU cache design was copied in the Pentium Pro.

Digital sued and the settlement was that Digital sold its networking division to Intel for an undisclosed but not trivial sum and an oldish fab (with outdated lithography equipment) and they left it at that (including the fact that the PPro line was EOL'd). This is why the old DEC Tulip network cards started appearing as Intel parts.

--Arrigo

Re:Compiler by DivineOb · 2001-09-03 23:15 · Score: 1

Actually all processors retire instructions in order... just some of the execute them out of order...

--

I must burn in hell, suffer and pay for my sins
But Gods the one who's losing, Satan always wins!

Backyard Foundry - Intel Inside! by BigBlockMopar · 2001-09-04 01:47 · Score: 2

if you dont have the cash for the kilowatts,

Dude. 130 watts of power dissipation. My 17" monitor only draws 125 watts. What's the surface area of the packaged chip?

Forget the old 5V Pentiums (P60/66) being nicknamed "coffee warmers". They were known for all sorts of overheating problems, but they only drew 3.2 amps at 5V. P = I x E = 3.2 x 5 = 16 watts of power.

I could use one of these new chips for the heater in my backyard foundry.

There's soon gonna be a boom market for tungsten and ceramic heat sinks.

Sheesh.

--
Fire and Meat. Yummy.

64-bit operations useful...sometimes by Junks+Jerzey · 2001-09-04 01:55 · Score: 2

I'm sure many people can appreciate 64 bit integer ops; for me, it means single instruction xor for the 64 bit hash codes used in chess transposition tables.

Yes, 64-bit operations have a handful of general uses, but when you weigh the benefits against the huge increases in transistor count, power consumption, and memory usage, are they worth it? I argue that they aren't. Doubling the size of almost every unit on the chip is a steep price to pay.

Is Intel fighting the Laws of computation? by SysKoll · 2001-09-04 03:06 · Score: 1

There is a famous naysayer who pretends that IA-64 is doomed to be an underperformer. The problem with this naysayer is:

1. He's often hard to understand.
2. He's dead.

His name is Alan Turing.

I'll shamelessly quote myself here. Here's an excerpt of an article
in The Register in which I said:

The problem is that the only way to predict what a program
will do is to run it (that's a consequence of Turing's
theorem). A classical processor executes the program code and
knows what happens with the current data (loops, jumps, etc). It can
infer locality relationship from its instruction flow and make
on-the-fly optimizations. Meanwhile, the compiler of an non-optimizing
VLIW processor such as the IA-64 can only look at the code beforehand
and make assumptions about what kind of data that code will
process. Turing's theorem says that the only way to know what a
program will do with a given data set is to execute the program.

Thus the theorem predicts that the IA-64 pre-compiled
optimization will always be inferior to a good on-the-fly RISC
processor. At best, the precompiled optimization will yield an optimal
path for a given set of assumptions, which may not be true on every
run, since run-time data can defeat these assumptions.

I'd gladly be proved wrong. Can someone please tell me why Turing
is wrong and Intel right? Or is Intel fighting an uphill battle
against Uncle Alan's laws?

--

--
Mad science! Robots! Underwear! Cute girls! Full comic online! http://www.girlgeniusonline.com/

Re:Is Intel fighting the Laws of computation? by SysKoll · 2001-09-05 02:18 · Score: 1

Thanks for the explanation. I know that some modern compilers include runtime feedback to reoptimize the code as more profiling data points are collected. But I am not aware that these optimizations are available in the IA-64 compilers. A cursory check of Intel's website did not reveal anything conclusive.
HP might make them available for the IA-64 line, but their R&D is now in considerable turmoil due to the acquisition of Compaq whose own R&D dept will have to be integrated with HP's, forcing a major reorg. So I'll not bet on anything smart coming from this side of HP for a few months.
Can someone point to an announcement or paper showing that IA-64 compilers are using or will use that kind of dynamic optimization?
-- SysKoll

--
--
Mad science! Robots! Underwear! Cute girls! Full comic online! http://www.girlgeniusonline.com/

This is outside the x86 realm by Junks+Jerzey · 2001-09-04 04:57 · Score: 2

The Itanium is not a clear replacement for the x86 line by any means. If we're going to toss the x86 architecture completely, then there are lots of options: PowerPC, StrongARM, Alpha, SPARC, something else. Now switching the entire PC world to a SPARC chip sounds crazy, but it's not any crazier than switching to Itanium.

For the record, Intel has cooked up x86 "replacements" before, like the i860 and i960.

Bigger, not faster by Weasel+Boy · 2001-09-04 05:50 · Score: 1

I pretty much agree with everything doug363 said. I'll sum it up: Bigger, not faster. The difference between 32-bit and 64-bit chips is the chunks are bigger. If a 32-bit chip is juggling tennis balls, the 64-bit chip is juggling softballs. In a given amount of time, both chips toss about the same number of balls; the 64-bit ones just represent bigger numbers. That's it.

You can do 64-bit computation with a 32-bit chip, but then you _do_ take a huge performance hit. If you're not trying to fake a 32-bit processor into doing 64-bit computation (i.e., you're programming each processor in its native mode), then from the programmer's perspective it's exactly the same.

If it sounds like I'm trying to downplay the benefit of 64-bit chips, I am. The only time you benefit from a 64-bit architecture is when you need to use really huge numbers (e.g., scientific or cryptographic computing) or access really huge data (e.g., databases and suchlike).

Fortunately for makers of 64-bit chips, there are a lot of scientists and databases out there.

X86-64 by dpilot · 2001-09-05 12:01 · Score: 2

Ever read "Soul of a New Machine"?

One goal of the protagonists was to have the architecture extensions be clean, and if there was a wart, it would be the legacy part. After this topic came up, I took a quick look at some X86-64 stuff, and it looks as if AMD may have done just that. The 8 new GPRs are really GPRs, and I suspect the whole batch of 16 64-bit GPRs really are GPRs. It may be a cleaner 64 bit machine than it was 32 bit. I hope so.

Actually, I had to learn 8080 pretty thoroughly in college, learned a fair amount of 8086, less 80286, and by the time 80386 came around, was pretty well esconced into HLLs. So I can't speak very authoritatively on that side of it.

--
The living have better things to do than to continue hating the dead.

181 of 297 comments (clear)