Slashdot Mirror


Porting Linux Software to the IA64 Platform

axehind writes "In this Byte.com article, Dr Moshe Bar explains some of the differences between IA32 and IA64. He also explains some things to watch out for when porting applications to the IA64 architecture."

160 comments

  1. Awesome! by Wakko+Warner · · Score: 3, Funny

    Now I, and the other two IA64 users, will have some programs to run on our Linux-64 boxes!

    Can someone please port nethack for us?

    - A.P.

    --
    "Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
    1. Re:Awesome! by morbid · · Score: 0

      Poor you. Maybe you can swap it with someone for an electric storage heater or a gas central heating boiler? The latter would be cheaper to run.

      --
      I'm out of my tree just now but please feel free to leave a banana.
    2. Re:Awesome! by rcw-home · · Score: 3, Informative
      Can someone please port nethack for us?

      Maybe you could try the patches here?

    3. Re:Awesome! by conway · · Score: 1

      This might be a huge surprise to you, but a very large perentage of Linux apps (over 90%) port to linux ia-64 without any modifications.
      This is largely thanks to the fact that linux already runs on 64-bit architectures -- Alpha, Sparc, etc. and most apps have been adapted to that already. There's not much conceptual difference in the high-level programmer's view between IA64 and any other 64-bit linux platform.

  2. I'm a wacked out dude by Anonymous Coward · · Score: 0, Insightful

    AND IT FEELS GOOD!

    1. Re:I'm a wacked out dude by morbid · · Score: 0

      You been sniffing the fumes the itanic gives off? I mean it runs so hot it must melt something somewhere...

      --
      I'm out of my tree just now but please feel free to leave a banana.
  3. Major difference by Anonymous Coward · · Score: 1, Funny

    The major difference between IA32 and IA64 is price.

  4. There are a number of adjustments to make. by Anonymous Coward · · Score: 1, Funny

    IA64 is twice as wide as IA32. Therefore, it will be necessary to remember to halve the size of all variables to compensate in your programs. Additionally, we now have to type twice as much for each command or function. It really sucks that we will no longer see Ms. Portman onscreen in Star Wars anymore. So, in conclusion, nuts to IA64: I'm sticking with my Athlon, thank you very much.

  5. MOSIX + porting by Ed+Avis · · Score: 3, Funny

    Well obviously what we'll see next is a kernel extension that dynamically 'ports' all your applications to IA-64 and transparently migrates them to IA-64 machines elsewhere in the cluster. When Intel's next Great Leap Forward is released, you'll be able to transparently migrate to that as well. In fact it will be so transparent, you won't notice any difference and you can continue working at your 80286-based machine without any interruption.

    --
    -- Ed Avis ed@membled.com
    1. Re:MOSIX + porting by morbid · · Score: 2, Funny

      Last summer at the London Linux Expo I asked this HP reseller (who had a big itanic display) "what about legacy code?"

      He replied,"16-bit code?"

      I sighed and moved on...

      --
      I'm out of my tree just now but please feel free to leave a banana.
    2. Re:MOSIX + porting by gr · · Score: 2

      Unlikely.

      Migration of a running process, even when going between identical processors, is expensive. Going even to a similar processor would be more so. (And going from, say, a Sparc to a m68k is totally out of the question, not that you're suggesting that.)

      It's *really* hard to justify a policy of process migration in a cluster except with extremely long-running, massivley-parallel jobs. For most stuff, you'll waste less time just letting it finish. (GLUnix does do process migration. Note that when you come back to a workstation that's been horfed by GLUnix, you'll be waiting about two minutes before you get your UI back.)

      As for *starting* IA32 binaries on an IA64 processor, that's doable, but most cross-platform clustering systems function by keeping binaries for all their constituent processor types and having a hacked shell to convert PATH to the architecture-dependent path. (And by "most cross-platform clustering systems", I mean most that have been designed, since I know of none that work.)

      --
      Do you have a /. uid shorter than five digits? No? Then piss off.
    3. Re:MOSIX + porting by Salsaman · · Score: 2
      I don't think automatic porting would be possible.

      You'd have to have a compiler that was smart enough to recognise when a pointer was cast to an int and then instead cast it to a long.

      But now your code has changed, instead of a variable being an int, it's now a long - this is bound to cause problems elsewhere in your code !

      I learnt this lesson long ago when somebody tried to compile a C program I'd written on an Alpha machine, and it complained about casting pointers to ints (I'd wrongly assumed pointers and ints would be the same size on every architecture).

      What I do now is to typedef a pointer_t which can be either int or long, and make sure to use that everywhere pointer arithmetic is required.

    4. Re:MOSIX + porting by Anonymous Coward · · Score: 0

      Any compiler could emit a warning when a pointer cast discards information.

    5. Re:MOSIX + porting by rew · · Score: 2

      What I do now is to typedef a pointer_t which can be either int or long, and make sure to use that everywhere pointer arithmetic is required.

      Auch.

      First: Keep pointers in pointer variables. Try not to cast them back and forth to integer variables.

      If you have to, use longs. The C standard requires a long to be able to hold a pointer.

      Roger.

    6. Re:MOSIX + porting by gr · · Score: 2
      I learnt this lesson long ago when somebody tried to compile a C program I'd written on an Alpha machine, and it complained about casting pointers to ints (I'd wrongly assumed pointers and ints would be the same size on every architecture).
      This is a really easy problem to solve: follow the POSIX standards on type names. If you want a u_int32_t, say so. If you want a pointer-to-something, declear it that way, rather than trying to stuff it in some architecture-dependently-sized variable.

      (Note that NetBSD's code is primiarly arch-independent--the dependent stuff is mostly hardware initialization--and it compiles just fine on quite a wide array of processor architectures.)
      --
      Do you have a /. uid shorter than five digits? No? Then piss off.
    7. Re:MOSIX + porting by Salsaman · · Score: 1
      That's fine unless you're doing low level memory work, and you want to do integer (or long arithmetic) operations on pointer types.

      I was writing my own binary search array handling routines and I needed to return signed values and so on.

      But thanks for the tip about using longs :-)

  6. Key difference by Anonymous Coward · · Score: 0

    The key difference between IA32 and IA64 is not 32 or 64 bit technology. It is price/performance, as Intel is sadly aware of.

  7. Just learning assembly now by El_Nofx · · Score: 1

    I figured this would be coming now, I just started my first Assembly class as a CS undergrad, a whole new group of registers to memorise!

    --
    It's not the OS it's the user that sucks. If it's user friendly, you get stupider people. - clinko
    1. Re:Just learning assembly now by Tower · · Score: 1

      I sincerely hope that your assembly class won't be using x86 (or any derivative thereof)... a nice 6811/68332/PowerPC would be far more useful as a learning tool without the cruft... PowerPC assembly is actually fun...

      --
      "It's tough to be bilingual when you get hit in the head."
    2. Re:Just learning assembly now by El_Nofx · · Score: 1

      The Instructor actually brought that up the first day, he said in the past there has been demand for a PPC version of the class, but since each platform has it's own unique instruction set there would be no overlap and you would just have to learn the language all over again for Intel coding. They mostly go on the demand of the market, (they just switched their main taught language from c/c++ to java) so they pretty much just teach the x86 version now.

      I would have to agree with them that there would be alot more demand for someone programming assembly on an Intel box then on a Mac.

      I would say what we have done so far is fun though. Any programming can be if you make it.

      --
      It's not the OS it's the user that sucks. If it's user friendly, you get stupider people. - clinko
    3. Re:Just learning assembly now by Anonymous Coward · · Score: 1, Funny

      So, you learn x86 assmebly and Java. I guess you'll do XOR in the long run, eh?

    4. Re:Just learning assembly now by Tower · · Score: 1

      True, it can all be fun. I think the register set of the PPC lends itself to some more creative solutions to some problems, and when you look at low level programming in assembly, much of the work is in the embedded space, where there are a *ton* of PowerPCs (and Motorola chips). I wasn't thinking as much about the PC/Mac situation.

      --
      "It's tough to be bilingual when you get hit in the head."
    5. Re:Just learning assembly now by Wildcat+J · · Score: 2, Insightful
      When I was in college, the only assembly programming we did was for MIPS. For our compiler project, we originally put out MIPS assembly and then retargeted it for the Sparc. I never once had to do any x86 assembly in school.

      There's really not that much demand for any assembly in the industry at large. Even microcode is being done in high-level languages these days. I would wager that most of the people doing assembly coding now are in highly specialized fields, especially embedded programming. So, there isn't necessarily any more demand for x86 assembly programmers than for any other (possibly non-standard) architecture. In my opinion (and this is only opinion), while you should learn an assembly language in school to understand the basic building blocks, the choice of architecture isn't crucial. However, since it's not crucial to learn one or the other, I think they should stick with a simple one. x86 is kind of a mess; MIPS was easy to learn. As far as access to the hardware goes, there are simulators for most processors, which is sufficient for education.

      -J

    6. Re:Just learning assembly now by drewness · · Score: 1

      I'm taking an assembler class myself right now. The real point of taking an assembler class anymore is to help you understand how computers work at a lower level, so you make better decisions programming in a higher level language. Very few programs should need to have asm anymore. Even linux kernel drivers are mostly written in C.

      The x86 is and odd choice if that's the goal, because it just kludge upon kludge trying to make an 8 bit processor be 16 bit, then 32, and now 64. I don't know any x86 asm, but it is rather wonky and makes you jump through some hoops as I am told.

      At OSU we are learning SPARC asm. When Sun went from 32 to 64 bit I think that for the most part they just had to change all the register sizes to 64 bit, because it was designed with the future a little bit more in mind than the x86.I'm just taking a really basic class (it's actually called "Introduction to Computer Systems"), so we aren't going to deal with things like the differences between a SPARC and UltraSPARC, but like I said it is apparently an easy transition. I'd imagine that the PPC is probably easy too. (Both are 32bit bigendian with the possibility of 64bit in the future designed in, I think)

    7. Re:Just learning assembly now by tzanger · · Score: 2

      I would wager that most of the people doing assembly coding now are in highly specialized fields, especially embedded programming.

      As an embedded systems designer I can tell you that even here in the embedded world, assembly x86 is nowhere to be found, except for maybe in the lowlevel init. Even there, though, it's used to get the environment ready for C and calls a C function to start all the real work, very much in the same manner as the Linux kernel source shows.

      Assembly programming is everywhere in the embedded world, just not x86 or anything powerful enough to be able to use a C compiler. I routinely do large Microchip PIC systems entirely in assembler, but that's only because of one of two reasons: they're not suited for C (the 18Cxxx is a different story now), or I need every last word of program and data space.

    8. Re:Just learning assembly now by Anonymous Coward · · Score: 0

      > When Sun went from 32 to 64 bit I think that for the most part they just had to change all the register sizes to 64 bit

      Welcome to the world of Intel i386, and AMD x86-64.

    9. Re:Just learning assembly now by Anonymous Coward · · Score: 0

      OSU has one fucked up CS department. May God have mercy on your soul.

  8. What's the deal with IA64? by ArchMagus · · Score: 1

    Isn't that the instruction set of the Itanium processor that isn't selling worth crap? I was under the impression that intel was going to eventually drop (or push to a back burner) support for this and go with x86-64 (the AMD 64 bit architecture being rolled out with the Opteron.)

    1. Re:What's the deal with IA64? by Cheeko · · Score: 1

      Hardly. HP and Intel are pushing full speed ahead with these. Supposedly there will be commercial systems by the end of the year. Also if IA64 was to be pushed back, Intel would likely switch to its own 386-64 architecture, currently codenamed Yamhill, if I recall.

    2. Re:What's the deal with IA64? by Anonymous Coward · · Score: 0

      Stick a fork in it!

      Intel can't stick with IA64 now that AMD is rolling out their 64bit chips. They'd just fall too far behind the curve.

      After all the IA64 chips are too expensive and too slow.

    3. Re:What's the deal with IA64? by NanoGator · · Score: 2, Interesting

      "Intel can't stick with IA64 now that AMD is rolling out their 64bit chips. They'd just fall too far behind the curve."

      Yeah, I mean its not like Intel knows how to develop chips or stay in business or anything.

      --
      "Derp de derp."
    4. Re:What's the deal with IA64? by Cheeko · · Score: 0

      IA64 and AMD's 386-64 don't even compete for the same market. One is a high-end chip to replace the big iron RISC chips, while the other is a chip for low end intel servers that currently run on IA32, but could benefit from an increased address space.

    5. Re:What's the deal with IA64? by guacamole · · Score: 1

      The current generation of IA64 is not really meant for the general public. It is useful only for early adopters (that is developers). We'll be able to tell
      whether IA64 succeeded or not a few years down the
      road when it is somewhere in its third generation..

    6. Re:What's the deal with IA64? by Master+Bait · · Score: 2, Informative
      Intel hasn't made any announcements about their Yamhill, and HPQ still seems to think that IA64 is a go. The new(!) Itanium II is supposed to make this pathetic architecture up to 50% faster. Then it will have integer op performance comparible to today's fastest Celeron.

      Look for Sun and/or IBM to be selling 8-way Hammer machines by this time next year, according to my Spirit Guides.

      --
      "Only in their dreams can men truly be free 'twas always thus, and always thus will be."
      --Tom Schulman
    7. Re:What's the deal with IA64? by Anonymous Coward · · Score: 1, Insightful

      Look, IA64 and AMDs 64 bit instruction set are two very different things. One will succeed and one will fail, if the market doesn't dictate this Microsoft will. The IA64 products may never reach the performance of the competing chips and the price to performance ratio will NEVER touch that of the AMD 64 chips.

      Give me one reason anyone will care about the IA64 chips if cheaper faster 64bit chips will already be out.

      IA64 is significantly more expensive than the problem it was trying to solve. Oops.

    8. Re:What's the deal with IA64? by Anonymous Coward · · Score: 0

      "Pathetic architecture?"

      First people rave about how bad the X86 is and how much we need to move to something completely different, then Intel comes out with an awesome new architecture and everyone chickens out in favor of some crappy 64-bit extension to the X86?

      Itanium's performance may be disappointing, but the chip has a LOT of potential. Far more than X86-64 has in the long run.

      Itanium floating point performance was actually quite good. I think the chip hardware will improve very much within a few revisions and performance will be much, much better because of that and better compilers.

      VLIW-style instruction level parallelism isn't easy to code, but if done properly, the processor can really fly.

    9. Re:What's the deal with IA64? by Anonymous Coward · · Score: 0

      1. it's x86-64, not 386-64. Building a 386 with a 64 bit processing power is sorta weird. It would be the equivalent if 886 (and not 8086 :P).
      2. Under intel's current plans, they will switch to ia64 desktop chips eventually. I doubt that they'll ever get far enough with ia64 tho. It'll be probably scrapped for Yamhill

    10. Re:What's the deal with IA64? by Anonymous Coward · · Score: 0

      >First people rave about how bad the X86

      True enough, it's not good. However, ia64 is not what we looked for.

      > Is and how much we need to move to something completely different, then Intel comes out with an awesome new architecture and everyone chickens out in favor of some crappy 64-bit extension to the X86?

      This is how the market works. Intel should have learned this 15+ years ago.

      > Itanium's performance may be disappointing, but the chip has a LOT of potential. Far more than X86-64 has in the long run.

      How so? The x86 has progressed A LOT from the 8086/8088. Whenever people thought that there was a block to future development, it has always pulled through. Right now people don't even think that x86 can't go on.

      > Itanium floating point performance was actually quite good. I think the chip hardware will improve very much within a few revisions and performance will be much, much better because of that and better compilers.

      It is much, much, much worse than current generation x86 chips. This shows ia64's worth.

      > VLIW-style instruction level parallelism isn't easy to code, but if done properly, the processor can really fly.

      I work for a really large company that makes an very obquitious operating system (ok, ok, didn't want to mention the word on slashdot, but I work for Microsoft). Most of us developers are not that happy with ia64. It's simply not pramatic in the way i386 always has. Management is more ambivalent about it, but support has died down quite a bit over the last six months (in comparison to as recently as a year ago, when ia64 was a hot ticket item around here).

      You can argue that ia64 is the better architecture to hell and back, but until you convice programmers that, it'll fail in the general non-server market.

    11. Re:What's the deal with IA64? by Anonymous Coward · · Score: 0

      The info revealed on Itanium 2 's is that it will bring run typical application between 1.5x and 2x faster than Itanium.

      Oh and I know what the real number is for my application. Obviously I can't tell. You were trolling so you woudn't care about the truth anyway :-)

    12. Re:What's the deal with IA64? by Anonymous Coward · · Score: 0

      "Most of us developers are not that happy with ia64. It's simply not pramatic in the way i386 always has."

      Most developers aren't happy with the fact they can't get their compilers performing good enough. But progress is being made.

      I think the IA-64 has the potential to be exactly what is needed in an X86 replacement. Its performance isn't necessarily worse than X86, either.

      Last time I saw SPECint marks, an 800MHz IA-64 actually outperformed an Athlon 1.4GHz in the SPECint-per-MHz deparment. Sure, such benchmarks are unreliable, and even moreso when you try to get per-MHz figures, but really, the IA-64 wasn't doing that bad.

      Floating point is where it shines. If IA-64 doesn't capture the desktop, you can bet it will bury IBM's POWER series and the Alpha (which is dead anyway.) Apparently the people at CERN are quite impressed with what IA-64 can do with their high end floating point number crunching software -- supposedly it beats their Alpha machines.

      What IA-64 isn't doing well at is simple integer based desktop stuff like web browsing and word processing. The IA-32 compatibility is also painfully slow.

      However, where it really matters, Itanium II might really shine. With much more cache, faster RAM, and some architectural bottlenecks nullified, you can bet this processor is going to smoke the competition if clocked high enough.

      Itanium also scales very well.

      X86-64 really can't compete in the high end market. It will put up a fight on desktops, however. Intel is going to make a strong showing with Yamhill or IA-64 simply because of its clout.

      I still think X86 is dying and eventually even the desktop world will have to give it up. Given a little time, IA-64 can mature into something that will blow X86-64 out of the water. The market is going to want it eventually.

      Anyway, the original point of my post was just to dispute the fact that IA-64 is a "piece of crap architecture." I think most would agree that it's a very nice and interesting architecture with a lot of potential for performance, if not for success in the market place.

    13. Re:What's the deal with IA64? by Master+Bait · · Score: 3, Informative
      I think the IA-64 has the potential to be exactly what is needed in an X86 replacement. Its performance isn't necessarily worse than X86, either. Last time I saw SPECint marks, an 800MHz IA-64 actually outperformed an Athlon 1.4GHz in the SPECint-per-MHz deparment. Sure, such benchmarks are unreliable, and even moreso when you try to get per-MHz figures, but really, the IA-64 wasn't doing that bad.

      What SPEC needs to benchmark is SPECInt-per-$. Considering that commodity Athlons, Pentiums, Celerons and Durons handily beat the extremely expensive Itanic in a straight SPECInt benchmark, what's the advantage of the IA64 performing more efficiently per mhz?

      What IA-64 isn't doing well at is simple integer based desktop stuff like web browsing and word processing. The IA-32 compatibility is also painfully slow.

      It was very silly of Intel to graft a 386 unit onto the IA64 chip, that's for sure. Fast int ops are important for running databases. They are essential in supporting that 64-bit I/O.

      However, where it really matters, Itanium II might really shine. With much more cache, faster RAM, and some architectural bottlenecks nullified, you can bet this processor is going to smoke the competition if clocked high enough.

      That's been Intel's promise since they announced the chip project many, many, many years ago. They also promised that the chip would be inexpensive. It isn't very fast, it isn't a good value compared to todays 32-bit commodity CPUs.

      Itanium also scales very well.

      From what I've read, the Itanic scales in a way very similar to the Hammer -- 8 CPUs at a time and if you want more than you have to run a pipe between each group of eight. Hammer claims a Hypertransport link between each set with a one cycle wait state (Intels simply calls their a pipe), but really, anything more than 8-way is still going to be the realm of POWER4, UltraSparc, etc. IMO. To tell the truth, the Itanic and the X86-64 will have very similar scaleability, the x86-64 is less than half the die size of the Itanic and better performing. It's NUMA setup gives greater throughput between multiple CPUs in an 8-way or less. It may be ugly on the inside, but both CPUs do the about same thing. And one will be faster and a whole lot cheaper. And don't forget AMD's 4-way chipset. The Taiwanese motherboard makers are going to be moving into that space with this chipset. Commoditization.

      X86-64 really can't compete in the high end market. It will put up a fight on desktops, however. Intel is going to make a strong showing with Yamhill or IA-64 simply because of its clout.

      Well, just take a 32-bit commodity CPU and kludge it to 64 bits, gain about 25% speedup in doing so and SELL IT FOR AROUND $400 maximum and you will quickly see that the Itanic is sinking! Sure the x86 instruction set is lame, but that's the roll of the dice. If the Motorola 68000 had been chosen by IBM for the PC, we would be singing the same tune. I think the x86 instruction set will be around ad infinitum. Just like the accellerator pedal is on the right side, the clutch is on the left and the brake pedal is in the middle. Totally arbitrary, but it somehow stuck.

      Anyway, the original point of my post was just to dispute the fact that IA-64 is a "piece of crap architecture." I think most would agree that it's a very nice and interesting architecture with a lot of potential for performance, if not for success in the market place.

      The Itanic wasn't a piece of crap 5 years ago, but it is obsolete today. Intel raves about its "266mhz" memory bus and its 66mhz-64-bit PCI support. You can get this in a commodity motherboard and two Athlon CPUs for around $600. You can get the Pentium 4 with 133mhz X4 quad-pumped memory bus nowadays. The Itanic's parallel execution method is nice, but why did they wait till the CPU was released before they began making compilers that took advantage of this? Completely useless without the right tools (assuming decent tools can be made).

      --
      "Only in their dreams can men truly be free 'twas always thus, and always thus will be."
      --Tom Schulman
  9. Even Better by Anonymous Coward · · Score: 0

    Even more exciting is porting Linux to the N64 platform!

  10. size_t by $pacemold · · Score: 2, Informative

    Oh please.

    return (char *) ((((long) cp) + 15) & ~15);

    is not portable.

    return (char *) ((((size_t) cp) + 15) & ~15);

    is much better.

    1. Re:size_t by morbid · · Score: 1, Informative

      Sad isn't it?
      What he doesn't mention, is that most Linux people have gcc, and last time I looked, the object code produced by gcc on IA64 was +20% of the speed of the intel compiler. This isn't a criticism of gcc, it's just that the IA64 arch. is so different that you absolutely _must_ have the intel compiler to get any performance out of it.

      --
      I'm out of my tree just now but please feel free to leave a banana.
    2. Re:size_t by Karel+Capek · · Score: 1

      Actually, you probably want to use ptrdiff_t

    3. Re:size_t by Bert64 · · Score: 1

      This is the case on every architecture, gcc massively underperforms compared to a vendor compiler.. x86 is the architecture where the difference is the smallest, and its still significant.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    4. Re:size_t by Brainchild · · Score: 1
      Actually, you probably want to use ptrdiff_t

      No. ptrdiff_t is a signed type. cp is a pointer, and hence an unsigned type. size_t is the correct type to use for the typecast.

      --

      :: "I am non-refutable." --Enik the Altrusian ::

    5. Re:size_t by Anonymous Coward · · Score: 0

      There's no guarantee that intmax_t is large enough to store a pointer, much less size_t. If you want to do arithmetic on pointers to arbitrary data, use (char*).

    6. Re:size_t by Isle · · Score: 1

      Not any longer, with icc 6 the margin is now the same on x86 as on most platforms.
      Ofcouse the largest margin is on the alpha platform where cxx outperforms gcc by 4x til 5x on floatingpoint and 2x-3x on integer. Ouch!

    7. Re:size_t by Anonymous Coward · · Score: 0

      Neither size_t nor ptrdiff_t is guaranteed to be wide enough. Use intptr_t or uintptr_t.
      They are optional types, but if a C99 implementation does not support them, it probably doesn't have any suitable type and you at least get a clear compiler error.

      This still won't work if the mapping between pointers and integers is something weird, like having a byte-of-word index in the most significant bits.

    8. Re:size_t by Bert64 · · Score: 1

      Yes, if only linux or freebsd and associated libraries/tools were compileable using ccc, we alpha users would get VASTLY superior performance immediately..
      Performant code isn`t just important for heavy duty data processing applications, recompiling the entire system with better optimization can HUGELY improve the usage experience at no extra hardware cost.

      Basically, gcc SUCKS, but not half as much as software which forces you to use it, by making use of "embrace and extend" nonstandard features.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
  11. The more things change ..... by binaryDigit · · Score: 3, Informative

    Ah, porting to homogeneous isa but with a bigger word size. Funny how it's the same old issues over and over again. Structs change in size, bad assumptions about the size of things such as size_t, sizeof(void *) != sizeof(int) (though sizeof(void *) == sizeof(long) seems to be pretty good at holding true here), etc. Of course now there are concerns about misaligned memory accesses, which on IA32 was just a performance hit. Most IA32 types are not used to being forced to be concerned about this (of course many *NIX/RISC types are very used to this).

    When things were shifting from 16 to 32 bit (seems like just yesterday, oh wait, for M$ it was just yesterday), we had pretty much the same issues. Never had to do any 8 -> 16bit ports (since pretty much everything was either in BASIC, where it didn't matter, or assembler, which you couldn't "port" anyway).

    Speaking of assembler, I guess the days of hand crafting code out of assembler is really going to take a hit if IA64 ever takes off. The assembler code would be so tied to a specific rev of EPIC, that it would be hard to justify the future expense of doing so. It would be interesting to see what type of tools are available for the assembler developer. Does the chip provide any enhanced debugging capabilities (keeping writes straight at a particular point in execution, can you see speculative writes too?). It'd be cool if the assembler IDE could automagically group parallelizable (is that a word?) together as you are coding.

    1. Re:The more things change ..... by CFN · · Score: 4, Informative

      Well, they days of hand crafted assembly, except for a few special purposes, have long since past. And no one expects assembly writers to be competitive with the compiler's ability to discover and explot ILP.

      But the example you mention won't actually cause assembly writers any problems: the code won't be tied to a specific version of EPIC.

      The IA-64 assembly contains so-called "stop bits", which specify that the instruction(s) following the bit cannot be run in parallel with those before the bit.
      Those bits have nothing to do with the actual number of instructions that the machine is capable of handling.
      For example, if a program consisted of 100 independent instructions, the assembly would not contain any stop bits. Now the actual machine implementation might only handle 2 or 4 or 8 instructions at a time, but that does not appear anywhere in the assembly. The only requirement is that the machine respect the stop bits.

      Now, you might question how it deals with load-value dependencies (ie. load a value into a register, use that register). Obviously, the load and use must be on different sides of a stop bit, but that would still not guarantee correctness. I'm not sure how IA64 actually works (and someone should reply with the real answer) but I imagine that either: a) loads have a fixed max latency, and the compiler is required to insert as many stop bits between the load and the use to ensure correctness, or b) the machine will stall (like current machines).

      Either way, the whole point of speculative loads is to avoid that being a problem.

    2. Re:The more things change ..... by binaryDigit · · Score: 2

      Actually my point was that for anyone to code in assembler usually implies coding for max performance therefore you would maximize the number of parallel instructions for the particular version of EPIC you were targeting. That in turn would make your code either non portable (going down in # of EU's) or non optimized (going up in # of EU's).

      I too would be interested in hearing about how the cpu handles the dependencies. The only modern "general purpose" cpu that I know of that _doesn't_ stall is the MIPS.

    3. Re:The more things change ..... by wik · · Score: 1

      It handles dependencies by stalling, if necessary. A true VLIW wouldn't do this. Intel took a whole bunch of good architecture ideas, and then tied themselves to a wall with requirements for compatibility and a desire to shove more features into the package.

      You'd think that starting out with a new architecture/ISA, they'd at least try to keep it simple and then let it grow hairy with age. :)i

      --
      / \
      \ / ASCII ribbon campaign for peace
      x
      / \
    4. Re:The more things change ..... by Anonymous Coward · · Score: 0

      The Itanium Processor uses a scoreboard so the pipeline will stall on the consumption of a pending load. It does provide several new instructions to try and move those loads up in the code, so you don't run into the problem of missing a cache. These instructions in include and advance load for moving loads ahead of stores and a speculative load for moving loads ahead of branches. There's even an advanced speculative load for, you guessed it, moving a load ahead of a store and a branch. Hope this helps.

    5. Re:The more things change ..... by pne · · Score: 2

      Funny how it's the same old issues over and over again. Structs change in size, bad assumptions about the size of things such as size_t, sizeof(void *) != sizeof(int) (though sizeof(void *) == sizeof(long) seems to be pretty good at holding true here), etc.

      But did you notice that on Windows/IA64, even that won't work? They have a "strange P64 model", where ints and longs stay 32 bits and only pointers are 64 bits. So this kind of thing isn't even homogeneous within the architecture (the Windows guys will have to use long longs or _int64_t's explicitly, I guess).

      --
      Esli epei etot cumprenan, shris soa Sfaha.
    6. Re:The more things change ..... by Kitanin · · Score: 1
      Now, you might question how it deals with load-value dependencies (ie. load a value into a register, use that register). Obviously, the load and use must be on different sides of a stop bit, but that would still not guarantee correctness. I'm not sure how IA64 actually works (and someone should reply with the real answer) but I imagine that either: a) loads have a fixed max latency, and the compiler is required to insert as many stop bits between the load and the use to ensure correctness, or b) the machine will stall (like current machines).

      According to the documentation, it would be b. The machine stalls until it's retrieved. Of course, according to the documentation at the time, floating-point division worked fine on the first Pentiums. :-)

      --


      Teach your kids: "C++ made baby Jesus cry."
  12. printf() by Anonymous Coward · · Score: 0


    It's always bugged me that there's no portable
    way to print out most int-like datatypes.

    I usually just cast them to long. So if I had
    a pid_t, I'd print it like this:
    printf( "%ld\n", (long int)pid );

    The way it *should* work, if I were king of the
    universe, would be:
    printf( "%{pid}\n", pid );
    printf( "%{uid_t}\n", getuid() );
    etc.

    1. Re:printf() by $pacemold · · Score: 1

      The way it *should* work, if I were king of the universe, would be:

      printf( "%{pid}\n", pid );
      printf( "%{uid_t}\n", getuid() );

      etc.
      #define MYPRINTF(fmt, var) myprintf((fmt),sizeof(var),(var))

      Designing the rest of the API, writing the myprintf() function and dealing with macros with variable number of parameters is left as an exercise to the implementor.
    2. Re:printf() by Anonymous Coward · · Score: 0

      This is what iostreams are for. Encoding a type in a string offers nothing more than an opportunity to get it wrong.

    3. Re:printf() by descubes · · Score: 2
      The way it *should* work, if I were king of the
      universe, would be:
      printf( "%{pid}\n", pid );
      printf( "%{uid_t}\n", getuid() );
      etc.


      The way it *does* work in the little universe where I am the king is:

      procedure Write(pid_t pid; others) is
      // Write "(pid) 1FEDDE" on output
      Write "(pid) ", HEX, pid as integer
      // Write other arguments
      Write others

      procedure Write(uid_t uid; others) is
      // Write "(uid) 1FEDDE" on output
      Write "(uid) ", HEX, uid as integer
      // Write other arguments
      Write others

      // Let's add a "WriteLn" capability
      procedure WriteLn(others) is
      Write others
      Write NewLineCharacter

      // And use it:
      procedure Main() is
      var pid_t pid := GetPID()
      var uid_t uid := GetUID()
      WriteLn "Hello, PID=", pid, " and UID=", uid


      This way is arguably better, because it's type safe, and easier on the users. Of course, since it's not Compatible With C, it will never be used by anybody :-(
      --
      -- Did you try Tao3D? http://tao3d.sourceforge.net
  13. The number of bits is like the length of your dick by Anonymous Coward · · Score: 0

    After you reach a certain point longer isn't better, it becomes inconvenient. I'm sticking to my 32-inch architecture thank you.

  14. i386 not designed for servers? by Ed+Avis · · Score: 3, Interesting
    From the article:
    Back in the early '80s, nobody at Intel thought their microprocessors would one day be used for servers; the inherent architecture of the i386 family shows that clearly.
    That's funny, I thought that the i386 was specifically designed to run MULTICS, which was the very definition of a 'server' operating system (computing power as a utility, like water and electricity). The early 80s was the time Intel designed the i386 wasn't it?
    --
    -- Ed Avis ed@membled.com
    1. Re:i386 not designed for servers? by Anonymous Coward · · Score: 0

      By the 80s Multics was very dead. It was in the history section of my college OS textbook.
      Unix as very much on peoples minds then. OS/2 had a bad start because it version 1 did not have a GUI but I think that was the late 80s.
      Was the 386 designed as a server? I doubt it. Servers where not a big thing then. I think the 386 was supposed to be a workstation chip to take on Sun and Apollo.

    2. Re:i386 not designed for servers? by Russ+Steffen · · Score: 3, Interesting

      What's really funny is that I have an Intel propoganda book for the "brand new 80386." It spends two whole chapters talking about how the 386 is the perfect CPU for LAN servers. Of course, it also had to spend almost that much space describing what a LAN is and what a server might do, since very few people had ever heard of a LAN at that point, much less had one.

    3. Re:i386 not designed for servers? by Anonymous Coward · · Score: 0

      Yeah! the first server I ever saw (I saw older ones after that, but these were the first I saw) were IBM PS/2 i386+FPU, I *think* 8 o even 16MB RAM... those pretty towers.

    4. Re:i386 not designed for servers? by david+duncan+scott · · Score: 2

      Yeah, and the early 80's was also when Honeywell stopped Multics development. FWIW, I'd describe MULTICS as a "timesharing" OS, rather than "server", which to me implies "client".

      --

      This next song is very sad. Please clap along. -- Robin Zander

    5. Re:i386 not designed for servers? by hey! · · Score: 3, Interesting
      386 designed for Multics? I doubt it. Running multics on a 386 would be like scoring Beethoven's ninth for a kazoo.


      Multics was pretty much tied to it's unique mainframe hardware with loads more weird addressing and virtual memory management features that would never have fit the paltry 275,000 transitors of the 80386. Also, at the time (1985) Multics was a legacy system; Unix was seen the operating system of the future, in particular because it was portable to microprocessors and didn't require much special hardware.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    6. Re:i386 not designed for servers? by Ed+Avis · · Score: 1

      I have one of those at home, ex-Midland Bank fileserver, I put Linux on it. Although it now has an IBM Blue Lightning CPU instead of the original 386 processor, and more than the original 12 megs of memory.

      --
      -- Ed Avis ed@membled.com
    7. Re:i386 not designed for servers? by sparkz · · Score: 2

      Ohh, you're taking me back to my ICL Ei TeamServers... the big ones had 486DX2/25 with 32MB... and were they cheap?!

      --
      Author, Shell Scripting : Expert Re
    8. Re:i386 not designed for servers? by karlm · · Score: 2

      Don't forget about the 32 rings of protection on the Honneywell hardware instead of the 4 found on i386. As long as you have 2 rings, you can emulate as many rings as you want, but it's slow and a PITA. MULTICS was a beast of an OS that need a beast of a machine. Thank G_d your classic arcade machines didn't run MULTICS, or it'd take decades to write a good efficient emulator to run them.

      --
      Copyright Violation:"theft, piracy"::Anti-Trust Violation:"thermonuclear price terrorism"<-Overly dramatic language.
  15. Debian on the IA64 by hereward_Cooper · · Score: 5, Informative

    Debian is already ported to the IA64 -- not sure about the number of packages ported yet, but I know they intend to release the new 3.0 (woody) with a IA64 port.

    See here for more details

    --
    zadok.org.uk
    1. Re:Debian on the IA64 by BacOs · · Score: 3, Informative

      From #debian-ia64 on irc.openprojects.net:

      Topic for #debian-ia64 is 95.70% up-to-date, 96.07% if also counting uploaded pkgs

      There are over 8000 packages for i386 (the most up to date architecture) - ia64 currently has about 7650 or so packages built

      More stats are available at buildd.debian.org/stats/

    2. Re:Debian on the IA64 by morbid · · Score: 0

      >c There are over 8000 packages for i386 (the most up to date architecture) - ia64 currently has about 7650 or so packages built

      ....and 93.2% of those are themeable IRC clients for X with alpha transparency and build-in MP3 streaming.

      Sorry, couldn't resist...it's late and I have Kronenbourg 1664 :-)

      --
      I'm out of my tree just now but please feel free to leave a banana.
    3. Re:Debian on the IA64 by Anonymous Coward · · Score: 0

      Mandrake released both 8.1 and now 8.2 for IA64. This isn't exactly news, though I guess if you take into account Debian's release frequency.

    4. Re:Debian on the IA64 by BadlandZ · · Score: 1

      Maybe I'm wrong here, but I would think that getting compiler optimizations right are going to matter a lot more than just getting everything under the sun to build on IA64. And, the question on my mind is, will that happen BEFORE ia64 systems reach the market or not?

    5. Re:Debian on the IA64 by morbid · · Score: 0

      IA64 machines ahave already hit the market (last year) and been withdrawn. You could buy a single processor machine from Dell for about $25k. They stopped selling them recently.

      --
      I'm out of my tree just now but please feel free to leave a banana.
  16. PA-RISC and IA32 Native Execution by morbid · · Score: 3, Interesting

    In the article he mentions that itanic can execute IA32 code _and_ PA-RISC code natively, as well as its own, but these features will be taken away sometime in the future.
    Does anyone remember the leaked benchmarks that showed the itanic executing IA32 code at roughly 10% of the speed of an equivalently-clocked PIII?
    I wonder how it shapes up on PA-RISC performance?
    It has to offer some sort of advantage over existing chips, or no one will buy it.
    On the other hand, maybe its tremendous heat dissipation will reduce drastically when they remove all that circuitry for running IA32 and PA-RISC code.
    Which leads me to think, why didn't they invest the time and money in software technology like dynamic recompilation, which Apple did very successfully when they made the transition from 69k to PPC?

    --
    I'm out of my tree just now but please feel free to leave a banana.
    1. Re:PA-RISC and IA32 Native Execution by Anonymous Coward · · Score: 0

      You mean 68k to PPC.

    2. Re:PA-RISC and IA32 Native Execution by morbid · · Score: 0

      Indeed I do :-)
      My eyesight and tryp[ing ain;t what they used to nbe :-)

      --
      I'm out of my tree just now but please feel free to leave a banana.
    3. Re:PA-RISC and IA32 Native Execution by descubes · · Score: 2

      In the current Itanium, only user-space IA-32 instructions are implemented with hardware assistance. Since this is essentially microcode, this is not too fast. The architecture specifies how the instructions work, which IA-64 registers they use to store IA-32 registers, etc. But the whole thing can be implemented in firmware or software in future revisions of the chip.

      IA-64 machines also offer firmware emulation of IA-32 system instructions. This allows you, in theory, to boot an unmodified IA-32 OS. I've never used it myself, however.

      Last, the PA-RISC support is a piece of software integrated in HP-UX. There's no help from the hardware, except numerous design similarities (IA-64 began its life as HP PA-Wide Word). So you won't be able to run PA-RISC Linux binaries on IA-64 Linux any time soon...

      --
      -- Did you try Tao3D? http://tao3d.sourceforge.net
    4. Re:PA-RISC and IA32 Native Execution by NovaX · · Score: 2

      Actually, the IA-64 instruction set is based off of PA-RISC, as it is the next generation of that architecture. Various projects designing processors with high levels of ILP were conducted at HP, blooming into the partnership between HP and Intel (who had been floating around an idea of a 64-bit x86 architecture, but recieved poor supportive responces) that created IA-64. HP-UX developers have stated that only minor changes must occur to port an application, and have created what equates to a shell process that converts a PA-RISC instruction directly into its IA-64 counterpart.

      So, PA-RISC is native via design. The x86 instructions were tacked on, origionally supposed to be an entire processor but proved to be to costly. You have to remember that x86 is hardly needed, as its mostly important for developers porting and testing applications, and for Microsoft to run 'legacy' applications. McKinly has a newer design that should boost the x86 performance substantially. If extra is needed, I'm sure something similar to Sun's x86 PCI card will be devised.

      As to heat and the rest, taking out the x86 would help of course. From what I've heard, the control logic on current IA-64 chips is actually smaller then that of the Pentium 4, which was the point of the architecture - simplify. Simplifying meant spending more time on higher level logic rather OOO techniques, etc that could be done via software. The chip is so large due to *lots* of cache.

      Anyways, a few good links are:
      here and here.

      --

      "Open Source?" - Press any key to continue
    5. Re:PA-RISC and IA32 Native Execution by morbid · · Score: 0

      Thanks. That's very interesting. PA-RISC always was a very good design.

      --
      I'm out of my tree just now but please feel free to leave a banana.
  17. Why can't i386 assembler be used? by Ed+Avis · · Score: 3, Insightful
    From the article:
    Quite obviously, inline assembly must be rewritten from scratch.

    I don't see what is so obvious - isn't one of the selling points of Itanium its backward i386 compatibility? Even if running the 64-bit version of Linux it should still be possible to switch the processor into i386-compatible mode to execute some 386 opcodes and then back again. After all, the claim is that old Linux/i386 binaries will continue to work. Or is there some factor that means the choice of 32 bit vs 64 bit code must be made process-by-process?

    Interesting question: which would run faster, hand-optimized i386 code running under emulation on an Itanium, or native IA-64 code produced by gcc? They say that writing a decent IA-64 compiler is difficult, and I'm sure Intel has put a lot of work into making the backwards compatibility perform at a reasonable speed (if not quite as fast as a P4 at the same clock).

    --
    -- Ed Avis ed@membled.com
    1. Re:Why can't i386 assembler be used? by NanoGator · · Score: 4, Interesting

      " isn't one of the selling points of Itanium its backward i386 compatibility?"

      If I remember clearly, the 386 instructions are interpreted instead of being on the chip. That means that those instructions will execute alot slower. It would work, but it wouldnt work well. Its nice because you could transition to IA 64 now and wait for the new software to arrive.

      Personally, I dont think that selling point is that worthwhile, but Ill let Intel do their marketing without me.

      --
      "Derp de derp."
    2. Re:Why can't i386 assembler be used? by $pacemold · · Score: 1

      > I'm sure Intel has put a lot of work into making the backwards compatibility perform at a reasonable speed

      :)

      Look up what happened when:

      1. 80286 was emulating 8086 in protected mode
      2. Pentium Pro was running 16-bit code

    3. Re:Why can't i386 assembler be used? by Slashamatic · · Score: 1
      It is an interesting comparison to look what Digital did to get people from the VAX to the Alpha. The had a sophisticated binary translator and for low level code where you the source, VAX Assembler can be compiled.

      The end result is that it easn't to difficult to move architectures, even though the Alpha does not know the VAX instruction set and no interpreter was provided.

      The only gotcha is that Digital had to provide some special extra instructions to implement some primitives used by the OS, such as interlocked queues.

      Intel is primarily a hardware company so they would tend to ignore software solutions, but the one-architecture approach kept the Alpha from getting too complicated.

    4. Re:Why can't i386 assembler be used? by iabervon · · Score: 3, Informative

      Changing modes for a single assembly block is not going to work. All of your data is in IA-64 registers, the processor pipeline is filled with IA-64 instructions, and so forth. Switching is a major slowdown (might as well be another process), and the point of having sections in assembly is to speed up critical sections.

      In any case, what makes it difficult to write an IA-64 compiler is taking advantage of the things that the new instruction set lets you tell the processor. It's not hard to write code for the IA64 that's as good as some code for the i386. It's just that you won't get the benefits of the new architecture until you write better code, and the processors aren't optimized for running code that doesn't take advantage of the architecture.

    5. Re:Why can't i386 assembler be used? by Chris+Burke · · Score: 4, Informative

      I don't see what is so obvious - isn't one of the selling points of Itanium its backward i386 compatibility?

      Yes. Compatability. Nothing more. Your old apps will run, but not fast. It's basically a bullet point to try to make the transition to Itanium sound more palatable.

      Or is there some factor that means the choice of 32 bit vs 64 bit code must be made process-by-process?

      It is highly likely that the procedure to change from 64 to 32 bit mode is a privileged operation, meaning you need operating system intervention. Which means the operating system would have to provide an interface for user code to switch modes, just so a small block of inline assembly can be executed. I highly doubt such an interface exists (ick... IA-64 specific syscalls).

      Interesting question: which would run faster, hand-optimized i386 code running under emulation on an Itanium, or native IA-64 code produced by gcc?

      An interesting question, but one for which the answer is clear: gcc will be faster, and by a lot. Itanium is horrible at 32-bit code. It isn't designed for it, it has to emulate it, and it stinks a lot at it.

      They say that writing a decent IA-64 compiler is difficult, and I'm sure Intel has put a lot of work into making the backwards compatibility perform at a reasonable speed (if not quite as fast as a P4 at the same clock).

      Writing the compiler is difficult, but a surmountable task. And your surety does not enhance IA-64 32-bit support in any way. It is quite poor, well behind a P4 at the same clock, and of course at a much lower clock. Even with a highly sub-optimal compiler and the top-notch x86 assembly, you're better off going native on Itanium.

      --

      The enemies of Democracy are
    6. Re:Why can't i386 assembler be used? by n0ano · · Score: 1
      isn't one of the selling points of Itanium its backward i386 compatibility

      The article was referring to inline assembly in the kernel code. The IA32 compatibility built into the IA64 CPU is strictly for user mode, all system functions are executed in IA64 mode. Although it would be technically possible to enter kernel mode, swith to the IA32 instruction set, exec some IA32 code and then swith back, in practice this is unfeasible. The IA32 code would be using different data structures and it couldn't call any of the kernel internal routines with somehow finding a way to swith from IA32 to IA64 mode and back on each subroutine call.


      The problems of mixing IA32 and IA64 code, especially inside the kernel, are just too difficult and provide little benefit. For these reasons the Linux/IA64 team decided not to support this.

      --
      Don Dugger
      "Censeo Toto nos in Kansa esse decisse." - D. Gale
    7. Re:Why can't i386 assembler be used? by Ed+Avis · · Score: 2

      If the entire critical loop is in assembler (not just a small part of it) then it could be worth switching. Although based on what another poster wrote, it sounds like the emulation is so lousy that no matter how suboptimal gcc's code generation

      --
      -- Ed Avis ed@membled.com
    8. Re:Why can't i386 assembler be used? by Ed+Avis · · Score: 2

      The PPro sucked at runnin 16 bit code - because at the time it was designed Intel didn't anticipate that people would _still_ be running 16-bit stuff in the mid-90s - but the next iteration the Pentium II was better. I wonder if McKinley is expected to give a boost to legacy code compared to Itanic.

      --
      -- Ed Avis ed@membled.com
    9. Re:Why can't i386 assembler be used? by juggleme · · Score: 1

      Here's some benchmarks of a 666 MHz Itanium on x86 code. About as good as a Pentium 100. Not exactly a compelling reason to buy a thousand dollar chip...

  18. Re:The number of bits is like the length of your d by Anonymous Coward · · Score: 0

    I'm running on a very fat little 8-bit machine...

  19. Has anyone thought of... by Ben+Edwards · · Score: 0, Offtopic

    It's an old frustration I've had with Windows having to do with the time it takes to boot. Why can't they put Windows on an EPROM chip (perhaps on the motherboard, perhaps on a card) so that the OS is all in hardware? Booting would be so much faster.

    Has anyone thought of doing this with Linux?

    1. Re:Has anyone thought of... by ocelotbob · · Score: 1
      Has anyone thought of doing this with Linux?
      Of course. They're mostly used in clustering situations, but they are definitely out there.
      --

      Marxism is the opiate of dumbasses

    2. Re:Has anyone thought of... by Anonymous Coward · · Score: 0

      > ...I'm the one who slaughtered those
      > COWABUNGA guys everyone heard
      > rapping under their toilettes.

      That's a sad, sad thing, they were such great artist. Check the Louvre in their memory if you're in Europe.

  20. NULL barfage by dark-nl · · Score: 3, Informative

    The examples he gives for usage of null pointers are both wrong. When a null pointer (whether written as 0 or NULL) is passed to a varargs function, it should be cast to a pointer of the appropriate type. See the comp.lang.c faq for details. The relevant questions are 5.4 and 5.6. But feel free to read them all!

    1. Re:NULL barfage by ScottMaxwell · · Score: 3
      The examples he gives for usage of null pointers are both wrong. When a null pointer (whether written as 0 or NULL) is passed to a varargs function, it should be cast to a pointer of the appropriate type.


      Indeed. In the particular case in question, passing a pointer to printf(), this should be (void *) 0 or (void *) NULL.

      At least he's right when he says "The following is coded wrong." :-)

      Bar is also mistaken on at least one other ANSI/ISO C-related point. He writes:

      values of type size_t should use the Z size modifier [to printf], like so:


      In fact, the Z modifier in the %Zu construction is non-standard. There was no portable way to print a size_t in the original ANSI/ISO C (C89). C99 (the 1999 revision of the ISO C standard) uses a lower-case z instead, so portable code should use %zu instead. Of course, the kernel is intended for compilation with gcc, not just any compiler, so Bar's example is correct for the kernel but is not (as he claims) standard.

      --

      ``Life results from the non-random survival of randomly varying replicators.'' -- Richard Dawkins
  21. How is that different from a PPC? by jmv · · Score: 3, Interesting

    A while ago, I tried compiling and running my program (http://freespeech.sourceforge.net/overflow.html) on a Linux PPC machine and (to my surprise) everything went fine. Does that mean that it should work on ia64 too since (AFAIK) both are big-endian 64-bit architectures?

    1. Re:How is that different from a PPC? by Gothmolly · · Score: 2

      No, because as he says in the article, IA64 is little endian.

      --
      I want to delete my account but Slashdot doesn't allow it.
    2. Re:How is that different from a PPC? by gr · · Score: 2

      Whatchew talkin' 'bout, Willis?

      PowerPC is 32-bit and IA64 is little endian.

      Duh?

      --
      Do you have a /. uid shorter than five digits? No? Then piss off.
    3. Re:How is that different from a PPC? by Anonymous Coward · · Score: 0

      No, IA64 is technically bi-endian. Although it wouldn't surprise me if Intel boxes will only really work as little endian.

      HP/UX wants to be big-endian, and thats why Itanium is not a fixed-endian system.

      Tom

    4. Re:How is that different from a PPC? by jmv · · Score: 2

      PowerPC is 32-bit and IA64 is little endian.

      (After a quick check) It does seem like the PowerPC is a 64-bit chip (though maybe linux uses it as a 32-bit for some operations). Also, both PPC and Itanium can act like big-endian or little-endian.

    5. Re:How is that different from a PPC? by sagi · · Score: 1

      Actually, the poster was right - the regular PowerPC is 32-bit.

      From http://penguinppc.org/intro.shtml:
      There are actually two separate ports of Linux to PowerPC: 32-bit and 64-bit. Most PowerPC cpus are 32-bit processors and thus run the 32-bit PowerPC/Linux kernel. 64-bit PowerPC cpus are currently only found in IBM's eServer pSeries and iSeries machines. The smaller 64-bit pSeries and iSeries machines can run the 32-bit kernel, using the cpu in 32-bit mode. This web page concentrates primarily on the 32-bit kernel. See the ppc64 site for details of the 64-bit kernel port.

    6. Re:How is that different from a PPC? by gr · · Score: 2

      Extremely few PowerPC processors are 32-bit.

      Certainly none you're likely to be compiling software on with any kind of regularity. (By which I mean: Apple's never sold a 64-bit processor. ;^>)

      --
      Do you have a /. uid shorter than five digits? No? Then piss off.
  22. Star Wars post by Anonymous Coward · · Score: 0

    But I was going to tashi station to pick up some power converters!

  23. IASixtyTroll by Anonymous Coward · · Score: 0

    This is a troll post. There are thousands more like it, but this one is mine.

  24. No FP in kernel? by d-rock · · Score: 1

    When I was reading the article, the part about no Floating Point in the Kernel stuck out for me. Is this an absolute, or a "don't do it, it's bad"? I looked at the Mossberg presentation on the IA-64 kernel and it looked like they were using some of the fp registers for internal state, but it didn't look like all of them.

    Derek

    --
    Don't Panic...
    1. Re:No FP in kernel? by T-Punkt · · Score: 1

      It's for performance reasons I guess - NetBSD does the same for quite a few of its ports (e.g. the m68k and powerpc ones). The kernel does nearly no floating point calculations and if you do the few ones the kernel does with soft-float to avoid using the floating point instructions you manage to keep the contents of the FP registers unchanged. So there's no need to save and restore them when the CPU switches between user and kernel mode (syscalls etc.). Storing/loading n floating point registers in memory for every syscall is quite expensive, you know.

      This of course is not necessary if the CPU has two (at least partly) different sets of FP registers for kernel (supervisor, privileged, ...) and user (unprivileged) mode or an instruction to quickly exchange (parts of) the FP register sets. (SPARCs do have this to some degree with its concept of register windows).

    2. Re:No FP in kernel? by descubes · · Score: 4, Informative

      There are two reasons:

      1/ The massive amount of FP state in IA-64 (128 FP registers). So the linux kernel is compiled in such a way that only some FP registers can be used by the compiler. This means that on kernel entry and exit, only those FP registers need to be saved/restored. Also, by software conventions, these FP registers are "scratch" (modified by a call), so the kernel needs not save/restore them on a system call (which is seen as a call by the user code)

      2/ The "software assist" for some FP operations. For instance, the FP divide and square root are not completely implemented in hardware (it's actually dependent on the particular IA-64 implementation, so future chips may implement it). For corner cases such as overflow, underflow, infinites, etc, the processor traps ("floating-point software assist" or FPSWA trap). The IA-64 Linux kernel designers decided to not support FPSWA from the kernel itself, which means that you can't do a FP divide in the kernel. I suspect this is what is more problematic for the application in question (load balancer doing FP computations, probably has some divides in there...)

      XL: Programming in the large

      --
      -- Did you try Tao3D? http://tao3d.sourceforge.net
    3. Re:No FP in kernel? by plastik55 · · Score: 2
      The kernel interrupt handler don't bother to save the state of the FP registers, mainly for performance reasons. That means if you use FP in the kernel you'll probably fubar any user-space process that's using the FPU.


      It's not specific to IA64 or Linux-- PPC and IA32 also work this way, and Windows does the same thing. You can get around it, possibly, by inlining some assembly which saves and restores the FP registers before and after you use them. You need to be careful that the kernel won't switch out of context or go back to userland while you're using FP registers--preemptive kernels make this much harder.


      However, there really aren't many reasons why you would want to use FP in the kernel in the first place. Real-time data acquisition and signal processing is the only example that comes to mind, but you'd be better off using something like RTLinux in that case.

      --

      I have a positive modifier on Troll. When I mod someone Troll their karma should go UP!

  25. Power consumption with no IA32 & PA support by Anonymous Coward · · Score: 0

    The itanium consumer quite some power compared to
    say Hitachi SH4, Arm/Intel Xscale etc.

    In the link Moshe Bar writes that Itanium has hardware support for both the IA32-legacy (I knew that) and HP's PA-RISC architecture (new to me). Does anyone know how much less power the Itanium would consume if those were to be dropped?

    Since the IA32-core in Itanium is alow anyway, how much slower would it be to use software emulation like Apple did (Ie emulating a MC680x0 with the PowerPC CPU)?

  26. Re:Sparc and Alpha ahead of Itanium by Anonymous Coward · · Score: 0

    If you want a pure 64 Bit environment then go with Sparc or Alpha. AMD has 64bit version too. Intel has an advantage by keeping the i386 op codes it makes it easier to continue using 32bit code but there is a performance hit. If you need to use 32 but then go with Itanium or AMD if you need raw power of 64 BIT then go with Sparc or Alpha. I cannot see any reason to code 32bit when 64 Bit Itanium is released to the desktop unless you have legacy hardware that needs the support. My two cents 64BBIT pure would be best served on your database servers that need the raw power to crunch all that data. Your can still run your office apps off a 32BIT server as there would be little perfomance gain on pure 64BIT. Remember its performance that matters and what you are going to use the server or workstaion for that fiqures into the hardware. CAD developers would love pure 64 BIT and they have been using SPARC 64BIT for years because INTEL bites performance when crunching data like floating point ect...

  27. See, Linux is not dying by Anonymous Coward · · Score: 0

    Even though Linux is blamed for setting back the state of computing by 10 years reinventing the VM and networking stack which works much better in FreeBSD than in Linux, Linux is still being ported to new platforms. Linux is like coachroaches---once you get them, you never get rid of them.

  28. LinuxBIOS and OpenBIOS by isolation · · Score: 0

    google is your friend.

    Nuff Said

    --
    Free Unix? Free Windows. http://www.reactos.com
  29. Will 64 bit chips ever make it? by 00_NOP · · Score: 3, Interesting

    When I started messing about with computers 8 bit chips were stanard on the desktop and 4 bit in the embedded sphere.

    Within four years 16 bit was the emerging standard for the desktop and four more than that 32 bit was emerging.

    In the 12 years since then, well...

    32 bit rules in both the desktop world and in the embedded world. Can someone tell me why we aren't on 128 bit chips or more by now? Why do 64 bit chips not amke it - is this a problem of the physics of mobos or what?

    1. Re:Will 64 bit chips ever make it? by Chris+Burke · · Score: 5, Insightful

      It's really not that complicated.

      While 4-bit and 8-bit chips were cool and all, no one really thought they were -sufficient-. The limitations of an 8-bit machine hit in you in the face, even if you're coding fairly simple stuff. 16 bits was better but, despite an oft quoted presumption suggesting otherwise, that as well was clearly not going to work for too long.

      Then, 32 bits came around. With 32-bit machines, it was natural to work with up to around 4 GB of memory without any crude hacks. Doing arithmetic on fairly large numbers wasn't difficult either. The limitations of the machine were suddenly a lot farther away. Thus it took longer for those limitations to become a problem. You'll notice that for those spaces where 4GB was a limiting factor the switch to 64 bits happened a long time ago. The reason we are hearing so much about 64 bits now is that the "low end" servers that run on the commodity x86 architecture are getting to the point where 4GB isn't enough anymore. Eventually I imagine desktops will want 64 bits as well. I've already got 1.5GB in the workstation I'm typing this on.

      When will 128 bit chips come about? I don't know, but I'm sure it will take longer than it will take for 64 bits to become mainstream. The reason is simple: Exponential growth. Super-exponential, in a way. 64 bits isn't twice as big as 32 bits, it's 2^32 times bigger. While 2^32 was quite a bit of ram, 2^64 is really, really huge. I won't say that we'll never need more than 2^64 bytes of memory, but I feel confident it won't be any time soon.

      An interesting end to this: At some point, there -is- a maximum bit size. For some generation n with a bit size 2^n and a maximum memory space of 2^2^n you have reached the point where you could use the quantum state of every particle in the universe to store your data, and still have more than enough bits to address it. Though this won't hold true if, say, we discover that there are an infinite number of universes (that we can use to store more data). Heh.

      --

      The enemies of Democracy are
    2. Re:Will 64 bit chips ever make it? by Chief+Typist · · Score: 1

      I have my doubts about the success of 64-bit chips, too (sounds like we both started messing around with 8080's and 4040's :-)

      The only thing that I see the 64-bit architecture getting you is more addressable memory (2^64 vs. 2^32). Most large scale systems these days are highly distributed; you throw lots of CPUs at the problem. You don't throw a large memory space at the problem.

      You don't need a 64-bit processor to for the instruction size. RISC uses less, not more.

      Of course, there are advantages to parallelization of the instruction pipeline, but multi-processor systems or vector processing units (Altivec rocks) are better at this.

      I remember being involved in an early port on a 64-bit DEC Alpha. It was a pain in the butt and the performance gain wasn't enough to justify the expense.

      -ch

    3. Re:Will 64 bit chips ever make it? by Chris+Burke · · Score: 2

      It is true that 64 bits really only brings to the table is more addressable memory. That's an entirely adequate reason for its adoption.

      Most large scale systems do have a large number of processing nodes, but each node needs to be able to access a large amount of data easily. 4GB isn't that much memory, even for one node. Besides, for inter-node communication, a unified memory space is the easiest method by far. For large multi-way servers (as opposed to Beowulf-style) this is also quite natural.

      64 bits is a good thing. It's already a success. It's only in the low-end server and desktop markets where it still hasn't taken over.

      --

      The enemies of Democracy are
    4. Re:Will 64 bit chips ever make it? by morbid · · Score: 0

      ...because 90% of the world thinks that M$ (which has only recently become 32-bit clean) is the be-all and end-all od computing.
      Yes it's a rant, but there's probably a grain if truth in there.
      The 386 came out in '85. It took M$ until about 1993(?) until NT3.51
      Most people were only ever interested in an OS if they could run their DOS 2.0 apps unmodified, specifically Lotus-1-2-3, Norton Commander, Borland Sidekick etc.
      Technological superiority was never an issue. Hence, the humble Xenix was passed by, as was OS/2, etc..

      The rest, as they say, is history (my dad wouldn't let me have a 68k-based (internally 32-bit) Amiga or ST because "it doesn't run DOS and doesn't run Lotus and has Mickey-Mouse grpahics which are only for playing games which is what hooligans and junkies do)

      Kid, when you become parents, please take heed, and don't be a dick as described above.

      --
      I'm out of my tree just now but please feel free to leave a banana.
    5. Re:Will 64 bit chips ever make it? by Chief+Typist · · Score: 1

      I hadn't thought about using the address space as a way of communicating between nodes. It certainly puts a new and interesting twist on larger systems...

      Still, I don't see 64-bit systems working their way into the lower end of the market anytime soon. I distribute my work across many machines. It's cheap and easy to have lots of servers...

      -ch

    6. Re:Will 64 bit chips ever make it? by Chris+Burke · · Score: 2

      I hadn't thought about using the address space as a way of communicating between nodes. It certainly puts a new and interesting twist on larger systems...

      My reaction depends on how you define "node" or "larger system". I think of a "node" as a small processing unit of a few processors connected by a router to other nodes in some form of network. But the node need not be a standalone machine, and the network need not be cat-5. I think of a "large system" as more of the massively MP systems like mainframes and such that can have a thousand processors in one (big) box.

      Anyway, there are generally two methods of doing IPC in such a system: shared memory, and message passing. While message passing has its advantages, it is much more difficult to program for. Shared memory is simple.

      If you don't consider 1024-way mainframes "low end" (heh), the same argument is true in a 2-way server. You have two processors, each of which may need to access more than 4 GB of memory,

      Still, I don't see 64-bit systems working their way into the lower end of the market anytime soon. I distribute my work across many machines. It's cheap and easy to have lots of servers...

      As soon as your database gets larger than 4GB, you'll want 64 bits. Maybe your front end web server won't need it, but your back end will. And your back end may still be considered "low end".

      But you are right, in that it won't be a fast transition. 4GB is still a lot to a lot of people. But that number will inevetiably decrease, and perhaps as more 64 bit applications come online in the low-end world sometimes known as wintel that will drive the change faster.

      --

      The enemies of Democracy are
    7. Re:Will 64 bit chips ever make it? by be-fan · · Score: 2

      Yes they will. If only because you can make Linux use workarounds like high-mem (instead of mapping in all physical memory) for only $200 on pricewatch.

      --
      A deep unwavering belief is a sure sign you're missing something...
    8. Re:Will 64 bit chips ever make it? by 00_NOP · · Score: 1

      (sounds like we both started messing around with 8080's and 4040's :-)

      Z80s actually. Was amazed to see that they are still in use - in Gameboys (though that too has gone 32bit now)

    9. Re:Will 64 bit chips ever make it? by rew · · Score: 2

      32 bit rules in both the desktop world and in the embedded world. Can someone tell me why we aren't on 128 bit chips or more by now?

      It's Moore's law.

      With doubling computer-capacity almost every year or so, you hit the addressing limits of a 4 bit processor pretty quickly, the 4 extra bits of an 8bit processor will last you about 4 years (78-82). The "life time" of 16 bit processsors is therefore about 8 years (80-88), and 32 bit should last some 16 years ('88-2004) before you regularly hit the adressing limit of the processor.

      Sure there are some "advanced" processors that are ahead of the curve. The 32bit 68000 was launched in '80. The alpha has been 64 bit for quite a while already. But the mainstream will have to move to 64 bit in a couple of years.

      Roger.

    10. Re:Will 64 bit chips ever make it? by JKR · · Score: 1
      Eventually I imagine desktops will want 64 bits as well. I've already got 1.5GB in the workstation I'm typing this on.

      Increased address space doesn't necessarily require the jump to a full 64 bit architecture. Some Intel chips already support 36 bit addresses, while having 32 bit word length. It's a bodge, because pointers are still 32 bit (think back to 16 bit DOS and segmented memory architecture) but it's already there. You can have 16 GB in your desktop, if you can afford it.

      Jon.

  30. Porting applications. by Bert64 · · Score: 1

    Surely porting of applications written in C and other high level languages shouldn`t be difficult. In that respect Itanium is nothing new, a 64bit little-endian architecture.. alpha anyone?
    64bit machines have been commercially available for atleast 10 years, you`d think coders would have got used to writing 64bit clean software by now.

    --
    http://spamdecoy.net - free throwaway anonymous email - avoid spam!
  31. It's pretty cool... by Time+Doctor · · Score: 2

    nvidia already has drivers out for Linux/IA64 with some of their higher end cards (quadro line).

    --
    Check out ioquake3.org for a great, free, First-Person Shooter engine!
    1. Re:It's pretty cool... by Anonymous Coward · · Score: 0

      Note that you're probably screwed if you bought an affordable NVIDIA card. And you deserve it for knowingly doing business with a vendor that ships proprietary drivers instead of supporting (or at least documenting) what you bought.

    2. Re:It's pretty cool... by Time+Doctor · · Score: 2

      Not that any commercial games are compiled for IA64 yet.

      You might try getting to know the facts before posting, even as a coward.

      --
      Check out ioquake3.org for a great, free, First-Person Shooter engine!
    3. Re:It's pretty cool... by Anonymous Coward · · Score: 0
      Proprietary games (whether gratis or commercial) probably haven't been yet, but there's no reason to assume they never will (just as they rebuilt for IA32 in the early DOS4GW and then Win32 days). After that GeForce cards will become doorstops, because their customers have already proven they're willing to let NVIDIA dictate how they use their own property.

      Every platform change is an opportunity for vendors to screw us, unless we demand the capability to support ourselves.

  32. stupid question by prizzznecious · · Score: 1

    asked in sincerity: does this mean faster chips, or what?

    --

    visit the hwky website for a lyrical genius infusion.
  33. Use Java by matsh · · Score: 3, Funny

    And forget about the problem!

    Mats

    1. Re:Use Java by WetCat · · Score: 1

      One more Java dumb.
      He is speaking about writing SYSTEM software?
      Have you done any driver development on Java recently?

  34. Re:64 BIT Assembler Project by Anonymous Coward · · Score: 0

    Gotcha! Coders instead of pissing about how hard it is to write a good compiler why do we not start the 64 BIT Assembler Project. It would not take much time to get some code out for testing. SourceForge could host this project and perhaps IBM, SUN, INTEL, HP, AMD, REDHAT, SUSE, TURBO, MANDRAKE, CALDERA ect... would provide support for such a project. This way there would be a good 64BIT compiler that had some agreed standards that would allow porting of code.

  35. Re:What Is Linux Torvalds View On This by Anonymous Coward · · Score: 0

    I would like to know if Linus Torvalds and the Linux Kernel Hackers have been asked about this issue of 32BIT and 64BIT code because it is going to affect kernel development. Could 32BIT calls break the Linux Kernel when mixed with 64BIT calls. Another question is on sockets and networking code and protocals. Would it not be better to make a clean 64BIT kernel and dump legacy 32BIT. The break while painfull at first would be the best solution going forward. A compromise would be to make 32BIT mode a module that you could enable by a kernel recompile if you needed it sort of like the IBCS emulation layer. I would hope we get some clarification from Linus Torvalds as to the Linux Kernel Roadmap regarding this so companys that have to plan migration of their sofware and hardware to 64BIT. The goal should be well documented good clean code that does not break the Linux Kernel. Coders need a roadmap in how best to code 64BIT and how best to port legacy code to 64BIT.

  36. although to be fair by Anonymous Coward · · Score: 1

    There's not any conceptual difference in the high-level programmer's view (nor the C programmer's view for that matter) between IA64 and any other POSIX platform, 64-bit or otherwise, either. The code that breaks, unless it's really CPU-specific stuff, breaks because it was coded poorly. Most of the unportable code out there is really unwarranted.

  37. OT: your sig by theCoder · · Score: 1

    Karma 39 and still posting at 0.

    How do you do that? It'd be great for off-topic posts like this one (that should be modded to 0 anyway)

    --
    "Save the whales, feed the hungry, free the mallocs" -- author unknown
    1. Re:OT: your sig by morbid · · Score: 0

      Maybe the editors don't like my opinions, after all I'm skeptical of IBM.... and I use Slackware on my home-made K6-2/500.

      --
      I'm out of my tree just now but please feel free to leave a banana.
  38. Is there a JRE on IA-64? by Michael+Wardle · · Score: 1

    Java might be a good cross-platform development language, but I haven't seen a JRE for IA-64 at this point.

    Is there a JRE for IA-64? How can Java bytecode be executed/interpreted on Itanium systems at this stage?

    Does the IA-32 emulation work with a IA-32 JRE? If so, wouldn't the dual layers of Java and IA-32 emulation make it too slow to be practical?

  39. Article is inaccurate by leek · · Score: 2, Interesting
    The article is inaccurate.

    First of all, IA-64 is now called IPF (Itanium Processor Family), although I've heard rumors that this is changing again, to a third name.

    Although the initial acceptance of Itanium-based servers and workstations has been slow, there is little doubt that it will eventually succeed in becoming the next-generation platform.

    Actually, as /. readers know, there have been some doubts. Itanium is 5 years late. Right now Itanium ranks lowest in SPEC numbers, and Itanium 2 (McKinley), while it addresses some of the problems, can't expect to compete with Hammer or Yamhill when it comes to integer code.

    For tight floating-point loops, Itanium 2 is great -- 4 FP loads + 2 FMAs per clock. But on integer code with lots of unpredictable branches, the entire IPF architecture leaves a lot to be desired. Speculation and predication were supposed to address that, but it is very hard for compilers to exploit speculation, and predication does not address issues such as the limitations of static scheduling.

    (Also, Itanium 2 removes any benefit that the SIMD instructions had on Itanium, because on Itanium 2, SIMD instructions such as FPMA are split and issued to both FPU ports, negating any performance benefit they had on Itanium. So while Itanium can perform 8 FP ops per clock with FPMA, Itanium 2 can only perform 4 FP ops per clock. This does not look good for the future of IPF implementations. But Itanium 2's bigger memory bandwidth is probably more important than SIMD instructions anyway. Itanium 2 is built more for servers, while Itanium is built more for workstations, which might benefit from SIMD MMU instructions, although the rest of Itanium, and its price/performance, make almost anything else better.)

    Superscalar processors with dynamic scheduling are improving much better than was expected during IPF's design (witness the P4 and AMD chips). So Itanium's static instruction scheduling design may be a liability more than an asset today. It puts considerable burden on the compiler.

    The x86 emulation and stacked register windows take up a lot of real estate on the chip, which could be better used for something else.

    The IA64 can be thought of as a traditional RISC CPU with an almost unlimited number of registers.

    Nonsense!!! No CPU has unlimited registers. When writing code by hand or with a compiler, registers are a limited resource which are used up quickly.

    And even though IPF has "stacked" general purpose registers which are windowed in a circular queue with a lazy backing store, these windows are of limited utility in real code. How many times does real code use subroutine calls which can take heavy advantage of register windows, before call branch penalties start to negate any benefit the windowing provides?

    It's a great idea in theory, but windowing just adds to the complexity of the implementation, taking up real estate that could be better used elsewhere.

    The IA64 has another very important property: It is both PA-RISC 8000 compatible and IA32 compatible. You can thus boot Linux/IA64, HP-UX 11.0, and Windows on an Itanium-powered box.

    Absolutely false: PA-RISC emulation was dropped years ago, before the first implementation, although it was originally planned. Also, HP-UX 11.0, which is PA-RISC only, is not supported on IPF. Only HP-UX 11.20 and later are supported. HP-UX 11.22 is the first customer-visible release of HP-UX on IPF.

    The endianism (bit ordering) is still "little," just like on the IA32, so you don't have to worry about that at all.

    Misleading -- the endianism is still a part of the processor state (i.e. context-dependent). This means it can be both big and little endian, and can switch when an OS switches context. HP-UX, for example, is big-endian on IPF.

    The rest of the article had generic ANSI C programming tips which everyone knows already -- nothing specific to IPF.

  40. IA64 backward compatible? by phoebe · · Score: 1
    I thought all the fuss over the IA64 was because it was not backward compatible? The chip is used for new HP/UX 11.2 boxes and some big PC servers but there are special versions of Linux / Oracle for that. Is the issue purely performance based? When AMD announced the x86-64 chip named Hammer articles are saying things like:

    "Intel's Itanium processors handle 64 bits, but the Pentium family handles 32 bits."

    "The Hammer family of processors ... will be able to run conventional 32 bit applications ... as well as 64 bit applications"

    The press anouncements also got Intel to change its mind and start developing a new 32/64 bit combo chip.

    1. Re:IA64 backward compatible? by smash · · Score: 1
      The IA64 cpus to my knowledge are backwards compatible - but through some form of hardware emulation.

      This is in contrast to the AMD-64 bit architecture in that the AMD cpus retain the full IA32 register/instruction set, and simply add new instructions and a few 64 bit registers.

      This means the AMD cpus run IA32 code much quicker, but the intel 64 bit cpus are quicker when running native code.

      At least thats how I understand it, in laymans terms.

      smash.

      --
      I run: Windows, OS X, Linux, FreeBSD. Just because you have a hammer, doesn't mean everything is a nail.
    2. Re:IA64 backward compatible? by morbid · · Score: 0

      >This means the AMD cpus run IA32 code much quicker, but the intel 64 bit cpus are quicker when running native code.
      At least thats how I understand it, in laymans terms.

      Wrong! The AMD Hammer runs 64-bit and 32-bit code at the same (full) speed since the 64-bot stuff is a logical and transparent extension of the 32-bit stuff. The 64-bit instruction set is RISCy in that to keep it simple and keep the speed up, they only implemented the simpler and more useful instructions in 64-bits.
      Go to www.x86-64.org and look at what they did to the register set (twice as wide, twice the number of registers, i.e. the 386 registers are the upper-right quarter of the entire register set) etc.
      Oh, and doesn't it have 11 execution units and run at 2GHz?

      --
      I'm out of my tree just now but please feel free to leave a banana.
  41. Two questions by Ignominious+Cow+Herd · · Score: 0

    1) RE: "(Back in the early '80s, nobody at Intel thought their microprocessors would one day be used for servers; the inherent architecture of the i386 family shows that clearly.)" What the heck is he talking about?

    2) The article said that all instructions are assumed to work in parallel thus Explicitly Parallel (EPIC). Isn't that backwards? Wouldn't that be Implicit parallelism? I thought that you bundled instructions together to indicate that they could execute in parallel.

    --
    Lump lingered last in line for brains, and the ones she got were sorta rotten and insane.
  42. You're dead wrong by cameldrv · · Score: 1

    The compiler most certainly can be beaten, even more so today than in the past. I haven't done much asm programming on RISC machines, but on x86, the stuff the compiler puts out is generally garbage. If you're using GCC, it's trivial to beat, as it doesn't know how to deal with the extreme lack of registers on the x86. GCC is constantly going to memory when, with some rearangement, it's possible to keep many more things in registers. The Intel compiler is much better, but it still isn't hard to beat.

    Furthermore, with the SIMD stuff in the newer x86 processors (MMX, SSE, SSE2), an asm programmer can get huge speedups which the compiler just doesn't know how to exploit. The Intel compiler will use these features in some instances, but far from optimally. Mind you, you have to know the processor well, and for the big wins, you have to optimize for a specific processor, but if you're doing computationally intensive stuff, the gains can be huge.

    1. Re:You're dead wrong by jquirke · · Score: 2

      Umm, we are talking about IA64 here... Have a look at the manual, you might then understand.

    2. Re:You're dead wrong by cameldrv · · Score: 1

      I've read the IA64 manual, I understand, and I've seen the code that comes out of the Intel compiler for IA64. If you're clever and you have enough time you can beat it. Ultimately the issue is that the compiler can only act at the level of description of the code. The compiler is never going to do something like reorganize your data structures to take advantage of a particular instruction. There are many things that the compiler can't assume in the general case that will speed your program up. If you know that it will work in your specific case, you can do it yourself. For example, C guarantees left-to-right evaluation. This may not matter in a lot of cases, but the compiler is required to honor it unless it can prove that it doesn't make any difference.

    3. Re:You're dead wrong by CFN · · Score: 2

      I'll agree that a clever programer might save a few instructions here or there, but I'll argue that in the age of RISC, and especially EPIC, instruction count does not have any effect on performance.

      With a modern machine executing at 1Ghz, lets assume its throughput is close to 1 billion (10^9, I think these things are different in England) instructions per second.
      So for any normal application (everything except weapon guidance, etc.) that runs for a few minutes, even if you can save 100 million (dynamic) instructions, you are not going to even notice. And just imagine how hard it is to eliminate 100 M dynamic instructions for a real, non trivial, program.

      For IA-64, we expect a lot of performance to come from the fact that it can execute many instructions in parallel (thats what Intel is betting on).
      It is much easier for a machine to find ILP, than for a human.

      And I'm not claiming that there will not be a few cases where a programmer could write better assembly than the compiler, but that even an expert assembly programmer will get beat 99 out of 100 (at least) for IS-64.

      As an aside, reorganizing data structures (usually to take advantage of the memory hiearchy) is a very hot research topic right now. Reorganizing algorithms, i.e. loop tiling, etc., has been studied for about 10 years, and is finally beginning to make its way into commercial compilers.
      It was easier to beat a compiler when it was just doing register allocation for 4 GP registers. Now, as the compilers are getting more and more advanced, it is much much harder to do better than them.

    4. Re:You're dead wrong by cameldrv · · Score: 1

      To reply to the first part of your message, yes, you're never going to assembly optimize large parts of your program (I assume that's what you mean by "dynamic" instructions). And it's true that if the program doesn't have a long running time, it probably doesn't matter if you optimize it. However, there are still situations where this kind of thing is very valuable. The last time I seriously got my hands dirty with asm was with a computer vision application that ran on 64 Pentium IIs. The objective was to reconstruct the shape of an object in real-time. This program spent over 90% of its time in a loop which was about ten lines of C++ (C really as that part didn't use any objects). Proper assembly optimization allowed a 400% improvement in overall speed. This was partially due to being able to effectively use MMX, which is very difficult for the compiler, because one has to organize the data in a very specific way to take advantage of MMX.

      Unfortunately now there isn't a good reference for the P4. When I was doing P2 programming, I used the Intel manuals and Agner Fog's site. From these sources it is possible to know the pipeline cold. You can look at code and see almost exactly if you're going to run out of decode bandwidth, whether a branch is easily predictable, whether you're using all the functional units you can, etc. GCC doesn't know any of this. The Intel compiler knows some of this, but it is limited.

      IA64 is a very different assembly language, and perhaps the compiler will be harder to beat. In an SIMD situation, I belive the human has the upper hand, as there are many things the compiler can't do because it can't prove that it will be equivalent. IA-64 is MIMD, so that's not so much of an issue. Ultimately, though the basic issue remains that a compiler has relatively little knowledge and ingenuity, but lots of patience. Compilers have been getting better but processors have been getting more complex. IA-64 changes that equation for sure. However, it's far from clear that IA-64 is going to win in the marketplace anyway, so we will see.

  43. stupid answer by dark-nl · · Score: 1

    The Itanium isn't really a "faster chip". If you count clock cycles, it's actually slower than the mainstream. It gets its speed from better instruction scheduling, so that each clock cycle does more work. The architecture provides for this in several ways:

    • The processor can assume that a sequence of instructions can all be executed simultaneously, unless it is told otherwise. This passes the burden of figuring this out from the processor (which has to do it on every run) to the compiler (which will have to do it only at compile time, and has a lot more information to work with too).
    • The architecture is designed so that conditional jumps are needed less often. Conditional jumps interfere with efficient instruction scheduling by making it harder to predict which instructions should be executed next, so reducing the need for them is a good thing.
    • There are lots of general-purpose registers, which means that fewer instructions are wasted on just shuffling data between the registers and main memory. This is an old trick, but the x86 architecture is even older... when I look at a compiled x86 program, about half of the instructions seem to be that kind of data shuffling. What makes it worse is that main memory is significantly slower than the processor.
    Note that the first point creates a problem mentioned in the article: it relies on the compiler to determine which instructions can be executed together. It is difficult for a compiler to take full advantage of this, particularly when compiling C code. It's an interesting area for experimentation, though.
  44. My openMosix software by Anonymous Coward · · Score: 0

    My openMosix software

    My openMosix software

    My openMosix software

    Great PR.