Slashdot Mirror


Alpha 21364 EV7 Specs Released

Jon Carroll writes " HP has revealed their Alpha roadmap today at RDF and the schedule goes as previously planned. Alpha 21364 (EV7) is based on 0.18 micron to be shipped by this year end and EV79 based on 0.13 micron SOI will be up next. EV7 will be at 1.2Ghz while EV79 will be at 1.6Ghz. The Alpha 21364 EV7 chip will have 152M transistors, 1.75MB integrated on-die L2 cache, 32GB/s of network bandwidth, integrated RDRAM memory controller with 8 channels up to 12.8GB/s of memory bandwidth. "

174 comments

  1. Yay by Lord+Squirrel · · Score: 2, Funny

    Alpha Lives! Yay! I can die happy now.

    --

    Lord of the Squirrels, Ambassador to the Moles, Minister of Rodential Information

  2. alpha still lives? by Indy1 · · Score: 3, Interesting

    Wait, i am confused here. I thought Dec was bought out by Compaq, which then butchered Dec and their Alpha technology to the point that Compaq finally sold off what remained of the Alpha to Intel (and a bunch of former Alpha engineers also went to AMD if memory serves correctly). Can any one clarify what really happened to Alpha ? I hope that Alpha sticks around, as i feel its a good archtecture (forgive spelling) compared to the x86 stuff.

    --
    Lawyers, MBA's, RIAA? A jedi fears not these things!
    1. Re:alpha still lives? by Henry+V+.009 · · Score: 4, Informative

      Your sketch was more or less right on. When Compaq sold ALPHA to Intel, they said there would only be one more ALPHA chip. Damn them to hell anyway. ALPHA was the best.

    2. Re:alpha still lives? by jbridge21 · · Score: 2

      The plan was to finish the 21364 because their engineers were already pretty far along, and because it would take a while to move over to Itanic anyway. The design of the EV8 was cancelled, unfortunately :-(

      THE ALPHA IS DEAD! LONG LIVE THE ALPHA!

    3. Re:alpha still lives? by LoRdTAW · · Score: 1

      Your wrong, APHA is the best!

    4. Re:alpha still lives? by Strog · · Score: 1

      Where does this all leave Samsung in this whole mess?

      Will they continue the Alpha line or will this be where it ends?

  3. barf, RDRAM by Indy1 · · Score: 1

    i just noticed the bottom part, "integrated RDRAM memory controller". RDRAM is WAYYYYY too $$$$ to be used in servers, and the latency on it sucks balls as well. I dont understand why they dont go with dual DDR (ala nforce style or intels p4 chipset thats due out next time some year).

    --
    Lawyers, MBA's, RIAA? A jedi fears not these things!
    1. Re:barf, RDRAM by Anonymous Coward · · Score: 3, Informative

      It is an EIGHT channel RDRAM controller though. Compare to the TWO channel RDRAM controller of the i850 for example. That gives the Alpha 4x the memory bandwidth of the i850. RAMBUS and DDR both have their advantages and disadvantages. I doubt that RDRAM would have been used without a good reason - most likely the need for high memory bandwidth. Graham

    2. Re:barf, RDRAM by mfago · · Score: 1
      "RDRAM is WAYYYYY too $$$$ to be used in servers"
      Too expensive for servers? You must be kidding. Ever price an alpha cluster?

      I'm no fan of RDRAM though. Not that I necessarily dislike the technology, but the tactics.

    3. Re:barf, RDRAM by jmv · · Score: 5, Informative

      the latency on it sucks balls

      It does in a PC, where they only put two 16-bit channels so you need two accesses to each bank to fetch the 64-bit bus-width (it's serialization).

      In Alpha, there's no serialization. You've got an eight-channel (16 bit each, unless they use the newer 32-bit wide?) configuration. That means that they are 128 bits wide. In order to get the same performance from DDR, you'd need to have a bus that's 1024-bit wide or something like that, which is not practical...

      I don't like RAMBUS at all, but the industry has to come up with something faster because it's clearly the fastest on platforms where it's used correctly (I don't include the current PC in that category).

    4. Re:barf, RDRAM by LinuxParanoid · · Score: 3, Informative

      You must be buying cheap servers. RDRAM is used in more expensive servers, in part due to the high bandwidth it provides (and also, in part due to engineering decisions made years ago.) 8 channels of RDRAM yields 12.8 GB/sec of memory bandwidth which is certainly more than you get with PCs these days, even PC servers. Then again, the 21364 isn't shipping yet. But I don't think Intel plans on shipping that sort of CPU bandwidth by the end of the year.

      And back to your point about economics of RDRAM, there is money out there that will pay a premium for performance scalability (at least when combined with reliability). About 11 percent of all servers -- command as much as 60 percent of all server revenue.

      I just wonder how it'll stack up performance-wise on this chart versus Power4 and Itanium2.

      But the main reason I suspect one would buy one of these is because you want binary compatibility with all your old high-performance Alpha code that you invested so many man-years in.

      --LP

    5. Re:barf, RDRAM by Indy1 · · Score: 2

      all of you make good points. I didnt stop to consider that a high end alpha solution is massive dollars. Does using multiple channels get around the latency in rdram? One of my friends who is a developer says that applications that do mostly serialized type memory acesses do great with rdram, but unless your app is written to take advantage of rdram, ddr is far better (due to much better latency). I am not a big iron expert, so can anyone comment if your usual type big iron apps depend on latency more vs raw bandwidth? I know with Hard drives, brute STR isnt as godly as random access speeds unless your doing stuff like streaming video.

      --
      Lawyers, MBA's, RIAA? A jedi fears not these things!
  4. How sad... by Glock27 · · Score: 5, Insightful
    to see Itanium steamroller a much better architecture.

    Alpha is brilliant, too bad it didn't receive the development and marketing dollars it deserved. Compaq should be ashamed.

    Thank goodness AMD is here to take up the slack with Hammer! =)

    --
    Galileo: "The Earth revolves around the Sun!"
    Score: -1 100% Flamebait
    1. Re:How sad... by abdulla · · Score: 1

      With a hammer, or with The Hammer. ;)

    2. Re:How sad... by Anonymous Coward · · Score: 1

      Hammer is not quite something to be happy about. From what I've been told by others x86 is not the best architecture out there, and having a clean break from this to a better architecture would be a good idea. Although, I'm not sure the Itanic is the way to go...

    3. Re:How sad... by ksymoops · · Score: 2, Insightful

      > to take up the slack with Hammer!

      To take up the slack? How could a glorified x86 chip with a broken/inefficient instruction set possibly be better than a chip with a new from-scratch architecture.

      --
      Never put off till run-time what you can do at compile-time. -- D. Gries
    4. Re:How sad... by stripes · · Score: 5, Insightful
      How could a glorified x86 chip with a broken/inefficient instruction set possibly be better than a chip with a new from-scratch architecture.

      Well you have the x86 with basically all the market forces behind it driving huge R&D budgets...that's how the x86 managed to slam the MIPS, SPARC, POWER, and pretty much all the other RISC chips. It doesn't matter that you are basically sticking solid rockets onto a large not-so-aerodyanic brick. It flys.

      That's the past. Now in the present we have the same market forces behind the x86, and a stunningly bizzare new creature called the IA64, which may not be the poster child for "broken/inefficient", but is clearly a great one for "will drive compiler writers over the brink into the spinning abyss of madness". It is definitly stunningly hard to write things for, that's for sure. More so in most cases then figuring what "RISCops" your x86 instructions are broken into, and where they are shoved, how long that takes, and what a better set would be.

      Intel will send you the IA64 instruction set manuals for free. Go take a peak...if your mind is strong. Or you don't mind a bit of gibbering.

    5. Re:How sad... by DarkHelmet433 · · Score: 1

      One thing I can comfortably say after spending quite some time porting OS code to IA64 is "everything you know is wrong". Much of our accumulated knowledge about OShardware interaction has gone out the window with this beastie. The same for compilers. RISC made a mess of our accumulated compiler knowledge, and IA64 (VLIW^H^H^H^HEPIC) takes us back to square one again too.

      Itanium is going to take years to reach critical mass. x86-64 is going to be eating its lunch for quite a good while. Especially if Intel dont hurry up and make a version suitable for desktop/workstation use.

  5. Too little to late by synoniem · · Score: 3, Informative

    I used to use Alpha's but left the platform 3 years ago because of lack of progress in the development of the Alpha. Especially now Compaq is dead too, the Alpha is a sitting duck. HP already has PA-Risc and and a very good relationship with Intel and their Itanium chip. Too bad!

    1. Re:Too little to late by susehat · · Score: 1, Interesting

      Don't forget that HP killed PA-RISC when they went to do Itanium. The Alpha is very cool. It really doesn't need to be killed off, and HP saw this. Besides, Itanium is lame, it has bad press behind it, and is is also a sitting duck. So, the best to the Alpha, since it seems that HP may want to get off the Intel Bandwagon.

    2. Re:Too little to late by jbbernar · · Score: 1

      Yes, the Alphas have lost their main advantage, a high clock speed. Unless the architecture has fundamentally changed, the Alpha can't hope to compete at 1.3-1.7 Ghz.

    3. Re:Too little to late by DiscoBiscuit · · Score: 1
      HP Killed PA-RISC? It ain't dead yet. HP recently released specs for the PA-8800 and its slated to beat McKinley into the weeds...their own Itanium processor. The roadmap includes an 8900 processor too. As for HP being on the Itanium 'bandwagon', Itanium belongs more to HP from a technology standpoint than it does to Intel from what I understand. I can only imagine that HP decided to let Intel take most of the credit for marketing reasons.

      Viva la PA-RISC!

    4. Re:Too little to late by Bert64 · · Score: 1

      The alpha still has an advantage over competing RISC processors (PA-RISC, Sparc, MIPS etc) in the clock rate department, and when compared to x86.. the alpha always trashed x86 chips at half the clock rate.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    5. Re:Too little to late by Anonymous Coward · · Score: 0
      the alpha always trashed x86 chips at half the clock rate.

      I remember Alphas used to routinely run at double the clock speed of the best Intel chips (not to mention the other RISCs). They were also faster on a per-cycle basis, but not too much (at least on the integer side), and generally quite a bit slower per-cycle than the other RISCs.

      The later Alphas squeezed more out of each cycle, but I don't think enough to make up for the lagging clock speeds (versus x86 at least, where the performance gap has been steadily closing, and may have even inverted -- I don't follow Alpha any more). How sad to see Alphas now trailing so far behind the x86 in clock-speed terms, when they once had such a huge advantage. They still scale much better, though, so with enough CPUs...

    6. Re:Too little to late by be-fan · · Score: 3, Informative

      The cool thing is, (in SPEC fp scores at least, which are decent benchmarks) Alphas at 1 GHz are just about even (10% either way in different tests) with a 2.53GHz P4. And its only about 40% slower on integer code. Clock-speed is nice, but the Alpha had one mad FP architecture!

      --
      A deep unwavering belief is a sure sign you're missing something...
  6. 8 RDRAM channels. by Anonymous Coward · · Score: 0

    Holy cow. 8 RDRAM channels? That is *insane*.

  7. That's what happened by Anonymous Coward · · Score: 0

    Your sketch was more or less right on. When Compaq sold ALPHA to Intel, they said there would only be one more ALPHA chip.

    Damn them to hell anyway. ALPHA was the best.

    1. Re:That's what happened by Anonymous Coward · · Score: 0

      > ALPHA was the best
      >
      It just goes to show, that also hardware eventually needs to go "open-source". Humanity deserves the best solution(s) and that can only be achieved by having the designs and specs fully available to everyone.

    2. Re:That's what happened by Strog · · Score: 1

      Compaq still owns Alpha. Intel just licensed several Alpha technologies

  8. Will it run... by Lokni · · Score: 0, Offtopic

    Now all of that is great and all, but will it run my WINDOZE!!

    1. Re:Will it run... by Chemicalscum · · Score: 1
      Should still run NT. At work we have a alpha box which our IT department is running NT on - Too many dumb MCSE's to run a *NIX. Remember NT is just VMS plus the Windoze GUI so it should be at home on an old DEC box.

      Rumour has it that M$ ported NT to alpha to head off DEC from complaining about it ripping off VMS for NT.

    2. Re:Will it run... by Anonymous Coward · · Score: 0

      It's funny that ridiculous tales like this pop up so often amongst the Slashbots. If NT is really VMS, it must be written in VAX assembler, which, erm, doesn't run on MIPS, x86, PowerPC or Alpha. Hmm. It must also use all of the VAX privilege modes, which, erm, don't exist on MIPS, x86 or PowerPC, and aren't supported by the NT or UNIX PALcode layers on Alpha (they require the VMS PALcode). Hmm. It must also use a single-threaded kernel -- oh, it doesn't? Well, it surely must use the VAX virtual-memory system! No? But wait... aha! It uses the .EXE extension for executables, so it MUST be VMS -- well, or MS-DOS, or DR-DOS, or CP/M, or, erm... may I go home now?

      I always get a good laugh from these conspiratorial "the Earth is really flat!" stories. The sad thing is there are a lot of people who actually believe them.

    3. Re:Will it run... by Slashamatic · · Score: 3, Informative
      NT running on Alpha was probably connected with Cutler (a former VMS architect) who was technical lead for NT.

      In reality NT does have some VMS like feataures in the kernel, but it is *not* VMS. If it was it would be a little slower and a BSOD would be strictly mythological.

    4. Re:Will it run... by Slashamatic · · Score: 2
      Actually Macro-32 (the VAX assembler) runs very nicely on an Alpha. It works as a translator there. Otherwise, I agree with you about the PAL support.

      Digital did start a project to get VMS onto other archiotctures, namely MIPS and INTEL but they gave up even before the feasibility study was fully completed).

    5. Re:Will it run... by Anonymous Coward · · Score: 0
      Rumour has it that M$ ported NT to alpha to head off DEC from complaining about it ripping off VMS for NT.

      I am not certain about that rumour since the one I read in a book (written by a cloudy minded comouter inductry journalist) is that some MS higher ups wanted to push Andy Grove of Intel into providing MS with some secret undocumented instructions in the next x86 instruction set. When Andy refused MS quickly went about porting NT to various RISC chips including PPC, SPARC, Alpha, and maybe MIPS(?). They stopped with NT 3.51 on most of the other RISC chips, but kept the port alive on Alpha thanks to the help of a few hundred DEC software engineers. Those engineers were fired more than a year ago and NT 4 was the last Alpha NT release.

  9. I remeber when by Himmit · · Score: 1

    I first read about their Alpha 21xxx something 64 bit processor and it was running at a whooping 300mhz at that time, which I think was around 94 or 95, compared to my measly i486 dx/2 66mhz that was mind boggeling fast.. good to see that at least a part of my old favorite chip maker still lives

    1. Re:I remeber when by puto · · Score: 2, Interesting

      Alphas were running at 500 megahertz at the time the P60,90 were out.

      I remember becayse I almost bought one with the special version of NT for the Alpha. They only cost a small amount more and ran like scalded dogs.

      The only problem was that there was very little peripheral support and huge driver issues. But most NT stuff ran on them and ran real fast.

      AMD is the bastard child of the Alpha.

      Puto

      --
      The Revolution Will Not Be Televised
    2. Re:I remeber when by fatphil · · Score: 1

      You misremember.

      The 21064A was released in October 1993 at 275MHz, according to Bhandarkar, and went into the 3000/900 and 7000/700 systems in mid 1994. Dec didn't reach 300MHz until the release of the 21164 in Sept 1994, which reached systems in 1995. 500+MHz Alphas, such as the one I'm sitting at currently (which has not been booted into NT since about a week after I bought it, thanks to RedHat and more recently Debain), only came later.

      The original Pentiums (60/66MHz) were released in 1993. They were at P6 by 1995.

      So, timewise,
      Intel 60MHz DEC 200MHz
      Intel 166MHZ DEC 300MHz

      Phil

      --
      Also FatPhil on SoylentNews, id 863
  10. Not really Re:barf, RDRAM by ppetrakis · · Score: 3, Informative

    Sure RDRAM is 'slow' when used on PC architecture however on an Alpha which has VERY WIDE memory bus it can actually use all that memory bandwidth. The latency doesnt matter anymore. As for cost. If you are buying one of these you probably had to get the job done 'yesterday' :-)

    Peter

    --
    www.alphalinux.org
  11. Your ideas intrigue me... by ringbarer · · Score: 0, Troll

    ... and I would like to subscribe to your newsletter

    --
    "Why did they cancel my favorite Sci-Fi show? I downloaded ALL the episodes!"
  12. No relevance since HP admitted it will kill it by maitas · · Score: 5, Informative

    After HP anouncement that Alpha is a dead end, this is of no relevance... SADDDLY!!

    http://www.hp.com/hpinfo/newsroom/press/07may02b .h tm

    They are dropping Alpha and PA-RISC for Itanium... baaadddd move!!

    1. Re:No relevance since HP admitted it will kill it by BlueFall · · Score: 3, Insightful

      This is kinda weird. Certainly, no new customers (usually corporate/research, i.e. not hackers) would buy a chip that will be discontinued and it looks like HP itself acknowledges that:

      AlphaServer systems will be focused on the Alpha installed base. - from the press release sited above.

      But this also means that of the existing customers, probably only those who can't find another alternative soon will buy the new Alpha. Seems like kind of a harsh thing to do the Alpha. If they (Compaq) released this chip then said that they were stopping the line, that would be one thing, but in this case, they're stopping the line before releasing the chip! This is certainly a bizarre move.

    2. Re:No relevance since HP admitted it will kill it by rodgerd · · Score: 4, Informative

      Digital and Compaq did a bunch of deals with customers, especially in the supercomputer space, that were predicated on the appearance of this iteration of the Alpha architecture - they'd be in breach to the tune of hundreds of millions, perhaps even billions, if they hadn't pushed this out the door. It's not about whether new customers pick it up, it's about not being sued by old customers.

      Furthermore, they've got customers on Tru64 and VMS who have nowhere to move at the moment, but may need more grunt; they'll buy upgrades until they've ported VMS to Itanic and the Tru64 customers have migrated to HP-UX (or give up on the Digital->Compaq->HP fiasco in disgust and move to AIX or Solaris).

      Bear in mind that until fairly recently Digital/Compaq were selling new VAX systems to customers who had VAX/VMS setups that worked just fine and no particular desire to upgrade.

    3. Re:No relevance since HP admitted it will kill it by sl3xd · · Score: 2

      They are dropping Alpha and PA-RISC for Itanium... baaadddd move!!

      One thing you seem to have omitted. The Itanium is a joint HP-Intel processor. HP was intimately involved with the design of the Itanium, and intended it as a replacement for the PA-RISC from the beginning of the design. HP had better 'know-how' in 64-bit RISC, and Intel had the fab facilities to produce the Itanium on a large (and more inexpensive) scale. The Itanium was & has not ever been intended to be a x86 competitor. It was designed to replace the PA-RISC and to compete with MIPS & SPARC, among others. In fact, originally, the Itanium was supposed to be backwards/binary-compatible with the PA-RISC. (I'm not sure of if the final product actually IS, but I lost interest in the Itanium several years ago...)

      It was merely hoped that one day the architecture the Itanium uses would finally replace the x86 architecture.

      The Alpha, I suspect, is somewhat of a white elephant in HP's acquisition of Compaq, and I suspect we can expect to see many of the Alpha's technologies rolled into next-generation Itaniums.

      --
      -- Sometimes you have to turn the lights off in order to see.
    4. Re:No relevance since HP admitted it will kill it by hache_the_boss · · Score: 1

      I hope this happens, but How can you be sure? I really don't think that the X86 architecture will expire sometime... I think that a multiplatform/multichip environmet is the best deal (using Linux and Unix flavors). The open environment is the right deal. About Alpha, I prefer to think that Alpha will survive in two or three years I will like to hear about a new company based in Alpha server running against Cray, Sun and several others... or who knows... maybe in a couple of years, Sun could buy alpha technology and add it to the sparc platform... who knows?? Cheers.-

  13. MOD PARENT DOWN (-1, Propaganda) by Anonymous Coward · · Score: 0

    Get to the basement, troll

  14. The last Alpha? by iankerickson · · Score: 4, Funny

    And while they're at it, they can change the name to "Omega".

    --
    Democracy. Whiskey. Sexy. Pick any two.
  15. *mniam* by zdzichu · · Score: 1

    mummy, mummy, buy mi this!
    Up to 256 GB of ECC memory
    Over 51 GB/s aggregate internal bandwidth
    4 MB or 8 MB ECC memory onboard cache per CPU
    Up to 224 PCI slots on 64 PCI buses

    (the image in linked news announcement has this page (www.compaq.com/alphaserver/index.html) link).

    --
    :wq
    1. Re:*mniam* by lhaeh · · Score: 1

      It begs the question; Just what do
      I put in those 224 PCI slots?

  16. alphas and optimisation by Zurk · · Score: 5, Interesting

    just a short comment on how good the alpha high performance math libraries really are (and the alpha engineers -- may alpha rest in peace).
    I was writing code for a simple matrix transform using the algorithm as follows :
    for (a=0;a100;a++){for(b=0;b100;b++){
    txarray[a][b]=o ldarray[b][a];}}
    using the alpha libraries to do the transform instead rated me a 10x boost in speed.
    this was weird as i didnt see how the above algorithm could be optimized...tearing apart the assembly i saw :
    for (a1=0;a1100;a1=a1+10){for b1..{for(a=0;a10;a++){for(b...

    evidently they had optimised it so that reads and writes would occur from closely spaced regions of memory and less time would be spent writing.
    result ? a 10x boost on a simple algorithm and a neat hack at the same time.
    just an example of how awesome the engineering of the alpha wa

    1. Re:alphas and optimisation by Anonymous Coward · · Score: 0

      That is not assembly, Einstein.

    2. Re:alphas and optimisation by Anonymous Coward · · Score: 0

      its C. it was converted from looking at the assembly dimwit.

    3. Re:alphas and optimisation by BlueFall · · Score: 2, Insightful

      This sounds to me like a standard compilers loop unrolling optimization. Almost all modern processors run this kind of code faster. Sounds like you had a cool compiler, though the alpha itself is cool for other reasons.

    4. Re:alphas and optimisation by John+Whitley · · Score: 5, Informative

      No, this isn't loop unrolling at all. This library (and not the compiler, note) is using this scheme to maintain cache-locality. A general rule of optimization is to agressively utilize the memory heirarchy, be it at the L1/L2 cache level, VM, etc. This means maintaining good data-locality in the algorithm's access patterns at the relevant scales (i.e. cache, VM pages, etc). Failure to manage this (for this example) means a performance hit due to greatly increased cache misses, often in the form of unecessary loading, dirtying, flushing, reloading and redirtying cache lines continuously during the course of processing. Ideally, one wants to load the cache line once, do all work in the cache, then flush/write back and move on to other data.

      This principle can be seen in how the GIMP stores image data in tiles data for rapid processing, in matrix math libraries, in the design of FFTW (The Fastest Fourier Transform in the West, www.fftw.org), and many other systems.

    5. Re:alphas and optimisation by BlueFall · · Score: 2

      Ok, I misread the comment. It looks like this may have been hand-written in the library. Nonetheless, this is definitely loop unrolling -- there is a loop and certain parts of it have been written explicitly. Locality in code and data is indeed part of loop unrolling. While most people think that loop unrolling is only for code locality (i.e. to prevent unnecessary branches), data locality as you already mentioned is in fact handed by certain special purpose compilers through loop unrolling.

  17. Only if they support AlphaBIOS Re:Will it run... by ppetrakis · · Score: 1

    Watch the firmware directories on gatekeeper.dec.com for 'nt' firmware on the EV7 boxen. If they release it, It just may run :-).

    Peter

    --
    www.alphalinux.org
  18. I'd rather have an UltraSparc(tm) myself by Anonymous Coward · · Score: 0

    My next machine will be an UltraSparc(tm) based machine.

    Incidently, Intel(tm) now have the rights to the Alpha(tm) chip.

    1. Re:I'd rather have an UltraSparc(tm) myself by Anonymous Coward · · Score: 0

      StrongARM(tm), without a doubt. I would switch to StrongARM(tm) now, but I don't think that it would be practical for my main machine.

      Once I can afford to build the UltraSparc(tm) machine, and then buy a StrongARM(tm) development board, I'm going to see if I can do an official StrongARM(tm) port of a popular Linux distribution.

  19. nice 64bit by johnjones · · Score: 2

    I would like to see one of these give a specFp result

    I bet that it could cane IA64 in the specInt but the real test would be floating point and to do IEEE754 properly you need 64 bit otherwise you end up emulating it

    now we have of the true 64 bit microproessor's

    Sun Microsystems - Processors which are a Sparc

    PA-risc which is MIPS like

    and MIPS64 which I like alot

    of the ports linux to 64bit for linux HPPA and the oldie but goodie linux Alpha and linux sprac64 of course not forgeting linux for IA64 but unfortunately the linux for MIPS is not 64bit so if ever their was a challenge as linux is mostly 64bit clean its to do a MIPS64 port

    oh and intel wont like to say linux for hammer which is not real 64bit just has some 64bit registers tacked on (but hey you can do fp right ;-)

    1. Re:nice 64bit by Anonymous Coward · · Score: 1, Interesting

      Wait till you see what IA64-2 (Itanium 2) can do for FPU ops its just plain frightening. I'm talking 1.3K+ specfpu2000 score. From what a rep from a major OEM mentioned to me, IA64-3 (Madison) should double that.

      The really interesting thing is the parallels between IA64 and Alpha, that they both sucked hard in thier early varients. It is a little known fact that Alpha 21064 cost more than all other 64bit CPU's at the time and performed far worse.

      Oh I should finish off by mentioning that the current IA64 tools for linux suck donkey balls in the performance stakes hopefully this is one area where serious optimisation will be made.

    2. Re:nice 64bit by ToLu+the+Happy+Furby · · Score: 5, Interesting

      I bet that it could cane IA64 in the specInt but the real test would be floating point and to do IEEE754 properly you need 64 bit otherwise you end up emulating it

      x86 processors have had 64-bit floating point registers (actually 80-bit) for as long as they have done native floating point. x86 does not have 64-bit integer registers; this has nothing to do with floating point.

      The reason x86 has traditionally sucked at floating point is because the x87 floating point ISA only allows for a stack of 8 fp registers, instead of a flat set of 32 registers like most RISC architectures. This has been worked around to some degree in current x86 processors through the use of a flat virtual register set and good compilers, although there is only so much a compiler can do when it is limited to 8 target registers. Nowadays the continued leadership in SPECfp by 64-bit RISC chips is mostly due to higher memory bandwidth and particularly large L2/L3 caches which help a great deal with certain SPECfp subtests.

      While not quite as high as its world-beating SPECint scores, the P4's SPECfp scores are still damn good, and would be even better if Intel would officially support PC1066 RDRAM (the current scores on spec.org are PC800 only). Put another way, they will be even better when Intel releases their dual-channel DDR chipset in a few months.

      That said, EV7 will clearly have the SPECfp score to beat for quite some time. (Probably SPECint as well.) And Itanium2's SPECfp scores are reported to vault it well ahead of the also impressive Power4. But, again, this is all to do with higher DRAM bandwidth and larger caches, not with any inherent limitations of x86 for performing double-precision fp.

    3. Re:nice 64bit by Anonymous Coward · · Score: 0

      Yeh, word, it is just like Intel vs. AMD, right now it is... disappointing. But IA64-2... etc... They are thinking seriously of the future... The tools... ok. But the chip is gonna be some serious shit!!!

    4. Re:nice 64bit by AndrewHowe · · Score: 2

      You are right, in that I cannot correct you in any way, but you have to admit... x86 is backward compatible with the 8086*... That may (!) not matter any more but they have done a f##king good job... Credit where it's due and all that... Don't you think?
      * 8088?

    5. Re:nice 64bit by egoots · · Score: 1

      It's not the hardware that is holding back IEEE754 properly, but rather the compilers

      According to W. Kahan (one of the fathers of IEEE754) (see this link to PDF article)

      "The widest precision thatâ(TM)s not too slow on odayâ(TM)s most nearly ubiquitous âoeWintelâ computers is not double (8 bytes wide, 53 sig. bits) but IEEE 754 double extended or long double (Â10 bytes wide, 64 sig. bits). This is the format in which all local scalar variables should be declared, in which all anonymous variables should be evaluated by default. C99 would permit this (not require it, alas), but â¦

      Microsoftâ(TM)s compilers for Windows NT, 2000, ⦠disable that format.

      Java disallows it.

      Most ANSI C, C++ and Fortran compilers spurn it.

      ( Appleâ(TM)s SANE got it right for 680x0-based Macs, but lost it upon switching to Power-Macs.)"

    6. Re:nice 64bit by ToLu+the+Happy+Furby · · Score: 3, Insightful

      You are right, in that I cannot correct you in any way, but you have to admit... x86 is backward compatible with the 8086*... That may (!) not matter any more but they have done a f##king good job... Credit where it's due and all that... Don't you think?

      No doubt. If it wasn't clear from my post: the fact that AMD and Intel can get almost equivalent single-CPU SPEC performance (and SPEC is oriented toward workstation/server/HPC workloads!) to the top 64-bit CPUs, despite maintaining backwards compatability with a much uglier ISA and costing ~50x less, is a huge credit to their engineering teams. As well as pretty strong proof that the fitness of your ISA is much less important than the manufacturing process you use and the engineering resources you have.

      And second, while it of course no longer matters than the P4/Athlon are backwards compatible with the 8086, it mattered hugely that the 286 was, and that the 386 was compatible with the 286, and so on. Tremendously. The immense size of the x86 backwards compatible market has meant that Intel and AMD sell their CPUs in volumes large enough to make owning their own fabs (and keeping them on the cutting edge of process tech) worthwhile...which in turn is what has kept x86 performance so competitive (along with other effects from selling into such a huge market).

    7. Re:nice 64bit by fatphil · · Score: 1

      "PA-risc which is MIPS like"

      Erm:

      PA has a segmented memory architecture, MIPS doesn't. (OK they call it an 'address space identifier', but really it's a segment, as the virtual addressing modes are all 32-bit, unlike MIPS' 64bit.)
      PA has packed decimal types, MIPS doesn't.
      PA has variable bit-field data types, MIPS doesn't.
      PA has 58 SP registers which can be paired for DP, MIPS has a flat DP FP register set.
      Both have branch delay slot but PA has optional nullification, MIPS doesn't.
      PA doesn't have a branch-likely extension, MIPS does.
      PA has conditional moves, MIPS doesn't.
      PA doesn't have a dingle instruction divide, MIPS does.

      MIPS-III may have filled in some of the above gaps, but I stopped looking at MIPS at the MIPS-II stage.

      PA was a 'braniac' design (lots per tick), MIPS was a 'speed demon' traditional RISC design (lots of ticks).

      FatPhil

      --
      Also FatPhil on SoylentNews, id 863
    8. Re:nice 64bit by fatphil · · Score: 1

      "It is a little known fact that Alpha 21064 cost more than all other 64bit CPU's at the time and performed far worse."

      Show us the figures, or shut up.

      The 21064, shipping in 1992 was level (Int & FP) with the PA-7150 from 1994, level to PPCs in SPECInt92 withthe 604 from 1995, but thrashed the 604 in FP.

      i.e. the 21064 was _2 years_ ahead of the field.

      The 21064A, shipping in 1994 was superior in Int and FP to every processor by _every_ other RISC manufacturer before 1996 apart from MIPS's R8000 from the same year, which was better at FP, but lousy at Int.

      i.e. the 21064A was _nearly 2 years_ ahead of nearly the whole field.

      My figures from MPR, from vendor SPEC releases.

      Sure, they weren't cheap, but neither were Sparcs, or PAs.

      The ding-dong battle between HP and DEC started after that, and basically DEC would spend 75% of the time top of the SPEC FP tables, but HP would always manage to throw a system into the #1 slot for about 25% of the time. Intel/AMD coming anywhere near either of those, and Power now actually beating it, are relatively new concepts considering the 10-year length of Alpha history.

      FatPhil

      --
      Also FatPhil on SoylentNews, id 863
  20. Can you imagine a beowulf cluster of those? by ethelred · · Score: 0, Funny

    oh no. I didn't just write that, did i?

    --

    Remember: If you buy anything from spammers, you have a small penis.
    1. Re:Can you imagine a beowulf cluster of those? by Tuzanor · · Score: 3, Funny

      Not funny.

    2. Re:Can you imagine a beowulf cluster of those? by evilviper · · Score: 2

      Unlike x86 or MacPPC, you don't NEED to cluster them. Want to have the speed of 200 machines??? Just stick 200 processors in one of them. The SMP abilities of Alpha are absolutely incredible. (not to mention the threading, performance, et al)

      Hey, no need for distributed file systems, expensive high-speed ethernet, etc. It's just too bad Alpha never caught on.

      And for all those that think it's dead, there are still other companies with vested interest in the Alpha.

      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    3. Re:Can you imagine a beowulf cluster of those? by nuintari · · Score: 2

      yeah, I bet it'd look a lot like this, only faster.

      --

      --Nuintari

      slashdot : where an opinion can be wrong.

    4. Re:Can you imagine a beowulf cluster of those? by Slashamatic · · Score: 2
      I am working on a VMScluster of multiprocessor Alphas. VMSclusters beat the sh*t out of beowulf because of the better cross cluster synchronisation. Digital has been producing clusters since 1980 and they are quite frankly, boringly reliable.

      Our Alphas don't calculate much, they just run the biggest electronic futures and options market in the world (at least the production cluster does). Most of the backend code is even written in COBOL.

    5. Re:Can you imagine a beowulf cluster of those? by th3_l33t_h4x0r · · Score: 1

      Hilarious.

    6. Re:Can you imagine a beowulf cluster of those? by fatphil · · Score: 1

      "Our Alphas don't calculate much..."

      Wanna donate some time to a distributed computing project? :-)

      (I'm one of the few prime-number nuts (Ernst Mayer being the other) who codes stuff for Alpha.)

      FatPhil

      --
      Also FatPhil on SoylentNews, id 863
    7. Re:Can you imagine a beowulf cluster of those? by Slashamatic · · Score: 2
      Actually, the processors keeps quite busy, but with other problems like reformatting and comparing data.

      They are also a *long* way in connectivity terms away from the Internet as all trading by the members goes via a private WAN (better control of transaction times).

    8. Re:Can you imagine a beowulf cluster of those? by be-fan · · Score: 2

      Actually, the EV7 is limited to 128 processors per node. Not chicken-feed, but not 200 ;)

      --
      A deep unwavering belief is a sure sign you're missing something...
    9. Re:Can you imagine a beowulf cluster of those? by Anonymous Coward · · Score: 0

      Yeah, I can - and you can bet your 1040 on it...

      I just hope that they upgrade the power grid in
      Anne Arundel County in time to support it...

      You could say it's a Marvelous idea, particularly when each "node" in the Beowulf consists of 32 CPUs arranged in 4 or 8 NUMA partitions and have more memory than most PC's could write to their disks.

      In fact, a Beowulf cluster of, say, 128 "nodes" each consisting of 8 TruClustered Marvels, each having 32 CPUs and 256GB of memory would indeed be interesting. Hopefully, it will tell us something more interesting than "42" :-)

      BTW, there are some interesting articles in The Inquirer (http://www.theinquirer.net) about how the Houstonians tried (and, well, succeeded) in killing "DEC" over in Europe. Curly and Shane ought to be ashamed of themselves - they were the architects of the Alpha sell-out. But, they apparently enjoy doing business with a convicted felon. Maybe Carly will convince them that the margins on Alphas are a tad higher than on peecees.

  21. Such a shame.. by popeydotcom · · Score: 1

    Only 'real' techies seem to have any time for Alpha. A company I work at has just thrown some out which ran a development SAP system only 18 months ago. Now it's been recycled by one of my collegues and runs RedHat.

    Works as a nice door stop too! ;)

    1. Re:Such a shame.. by Neon+Spiral+Injector · · Score: 2

      Not a door stop. A heater. I literally turned off the heat in my room after I bought an Alpha (dual 21164s) this winter. Of course when summer came around, I had to run my air conditioner all the time, and it was getting out of hand (70 year old wiring, 20 amp circut breaker that was always tripping). So I had to get it co-located a in real data center. It is happy there now.

      Anyway, I have always loved the Alpha and wanted one since I was a boy. But after having one, and finding how poorly they are supported these days, I can't wait to get my dual Opteron system.

  22. TestDrive by SignoffTheSourcerer · · Score: 2, Informative

    They have been available for the compaq testdrive project for a couple of weeks
    cpu Alpha
    cpu model EV7
    system variation Marvel/EV7
    cycle frequency 800000000
    BogoMIPS 2140.20
    platform string Compaq AlphaServer ES80 7/800
    cpus detected 2
    cpus active 2

    This has been restructured a bit to pass through the junk filter as well as condense it to the most important info.

    --
    Ordo Militum Unix.
  23. 0.18? by Anonymous Coward · · Score: 1, Insightful

    Why do they use 0.18 by the end of the year?
    0.13 schould be capable of such a chip.
    IBM uses 0.13 already for their power4.
    One of the foundries like UMC or TSMC would be proude to produce the alpha.

  24. Q.E.D. by ringbarer · · Score: 0, Offtopic

    Thus we have come full circle. islam is National Socialism.

    --
    "Why did they cancel my favorite Sci-Fi show? I downloaded ALL the episodes!"
    1. Re:Q.E.D. by AndrewHowe · · Score: 0, Offtopic

      RE: your .sig, I never saw that place before, I can't personally see your copyright problem, but the graphs are f'ing sweet. Thankyou!!!!1

    2. Re:Q.E.D. by AndrewHowe · · Score: 1, Offtopic

      I'd just like to say that I was moderated (unfairly) overrated for that. Burn in hell, moderator...

  25. Re:happy july 4th by Anonymous Coward · · Score: 0

    Just 30 more comments and you'll hit 666 comments posted - do you get a karma bonus for that?

  26. RDRAM? Bye-bye! by Anonymous Coward · · Score: 2, Funny
    The Slashdot nightmare:
    • ALPHA RULEZ! Intel sux0rs!
    • RDRAM sux0rs! Evil patents! Evil company! Evil! Information wants to be w4r3z3d!
    • ALPHA RDRAM ??
    * Sound of Explosion *
  27. Thoughts on the world by Anonymous Coward · · Score: 0

    Religion is a corrupt organized cult created by power hungry men and women who want to control the minds of innocent and vulnerable people to put them in control of what they want on this Earth and how they want other people to live and abide to this

    Religion must die

    1. Re:Thoughts on the world by AndrewHowe · · Score: 2

      Offtopic.. But strangeky true... Word!

    2. Re:Thoughts on the world by Anonymous Coward · · Score: 0

      Before you even bother... k is next to l... fuck off!

  28. Missing feature by red_dragon · · Score: 5, Funny
    152M transistors, 1.75MB integrated on-die L2 cache... integrated RDRAM memory controller with 8 channels...

    They should go all the way and integrate either one of these into the packaging:

    • Heat exchanger for Freon-based cooling system;
    • 1,000,000-CFM fan with exhaust duct (might require special municipal permit to get installed);
    • Chimney;
    • Frying pan.

    Suddenly, Athlons seem mighty cool (literally).

    --
    In Soviet Russia, Jesus asks: "What Would You Do?"
    1. Re:Missing feature by TheMatt · · Score: 2

      Oh, this is true. I use an EV67 for my research and the thing puts XPs to shame. Once, I opened it right after shutdown and the sheer heat coming of the proc was amazing.

      --

      Fortran programmer...oh yeah. Array math for life!

    2. Re:Missing feature by morcheeba · · Score: 1

      We had a room-sized temperature chamber for baking/freezing satellites that used water cooling. When we'd fire it up, we rolled out a fire hose (literally!) into the parking lot, so that it could drain into the sewer. Technically it's not legal to put treated water into the storm sewer system (it's for rainwater, not chlorinated water), so we were quick to roll it back up.

  29. tiling, not loop unrolling by Anonymous Coward · · Score: 1, Informative

    This is not loop unrolling, it's a technique called tiling. The idea is that accesses to your rectangular array are performed in small square sections. This optimizes cache usage during the transform, where sequential access in 1 of the 2 dimensions would otherwise be cache-unfriendly.

    1. Re:tiling, not loop unrolling by morcheeba · · Score: 2

      I'd recommend this book: High Performance Computing. It covers this trick and many others -- if your compiler doesn't do them automatically, then you can hand code it.

  30. GOD BLESS THE ALPHA! by Anonymous Coward · · Score: 1, Interesting
    AlphaServer 1200 and 164/LX user for several years.

    Once again, capitalism destroys superior technology, as the DECHPaq behemoth kills off all its own engineering masterpieces to appease Intel.

    Long live Alpha, long live Alpha/x86 binary translation technology, long live PA-RISC, long live HP instrumentation and calculators. Long live control by the competent, rather than the short term profit-minded!

    May Fiorina and Cappellas be given the softest of pillows to relieve their nightmares of guilt.

    4th July? Why are we celebrating independence from a nation now no less free than our own?

  31. Re:Moderation - The tool of islamic terror by AndrewHowe · · Score: 0, Offtopic

    Although you were off-topic (where can you post such important stuff and be on-topic) I completely agree with you. Although I play chaotic-neutral in general, that is merely an experiment... And I know exactly how to meta-moderate.
    I am a Microsoft user and proud, well, at least not ashamed, to admit it. But there are some things we can agree on. Idiots do us no favours. I welcome, and thrive on, the cut and thrust of intellectual debate. Long live debate! And long may trolls starve, living only on the skinny bones of the lesser creatures.

  32. Re:RDRAM? Bye-bye! by Anonymous Coward · · Score: 0

    What does Alpha do next? ...profit?

  33. Will we see an Itanium like this? by Anonymous Coward · · Score: 0


    Will we see fast RAMBUS controllers, single-chip MP mesh network, in a future version of Itanium, the architecture formerly known as Merced?

    Alpha, you will be assimilated. Resistance is futile.

  34. Can you imagine a beowulf cluster of those? No by The+Creator · · Score: 1

    Considering how many CPU's you can get in a single box it's often not nessesery.

    --

    FRA: STFU GTFO
  35. Samsung? by Anonymous Coward · · Score: 0

    I hope there will be a Samsung/generic offering, and that mainboards, etc will be available for hobbyists, open source developers and researchers.

    Mostly because I have sworn to never buy anything ever again from Compaq/HP because of some screw ups I have experienced with ordering Alpha products from Compaq, that 1 1/2 years later, they cannot resolve. I have finally taken the matter to my bank card company, and the Better Business Bureau. Their customer service, fulfillment, and transaction/order tracking suck, but they are good at taking your money. So no Samsung, no more alpha for me, thanks.

    Also, because I am cheap. But I like the Alpha Processor, and OpenVMS had a pretty sophisticated design for its time.

  36. About time. by SlashdotTroll · · Score: 1

    It's news like this what makes Slashdot:

    News for Nerds. Stuff that matters.

    Thankyou chrisd, now please cover the stories on the Warcraft3 Linux porting effort.

    --

    I am the nightmare of nightmares.

  37. Too bad: littlw Linux support. by Anonymous Coward · · Score: 0
    About the only hope for Alpha these days is to run OS/F, because Linux has been taken over by code that breaks on Alpha. Sadly, even GCC has never materialised the optimisations that every other architecture has, but instead only hands Alpha compiled code of mixed quality.

    I'm not irked. Merely yet still in the mourning stages.

    1. Re:Too bad: littlw Linux support. by DMDx86 · · Score: 2

      The Compaq C/C++ compilers are avaliable for "free", though albeit not GPL (or "open source"). They seem to produce excellent and well optimized code.

    2. Re:Too bad: littlw Linux support. by Anonymous Coward · · Score: 0
      Compaq has its own compilers for the Alpha? Joy. Be sure to tell the few organisations that distribute Alpha-based binaries to compile with it... seeing as how these compilers are part of every Alpha-Linux distribution. Like I said, little Linux support. Next you'll say that we can still emulate with em86. The attrition rate of support has been staggering in the last year.

      Rather that disseminating good hardware on the cheap as companies get out, Alphas are just going to disappear -- for everyone.

    3. Re:Too bad: littlw Linux support. by Anonymous Coward · · Score: 0

      How much is Sun or IBM paying you to trash the Alpha?

  38. The Hammer is NOT a good thing... by sl3xd · · Score: 2, Interesting

    While I prefer AMD processors over Intel's, and I have an x86-PC, as I understand the situation, the Hammer is not a good thing in any way. My understanding being that the Hammer is simply an extension of the x86 architecture from 32 to 64 bits. (in a remarkably similar fashion to how the 80386 was a 32-bit extension to the 80286/8086, which was a 16-bit extension to the 8085, 8080, & 8008. I'm not sure if the 8008 was an 8-bit extension of the 4004 or not; the 4004 was a 4-bit processor, and is considered to be the world's first microprocessor.)

    So the x86 architecture/instruction set still has a great deal of commonality with the Altairs running CP/M.

    The 'x86' architecture was only intended to be used for a few years. IBM first extended it from the Altair (8085, 64k) to the PC (8086/8, 1M). The popularity of the PC lead to the decision to extend the PC to the AT (80286, 16M). After that, IBM decided that the architecture needed replacement and then tried to kill it. IBM created an entirely new, superior architecture, complete with a new, superior OS. (The PS/2 and OS/2).

    This failed miserably. (Not in small part to the fact it was a 'closed' architecture-- just like Macintosh)

    Instead most of the world chose to stay with the 'x86' architecture (and the more economical clones), maintain backwards compatibility, and deal with its limitations. (I won't say flaws, because the original architecture was never meant to be extended this far to begin with. Of course, that was back with the 8080 and 8085, 64k (max) memory, the Altair, and CP/M.

    And now, the x86 architecture is one extension upon another, finally arriving at the monstrosity we know today.

    The Hammer (and Intel's 64-bit extension to the Pentium... NOT the Itanium) will be yet another generation of an architecture originally intended to handle no more than 64k of memory.

    It's sick; the best comparison I can think of is if the 'x86' architecture is compared to bare hands, the only tools we have are gloved hands with speed/power assist. No wheel, no lever -- just hands.

    The sooner we kill the x86 architecture, the better. It was ancient 15 years ago. Humanity gave up horses and slaves in favor of automobiles and machinery. We can give up the old x86 architecture for something better. Maintaining it is inhumane.

    But getting Intel, AMD, and others to cooperate (and share valuable, patented technologies with each other) is like asking Microsoft to GPL the source for Windows.

    --
    -- Sometimes you have to turn the lights off in order to see.
    1. Re:The Hammer is NOT a good thing... by Anonymous Coward · · Score: 0

      Well a bit like Amigo and Apple have done when they switched from 68000 to PowerPC.
      It would take hell of a lot of convincing for Wintel to do so. Unless the need for Speed prevent any evolution above WinXP2008 or something.
      Then maybe they might review their copy.

    2. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 3, Insightful

      You're characterization about the x86 architecture and PC history is completely wrong. It's one of those tales about an innocent comment that has been passed down from mouth to mouth and along the way the details have gotten embellished completely out of proportion. Having been an assembly language programmer in the x86 world, I know what I'm talking about. There was nothing hard to learn about the x86 instruction set, it was actually quite useful because there were so many complex instructions to choose from that could do some complex tasks with a single instruction. If you didn't feel like learning all of the complex instructions, you could stick to the common instructions to do everything you want.

      Now, the problems with the x86 instruction set that have been embellished out of proportion have all been basically taken care of over the years and fixed, but the bashing still continues. It continues mostly from people who aren't aware the problems have been fixed because they are simply bashers for the sake of bashing. One deficiency about the x86 was its integer register set: it was too small, only 8 general purpose integer registers, and in some of the more complex instructions, only specific GPRs could be used. This has been taken care of by the x86-64 instruction set, they doubled the registerset to 16, and these registers are truly general-purpose. Then there is wierdness about the stack-based floating-point unit: again this has been taken care of because they are using SSE for floating point which uses random-access floating point registers rather than stack-based. Still there were some advantages to using the stack-based FPU, such some of the complex floating point instructions you got with it, such as tangents, sines, cosines, logarithms, etc.

      Now, your knowledge of PC history is woefully inaccurate. When IBM tried to make its powergrab with the PS/2 hardware and OS/2 software architectures, it wasn't trying to get rid of the x86 instruction set. On the contrary, it was getting much deeper into x86 than at any point in its past. With PS/2, it had tried to change away from the ISA bus towards a new generation bus called MCA, without accomodating the existing ISA bus; the shift away from ISA wouldn't successfully take place until many years later when they introduced the PCI bus, which maintained backwards compatibility with ISA. PCI was successful because it allowed a gradual transition away from ISA, MCA on the other hand tried to force everyone to switch away completely all at once. The OS/2 operating system was a similar story, it actually tried to use the x86 architecture in greater depth than any OS previously, by using the 286's new "Protected" operating mode, which gave access to much larger amounts of memory. The only problem was that the 286's Protected mode was not yet full featured, it was more of a running experiment, and it wouldn't become truly useful until the 386 came along and added all kinds of features to Protected mode that allowed for greater flexibility and backwards-compatibility at the same time.

    3. Re:The Hammer is NOT a good thing... by Glock27 · · Score: 4, Insightful
      Sorry I didn't reply sooner, I was away from the keyboard most of yesterday.

      The sooner we kill the x86 architecture, the better. It was ancient 15 years ago. Humanity gave up horses and slaves in favor of automobiles and machinery. We can give up the old x86 architecture for something better. Maintaining it is inhumane.

      This is a silly argument, for two reasons.

      First, almost all programmers can (thankfully) ignore the underlying instruction set and program in a higher level language - therefore it is irrelevant. x86-64 is actually quite an improvement over IA32 regardless.

      Second, if an instruction set is sufficiently efficient to allow the processor to be the fastest microprocessor in the world, it can't be so bad - can it? If my information is correct, Hammer and Opteron will debut with absolutely world-class performance. This isn't so surprising, given that many ex-Alpha engineers are working on it.

      Backwards compatibility is simply a nice bonus, which will be crucial in Hammer attaining critcal mass quickly.

      Time to pick up some AMD stock!!! =)

      --
      Galileo: "The Earth revolves around the Sun!"
      Score: -1 100% Flamebait
    4. Re:The Hammer is NOT a good thing... by DarkHelmet433 · · Score: 1

      x86-64 isn't a pure extension to x86. AMD chopped out a *lot* of stuff. In 64 bit mode, segmentation is completely gone, for example. When the OS is in 'long mode' (ie: the OS is 64 bit) then vm86 is gone. real mode is gone. etc. All that is left when running under a 64 bit OS is 64 bit protected mode and 32 bit protected mode. While the 32 bit apps see what looks like "segments", the supervisor side of it is mostly gone.

      From reading the AMD manuals, it looks like somebody wrote up a list of what sucks about x86 from an OS perspective and the design engineers did a damn good job at getting rid of just about everything on the list. There is still some nastiness, but it is a damn sight better than plain x86.

      The x86-64 application view is dramatically cleaned up too.

      And about damn time!

    5. Re:The Hammer is NOT a good thing... by tricorn · · Score: 1

      The main thing wrong with x86 backwards compatibility isn't that the machine code is awkward; you're absolutely right that if you can make it run fast, who cares? One problem is that it makes it difficult to run fast, so it would run faster without the cruft. However, the biggest problem is that it encourages manufacturers to continue producing machines that are basically the same crap as we've always had. IDE, lousy serial ports, the parallel port for gosh sake, the same lousy BIOS architecture, ISA ports and IRQs. The PC world really needs to take the plunge the way Apple did - use a decent boot architecture (hey, maybe they could use Open Firmware!), drop serial ports, go to FireWire/USB. Apple's only mistake was justifiable, going to IDE (due to the ridiculous price differential between SCSI/IDE drives, which was due to a self-perpetuating cycle of being more expensive because it wasn't as widely used).

    6. Re:The Hammer is NOT a good thing... by tricorn · · Score: 1
      the shift away from ISA wouldn't successfully take place until many years later when they introduced the PCI bus, which maintained backwards compatibility with ISA. PCI was successful because it allowed a gradual transition away from ISA

      PCI is not compatible in any sense with ISA. I think you're thinking of EISA, which was the industry response to IBM's attempt to corner the market by patenting various aspects of MCA so that no one else could make compatible devices or systems.

      Another thing he got wrong was that the 8086 chip was not backwards compatible with the 8080 line. It was similar in architecture (limited non-orthogonal register set, awkward instruction set), and there were 8080 -> 8086 cross-assemblers (sometimes producing more than one 8086 instruction for each 8080 instruction), but it wasn't backwards compatible in the same way as 8086 -> 80186/286/386/486/Pentium were.

      Wow, a whole 16 general-purpose registers, my heart flutters. Bah, might as well use a 64-bit extension to the 6502.

    7. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      PCI is not compatible in any sense with ISA. I think you're thinking of EISA, which was the industry response to IBM's attempt to corner the market by patenting various aspects of MCA so that no one else could make compatible devices or systems
      PCI is compatible with ISA in the sense that it allows an ISA-to-PCI bridge to exist, and the ISA bus acts as a client of the PCI bus. MCA never allowed any sort of backward compatibility. EISA (and later VL-Bus) were directly compatible with ISA (as opposed to bridged compatibility), true.
      Wow, a whole 16 general-purpose registers, my heart flutters. Bah, might as well use a 64-bit extension to the 6502.
      16 GPR's are well within the current norms for RISC processors. Don't forget this is a CISC processor, so more than likely there will be all kinds of hidden internal registers for register renaming available. BTW, the 6502 had a whole 2 GPRs available to it.
    8. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      IDE, lousy serial ports, the parallel port for gosh sake, the same lousy BIOS architecture, ISA ports and IRQs.

      Hey before you get too high on your anti-establishment high-horse, check some of those facts first.

      IDE is now in use by such high-end server/workstation makers like Sun Microsystems, HP PA-RISC, etc., who use them in their personal workstation lines, because it makes absolute sense both in terms of economics and performance to do so. A workstation just requires a boot disk, and maybe some personal storage space, and IDE does this extremely well. Most large-scale data storage can and should be accomplished off of network-attached or SAN-attached storage devices.

      Don't know what you're complaining about those IRQ's, all systems in the world have something similar in concept to IRQ's. And in fact most systems throughout the world are now standardized on PCI, so they use the same IRQ mechanism as PC's.

      And what about them "lousy serial ports"? That's absolutely essential in maintaining control over large groups of Unix servers. Their consoles are invariably serial-port based. They do have nice modern GUI consoles, but when it comes to stacking them into a server room and controlling them all from a single input/output source, nothing beats the simplicity of a serial console tty device. And since they're X Window or Java based, you can simply do all of your graphical stuff from the comfort of your own PC logging in remotely, but the local administration can be done over non-graphical serial ports.

    9. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      64 bit mode, segmentation is completely gone, for example.

      Those segments could have been put to some extremely good use in Protected Mode. They basically allowed you to have completely separate code and data segments which never overwrote each other. Allowing some extremely unprecendented levels of memory protection for applications, not only from other apps but from themselves. It also would make the task of writing OSes easier because the hardware itself could be employed to enforce protection.

      In fact, the original Linux was written this way. Linus's original intention for Linux was to design an operating system to see how much of the Intel 386 architecture's features could be used. Obviously considering how quickly he got the kernel designed and running, the Intel architecture made his life very easy. This was in the pre-1.0 days, as of 1.0 and later they shifted to a more generic kernel that could be ported across platforms. But those early pre-1.0 kernels were extremely small and fast.

      When the OS is in 'long mode' (ie: the OS is 64 bit) then vm86 is gone. real mode is gone. etc. All that is left when running under a 64 bit OS is 64 bit protected mode and 32 bit protected mode. While the 32 bit apps see what looks like "segments", the supervisor side of it is mostly gone.

      But I think any 64-bit OS can switch easily between "long" and "legacy" mode, right? So if there is a requirement to use VM86 mode, they can still do so by putting it into a legacy segment?

    10. Re:The Hammer is NOT a good thing... by sl3xd · · Score: 2

      16 GPR's are well within the current norms for RISC processors. Don't forget this is a CISC processor, so more than likely there will be all kinds of hidden internal registers for register renaming available. BTW, the 6502 had a whole 2 GPRs available to it.

      Funny... I seem to remember most RISC processors I've known (or designed) to have at least 32 GPR's.

      Besides... The point of moving away from CISC is so a processor doesn't use over 1/2 its transistors just to decode the instruction. The instruction decode section of the pipeline shouldn't be the single most complex part; unfortunately on a CISC processor, that's where ~50% of the transistors are.

      I'm also fully aware of the 'evolution' of PC architecture. I've been programming x86 asm for quite a while as well. Many of the x86 (even modern ones) ways of doing things are just... inelegant (or ugly)

      --
      -- Sometimes you have to turn the lights off in order to see.
    11. Re:The Hammer is NOT a good thing... by sl3xd · · Score: 2

      on't know what you're complaining about those IRQ's, all systems in the world have something similar in concept to IRQ's. And in fact most systems throughout the world are now standardized on PCI, so they use the same IRQ mechanism as PC's.

      Exactly true. Although the number and arrangement of the interrupts may be different. I would prefer not to think of how dog slow computers would be if they had to actively poll system devices (from video cards to keyboards). It's sooo much nicer to use an interrupt system.

      And what about them "lousy serial ports"? That's absolutely essential in maintaining control over large groups of Unix servers. Their consoles are invariably serial-port based. They do have nice modern GUI consoles, but when it comes to stacking them into a server room and controlling them all from a single input/output source, nothing beats the simplicity of a serial console tty device. And since they're X Window or Java based, you can simply do all of your graphical stuff from the comfort of your own PC logging in remotely, but the local administration can be done over non-graphical serial ports.

      While not arguing this point in the least, I will say one thing: The way the serial ports are set up on the x86 is a bit messy. The Unix boxen I've worked with had a more elegant system for serial ports. (Although most of them also didn't have the same backwards-compatibility problems x86 has).

      --
      -- Sometimes you have to turn the lights off in order to see.
    12. Re:The Hammer is NOT a good thing... by sl3xd · · Score: 2

      First, almost all programmers can (thankfully) ignore the underlying instruction set and program in a higher level language - therefore it is irrelevant. x86-64 is actually quite an improvement over IA32 regardless.

      Oooh! A higher level language!!!

      So is BASIC! And you can get it for any platform and your code will run.

      Whoopee! It's still dog slow and takes up more resources than is necessary to get the job done. Even compiled (C) code usually runs several times slower and requires more memory than assembler.

      Second, if an instruction set is sufficiently efficient to allow the processor to be the fastest microprocessor in the world,

      First, an instruction set has little to do with the speed of the processor. The whole CISC vs. RISC thing has more than shown that. An instruction set has more to do with the difficulty and/or complexity of the processor's design. The CISC instruction set requires more (electrical) power, and more transistors to do the same job.

      Second, it's to be the fastest in the world? By what method is this measured? Clock speed? Size of the pipeline? Number of pipelines? Clocks per (integer, float, or instruction)?

      The hammer isn't even meant to compete with workstation processors in terms of speed. I'll take a SPARC or Itanium any day. (It's a sad thing that so many seem to forget that the Itanium is an HP design, the successor to its PA-RISC, and that newer versions of the Itanium will include many of the Alpha's technologies).

      --
      -- Sometimes you have to turn the lights off in order to see.
    13. Re:The Hammer is NOT a good thing... by Glock27 · · Score: 2
      First, almost all programmers can (thankfully) ignore the underlying instruction set and program in a higher level language - therefore it is irrelevant. x86-64 is actually quite an improvement over IA32 regardless.

      Oooh! A higher level language!!!

      So is BASIC! And you can get it for any platform and your code will run.

      Whoopee! It's still dog slow and takes up more resources than is necessary to get the job done. Even compiled (C) code usually runs several times slower and requires more memory than assembler.

      You've just proven you have no practical knowledge of software development. Far less than 1% of desktop/workstation/server software is programmed in assembler. Perhaps the inner loop of some game engines might be, but I doubt even that in most cases.

      One of the main points of developing faster processors with large amounts of memory was to enable the use of more programmer-friendly languages. It is simply not worth the cost to develop systems of any size in assembly.

      Finally, if you think C code "usually" runs several times slower than assembler, you're just plain out to lunch.

      Second, if an instruction set is sufficiently efficient to allow the processor to be the fastest microprocessor in the world,

      First, an instruction set has little to do with the speed of the processor. The whole CISC vs. RISC thing has more than shown that. An instruction set has more to do with the difficulty and/or complexity of the processor's design. The CISC instruction set requires more (electrical) power, and more transistors to do the same job.

      The instruction set (and associated issues like register count) certainly does have an effect on speed. Next!

      Second, it's to be the fastest in the world? By what method is this measured? Clock speed? Size of the pipeline? Number of pipelines? Clocks per (integer, float, or instruction)?

      I'll settle for SPEC2000 benchmarks. You know, real world codes optimized to the hilt for the target processor.

      The first Hammer is supposed to debut at a PR 3400, and the first Opteron with a PR 4000. Multiply the current Athlon SPEC scores by the ratios of the PR numbers...that should give you a good idea of what's to come.

      The hammer isn't even meant to compete with workstation processors in terms of speed. I'll take a SPARC or Itanium any day.

      You are absolutely incorrect. First off, Athlon MP and Xeon are already workstation solutions, albeit 32-bit.

      Secondly, the Opteron versions of Hammer (with dual memory controllers, more than 2-way capability and large cache) are squarely aimed at high-end workstation and server applications, up to at least 8-way. Do some homework and you'll see this to be the case. Dell recently announced that it's skipping Itanium 2, and evaluating Hammer/Opteron.

      (It's a sad thing that so many seem to forget that the Itanium is an HP design, the successor to its PA-RISC, and that newer versions of the Itanium will include many of the Alpha's technologies).

      It is a dual HP + Intel design, and so far it has been a collosal dud by anyone's measure. With poor backwards compatibility and anemic performance, it is very vulnerable to Hammer, if AMD can pull it off. So far Hammer is looking great! Working silicon has been demoed, and things look on track for a 4Q release of the first Athlon-64 (desktop Hammer). Opteron will follow 1Q 2003.

      (BTW, it'll be interesting to see how Intel spins the low clock speeds of Itanium. THAT will require some chutzpah! I hope AMD nails Intel on that score.)

      --
      Galileo: "The Earth revolves around the Sun!"
      Score: -1 100% Flamebait
    14. Re:The Hammer is NOT a good thing... by DarkHelmet433 · · Score: 1

      Actually, when I said "segmentation is gone", I over simplified. Segments for cs/ds/es/ss are essentially hardwired to start at zero, with a 64bit "limit". fs/gs however are still alive and simply have a floating base and limit count. ie: you can still use them for thread-local-storage. All the protection mechanisms are gone though.

      Seperation and protection of code and data etc is done at the page level.

      Regarding 'long' and 'legacy' modes. You either have a 64 bit OS (long mode), or you have a legacy OS (looks just like x86).

      long mode has a 64 bit OS and supervisor model. However, long mode allows *applications* to run in either flat 64 bit mode, or an emulated 32/16 bit protected mode. It isn't true traditional protected mode, but it is enough for applications. In this application mode, you still do have segments, protection is enforced etc, but you are really running on 64 bit page tables etc where you simply cannot generate 64 bit memory references and cannot use any of the 64 bit instructions or registers. If your application traps or makes a syscall, the OS handles it in 64 bit mode. Since the supervisor part of this model is gone, things like vm86 and switching to real mode are not possible.

      ia64 does something very similar for x86 emulation, except that the simulated internal segmentation protection mechanisms are even weaker. For example, on ia64, you can edit your GDT and change your %cs etc as you please. It just simply doesn't do anything interesting because you are mapped onto a 64 bit address space. There are no "priviliges" granted by the segmentation system in this mode.

      Hammer's 'OS legacy mode' makes it look and feel just like an x86.

      In theory your could switch between 'OS legacy mode' and '64 bit OS mode' but you really wouldn't want to. It is expensive, and the supervisor interface is radically different. It would be tremendously expensive to do so regularly. It just isn't worth it. To hell with vm86 and real mode code!

    15. Re:The Hammer is NOT a good thing... by sl3xd · · Score: 2

      You've just proven you have no practical knowledge of software development. Far less than 1% of desktop/workstation/server software is programmed in assembler. Perhaps the inner loop of some game engines might be, but I doubt even that in most cases.

      If desktop computers accounted for more than a tiny fraction of the whole computer market, I might actually care about that statement. Fortunately, the vast majority of computers are embedded systems, and a substantial portion of embedded code is pure asm.

      One of the main points of developing faster processors with large amounts of memory was to enable the use of more programmer-friendly languages. It is simply not worth the cost to develop systems of any size in assembly.

      No, that's the software designers point of view. The hardware designers point of view is to maintain the performance of software written by overworked programmers who don't have the time to do it right.

      Finally, if you think C code "usually" runs several times slower than assembler, you're just plain out to lunch.

      First off, C code does execute several times slower than assembler. On the order of 5-10x is typical. Compilers really aren't that wonderful.

      --
      -- Sometimes you have to turn the lights off in order to see.
    16. Re:The Hammer is NOT a good thing... by Glock27 · · Score: 2
      If desktop computers accounted for more than a tiny fraction of the whole computer market, I might actually care about that statement. Fortunately, the vast majority of computers are embedded systems, and a substantial portion of embedded code is pure asm.

      Er, wait a sec. This discussion was about Hammer, vis a vis Itanium, SPARC, etc. Remember?

      None of them are aimed at the embedded market.

      No, that's the software designers point of view. The hardware designers point of view is to maintain the performance of software written by overworked programmers who don't have the time to do it right.

      Don't worry. Even with the widespread use of high level languages, computers can do far more now than a few years ago - or are you claiming you could run Quake 3 on a 386, if it were written in assembly? ;-)

      First off, C code does execute several times slower than assembler. On the order of 5-10x is typical. Compilers really aren't that wonderful.

      Except for pathological cases, C code will run a few percent slower than hand-tuned assembler - if that. As I said, out to lunch...

      --
      Galileo: "The Earth revolves around the Sun!"
      Score: -1 100% Flamebait
    17. Re:The Hammer is NOT a good thing... by tricorn · · Score: 1

      The only reason IDE is cheap is because everyone uses it, because it is cheap. Thus, we're stuck with it. FireWire is one possible way out of it.

      As for IRQs, I wasn't suggesting that computers shouldn't have interrupts, but that the PC architecture for handling them is lousy. PCI indeed handles things much better - but the PC is still saddled with the old architecture. In the meantime, the Mac had NuBus ages ago, and never had any such problems.

      As for serial ports, again I wasn't saying that serial ports are bad, I'm saying that PC serial ports are lousy. Taking the Mac again as an example, they had great serial ports, could do RS-422 as well as pseudo-RS-232, could handle asynch and synchrounous, had a larger internal buffer than standard PC serial chips. Serial connections still have their uses, but for most home use, they are utterly useless now.

    18. Re:The Hammer is NOT a good thing... by tricorn · · Score: 1

      If that's compatible, then S100 is compatible with VMEbus is compatible wiht NuBus is compatible with ISA/EISA/PCI. Just because IBM never licensed their MCA patents to anyone to make an ISA/MCA bridge doesn't mean it couldn't have been done. IBM was dumb in the way they handled MCA, but that doesn't make PCI "compatible" with ISA.

      I know the 6502 had a limited set of registers. That was my point. It simplifies the architecture immensely, which means you can do load/operate/store very quickly. If you're going to have a limited number of registers, might as well go all the way. Besides, the 6502 effectively had 256 registers in low memory; in a modern version of the 6502, those locations would certainly be implemented on-chip.

    19. Re:The Hammer is NOT a good thing... by sl3xd · · Score: 2

      The discussion is about Hammer vs (insert processor)

      But all are (or will be used) in embedded system design anyway, so that's where my train of thought was leading. The Hammer mainly has 'momentum' going for it. Just about everything else is against it.

      First, the Hammer is a design of a few orders of magnitude more complex than anything else ever attempted. The engineers at DEC dropped the VAX processors and designed the Alpha to avoid the same complexity issues the Hammer is trying to tackle.

      First, it uses the x86 set, which has both more instructions, and more complexity (some would say features) per instruction than a pure RISC processor. About half the Athlon's design is just to decode the instructions it's given. After decode from x86 into its internal RISC structure, it then schedules the pipeline, and finally actually sends the data into the appropriate pipeline for execution. There is a huge amount of overhead just to decode what needs to be done.

      Pure RISC designs use about 15% of the chip's transistors for decode, and that's if you include pipeline scheduling.

      This is the crux of the problem for AMD's hammer. The hammer will be forced to use a much larger transistor count than its RISC competitors. The higher transistor count results in several problems: It's far more complex and expensive to design. It takes a more complicated and expensive process to fab. The die is larger, which results in a slower processor. And it uses more power.

      Which means that while AMD may have some momentum going for it, the Hammer is far more costly to design and produce than its competition. This will make things very hard for AMD; espescially if Intel is able to use its (considerably greater resources) to get computer makers to move from x86 to IA-64 at the same time they move from 32 to 64-bit.

      And since HP/Compaq, Dell, Gateway, Micron, and IBM have all thrown in with IA-64... Things look grim for the Hammer.

      The good thing is that I would bet that the RISC back end to the hammer is designed so it can be mated with an IA-64 interface should the x86-interface core not take off.

      or are you claiming you could run Quake 3 on a 386, if it were written in assembly? ;-)

      Not exactly a fair comparison, given that 3D Acceleration was a rather expensive solution back in the days of the '386. Most of which were used for military flight sims. The accelerator was about the size of a refrigerator, and connected to the 'host' computer, which was usually a SPARCstation @ 33 MHz. (At least in the case of sims made by Evans & Sutherland, which had the market for them pretty much cornered)

      However, I wouldn't be too surprised that the non-graphics portions of Q3A would run fairly well (but not great) on a '386 ( if it had a '387 FPU as well.).

      I'll say this: Without a dedicated 3D card, it would take a Power4 module to tackle Q3A at max settings. (Of course, a Power4 module isn't a single processor-- it's 8 processor cores roughly analogous to a PowerPC. And just one of the current Power4 cores outruns AMD's 'best-case' specs for the Hammer (which is still in development).

      Except for pathological cases, C code will run a few percent slower than hand-tuned assembler

      More the opposite; except for pathological cases, C code runs a few hundred percent slower than assembler. (Although on the IA-64 architecture, this is not necessarily true, as it relies entirely on the compiler to explicitly state operation order. The IA-64 does not re-order operations (or do any pipeline scheduling) at all, which is one of the primary reasons the Itanium runs x86 code so slowly.

      And, the IA-64 arch. is about the only one out there where a C compiled program stands a fair chance against pure asm, (since it requires the pipeline scheduling to be explicitly stated by the programmer, which is an extremely difficult task for mere mortals.)

      Frankly, I'm not about to argue any further. The asm vs c/compiled is older than vi vs. emacs; except the 'vi vs. emacs' doesn't have much impact on the speed of programs written in it. I know for my own experience how much faster assembler is than Compiled languages. I stand by my numbers. So do all the hardware engineers I know, including a couple whom have Ph.D.'s in compiler design.

      C is great because it compiles well, and is cross-platform. Asm doens't require THAT much more development time than C does... But ASM is so device specific that unless you're writing software for a driver or embedded devices, the advantage of C more portable nature outweighs ASM's speed.

      I mean... think of what Carmack would give up if he wrote his graphics engines in asm: NO cross-platform capability, a nightmare interfacing with graphics card drivers, and almost no flexibility in the graphics engine.

      And since the advantages of ID's graphics engines have always been broad platform and hardware support, and extreme graphics engine flexibility. He'd lose a significant part of his market if he wrote it in asm. Plus, other companies would have graphics engines that, while somewhat slower, would be available far sooner than an asm implementation.

      --
      -- Sometimes you have to turn the lights off in order to see.
    20. Re:The Hammer is NOT a good thing... by Glock27 · · Score: 2
      But all are (or will be used) in embedded system design anyway, so that's where my train of thought was leading. The Hammer mainly has 'momentum' going for it. Just about everything else is against it.

      They are years (possibly decades) away from widespread use as embedded processors. Given their capability and memory sizes, again it will be far less than 1% programmed in assembler. Mostly likely Java will be the dominant embedded language by then.

      I don't have time to rehash this entire argument, but I will touch on one point:

      This is the crux of the problem for AMD's hammer. The hammer will be forced to use a much larger transistor count than its RISC competitors. The higher transistor count results in several problems: It's far more complex and expensive to design. It takes a more complicated and expensive process to fab. The die is larger, which results in a slower processor. And it uses more power.

      Hammer has a substantially smaller die than P4, it's main competitor. Itanic, er Itanium, not only has a large die, but is priced with extreme margins. It's an easy target for Hammer. There is the issue of OEM support, but if Hammer meets spec it will be in high demand.

      More the opposite; except for pathological cases, C code runs a few hundred percent slower than assembler.

      Why don't you go to the Usenet group comp.compilers and state that "Except for pathological cases, C code runs a few hundred percent slower than assembler".

      The resulting blood bath should be amusing. ;-)

      Let me know if you do it, I want to watch...

      --
      Galileo: "The Earth revolves around the Sun!"
      Score: -1 100% Flamebait
    21. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      IBM was dumb in the way they handled MCA, but that doesn't make PCI "compatible" with ISA.
      No doubt today if IBM had to do MCA all over again, then they would've implemented an ISA bridging chip. But they didn't and that's what killed them. Besides, I don't think that back when MCA was created, that the level of technology existed to really create the quality of ISA bridges that could be had when PCI came out.
    22. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      Funny... I seem to remember most RISC processors I've known (or designed) to have at least 32 GPR's.
      UltraSparcs are 16 GPR's, so were the last bunch of PA-RISC processors I looked at (PA-RISC 7200's).
      The instruction decode section of the pipeline shouldn't be the single most complex part; unfortunately on a CISC processor, that's where ~50% of the transistors are.
      Doesn't seem to have done them much harm in getting good performance out of these processors. Besides, over 50% of the processor die isn't the decode units, it's the caches.
    23. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      Exactly true. Although the number and arrangement of the interrupts may be different. I would prefer not to think of how dog slow computers would be if they had to actively poll system devices (from video cards to keyboards). It's sooo much nicer to use an interrupt system.
      About the only problems with the PC IRQ system in the past has been that: (1) you had to manually choose an IRQ on a plugin device with a jumper, (2) you had to know enough to avoid using the same IRQ twice, and (3) you couldn't share an IRQ between two or more devices.

      All of these had been taken care of by various technologies (mostly introduced with PCI). (1) Plug'n'Play eliminated having to set jumpers on the cards. (2) Again, P'n'P chose the appropriate IRQ for you, thus eliminating conflicts, but also PCI has the level-sensitive IRQs which allow multiple devices to be multiplexed onto the same IRQ number, which also answers point #(3).

      While not arguing this point in the least, I will say one thing: The way the serial ports are set up on the x86 is a bit messy. The Unix boxen I've worked with had a more elegant system for serial ports.
      Well, the Unix boxes all use RS-232 serial interfaces just like PC's, so there shouldn't be much, if any difference. But the one major difference is that PC BIOSes don't redirect their consoles outputs through the serial ports like Unix boot firmware does.
    24. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      The only reason IDE is cheap is because everyone uses it, because it is cheap. Thus, we're stuck with it. FireWire is one possible way out of it.
      Well, who's fault is it that IDE became cheap and SCSI didn't? It's the SCSI manufacturer's fault, trying to keep everything about it artificially high priced. The electronics in a SCSI are no more complicated than the electronics in an IDE drive, in fact in many cases it's the exact same drive with slightly different connector and slightly different firmware that does the same job whether it is IDE or SCSI.

      Firewire will never take over from IDE (or even SCSI) as a disk drive standard, simply because there is no need to. IDE is already much faster than Firewire, it's more or less Plug'n'Play (using cable selects), and IDE is evolving towards a high-speed serial standard, Serial-ATA.

      Taking the Mac again as an example, they had great serial ports, could do RS-422 as well as pseudo-RS-232, could handle asynch and synchrounous, had a larger internal buffer than standard PC serial chips. Serial connections still have their uses, but for most home use, they are utterly useless now.
      Yes, yes, the Mac was always superior to the PC, that's why it was such a raging success.
    25. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      Seperation and protection of code and data etc is done at the page level.
      Which is not surprising because that's the direction that x86 memory protection had evolved to anyways, under most OSes.

      Say, what size can the pages get in Hammer anyways? I assume that they start at 4K and go up to something from there.

      long mode has a 64 bit OS and supervisor model. However, long mode allows *applications* to run in either flat 64 bit mode, or an emulated 32/16 bit protected mode.
      In Protected mode segmentation, you had upto 4 levels of privileges (with ring 3 being where applications usually resided with the least privileges, and ring 0 where the OS resided with the most privileges). Most OSes never used anything other than ring 0 and ring 3, ignoring rings 1 and 2. Can this emulated 16/32-bit protected mode emulate all of the original segment ring levels, or does it just treat it as simply privileged vs. non-privileged?
      ia64 does something very similar for x86 emulation, except that the simulated internal segmentation protection mechanisms are even weaker. For example, on ia64, you can edit your GDT and change your %cs etc as you please. It just simply doesn't do anything interesting because you are mapped onto a 64 bit address space. There are no "priviliges" granted by the segmentation system in this mode.
      You know, I always assumed that IA64 despite it being an x86-emulator, was at least a full x86-emulator. So I had assumed that it could be made too boot up DOS if you ever so wanted to. But this makes it sound like they haven't bothered to emulate Real Mode or anything under IA64.
    26. Re:The Hammer is NOT a good thing... by sl3xd · · Score: 2

      Hammer has a substantially smaller die than P4, it's main competitor.

      Again, that's comparing a fab tech that is in the near-future compared to one that's been used for over a year. Not a fair comparison by any means. It's like saying that the Athlon has a smaller die than the K6. Completely different chip generations.

      And since both Intel and AMD are working together (with about every other semiconducter maker) on researching new fab techs, you can bet Intel will have the same fab tech of the Hammer. (I do know the Itanium II uses a 0.09 micron fab tech, which is unprecedented for the scale.)

      Supposedly the Itanium was (more or less) a rushed release (similar to the PowerPC G4). The Itanium II seems to have improved by a few orders of magnitude in efficiency, as well as speed. For that matter, the PowerPC G5 (which is not being rushed out) specs about 2x faster than IBM's Power4 core.

      And, remember, as I said before, the Itanium is currently targeted at the Workstation/high-end server market; NOT the PC market. When I say workstation, I mean "ultra-high performance, ultra-high stability (and typically, ultra-high cost)" market. The Itanium is priced similarly to the primary competitors in the arena, those being UltraSPARC, Power, Alpha, and PA-RISC. The first-gen Itanium is not (and was never intended) to be anywhere near your local conumer electronics store (or your local system builder, for that matter).

      The Athlon MP is not real workstation class by any stretch of the imagination. No competant engineer even trusts the architecture with critical tasks. I have yet to see anybody design computer hardware (or vehicles, or perform complex simulations, scientific calculations, or true enterprise-level work) on x86 hardware. The hardware, while cheap, still crashes far, far too often... it doesn't have anywhere near as good of a memory (and system) architecture... the list goes on and on.

      The reason PC's are used for 'render farms' are because they're so cheap. If a computer crashes, then they just have to re-boot it and re-render the current frame (losing only a few hours work at most, and even then in a relatively non-critical task).

      To be short: Sun dominates the workstation market, followed by HP, IBM, and SGI. None of their workstations (with exceptions to SGI's lowest-cost graphic workstations), run x86. That's over 95% of the workstation market.

      There is the issue of OEM support, but if Hammer meets spec it will be in high demand.
      Unquestionably. However, that doesn't mean it will be successful. AMD once made the world's most popular RISC processor (hands down). It literally blew everything else away in terms of sales. AMD discontinued production because, in spite of very high demand for the hardware, they couldn't come close to competing with the other architectures (or, to be more specific, although hardware makers loved it, nobody wrote software for it.)

      If the Hammer isn't compatible with IA-64 compiled binaries, then AMD will have to fund the development of Hammer-compiled versions, as software developers, following the money, will support IA-64 first. AMD has done this in the past already, but had to give up because it wasn't profitable. (Not coincidentally, it's the same RISC processor that was in such high demand that was the source of this headache).

      Why don't you go to the Usenet group comp.compilers and state that "Except for pathological cases, C code runs a few hundred percent slower than assembler".

      The resulting blood bath should be amusing. ;-)


      As I said, the assembler vs compiled fight is quite long running. Stating asm vs compilers arguments in a compiler newsgroup would get a similar response to a windows user extolling the virtues of WinXP in a Mac (or linux) group.

      And, unsurprisingly, stating that C is anywhere near as efficient as pure asm in an assembly newsgroup would be a bloodbath as well.

      The main argument for using C is that it is generally faster software development, and generates code that is 'acceptable'.

      Pure asm takes more time to develop, but results in significantly tighter/faster code. The L4 microkernel kernel is a great example of this: The C implementation is much slower than the asm implementation.

      But, unsuprisingly, the C implementation is a bit easier to work with.

      HP did some research a while back (2-3 years) with software optimisation. They discovered a few interesting things: They could 'emulate' (using full architecture emulation) compiled programs with ~5-15% greater performance than running the same binary natively. (The emulator was emulating the PA-RISC architecture, and ran on top of PA-RISC hardware, so the test was conducted on the same machine) The emulator was capable of making up for inefficiencies the compiler added into the code. In spite of the (large) overhead of the emulation, the program still ran faster while emulated.

      While it has yet to really see more than tech demo releases, the Amiga OS4 technologies are quite similar: They are able to run the exact same binary on multiple platforms (PowerPC, x86, IA-64, SPARC, and MIPS) with no drop in performance (compared to natively-compiled versions of the same code). Again, this is due to (current) compiler problems. (PS- I'm not an Amiga fan per se, but I do admire how well-engineered they were for their day)

      Another good example is the speed difference between different compilers on the same platform. If they compiled to anything remotely close to the speed of asm, then there wouldn't be a 15-20% speed difference between a highly specialized compiler (such as Intel's) versus a more generic (cross-platform) compiler (such as gcc).

      Don't get me wrong: There's nothing really wrong with using a compiled (or interpreted) language. There are very definate benefits to their use (development and maintenance time being primary considerations). Compiled languages are acceptably fast, and compilers are getting steadily better.

      But I doubt we'll ever see more than a fraction of embedded devices use a more high-level language. A price difference of $0.01 adds up to real money (and reduced cost) in commercial production runs.. In addition, even where compiled languages are used, the resulting code is still de-compiled and the results scrutinized closely. (Which isn't much different than just writing the whole thing in asm anyway).

      --
      -- Sometimes you have to turn the lights off in order to see.
    27. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      Hammer has a substantially smaller die than P4, it's main competitor.
      Again, that's comparing a fab tech that is in the near-future compared to one that's been used for over a year. Not a fair comparison by any means. It's like saying that the Athlon has a smaller die than the K6. Completely different chip generations.
      AMD has stated that adding the 64-bit extensions to Hammer has only increased its core size by about 5% over K7, at the same process size! Why do you think people are so excited about Hammer? It will not only add substantial processing power to the existing core, it will also be not much harder to manufacture. It will be substantially smaller than the P4 core even with 64-bit extensions, let alone an Itanium!
    28. Re:The Hammer is NOT a good thing... by sl3xd · · Score: 2

      AMD has stated that adding the 64-bit extensions to Hammer has only increased its core size by about 5% over K7, at the same process size!

      That's not too suprising... I'd say the figure is about right. With as large an instruction decode stage as an x86 (or any CISC) has, changing from 32 to 64 bits isn't going to change the size of the chip much. (The 64-bit extensions, from what I understand, do not add more than a couple instructions; it simply reuses the ones it already has. Hence Decode stage won't grow too much)

      The thing is the Decode stage takes up so much of the overall die (and number of transistors, etc) in any CISC processor, that even sweeping changes in the remainder of the chip will result in a nearly identical die size.

      That being said, the actual RISC processing core (of the Hammer) is significantly larger than the K7's RISC core. (On the order of 20-30%). It's just that the decode stage is so huge that it hardly makes any difference.

      Why do you think people are so excited about Hammer?

      A couple of things: First, there is a significantly large anti-Intel crowd. (Not surprisingly, they're also anti-Microsoft). So any upcoming non-Intel chip is exciting to them.

      My feelings as to 'why AMD?' comes down to a simple factor: Price. AMD chips are loved by so many because they're cheap x86-compatibles (games being a key factor). If Apple hardware were similarly priced, and had the game market that x86 offers, Apple (and PowerPC) would be a favorite.

      Processors can be related to cars fairly well, as long as you forget about being compatible with Windows for a moment; And frankly, as far as I'm concerned, the programs that run on it don't make a difference to the actual hardware.

      The Hammer is akin to a pickup truck: A fairly inexpensive, medium-quality vehicle. It's loved because it does its job at a bargain price. It's utilitarian. It's the 'people's truck', and is affordable to most of the population.

      Workstation processors (Such as Power, SPARC, Alpha, PA-RISC, Itanium) are compared to a semi-truck (Kenworth, International, Caterpillar): They don't necessarily go any faster, but they can tow huge cargos, but the corresponding rise in cost is far from linear.

      And Apple (PowerPC) processors are BMW's or an Audi: They don't really run any better (or worse) than a pickup truck-- but it's a higher-quality 'luxury' car, and gives a better ride. You pay for the quality and experience, though.

      And, basically, there are a lot of people who are perfectly happy with their pickup truck. They're not about to pay more (at a very uneven scale) for more performance of a semi-truck, nor do they care for the luxury of a BMW.

      (And, the Itanium isn't as great as the other workstation processors, but it's also the only 1st gen chip in the bunch; The 1st gen SPARC, Power, and PA-RISC processors weren't wonderful either.)

      The Itanium also has one major problem with reguard to die size: It's binary compatible with both x86 and PA-RISC processors; meaning that while the pure IA-64 architecture part of the chip is smaller than the Hammer, it then has the circutry to decode x86 (which is a huge # of transistors, and hence, huge die area), PA-RISC (a much simpler/smaller addition to the x86 decode), and the IA-64's own VLIW decode.

      If the hammer had three seperate instruction decoders (one CISC, one RISC, one VLIW), then it would have a huge die area too. But the Hammer has one (CISC). And even the Athlons would be half their current size if they were pure RISC rather than CISC. (Of course, they wouldn't be x86 compatible then, but that's markets for ya.)

      The 64-bit extensions don't comprise an entirely new instruction set, primarily because they're just that: extensions. The Hammer's mechanism to extend from 32 to 64 bits is identical to the way the '386 extended from 16 to 32 bits. (This is from AMD's data). The '386 also added a couple more instructions (and registers) to the '286 design. That doesn't make an entirely different instruction set and/or decode.

      --
      -- Sometimes you have to turn the lights off in order to see.
    29. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      That being said, the actual RISC processing core (of the Hammer) is significantly larger than the K7's RISC core. (On the order of 20-30%). It's just that the decode stage is so huge that it hardly makes any difference.
      There you go about the decode stage again. It's a broken record, it's obvious that this hasn't hurt AMD or Intel, or any of the x86 crowd in the least. In fact, this has been the big secret behind their big performance advantage. It's the x86's experience and expertise in designing large decoder stages that has allowed both Intel and AMD to reach the 1Ghz+ frequency stage so far ahead of any of the RISC crowd, despite all of RISC's much vaunted simplicity. Yes, frequency is just one factor in performance, but it is a design aspect that most RISC processors had not been able to exploit very well up until now; and they are still having trouble exploiting it.

      The Alpha was making a run for this crown, and it was the only horse in this race for the longest time, and then all of a sudden from out of nowhere both Intel and AMD both overhauled the Alpha as if it wasn't there. I think most of us had assumed that the Alpha would be the first to 1Ghz, but instead it was the Athlon.

      The Hammer is akin to a pickup truck: A fairly inexpensive, medium-quality vehicle. It's loved because it does its job at a bargain price. It's utilitarian. It's the 'people's truck', and is affordable to most of the population.

      Workstation processors (Such as Power, SPARC, Alpha, PA-RISC, Itanium) are compared to a semi-truck (Kenworth, International, Caterpillar): They don't necessarily go any faster, but they can tow huge cargos, but the corresponding rise in cost is far from linear.

      Oh, I see. I can agree with some of your characterizations, but it is still woefully wrong. You can call x86 processors pickup trucks, but only in certain guises. A pickup truck is basically a small truck with an upgraded car engine. An x86 inside a desktop or laptop PC is a car. But a server x86 (Xeon, Athlon MP) inside a server is a pickup truck.

      That leaves the whole category of heavy-haul trucks unanswered by x86 at the moment. But what distinguishes a heavy-haul truck from a pickup? The ability to pull large loads. Is that all achieved by the truck's engine? No! Large trucks have incredible 18-speed transmissions, and stiff chassis, etc. In other words it's the overall package that distinguishes a heavy-hauler from a pickup. Is any of this sounding familiar? If it doesn't, then it should, because they describe a similar approach to how you distinguish a RISC processor-based (heavy haul) server from a PC (pickup) processor-based one.

      So how's this got anything to do about Hammer? Well, what it leads to is that Hammer has been designed right from the start to be everything from a car engine, to a pickup engine, to a heavy haul engine. That's because of its various features, such as Hypertransport, and onboard DRAM controller. Well, actually, only the Clawhammer (two Hypertransport channels) will work as a car or pickup engine, but Sledgehammer (three Hypertransport channels) will be a truck engine. They've been able to design an engine family that can fulfill many different roles. That's why Hammer is so exciting, and it's garnering so much anticipation.

    30. Re:The Hammer is NOT a good thing... by sl3xd · · Score: 2
      It's the x86's experience and expertise in designing large decoder stages that has allowed both Intel and AMD to reach the 1Ghz+ frequency stage so far ahead of any of the RISC crowd

      Actually, it's primarily because Intel pushed better fab processes into production earlier than the RISC crowd, of whom only Motorola & IBM fab their own.

      The Alpha was making a run for this crown, and it was the only horse in this race for the longest time, and then all of a sudden from out of nowhere both Intel and AMD both overhauled the Alpha as if it wasn't there.

      Never underestimate the damaging effects of a corporate sale. When DEC was split between Intel and Compaq, (well before the 1 GHz barrier) it was the death knell for the Alpha-- there was simply too much disruption in the shift of companies. (not to mention the fact that many of Alpha's engineers wanted nothing to do with Intel or Compaq, so they left) Neither AMD or Intel was bought out, as DEC was. And AMD even ended up with some of Alpha's engineers!

      That leaves the whole category of heavy-haul trucks unanswered by x86 at the moment. But what distinguishes a heavy-haul truck from a pickup? The ability to pull large loads. Is that all achieved by the truck's engine? No! Large trucks have incredible 18-speed transmissions, and stiff chassis, etc. In other words it's the overall package that distinguishes a heavy-hauler from a pickup... [it] describes a similar approach to how you distinguish a RISC processor-based (heavy haul) server from a PC (pickup) processor-based one.

      So how's this got anything to do about Hammer?


      Easy... Architecture. As you say, the engine is only a small (but significant) part of the entire package that makes the distinction. The rest is the architecture around which the engine is built. Frankly, even though there's been many improvements of the x86 design (primarily by eliminating ISA and replacing it with PCI/AGP), it still has its problems; which is why it will never be a true replacement for high-end workstations and servers.

      Well, what it leads to is that Hammer has been designed right from the start to be everything from a car engine, to a pickup engine, to a heavy haul engine. That's because of its various features, such as Hypertransport, and onboard DRAM controller.

      If it were designed from the ground up, it wouldn't be x86 compatible; not, at least, if the designers wanted a truly great processor. Rather, AMD hopes to ride the x86-compatibility market and is therefore adapting a phenomenal RISC core to the pre-existing x86 set. It's like bolting a jet engine on a farm tractor.

      Hypertransport (as well as a built-in DRAM controller) is only useful on multiprocessor systems (I'm not downplaying their usefulness at all) The onboard DRAM controller allows each processor to have its own seperate memory (whereas many, including the IA-64, share the same memory through the system bus.) Combined with the increased multiprocessing effecinecy Hypertransport offers, the Hammer processor line seems to be clearly designed for multiprocessor systems. (Hypertransport and onboard DRAM doesn't provide any real benefit to a single-processor system)

      It will be great for companies that want to upgrade their x86 server hardware, but want to keep their old software. It'll do great in the 3D animation and rendering studios, many of whom use a Unix-like OS anyway. But for the general desktop machine, there will be only one CPU, robbing the user of the benefits Hypertransport and the onboard DRAM module give.

      One key here is that Hypertransport is not unique to the Hammer; SUN, HP, Motorola, SGI and Apple are all members of Hypertransport consortium, and intend to incorporate it into their processor designs.

      The primary benefit of an onboard DRAM controller per chip (no longer sharing the same memory pool via a bus) is already implemented on other architectures by using multiple DRAM controllers.

      My argument all along was that the Hammer isn't a good thing because it:

      Keeps the paleolithic x86 architecture.

      Could operate far faster if its RISC core didn't adapt itself to x86

      We would be better off junking the x86 architecture sooner than later.

      The Hammer, while an excellent x86 design, seeks to make the transition 'later', if at all.
      Most of the responses I've seen are remarkably similar to a PC fan's reasons why they don't want to switch to a better machine than x86 can provide: They're cheap (the machines, although it can apply to a few users). Actual reasons as to the Hammer's 'superiority' are in no way particular to the Hammer, and are found in many of its competitor's drawing boards as well.

      And outside the Free software world, where the software typicall only requires a recompile, the Hammer faces some serious, possibly fatal obstacles once 64-bit compiled commercial packages begin to replace the older 32-bit code. The commercial reality is that to be successful, the Hammer has to have natively-compiled 64-bit code. (In Windows) To do this, they have to have developers who will support Hammer/64 in addition to the IA-64. They'll have to either sell two different versions (somewhat similar to the sales of Mac vs PC / or Win32 vs x86Linux games), or have both binaries in one package. Both are expensive propositions, and with Intel's virtually guaranteed market-share, it may not be worth the effort to support Hammer.

      For a brief history on AMD and binary incompatibility-- Jim Turley, a CPU/Architecture analyst, said the following: "Backing Intel's newest and heavily promoted next-generation architecture is a foregone conclusion for vendors that want to stay in business. Supporting AMD becomes more problematic. Will the added market share be worth the effort? Suddenly AMD finds itself in the same boat as Apple with a different, yet competitive, product that requires dedicated software support to survive.

      Grimly, AMD itself lived through this tragedy not so many years ago, and the wound was self-inflicted. AMD unceremoniously axed its entire 29000 family, one of the most popular RISC processors of the early 1990s, due to the cost of software support. The company decommissioned the second-best-selling RISC in the world because subsidizing the independent software developers was sapping all the profits from 29K chip sales. As "successful" as it was, AMD had to abandon the 29K, the only original CPU architecture it ever created. "
      (emphasis added)

      I'm not saying that the Hammer isn't a good processor.

      I'm saying that it's putting a jet engine in a 1940's John Deere tractor. I'm saying the mechanic should dump the tractor, and put a jet engine in an aircraft-- not an ancient, over-extended farm tool. The tractor could still do its job, but it's just such a waste of the engine's potential.

      I'm sorry, but the x86 instruction set is old and inefficient; it doesn't allow compilers or programmers to access a modern CPU's (including the Hammer) features-- So the Hammer has to deal with the limits inherited from the x86 set.

      IA-64 allows explicit branch/pipeline ordering and load optimization; this allows the compiler's larger view to create code that keeps all the pipelines busy.

      As all branch/pipeline and load optimization is done in the compiler, there is much more time to find the most optimal instruction order and path. (Fractions of nanoseconds vs. seconds/minutes/hours)

      An instruction set (such as IA-64) capable of direct access to branch ordering, or a greater number of registers is more powerful, in that it allows for developers (directly, or via a compiler) to 'take the time' and resources to find the most optimal/efficient way to use the processor's full capabilities.

      x86/Hammer does not allow explicit branch/pipeline ordering or load optimization, as x86 was purely single-pipeline until the first Pentium. (Although technically x87 is another pipeline, it served an entirely different purpose... the branching I speak of is of two or more identical pipelines)

      As a result, the (Pentium, Athlon, K6, Hammer) must look at its instruction cache, and from that (very limited) amount of information, attempt to optimize the branch/pipelines and provide load-balancing. Time is extremely limited (to fractions of nanoseconds), as are resources to perform any re-ordering. But as time is limited, it frequently executes a suboptimal route and/or order.

      Even though the Hammer has all kinds of ultra-modern features and resources, nearly all of them are inaccessible to the programmer/compiler; while the built-in management of these features/resources is quite good, it is also far from perfect (having a far more limited scope than a compiler does, after all) Cycles that could have been put to good use end up being wasted.
      Lastly, I'll say that I'm not so much a fan of the IA-64 as I am of the VLIW concept; Non-VLIW processors (Sparc, Power, Alpha) have the same pipeline scheduling concerns as the Hammer. But at least they offer greater access to the processor's resources (such as double or more the accessible GP registers of 64-bit Hammer).

      --
      -- Sometimes you have to turn the lights off in order to see.
    31. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      Actually, it's primarily because Intel pushed better fab processes into production earlier than the RISC crowd, of whom only Motorola & IBM fab their own.
      The reason for x86 crowd's frequency increases was partly due to better fab processes, but also quite a bit to do with splitting up the decoder stages into multi-stages. Allowed them to conveyor belt several atomic instructions simultaneously.

      For example, the Athlon and the P3 were both identically at 10 pipeline stages and 0.18um, but the Athlon was able to overhaul the the P3 by quite a bit due to newer design, and quite a bit better fab processes (copper interconnects, etc.). The P3 at 0.18um topped out at 1.0Ghz, but the Athlon soared all of the way to 1.7Ghz at the same process node.

      Then the P4 came out with 20 pipeline stages (twice the P3 or Athlon), and using the same process technology as the P3 at 0.18um node, it was able to touch 2.0Ghz. So the doubling the pipeline stages allowed it to double the speed just by itself.

      So with the improved process technology they were able to get 70% better speeds (Athlon vs. P3), but with increased pipeline stages (P4 vs. P3) they were able to get 100% better speeds.

    32. Re:The Hammer is NOT a good thing... by sl3xd · · Score: 2

      So with the improved process technology they were able to get 70% better speeds (Athlon vs. P3), but with increased pipeline stages (P4 vs. P3) they were able to get 100% better speeds.

      Interesting side note: One reason the Alpha does so well is that the physical design is very closely tuned to its fab process.

      And a question: Do you mean a greater number of pipelines, or more pipeline stages?

      I ask because more pipeline stages doesn't really increase speed very much (ie. there can be one instruction in each pipeline stage, but as each instruction takes one clock to move to the next stage, there isn't any improvement in speed.) In fact, shorter pipelines are often faster, as they don't have as much potential for stage bubbles.

      A stage conflict is when, for example, you have a 5 stage pipeline. Instruction A comes immediately before B. However, instruction B requires that A finish the entire pipeline before it can begin executing. So, instruction B has to wait 4 more cycles before it can execute (instruction A must finish, which essentially clears out the pipeline) A 10-stage would take 10 cycles to clear out before B can execute.

      Out-of-order execution can help keep the pipeline busy with other tasks while B is waiting to be executed; but it doens't always work out.

      Additional pipelines (which is what I think you meant) is adding a second (or third, fourth...) identical pipeline, so that tasks unrelated to the A,B instructions (above) can be executed as well. Again, out-of-order execution helps keep things busy, but not always.

      Which comes to the nice thing about VLIW design: The compiler (or, in the case of VLIW, the maschocistic asm coder) is able to take a larger look at program than is possible in a non-VLIW design (Which, AFAIK for the mass-produced chips, is everything except the Crusoe and Itanium). And that results in a more efficient run than having the hardware attempt to do it.

      Of course, as far as design complexity goes, I'm not entirely sure which is easier to design: The out-of-order predicion chip, or a VLIW chip. I tend to believe the VLIW chip is more complex in design.

      --
      -- Sometimes you have to turn the lights off in order to see.
    33. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      If it were designed from the ground up, it wouldn't be x86 compatible; not, at least, if the designers wanted a truly great processor. Rather, AMD hopes to ride the x86-compatibility market and is therefore adapting a phenomenal RISC core to the pre-existing x86 set. It's like bolting a jet engine on a farm tractor.
      It's precisely this x86-compatibility which will make it successful. There is nothing wrong with the x86 ISA that is holding it back in the least. It's got a set of simple instructions which are just as atomic and simple as anything in the RISC world, and it's got a set of complex instructions that help immensely in simplifying compiler design -- which is the great advantage of CISC design. Bolting a jet engine to a tractor? Haven't you heard of gas turbines? Jet engines are just one example of one, but not the most appropriate one, gas turbines power everything from diesel-electric locomotives to jets.
    34. Re:The Hammer is NOT a good thing... by bbbl67 · · Score: 1
      Hypertransport (as well as a built-in DRAM controller) is only useful on multiprocessor systems (I'm not downplaying their usefulness at all) The onboard DRAM controller allows each processor to have its own seperate memory (whereas many, including the IA-64, share the same memory through the system bus.) Combined with the increased multiprocessing effecinecy Hypertransport offers, the Hammer processor line seems to be clearly designed for multiprocessor systems. (Hypertransport and onboard DRAM doesn't provide any real benefit to a single-processor system)

      You have a lot of overarching opinions, but not a lot of background. How could you possibly say that onboard DRAM controller doesn't provide any benefits to SP systems? All you have to do is see the history of the PC to know how foolish that statement is. Even the recent history shows that a reduction in memory latency has a greater effect on PC performance than an increase in bandwidth.

      As for Hypertransport, the idea behind that is not just absolute performance increases, but also design flexibility. So the same chipset that serves as a PC chipset, may also be able to serve as an 8-way server chipset, with few design changes (perhaps by adding or subtracting a few more HTT channels). Even within a desktop environment, you can easily separate out shared PCI/AGP buses, into multiple switched PCI/AGP buses with Hypertransport underlying them. There's lots of potential even within a desktop environment.

      One key here is that Hypertransport is not unique to the Hammer; SUN, HP, Motorola, SGI and Apple are all members of Hypertransport consortium, and intend to incorporate it into their processor designs.

      Absolutely, the more the merrier. But it's not all of the other players it has to worry about, just one player: Intel. Intel may be allowed to use the HTT, but its absolutely certain they would rather die than use their great competitor's designs. All of the other players are small-fry in terms of volume compared to the x86 camp.

      Anyways, the only RISC player that is likely to use HTT is Sun, and they will likely use it in their upcoming Opteron servers. HP has no need to use HTT in its processors, simply because it has no processors anymore, all of them (PA-RISC and Alpha) have been EOL'ed according their own roadmaps, so what are they going to use them for, Itanium? It's likely that IBM, HP, in addition to Sun all have Opteron plans secretly already devised.

    35. Re:The Hammer is NOT a good thing... by sl3xd · · Score: 2
      How could you possibly say that onboard DRAM controller doesn't provide any benefits to SP systems? [...] Even the recent history shows that a reduction in memory latency has a greater effect on PC performance than an increase in bandwidth.

      This argument seems to be more a Rambus vs. DDR thing; and even then on commodity boxen. But I digress. In both cases there is currently an off-chip memory controller. The big reason for the difference in latency is not the controller itself, but the (completely different) methods of transferring data. Rambus uses a serial data transfer, which is easy to scale up (in terms of speed and bandwidth), but has higher latency. DDR is an older, parallell technology. DDR has lower latency, but has lower bandwidth and is much harder to scale up. This primarily because of electromagnetic crosstalk (and other E&M interference problems) within DDR's (parallell) data paths.

      There is a point of limited returns with the low latencies DDR offers; the point is frequently reached on high-performance computers (workstations, scientific processing, and high-end servers) where the bandwidth is the key factor. When you're transferring a few GB of memory, who cares that it takes a few us longer to start receieving data-- overall, the entire transfer (from request to completion) takes much less time. Even Wintel boxen are beginning to reach this point.

      Personally, I wonder how RAMBUS even got a patent. I don't see how a serial memory bus is 'non-obvious to the trade's practitioners'. But, that's the USPTO for you.

      Another major problem is the physical distance to (as well as speed of) DRAM. Silicon technology has already reached the point where a signal often travels faster through logic gates (such as an off-CPU controller) than it does through wire. So long as the memory controller is physically located between the DRAM and the CPU, there is little chance there will be any performance drop. At current CPU speeds, it takes 2-3 clock cycles for any signal to even reach the DRAM (even light-speed is slow at 1 GHz). Then it takes several more before the DRAM addresses and returns data. Then another 2-3 clock cycles before it gets back to the CPU. An off-CPU DRAM controller may or may not take an additional cycle. For large (sequential addressed) memory transfers, this one cycle is a one-shot deal. Even with millions of tiny, single-byte (randomly selected) transfers, there is one million extra clock cycles 'burned up'. This would result in a performance drop of 0.05% on a 2GHz CPU. (And less as speeds increase)

      As for Hypertransport, the idea behind that is not just absolute performance increases, but also design flexibility. So the same chipset that serves as a PC chipset, may also be able to serve as an 8-way server chipset, with few design changes (perhaps by adding or subtracting a few more HTT channels).

      This is true; but as I said, it only really makes things better for the multiprocessing crowd. Chip makers don't usually pass the costs of a higher-complexity/performance chip to the buyers of a lower-complexity chip. The SP chipset would be the hands-down highest-volume seller. An MP chipset that is based from the SP design would cost less than a wholly-redesigned MP chipset. This suits the MP buyers fine... but it doesn't give any benefit to the SP buyers. The benefit is to MP alone.

      Even within a desktop environment, you can easily separate out shared PCI/AGP buses, into multiple switched PCI/AGP buses with Hypertransport underlying them.

      You can, but why? For all intents and purposes, the PCI/AGP bus is essentially idle 100% of the time. (The times when it is used is more of a statistical anomoly than fact; a figment of the deranged observer's imagination.) Even in applications when there actually is heavy bus activity, the PCI/AGP bus is far from being saturated. There are cases (such as multiport gigabit ethernet cards) where any single PCI slot is unable to handle the load -- but the PCI bus itself still has massive amounts of idle bandwidth; it's just that it's not possible to transfer the data between the network card and the PCI bus fast enough. (Which is a limitation of PCI's component interface, but not of its bus).

      I've seen many servers that have multiple network interfaces, where each NIC saturates the PCI card slot. The actual PCI bus, however, is not saturated, and handles the full load of multiple saturated interfaces quite well.

      In other words, it doesn't matter how wide the freeway is; the tollbooth (AKA the PCI Slot interface) is the bottleneck, and is the real limiter of performance. A HyperTransport-switched PCI bus would be like adding more lanes to a highway that has nearly no traffic on it. It doesn't change how fast you can drive. It's the long wait at the toll-booth at the on and off-ramps that is the speed problem.

      Espescially as on many motherboards, AGP and PCI are on entirely different buses, so heavy AGP usage (such as DoomIII, or 3D Animation) doesn't even effect the PCI bus. For the desktop user, there is no benefit to such a scheme. Even a power-hungry gamer, using his AGP8X card to its fullest potential, compiling XFree86, and hosting multiple P2P file transfers couldn't do much to dent the PCI bus's capabilities. It's other x86 problems that are most likely to cause speed drops; not PCI or AGP.

      Only in ultra-high-end applications would there be a benefit.

      But it's not all of the other players it has to worry about, just one player: Intel. Intel may be allowed to use the HTT, but its absolutely certain they would rather die than use their great competitor's designs.

      That's completely untrue. In several aspects. First, the NIH (Not Invented Here) syndrome has burned just about everybody. No company that is too proud to use a technology that was NIH lasts long. The managers at Intel are not that stupid. But they aren't going to jump on the bandwagon and spend any money just yet; they'll wait until they see how the results fare on the market before they invest anything in HyperTransport. If it's in Intel's best interest, they'll use it. If not, they'll design an alternative. To call AMD their 'great competitor' is rather short-sighted as well. They're only the most major competitor in the x86 arena, and one with a minority of the market. That's the reality, whether you like it or not. And I like (and have recently bought) AMD processors.

      All of the other players are small-fry in terms of volume compared to the x86 camp.

      That is an entirely baseless statement. The x86 camp is extremely small in terms of the 'other players'. Or weren't you aware that approximately 0% of all computers use an x86 chip? AMD has a very small production volume; so small they don't even fab their own chips. The only major competitor that is fab'd in such small volumes is SPARC. But Power & PowerPC, Itanium, and even ARM processors are all fab'd in greater volumes than AMD's. Intel plans on abandoning x86 entirely; their Yamhill (Hammer-like) processor is a contingency plan, to 'steal the Hammer's thunder.'

      HP has no need to use HTT in its processors, simply because it has no processors anymore

      Patently false. HP's processor is the Itanium. (more below)

      all of them (PA-RISC and Alpha) have been EOL'ed according their own roadmaps, so what are they going to use them for, Itanium?

      Their roadmap EOL's the PA-RISC, but points straight to Itanium. The Itanium is 100% PA-RISC compatible (in addition to supporting x86 and its own architecture). It is the next-gen PA-RISC. They are only supporting the next couple of releases of PA-RISC to appease people whom already have PA-RISC hardware, and wish to upgrade the processors in their pre-existing hardware. Alpha was acquired well after the Itanium was complete; a white elephant of sorts. It was never part of the plan. It's entirely likely that HP will include Alpha technologies into next-gen IA-64 chips. If there is customer demand (espescially if it's from Itanium's co-designers at HP), HyperTransport will be included as well.

      Anyways, the only RISC player that is likely to use HTT is Sun, and they will likely use it in their upcoming Opteron servers. It's likely that IBM, HP, in addition to Sun all have Opteron plans secretly already devised.

      Opteron is the Hammer's new brand-name, and Sun will definately not be using it.

      Sun is 100% SPARC, has been for more than a decade, and they have no plans to abandon it. There is no such thing as an 'Opteron server' from Sun. Sun only sells SPARC boxen.

      I already covered HP -- they're Itanium. Their roadmaps still point to it.

      SGI's roadmap leads to Itanium for their workstations and servers. They will use Intel's answer to HyperTransport (whether it is HyperTransport or not)

      IBM is all about their own Power and PowerPC processors, which has better SPECint and SPECfp scores than anything else to begin with.

      It's likely that IBM has an Opteron-based PC and Windows.net server, but the Opteron won't be used in their high-end servers or workstations. IBM already scales well past the point where HyperTransport would be beneficial; and IBM is in the same boat as Intel: If it's worth their while, they'll either use or design an alternative for HyperTransport. But for IBM, it may be completely unnecessary to begin with.

      Apple is likely to use HyperTransport, as they have a great deal of flexibility in what technologies are to be used in their machines. Apple is also a member of the HyperTransport consortium. Apple's market is definatley not a trivial one.
      Which goes to show my point: Just because AMD's Opteron has great features, they are in no way unique to the Opteron. And its competitors have a better system architecture than x86 to boot.

      --
      -- Sometimes you have to turn the lights off in order to see.
  39. You forget Samsung by DABANSHEE · · Score: 2

    Samsung have the right to develop & manufacture Alphas for as long as they want, no matter what Intel & HPaq say or do.

  40. You're forgeting Samsung by DABANSHEE · · Score: 2

    Samsung have the right to develop, manufacture & sell Alphas for as long as they want, no matter what Intel & HPaq say or do.

  41. Alpha, Linux and MCSE by Anonymous Coward · · Score: 0

    Hi!

    I own (still) Alpha XL 266 with 128Mb ram and 21064 266Mhz CPU. I am MCSE. I run Linux Debian on my Alpha. It is production server for my company that I own.

    Regarding some morron on this forum that said MCSE's are too dumb to run Linux........!#$! OFF!