Slashdot Mirror


AMD Delays Hammer

TeJarz writes "C|Net reports that their next processor (Hammer) has been rescheduled from its original Q4 release to Q1 2003. To quote C|Net: 'The delays are occurring to accommodate the release of a new version of Athlon with a 333MHz bus, said Crank. Current Athlons come with a 200MHz bus and 256KB of secondary cache.' Let's hope this doesn't get moved again."

27 of 346 comments (clear)

  1. Current Athlons by Anonymous Coward · · Score: 3, Insightful

    ...have a 266MBz bus

    1. Re:Current Athlons by Anonymous Coward · · Score: 3, Funny

      What is that, MegaBizzatch?

    2. Re:Current Athlons by packeteer · · Score: 4, Informative

      In case everyone doesn't know what "double pumped" or "DDR FSB" mean let me explain. The clock that sets how often data is transfered clicks over and over to keepo the pace. On an Athlon it transfers data twice for every click. On a Pentium 4 its 4 times a click. Right now most Athlons run at 133mhz "DDR FSB". Mine already runs at 166mhz (overclocked of course) and let me tell you its sweet. I cant wait to see everyone have access to 166 mhz FSB Athlons.

      --
      unzip; strip; touch; finger; mount; fsck; more; yes; unmount; sleep
  2. Not surprising by Anonymous Coward · · Score: 4, Funny

    AMD = All Microprocessors Delayed

  3. 266 Bus by clinko · · Score: 3, Interesting

    Current Athlons have 266 bus. You can still get the older 200 bus, but it died out about a year ago. Sorted in price on pricewatch

  4. Comment removed by account_deleted · · Score: 4, Insightful

    Comment removed based on user account deletion

  5. Re:Comment non-sense by Paul+Komarek · · Score: 5, Insightful

    I, for one, am hoping to replace our Alphas with cpus from the AMD Hammer series. We're about to buy a bunch of P4-based machines despite the problems we've had with certain tight loops in scientific code performing 80 times slower than a similarly clocked Athlon (according to Athlon advertised "speed", not actual clock). No, I'm not exaggerating, and this has been verified independently -- the P4 cpu has some huge weak spots that really suck if you hit them. If Hammer were out and working properly, we probably wouldn't buy the P4 machines to hold us over.

    We need 64 bit machines to accomodate massive memory for our research. I'm really hoping the Hammer can provide a relatively inexpensive and *commoditized* 64 bit platform for us to work on, compared to existing 64 bit (workstation/server) platforms. And I want it yesterday. Actually, I want it last year.

    I have no idea what the editors or submitter meant, of course.

    -Paul Komarek

  6. Honesty? by wray · · Score: 3, Interesting

    What is the reason for the delay? Can it really be that it's just a business decision (as they seem to say) rather than a technological problem? It seems that AMD _needs_ this jump in 64 bit computing, the sse2 registers, and boost in performance on Intel. So to me, if it is a business decision, it is a poor one.

    Everything I have seen shows that Intel is doing much better in performance and climbing. AMD claims there is no real technological reason, yet there must be. Anyone have insights? It seems that it would be prudent for AMD to issue better explanations -- how could it hurt to be honest? I want to see competition, if they are going to lag in performance, then they present no reason for people to buy. (A similarly performing Intel chip is close in price right now)

    --
    Guess what? I got a fever! And the only prescription.. is more cowbell!
  7. Good by Billly+Gates · · Score: 3, Informative

    A delay from palladium which will be included by default starting with the Hammer. It was probably delayed because longhorn aka drm-Windows was delayed and its needed to actually use the cyptography in the cpu.

    1. Re:Good by Billly+Gates · · Score: 5, Informative
      Oops I forgot to include this from the faq.



      Q: Can Linux, FreeBSD or another open source OS run on "Palladium" hardware?

      A: Virtually anything that runs on a Windows-based machine today will still run on a "Palladium" machine (there are some esoteric exceptions[1]). If you currently have a machine that runs both Linux and Windows, you would be able to have that same functionality on a "Palladium" machine.

      The exceptions are here



      [1] These exceptions include the following:

      1.)Some debuggers may need to be updated to work in the "Palladium" environment, but they can still work.

      2.)Some special performance tools may need to be updated.

      3.)Software that writes directly to TCPA hardware will need to be updated.

      4.)Memory scrub routines (at the hardware level) will need attention.

      5.)Third-party crash dump software may need to be updated.

      6.)BIOS mode hibernation features will need to be updated to work with "Palladium."



      Its these 6 reasons why palladium is still beta and why AMD is probably waiting before releasing Hammer.

  8. Re:Comment non-sense by Archfeld · · Score: 3, Interesting

    Any place I can look for some doc on that issue ?
    We are migrating from our Alphas to dual P4's and seeing a serious drop in performance that should not exist :( The fingers have all been pointed at software optimization and we are doing some heavy duty examinations but it sounds all too pat to me...

    --
    errr....umm...*whooosh* *whoosh* Is this thing on ?
  9. the other side... by tanveer1979 · · Score: 3, Informative
    Lot of posts are screaming "again, again"... but the fact is a 64 bit processor is one devil to design.
    The biggest problem with current processors is that to design such devices we *have* to use dynamic logic. Ask any VLSI design engineer.. that is no joke. Infact many multipliers and dividers have to be hand edited! So delays are expected and it does reflect upon the desigers and companiesd in any way.

    Before you ask.. I do now work for AMD, i work in another VLSI company, thats why i say.. its tough. Millions of gates thousands to be hand edited its a bitch.. but as they say the fruits of labour are sweet... and for AMD hammer is going to be the sweetest

    --
    My Aurora : http://www.youtube.com/watch?v=o91ZsGwJYyg
    FB : https://www.facebook.com/TanveersPhotography
  10. some people use these for work, you know by g4dget · · Score: 3, Insightful
    Why should we hope it gets released now instead of later? Do you have anything riding on it?

    Hard as that may be to believe, some people use their computers for real work. And some of those people run into that dreaded 4G limit--4G is not a lot of memory anymore these days. And many of these people would love to have the choice of a Hammer over Itanium.

  11. The real reason by PaxTech · · Score: 5, Funny

    They're waiting so they can ship the new chip bundled with Duke Nukem Forever. ;)

    --
    All movements for social change begin as missions, evolve into businesses, and end up as rackets.
  12. Re:Comment non-sense by Paul+Komarek · · Score: 4, Interesting

    I can probably send you some test code (same for anyone else who asks), but I'll have to check with my advisor first. The smallest I've made the test code is a bit under 300 lines. It's been run on Alpha 21264 EV67, Athlon C, Athlon XP, P4, and P-III, and one other Pentium-ish platform. At least two (I believe it's actually three) profilers have been run to find the bottleneck; it appears to be the floating point unit stalling for data.

    Here are the timings. Note that these are just via "time" on GNU/Linux or a wall clock on Windows (or something -- I didn't do the Windows tests).

    P4 dual Xeon 1.7GHz/gcc: 82 seconds
    P3 1000/msvc: 18 seconds
    Athlon C 600/msvc: 2 seconds
    P3 1000/msvc, using floats and sse:
    2 seconds
    Alpha 667/gcc: 2 seconds
    Athlon XP 1900+ 0.88 seconds

    I guess the Athlon's clock was closer to the P4's clock than I recalled in my original post. Either way, the slowdown on the Pentiums can be easily seen.

    -Paul Komarek

  13. we believed it was hammer time by deft · · Score: 5, Funny

    but it turns out you can't touch this.

    --

    There's nothing Intelligent about Intelligent Design.
    1. Re:we believed it was hammer time by Hadlock · · Score: 3, Funny

      and my case will be cleverly constructed entirly of parachute pants...

      --
      moox. for a new generation.
  14. Re:Invested in AMD. by Perdo · · Score: 3, Interesting

    They got beaten today.

    Down 7% on Intel's 2 cent per share dividend.

    They'll get beaten again tomorrow.

    They'll get beaten at Christmas.

    They'll get beaten until Sledgehammer is released.. not Claw hammer which will have no x86-64 desktop software support right off the bat, and will have to rely solely on it's pure x86 performance.

    Microsoft shafted them on the X-box because Intel paid Microsoft 200 million to use the Pentium III. Nvidia was stuck with an unused AMD integrated chipset for X-box and Nforce was born.

    Intel will pay Microsoft to shaft them again. No x86-64 Windows XP for AMD despite AMD testifying on Microsoft's behalf in exchange for anti-trust testimony. AMD made an unenforceable agreement with Microsoft. To enforce it would perjure themselves.

    So Intel wins again.

    Until Sledgehammer arrives. Sledgehammer is a server/workstation chip and will have full support of the dominant server operating system, Linux. Microsoft must support Sledgehammer or risk losing more of their already weak server market share.

    Long after Microsoft has done the work to get Windows running on the Server, Microsoft will incorporate x86-64 support into their desktop OS.

    Probably about the same time as they support Hyper threading and SSE3 for Intel.

    --

    If voting were effective, it would be illegal by now.

  15. Re:To anyone complaing because they have old syste by bogie · · Score: 3, Insightful

    I'm glad your happy with your slow Celeron, but don't assume that for the rest of us we don't need the fastest CPU possible. Time is money and the faster myself or someone else can get their work done the better. There are a ton of apps out there right now where that speed CPU is just no longer a viable solution.

    I do happen to agree with not freaking out about a processor release date. But do realize that people are excited about this cpu for a reason.

    BTW many of us here were using computers and programming before you were born. Your only 20 for Pete's sake.

    --
    If you wanna get rich, you know that payback is a bitch
  16. short rant and a question by Erpo · · Score: 5, Interesting

    Everyone always makes the same really annoying mistake when it comes to athlon fsbs. Athlon front side busses do not run at 200MHz and 266MHz. They offer bandwidth equivalent to 200MHz and 266MHz by using both sides of the clock (DDR) on 100MHz and 133MHz fsbs. All new athlons use 133MHz DDR fsbs. The hammers will support 166MHz DDR memory busses, offering performance equivalent to 333MHz SDR memory.

    However, the notion of "fsb" is a little blurred with the hammer. Hammers will be directly connected to dimm banks and have integrated memory controllers, so the speed of the fsb will no longer be a determining factor in memory bandwidth. (* see mp note below) The traditional fsb to the traditional northbridge will be replaced by a "high speed" hypertransport link to a chip that connects to the agp slot, and has another (slower) hypertransport link to what could be called the south bridge. This "south bridge" will then connect the pci bus, serial ports, hard drives, usb ports, and any other devices that need to talk to the processor or main memory.

    *What does this mean for MP systems? Well, that's actually the really cool part. By moving the memory controller onto the processor and providing communication between processors over a hypertransport link (3.2GB/sec for dual, 6.4GB/sec for quad and above), memory bandwidth actually increases as more cpus are added! This is in contrast to a normal MP system where as more cpus are added, there is increased competition for a fixed resource (main memory) which is already the bottleneck in many single processor applications.

    That's my rant on terminology. Here's the question:

    I'm no kernel hacker, and I certainly don't know anything about writing schedulers, but it seems like this would require a change in how processes are handled in hammer mp systems. In traditional mp systems, every processor has equal access to main memory. If a process gets moved from one cpu to another, there's initial overhead to do the moving, but after that it can still get to its areas in memory without any problems. On a hammer mp system, migrating a process from one cpu to another would mean that in order to access its memory it would have to reach out of its cpu's hypertransport link, into another cpu's memory controller (which may or may not be busy) and into the attached ram. Considering there would not be enough bandwidth available on the 3.2GB/sec hypertransport bus (in the case of a dp system) for both processors to reach into eachothers 166MHz DDR memory at the same time without suffering a performance hit, it seems like there would definitely be an advantage to keeping processes close to their data.

    What changes would this require to scheduling and process management code, if any? Has this already been addressed, or are there people working on it in the linux kernel?

    1. Re:short rant and a question by Erik+Hensema · · Score: 4, Informative

      Essentially this would be a NUMA system (non-uniform memory architecture). As far as I know Linux 2.6 will have support for these systems.

      In a real NUMA machine there would be a hierarchy of clusters of processors. Each cluster functions a bit like a traditional SMP system, but the clusters are interconnected over "low"-bandwidth busses. This makes memory accesses across clusters slower than direct accesses into the clusters' memory.

      Both the VM and the scheduler will have to know about this.

      Another point with NUMA systems is the possibility of gaps in the main memory (discontinues memory). Kernel hackers are currently working on support for that (discontigmem patch, merged in 2.5.34).

      --

      This is your sig. There are thousands more, but this one is yours.

  17. Why Pentium IVs are slow by stewartjm · · Score: 4, Informative

    The P4's x87 FPU and x86 ALU are just plain slow compared to P3s and Athlons. Though I am surprised your code is running 82x slower. I'd expect more like 2-8x slower for compute bound code. You can get a somewhat sensationalistic overview of why it's so slow at this link.
    If you want more in-depth numbers you can compare appendix C of the Intel Pentium 4 Optimaztion Manual with chapter 29 of Agner Fog's Pentium/II/III Optimization Manual. You can see the Athlon numbers in Appendix F of AMD's Athlon Optimization Manual.
    If you want to do number crunching with Pentium 4s your best bet is to use the SSE2 instructions/registers. You should be able to get a noticable speedup by using the Intel C++ compiler and telling it to use SSE2 instructions. If you want to eek out max performance you'll have to use assembly language. Though you can probably get most of the way there using the Intel C++ Compiler's SSE2 intrinsics.
    I'm curious as to why your code is so much slower on a P4 than on an Athlon. The best way to find out would be to look at the assembly code that gcc is producing. You can do that by using gcc's -S option. If you'd like send me the C code and the output from -S and I'll see if I see anything obvious.
    I'm somewhat paranoid about posting my email address. My paranoia seems to work, as I've received no more than the occasional spam in the last few years. My email address is my slashdot user name at woh.rr.com.

  18. They're having clock speed issues with Hammers... by Heretic2 · · Score: 5, Interesting

    You ever notice how all the Hammers are clock speed locked at 800MHz? Yea, there's a reason for that. They're having problems cranking the clock speed up. For 800MHz they're fast as hell, beating P4 with twice the frequency, but they're not gonna release them until they clock faster than current Athlons so they're trying different types of transitors and what not.

    How the hell do I know that??? Look where I live, take a guess...The birds outside my window know things.

  19. Re:Comment non-sense by MSG · · Score: 3, Interesting

    it appears to be the floating point unit stalling for data.

    Well, if it's stalling for data, your problem is probably that the P4 has a *tiny* L1 data cache compared to... uh... anything. It's only 8K, compared to the Athlons 64K. See the following URLs:
    http://www.tomshardware.com/cpu/02q2/020402 /p4_240 0-01.html
    http://www.geek.com/procspec/intel/nort hwood.htm
    http://www.geek.com/procspec/amd/k7sele ct.htm

    It's probably also worth noting that Intel does NOT list the P4 as a "server processor". The P4 is listed as a desktop or workstation processor. Only P3, Xeon, and Itanium chips are recommended for server use:
    http://www.intel.com/products/browse/process or.htm ?iid=Homepage+Find_Products_Processors&

    You might want to show that to management and reconsider your purchase of P4 equipment. Even a P3 is likely to perform better.

  20. is there a real difference? by Trepidity · · Score: 3, Interesting

    Something I've never seen a good explanation of -- is there performance-wise any difference between a 266 MHz clock with data transferred once per clock and a 133 MHz clock with data transferred twice per clock (despite the actual clock ticking rate of course)?

    1. Re:is there a real difference? by kimmo · · Score: 5, Informative

      Latency.

      With single data rate a new address can be sent every clock for all memory requests.

      With double data rate a new address can be send with every other "clock", but while data transmission rate stays the same. Effectively this means transferring double data for each request, while the amount of requests doesn't change.

      This isn't very serious problem, since single bytes/bus wide data aren't usually transferred, but whole cachelines of 32/64 bytes. They will generate 4/8 sequential burst requests nullifying much of the "halfclocked" address generation potential latency problems.

      Ok, so why can't the addresses be sent like the data is another question which someone else with more knowledge might explain.. Maybe it would complicate things too much since the request-answer mechanism should be pipelined to accept new requests until previous requests are served. Or maybe the physical bus has some limitations, like using the same pins for address/data, which would simply make it impossible to send new addresses simultaneously (on falling edge of clock) while receiving data.

  21. Seems smart to me... by DeathPenguin · · Score: 3, Interesting

    Why should they rush the Hammer when the Itanium is failing as is? They know they can't push people to use their 64-bit capabilities, just like people didn't switch to Alphas. Squeeze every ounce of strength from the Athlon as they possibly can for now. Let Intel push the IA64 standard on everyone first to create a demand to migrate from 32-bit to 64-bit. That's where AMD plans to make their killing.

    I would imagine it would be better to release Hammer ASAP and create the 64-bit market themselves. Then again, I don't know the logistics required for such a launch, nor do I know exactly how much better, if any better, x86-64 would perform. Let's face it, not many people care about 64-bit versus 32-bit, they only know what the dork at CompUSA tells them. And if Hammers can't outscore P4's in the 32-bit apps that very short-sighted people care about, then there is really no place for Hammer in the consumer market.

    From what I've heard, mostly from internet gossip, is that AMD is having problems making Hammer scale high enough to beat the P4 in 32-bit apps, although it only requires roughly 1 Hammer MHz to beat 3 P4 MHz. I've also heard that AMD is having problems making Hammers run above 800MHz. With the expected debut of the P4 at clock speeds above 3GHz, the Hammer doesn't stand much of a chance in 32-bit apps.

    In short, don't expect to see Hammers until Intel manages to salvage the Itanic.