Slashdot Mirror


AMD64 Preview

Araxen writes "Over at Anandtech.com they have an interesting preview of AMD's 64 bit processor on a Nforce3 mobo. The results are very impressive with the Anthlon64 beating out Intel's P4 best processor soundly in their gaming benchmarks. This was only in 32-bit mode no less! I can't wait for 64-bit benchmarks come out!"

58 of 290 comments (clear)

  1. Opteron Benchmarks, not Athlon 64 by ultor · · Score: 5, Informative

    The benchmarks are from a 2ghz Opteron, not an Athlon 64. It is intended to give an example of the performance from the new chip. Unfortunately, upon introduction, only the Athlon FX, running on ECC memory will be capable of using dual-channel memory. And from what I've heard, this cpu will cost in the vicinity of $600+. The first non-ECC dual-channel platform will be introduced in 2004.

    1. Re:Opteron Benchmarks, not Athlon 64 by WoTG · · Score: 2, Interesting

      The top end chip might be in the $600 dollar range, but the cheaper chips will be significantly less than that.

      For comparison, the 1.8GHz Opterons are in the $460 range on Pricewatch. So the A64's will have to be somewhat lower than that in price. (Unless they skip 1.8 altogether)

      Also, for many benchmarks, dual-channel memory isn't that important. What is most important with the A64's (and Opterons) for desktop application speed is the on-chip memory controller. This gives these chips dramatically lower latency. So, we can still expect the low end A64's to be good in many, many applications - including games, I think.

  2. Not an Athlon64, but an Opteron by doormat · · Score: 4, Informative

    Anandtech is only comparing single processor Opteron performance against everything else, no infact, Athlon64 performance. The primary difference is that the Opteron has a dual channel memory subsystem, whereas the Athlon64 has a single channel system. This difference will have an affect on performance.

    --
    The Doormat

    If you're not outraged, then you're not paying attention.
    1. Re:Not an Athlon64, but an Opteron by heli0 · · Score: 4, Informative

      There will actually be two lines at launch. The 940-pin Athlon64FX(1-way Opteron) will have dual channel DDR while the cheaper 754-pin Athlon64 will have single channel DDR.


      Athlon64 Showing Up
      Pricing for Athlon 64 leaks: 939 pin chip won't be compatible with 940 CPU

      --
      Whenever the offence inspires less horror than the punishment, the rigour of penal law is obliged to give way...
    2. Re:Not an Athlon64, but an Opteron by MBCook · · Score: 5, Informative
      Your comment is somewhat missleading. There are TWO Athlon 64s being launched (or as Overclockers.com called them the "Opteron that's not an opteron but is an opteron" and the "operton that's really not an opteron" or something like that). Annandtech compared the equivelent of an Athlon64 FX, not an Athlon64. Here is the skinny:

      Athlon64 FX
      This is a 1xx opteron. It's still dual channel, and it uses ECC memmory (for now?). This is the "performance" part, the high end one. If we're trying to find who has the fastest CPU, this is the one to test. Their tests are quite valid for this, IMHO.

      Athlon64
      This is the "budget" Athlon64. It only has once memory channel, I don't know if it has ECC or not. Yes, this will be slower, but it will also be cheaper and the motherboards for it could be cheaper too (since it doesn't have that second memory channel).

      So, I think that this is a very important article. Look how fast an Opteron/Athlon64 FX is compared to a P4. A 2 Ghz Opteron/Athlon64 FX is beating a 3 Ghz P4. This is all on a 32 bit os and software. When you run 64 bit software that knows about all the extra registers and can do 64 bit math nativly should it need to, the computer will be fast. Tim Sweeny (spelling?) said that native versions of UT2003 (or something) was running up to 20% faster on x86-64 without optimisations; just from going to 64 bit mode. And for most of us the fact that it can manage over 4GB of mem easily for now is only iceing on the cake.

      AMD has a great processor. I can't wait to see more info on these things. The fact that it does so well in 32 bit mode is important since you currently can't get Windows for the processor (there is no x86-64 version of Windows out yet). If it was a great processor, but you were forced to get terrible performance if you bought one for 6+ months (becuase it wasn't good with 32 bit software like windows and what you run), would anyone buy it? This thing is faster today, and should only get faster when you run native software. I'm saving my pennies (and yes, I know it will take a lot of pennies ;).

      --
      Comment forecast: Bits of genius surrounded by a sea of mediocrity.
    3. Re:Not an Athlon64, but an Opteron by sirsnork · · Score: 5, Informative

      All Opterons are made on the same line, they all have the exact same core. They are then tested to see it the work in 8way, if not they are tested for 4way and if they fail that they are tested for 2way. In 2 way they only use 2 of the 3 HT cotrollers and only one talks to another CPU (the second connects to the HT controller). In 4way config the CPU's use 2 HT controllers to talk to CPUs and one of them uses the third to talk to an HT controller (in fact sometime they have 2 CPUs talking to HT controllers, one for PCI-X and the other for all the rest). Finally in 8way 4 of the CPUs use all 3 of their HT controllers and the 4 at the edges only use 2, but again some also have to talk to the HT controller(s) on the "outside"

      --

      Normal people worry me!
    4. Re:Not an Athlon64, but an Opteron by mjuarez · · Score: 2, Interesting

      It's not vaporware. You can buy it from a lot of places on the Net. See here for one:

      http://tinyurl.com/mhn9

      You can also find it in PriceWatch, at least 5 vendors offer it currently.

    5. Re:Not an Athlon64, but an Opteron by Jeff+DeMaagd · · Score: 3, Informative

      I don't know how this fits with your listing, but it looks like there might be another derivative budget chip at or below A64 that might not have a 64 bit mode at all:


      AMD to ship Athlon 64s as Athlon XPs

      I do find it amusing that people are commenting how good something is or is not before the damn product has been released, particularly when there is so little hard information on what it will really amount to.

      So far one difficulty I see is the lack of Hammer boards that have AGP _and_ PCI-X slots or at least 64 bit/66MHz PCI slots, and they commented on this in that review last I checked. I think part of the assumption was that because these systems are for servers, AGP isn't needed, or if AGP is needed, it was assumed that PCI I/O slots weren't that critical.

  3. Re:Interesting by Dun+Malg · · Score: 3, Informative
    I just wonder if it can compete with the Intel x86-64 line of processors.

    Huh? There's no such thing as an "Intel x86-64" processor. x86-64 is AMD's solely implementation.

    --
    If a job's not worth doing, it's not worth doing right.
  4. Re:64-benchmarks wont be good by Slack3r78 · · Score: 4, Informative

    I actually read this this morning, and there are a couple of important things to note - the chip being 'previewed' isn't actually an Athlon64 - it's a 1.8GHz Opteron overclocked to 2.0GHz, which is the expected clock rate of the first A64, prorated at 3200+. It'll give us an idea of what to expect, but nothing too specific.

    The other important thing to note is that the comparisons were mostly against P4s and an Athlon XP, with a Dual 3.06GHz Xeon thrown in for good measure, all 32 bit chips. And the 'Athlon64' owned most of the competitions, showing that its 32 bit mode is just as good as rumored. There were no Itaniums in the competition since, so only 32 bit modes can be compared here. However, if the A64 turns out to be as good in its native 64 bit mode as the 32 bit number might lead you to believe, the Athlon 64 looks like it very well could be a force to be dealt with.

  5. Re:Idiots... by Slack3r78 · · Score: 2, Interesting
    They have announced physical packaging changes scheduled about every 4 months until 2005.
    Do you have a source on this? Everything I've read on the Athlon64s for months on end now has mentioned nothing but Socket 768. I have a sneaking suspicion that you're a troll, after all, I seem to recall Intel changing the P4 socket midway through the game. But I take it that's different because they're Intel?
  6. Re:64-benchmarks wont be good by robbyjo · · Score: 2

    If you read the article, the comparison is against Dual P4 Xeon. Some of the tests didn't enable any hyper-threading stuff (and thus it didn't take any advantage of the dualies. Opteron beat P4 by very high margin. Except for content creation & general usage stuff where the P4 wins. But take that with a grain of salt.

    64-bit tests won't be fair to either side. It's like comparing apples to oranges. For me, I'm looking forward to see vis-a-vis comparison on programs that is optimized on either platform. For example: A program that is optimized on Itanium and Opteron and see how they fare.

    --

    --
    Error 500: Internal sig error
  7. Will it be secure? by samjam · · Score: 4, Interesting

    When are some of these newer processors going to implement the executable permissions bit in the MMU so that the STACK can be NON-EXECUTABLE (ok I know some trampoline stuff needs executable stacks, well they can ask for it where needed by setting the executable bit for a small region)

    And when are some of these new processors going to be fully virtualizable? I'm talking about PUSHF and POPF generating exceptions like directly setting the interrupt flag does.

    Think how easy plex86 would be to run on a processor that did this properly?

    Code-morphing Transmeta (come one!), AMD (maybe?) Intel (no chance?)

    Sam

    1. Re:Will it be secure? by Amoeba · · Score: 4, Funny

      The sad thing is I understood everything you just said.

      My God, I *am* a geek.

      --
      Do not taunt Happy-Fun Ball
    2. Re:Will it be secure? by MROD · · Score: 3, Informative

      Not exactly.

      Within the MMU look-up tables the memory pages can be marked as being executable or not. Hence, if a program tries to jump to memory in a protected page (ie. not marked as executable) it will cause an exception.

      The current x86 MMU doesn't have this ability, unlike some processors such as the Sun UltraSPARC (though not any versions previous to this).

      --

      Agrajag: "Oh no, not again!"
  8. Semantics, maybe, but... by Murdock037 · · Score: 3, Informative

    Intel doesn't have an x86-64 line of processors. They have an IA64 line of processors.

    The two apparently aren't interchangable. There's a coming battle in which software companies have to choose between the two, or support both, which would be tough on both them and consumers.

    Apparently, AMD's x86-64 set is easier to deal with, and more of a natural progression from where the processors are now. (It also apparently runs 32-bit code at rates comparable to 32-bit chips at the same clock speed.) Intel's IA-64 is a total reworking, and a bitch to work with, from what I've read.

    In the end, it seems like the smart choice would be for everybody to toss their hat in with x86-64 (which means Intel would have to, as well, and essentially concede defeat and lose face); it probably won't happen, though, because Intel is Intel.

    Check out this article at the Inquirer, which I've basically just paraphrased, but it does go into some interesting Windows 64 dealings.

  9. 64bit performance gains... by Natalie's+Hot+Grits · · Score: 5, Informative

    Before anybody starts talking about how little 64bit cpu's actually increase performance, let me tell everyone what 64 bit mode will actually bring to the table over the Opteron/Athlon64 32 bit modes:

    1) more registers. This will get us fair performance increase from the start, as compilers will have more registers to work with when doing calculations on multiple pieces of data.

    2) support for larger system memory sizes. This won't help you in video games, but it will help you doing high end photoshop, and other applications (provided you spend the money to get more memory put into your system)

    3) native operations on 64 bit data. Typically, when someone wants to do operations on a 64 bit integer in a 32 bit CPU, you have to split up the work in software. Now with 64 bit registers, you will be able to do operations on 64 bit integers in the same time as it takes to do the same operation on a 32 bit integer.

    4) when using native 64 bit mode, certain legacy instructions of x86-32 are depreciated. This is a cleanup for the x86 ISA, which in the past has contained literaly EVERYTHING that the previous generation of CPU supported. AMD's x86-64 ISA eliminates these legacy features and moves them into firmware emulation (don't worry, it won't degrade any modern 32 bit code, just terribly outdated stuff from the 386 days, which doesn't need 2GHz of power in the first place)

    On top of these performance enhancements that 64 bit mode brings you, you get all of this just because you are using AMD's Opteron/Athlon64 CPU:

    1) Dual channel DDR Memory interface, with memory controller on the die of the CPU. This reduces latency and improves memory bandwidth so dramatically that even Intel's off die memory controller can't keep up (this is why video games are so much faster on the amd64 platform than on athlon-32 platform)

    2) HyperTransport bus to the south bridge, which will give high bandwidth access to the PCI bus, PCI-X, and other IO intensive controllers. Eventually AGP slots will be phased out for PCI-X slots which will be universal for both video, and other devices.

    3) when using multiple CPU's in the same system, the new AMD-64 platform gives you dedicated memory bandwidth to each CPU installed. On the intel and athlon-32 platforms, all the CPU's in the system shared the same memory controller which runs either single or dual channel DDR anywhere from 266MHz - 400MHz.

    --
    Two infinite things: your stupidity and mine. But I'm not sure about the latter. If my sig offends you, I'm sorry.
    1. Re:64bit performance gains... by p3d0 · · Score: 4, Informative
      Nice summary. I would only add a couple of things:
      • 64-bit math on IA32 requires register pairs. With 8 GPRs, one of which is reserved for the stack pointer, that means you can only keep 3 long-longs in registers. On AMD64, even if you dedicate another register to the frame pointer, you can still get 14 long-longs in registers: almost a factor of 5 improvement.
      • The benefits from the memory subsystem will be offset by the fact that objects containing pointers will be twice as big as on IA32. That means objects could have twice the cache footprint and twice the memory bandwith requirements.
      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    2. Re:64bit performance gains... by barawn · · Score: 2, Insightful


      The benefits from the memory subsystem will be offset by the fact that objects containing pointers will be twice as big as on IA32. That means objects could have twice the cache footprint and twice the memory bandwith requirements.


      Except that pointers make up only a small fraction of the code footprint of an executable - most of it is ints, which still are 32-bit by default on x86-64. In general you can easily minimize the number of pointers in code by doing math (i.e., with 32-bit ints) on one base pointer.

      The estimate is that code size will increase by about 10-15% on x86-64. Considering that the L2 cache is 1MB, as opposed to the standard size of 512k nowadays, it's a net win. Presumedly in the future they'll increase the cache size even more.

    3. Re:64bit performance gains... by timeOday · · Score: 3, Interesting
      The benefits from the memory subsystem will be offset by the fact that objects containing pointers will be twice as big as on IA32. That means objects could have twice the cache footprint and twice the memory bandwith requirements.
      I wonder if it will be possible to use 32 bit pointers within the X86-64 isa? This would save memory on pointers but give you access to the extra registers, instructions, and one-whack 64 bit math (which should be great for encryption and compression, without using special mmx instructions).

      I thought I remembered SPARC being able to do this, but it looks like SPARC programs must be compiled with 64 bit pointers to efficiently perform 64 bit arithmetic.

    4. Re:64bit performance gains... by p3d0 · · Score: 4, Informative
      I meant data objects, as in object-oriented programming. Not object files. OO data tends to have a lot of pointers.

      Having said that, object files will be bigger too. I'm not sure where you're getting your 10-15%; have you actually checked? I don't have access to our AMD64 boxes right now so I can't take a look at the object files, but I think the difference could easily be more than that for object-oriented code, for a number of reasons:

      • Probably 2/3 of the instructions in hot code will need a REX prefix, either because they use registers r8-r15, or because they manipulate addresses.
      • Only mov instructions can use an 8-byte immediate. Anything else that needs an 8-byte immediate must load that immediate into a register first with a 10-byte mov instruction, possibly spilling whatever was in that register. We could be talking about 3 extra instructions totalling maybe 18 bytes on an instruction that used to occupy maybe 6 bytes. Class tests in a polymorphic inline cache are particularly affected by this. Also, relocations (ie. jumps between different DLLs) must be 64 bits because there's no reason to think DLLs will be loaded within a 32-bit offset of each other.
      • Autos that are pointers now occupy twice as much stack space, making your stack frame that much less likely to fit within an 8-bit signed offset (ie. 127 bytes). That means you can't use [esp+12h] addressing to access your locals, but rather [rsp+12345678h], which requires three extra bytes (not to mention the Rex prefix). Highly optimized functions often have lots of variables, especially after inlining, and in OO code, lots of the variables are pointers, so this one could hurt.
      • Similarly, the AMD64 linkage convention on Linux has 6 parameters passed in registers (while IA32 has none) which also makes the stack frame bigger. This can be mitigated by using a frame pointer, but if you don't dedicate a register as a frame pointer, than you need to access your parameters with the stack pointer (rsp), and the parameters are always at the largest offsets from rsp. Result: parameters are likely not to be reachable with an 8-bit offset from rsp.
      If I had to estimate off the top of my head, I'd guess code would be more like 25% bigger, while OO data could be as much as 50% bigger. (Remember that each object contains a pointer to its class or vft, and many object fields are pointers.)
      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    5. Re:64bit performance gains... by Performer+Guy · · Score: 2, Informative

      1) Is NOT an inate property of a 64 bit processor. You could build a 32 bit processor with more registers.

      2) is valid

      3) True but almost no software I use does much 64 bit type processing.

      4) You could do this with a compiler, it the instruction is slow, yu don't save die area because you need to support it in 32 bit mode.

      You missed out the biggest winner, the massive cache on this processor, 1MB I believe, that's a big step up.

      You put a cache that size on a 32 bit Athlon and you will see some big improvements.

      I don't think it's right to say 64 bit is inherently faster, if your application needs it then yes, but for 32 bit class apps, 32 bit mode is faster.

    6. Re:64bit performance gains... by Ninja+Programmer · · Score: 3, Insightful
      Object code side with *NOT* be bigger -- it should be *SMALLER* if anything:
      • Pointers inside objects occupy run-time memory from the *HEAP* -- i.e., they don't have any presence in the object file.
      • The use of REX to access r8-r15 is the register based alternative to using a SIB byte, and offset for an [ebp+offset] encoding for directly accessing the stack. I.e., paying the cost of an extra prefix byte saves in both execution speed and actual code size versus the spill/fill style or direct stack based alternative.
      • Auto areas that are larger than 256 bytes because they are filled with a bazillion pointers are indicative of more serious program design flaws (that people don't generally have) than the statistical potential of loss from using far offset values from it. This is an extremely marginal case at best.
      • I don't understand your linkage complaint -- the more parameters passed in registers, the fewer that will end up on the stack.
    7. Re:64bit performance gains... by p3d0 · · Score: 2, Interesting
      Pointers inside objects occupy run-time memory from the *HEAP* -- i.e., they don't have any presence in the object file.
      Duh, yeah. What's your point?
      The use of REX to access r8-r15 is the register based alternative to using a SIB byte, and offset for an [ebp+offset] encoding for directly accessing the stack. I.e., paying the cost of an extra prefix byte saves in both execution speed and actual code size versus the spill/fill style or direct stack based alternative.
      Good point about r8-r15. However, the problem with needing REX prefixes for address manipulations is still a pure loss.
      Auto areas that are larger than 256 bytes because they are filled with a bazillion pointers are indicative of more serious program design flaws (that people don't generally have) than the statistical potential of loss from using far offset values from it. This is an extremely marginal case at best.
      Huh? Do you do compiler work? Surely you have seen methods with more than 128 bytes of local variables after inlining has occurred?

      Besides, as compiler writers, we don't have the luxury to tell application developers to "just redesign your code".

      I don't understand your linkage complaint -- the more parameters passed in registers, the fewer that will end up on the stack.
      Forget the linkage complaint, it's bogus. I was thinking of a different parameter-related problem that is specific to the compiler I work on right now. It's not a general problem.
      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
  10. Re:64-benchmarks wont be good by kenneth_martens · · Score: 2, Informative
    Intel's IA-64 emulates 32-bit unlike AMD's 64-bit chips which have 32-bit hardware. So we can expect AMD to beat Intel easily in 32-bit stuff.

    If you had RTFA, you would know that the benchmarks compared the Athlon64 against Pentium 4s and Xeons, not against IA64. What the benchmarks show is that the 32-bit performance of the Athlon64 is on par or better than the best Pentium 4 processors, and is better than the current Xeons. IA64 is not benchmarked in the article.

    The 64-bit performance of the Athlon64 is not being benchmarked in the article; it is the 32-bit performance relative to leading 32-bit processors that is the issue.
  11. Re:Intel's response by Glasswire · · Score: 3, Informative

    Prescott with PNI new instructions, 1Mb L2 cache clocking up to 4GHz and beyond, 800MHz front side bus and increased software support for Hyperthreading. (eg. 2.6.x Linux kernels know how to do HT scheduling much more efficiently)

    Watch the Xmas benchmarks, that's when it matters...

  12. About 64-bit gaming performance by yourruinreverse · · Score: 2, Interesting

    This was only in 32-bit mode no less! I can't wait for 64-bit benchmarks come out!

    The above seems to imply that game benchmark results will be better at 64-bit. Now, if those games needed access to many gigabytes of game data, that would be an entirely correct assumption.

    Apart from the utter pointlessness of 64-bit gaming for the coming years because of the comparatively humble data requirements of current games, a benchmark of 64-bit gaming performance (say, its 3D calculation or its AI plotting performance) would be mostly a waste of time, as you would see very likely only see an equalling performance at best.

    --
    JeR
    1. Re:About 64-bit gaming performance by amorsen · · Score: 4, Informative
      a benchmark of 64-bit gaming performance (say, its 3D calculation or its AI plotting performance) would be mostly a waste of time, as you would see very likely only see an equalling performance at best.

      This would have been the case if IA-32 was a sane architecture. Athlon64 in IA-32 mode has only 8 visible general purpose registers, whereas it has 16 in 64-bit mode. That makes 64-bit mode a win in almost all cases. Technically it would have made sense for AMD to introduce a new 32-bit mode, but it would probably have been bad for marketing.

      --
      Finally! A year of moderation! Ready for 2019?
    2. Re:About 64-bit gaming performance by mjuarez · · Score: 2, Insightful

      The above seems to imply that game benchmark results will be better at 64-bit.

      With a little tweaking and register optimization, they will be better. You have double-sized registers, and much more general purpose registers. In tight inner loops, being able to complete a loop in 10 vrs 20 clocks makes a hell of a difference.

      Now, if those games needed access to many gigabytes of game data, that would be an entirely correct assumption.

      We are getting to that point. I believe Doom 3's textures are approaching the gigabyte size, and you need many of those at the same time on memory to be able to correctly display a level. Of course, even if it was not necessary, being able to load up ALL textures to memory will make the game so much more playable. In general, if the RAM is there, gaming companies will find a way to use it to make the game better/faster.

      Marcos

    3. Re:About 64-bit gaming performance by Ospeovedizer · · Score: 3, Informative

      What you say is true, if the only improvement of AMD64 is 64-bit support. However, AMD64 also doubles the number of general-purpose and XMM (for SSE, SSE2) registers to 16 of each. This will make many programs run faster, as having 8 general-purpose registers is just not enough. Far too much time is given to swapping data into and out of registers on x86.
      The additional registers is really what I like about AMD64. I couldn't care less about 64bit for now.

      --
      "We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2
    4. Re:About 64-bit gaming performance by MROD · · Score: 2, Insightful

      Hmm.. you say that having bigger and more registers is going to increase the speed of programs..

      Well, this may be true if the only code running is the game and doesn't transfer double the data from the memory to process (64bits rather than 32).

      However, what happens when the operating system does a context switch or some other exception occurs? The latency from saving the processor context is going to go way up as you have to save far more data to memory and then load the same large amount of data in for the new context.

      If you double the size of the registers and double the number of registers (and possibly add to the size of the CPU's other program registers) you suddenly quadruple the amount of data which has to be changed over. On a system with many threads and processes running this can add up to a significant deficit.

      Now, if your programmers decide that they want to work on 64bit wide data instead of the 32bit they used to on the old system, you suddenly find that your processor is having to move double the amount of data around there system.. You have to hope that any increases in memory bandwidth the engineers included are enough to cater for this.

      I think the main thing I'm trying to say is that 64bit computing isn't necessarily faster than 32 bit computing. Indeed, because some of the overheads can be double or quaduple, it can be a performance hit.

      Sorry for possibly raining on your parade, but that's how the cookie crumbles.

      --

      Agrajag: "Oh no, not again!"
    5. Re:About 64-bit gaming performance by mjuarez · · Score: 2, Informative

      However, what happens when the operating system does a context switch or some other exception occurs? The latency from saving the processor context is going to go way up as you have to save far more data to memory and then load the same large amount of data in for the new context.

      There is no "context-switch" delay. The processor takes exactly the same amount of time doing a context-switch at 64-bits than at 32-bits. Remember that the processor has to do a certain number of clocks per second, and it cannot "fall behind" or get delayed.

      Now, if your programmers decide that they want to work on 64bit wide data instead of the 32bit they used to on the old system, you suddenly find that your processor is having to move double the amount of data around there system.. You have to hope that any increases in memory bandwidth the engineers included are enough to cater for this.

      If you read the article, you will have noticed that Opteron has an integrated memory controller. In this case, it means the controller was moving data at 2.0Ghz. This adds up to significant increases in performance in the benchmarks, as could be seen by the article.

      I think the main thing I'm trying to say is that 64bit computing isn't necessarily faster than 32 bit computing. Indeed, because some of the overheads can be double or quaduple, it can be a performance hit.

      Absolutely true. It can be slower (just take a look at Itanium :-), but it shouldn't! Did you even read the article? In most of the benchmarks, the Opteron was even faster than dual-Xeons (although I'm not sure the benchmarks were fully using the additional processor) I didn't see a "performance hit" anywhere in the benchmarks.

    6. Re:About 64-bit gaming performance by katz · · Score: 2, Insightful

      Your analysis is detailed and insightful, and at one time was a big issue. However, today's sheer clock speeds and superscalar pipelines render it far less of a burden. How fast does your OS switch contexts? every few milliseconds? "iostat" on my 1.0 Ghz Athlon t-bird says 351 cs/sec; 1.0/351 ~= 2 ms execution time per context. This is enough time for even the >>tightest>miniscule compared to the time tight loops have at their disposal. This, coupled with the fact that context switches are so carefully and constantly streamlined to be as efficient as possible, make this context switch--which was an impediment at one time--insignificant now.

      Roey

    7. Re:About 64-bit gaming performance by be-fan · · Score: 2, Informative

      It has to lok like its doing this, but doesn't have to do this :) A P4 has about 128 internal registers, and uses renaming hardware to present 8 to a given task. A given task may use more than 8 registers, where the CPU figures out ways to avoid spilling a register and doing a rename instead. Now, during a context switch, the CPU doesn't actually have to dump the full context of the processor out to memory. Most of the state gets buffered, either in the internal register file, or in one of the write queues. Also, I doubt modern processors flush the i-cache. The i-cache on the P4 is actually the trace cache, and flushing it would involve dumping about 8kb of traces that took a lot of work to make. In reality, its probably lazily replaced with new traces as the new process executes.

      FYI> The big win with the AMD64 is not that the processor has more physical registers (it probably doesn't) but that its larger window of 16 GPRs enables the compiler's optimizer to do a much better job with register allocation.

      --
      A deep unwavering belief is a sure sign you're missing something...
    8. Re:About 64-bit gaming performance by drinkypoo · · Score: 2, Interesting

      IBM had a RISC chip called the 801 way the hell back in time but never commercialized it, and so the ARM was the first RISC CPU that anyone was able to buy. I went hunting for dates once and wrote this writeup on E2 which has the dates of these assorted processors. The 801 is from 1979, the ARM2 in 1985 (ARM1 is also in 1985, but never commercialized) and ROMP in 1986. POWER happened in 1990. There is enough time between 801 and ROMP, and further enough time between ROMP and POWER, to ensure that each processor somehow advanced the others, if only because IBM was busy laying their share of the groundwork for how RISC processors and processors in general would work. IBM has always advanced the science of computer technology by at least their fair share, if not more.

      Other interesting factoids for those too lazy to visit the link, or to wait for the page to load, though probably anyone who has drilled down this far will fire it up in another tab or window; The Motorola 68020 (1984) was the first 32 bit processor. The first general-purpose 16 bit microprocessor was the Texas Instruments TMS9900 in the TI 99/4(A), in 1976.

      I know about AIX on the RT, I know that was the primary OS, but the fact is that the system tanked because it was mismarketed as a PC, though it's true it was priced like one, scaled up for performance. I managed to track down both AOS and BSD 4.3 (IBM and not IBM, as you apparently know) for my RTs.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  13. Athlon64 will be in short supply by afidel · · Score: 4, Interesting

    or so says Ars Technica. In addition most of the initial shipments will go to motherboard manufacturers for bundling with their boards. I really don't like the idea of that becoming common practice as that much purchasing power will mean tight pricing controlls. Read more Here.

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  14. Re:64-benchmarks wont be good by Distinguished+Hero · · Score: 5, Informative

    How the frell did this get modded up? Please RTFA before commenting/modding.

    The benchmark was against a P4 (as well as a dual Xeon), which runs IA-32 natively, not the Italium.

    The A64 is a consumer chip, designed to be purchased and used by consumers. The Itanium processor costs more than a whole top of the line consumer computer. The A64 and the Italium are not targeted at the same market segment and neither is the Opteron, which is supposed to go up against the Xeon.

    The reason everyone is looking forward to a benchmark of an A64 running a native 64-bit application on a 64-bit OS is that not only is X86-64 considerably cleaner than IA-32, but the A64 also has two times as many SSE2 and General Purpose registers, which should yield significantly better results than the A64 running in 32-bit mode (which is already outperforming the P4 in a lot of benchmarks).

    By the way, before someone points out that the benchmarked processor is an overclocked Opteron and not an A64, AMD is currently planning on releasing a version of the A64 which is just a rebranded Opteron 1xx along with the single-channel version of the A64.

    --
    Uttering logically derived and empirically supported truths to the disciples of the orthodox establishment.
  15. Intel Itanium vrs. AMD Opteron/Athlon64 by mjuarez · · Score: 4, Informative

    Just to set some things straight:

    - Itanium, Intel's 64-bit chip, uses a totally different architecture (EPIC) from the current Pentium x86 line of chips. This architecture is NOT compatible with x86, so that effectively you need a recompile for existing software work on Itanium. There is an EMULATION mode for x86 in Itanium, which is absolutely unusable according to various sources on the Net. You will DEFINITELY not want to run a game on it. Finally, prices for a low-end 1.0Ghz Itanium chip start at approx $800.

    - AMD's Opteron/Athlon64 chips are compatible with everything you are running right now at 32 bits. You can install a complete 32-bit operating system in it, and everything will run just as today, albeit a little bit faster. There is no need for an "emulator". And, of course, you can already use Linux at full-64 bits, available from SuSe, RedHat and Mandrake. Also, Microsoft will release a 64-bit version of XP at the end of the year.

    Marcos

  16. But is it representative? by eddy · · Score: 2, Interesting

    While true, isn't the whole point of this "preview" to demonstrate the true Athlon64 performance without breaking the NDA by actually publishing Athlon64 benchmarks?

    I'm guessing they have access to Athlon64 hardware, and simply "tweaked" the Opteron until ut produced similar enough results to be published as a "preview" -- Since those can be published. It's almost a little like what AMD did with their PR rating, which is officially based on the Thunderbird line, but everyone compare it to the P4 core freq. instead.

    But yes, we have no idea of knowing how accurate these results reflect the final Athlon64 3200+ or whatever model they're previewing (am I the only person who got several pages without content in the preview?)

    (everything above is pure conjecture)

    --
    Belief is the currency of delusion.
    1. Re:But is it representative? by Slack3r78 · · Score: 2, Insightful
      (am I the only person who got several pages without content in the preview?)

      I read this article this morning long before it hit slashdot and didn't have that problem. What it likely was is that several of the pages were nothing but images (charts) and poor anand was suffering a slashdotting when you tried accessing them. Hence, nothing came up. Might want to try again when the frenzy dies down some.
  17. Re:Intel's response by mjuarez · · Score: 4, Interesting

    Of course, you can buy a dual-Opteron or even a quad-Opteron TODAY if you want, or you can wait until late this year to buy a Prescott system, which is not 64-bits nor multi-processing.

    By the way, did you know Prescott, along with its mobile version Dothan, was delayed because it was dissipating almost 103 watts? For the record, Opteron is dissipating about 60 watts.

    Marcos

  18. First Look at Windows XP 64bit for AMD64 by rchatterjee · · Score: 5, Informative

    GamePC is running a first look of Windows XP 64bit edition for the AMD64 (x86-64) architecture.

  19. Re:Intel by mjuarez · · Score: 2, Insightful

    This will all but signal the end of IA64, it will at that point probably only be used for HPaq's large servers.

    Yamhill was rumored since 2000. The rumors appear to be true, but Intel has been denying it ever since.

    The problem is that they committed themselves to Itanium for 64-bits. And, in doing that, they also committed SGI, HP, IBM and a number of other vendors. These vendors will NOT be happy if Itanium is obsoleted later on. HP alone has probably invested more than $1 billion in porting their HP/UX and Tru64 software to Itanium architecture, and there are even some customers that have made the full switch. (I'm not talking small shops here, I'm talking huge corporations which replaced their main servers with Itanium hardware).

    I believe that, eventually, Intel will release a Yamhill-type of chip, but not after they get battered to death by the press and technical community out there for not releasing an equivalent-to-Opteron processor. But that will probably not be at least until the end of 2004 or beginning of 2005. So AMD has at least a full year for itself to gather momentum. Which I believe it will.

  20. Re:Intel by afidel · · Score: 2, Insightful

    Actually IBM doesn't care, they have sold WAY more Opteron systems than Itanium systems despite the fact that Itanium has been out for about 20% the length of time that Itanium has. Besides which their real 64bit chip is the POWER series. They are already performing initial work on the POWER6 and some research on stuff to include in the POWER7 even though the currently shipping generation is the POWER4. SGI is irrelevant these days so the only big player attaching their horse to Itanium is HPaq and they were doing it because they hoped it would pay off by reducing the cost of development of their chip used for their high end systems like the Superdome. In that sense Itanium has already reached its goals for HPaq, even if Intel never gets volume pricing on the chip Intel has already subsidized HPaq's development efforts =)

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  21. Re:64-benchmarks wont be good by parkanoid · · Score: 4, Informative

    Sorry, but Hyper-Threading isn't really used to "take any advantage of the dualies". From the intel page: "Hyper-Threading Technology is a form of simultaneous multi-threading technology (SMT) where multiple threads of software applications can be run simultaneously on one processor" (emphasis mine)

  22. 20% Gain by MBCook · · Score: 3, Informative
    IIRC, Tim Sweeny said that by recompiling one of the versions of UT (2003 maybe?) for the x86-64 platform without optimizations, they saw up to a 20% performance boost. Now if they were to optomize the code on top of that, they could probably get a little more.

    So even for programs that don't need to use 64 bit math, moving them to the x86-64 platform can speed them up. It won't improve your typing speed in Word, but it can probably speed up most if not all your games if they are simply recompiled.

    --
    Comment forecast: Bits of genius surrounded by a sea of mediocrity.
  23. So why didn't Intel do this? by fm6 · · Score: 3, Insightful

    Basically, you're saying that this is an important incremental improvement over previous x86 processors. Which describes every new x86 processor right back to the 8088. So you have to ask: why did Intel abandon the incremental approach with the Itanium? It's locked them in as the dominant CPU maker since forever.

  24. Re:So why didn't Intel do this? Politics by fm6 · · Score: 3, Insightful
    Intel has sunk a lot of money and time into the Itanium architecture, almost a decade's worth.
    Well, that explains why they're pushing the Itanium now. But the real question is what they were thinking 10 years ago, when they committed so much to a non-compatible processor. They knew going in that developing the Itanium was going to gobble up a lot of resources. So much so, they had to bring in HP to help. Imagine a project that's so big that Intel can't handle it solo!

    Perhaps somebody was bored with the whole Pentium architecture.

  25. Re:Well I'm hopefull. by mjuarez · · Score: 2, Informative

    Tyan and Arima already have dual motherboards out there. The Tyan K8W looks really nice for a workstation or high-end gaming machine. All 4P motherboards are not "available" per se, they're only should as complete systems. Check out Appro, Angstrom Computer or Racksaver for some 4P servers if you're looking for Opteron servers.

  26. Re:So why didn't Intel do this? Politics by Anonymous Coward · · Score: 3, Insightful

    10-15 years ago, everyone else in the industry thought x86 was a dead end. Massive amounts of investement poured into RISC alternatives like Alpha and PPC.

    Perhaps Intel believed the conventional wisdom and felt they had to eventually drop x86.

  27. One More thing by Nazmun · · Score: 2, Informative

    Most of the slashdotters already pointed out the other important stuff...

    But I'd like to point out that the Itanium will not be competition for the Opteron in most cases. Itaniums are super expensive chips that run on servers and are totaly incompatible with x86 (32 bits or 64 bits) software unless it's in emulation mode in which it runs very slowly. If you were to run Itanium on x86 software then more then likely the opterons would easily win anyway.

    --
    Hmmm... Pie...
  28. no more 'next page' style, please ;-( by TheGratefulNet · · Score: 3, Informative
    here ( http://www.anandtech.com/printarticle.html?i=1856) is the printable (all continuous) version.

    causing hit counters to go up artificially just to see 'next page' drives me nuts!

    --

    --
    "It is now safe to switch off your computer."
  29. Re:So why didn't Intel do this? Politics by afidel · · Score: 4, Informative

    And conventional wisdom was correct. They just underestimated the power of the entrenched software library. Intel processors since the Pentium Pro have basically been RISC cores with a x86->RISC translator in front. This allows them to ramp up the speed of the core, even change core architectures while still running all the old code. It costs at the fairly small cost of the gates needed for the translation frontend. It has another advantage in that CISC operations take up less room in cache so you get much better utilization out of your expensive cache resources. Intel started the Itanium project for two reasons, HP needed a new flagship chip and they are a large enough customer to sway Intel, and two they were tired of Cyrix and AMD copying their designs so they were going to make a tightly controlled architecture where EVERYTHING was covered by patents and copyright, that way they thought they could have the whole pie to themselves. What they didn't realize is that while they are a big player the only reason people keep using their chips is that they have maintained that backwards compatability path, throw that away and Intel is just another chip maker and others like IBM, Motorolla, etc may look better.

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  30. AMD64 already has non-executable pages. by tamyrlin · · Score: 4, Informative

    Quoting the AMD64 Architecture Programmer's Manual Volume 2: System Programming:

    "The NX bit in the page-translation tables specifies whether instructions can be executed from the page."

    So non-executable pages are already present in AMD64.

  31. This is good, but don't count on XP 64-Bit by aelfwyne · · Score: 2, Informative

    If the AMD64 version of Windows XP 64-Bit is as stripped down as the current Intel version... then don't bother considering what performance would be like there anyway... check here for a list of things *NOT* included in XP 64-bit:

    http://www.microsoft.com/technet/treeview/defaul t. asp?url=/technet/prodtechnol/winxppro/reskit/prka_ fea_tfiu.asp

    But I guess we can do without features like Media Player, POSIX Compliance, Power Management, Windows Installer, and more... I guess..... just to have a 64-bit OS...

    --
    -- If it ain't broke - overclock it more.
  32. Not just more memory, more address space by Namarrgon · · Score: 3, Insightful
    Obviously for some high-demand apps, having >4 GB of memory is a Very Good Thing. But for some apps (especially under Windows), a 64 bit processor can be bring another big benefit to the table: a full 64 bit address space. Obviously this is needed for more memory, but even with only 2 GB of RAM, a Windows app that uses large contiguous areas of memory can run into serious address space fragmentation long before they run out of memory.

    In Windows, you only get 2 GB of address space for your process (WinXP & expensive Win2K Server versions can give 3 GB, which helps). Into this address space is loaded your executable code (including all system DLLs) and your stack (by default 1 MB of address space is reserved for every thread), and these tend to be scattered around a bit, which breaks up the available address range considerably.

    Now if your app needs to allocate large (200+ MB) areas of memory, how many of those do you think you can get from a 2 GB RAM machine? Not enough :-) In fact you may find that as little as 50-60% of your available RAM can be allocated into large chunks, and all the rest is only available as countless smaller fragments. The larger the contiguous RAM blocks you want, the less of them you can allocate.

    With a 64 bit CPU, there's no more problem. The MMU can map scattered pages of your available physical RAM to any contiguous section of the massive 64 bit address range, and you can utilise all the RAM you have in any size chunk you wish :-)

    --
    Why would anyone engrave "Elbereth"?
  33. Re:About the wattage... by sirsex · · Score: 2, Informative

    To the first order, power increase linearly with speed, squared with voltage. P=CFV^2

  34. Will we have a library nightmare? by r6144 · · Score: 2, Interesting
    For optimal performance and compatibility, we need at least three sets of libraries: a pure 32-bit version (old apps have to run in 32-bit mode with a 64-bit kernel because the new instruction set is not entirely compatible with the old one), a "small" 64-bit version in which pointers are 32-bit in memory (so that most applications can get 64-bit and the extra registers, etc., without wasting memory on pointers), and a regular 64-bit version for the apps that really need the large address space. Seems that the nightmares of tiny/small/medium/large/huge/compact memory models in 16-bit x86 will come again.

    Anyway, those running other existing 64-bit CPUs should be able to give some advice.