AMD Athlon 64 FX-57 Review

← Back to Stories (view on slashdot.org)

Posted by Zonk on Sunday June 19, 2005 @07:41AM from the heavy-hardware dept.

Duane writes "GDHardware.com has the first review of AMD's upcoming Athlon 64 FX-57 CPU clocked at 2.8GHz. They benchmark it against Intel's current fastest 3.8GHz P4 and the Athlon 64 X2." From the article: "Clocked at 2.8GHz, the FX-57 continues the 'San Diego' core AMD released with the FX-55, but is stepped up a paltry 200MHz faster. What's interesting is that while 200MHz on the Intel side of things doesn't always mean that great of a performance gain, not so with AMD."

16 of 167 comments (clear)

Min score:

Reason:

Sort:

Summary by 823723423 · 2005-06-19 07:50 · Score: 5, Informative

[page 1]
AMD continues to raise the bar in performance - both in dual core with its recent X2 chip and now once again in the single core design with its pending FX-57 launch due on June 27th.
[page 2]
The FX-57 is armed with a total of 1152KB of cache (128KB L1 and 1024KB L2) which greatly speeds up commonly called data cues and is a great sized buffer between the CPU and system RAM.
[conclusion]
However, at this point in the game we'd have a hard time giving a full recommendation to anyone to spend close to or over $1000 on a chip that isn't dual core
Re:Clockrate differences... by tomstdenis · 2005-06-19 07:55 · Score: 2, Informative

No, and why would it? The original pentium core only went up to around 233Mhz. The other Pentiums that hit the Ghz range are the P3 and PM.

They both have a higher IPC than the P4, so no a 1Ghz P4 is not the same as a 1Ghz P3... [it's slower].

In the AMD world they're not always the same either. A 1Ghz AMD64 would be faster in most cases than a 1Ghz AMD32 [e.g. Barton] because of the extra registers and more decode/execute resources [e.g. larger instruction scheduler, more DirectPath opcodes, etc].

Tom

--
Someday, I'll have a real sig.
Re:Clockrate differences... by XaXXon · 2005-06-19 07:55 · Score: 3, Informative

Yes. The chip itself will operate exactly 3.8 times faster than a 1GHz chip of the same architecture. Note that there was no 1GHz P4 chip nor was there a 1GHz Athlon 64.

What you don't understand is that different architectures lead to different performance characteristics. This results in a similarly clocked AMD chip outperforming it's Intel rival.

Also, many other systems affect how fast your program runs -- it's not just processor speed.
Re:Clockrate differences... by NaruVonWilkins · 2005-06-19 08:07 · Score: 2, Informative

A lot more - in some cases, 2x as efficient. The Pentium M at 2GHz is as fast as the 3.x P4s, if not faster - it's based on the P3 architecture.
Re:640x480 gaming by NerveGas · 2005-06-19 08:10 · Score: 4, Informative

Maybe you don't understand: They were benchmarking a CPU. Not a graphics card, a CPU. The more they turn up the resolution and detail, the more the video card will be a factor, and mask the benefits of the CPU. Even if they used the same video card, as the card becomes more of a limitting factor, the more all of the CPUs will look the same.

Now, that's not to say that it wouldn't have been interesting to have some 1600x1200 benchmarks, but in and of itself, the choice of 640x480 is not a bad one.

steve

--
Oh, you're not stuck, you're just unable to let go of the onion rings.
Re:Great, but... by juhaz · 2005-06-19 08:14 · Score: 2, Informative

I realize the price will go down over time, but seriously, who is going to buy this chip?

The same people who always buy flagship chips, kids with rich parents and other folks with whole load of money in their hands.

Ok, I know some gamers with too much money on their hands will buy it, but it's still going to be surpassed when the dual cores start gaining ground, especially in gaming (think Christmas '05).

I doubt too many games that can take advantage of dualcores will be done by christmas, but if I'm wrong, do you think the gamers in question will really need to think twice before getting another $1000 CPU for christmas, that time dualcore flagship?

Until I saw the pricetag I thought this might be an option for my next build, but not anymore. There are other options, at much lower prices.

Of course there are, things that cost three times more than something only slightly slower are not for people who concern themselves with money, the only "if you need to ask how much it is, then you can't afford it" line is how it tends to be on the absolutely latest and finest.
Re:Clockrate differences... by cynical+kane · 2005-06-19 08:24 · Score: 2, Informative

Who the heck modded this informative? This is a completely juvenille misinterpetation of how CPUs work.

IIRC the Athlon 64 can "only" dispatch 3 'muops' per cycle to its execution units, which themselves are broken-down x86 instructions. The P4 is similar.

Secondly, the mu-ops must be in a certain sequence if you're ever going to dispatch more than 1 per cycle.

Thirdly, in order to keep the dispatcher dispatching, you must keep it busy with new operations and data to execute. Which means you need a good memory architecture and cache.

And: "The size of the pipes is important (you may be doing more per cycle, but the cycles take longer)." I can't fully understand what this sentence is trying to say, but it's probably wrong.

Put it all together, and IIRC the Athlon 64 can barely beat 1 x86 instruction per cycle under optimal conditions. It won't get close to 9.
Re:Clockrate differences... by doormat · 2005-06-19 08:26 · Score: 5, Informative

How the fuck did that get moderated up. It makes no fucking sense and is completely inaccurate (and yes, IAACompEng).

Athlons have higher IPC (instructions per clock) than a P4. Why? The length of the pipeline. Athlon 64s have a SINGLE pipeline, with a length of about 15 (aka "a 15 stage pipeline"). A P4-prescott (90nm version) has a 31 stage pipeline. The P4 northwood had a 20 stage pipeline (note that those are for integer instructions, floating point operations have more stages through the FPU). A64s do not have 9 pipelines, nigh the P4 have 6. And neither get anywhere near the ops/clock you claim. They do have parallel execution units however, and maybe thats where you get your numbers from, but even then they're still not right.

So it takes an integer operation 15 or so cycles to be complete in an Athlon, and 30 cycles in a P4. Thus the higher IPC. Other things also influence performance are cache hit ratio, branch prediction. And thats the reason why the prescott didnt fall on its face-more cache as well as better Branch Prediction Unit (BPU). A lot of improvements went into the 90nm prescott to keep IPC close to what the P4-northwood had. There were some articles at Anandtech when it first came out, comparing it to the northwood.

To parent: Go read some Ars Technica articles about how CPUs are organized before you talk out of your ass about stuff you dont know.

--
The Doormat

If you're not outraged, then you're not paying attention.
Re:Clockrate differences... by jadavis · 2005-06-19 09:09 · Score: 4, Informative

So it takes an integer operation 15 or so cycles to be complete in an Athlon, and 30 cycles in a P4.

I want to add that a long pipeline isn't as bad as you make it seem. Assuming the branch prediction and cache are working effectively (and there aren't too many data hazards, etc), there will be several instructions in the same pipeline at the same time in different stages.

One integer operation may take 30 cycles on a P4 and 15 on an Athlon. But one million integer operations might approach 1 integer operation per cycle on both processors. This is under very ideal circumstances, and realistically there will always be fewer instructions in the pipeline than there are stages.

--
Social scientists are inspired by theories; scientists are humbled by facts.
Re:Clockrate differences... by thorndt · 2005-06-19 09:23 · Score: 5, Informative

Correct, as far as it goes.
However: it's not the pipeline length causing "15 cycles versus 30 cycles" that will actually harm performance. It's pipeline STALLS what kill performance--in a perfect world, for example, a hypothetical 10,000-stage single-pipeline processor running at 1 GHz would retire 1 BILLION instructions per second, albeit with a 10,000 clock initial pipeline fill upon powerup.

Do something that causes the pipeline to need to be flushed and refilled, however, and you just lost 10,0000 clocks.

This is where the P4 has problems relative to the Athlon: keeping it's pipeline filled, and the subsequent pipeline flush /bubble penalties.

Note that there's lots more to this discussion than I wrote here (can you say branch predictors, trace caches, lookaside buffers, etc.), but ultimately all that stuff has to do with KEEPING THE PIPELINE FILLED, and what happens when you don't.

--
- The race is not [always] to the swift, nor the battle to the strong. -
Re:AMD Reaping the benefits of HyperTransport by EndlessNameless · 2005-06-19 10:31 · Score: 2, Informative

The northbridge is nowhere near being the biggest bottleneck in a modern PC. AMD's design reduces latency substantially, which results in slightly improved bandwidth.

For newer games, graphics processing is the performance bottleneck. For scientific work, it is generally either memory bandwidth or execution resources on the CPU. For servers, it is generally memory bandwidth and/or I/O bandwidth from the hard disks.

Integrating the northbridge onto the CPU die does net a modest performance boost, but it does little to affect performance in most usage scenarios.

--

---
According to the latest ruleset, this post should be modded as Vorpal Flamebait +5.
Re:Clockrate differences... by Hektor_Troy · 2005-06-19 11:01 · Score: 2, Informative

And that is not correct either. The chip doesn't scale that way with clock speed.

A 3800+ AMD chip will perform, roughly, 3.8 times as well as a 1 GHz Athlon. This is not the true in all cases - it will perform better in some situations, worse in others.

Optimized architecture also means, that the 800 MHz Athlon 64 FX (underclocked by Cool'n'Quiet) could still outperform a 1 GHz Athlon, hence giving it a performance rating of more than 1000+.

While you've been moderated quite "informative", your comment isn't really that - it's partly right, just like it's partly right that the earth is somewhat flat ;)

--
We do not live in the 21st century. We live in the 20 second century.
FX57 and FX59 benchamrks at 3Ghz. by ruiner5000 · 2005-06-19 11:08 · Score: 2, Informative

You can find the new San Diego core FX benched at FX57 and FX59(3GHz) speeds here.

--
ignorance is bliss. googlefiberatx.com
you're mistaken by YesIAmAScript · 2005-06-19 11:15 · Score: 2, Informative

Intel generally leads AMD in memory bandwidth, at least without any overclocking.

Intel was doing 6.4GB/s (dual channel PC3200 RAM) when AMD was at 2.7GB/sec. (single channel PC2700).

Also, note that memory accesses don't go over HyperTransport on an Athlon. The memory controller is built into the CPU. This is nice for latency, but bad because it means that Athlon users are stuck with whatever memory technology AMD has selected. At the moment, that means Athlon systems are stuck with DDR right now even as DDR2 prices fall below DDR prices.

Also, HyperTransport isn't all that insanely fast. Amongst other thinks, clocking cycles are part of the "GHz" rating on HT, and so the bandwidth is lower than it might seem.

--
http://lkml.org/lkml/2005/8/20/95
Re:640x480 gaming by be-fan · 2005-06-19 12:52 · Score: 2, Informative

There is no load balancing between the CPU and the GPU. The two always do the same part of the work. If one can't keep up, the other doesn't take over the work. Instead, the slower part just becomes a bottleneck. At 640x480, you're testing how fast the CPU can feed the graphics card data. At 1600x1200, the graphics card becomes the bottleneck, and as long as the CPU can feed the graphics card fast enough, they'll get the same results.

--
A deep unwavering belief is a sure sign you're missing something...
you're also mistaken. by YesIAmAScript · 2005-06-19 13:08 · Score: 2, Informative

When Intel came out with the 800MHz FSB (6.4GB/s), they handily beat AMD on benchmarks. AMD didn't have the internal memory controller at the time. Plus they had a slow FSB. Plus they were in that awkward time before NVidia jumped on the bandwagon, and so the chipsets for AMD were terrible, most of them bad VIA performers.

AMD may have the upper hand in many benchmarks right now (I guess you don't look at video compression), but it hasn't been that way for long. AMD's most recent rise above Intel really started with Intel's 3.4GHz offerings. In the period before that (the early 875P chipset days, and the aforementioned 800FSB and 3.0GHz range), Intel was beating AMD handily, although not at nearly equivalent prices. This is especially true on memory-intensive benchmarks. It was quite a step forward for a company that at that time was just emerging from the dark days of RDRAM stupidity.

Also note to say that the hypertransport bus is why AMD's dual cores run faster than Intel's dual cores is pure speculation. Do you have any real reasons to put behind that or just assertions?

Even if Intel does use the standard FSB for inter-processor communcations, their FSB is currently faster than AMD's HT. The bandwidth of the HT on AMD chips is 8.0GB/sec. The bandwidth of the FSB on Intel's fastest processors is 10.7GB/sec.

I'm not dumping on AMD. I like AMD. But the amount of misinformed speculation and assertions as to why they are doing well is astounding.

Also, I find the "troll" moderation on my post above insulting. It's seems silly to me, but disregarding that, I find it ridiculous that people automatically moderate "troll" apparently just for speaking any nice things about Intel.

--
http://lkml.org/lkml/2005/8/20/95