AMD Athlon 64 FX-57 Review

Summary by 823723423 · 2005-06-19 07:50 · Score: 5, Informative

[page 1]
AMD continues to raise the bar in performance - both in dual core with its recent X2 chip and now once again in the single core design with its pending FX-57 launch due on June 27th.
[page 2]
The FX-57 is armed with a total of 1152KB of cache (128KB L1 and 1024KB L2) which greatly speeds up commonly called data cues and is a great sized buffer between the CPU and system RAM.
[conclusion]
However, at this point in the game we'd have a hard time giving a full recommendation to anyone to spend close to or over $1000 on a chip that isn't dual core

Great, but... by wbren · 2005-06-19 07:51 · Score: 4, Interesting

From TFA:

If you have tons of money to spend, and aren't attracted at all by the AthlonX2 then get this chip; however, at this point in the game we'd have a hard time giving a full recommendation to anyone to spend close to or over $1000 on a chip that isn't dual core.

I realize the price will go down over time, but seriously, who is going to buy this chip? Ok, I know some gamers with too much money on their hands will buy it, but it's still going to be surpassed when the dual cores start gaining ground, especially in gaming (think Christmas '05). Until I saw the pricetag I thought this might be an option for my next build, but not anymore. There are other options, at much lower prices.

--
-William Brendel

Re:Great, but... by juhaz · 2005-06-19 08:14 · Score: 2, Informative

I realize the price will go down over time, but seriously, who is going to buy this chip?

The same people who always buy flagship chips, kids with rich parents and other folks with whole load of money in their hands.

Ok, I know some gamers with too much money on their hands will buy it, but it's still going to be surpassed when the dual cores start gaining ground, especially in gaming (think Christmas '05).

I doubt too many games that can take advantage of dualcores will be done by christmas, but if I'm wrong, do you think the gamers in question will really need to think twice before getting another $1000 CPU for christmas, that time dualcore flagship?

Until I saw the pricetag I thought this might be an option for my next build, but not anymore. There are other options, at much lower prices.

Of course there are, things that cost three times more than something only slightly slower are not for people who concern themselves with money, the only "if you need to ask how much it is, then you can't afford it" line is how it tends to be on the absolutely latest and finest.

Re:Clockrate differences... by tomstdenis · 2005-06-19 07:55 · Score: 2, Informative

No, and why would it? The original pentium core only went up to around 233Mhz. The other Pentiums that hit the Ghz range are the P3 and PM.

They both have a higher IPC than the P4, so no a 1Ghz P4 is not the same as a 1Ghz P3... [it's slower].

In the AMD world they're not always the same either. A 1Ghz AMD64 would be faster in most cases than a 1Ghz AMD32 [e.g. Barton] because of the extra registers and more decode/execute resources [e.g. larger instruction scheduler, more DirectPath opcodes, etc].

Tom

--
Someday, I'll have a real sig.

Re:Clockrate differences... by XaXXon · 2005-06-19 07:55 · Score: 3, Informative

Yes. The chip itself will operate exactly 3.8 times faster than a 1GHz chip of the same architecture. Note that there was no 1GHz P4 chip nor was there a 1GHz Athlon 64.

What you don't understand is that different architectures lead to different performance characteristics. This results in a similarly clocked AMD chip outperforming it's Intel rival.

Also, many other systems affect how fast your program runs -- it's not just processor speed.

Re:Clockrate differences... by NaruVonWilkins · 2005-06-19 08:07 · Score: 2, Informative

A lot more - in some cases, 2x as efficient. The Pentium M at 2GHz is as fast as the 3.x P4s, if not faster - it's based on the P3 architecture.

Re:640x480 gaming by NerveGas · 2005-06-19 08:10 · Score: 4, Informative

Maybe you don't understand: They were benchmarking a CPU. Not a graphics card, a CPU. The more they turn up the resolution and detail, the more the video card will be a factor, and mask the benefits of the CPU. Even if they used the same video card, as the card becomes more of a limitting factor, the more all of the CPUs will look the same.

Now, that's not to say that it wouldn't have been interesting to have some 1600x1200 benchmarks, but in and of itself, the choice of 640x480 is not a bad one.

steve

--
Oh, you're not stuck, you're just unable to let go of the onion rings.

AMD Reaping the benefits of HyperTransport by Peter+Amstutz · 2005-06-19 08:12 · Score: 5, Interesting

From what I've read, while Intel can keep cranking up the core speed of their chips, all those clock cycles are wasted if it spends most of its time waiting around for memory. The northbridge on Intel motherboards is now their biggest bottleneck. So at least part of the reason AMD can get better throughput at a lower clockrate is that it eliminates the northbridge altogether, puts the memory controller on the CPU, and ties everything else together using their insanely fast "HyperTransport" system bus. Any engineers who know more about it care to comment?

Re:AMD Reaping the benefits of HyperTransport by EndlessNameless · 2005-06-19 10:31 · Score: 2, Informative

The northbridge is nowhere near being the biggest bottleneck in a modern PC. AMD's design reduces latency substantially, which results in slightly improved bandwidth.

For newer games, graphics processing is the performance bottleneck. For scientific work, it is generally either memory bandwidth or execution resources on the CPU. For servers, it is generally memory bandwidth and/or I/O bandwidth from the hard disks.

Integrating the northbridge onto the CPU die does net a modest performance boost, but it does little to affect performance in most usage scenarios.

--

---
According to the latest ruleset, this post should be modded as Vorpal Flamebait +5.
Re:AMD Reaping the benefits of HyperTransport by Ezdaloth · 2005-06-19 10:33 · Score: 3, Insightful

Funny thing is, it isn't purely their HyperTransport. It was developed togerther with Digital for their alpha CPU's. Way to go digital, like many other "modern" features the alpha chips had this baby first. Too bad they died.

you can also look at alpha systems (in this matter any "real workstation design") how to fix this, e.g. with memory interleaving. With 64 memory dimms supplying data to the CPU, it will be the memory running circles around your CPU. :-)

Same goes for IO, most cheap-ass computers are quite fast considering the CPU, but with crappy disks, a slow pci-bus, etc. Go S-ATA, go pci-e; maybe we'll have the possibility to build decent affordable PC's afterall somewhere in the near future.
Re:AMD Reaping the benefits of HyperTransport by Afrosheen · 2005-06-19 11:52 · Score: 2, Interesting

The drawback, from what I can gather, is that while Intel is using DDR3200 now (2x400), it's just a 64bit memory path.

The latest NForce4 boards, socket 939 for AMD, are using Dual Channel DDR. This nets you a fatter memory path. You get the same 2x400 but it's in a double-wide bandwidth path, 128bit.

From what I've seen in informal testing here, using dual channel memory is a huge difference. Throw in a SATA/150 drive instead of an IDE drive, and a Windows XP install gets shaved down to 15 minutes. Not real scientific, I know, but that gives you an idea of I/O improvement with Hypertransport and SATA. Usually the bottlenecks with OS installs are hard drive i/o times and cd/dvdrom i/o times. Memory still plays a big role though (caching).

All in all, I won't build anything but socket939 athlons from now on. Everything else is inferior, particularly for the price.

Damn you by skomes · 2005-06-19 08:17 · Score: 3, Insightful

Damn you Intel and AMD, always teasing me with the absolute most bleeding edge hardware that I CAN'T AFFORD. Come on, let's work on bringing down prices as well as bringing up performance.

Anyone else find the graphs confusing? by spitefowl · 2005-06-19 08:18 · Score: 3, Interesting

Some of them say "Lower is better" some of them say "FPS" some of them just don't say anything. It makes it hard to gauge if higher is better or lower is better. I mean, some things are obvious like 3dmark 2005 results, but then it says "4D rendering" what the heck is that? Is it measuring FPS?

Agh, eeh gads!

Re:Anyone else find the graphs confusing? by thrashbluegrass · 2005-06-19 08:50 · Score: 2, Funny

Growth in confusing graphs on the internet, 1989-2005:

=====75
========80
===========80
========75

Re:Clockrate differences... by cynical+kane · 2005-06-19 08:24 · Score: 2, Informative

Who the heck modded this informative? This is a completely juvenille misinterpetation of how CPUs work.

IIRC the Athlon 64 can "only" dispatch 3 'muops' per cycle to its execution units, which themselves are broken-down x86 instructions. The P4 is similar.

Secondly, the mu-ops must be in a certain sequence if you're ever going to dispatch more than 1 per cycle.

Thirdly, in order to keep the dispatcher dispatching, you must keep it busy with new operations and data to execute. Which means you need a good memory architecture and cache.

And: "The size of the pipes is important (you may be doing more per cycle, but the cycles take longer)." I can't fully understand what this sentence is trying to say, but it's probably wrong.

Put it all together, and IIRC the Athlon 64 can barely beat 1 x86 instruction per cycle under optimal conditions. It won't get close to 9.

Re:Clockrate differences... by doormat · 2005-06-19 08:26 · Score: 5, Informative

How the fuck did that get moderated up. It makes no fucking sense and is completely inaccurate (and yes, IAACompEng).

Athlons have higher IPC (instructions per clock) than a P4. Why? The length of the pipeline. Athlon 64s have a SINGLE pipeline, with a length of about 15 (aka "a 15 stage pipeline"). A P4-prescott (90nm version) has a 31 stage pipeline. The P4 northwood had a 20 stage pipeline (note that those are for integer instructions, floating point operations have more stages through the FPU). A64s do not have 9 pipelines, nigh the P4 have 6. And neither get anywhere near the ops/clock you claim. They do have parallel execution units however, and maybe thats where you get your numbers from, but even then they're still not right.

So it takes an integer operation 15 or so cycles to be complete in an Athlon, and 30 cycles in a P4. Thus the higher IPC. Other things also influence performance are cache hit ratio, branch prediction. And thats the reason why the prescott didnt fall on its face-more cache as well as better Branch Prediction Unit (BPU). A lot of improvements went into the 90nm prescott to keep IPC close to what the P4-northwood had. There were some articles at Anandtech when it first came out, comparing it to the northwood.

To parent: Go read some Ars Technica articles about how CPUs are organized before you talk out of your ass about stuff you dont know.

--
The Doormat

If you're not outraged, then you're not paying attention.

Re:Clockrate differences... by rookworm · 2005-06-19 08:30 · Score: 2, Insightful

From article: Although the FX-57 runs at 2.8GHz, we did have some room to overlock things a bit by raising the bus speeds - we were able to safely clock it to a steady 3GHz and found an average performance gain of near 20-percent at that level.

This seems to imply Athlon scales better than linearly (?!) How does that work?

--
The toad can't burp - and for some reason can't fart either, so it swells up and eventually explodes. --Anonymous Coward

Re:Great. Just Great. by Phoenixhunter · 2005-06-19 08:33 · Score: 3, Funny

You must be new here.

Re:No way. by GISGEOLOGYGEEK · 2005-06-19 09:08 · Score: 4, Insightful

Have you ever actually looked at your task manager?

You don't think that the 'system idle' process is really sucking up all your power do you?

The point you make is meaningless here. The OS is not taking 90% of the resources of this system.

--
George Bush + Linux = "I will not let information get in the way of the fight against Windows"

Re:Clockrate differences... by jadavis · 2005-06-19 09:09 · Score: 4, Informative

So it takes an integer operation 15 or so cycles to be complete in an Athlon, and 30 cycles in a P4.

I want to add that a long pipeline isn't as bad as you make it seem. Assuming the branch prediction and cache are working effectively (and there aren't too many data hazards, etc), there will be several instructions in the same pipeline at the same time in different stages.

One integer operation may take 30 cycles on a P4 and 15 on an Athlon. But one million integer operations might approach 1 integer operation per cycle on both processors. This is under very ideal circumstances, and realistically there will always be fewer instructions in the pipeline than there are stages.

--
Social scientists are inspired by theories; scientists are humbled by facts.

Re:Clockrate differences... by thorndt · 2005-06-19 09:23 · Score: 5, Informative

Correct, as far as it goes.
However: it's not the pipeline length causing "15 cycles versus 30 cycles" that will actually harm performance. It's pipeline STALLS what kill performance--in a perfect world, for example, a hypothetical 10,000-stage single-pipeline processor running at 1 GHz would retire 1 BILLION instructions per second, albeit with a 10,000 clock initial pipeline fill upon powerup.

Do something that causes the pipeline to need to be flushed and refilled, however, and you just lost 10,0000 clocks.

This is where the P4 has problems relative to the Athlon: keeping it's pipeline filled, and the subsequent pipeline flush /bubble penalties.

Note that there's lots more to this discussion than I wrote here (can you say branch predictors, trace caches, lookaside buffers, etc.), but ultimately all that stuff has to do with KEEPING THE PIPELINE FILLED, and what happens when you don't.

--
- The race is not [always] to the swift, nor the battle to the strong. -

Re:No way. by Deliveranc3 · 2005-06-19 10:12 · Score: 2, Insightful

Whoever modded this guy insightful is a moron.

When you overclock you upgrade several diffrent subsystems diffrent amounts.

Perhaps the clock speed of the busses on his video card and CPU memory interface increased by 40%.

Also there is obviously no overhead on the new cpu cycles... the list goes on.

Dodgy Slashdot stories by rsynnott · 2005-06-19 10:22 · Score: 2, Funny

Why is it that the dodgiest news/review sites so often get written up here?

--
Me (Blog)

market for high end products by Anonymous Coward · 2005-06-19 10:46 · Score: 5, Interesting

I realize the price will go down over time, but seriously, who is going to buy this chip?

You're asking the wrong question. Even if no one buys this chip, the chip is still worthwhile to have on the market.

A few years ago Wendy's found that almost no one was buying their triple cheeseburgers, so they took triples off the menu. When they did this, they found that sales of their double cheeseburgers dropped to almost nothing. The problem, as they discovered later, was that the presence of triple cheeseburgers on the menu helped to legitimize the double cheeseburgers as mainstream items. Without triple cheeseburgers, the double cheeseburgers became the high end item and mainstream buyers went for the singles instead.

Since profit margins on double cheeseburgers are higher, the chain was forced to bring back triple cheeseburgers, even though triples weren't selling at all, because the sales of their double cheeseburgers depended on having triples on the menu.

Point is, although this is a fast food example, the same thing applies to the computer industry. You HAVE to have a high end item available if you are to have any hope of positioning the more profitable midrange items as mainstream.

Re:Clockrate differences... by Hektor_Troy · 2005-06-19 11:01 · Score: 2, Informative

And that is not correct either. The chip doesn't scale that way with clock speed.

A 3800+ AMD chip will perform, roughly, 3.8 times as well as a 1 GHz Athlon. This is not the true in all cases - it will perform better in some situations, worse in others.

Optimized architecture also means, that the 800 MHz Athlon 64 FX (underclocked by Cool'n'Quiet) could still outperform a 1 GHz Athlon, hence giving it a performance rating of more than 1000+.

While you've been moderated quite "informative", your comment isn't really that - it's partly right, just like it's partly right that the earth is somewhat flat ;)

--
We do not live in the 21st century. We live in the 20 second century.

FX57 and FX59 benchamrks at 3Ghz. by ruiner5000 · 2005-06-19 11:08 · Score: 2, Informative

You can find the new San Diego core FX benched at FX57 and FX59(3GHz) speeds here.

--
ignorance is bliss. googlefiberatx.com

you're mistaken by YesIAmAScript · 2005-06-19 11:15 · Score: 2, Informative

Intel generally leads AMD in memory bandwidth, at least without any overclocking.

Intel was doing 6.4GB/s (dual channel PC3200 RAM) when AMD was at 2.7GB/sec. (single channel PC2700).

Also, note that memory accesses don't go over HyperTransport on an Athlon. The memory controller is built into the CPU. This is nice for latency, but bad because it means that Athlon users are stuck with whatever memory technology AMD has selected. At the moment, that means Athlon systems are stuck with DDR right now even as DDR2 prices fall below DDR prices.

Also, HyperTransport isn't all that insanely fast. Amongst other thinks, clocking cycles are part of the "GHz" rating on HT, and so the bandwidth is lower than it might seem.

--
http://lkml.org/lkml/2005/8/20/95

Re:640x480 gaming by be-fan · 2005-06-19 12:52 · Score: 2, Informative

There is no load balancing between the CPU and the GPU. The two always do the same part of the work. If one can't keep up, the other doesn't take over the work. Instead, the slower part just becomes a bottleneck. At 640x480, you're testing how fast the CPU can feed the graphics card data. At 1600x1200, the graphics card becomes the bottleneck, and as long as the CPU can feed the graphics card fast enough, they'll get the same results.

--
A deep unwavering belief is a sure sign you're missing something...

you're also mistaken. by YesIAmAScript · 2005-06-19 13:08 · Score: 2, Informative

When Intel came out with the 800MHz FSB (6.4GB/s), they handily beat AMD on benchmarks. AMD didn't have the internal memory controller at the time. Plus they had a slow FSB. Plus they were in that awkward time before NVidia jumped on the bandwagon, and so the chipsets for AMD were terrible, most of them bad VIA performers.

AMD may have the upper hand in many benchmarks right now (I guess you don't look at video compression), but it hasn't been that way for long. AMD's most recent rise above Intel really started with Intel's 3.4GHz offerings. In the period before that (the early 875P chipset days, and the aforementioned 800FSB and 3.0GHz range), Intel was beating AMD handily, although not at nearly equivalent prices. This is especially true on memory-intensive benchmarks. It was quite a step forward for a company that at that time was just emerging from the dark days of RDRAM stupidity.

Also note to say that the hypertransport bus is why AMD's dual cores run faster than Intel's dual cores is pure speculation. Do you have any real reasons to put behind that or just assertions?

Even if Intel does use the standard FSB for inter-processor communcations, their FSB is currently faster than AMD's HT. The bandwidth of the HT on AMD chips is 8.0GB/sec. The bandwidth of the FSB on Intel's fastest processors is 10.7GB/sec.

I'm not dumping on AMD. I like AMD. But the amount of misinformed speculation and assertions as to why they are doing well is astounding.

Also, I find the "troll" moderation on my post above insulting. It's seems silly to me, but disregarding that, I find it ridiculous that people automatically moderate "troll" apparently just for speaking any nice things about Intel.

--
http://lkml.org/lkml/2005/8/20/95

Re:640x480 gaming by neye_eve · 2005-06-19 15:59 · Score: 2, Funny

I don't think so. At 640x480 the vid card can probably handle everything all by itself. You need to put a big load on it so that the work has to run on the cpu since the vid card can't do it all itself.

i know. I wish more people understood this. I'm having a bit of trouble though with the geforceMX card that I carefully modded to map into the 762 pin socket on my motherboard. The darn thing just don't wanna boot!

I mean, it has more gigaflops and bogomips than a G4, which we all know is a national security risk! o_0 so i figured it would run my computer super fast when I used it to replace my aging AthlonXP 1800 .

pls can anyone help me???? kthxbai!

my points by YesIAmAScript · 2005-06-19 16:35 · Score: 2, Interesting

My point was that Intel has not been behind AMD in memory bandwidth in recent memory. At most times, including right now, they are ahead of AMD in memory bandwidth. The parent (now super parent) poster was wrong in saying AMD was ahead.

Intel reached the memory bandwidth levels AMD is at right now almost 3 years ago with the 3.0GHz/800FSB Pentium 4.

Being stuck with DDR isn't a problem as far as performance. But right now memory (esp. Taiwanese) vendors are dropping their prices on DDR2 trying to accelerate the switch to DDR2 from DDR. This is presumably to get out from under the license fees they pay to Rambus for DDR. But regardless of the reasons, as DDR2 drops in price below DDR, many Athlon users are going to wish they could use DDR2.

As to your comments that memory manufacturers say DDR2 prices aren't going to drop, I could find nothing like that at all. Most news sources say DDR2 prices will drop below DDR prices in the 2nd half of the year. More specific news says things like I mentioned above. http://www.tomshardware.com/hardnews/20050609_0654 49.html.

As fast as having your own onboard memory controller is, it does stifle innovation as far as what memory can be used on motherboards completely. So you had better be SURE your bet is right. I'm not 100% sure AMD's is.

I dunno about Intel copying AMD's plans. I haven't heard anything of it. To do so requires adding at least 160 pins to the CPU package. And it means you can't do multi-chip multi-processing.

As to your comments that this means that I/O traffic doesn't tie up the FSB, you are incorrect. When you do I/O, the data doesn't go directly to the CPU (into the CPU registers), it goes into RAM. And AMD has the memory controller on the CPU, so that means that when you are moving data from the disk to the RAM, the data on an AMD has to go into the CPU on the HT bus, and out on the FSB (RAM) pins. So I/Os still tie up the FSB.

On an Intel, the data never even goes to the CPU, it comes in the south bridge (ATA, including SATA, most other stuff) or directly into the north bridge (GigE), and then goes out on the RAM pins (the magic of DMA). So there's no more or less competition for memory bandwidth in an Intel than on an AMD.

Essentially, AMD just moved the northbridge into the CPU. Why this decreases the memory latency, I'm not sure. I'm not saying it doesn't either, as the numbers seem to indicate it does.

I dunno if Intel is going to copy AMD. Right now, Intel is busy moving the GPU onto the northbridge to save money (esp. in laptops). That means Intel probably isn't going to move the northbridge onto the CPU, at least not on all systems.

--
http://lkml.org/lkml/2005/8/20/95

31 of 167 comments (clear)