FreeBSD on the Athlon64 in 64bit vs Pentium4 3.2E

← Back to Stories (view on slashdot.org)

FreeBSD on the Athlon64 in 64bit vs Pentium4 3.2E

Posted by Hemos on Monday April 5, 2004 @02:45AM from the compare-and-contrast dept.

veliath writes "Came by a comparison from about three weeks ago, between two systems running FreeBSD. One is an Athlon64 running FreeBSD in 64bit mode and the other a Pentium4 3.2E running FreeBSD in 32bit mode."

10 of 74 comments (clear)

Min score:

Reason:

Sort:

Old news... by hmallett · 2004-04-05 03:11 · Score: 2, Informative

This is the same article as was linked to from the FreeBSD site a few weeks ago. Everyone's probably read this already. Basically, the Athlon64 is faster.
Re:What about multiple processors? by Homology · 2004-04-05 05:02 · Score: 4, Informative

When I run top, it sure is nice to every now and then see 2 processors at almost 100% utilization, yet also show 50% idle.

It shows that you have capacity over for starting other processes. It also shows that your system is slower that it could be. Some food for thought relating to the uses of hyperthreading.
AMD 3200 won with only 512k cache. by BrookHarty · 2004-04-05 05:04 · Score: 3, Informative

I noticed they used the AMD64 3200, But the AMD64 3200+ only has 1/2 the cache compared to the 3400+, that extra cache should boost the build process even more.

Toms hardware has nice review and benchmarks for the 3400 vs the P4 3.4.

Also anyone notice, in both articles, P4's clean house on synthetic benchmarks, but real world (build process) the AMD cleans house.
1. Re:AMD 3200 won with only 512k cache. by Too+Much+Noise · 2004-04-05 05:55 · Score: 4, Informative
  
  I think you're mistaking the Athlon64 3200+ for the 3000+. 3200+ has 1M cache, while 3000+ has 512k. 3400+ has the same 1M cache, plus the 0.2GHz speed bump.
  
  Come to think of it, this can actually be found on the very page you linked to.
Re:HT & threads by ratboy666 · 2004-04-05 07:06 · Score: 4, Informative

What HyperThreading is...

Out of order execution takes the processor to a particular level of performance. Unfortunately, (and especially with the X86 IA), we run out of steam rather quickly, and the processor blocks waiting on registers or memory. The idea behind HT is that the processor's execution elements can then be reassigned to something else waiting in cache.

Of course, this means we need a big fat cache, and something else to execute. Could be another thread or process, but the important thing is that the second job be independent.

This can increase the utilization of the processor's compute elements.

So, yes, the "builds with multiple jobs running at the same time" test makes sense.

I would like to see a benchmark with CPU stalls and utilization summarized at the end. Can't do it myself, because I am far too cheap to replace my current system (and yes, it is an MP box - dual 200Mhz PPRO - and it still does quite nicely).

Anyway, it does look the the Intel took a hit in this benchmark; too bad for them. I looked over the methodology -- and it looked reasonable given the scope of the project.

Ratboy.

--
Just another "Cubible(sic) Joe" 2 17 3061
64 bit is faster by Anonymous Coward · 2004-04-05 07:43 · Score: 3, Informative

In the end I think the initial point is made with this review though, and that is that 64-bit does make a difference to the "average user" as well as the power user or administrator, but that performance advantage may not be evident in all situations. When under heavy load or dealing with large blocks of data, the Athlon64 (and we can assume that the Opteron and Athlon64-FX also apply) in 64-bit mode achieves superior performance to the same machine in IA32 (x86) mode. This is not so much because of the 64-bit addressing as it is the fact that there are twice as many general-purpose registers available.
Re:Ultimate 64 bit Nethack box! by Henry+V+.009 · 2004-04-05 11:55 · Score: 4, Informative

So sad to see that the parent is yet another victim of the megahertz myth.

Imagine for a moment that a CPU maker created a chip that performed 10 times the number of operations per cycle that either Intel or AMD could achieve. But also imagine that because of the complexity, they could only get the chip to run at 50MHz. Not very useful, huh?

Intel has gone with a design that allows them to ramp up clock speed. AMD has gone with a design that allows them to use clock cycles more efficiently.

Both of those approaches are a perfectly good way to do things. All that matters is how fast the user's applications run in the end.
Re:When is 5.3-RELEASE coming out? by Anonymous Coward · 2004-04-05 23:01 · Score: 1, Informative

Don't get your hopes up: it still seems pretty far from stable. Although of course they could release 5.3 soon, it probably wouldn't be -STABLE.

Either way, I'd expect there won't be a 5-STABLE branch until the end of the year if not longer.
Re:Holy Crap! by mobby_6kl · 2004-04-07 09:05 · Score: 2, Informative

I can't RTFA, but from the article summary it is a regular Prescott, not ExtremeEdition. IIRC, "E" stands for Prescott, "C" would be for Northwood core.
Re:Ultimate 64 bit Nethack box! by jasonsingha · 2004-04-08 04:54 · Score: 3, Informative

All chips are designed to run "at the edge" of the frequency upper limit and so AMD doesn't have an inherent advantage because they do more work per clock cycle and Intel does less work per cycle but has a higher frequency. All chip-makers hit the same physical limitations at about the same time and neither has the advantage because they run at a higher or lower frequency today.

The primary determination of clock-speed (besides process technology of course) is the largest number of transistors and the length of the wires in the critical path of each pipeline stage. For a chip with a higher clock-speed using the same process technology, this means that it has less wire or transistors in the longest path of each stage so it can be clocked faster. The presumption is that this is achieved by having more stages or better logic when compared to some other design, etc. but it really doesn't matter as far as the physics are concerned. All chips max out when the frequency is so high that the signals flowing through the circuits don't have enough time to go from one stage to the next and from this perspective, the only thing that matters is how much wire and how many transistors. If AMD was able to make faster chips, they would. Likewise with Intel.

It all boils down to this: if I have path with 10 units of delay and you have all paths with 8 or less units of delay, you will achieve a higher clock-speed if we use the same manufacturing process. Niether design is better for getting sped-up when moving to the next process technology since they use the same transistors and wire and are running at the same edge "node" in the current process technology.

Where AMD *may* have an advantage is that the top speed of chips may start to be effected by the power-consumption since if the chips get too hot, they will melt. Power-consumption for CMOS is determined by the dynamic component (when CMOS gates change their state they burn power) and a static component (determined by the total number of transistors). You used to be able to ignore the static component but as the feature size decreases, the leakage current begins to become quite noticeable. [AMD is using SOI already to help with this problem. Intel is eventually going to introduce its SOI work-alike (the name escapes me because it was invented by a marketing person).] Most advanced chips designs include circuits which burn power all of the time, but I'll assume that both Intel and AMD use the same tricks with the same small percentage of their transistors (and it is the same transistors with a high dynamic power usage anyways since they are important circuits).

If is conceivable that one design, Intels P4 or AMD64, is "more efficient" in this it uses less power both statically and dynamically, to acheive the same computations. At the end of the day, the more efficient processor may be able to compute the result quicker because it won't have to turn itself off just to cool down compared to the less efficient design. Currently this kind of limitation is only present in very small laptops.

However, you can't automatically assume that low frequency means the AMD64 design is more efficient. It may do more wasted work (speculation) per cycle in trying to do more work per cycle. It might have more transistors in its ALUs burning power, etc. One things that hurts Intel is that because they have a longer pipe-line, they have a higher penalty for branch mispredications, but they may just be able to tweak the branch predictor or make it larger to recoup this deficit.

At the end of the day, you'd probably find out that the two chips are about as efficient as each other (and that PowerPC processors are a little bit more efficient since they have much simpler decoders). I know that both AMD chips and Intel chips both burn a lot of power. If AMD sticks with SOI and Intel uses inferior technology, then AMD will win. But Intel has a lot of money and they probably won't get out-classed in technology by anyone.