Benchmark Program Rewritten to Favor Intel?
BrookHarty writes "Interesting article over at Van's Hardware, that BAPCo the maker of the SysMark benchmarking program, has re-written its SysMark 2002 benchmark program in favor of Intels P4. AMD joined BAPCo in order to "correct" these "broken" results. AMD reports that BAPCo's SysMark 2002 (written by Intel Engineers) is a collection of tasks to summarize "Real World" performance. Interestingly, these tasks are selected for Intel's favored performance, while removing certain tasks that favor AMD. Vans Hardware has additional information on BAPCo's Shady history."
Obviously, the best bet for cpu benchmarks would be an open-source one compiled using a standard compiler. This is a case where open-source really shines.
GoatPigSheep, the 3 most important food groups
Intel has used the "once compilers catch up..." scam for years, and every time people find themselves with a long obsolete processor by the time the software the theoretically exploits it arrives.
My general practice is to ignore any synthetic benchmarks because they represent no real world value whatsoever: Instead I look to application benchmarks, like compressing divx movies or rendering 3D scenes, if that was the use that I had in plan for my PC.
BapCo's head quarters are on the Intel campus. Its been Intel biased from day 1 (back when AMD was making K5's and thinking about making K6's) and AMD has known this.
The fact is, prior to the release of the Athlon, nearly all benchmarks were biased towards Intel. AMD's strategy when they released that Athlon was to make a CPU so good, it could beat Intel's CPUs even on these benchmarks. Sysmark just happens to be the one benchmark where Intel exercises so much control that it could literally say whatever Intel wanted it to say.
What you are seeing is AMD just starting to switch strategies from "lets just beat them on every benchmark under the sun regardless of bias" to "lets expose the bias where it is as its worse so people can know the truth".
This is all just preparation for the K8 launch I think. If AMD can properly put Sysmark results into perspective, maybe everything that is left will show what a monster K8 is versus any Intel offering. It is indicative that the K8 may not be winning on Sysmark on internal testing, or may not be winning by a sufficient margin.
As compilers become tuned to exploit this, it's plausible that the Athlon's performance is going to lag quite a bit more than it already does. That there is some benchmark out there that is specifically designed to show off this strength of the P4 is no real surprise to anyone, is it?
That's not the complaint at all. Read the linked article. The complaint is that Sysmark 2002 has been systematically altered relative to Sysmark 2001 so as to favour the P4 over Athlon.
For example, the PhotoShop test in Sysmark 2001 had 13 filters, of which 8 run faster on the Athlon and 5 faster on P4. The Sysmark 2002 PhotoShop test has 6 filters, of which 3 are filters from Sysmark 2001 on which P4 wins and the other 3 are additions on which the P4 also wins. The 8 filters on which the Athlon does better have all been removed.
There are several other examples in the article. Read the article
BTW, an interesting point is that this whole thing is basically an AMD publication that AMD have chosen to proxy via Van's. Van is at least open about it. The AMD presentation containing all the information in that article is linked at the end and is available here
When I dig through reviews on the latest CPU and/or mainboard, I initially groaned at the increasing number of benchmarks folks would put out. It is more than just increasing click-through rates (well maybe not for some, but...) - it lets me see applications that I use. Synthetic benchmarks and politician's promises garner then same level of trust from me.
Anyhow, I game and code but use games to judge where my cash goes. When the P4 came out, I saw it did great job with Quake and I started to get excited about the CPU. Then I saw the benchmarks on the games I actually play - UT, CS, and a few others - and it was not black and white. After the ATI fiasco, Quake is up there with synthetic benchmarks IMHO. As for Photoshop, you can pick what platform you want to 'win' by tuning the filters. Apple does it, their dually box wipes out the competition, the other do it and the tables are turned.
There are great graphs out there that show benchmarks using different sizes of data. Its like comparing a small turbo charged engine to a larger normally aspirated one - so what RPM were you at when you ran your test? BMW's M5 feels slower than an Audi S4 at the start, but get the RPM's up there and it is a different story. Even pickup trucks can beat a Ferrari if you tune the test to take advantage of a sweet spot.
I've done my homework, and my personal cluster is mostly AMD today. Still have one celeron 566@800 as a CS server, but my workstation (Intel Xeon box) was replaced by AMD MP chips. Secondary boxes are all XP chips, but they use to be PII&III's when Citrix and the K5 sucked. They run Oracle, Weblogic, LDAP, and other stuff quite well when I'm working, and one swap of a hard drive later I'm getting some solid fragging in on the same box. In another year or so, if Intel really hold the crown , the price is right, and my boxes are 'only fast enough for web browsing and email', I'll chose them.
+++ UGUCAUCGUAUUUCU
A couple of days after some Lawyers get together for a class-action suit alleging that Pentium IVs are slower than the AMD competition, new BAPCO tests 'prove' that the Pentium IV was quicker all along.
Nice one Intel. At the very least, this should muddy the waters with respect to which one is quicker being a matter of opinion.
I use a 7 Watt Via C3 as opposed to one of the 60 Watt P4/Athlone and do not really care either way.
Mielipiteet omiani - Opinions personal, facts suspect.
Also, the story didn't imply this was a big deal. It only remids us of all the dirty tricks Intel is forced to resort to when they try to maintain a market lead with a grossly inferior product. As long as people know this, benchmark-cooking is really no big issue.
I think this speaks volumes: Intel are schmoozing the benchmarkers while AMD are designing kick-ass processors. I hope the stockholders are listening!
If AMD would stick to making totally Intel compatible chips instead of trying to infuse their own personality, we wouldn't have this problem. Hint: my software shouldn't need to know it's running on an AMD chip.
This is so wrong on so many counts...
1. Intel's chips aren't "totally Intel compatible". The Pentium 4 contains instructions that were not present in the Pentium, P2, and P3. Why should your software have to "know it's running on a" Pentium 4 rather than a P3, P2, or Pentium? Hell, there was even a Pentium and a Pentium MMX (the latter adding the MMX instructions).
2. Intel tries every trick possible to patent their instructions to prevent people from implementing them. They do it with hardware, too. Remember when you could plug a K6-2 in place of an Intel Socket 7 CPU? Starting with Slot 1, intel used patents to prevent others from making compatible CPUs, which is why AMD and Intel motherboards are now incompatible.
3. Why should AMD not provide useful processor extensions that improve on Intel's base instructions? That's what provides useful competition and makes the industry grow.
4. What interest do you have in seeing AMD in a constant catch-up mode? In your scenario, Intel gets an advantage every time they release new instructions -- that will take AMD months to implement in silicon. Do you own Intel stock?
5. Why doesn't Intel just stick to providing processors that are 'totally AMD compatible'?
That's what SysMark is supposed to be: they measure "real-world" performance figures - they run a slew of Photoshop filters, and time it, and other crap.
Unfortunately, SysMark's testing strategy is really terrible. I'm even a bit confused how it works: they say that they scale each test based on how long it takes to complete: but is the scaling from a "reference system" or from each system? If it's from a reference system, then it's biased against whatever that reference system is good at (since the difficult bits get weighted more). If it's from each system on the fly, then it's really meaningless, as one poorly-chosen benchmark can skew the whole thing.
Worse yet: in SysMark 2002, AMD claims that BAPCo uses the same benchmark, multiple times: this is just plain bad, because not only does it magnify the importance of this benchmark, it shrinks the importance of all of the other ones. It's just plain idiotic. Take 3 tests, run them 4 times each, and use the results from all of the runs? It's a very very obvious bias - the only reason you would do that is if you wanted to cheat for one specific processor, and you knew which filters it was good at.
Which, surprise surprise, they do indeed remind people of! And if this is true, they'd be right that it was a smoking gun w.r.t. that lawsuit, too.
Let them go on being an AMD fanboy site. I don't see INTEL fanboy sites breaking this story.
Wouldn't a better CPU benchmarks be taken by using the chipmakers' own compilers?
... almost the antithes of what Intel is trying to do. Has this been a longstanding strategy on AMD's part?).
No.
The chipmaker would simply then optimize their compiler for the benchmark(s) in question, rather than for code more generally. In other words, what you suggest would still allow the chipmaker to cheat.
In order to have complete transparency in the benchmarking, both the benchmarks and the compiler should be open source (ideally free software, so that anyone can run and verify the benchmarks as well, allowing repeatable experimentation in the broadest scientific sense). If the chip maker wishes to submit optimizations to such a compiler they would be free to do so, since any such optimizations would in turn be open source (or free software) and subject to peer review.
A good candidate would be gcc, which runs on numerous platforms, and on several operating systems on AMD and Intel hardware.
Cheating would be much harder in this case, perhaps even impossible, something we need given the sordid history of benchmarking by all parties involved (except perhaps AMD? Can anyone recall an instance where AMD has cooked results? I ask because their current chip rating system is extremely conservative
The Future of Human Evolution: Autonomy
The Pentium IV has other questionable design desisions that hurt performance as well. It has 8K of L1 cache, the same amount found in the ancient 486 processor, whereas the Athlon has that amount squared and doubled (128K).
Obviously you flunked your freshman-level computer architecture course. The P4 8K L1's 2-cycle load-use latency is 50% better than Athlon 128k L1's 3-cycle load-use latency (not even accounting for P4's clock speed advantage). The difference in hit rate between 8k and 128k is only about 5% meaning that it is substantially faster to go with the small/fast cache than the big/slow cache. Do the math - even an infinitely large 3-cycle load-use cache is slower than an 8k 2-cycle load-use cache.
Cache size comparisons are more meaningless than megahertz comparisons. Whenever somebody tries to justify a big cache size without looking at performance, just walk away. AMD is playing marketing games with their slow-as-molasses (but massive) L1 cache.
I won't bother to address the rest of the technical errors in your post...
Apples to apples please - when you clock the Athlon at the same clockspeed as the Intel chip, (which are possible with the new chips that AMD just released) the FPU is far faster on the Athlon chips.
Not that it matters, but P4 and Athlon performance are approximately equivalent on a per-clock basis: Athlon does 624 / 1800 MHz (0.346) "SPECfp points per megaherz" (whatever that means), and P4 does 861 / 2533 MHz (0.399). Of course since P4 can clock much faster it has better absolute FP performance.
This is what indicates a superior FPU design, not a comparison based on a ~700mhz difference in clockspeed.
Let's throw Itanium 2 into the picture, which does a whopping 1356 SPECfp at an equally astonishing 1.0 GHz, which puts it at this silly "SPECfp points per megaherz" of 1.356 - quadruple that of Athlon/P4.
Does that mean Itanium 2's FPU is four times better than Athlon?
Both of these companies spend *billions* of dollars on producing these processors. Both companies run lots of simulations to determine what design choices best fit with the rest of their design. When you're spending that much time and money developing these CPUs, you can't afford NOT to consider every option.
When it comes down to it, both AMD and Intel have really good engineers, and both companies listen to them when figuring out how to build cpus.
So consider that the P4's 8KB L1 trace cache is so small because that's as big as they could make it while keeping the latency down to 2 cycles-- something that was critical to keeping their double pumped ALUs busy (and thus their IPC up as much as they can)-- and that they could compensate by working a bit harder on a fast L2.
Perhaps AMD decided that they could live with an extr cycle of latency in the L1 because they have enough instrucions in-flight that blocking on a cache access wouldn't hurt them as much as a low hit rate would.
Or, perhaps there are multiple sweet spots in size/hitrate, especially when you factor in die size and cost. Honestly, I don't know the reasons why they made these decisions, and I'd love to find out why-- but I have 100% confidence that all the options were carefully considered.
When it comes down to it, both architectures are performing really well! And for YEARS, they have been competitive with each other. So while you may have your favorite (I, for example, think the P4 SMT and trace cache stuff is pretty neat), you've got to realize that zealously promoting one over another just makes you look silly.
Cheers!
-Ed
point 1.
;-)
/dislikes for both companies...but CURRENTLY, i'm going to recommend amd 90% of the time. What most Intel FANS (read: biased) DON'T UNDERSTAND, is while they are bashing AMD, if it were not for AMD they would be paying a lot more for their precious little Intel processors. Very likely around $800-$900 for a dinky 1 Gigahertz P3 right about now. How can you read slashdot, and know how important competition is and still be an Intel cheerleader? They must be ignorant or stupid.
amd is, mhz for mhz, dollar for dollar, pound for pound (the currency & weight
is THE FASTEST x86 cpu out there.
If I give you $300 and tell you to go buy the fastest x86 cpu you can for the money....it's an AMD cpu... period.
point 2. i own 2 amd based systems, and about 7 intel based systems. i'm not biased, i have likes
Point 3. it's simple. a competitor like AMD, simply means that if you prefer AMD, you are getting a damn good processor with a LOT of value. If you prefer Intel, while not quite the value of AMD,you are still getting a very good cpu, with good value. Because of AMD, we the consumers are enjoying greater value.
Without AMD, I can guarantee you that the current situation would not exist.
Intel would definitely be the equivalent of Microsoft....a monopoly. Trying to control all aspects of the industry, charging rediculous prices, trying to build their own 40 billion in liquidity.