PCMark Memory Benchmark Favors GenuineIntel
javy_tahu writes "A review by Ars Technica disclosed that PCMark 2005 Memory benchmark favors GenuineIntel CPUID. A VIA Nano CPU has had its CPUID changed from the original VIA to fake GenuineAMD and GenuineIntel. An improvement of, respectively, 10% and 47% of the score was seen. The reasons of this behavior of FutureMark product are not yet known."
The reasons of this behavior of FutureMark product are not yet known
Easy. Intel paid them to make it that way.
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
I'm a GenuineIntel, mod me 47% higher!
No, but I did throw granola at a deaf person once
Seems obvious, but follow the money trail, does PCMark get backing from Intel?
So rise up, all ye lost ones, as one, we'll claw the clouds.
A VIA Nano CPU has had its CPUID changed from the original VIA to fake GenuineAMD and GenuineIntel. An improvement of, respectively, 10% and 47% of the score was seen.
It sounds to me like this could possibly be explained by some kind of conditional optimization that the compiler puts in for various chips, to take advantage of differences in their designs that can improve performance.
Then again, probably not.
This definitely requires clarification from the creator of the benchmark.
It is possible that the benchmark uses the CPUID to change how the benchmark works, for example, to work around known flaws in a given chip. If this is the case, then the problem is not "omyghoshitplaysfavorites" but rather lack of full disclosure that the benchmarks are not directly comparable across different chips. In the most benign scenario, this could be someone at the benchmark creator's shop forgetting to tell the documentation team. This is still a very serious issue, but it's not fraud.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Is this like changing the user agent in a browser?
Pretty much, yes.
Could it be that FutureMark uses the GenuineIntel and AMD flags to enable processor specific extensions? and then does a whole bunch of math with those extensions and never bothers to check the result?
This would indicate some really terrible code on FutureMarks part, and VIA should be flagging those op-codes as illegal op-codes, but it might be possible that something like this could happen. It is even possible that the CPUID checks are duplicated in some library somewhere that actually gets the correct code sequence right, and the main FutureMark code disables the advanced functions of the library whenever the GenuineIntel and AMD flags are missing. Thus FutureMark may feature both code sequences that work and those that don't, and the resulting incompatibilities are what causes the issues.
Why would you even consider running a benchmark program you don't have source code for and cannot compile yourself? (If you are worried about random compiler differences messing up the results, you can check an MD5 sum of the final binary against the published one, but it is important that you can reproduce the binary from source and you can read the source to find out what it does.)
If compilers like ICC cripple their code depending on CPUID, that will just lead all manufacturers to set CPUID to GenuineIntel, just as moronic websites (with help from Microsoft) ensured that all browsers call themselves 'Mozilla'.
-- Ed Avis ed@membled.com
Well, PC Mark 2005 is no longer good for testing processors against processors of another maker, i.e. only good for intra-AMD, etc.
Colin Dean Go a year without DRM
That should be AuthenticAMD, not GenuineAMD.
But that would be expecting editors to actually, you know, edit.
The CPUID instruction provides feature bits that software should use to determine which instructions are available. Using the vendor string is not a reasonable way of detecting the presence/absence of instruction set extensions like SSE.
My server
V+I+A == 224
G+e+n+u+i+n+e == 715;
Genuine+A+M+D == 925
Genuine+I+n+t+e+l == 1223
The bigger the number, the faster the processor. And you get 20% extra when you pass 1000.
Ignore this signature. By order.
It sounds to me like this could possibly be explained by some kind of conditional optimization that the compiler puts in for various chips, to take advantage of differences in their designs that can improve performance.
People are trusting closed-source benchmarks? Well, golly gee, who'd'a thunk there'd be errors, oversights, or shenanigans?
If this was used for anything more than entertainment value, any methodical person would have at least compared multiple closed-source benchmarks. If that proved to be inappropriately favoring a vendor, then, OK, start calling 'conspiracy', but this just sounds like an error in a tool that was never validated.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
VIA's is "CentaurHauls"
AMD's is "AuthenticAMD"
Intel's is "GenuineIntel"
There's no "VIA" nor is there "GenuineAMD".
Clearly PCMark2005 is buggy (at the best) and cannot be used to compare different CPU families in this test. At the worst it is intentionally flawed, and shouldn't be used at all.
It's a shame that not one VIA Nano review benchmarked the built-in Padlock functionality. Not one OpenSSL benchmark.
It's a 3+ year old benchmark being let loose on 2008 vintage CPU's and making mistakes on it's optimisations. I wouldn't expect anything else. It's going to have a 3 year old view on the kind of things these CPUs can do and will act accordingly.
I want a list of atrocities done in your name - Recoil
The benchmarks is looking at the ID and making assumptions. These benchmarks run on Windows. So another possibility is that MS does an optimization that few know about. Any of these are plausible. Simplest answer should rule until proof shown otherwise; Bad assumptions made in OS or program.
I prefer the "u" in honour as it seems to be missing these days.
If I were an evil fraudster at PCMark, paid for Intel to deliver worse scores to rivals, I would make sure that these rivals had no easy way of uncovering the fraud. Testing for an ID looks much more like bad code paths than like "sneaky fraud".
There is no shortage of alternative quirks that can be used to see whether a given processor belongs to one family or another. Should enough of these quirks be combined, it would be *very* hard to discover an evil-related cause.
Of course, choosing the 'bad' path given an ID may just be blatant enough to provide plausible deniability for the developers that "messed up". However, being a firm proponent of Hanlon's Razor, I would rather call it a bug than a "sponsored feature".
On the other hand, kudos to the guys at Ars who thought of changing the ID and, when the numbers did not add up, make further tests to nail down the argument. Instead of just forgetting about the problem and performing a "review as usual", which would have doubtlessly required less effort. Yay for inquisitive hacker - reviewers.
you got a point there which is important to the discussion, if the source is closed, how can we know if the test is fair?
if(cpuid == "GenuineIntel")
{
Run_really_fast();
}
else if(cpuid == "AuthenticAMD")
{
Run_no_so_fast();
}
else
{
Run_slow();
}
That's a pretty good analogy.
If Futuremark is indeed enabling CPU features based upon the CPUID, then this situation is a lot like the webpages that render incorrectly in Firefox unless the user agent is set to Internet Explorer.
Does it really matter whether the cause was "incredibly sloppy coding" or "Intel bribed them?" Either way, their benchmark cannot be trusted, and trustworthiness is ESSENTIAL for a benchmark. If anyone pays serious attention to this (which, having read TFA, it seems to merit), then FutureMark is toast.
"My strength is as the strength of ten men, for I am wired to the eyeballs on espresso."
GenuineIntel/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008072820 Firefox/3.0.1
I can already feel the speed!
The Phoronix Test Suite.
It's Linux only, but a CPU that performs better on Linux will perform better on Windows.
I agree with you.
I was wondering if there is some way we can get code audited by the community on a more formal basis, perhaps with a bounty system and a reputation system, so that one might donate to get the KDE4 code audited by me ($10), or some KDE contributor ($300), or Linus Torvalds ($10000). Then these people could develop a formal reputation system, like + or - votes on SourceforgeAuditVoting.org. They'd use their PGP signature to sign the audits.
Or something. I would view this as the next phase of the open source economy. Eventually companies might hire people with good reputations, to audit their own intra-company code.
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
I'll give you credit for coming with a scenario that replaces malice with a heaping dose of incompetence. If what you say is true, then that's not a benchmark at all. After all, you're not comparing the same things; for all you know, you're comparing the skill of the programmer at writing for the VIA processor with the skill of the programmer at writing for the AMD processor.
You might as well write a benchmark to see how long it takes for various processors to divide 4195835.0 by 3145727.0 and come up with 1.333739068902037589! (Note: The correct answer is 1.333820449136241002.)
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
It's not a call, it's an instruction. Are you talking about intercepting it with virtualization? Or are you talking about modifying the benchmark code?
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....