PCMark Memory Benchmark Favors GenuineIntel
javy_tahu writes "A review by Ars Technica disclosed that PCMark 2005 Memory benchmark favors GenuineIntel CPUID. A VIA Nano CPU has had its CPUID changed from the original VIA to fake GenuineAMD and GenuineIntel. An improvement of, respectively, 10% and 47% of the score was seen. The reasons of this behavior of FutureMark product are not yet known."
The reasons of this behavior of FutureMark product are not yet known
Easy. Intel paid them to make it that way.
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
I'm a GenuineIntel, mod me 47% higher!
No, but I did throw granola at a deaf person once
Intel is faster. The commercials say so, and commercials don't lie.
HA! As a GenuineSlashdot post, I should get even better than that!
Seems obvious, but follow the money trail, does PCMark get backing from Intel?
So rise up, all ye lost ones, as one, we'll claw the clouds.
A VIA Nano CPU has had its CPUID changed from the original VIA to fake GenuineAMD and GenuineIntel. An improvement of, respectively, 10% and 47% of the score was seen.
It sounds to me like this could possibly be explained by some kind of conditional optimization that the compiler puts in for various chips, to take advantage of differences in their designs that can improve performance.
Then again, probably not.
Is this like changing the user agent in a browser?
This definitely requires clarification from the creator of the benchmark.
It is possible that the benchmark uses the CPUID to change how the benchmark works, for example, to work around known flaws in a given chip. If this is the case, then the problem is not "omyghoshitplaysfavorites" but rather lack of full disclosure that the benchmarks are not directly comparable across different chips. In the most benign scenario, this could be someone at the benchmark creator's shop forgetting to tell the documentation team. This is still a very serious issue, but it's not fraud.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Could it be that FutureMark uses the GenuineIntel and AMD flags to enable processor specific extensions? and then does a whole bunch of math with those extensions and never bothers to check the result?
This would indicate some really terrible code on FutureMarks part, and VIA should be flagging those op-codes as illegal op-codes, but it might be possible that something like this could happen. It is even possible that the CPUID checks are duplicated in some library somewhere that actually gets the correct code sequence right, and the main FutureMark code disables the advanced functions of the library whenever the GenuineIntel and AMD flags are missing. Thus FutureMark may feature both code sequences that work and those that don't, and the resulting incompatibilities are what causes the issues.
Why would you even consider running a benchmark program you don't have source code for and cannot compile yourself? (If you are worried about random compiler differences messing up the results, you can check an MD5 sum of the final binary against the published one, but it is important that you can reproduce the binary from source and you can read the source to find out what it does.)
If compilers like ICC cripple their code depending on CPUID, that will just lead all manufacturers to set CPUID to GenuineIntel, just as moronic websites (with help from Microsoft) ensured that all browsers call themselves 'Mozilla'.
-- Ed Avis ed@membled.com
Well, PC Mark 2005 is no longer good for testing processors against processors of another maker, i.e. only good for intra-AMD, etc.
Colin Dean Go a year without DRM
Could the difference be that the benchmark program is utilizing additional processor instructions typically found only on those types of processors? The VIA's CPU obviously supports those instructions, but perhaps the typical generic CPU does not.
Better known as 318230.
That should be AuthenticAMD, not GenuineAMD.
But that would be expecting editors to actually, you know, edit.
I will not say anything about possibilities here without my anti-conspiracy-haters-shield online (needs a lot of power), but is really strange for a benchmark (supposed to be neutral) Well, I do not really expect neutrality for a benchmark with sponsorship (or partnership?) from hardware makers like nVidia.
Religion: The greatest weapon of mass destruction of all time
I think you mean 'AuthenticAMD'.
It's all about money, ain't a damn thing funny.
Need an automatic screenshot taker? Try here.
V+I+A == 224
G+e+n+u+i+n+e == 715;
Genuine+A+M+D == 925
Genuine+I+n+t+e+l == 1223
The bigger the number, the faster the processor. And you get 20% extra when you pass 1000.
Ignore this signature. By order.
There, fixed that.
The only way the maker of PCMark can EVER get their credibility back is if their future releases are open source.
Obama's legacy: (N)othing (S)ecure (A)nywhere and (T)error (S)imulation (A)dministration
It sounds to me like this could possibly be explained by some kind of conditional optimization that the compiler puts in for various chips, to take advantage of differences in their designs that can improve performance.
People are trusting closed-source benchmarks? Well, golly gee, who'd'a thunk there'd be errors, oversights, or shenanigans?
If this was used for anything more than entertainment value, any methodical person would have at least compared multiple closed-source benchmarks. If that proved to be inappropriately favoring a vendor, then, OK, start calling 'conspiracy', but this just sounds like an error in a tool that was never validated.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
VIA's is "CentaurHauls"
AMD's is "AuthenticAMD"
Intel's is "GenuineIntel"
There's no "VIA" nor is there "GenuineAMD".
Clearly PCMark2005 is buggy (at the best) and cannot be used to compare different CPU families in this test. At the worst it is intentionally flawed, and shouldn't be used at all.
It's a shame that not one VIA Nano review benchmarked the built-in Padlock functionality. Not one OpenSSL benchmark.
This isn't the first time they've been caught doing something "odd" with their code and it likely won't be the last.
That said, keep in mind it's a 3 year old benchmark. Whatever relevance this benchmarking program has today is far more lessened by its age than by any results shown from this research. Don't get me wrong. I'm not defending Futuremark at all. I don't particularly like their suite of benchmarking tools, and not just because of the "odd" results.
How well a platform scores in Futuremark is less relevant than how well it plays your games or movies or compiles your code or rips your movies/CDs. It's my humble belief that a proper benchmark of a system is how well it will perform in the role you want to use the computer.
If I can play GRID at 1920x1200 at the maximum settings possible with playable frame rates I'm happy.
If I can play Crysis at the same resolution and settings, cool.
If AOC runs well at those settings, then I built a nice system.
If Futuremark runs well...so?
Sig Follows: "Suppose you were an idiot. And suppose you were a member of Congress. But I repeat myself." -- Mark Twain
It's a 3+ year old benchmark being let loose on 2008 vintage CPU's and making mistakes on it's optimisations. I wouldn't expect anything else. It's going to have a 3 year old view on the kind of things these CPUs can do and will act accordingly.
I want a list of atrocities done in your name - Recoil
The benchmarks is looking at the ID and making assumptions. These benchmarks run on Windows. So another possibility is that MS does an optimization that few know about. Any of these are plausible. Simplest answer should rule until proof shown otherwise; Bad assumptions made in OS or program.
I prefer the "u" in honour as it seems to be missing these days.
That's why I never use any of those becnhmark software. I run my own programs, say Crysis demo for FPS, or rending a specific image for render time, then look at results before and after upgrades and make my own mind.
but AuthenticAMD!
wut is a benchmark
If I were an evil fraudster at PCMark, paid for Intel to deliver worse scores to rivals, I would make sure that these rivals had no easy way of uncovering the fraud. Testing for an ID looks much more like bad code paths than like "sneaky fraud".
There is no shortage of alternative quirks that can be used to see whether a given processor belongs to one family or another. Should enough of these quirks be combined, it would be *very* hard to discover an evil-related cause.
Of course, choosing the 'bad' path given an ID may just be blatant enough to provide plausible deniability for the developers that "messed up". However, being a firm proponent of Hanlon's Razor, I would rather call it a bug than a "sponsored feature".
On the other hand, kudos to the guys at Ars who thought of changing the ID and, when the numbers did not add up, make further tests to nail down the argument. Instead of just forgetting about the problem and performing a "review as usual", which would have doubtlessly required less effort. Yay for inquisitive hacker - reviewers.
Oh...wait....nevermind.
Given Intel's track record involving anti-competitive practices, I have no doubt in my mind that Intel paid off PCMark.
you got a point there which is important to the discussion, if the source is closed, how can we know if the test is fair?
if(cpuid == "GenuineIntel")
{
Run_really_fast();
}
else if(cpuid == "AuthenticAMD")
{
Run_no_so_fast();
}
else
{
Run_slow();
}
... and synthetic benchmarks.
I just hopped over to FutureMark's website and in the community section there's a list of "Most Popular Processors in 3DMark Vantage (Last 7 Days)"
Could be a coincidence but at the moment, they're all Intel.
iirc, via cpus have been buggy even in their 686 instruction set implementation (ubuntu-686 kernel chrashing on via cpus). There might be something similar here.
Busted!
Shameless plug alert: Game server control panel
Does it really matter whether the cause was "incredibly sloppy coding" or "Intel bribed them?" Either way, their benchmark cannot be trusted, and trustworthiness is ESSENTIAL for a benchmark. If anyone pays serious attention to this (which, having read TFA, it seems to merit), then FutureMark is toast.
"My strength is as the strength of ten men, for I am wired to the eyeballs on espresso."
It's sloppy and doesn't excuse Futuremark but there is one theoretically "sane" (when viewed under a certain light) explanation for what's been noticed: they took a number of CPUs and measured which memory access instruction had the least latency in itself, for example to decode it, activate proper CPU paths, etc. - so for example a MMX instruction on CPU A took X cycles before even trying to access memory, and SSE took X+n cycles, so for this particular CPU MMX is better than SSE for measuring memory performance. Of course this is really lame since new CPUs are released constantly, and a little tweak in the hardware or the microcode can invalidate the data they gathered from such tests.
This is probable because when assembler was still popular it was "well known" that certain CPUs perform certain operations faster. For a time, while it was worth it, a good assembler programmer had to know this and insert microoptimizations that depend on CPU type. Unfortunately (or fortunately), those assumptions broke sometime in the late nineties, since a) the number of CPU models on the market became huge and b) even CPUs that were theoretically in the same family started having different characteristics. I remember seeing just this for Athlon and Athlon XP (or maybe even for "early" Athlon XP and its later versions) - it was obvious that assuming anything about the CPU itself without actually measuring it on the spot is useless. A good example of this is in the Linux kernel - the MD (RAID) driver will actually measure (when kernel is booting) which instruction combination for calculating parity (among "plain" instructions, "SSE", "MMX", etc.) is faster and use that one.
-- Sig down
and none for 'others'. A simple answer to why.
I've assumed this was the case for some time.
You can actually ask the processor which advanced instruction sets it's capable of using. Enabling/disabling certain features based on the vendor string and not based on what the processor actually claims to support is braindead.
That's like putting diesel fuel in all Volkswagens because some of them support Diesel. And then putting gasoline in a Freightliner because it's not a Volkswagen. (YAY CAR ANALOGY!)
Maxim: People cannot follow directions.
Increases in truth directly with the length of time spent explaining them
Obviously a conspiracy! Where's Ralph Nader?!?!
Here's a perhaps simpler explanation. CPU benchmarks need to parse CPUID output to decide which instructions to implement. Most likely, the benchmark had never heard of these VIA CPUs that implement hot new SSE12 (or whatever) instructions; by claiming to be another vendor, the benchmark used a different instruction mix. I don't know for certain that this is what happened, but I'd bet solid money something like this is the story; we've seen analogous performance degradations at VMware when we fiddle with CPUID too aggressively.
Depends on the game, but most games come in demo form, and I suspect that most of the demos can be used to perform some kind of benchmark.
Doom3, for example, has a "timedemo" benchmark, and this runs entirely on levels included in the demo. So unless they explicitly disabled it in the demo version, I think that qualifies.
Can't speak for UT3, though.
Don't thank God, thank a doctor!
What I find hilarious about this is that it shows how hardcore, bare-to-the-metal programmers have to deal with exactly the same stupid issues as web developers.
Because there's a lot of potential here. Via suing Futuremark, Futuremark suing Ars and Intel, obviously, suing everyone.
How do we know other code isn't simply optimized for Intel CPU's? Granted, it's not in the best interest of software makers to do this unless some other incentive is in place.
More fire for the amd vs intel lawsuit first skype now this. Intel may have lot more stuff that will give them a black eye when it come out in court.
That's why I use mostly/only software compiled with GCC, I at least get the performance out of the hardware I bought.
The Phoronix Test Suite.
It's Linux only, but a CPU that performs better on Linux will perform better on Windows.
...the fact that the benchmark was developed and tweaked on an INTEL BOX??
Surely not!
Operation Guillotine is in effect.
The big problem with Future Mark is they have absolutely no credibility nor transparency. They list all the major hardware manufacturers as "partners", so how can they possibly be impartial ? Their test scores are commonly used to compare different brands, these numbers command great influence over the market... whoever gets the the highest scores is almost guaranteed to outsell their competitors, especially in the high-end segment where buyers are primarily interested in having the fastest product available, and where the high prices result in more time spent researching each purchase.
I also cannot imagine very many people buy the Pro versions of their benchmarking tools, other than major sites and publications that routinely publish detailed benchmark results. Most people are perfectly satisfied with the free one. This means the money has to come from other sources. I know for a fact, I wouldn't bother developing custom game-based benchmarks unless I was making more money than I would making actual games.
-Billco, Fnarg.com
This is why I don't trust any benchmark that a vendor would print on their packaging. I tend to go with benchmarks from sites that run whole suites of tests, including some real-world tests. The problem with this route is of course you don't get a single number to compare which bit of hardware is the fastest.
“Common sense is not so common.” — Voltaire
I'd say, let's use http://phoronix-test-suite.com/ btw, sorry for the GenuineAMD typo ;D As it is already tagged - I meant AuthenticAMD
I agree with you.
I was wondering if there is some way we can get code audited by the community on a more formal basis, perhaps with a bounty system and a reputation system, so that one might donate to get the KDE4 code audited by me ($10), or some KDE contributor ($300), or Linus Torvalds ($10000). Then these people could develop a formal reputation system, like + or - votes on SourceforgeAuditVoting.org. They'd use their PGP signature to sign the audits.
Or something. I would view this as the next phase of the open source economy. Eventually companies might hire people with good reputations, to audit their own intra-company code.
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
I'll give you credit for coming with a scenario that replaces malice with a heaping dose of incompetence. If what you say is true, then that's not a benchmark at all. After all, you're not comparing the same things; for all you know, you're comparing the skill of the programmer at writing for the VIA processor with the skill of the programmer at writing for the AMD processor.
You might as well write a benchmark to see how long it takes for various processors to divide 4195835.0 by 3145727.0 and come up with 1.333739068902037589! (Note: The correct answer is 1.333820449136241002.)
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
that's why a smart person like me never benches, I just buy AMD every time
It's not a call, it's an instruction. Are you talking about intercepting it with virtualization? Or are you talking about modifying the benchmark code?
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
Something wrong with SPEC? The suite itself (as a bundle) is wrapped up in some stupid license but the individual programs in it are free. You can benchmark each of them the way SPEC does and end up with a comparable score. You're just not allowed to refer to that score as a SPEC measurement because of the trademark.
It's authenticamd FFS
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
If the code were open, you can bet that every hardware vendor would pore over it for anything that might be unfair to their product. It doesn't prove that it isn't, but you at least know that if there was something obviously anti-AMD in it, AMD would complain. Without the source open, you don't get the benefit of that scrutiny.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
It's been well known for years now, that Intel's compiler emits code that checks for the GenuineIntel CPUID result and completely disables its generated SSE2/SSE3 code if it's not found.
They have the (flimsy) excuse of only being sure of compatibility with their own chips; the real reason (obviously) is to make them look better in benchmarks than their competitors.
I would guess that when testing the same RAM with the different CPUs, different numbers would come out. So to even out the scores, they corrected by +47% for the slower Intel CPUs.
Trust is the problem. If you have the source code, you don't have to rely on trust.