Intel Caught Cheating In 3DMark Benchmark
EconolineCrush writes "3DMark Vantage developer Futuremark has clear guidelines for what sort of driver optimizations are permitted with its graphics benchmark. Intel's current Windows 7 drivers appear to be in direct violation, offloading the graphics workload onto the CPU to artificially inflate scores for the company's integrated graphics chipsets. The Tech Report lays out the evidence, along with Intel's response, and illustrates that 3DMark scores don't necessarily track with game performance, anyway."
Thanks for telling all of us that the best measure of hardware's performance ingame is... to benchmark it with a game.
A bullet may have your name on it but splash damage is addressed "To whom it may concern."
Intel has cheated, can AMD avoid cheating?
Muchas Gracias, Señor Edward Snowden !
I thought offloading graphics computations to the CPU was the whole *point* of integrated video.
TODO: Something witty here...
Why are we surprised? They are a marketing company!
On the one hand, a mechanism that uses the CPU for some aspects of the graphics process seems perfectly reasonable(whether or not it is a good engineering decision is another matter, and would depend on whether it improves performance under desired workloads, what it does to energy consumption, total system cost, etc.), so I wouldn't blame intel for that alone.
On the other hand, though, the old "run 3Dmark, then run it again with the executable's name changed" test looks pretty incriminating. Historically, that has been a sign of dodgy benchmark hacks.
In this case, however, TFA indicates that the driver has a list of programs for which it enables these optimizations, which includes 3Dmark, but also includes a bunch of games and things. Is that just an extension of dodgy benchmark hacking, taking into account the fact that games are often used for benchmarking? Or is this optimization feature risky in some way(either unstable, or degrades performance) and so only enabled for whitelisted applications?
If the former, intel is being scummy. If the latter, I'm not so sure. From a theoretical purist standpoint, the idea that graphics drivers would need per-application manual tweaking kind of grosses me out; but, if in fact that is the way the world works, and intel can make the top N most common applications work better through manual tweaking, I'm can't really say that that is a bad thing(assuming all the others aren't suffering for it).
I'm shocked shocked shocked, I tell you.
Just look at the pics. Changing the name of the executable changed the results dramatically. The driver is apparently detecting when it's running a 3DMark (or some other specific apps) and switches to some other mode to boost its scores/FPS markings.
"In a 32-bit world, you're a 2-bit user. You've got your own newsgroup, alt.total.loser." -Weird Al
Is 3DMark the benchmark that will give a higher score to a VIA graphics card if the Vendor ID is changed to Nvidia?
Intel fully admits that the integrated chipset graphics aren't that great. They freely admit that they offload rendering to the CPU in some cases. This isn't a secret.
The newest GPUs have 2 billion transistors. Why wouldn't you put them to use? That's the trend anyways, even nVidia is going to release a 3 billion transistor GPU that's able to run general programs. I'm a PC gamer, I could care less if Intel or ATI or nVidea cheat on their benchmarks. In fact they should be encouraged to release hand coded or special drivers to improve performance in specific games.
Its funny that Intel simply creates an INF file and uses those to detect apps and optimize for performance. I mean, if you are detecting a file name and enabling performance optimizations, why not detect the app behaviour itself and make the optimizations generic ? Clearly you know the app behaviour and you know the performance optimizations work. This seem to me a case where people were asked to ship it out fast and instead of taking the time to plug the optimization into the tool, they just made it a hack. A really bad one too!!!
In true White Goodman fashion, cheating is something losers come up with to make them feel better about losing.
Is Intel the 500 lb. gorilla in chipsets? Sure, and they got there by 'cheating.' Which is winning.
Aint capitalism grand?
For all you losers who don't know who the great White Goodman is: http://www.imdb.com/title/tt0364725/
http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
http://www.xbitlabs.com/articles/mainboards/display/amd785g-intelg45_6.html#sect1
quote
The obtained numbers are pretty interesting. The thing is that although AMD 785G solution is ahead of Intel in 3DMark06, it falls behind the competitor in 3DMark Vantage suite. It is especially strange keeping in mind that Radeon HD 4200 is considerably more powerful than GMA X4500HD according to formally calculated theoretical performance. However, the fact is undeniable: Intel G45 chipset does produce higher 3DMark Vantage score in Windows 7. By the way, this is only true for the upcoming operating system, because Intel graphics accelerator can't repeat its success in Windows Vista. And it means that we can conclude that this sudden success demonstrated by Intel G45 can only be explained by certain driver optimizations and not the GPU architecture.
It's nothing new that integrated/cheap gpus use the cpu for various things. By itself, this is not cheating. it's just a subpar solution. It's only cheating if the drivers are fudging the settings per-application without telling the user. If they fudge for 3dmark and not for other applications, this might mislead the user's intuition about the gpu's performance elsewhere. The per-game profiling offered in the control panels for ati/nvidia are different because they can be switched off and the user is made aware of them.
That was my first thought, too.
Here's the thing, though: They took 3DMarkVantage.exe and renamed it to 3DMarkVintage.exe, and much of that offloading was dropped. So this isn't a general-purpose optimization, which would make sense -- it's a targeted optimization, aimed at and enabled specifically for a benchmark, in order to get higher scores in said benchmark.
It reminds me of the days when Quake3.exe would give you higher benchmarks, but worse video, than Quack3.exe.
Don't thank God, thank a doctor!
The whole idea of a coprocessor is to distribute the work. Oddly, Intel seems to be using the CPU as the graphics coprocessor, instead of the other way around. However, if your task is not CPU constrained, then this actually makes sense. Its weird, and shady not to admit it, but if Intel came out and said "The following executables are not CPU constrained on processor X, therefore we shift graphics back to the CPU for improved performance" everyone would applaud them for being clever.
I want to delete my account but Slashdot doesn't allow it.
Effectively dividing tasks among CPUs is not the issue here. They want to benchmark the GPU and they wanna make sure you don't enable optimizations that are targeted specifically for the benchmark which Intel was doing shamelessly.
Please mod this up; it really is that simple.
It is a miracle that curiosity survives formal education. - Einstein
I'm not defending Intel at all, but...
ATI's done it: http://www.xbitlabs.com/news/video/display/20030526040035.html
NVIDIA's done it: http://www.theinquirer.net/inquirer/news/1048824/nvidia-cheats-3dmark-177
They've probably done it several times in the past with other benchmarking software as well.
They're all dishonest. Don't trust anyone!
I'm seeing a potential other side to this that doesn't seem be being explored (unless I've missed something) -- if the optimizations are specific to .exes listed in the driver's .inf file, has anyone tried adding other games to the list (or alternately, just renaming another executable to match one in the list)?
It would seem like an interesting turn if the optimizations are generic, but only enabled for games/applications that Intel has spent time testing them on.
You'd think you'd have logic in the GPU that could determine when a certain load was being achieved, certain 3D functionality was being called, etc., and offload some work to a multicore CPU if it was hitting a certain performance threshold (as long as the CPU itself wasn't being pounded...but most games are mainly picking on the GPU and hardly taking full advantage of a quad core CPU or whatever). That makes a degree of sense...using your resources more effectively is a good thing. If that improves your performance scores, well...so what? It measures the fact that your drivers are better than the other card's drivers. That seems like fair play, from a consumer's standpoint. If the competitors can't be bothered to write drivers that work efficiently, that's their problem. Great card + bad drivers = bad investment, as far as I'm concerned. That's the real point of these benchmarking tests, anyway. It's just product marketing.
But trapping a particular binary name to fix the results? That's being dishonest to customers. They're deliberately trying to trick gamers who just look at the 3DMark benchmarks into buying their hardware, but giving them hardware that won't necessarily perform at the expected level of quality. I generally stick up for Intel, having worked there in the past as a contractor and generally liking the company and people...but this is seriously bad form on their behalf. I'm surprised this stuff got through their validation process...I know I'd have probably choked on my coffee laughing if I were on that team and could see this in their driver code.
Hasn't every chipset maker- ever- been busted for fudging benchmark results at some point? Multiple times, usually?
And then they get caught out by the old exe-renaming technique.
Why do they keep trying it? The mind boggles.
I would have thought by now that a standard tool in the benchmarkers repertoire was a tool that copied each benchmark exe to a different name and location and launched that, followed by a launch with the default name; and that the more popular benchmarks had options to tweak the test ordering and methodology slightly to make application profiling difficult.
ya know, with the pci bus going on die for the I5's it looks like this just a first step, Next gen chips will all almost have to have one core dedicate to graphics
Marketing execs change all the time. Each one says "Hey! I have an idea...." The programmer who is asked to put in the cheat is not wildly enthusiastic about the idea, knows it won't work and does a quick and dirty hack.
If that's the case:
1: Find a normal app.
2: Rename it to Crysis.exe or 3DMarkVantage.exe
3: ???
4: PROFIT.
But they're expected not to get caught. The truth will screw up the inflated stock values. Shareholders get rabid, which makes the lawyers have to work slightly longer than an hour. Just weed out the inferior ones who fail at lying and stealing and cheating like a professional capitalist, and send them off to Radio Shack in Moldova.
The behaviour their driver has in the benchmark is also used in several games... ie Crysis Warhead. RTFA.
So intel will not be as fast to render a bsod as Ati or nvidia?
As an Internet advocate for your obvious lack of a thesaurus, I give you http://thesaurus.reference.com/browse/retarded
Also, I must put in a Princess Bride quote
"you keep using that word. I do not think it means what you think it means".
As you have just done yet again.
3DMark Vantage was never a legit benchmark. Heavily tuned for Intel CPU and nVidia GPU architectures it never actually meant a damm thing.
Just compare performance of gf285/295 v. radeon 4870/5870 (any review) in 3DMark and in games. In 3DMark Vantage nVidia cards have close to 50% advantage while in real games radeons sometimes score higher.
The statistical anomaly alone is sufficient to dismiss 3DMark Vantage results as outlier.
All hope abandon ye who enter here.
The article isn't loading for me, but: can't they simply measure the amount of CPU used during the benchmark and use that information in the benchmark? I don't think it's basically evil to perform that kind of offloading (except in this case when the rules of 3DMark forbid using empirical data on it to optimize performance; but then again, I would imagine many other pieces of software also get this treatment without bad effects on quality or game experience), but dynamically detecting the situation would definitely be complicated; and it might even sometimes give the wrong answer.
One pretty useful heuristic for this kind of optimization would however be "is the CPU usage high without offloading GPU work to CPU: if so, don't do it". Hey, maybe the drivers could have a 'profiling'-mode, which would perhaps slow the performance but figure out the optimal parameters for running the program.
optimizing != benchmark. just optimizing for one thing does NOT mean that you are cheating. cheating is making the world think your card can do something it cant. like what intel did, ie unloading the load onto the cpu.
you seem to be a witless fanboi tho, since you havent been able to muster the courage to post with your own userid. so never mind this reply. its kinda wasted on you probably.
Read radical news here
Because of the shared memory nature of built-in graphics, the CPU is demanding memory so it can load up the details of the units coming on screen, but the video card is demanding memory so it can load up the graphics and more memory so it can display the graphics of the units coming on screen.
Genuine video memory is more expensive because it is multi-ported: it can read, write and update the screen and not tread on each others' toes.
System memory isn't.
but the + performance thing is real. you can experience it live, as opposed to benchmarks. whereas the raw 'processing power' intel supposedly sells to people doesnt translate into real world tasks as it should.
im an end user. i care about multimedia, video, games, internet, daily tasks. im not going to run long batches of arithmetic calculations or compile thousands of lines of code. i dont give a flying fuck about what number a cpu has on it - what i care is what i SEE in front of my eyes as performance.
Read radical news here
they REDUCED image quality, boosted performance. there isnt ANYthing wrong with that technically. you are yourself allowed to reduce image quality and boost performance through settings on any graphics cards.
it means that they TRADED OFF quality for performance. not showed as if their card was capable of delivering both, through cheating.
please get fucking real.
Read radical news here
Puts a spin on "trusted computing", hmm?
I'd put this junk as ultra-proto-AI. Instead of just delivering a processor and even a handy "here's some fun hack settings", the *chip itself* tries to make its own choices!!
"We detected Windows Seven and therefore gave it more of the discretionary processing power to improve performance...
"We detected iTunes.exe and earmarked it as a nonessential process..."
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
Mod the parent up: what his link shows is that Intel are not keeping it a secret that they offload to the processor; they have a published document saying that they do this for 3DMark as well as other software for the XP and Vista driver. I don't know whether they have yet published a similar document for Win7 driver, but Win7 is not yet on the shelves, so it's a bit hard to criticize them for not disclosing for that.
It's not really cheating is it, if you are open about what you are doing; I think the title and tone on the article is inappropriate.
IMO it's debatable whether this sensible for a benchmark or not - but it's not something that they've kept secret in a hope of gaming benchmarks - which is what a lot of other commenters seem to think.
I have no relationship to Intel apart from occasionally buying their products. I also buy other brand microprocessors and graphics hardware. I have mod points, but I think it's more important to point out why this comment is important than to mod it up myself.
Both ATI and nVidia have been caught cheating (and by cheating I mean specifically targeting the FutureMark benchmarks to make their products look better than they actually are). The above link is only a single instance. A quick google will net you a good sampling over the last decade or two.
Optimizing a driver for a specific game is not cheating as long as it doesn't affect quality. Optimizing your driver to get inflated scores specifically in a benchmark is cheating.
Well, if the GPU becomes saturated, I could imagine the rest of the load spilling over to the CPU (one or many cores). Obviously the GPU is more efficient at video tasks, but if the video task is priority for the user, why not offload to the CPU as well? Makes sense to me.
If you do that for a benchmark app then you are not really testing (just) the performance of the graphics hardware, so turning on that optimization without disclosing it is probably not really a fair comparison of the hardware. To make it 'fair' you really need to make the benchmark app to be aware of the feature and be able to turn it on or off under software control, or at least know if it is enabled or not. I wonder if similar optimisations could be made to any 3D video driver...
In the real world, if the user wants high graphics performance and there are CPU cores doing nothing then like you said, offloading to them makes perfect sense.
It's only half unfair though. In optimized games like Crysis, Call of Juarez, etc., they get a boost just like 3DMark Vantage shows. In other words, 3DMark's performance is indicative of how those games will perform. However, in any game not specifically mentioned in the drivers, the 3DMark results don't match up with actual games' performance.
As most people have stated, it would be much better if they could do this based on actual performance statistics, rather than just based on the filename. The flip side is that you might be able to get more performance out of other games by simply renaming their files to match one of the listed games, or by adding your game's executable to the list.
The behaviour their driver has in the benchmark is also used in several games... ie Crysis Warhead. RTFA.
The issue is that the driver treats different games differently, based on filename. Some get this boost and some don't. Whether you put 3DMark into the boosted or unboosted category, its results will be indicative of some games and not of others.
"We have engineered intelligence into our 4 series graphics driver such that when a workload saturates graphics engine with pixel and vertex processing, the CPU can assist with DX10 geometry processing to enhance overall performance. 3DMarkVantage is one of those workloads, as are Call of Juarez, Crysis, Lost Planet: Extreme Conditions, and Company of Heroes. We have used similar techniques with DX9 in previous products and drivers. The benefit to users is optimized performance based on best use of the hardware available in the system. Our driver is currently in the certification process with Futuremark and we fully expect it will pass their certification as did our previous DX9 drivers. "
The article confirms that the driver also plays crysis faster if you don't rename it. Maybe 3DMark is obsolete now that drivers are optimizing for individual games.
Two solutions come to mind immediately for this.
First off, here is the offending list of apps:
***
[Enable3DContexts_CTG_AddSwSettings]
HKR,, ~3DMark03.exe, %REG_DWORD%, 1
HKR,, ~3DMark06.exe, %REG_DWORD%, 1
HKR,, ~dreamfall.exe, %REG_DWORD%, 1
HKR,, ~FEAR.exe, %REG_DWORD%, 1
HKR,, ~FEARMP.exe, %REG_DWORD%, 1
HKR,, ~HL2.exe, %REG_DWORD%, 1
HKR,, ~LEGOIndy.exe, %REG_DWORD%, 1
HKR,, ~RelicCOH.exe, %REG_DWORD%, 1
HKR,, ~Sam2.exe, %REG_DWORD%, 1
HKR,, ~SporeApp.exe, %REG_DWORD%, 1
HKR,, ~witcher.exe, %REG_DWORD%, 1
HKR,, ~Wow.exe, %REG_DWORD%, 1
HKR,, ~3DMarkVantage.exe, %REG_DWORD%, 2
HKR,, ~3DMarkVantageCmd.exe, %REG_DWORD%, 2
HKR,, ~CoJ_DX10.exe, %REG_DWORD%, 2
HKR,, ~Crysis.exe, %REG_DWORD%, 2
HKR,, ~RelicCoH.exe, %REG_DWORD%, 2
HKR,, ~UAWEA.exe, %REG_DWORD%, 2
***
As you can see, it's targeted at exactly the standard list of reviewed apps on most sites. The easy solutions to fix this problem would be:
1 - Have the benchmark test check the config files for the video cards and flat out refuse to run tests which it finds the apps name in the list. But this would probably be quickly hard-coded somewhere in the chipset or drivers to avoid this type of detection, so it's only a temporary fix. The sneakier solution would be to have the benchmark randomly renamed by reviewers before running or have it randomly rename the program files. Crysis becomes "potato46.exe" or something impossible to catch. A nice random 1000-2000 word file like they use for email spam would be easy to add to the benchmark program(few dozen K at most) but create chaos in the drivers to try to replicate every combination.
2 - Have review sites randomly pick games that are non standard. This would have the advantage of eventually making that list of optimized games grow to dozens if not hundreds.
Note - I wonder how much difference it would make to add the games you regularly play to the config files? Just add the 20-30 games? Could it be that simple? (I don't see why all games aren't benefiting from these optimizations)
I hope that new apple systems don't get stuck with this carp video + a dual core cpu. It's the new imac thinner then even with intel core i3 cpu and half as fast video starting at $1200. To get a real video card starting price is $1800.
Mac mini with the slowest corei3 and 2gb of ram starting at $500-$600.
APPLE IF you plan to pull that carp at least have a real desktop at $800-$1500+.
3DMark is the most rediculous "benchmark" there has ever been. It gives some meaningless numerical value to a set of various small tests which will rarely correspond to current market games actual performance. There is no way to translate 3DMark to say, Crysis FPS. It might be easier to run some 3Dmark and give some meaningless number that only serves to confuse people, but something actually useful to anyone looking into performance of any system or hardware for gaming, is only going to be hard, *real* benchmarks of *actual* games that people are going to buy.
I really wish 3DMark would dissapear. So many places just give some meaningless 3Dmark score for graphics benchmarks on mainstream systems. Laptop Magazine is particularly offending, often just giving some offhand 3dmark score for a reviewed system, with nothing to corroborate that with, like a real game's performance numbers running at details suited for the systems hardware. You don't benchmark Crysis on a GMA4500, you benchmark HL2 or something, on medium settings. 3Dmark only serves to confuse people about graphics hardware. You can assign any number you want to a system and call it elite3dbenchmark2009Premium or something, but that number is never going to give anyone an idea of a given game's performance on that hardware. The only way to determine that with 3dmark numbers is to go and compare the scores with other otherware, and that takes alot more work and hunting, than does just reading an actual performance result of a set of games people are going to buy and play and expect them to run decently.
Exactly. If they want to offload GPU processing to the CPUs, then they should do that for ALL programs, not just certain ones in a list.
What if they want to offload GPU processing to the CPUs only for those programs that would benefit from offloading GPU processing to the CPUs, and they don't want the customer support nightmare of letting the end user guess which programs would benefit?
Wouldn't it be trivial to have the benchmark app randomly rename itself every time it runs? It would be far less trivial to optimize for djdusah89efhsl123d.exe...
They should consider putting together some of their code with a timer. The idea being that run a set sequence that exercises what THEY consider important to the game. Once they start doing that, they will become the owner of benchmarks. More importantly, the chip designers will spend a LOT more time working with them.
I prefer the "u" in honour as it seems to be missing these days.
The "rules" for 3dMark are basically a statement of what drivers they will/will not approve for use with 3dMark.
The driver in question is not approved for 3dMark. Where's the cheating?
If anything, they should provide a GUI to let users enable CPU enhancement for a game that's not listed in the INF by default.
But that just assumes that we really want to judge the performance of the graphics hardware in isolation, as opposed to judging it as part of a larger system that includes (a) the drivers, (b) the graphics library, and (c) the computer.
Suppose one graphics card manufacturer discovers that for most games that people play, there's more bang for the buck to be had in improving the CPU/GPU balancing approach than in increasing the pure GPU power, and invests their efforts accordingly into improving the drivers. Their competitor doesn't realize this, and put their effort into purely faster GPU. And assume that in practice, the first company's approach delivers slightly better performance than the second one's, and for a lower price to boot.
Now you run your pure GPU power benchmark, and lo and behold, the second company's card comes out on top, despite the fact that it produces inferior performance. In that case, if you ask me, you really should have designed a more realistic test that simulated the load that would be put on the whole computer system during a game or other actual use of the card.
This scenario is quite possibly hypothetical when it comes to graphics card, but things like this are already happening in the field of digital camera lens testing. Panasonic and Olympus' recent Micro Four Thirds system is designed to use lenses with quite large geometrical distortion and chromatic aberration, on the premise that these lens defects can be automatically and effectively corrected by the image processing software. The lenses, coupled with the intended software processing of the images, produce images with good resolution and less distortion and chromatic aberration than the older, optically-corrected designs. Some testers, however, have bypassed the intended software processing and attempted to evaluate the performance of the lenses in isolation, and gotten hung up on their "bad quality."
The point is that testing just the card (or just the lens) can be a very unfair comparison when the designers managed to think outside the box and create a system where the performance doesn't just come from the one component, but also from other pieces.
Are you adequate?
Isn't there some way to punish companies for this sort of thing? Advertising false data is illegal shouldn't this be as well?
IMHO that's something that's missing in their lineup, a mid-range mini-tower. I don't want a mac-mini (underpowered and not-upgradeable). Same goes for iMac (even worse, if say the LCD breaks out of warranty, you can't just swap another monitor). Mac PRO, *way* too expensive. I don't need Xeon processors, 2 optical drive bays and all of that. They could build a single-socket machine (and use a standard CPU, not a Xeon), one optical drive bay, 2 HD bays and 4 RAM slots. 2500$ for a tower is too expensive if you don't need a workstation
I've got better things to do tonight than die.
So now it's Intel corking their drivers - and this is news? Maybe they're just getting in their practice so that when it comes to Laughabee they'll actually try to make it look competitive.
AMD bought ATI in a panic after they couldn't merge with Nvidia (there wasn't room for 2 giant egos in that single company) to combat Intel Core + Larrabee on the same MCM. Now Larrabee is a no-show for months, if not years, to come and it's time for Intel to start panicking. Core i7 may be good, but AMD + ATI on the same die might be the better fit for most people.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
IDK Dude, but I just Facebook (TM)'d a picture of myself Twitter (TM)'ing this to my BFF on my iPod (TM) Touch.