FutureMark Confirms nVidia's Benchmark Cheating
jlouderb writes "As first reported by ExtremeTech, Futuremark has confirmed that nVidia is cheating on its 3DMark2003 benchmark through eight driver optimizations. The 3D graphics performance war just keeps getting more and more interesting!" See our previous story.
Test with the applications/games people really use, and they can't optimize for them without, well, optimizing for them! If they want to make Quake III faster, great.
You don't base your findings on one benchmark. Whenever I go to a site like tomshardware.com they have several different ways to benchmark. Each card has its own strengths, and if a card has cheated it will show up like that.
Paint.NET, a Free Image Editor, with Source Code Available!
WHAT?? My FX 5800 Leaf Blower only has a range of five feet and not six? I want a refund!
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
There a lies, damm lies and statistics .
I remember SPEC benchmarking ment something, and companies putting special routines to make chips seems faster than they were.
Thats why "Real world testing" is important. While not always the greatest comparison, its much better in most cases.
While this isn't a huge suprise, I am happy that there are smart folks out there who spend time to uncover this kind of information. Kudos to you for your efforts!
Videocard Benchmarks are about as believable as the the 'World's Best Grampa' award.
-n
http://www.remix.net/
How can company proceed to do its business while blatantly lying to its customers!!??
Oh wait, my medication just kicked in. It's just business as usual. I will just go on checking my MSN e-mail, while watching MSNBC, drinking my Coke and eating my McDonalds burger.
Never mind.
Wearing pants should always be optional.
So it's quite likely that NVidia was just anticipating optimizations and not outright "cheating."
Comment removed based on user account deletion
Calling them optimizations gives what nVidia is trying to do a level of legitimacy which is undeserved. If you read the Futuremark paper, you will see that they are clearly cheating.
It would be as if a CPU manufacturer substituted its own algorithms stealthily in a CPU performance benchmark and only when running that benchmark.
Sure, you get a higher number, but you aren't measuring what the benchmark designer intended to measure.
Thank you for submitting this to Slashdot. With Futuremark slashdotted to death, NOBODY will be able to get the evidence! *manical laughter*
This has been done for many years, even the last decade. A good friend of mine works and has worked for almost every major video card company in the buisness for the last decade. What is his job? Make sure THEIR video card gets the best scores on the latest and greatest video cards.
I am sorry to tell you all, but just because Nvidia was CAUGHT this time, doesn't mean they haven't been "cheating" (by optimizing for a specific benchmark) for the last 6 years.
I would bet every driver release contains code to help out benchmarks and even specific games. Why do you think Nvidia just said with there latest driver release " *Up to 30% faster frame rates ( *With Unreal Tournament 2002)".
Its just once in a great while someone notices a performance jump TOO big, or just wants some news worthy-ness and decides to put out a nice PDF file.
- Jeff
Modesty is one of life's greatest attributes
From what I read from [h]ardOCP's benchmark with doom3 It kills nvidia's card. And who cares aren't you suppose to optimize your card?
They also have another benchmark here where they compare the 5900 ultra and the radeon 9800 pro. In that article it says that NVIDIA told them not to use 3DMark03 I recommend reading that article
9th grade, you told me cheaters never make money
well 'pbhtbhtbthbth'
I thought that ATI did the same with their Radeon 8500 drivers 2 years ago, making their Quake 3 scores look better by "cheating". Isn't that just status quo in the video card manufactoring world.
Our investigations reveal that some drivers from ATI also produce a slightly lower total score on this new build of 3DMark03. The drop in performance on the same test system with a Radeon 9800 Pro using the Catalyst 3.4 drivers is 1.9%. This performance drop is almost entirely due to 8.2% difference in the game test 4 result, which means that the test was also detected and somehow altered by the ATI drivers. We are currently investigating this further.
It not about cheating... but about how much you cheat.
"Engineers do the work of man, Physicists do the work of God"
http://198.3.92.62/3dmark03_audit_report.pdf Just don't kill me now. ;-)
A test system with GeForceFX 5900 Ultra and the 44.03 drivers gets 5806 3DMarks with
3DMark03 build 320.
The new build 330 of 3DMark03 in which 44.03 drivers cannot identify 3DMark03 or the tests in
that build gets 4679 3DMarks - a 24.1% drop.
Our investigations reveal that some drivers from ATI also produce a slightly lower total score on
this new build of 3DMark03. The drop in performance on the same test system with a Radeon
9800 Pro using the Catalyst 3.4 drivers is 1.9%. This performance drop is almost entirely due to
8.2% difference in the game test 4 result, which means that the test was also detected and
somehow altered by the ATI drivers. We are currently investigating this further.
We should have a constant for each 3d company that we can multiple their benchmarks agains...
...
Maybe nvidia is 0.80 and ATI is 0.90
so then 100fps on a geFrorce card, is really 80 fps, and it would be 90 on an ATI...
The "optimization" relied on the benchmark camera being on 'rails'. It always shows the exact same angles, and there are some things that the benchmark would have the graphics card render, even though it's impossible for the viewer to see.
HOWEVER, in the development version of 3dmark 2k3, you can take the camera "offroading". When you do that, it becomes apparent that things are being drawn incorrectly -- that there are hard-coded limits that result in the video card doing less work than the program requests.
For those of you whining about how they should use "real life" games for benchmarks, this technique could be applied to anything where the camera path is predetermined. It has nothing to do with 3dmark 2k3 specifically.
No, because ATI did a much worse job of cheating. Nvidia got a 24% boost out of some of the benchmarks while the best ATI could do was a measly 8%. This clearly shows that ATI must cheat harder if they want to keep up with Nvidia.
I read the internet for the articles.
Different graphics cards have different strengths and weaknesses - much moreso than in previous years.
eg. Fillrate, Vertex manipulation, Texture rasterizer, Shader technology, Texture sampling techniques, Shadow buffering etc.. etc...
Some cards will be better than other at these tasks, and some games will take advantage of differing ratios of these technologies.
The unreal engine has a reliance on poly-count and texture resolution, and it looks like the doom engine will tend to tax shader, and multitexture units more than the polygon throughput side of things.
In other words, gfx cards are now so flexible that their abilities in these individual areas must be assessed in isolation depending on your choice of game/engine/technology.
As little as 2 years ago all that mattered was fillrate, and this was essentially what the direct3d/opengl api's could stress in hardware.
IMO, price seems to be the most useful benchmark for the newest cards.
\\ Mitch
Let me just say that this occurs not just on this test, but on all imaginable tests, as well as all games that are somewhere used as benchmarks. Many of the cheats are hard to detect because they don't break the test in the way that this cheat did. For instance, at some point there was a trick for a test with lots of occlusion to clip (discard) polygons that would eventually be occluded. However, these discarded polygons were actually calculated at run-time and not precomputed, so if you changed the test, it would still work right. For Quake (I or II, can't remember) they had a hack where they wouldn't need to clear the framebuffer. That version of Quake would do a glClear at each frame, which takes some time, and prior to framebuffer compression, there was a hack where you wouldn't need to clear the framebuffer if you swapped the Z-check and only used half of the Z span every frame. That hack's probably been backed out now because with framebuffer compression, you're actually better off doing the glClear each frame.
Anyway, I'm posting this as an AC for obvious reasons.
the pdf for bittorrent
Yeah, here's a mirror of that 760k file - though it won't be up for long, since I've only got 1.9 GB of transfer left for this month.
Be nice and download the zip or the bzip2'd version instead, if you're able.
0x0D 0x0A
nVidia Rep: Just look at how fast Quake III is running!
Reviewer: Sure but why is it just in wireframe?
Trolling is a art,
Thats why "Real world testing" is important. While not always the greatest comparison, its much better in most cases.
:)
"Real world testing" is great, if you're just a gamer. The problem is with independent developers who want to know about the performance of a card. Not only are features important but also how well can the card perform. I put both in consideration when I'm looking for a card. I don't consider current game benchmarks much because those games won't matter in 6 months, or by the time I finally finish my game.
-B
Lets not forget that about 4 months ago Nvidia deemed 3D Mark2003 a poor representation of real world scenarios, so how could they be "Cheating" if they pointed this out before hand? and what about all the other FX5900 benchmarks where Nvidia had a steady 20 to 30% lead on ATI? This article was posted before the FX cards were released, Nvidia's not trying to "SNEAK" anything by us here. "The primary goal of any benchmark is to arm the consumer with the right information to make the best possible purchase decision. As the gamers' benchmark, 3DMark 03 must emulate as closely as possible the kind of experience that the gaming enthusiast will expect on their machine. It must exercise graphics hardware in the same manner that consumer games will. The graphics features, rendering paths, and effects must all emulate games, or the consumer will be misinformed and their expectations misguided. 3DMark 03 combines custom artwork with a custom rendering engine that creates a set of demo scenes that, while pretty, have very little to do with actual games. It is much better termed a demo than a benchmark. The examples included in this report illustrate that 3DMark 03 does not represent games, can never be used as a stand-in for games, and should not be used as a gamers' benchmark. The ultimate injury to the consumer of such a benchmark is three-fold. First, of course, the consumer is misguided. A purchase decision based on ineffectual data will lead consumers to wrong conclusions. Second, it causes graphics hardware manufacturers to focus attention and engineering resources on optimizing for artificially fabricated cases that are a-typical of games. Such optimizations generally do nothing to improve real game performance, and provide no benefit to the consumer. Finally, the extra engineering effort focused on such benchmarks reduces the effort available for activities beneficial to consumers--improving the actual gaming experience." http://www6.tomshardware.com/column/20030219/index .html
I would be amused to see ATI try and sue over this considering that they also appeared to cheat the benchmark on game test 4. I wonder if this is because they weren't able to catch and manipulate any other tests. New benchmark for driver writers: how effectively can the coder cheat the performance benchmarks?
Of course if the article title was, "Everybody cheats on our benchmark!" then that would do more to undermine their benchmark than anything else. Instead they made the focus of the article the fact that NVidia is cheating.
Lasers Controlled Games!
Mirror. Slashdot into oblivion.
According to the article, that's only half the story. I could almost accept it if they were "optimizing" in the sense that, in certain situations, they slightly reduced image quality for a significant gain. That's kind of sketchy, as the card isn't then doing what it's claiming, but you could argue, perhaps, that the tradeoff is worth it. And if this activity were optional, it might be a benefit.
What they're doing here is different, and much worse. They're actually detecting what program is running - whether it is 3D Mark or not. Effectively, what it does is disobey 3DMark, and only 3DMark, when it issues certain commands that would reduce throughput. That has no purpose but to deceive.
So, not only are these not optimizations in that they don't really improve performance, they're not optimizations in that they don't even take effect when you run a program not called 3DMark.
Quite frankly, I think this could be considered false advertising and nVidia should get in deep shit for this. This is the worst kind of cheating, and quite frankly, this could be what puts nVidia down the Voodoo path. I don't know whether I'll ever buy another of their cards.
-Looking for a job as a materials chemist or multivariat
...Slashdot was to host a BitTorrent of this and similar files for faster, cooperative downloading?
I've said it before, and I'll say it again: doing this would be a win-win situation. It's a pity that the editorial team are too busy playing with MAME/whatever to actually do something of real benefit to the wider community.
"Accept that some days you are the pigeon, and some days you are the statue." - David Brent, Wernham Hogg
I wonder why this driver cheat was discovered by Extremetech? If you're a video card manufacturer, wouldn't you have your engineers go over every one of the competitions driver releases with a fine-toothed comb, just hoping to find some kind of cheat? You'd think ATI has better testing facilities are resources then ET.
Certainly any negative publicity for NVidia is good for ATI and vice versa.
I am NOT a man!
I am a free number!
I thought the same thing, until I actually RTFA. This is blatant cheating. Everything looks fine until you take the camera off the rails, and then there are clipping and display problems galore.
Further, the problems change depending on which part of the demo you're in (for instance, the "background not being cleared" bug conveniently only shows up in the part of the space demo where a largely black sky is being displayed, and so no background clear is necessary). This is cheating, plain and simple.
ZFS: because love is never having to say fsck
Just think about this the next time you do a 5MB driver download. How much of that code is specifically for detecting and defeating benchmarks? How much of the cheats are part of the instability problems in your system?
Here is an interesting quote from the article that seems to have been overlooked so far.
"Our investigations reveal that some drivers from ATI also produce a slightly lower total score on this new build of 3DMark03. The drop in performance on the same test system with a Radeon 9800 Pro using the Catalyst 3.4 drivers is 1.9%. This performance drop is almost entirely due to 8.2% difference in the game test 4 result, which means that the test was also detected and somehow altered by the ATI drivers. We are currently investigating this further.
Gasp, what a shock. Everyone seems to be guilty of having cheated on synthetic benchmarks at some time. This has happened before, it will happen again.
Funny, I seem to remember Toms Hardware being rabidly AMD fanboyish about 1.5 years ago when AMD still had the fastest processor. I'm not saying they aren't biased fanboys, what I'm saying is they're fairweather fans.
Isn't that the definition of a good reviewer? Fans of the current top of the line stuff - damn their history?
To keep it on-topic, I also seem to remember ATI doing the exact same thing nVidia is now doing with quake "optimization" for the 8500 cards... Do a google search for "quake quack"
Case in point...
So, ATI results drop from this new patch too? Doesn't this mean that ATI is also cheating? If so, then how do we know that there isn't more cheats ATI is using, as this new patch is only made to exploit the nVidia ones. ATI has access to the developer version of 3D mark, so they could hide their cheats much more efficiently.
ATI had their own cheating debacle a few years back.
Quake 3 vs Quack 3
From the article:
Our investigations reveal that some drivers from ATI also produce a slightly lower total score on this new build of 3DMark03. The drop in performance on the same test system with a Radeon 9800 Pro using the Catalyst 3.4 drivers is 1.9%. This performance drop is almost entirely due to 8.2% difference in the game test 4 result, which means that the test was also detected and somehow altered by the ATI drivers. We are currently investigating this further.
I think it's awesome that Futuremark has come out swinging on this one. NVidia has obviously cheated horribly on these benchmarks. ATI aparently has also taken the low road on these but not as low as NVidia.
NVidia is losing. Their chips and cards are worse than ATI's. What's worse than that, though, is that they are still trying to pretend that it's not the case. They need to seriously sit down and work on their designs but instead they are pissing money away working on cheating on benchmarks. That is a really bad sign for a company. It means managament is diverting money away from becoming successful twords appearing to be successful. A mentality like that is disasterous to the real value of a company.
SELL! SELL NOW! Buy again when they have fixed their mangement and design issues.
Contravertial != Overrated. Reply if you disagree, I'll read it.
set softtabstop=4 shiftwidth=4 expandtab nocp worlddomination
"Who came out with a standard API that ALL manufactures could use without resorting to the arcane obfuscation of OpenGL? That's right, cuntfaces...
It was Microsoft."
Right. All manufacturers... whose hardware works with windows. I'll take cross platform compatability thank you very much.
Before you might argue that nobody uses OpenGL, what about all those licensees of the Quake 3 engine? And what about all those who will license the Doom 3 engine?
You could not prove to any court that NVidia is using deceit. NVidia improved their driver so that a certain set of operations runs faster. There is nothing deceitful about this.
Even if they were to state on the box that they have the card that performs best on the 3DMark2003 benchmark, it would still be a truthful statement. Logically, it's a flaw of the benchmark that it is able to be exploited.
If there is any deceit involved, it would be if someone were to claim that the result of this one benchmark conclusively proves that the NVidia card is superior.
They've never done you any harm. And except for recent accusations of revenue massaging, they don't lie.
Well, friend. It's time you learn that nothing is sacred. Yes, Virginia, even Coca-Cola lies and squashes people to keep its bottom line intact. Read the sad and infuriating tale of judicial corruption and corporate fraud of Bob Kolody vs. Coca-Cola. I was outraged for days.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
Let's not jump on nVidia too harshly for this. Sure, this spectacle seems to have gained a lot more publicity than ATi's own cheating ( link link link ). At least when nVidia cheated in 3DMark, they publically denounced synthetic benchmarks.
You forgot this part:
"And my doctor thinks this twitch will eventually go away"
is 3dmark03 a synthetic benchmark or a eye-candy?
if i remember correctly some of the people who funded futuremark had something to do with a demo named "second reality". a good old school demo on 2 discs.
if 3dmark was TRULLY a bench it would then resort on code that we find in games!! opts are expected for thoses...even more for stuff...
what if you told carmack that the opt he made for quake and tweaked openGL implementation are just cheats? Sure you remember 3dfx ogl implementation and riva128 drivers...
what if you told ppl from the 'scene that their demo sucks because they don't properly handle Z buffering.
They all rely on tricks.(beter than opts or cheating from a coder point of view), even processors rely on thoses. they're based on user experience, not bogomips or whatever. page-flipping was a inproper behavior at a time when VESA was not VESA but scene called mode-X, eventually it became best practice. Sprites asm hard-coding was the same and most 2d shooters are based on that.
I'm pretty sure ppl at futuremark include some kind of sleazzy code in their bench as coders always do.
the only difference b/w cheating and proper optimization is only PR. if nvidia told us "wow! we made an optimization that runs 3dmark faster" as it would with a game none would complain.
it's just that for a lot of us 3dmark is supposedly an untouchable thing. It's not. it should reflect real world 3d. and in real life you expect those kind of code workaround.
then i ask myself a question... why doesn't futuremark distribute freely a playable bench.
why put us in front of a demo claiming it's a synthetic bench and then why aren't we believing it?
because it'a a lie. either they're real world gaming and tricks are OK, either they're pure demos and tricks are not options.
Dear nvidia / ATI / etc.,
Please optimize your drivers and hardware for the actual applications and games I run, not the synthetic benchmarks designed to simulate workloads. Benchmarks don't use your products, end-users do.
Hell I'd read the articles if they weren't slashdotted 90% of the time, such as the current story.
"A good conspiracy is an unprovable one." -Conspiracy Theory
You have just described an optimization, not a cheat. The point of cheats is that they take advantage of knowledge that's not available to normal processes. If your "cheat" takes no such advantage (e.g. calculating its shortcuts at runtime based only on the actual rendering data) then it's actually an optimization.
I do live television special effects for sports, and while I care a -GREAT DEAL- about the performance of the graphics cards (If I screw up, millions of people see it.. right away), we have small enough volume (you only need one system to make graphics for millions of people), that ATI and NVidia don't just hand out their cards to us.
Would I prefer it that way? (Who doesn't like free goodies??) Heck yes! I'd like to get the latest card and evaluate its robustness (Very important to television...) right away so that I can qualify it for use in our systems only a few weeks after it comes out instead of months.
On that note, I'm also constrained by lack of support for Linux on the latest cards (at times). For example, the 9800 doesn't yet have an accelerated linux driver. Dangit! Now, I love the 9700 pro, but I'd love to have that 256 meg on-card.. It is amazing how quickly you can eat up texture memory when you're doing things the card manufacturers didn't think of (like chroma-keying, video mapping, interlaced frame rendering, blah blah)
Does the DMCA make this type of exposure illegal?
I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
"Real World Testing" in general means that they're testing the card on games that are out on the shelf, finished products, right now; i.e. games which were targeted at video cards years old. In other words, one card does 150fps at the highest quality settings, another does 155fps, and when both of them are run on my 80hz refreshing monitor, the results are exactly the same.
Instead, I want testing that approximates the sorts of games that I'll want to buy years from now. Unfortunately those games don't exist yet. In lieu of those games existing, I can look at these eye candy benchmarks to get some idea of what the performance of video cards will be once they're pushed to their limits. How many polygons or how many dynamic lights can programmers squeeze into a scene before the frame rate drops to something unacceptable? How fast can the card whip through those pixel shader programs that everybody is going to be rendering fur and metal and such with in soon? That's what these sorts of benchmarks are supposed to do: tell me how my prospective new purchase will perform on games in the future.
Beyond3d is reporting on the ATI part of this issue.
ATI's official statement:
The 1.9% performance gain comes from optimization of the two DX9 shaders (water and sky) in Game Test 4 . We render the scene exactly as intended by Futuremark, in full-precision floating point. Our shaders are mathematically and functionally identical to Futuremark's and there are no visual artifacts; we simply shuffle instructions to take advantage of our architecture. These are exactly the sort of optimizations that work in games to improve frame rates without reducing image quality and as such, are a realistic approach to a benchmark intended to measure in-game performance. However, we recognize that these can be used by some people to call into question the legitimacy of benchmark results, and so we are removing them from our driver as soon as is physically possible. We expect them to be gone by the next release of CATALYST.
NVidia immidiately put out a rebuttal to these claims, and I'm not sure why they weren't reported along with this article. But, I guess I really can't say that I'm not used to biased or ignorant reporting from slashdot.
From Bluesnews (from an unlinked CNet article):
"Recently, there have been questions and some confusion regarding 3DMark 03 results obtained with certain Nvidia" products, Futuremark said in the statement. "We have now established that Nvidia's Detonator FX drivers contain certain detection mechanisms that cause an artificially high score when using 3DMark 03."
A representative at Nvidia questioned the validity of Futuremark's conclusions. "Since Nvidia is not part of the Futuremark beta program (a program which costs of hundreds of thousands of dollars to participate in), we do not get a chance to work with Futuremark on writing the shaders like we would with a real applications developer," the representative said. "We don't know what they did, but it looks like they have intentionally tried to create a scenario that makes our products look bad."
I know more than you drink.
There isn't really much difference bewteen a "cheat" and a true optimization. As long as the "cheated" driver produces acceptable results, and produces them faster, I don't see what the problem is.
Some of the cheats potentially reduce image quality, but we are talking about OpenGL and DirectX here - nobody really aims for 100% visual quality, and indeed there is no target to shoot for since neither standard specifies "correct" rendering down to the pixel level.
You might complain that 3DMark is being treated specially, that other software wouldn't receive the same speedups. That is true. But application-specific optimization has a long history. Just look at Windows - the more recent versions detect and flag certain programs that are known to break or run slowly due to compatibility issues. Nobody says Windows is "cheating" because it refuses to install a driver that its internal knowledge base knows will trash your system. In the CAD world, video card makers almost always tweak drivers to support specific CAD and 3D applications. (3DLabs' control panel used to have a box where you could select "optimize for AutoCAD/3D Studio/Maya/etc...")
ATI should be happy that NVIDIA engineers are wasting time fixing specific benchmarks when they could instead be improving performance in general. But I wouldn't read much more than that into this.
Making your buying decision based on a synthetic benchmark, rather than in-context with your intended application, is always going to distort the picture. (Looking at SPEC benchmarks, Itanium blows the competition away - just tell that to the millions of people who are *not* buying IA64 chips!)
If you, the OpenGL developer, end up writing the next wildly-successful game, I'm sure NVIDIA will be happy to tweak their drivers for it.