Slashdot Mirror


Making a Fair Gfx Benchmarking Utility?

Moggie68 asks: "Always when the big two release new GPU's and graphics cards that reach astounding heights with their benchmark scores, the same heated debate about unfair benchmarking utilities rises again. But what about the flipside of the coin? Would it really be that easy to construct a fair benchmarking utility for GPU's and graphics cards? What facts need to be considered? What problems solved?"

8 of 40 comments (clear)

  1. Re:Can't be done if driver authors want to skew it by BusterB · · Score: 2, Interesting

    Benchmarkers can just always rename their benchmark programs to something else when testing. Isn't this how a lot of recent driver optimizations were discovered in the first place? How about a benchmark installer that installs a differently-named executable every time.

  2. Those who do not learn from history are doomed to by Mad+Quacker · · Score: 3, Interesting

    ...repeat it

    Does anyone still care about MIPS, MFLOPS, Dhrystone, Whetstone, or SPEC? Why do we want to rehash history with GPU's?

    If you want a synthetic benchmark, the companies will make their product work well with the benchmark, a little else. When the inevitable happens (As it has with both major players) you should neither get upset nor demand a better benchmark, instead laugh when someone fronts a synthetic benchmark score.

    So you want to know if a card you are going to buy will work well for a game that is going to come out in 6 months to a year. We'd all like to know the future as well, I'd prefer a crystal ball.

    --
    "I don't know that atheists should be considered citizens, nor should they be considered patriots." George HW Bush
  3. Mutual generation of fair tests by G4from128k · · Score: 3, Interesting

    One possibility is to have each vendor create two test suites -- a suite that the vendor thinks highlights the best performance features of their own system and a suite that highlights the worst performance features of the competitor's system. For two vendors, this results in a total for 4 test suites (vendor 1's favorites, vendor 1's killer for vendor 2, vendor 2's favorites, vendor 2's killer for vendor 1).

    Then run all four suites on both systems and take normalized averages. The best system can win only by being robust and of overall high performance. With four tests in all, the vendor's own "best foot forward" suite can't overweight the result. And with the other vendor looking for any weaknesses, the downsides of each vendor's system becomes quite evident.

    Such testing may not produce over-optimized one-application super-stars, but it should lead to well-rounded graphics boards for high performance on a range of graphical display tasks.

    I bet that ATI and NVidia will never go for this approach becuase it would lead to real head-to-head fair competition as opposed to carefully staged, optimized, marketing-controlled demos.

    --
    Two wrongs don't make a right, but three lefts do.
  4. Re:Can't be done if driver authors want to skew it by molo · · Score: 2, Interesting

    Then the drivers will check a md5sum of the executable.. or they'll search for certain signatures within the file.. plenty of options.. it would be an arms race of sorts. There's no way to gurantee it.

    -molo

    --
    Using your sig line to advertise for friends is lame.
  5. One thing most benchmark folk miss by TheLink · · Score: 2, Interesting

    Those typical office/desktop benchmarks aren't real world.

    Why? Coz they don't have antivirus software running in the background. AV software running in the background could change results significantly.

    In most offices, the desktop PCs have AV software installed. If they don't have AV software installed, they usually have worms and viruses and those tend to take up more CPU.

    That's real world.

    Which AV software to use in the benchmark is one question that they may not want to deal with ;).

    But, hey, doesn't anyone want to know whether AV+apps works better with or without Hyperthreading enabled etc? Whether it works better with Athlons or P4s?

    Oh well..

    --
  6. OK, So here's what we do: by Rick+the+Red · · Score: 2, Interesting
    OK, So here's what we do:

    We take a bunch of gamers and group them by what video card they own. We give each of them the test board. After one month we take away the test board and give them their old one back. The benchmark is: How many out of 10 owners of board X would buy the test board? Because that's what you really want to know, right? And who better to tell you this than people who own the same board you do?

    --
    If all this should have a reason, we would be the last to know.
  7. Worms (no, not the game) by yerricde · · Score: 2, Interesting

    Writing drivers that will survive running malicious code takes time away from addressing other programming issues and the thing is that no one except for your compititor is writing that kind of code into their App.

    What if somebody finds a way to break Windows through a video driver bug? What if somebody puts that exploit into the next Windows worm?

    The more fundamental problem is that all any kind of test can ever measure is your ability to do well at that test.

    And if that test measures a video card's ability to process OpenGL instructions without bringing down the computer, I'm all for it.

    --
    Will I retire or break 10K?
  8. Re:Cheating 101 by Anonymous Coward · · Score: 1, Interesting

    Yes that'll work... right up until the drivers decide to drop that huge polygon that was supposed to be part of a mountainside.