FPS Benchmarks No More? New Methods Reveal Deeper GPU Issues
crookedvulture writes "Graphics hardware reviews have long used frames per second to measure performance. The thing is, an awful lot of frames are generated in a single second. Calculating the FPS can mask brief moments of perceptible stuttering that only a closer inspection of individual frame times can quantify. This article explores the subject in much greater detail. Along the way, it also effectively illustrates the 'micro-stuttering' attributed to multi-GPU solutions like SLI and CrossFire. AMD and Nvidia both concede that stuttering is a real problem for modern graphics hardware, and benchmarking methods may need to change to properly take it into account."
why not use time/frame min, max and avg values alongside fps?
Twelve pages of graphs and data? Couldn't he just have said "standard deviations and percentiles" and be done with it?
Our eyes detect 'deltas' better than 'speeds', so if the odd numbered frames have a delay shorter than others, our eye will detect it. But this only affect setups with multiple GPU's. And is easy to fix. Just calculate the delta of the latest frame, and force the same delta, maybe use a buffer. This is not a problem once has ben detected, It may need some minor changes on engines, but thats all. IMHO.
-Woof woof woof!
tldr; benchmarks ignore consistency in their measurements and are therefore nonscientific marketing devices.
The dangers of knowledge trigger emotional distress in human beings.
After all, your average geek tends to know that movies happen at 24 FPS
Movies happen at a motion-blurred 24 fps. Video games could use an accumulation buffer (or whatever they call it in newer versions of OpenGL and Direct3D) to simulate motion blur, but I don't know if any do.
and television at 30 FPS
Due to interlacing, TV is either 24 fps, when a show is filmed and telecined, or a hybrid between 30 and 60 fps, when a show is shot live or on video. Interlaced video can be thought of as having two frame rates in a single image: parts in motion run at 60 fps and half vertical resolution, while parts not in motion run at 30 fps and full resolution. It's up to the deinterlacer in the receiver's DSP to find which parts are which using various field-to-field correlation heuristics.
and any PC gamer who has done any tuning probably has a sense of how different frame rates "feel" in action.
Because of the lack of motion blur, 24 game fps doesn't feel like 24 movie fps. And because of interlacing, TV feels a lot more like 60 game fps than 30.
This will certainly make benchmarking a bit more complex. One hopes that the gamers like going back to stats class.
You'll need the FPS value, as before, (ideally with a worst-case FPS reported); but you'll also want a measure of the deviation of every frame's draw time from the average draw time being reported. And likely a measure of how atypically bad frames are distributed(ie. 5 seconds of super-low framerate during some sort of loading is annoying. 20 25 millisecond frames scattered throught action-heavy areas is really annoying...)
It would also be interesting to see what this does to the (traditionally poor) reputation of the sucker-edition cards that get loaded up with relatively huge amounts of slow memory in order to make them seem like a good deal(ie. if 2GB of GDDR5 is the lunatic fringe, and 512MB of GGDR5 is the solid-value-gamer special, you'll see cards with 1GB of DDR2/3, marketed to the unsuspecting as alternatives to the solid-value line. Their average framerates are usually pretty tepid, because DDR is slow; but they honestly do have a lot of it, so they needn't hit the PCIe bus to load something from system RAM as often...)
The point of doing a FPS-benchmark is to reveal how the graphic card performs in the games that most people play. People don't care about the theoretical performance. They just want to know if it can run the new cool game with the good graphics. Either a game renders fast enough, or it is so slow that you can't turn out the special effects that makes the game look really good. It's all about the game. The other stuff is not so important.
At this point a large use of GPUs seems to be for processes where they are more efficient than CPU. The most obvious is vector processing. If one is doing heavy computational work then the standard benchmarks seem fine. What fraction of the GPU market is for actual graphical use?
less than 60 fps.
kthxbye.
The Cloud - because you don't care if your apps and data are up in the air.
That sort of sounds like the solution presented on page 11: "More intriguing is another possibility Nalasco mentioned: a 'smarter' version of vsync that presumably controls frame flips with an eye toward ensuring a user perception of fluid motion."
Well played Sir. I used my overclocked Radeon HD 3650 to post this (almost as quick) reply...
This tagline was transcoded to result in at least one smirk. If you experience failure to smirk, please consult your Gen
Don't you only need 60 FPS to have the illusion of animation anyway?
On the last page, in the last paragraph, he indicates that all of the data you just read through is shit and probably invalid. Turns out he was measuring the wrong place in the pipeline - before rendering - and what he measured doesn't track with the actual user experience.
I'd like my 5 minutes back, please.
HBI's Law: Frequency of calling others Nazis is directly correlated with the likelihood of the accuser being Communist.
Most of the time when I notice any stuttering is also the same time my hard drive lights up. Usually either the game or some background service decides to flood the disk with IO requests. In a few instances I've even had Windows become completely unresponsive until whatever disk operation that is running completes. It doesn't matter how much RAM I have. I haven't purchased any SSDs yet but I'm sure they help a lot to alleviate the problem. The question is is this a fault in how programs or the operating system handle secondary storage? Why should a disk intensive operation halt the rest of my OS especially when the entire OS could fit in RAM?
As if the issue of micro stuttering hasn't already been covered in great detail numerous times in the past. I ran sli for a while, if it's a problem then features like vsync can help. If you are only running one GPU like 99.9% of folks out there then you don't need to waste your time on this article.
FWIW, FPS is still a fine benchmark. Like any benchmark, it only tells part of the story. That's why you use tools like 3dmark that run a battery of benchmarks to aggregate a rating, and then measure actual performance in games/applications. Review sites seem to have caught onto this say.. 15 years ago?
Undersupported, underutilized, full of bugs, and overall just a colossal waste of money. Not to mention how much more heat that means your case has to deal with.
And I mean just that. Displays. Monitors. That $500 Dell UltraSharp with the IPS panel that has 4ms response time. That is factor in all this, isn't it?
I find he content and discussion very interesting. For me, this was something obvious because of my line of work, but I can imagine that most people reading (and writing) GPU reviews had no clue what so ever about this.
As much as I find the content interesting, its presentation is awful. Although is is interesting the present some figures on a frame-count base, most of the overview figures should be on equivalent time base, allowing a proper comparison of the tests sequences. I'd have shown one frame based graphic to explain what was going on and than used this frame based scale only for the "zooms" illustrating specific features or effect.
Also, the author probably never heard of histograms and/or distributions. Nor of variance, standard deviation, etc.
Yuo should've stuck with nVidia...
I've always wondered about the insistence of various parties on using FPS as The Benchmarking Standard[TM] while ignoring all the things that contribute or limit the FPS achieved. A real benchmark should atleast keep track of CPU usage, GPU usage, bus bandwidth usage (ie. if for example GPU is idling a lot of its time because the bus can't keep up) and memory bandwidth usage. Then it would be much easier to find bottle-necks and make proper comparisons by ensuring that only the item to be benchmarked is causing bottlenecks, no other part in the equation.
Then again, I am not aware of a single benchmarking suite (or website, for that matter!) actually caring about bus saturation or providing meaningful information, only FPS numbers or some other inflated score to shake e-peens at.
Well played Sir. I used my overclocked Radeon HD 3650 to post this (almost as quick) reply...
by Anonymous Coward on Friday September 09, @10:10AM
by webmistressrachel (903577) Alter Relationship on Friday September 09, @10:39AM
19 minutes? Sounds about right.
I'm a good cook. I'm a fantastic eater. - Steven Brust
The only way I have ever been able to test what real performance will be like in a given game or rendering in a given program is to play that game or render in that program. Even built-in benchmarks like in HL2 don't seem to take gameplay into account well enough. While (at best) benchmarks can be a help in deciding what to buy in a very general way, I have learned to be skeptical and trust my experience only. Even framerate monitors in games often don't reflect the smoothness of the experience of the game. Rift would show around 30-40 FPS, WoW would show 75-100, yet Rift would seem to feel far smoother.
"We live as though the world were as it should be, to show it what it can be." - Joss Whedon via Angel
I'm glad somebody started looking at ms per frame instead of frames per second. Since what we really care about for game performance is whether frames are rendered quickly enough to give satisfactory reaction times etc, using frames per second is misleading.
Another example where the same thing happens is fuel consumption: we keep talking about miles per gallon, but what we primarily care about is the fuel consumed in our driving, not the driving we can do on a given amount of fuel, so this is misleading. To use wikipedia's example, people would be surprised to realize that the move from 15mpg to 19mpg (saving 1.4 gallons per 100 miles) has a much bigger environmental and economic impact than the move from 34mpg to 44mpg (saving 2/3 of a gallon per 100 miles).
Similarly, moving from 24 fps to 32 fps has a bigger impact on the illusion of motion, fluidity, and response times than moving from 40 fps to 60 fps (10.4 ms difference vs 8.3 ms difference in time between frames). I think everyone should have been using ms per frame all along.
(note: yes, I already said this on their forum, I just think it should be repeated here)
Some very snooty replies here. For me this was one of the most interesting articles I've read all year, and a nice change from all the identical articles I've read about graphics card benchmarking.
He could just provide a histogram with frame times. It would show the amount of long frame times
and it would show jitter (the histogram would have two bumps).
Yea, or 29 minutes ... depending on if you want correct math or not. Hope thats not what you're using that GPU for :/
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
They already show Min Frame Rate next to average on Tom's Hardware....
Tom's Hardware posted a similar sort of analysis a few weeks ago: http://www.tomshardware.com/reviews/radeon-geforce-stutter-crossfire,2995.html
if ($question !~ m/bb|[^b]{2}/i) { die(); }
I see someone is still using an old Pentium I. *ducks*
Well, there's spam egg sausage and spam, that's not got much spam in it.
Put two computers side by side, one with a 60Hz display and the other with a 120Hz display. Go to the Windows desktop and drag a window around the screen on each. Wonder in amazement as the 120Hz display produces an easily observable higher fluidity in the animation.
I have a 60Hz monitor and a 120Hz monitor side by side on this computer. There was no perceptible difference in window dragging - looked the same on both monitors. I bet the author just connected his 120Hz on a better GPU than his 60Hz monitor has.
Why would anyone need a framerate faster then the refresh rate of the display refresh rate you're using?
I've never understood why anyone would push a graphics card faster then the refresh rate of the display you're using. Why not just cap it off at the max refresh rate, and let the card take more time in rendering each frame.....
It seems as though there should be some sort of "dynamic rendering" option. You want the framerate to match the refresh rate of the monitor, so why can't the rendering engine decide what to spend more or less time on?
For instance, there are the core objects and lights and maps that make up the main scene, then from there there's particle engines, reflections, additional shading, etc. If the card has the capability to do 500 fps, I'd rather it focus on making a REALLY AMAZING 90Hz or 120Hz (or whatever my refresh rate is)....
And the flip side is true as well. If I'm playing a game, I'd rather it keep up with the monitor refresh rate rather then paint a pretty picture. It doesn't make sense for it to a beautiful scene while I'm getting whomped on.
The rendering engine for video games should dynamically choose what to render based on what your computer is capable of. All special effects and anti-aliasing and everythiing should be turned on when it starts up ... and it should scale back the unnecessary items as it can't keep up... and throughout the game one room might have different settings on than another depending on everything going on.
That was my point... whoosh...
This tagline was transcoded to result in at least one smirk. If you experience failure to smirk, please consult your Gen
I would much rather see a histogram of the number of consecutive missed frames and the total number of missed frames from each card. That data would capture all of the physical information that's visible to the viewer in a meaningful way. There wouldn't be any confusion if triple buffering hides latency or other ambiguous crap. My last paragraph says how this could be measured.
With vertical sync and fixed screen refresh rate (and especially with triple buffering) there is only one number (ignoring power consumption) that matters: the # of missed frames. If your screen has a 60Hz refresh rate, and the graphics card sends the screen the same frame twice in a row, then your computer just missed a frame. If it sends the same frame 5 times, then it missed 4 frames. And that is the ONLY thing visible to the user! If one frame takes a long time to render, but triple buffering hides the render time such that no frames are skipped, then the user still sees the maximum FPS. Frame times are a detail for the developer, not the user!
I admit, I don't know how to measure skipped frames in software. But why trust software? It would be easy to count missed frames by sending the video output to another computer. The receiving computer can then check if consecutive frames are identical.
When Black Ops came out I was getting decent FPS but the stuttering made it unplayable. It seems that this game requires a CPU with an onboard memory controller *and* 3/4 available cores. If you don't meet both those requirements the engine will stutter just as this article describes. This was a problem for users of Core2 series CPU's, even Core2quads were inadequate. Activision refuses to acknowledge the severity of this problem to this day.
Note: Not all TV shows are at 50 or 60 Hz. Some TV shows use the progressive encoding scheme that halves framerate.
The two fields of interlace are hijacked to form a single still image. It's a single image built from two passes. Hence the name "progressive".
They sacrifice framerate for the sales pitch. I don't know why though. Possibly to make the freeze-frame look better, dunno.
They sacrifice framerate for the sales pitch. I don't know why though.
TV producers use film to make the show appear more cinematic. Audiences are used to seeing film-like frame rates and other artifacts of film for scripted programming as opposed to video for "reality" programming such as news and sports.
Possibly to make the freeze-frame look better
The freeze-frame doesn't look better when you hit it at the wrong field dominance.
You can sure as hell tell the difference between 24fps and 60fps video. It is real, REAL easy to see. I have an AVCHD cam that will shoot 24p, 30p, or 60p (HDC-TM900 if you are interested) and it is amazing how smooth the 60p video looks. You just don't get that kind of smoothness in movies.
Some people don't like it, they think it looks "too smooth" and "not cinematic". In other words, they've gotten used to the choppiness of 24fps and associate that with movies.
What's more, there's a market in TVs that try to deal with this. You'll find many "120Hz" or "240Hz" TVs on the market. They don't accept input signals of that rate (with extremely few exceptions). What they do is motion upsampling. They try and generate intermediate frames, when you turn that mode on. The result is actually amazingly good. It has artifacting at times and doesn't work on all sources but over all, you get much smoother motion than just displaying the source directly. Similar to resolution upsampling many TVs and DVD/Blu-ray players can do for DVDs, just in the time domain instead.
24fps is not "all we need." It was decided to be "good enough" for film. You can real easy see the difference with higher fps stuff, even with video with motion blur. If you get a chance I encourage you to try it out. Find a 60p AVCHD camera, which are hot that hard to find these days, and shoot some video in 60p and 24p mode. Then play it back on an HDTV that accepts both inputs (most do and the cameras have HDMI out). You'll be amazed at the silky smoothness of the 60p video. It looks so... real compared to 24fps.