Fixing Steam's User Rating Charts

← Back to Stories (view on slashdot.org)

Fixing Steam's User Rating Charts

Posted by Soulskill on Monday October 6, 2014 @02:21PM from the hmm-same-distribution-as-professional-reviews dept.

lars_doucet writes: Steam's new search page lets you sort by "user rating," but the algorithm they're using is broken. For instance, a DLC pack with a single positive review appears above a major game with a 74% score and 15,000+ ratings.

The current "user rating" ranking system seems to divide everything into big semantic buckets ("Overwhelmingly Positive", "Positive", "Mixed", etc.), stack those in order, then sort each bucket's contents by the total number of reviews per game. Given that Steam reviews skew massively positive, (about half are "very positive" or higher), this is virtually indistinguishable from a standard "most popular" chart.

Luckily, there's a known solution to this problem — use statistical sampling to account for disparate numbers of user reviews, which gives "hidden gems" with statistically significant high positive ratings, but less popularity, a fighting chance against games that are already dominating the charts.

5 of 93 comments (clear)

Min score:

Reason:

Sort:

Valve Time by UnknownSoldier · 2014-10-06 14:33 · Score: 4, Insightful

Like all things Valve does on Valve Time, Steam is _slowly_ getting better so I'd imagine this will get fixed ... eventually.
At least we can give a thumbs up or down to games. The ability to write reviews takes advantage of the best kind of marketing:
Word of Mouth.
1. Re:Valve Time by Anonymous Coward · 2014-10-06 21:58 · Score: 5, Insightful
  
  As much as people forget, Valve are not in this to make gaming a better place. They're there to make gobs of money, and have been rather successful at doing so thus far.
  Valve makes gobs of money *because* they make gaming a better place.
  Valve is the sole reason why I'm probably not buying a console this generation, or maybe ever again. Gaming is so much better on PC these days that it just doesn't make sense to lock yourself into the console market anymore. And that's all on Valve.
  So you might be cynical and say Valve only cares about money, but the fact of the matter is that in order to generate that money they need a healthy market. Their interests are aligned with ours.
  
  Considering the sort of talent they hire and have hired in the past, if they truly wanted to fix things, they'd be fixed. If they're not, they either don't consider it important or have a reason for not fixing it.
  Maybe they've just got better things to do. Valve has their hands on so many pies that I'm sure they could double their workforce tomorrow and still keep everyone occupied.
  The new recommendations system is still new, I'm sure it's under close scrutiny and updates will be coming. But as someone else said, it'll happen on Valve Time.
2. Re:Valve Time by Marc_Hawke · 2014-10-07 05:27 · Score: 3, Insightful
  
  Nobody liked Steam when it came out either. There were a lot of things that kept most people away from it:
  1. Always on. This was a problem both in internet connections (which were much more flaky back then) but also PC memory usage. Background processes were a gamer's worst nightmare before RAM sizes gained a few extra digits.
  2. "Vaulted Access." People still wanted physical copies. They didn't trust Steam to be around in 5 years and figured they wouldn't have access to their games anymore.
  3. Other things.
  So, Steam was ignored by a lot of people, except for the games that 'forced' them to use it (Valve games:...CounterStrike and HL2 mostly.) However, (and this is the magic Microsoft needs to find) Valve made steam not suck. People learned to trust it. "Yes" it will be available. "Yes" it will be convenient. "No" it won't hose your experience. And most of all..."Yes" it will be economical.
  Steam was considered draconian, until it proved not to be. And...importantly...it was 'optional' during that testing phase.
  
  --
  --Welcome to the Realm of the Hawke--
Discretionary XKCD by Blaskowicz · 2014-10-06 19:02 · Score: 3, Insightful

Yep there's one about it!
It made me not get very enthusiastic about app stores and such.
Re:Doesn't even look like an algorithm by hey! · 2014-10-07 00:49 · Score: 3, Insightful

It's not an algorithm, except in the trivial sense. It's a formula for calculating an adjusted rating value that discounts extreme ratings for items with small numbers of reviewers.
This actually matches what you do intuitively when you see an item with a single rating of 5.0 at the top of a list, just above another item with an average rating of 4.9 from a thousand users. You mentally deduct a bit from the "top rated" item because you know it's probably too high. Likewise a 1.0 rating from a single user is probably too low, so you mentally add a bit to that.
The question is, how much to deduct or add from the score?
The approach suggested is to ask a slightly different question. Instead of "what is the average rating of the product", you ask "what percentage of positive ratings can I be 95% certain the product would score above have if *everyone* rated it?" It turns out there's a number of mathematical formulas that are supposed to tell you precisely that.
There's still a lot of arbitrariness in this approach. Why 95%? I'm reasonably sure that results would be just as intuitively reasonable if we chose 80% instead. But if 95% seems to generate intuitively reasonable results there's no particular reason to monkey with that parameter.
BUT, I think, the level of arbitrariness involved probably means we could choose a simpler approximation than the Wilson interval if we could dream one up. The more familiar Wald interval taught in basic statistics courses is somewhat simpler, but not so much that it's worth worrying about, at least not if you're doing the calculation on a database server which typically has a few CPU cycles to spare.
If I were to attempt something like this on a massive scale in an environment where CPU cycles were precious, I'd probably devise some kind of simple algebraic scaling formula that tweaked scores toward the mean, depending on the number of ratings. The results wouldn't be quite as good as the Wald or Wilson intervals, but maybe not so much less good that anyone would notice.

--
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.