Slashdot Mirror


MetaFuture Talks Review Inflation

MetaFuture, a game journalism analysis site, has recently refocused on review scores from the big gaming sites. The author takes an interesting approach, taking a look at Gamespot's review spread and IGN's tendencies. Unsurprisingly, both sites tend towards the 7 to 9 range, making it debatable whether their numbers are actually useful. The site's eventual goal is to normalize the review scores from the major sites, and actually make them useful. From the article: "Games will still get an average score from all contributing reviews. But a site's contribution to that average will depend on that site's own individual normal curve-- with the immediate left and right of the bell's tip signifying three stars on a scale of one to five. Watch the drama as the biggest sweethearts see their 8.4 score for Gun and Car IV get pegged as three stars." This is the reason Slashdot videogame reviews don't have numbers anymore.

6 of 42 comments (clear)

  1. Selection effects? by JackBuckley · · Score: 4, Insightful
    I agree that fanboy magazine ratings should be viewed with scepticism, but I also worry about "normalizing" the reviews. There is an underlying assumption here that the population of games is symmetric in the distribution of quality. This may not be true, if, for example, games which are in the low quality tail are not released (game companies are strategic actors, right?) at a higher rate than games in the high quality side of the distribution. In addition, gaming magazines do not necessarily choose which games to review at random either--they either review interesting games with a higher probability of being of high quality (if you want to be kind to the industry) or else review in response to payola/swag (in which case it is the companies strategy which matters again.)

    So, the question is, conditional on nonrandom selection of games to release and nonrandom selection of games to review, what should we expect the distribution of quality to look like? My guess is that this distribution is nonormal and is skewed with more observations in the higher quality tail. This does not necessarily mean, however, that the reviews are "fair," but it suggests that the question is more complicated than a simple "grade inflation" argument.

    Note that I am also making an assumption that quality is judged in some sort of absolute terms, and not relative to the other games that are released. There are probably some other assumptions lurking in there as well. Just my $.02

    1. Re:Selection effects? by iocat · · Score: 2, Insightful
      Since I actually read reviews, and don't just look at the scores, I know I vastly prefer to read a review by a fan on the genre (who might be more forgiving on the score) than someone who knows nothing about it, regardless of any grad inflation that may go on.

      The great examples are NASCAR and wrestling games. The typical NASCAR game review is like "well, if you like turning left a lot and you have a lobotomy, you may enjoy this... 3/5." Contrast that with Ivan Sulic's exceptionally well written review of NASCAR 2006: Total Team Control at IGN. Sulic ran down every parameter of the game, even going so far as to explain his play style, while doing the review. He didn't take the piss, and while it's clear he's a NASCAR fan, he delivered enough information to enable hardcore and casual fans to have everything they needed to evaluate the game. He scored it high (8.5) and obviously liked it. But, since it was the only NASCAR game to come out in 2006, according to TFA, it should be normalized to a 3. Retarded. In the end, I decided that NASCAR 2006 wasn't for me, despite its high score (the total team control stuff seemed like too much work), so I bought Flatout instead, but it was a great example of why you need to read the review, not just the score.

      Additionally, this notion that the "average" game must score 5 on a 10 point scale is retarded. If you have a 10 question quiz, your average person will probably get 7 right. That's why a 70% is usually a C (average). IGN, GameSpot, Game Informater, etc's scoring system may trend towards scoring 7 - 9, but that's simply in line with the way most people grade things on a 10 point scale. On a 5 point scale, you see a lot more threes, but so what? A 3 doesn't tell you any more than a 7 -- ultimately you need to actually read the review.

      --

      Dude, I think I can see my house from here.

  2. Reviews are only useful when... by mikeisme77 · · Score: 4, Insightful

    Just looking at the end score in the review for ANYTHING is useless. The usefulness in a review is in reading the comments of the reviewer and understanding the reviewers preferences in games by looking at their reviews of other games you're interested in. The trick is to find a reviewer with similar thoughts on genres and such as your own, that way their review is relevant to you. The other trick is finding well thought out, well explained reviews--ones that tell you EXACTLY what the shortcomings and pros of the title were, this way you can decide if the shortcomings are shortcomings to you or if you just think the reviewer is being anal.

  3. The culture of inflation by UbuntuDupe · · Score: 2, Insightful

    I think what reviewers say going in is that, "okay, 5 is average, if you get above 5, hey, you're doing something right." But then they hand out 6's and 7's and the companies are like "OMG!!!! totally unfair!!! That's a failing grade! And it's a good game!" (I think this actually came up in Electronic Gaming Monthly about 10+ years ago when they wanted to defend giving a game a 7 on just that basis - it *was* above average, but that's not "good enough" a reason to leave it at 7. Maybe the game was Super Empire Strikes Back.)

    To answer your question, you *should* see a 1-10 bell curve peaking at 5. But they won't use a genuine 1-10 scale because people will read it like an (American) grading scale, where 6 (of 10) is failing. So what you should expect in reality is some sort of bizarre pseudo-logarithmic asymetric scale: "normal" is 6 or 7, and you have to get disproportionally better or suckier to get to 8 or 9 or to 4 or 3.

    Like with grade inflation, no one is really served by that. But it doesn't help that there are no objective, well-defined units for an inherently subjective experience. What exactly does it mean that a game is "twice as good" as another? Or that a game is "one point" better.

  4. Part of the problem by miyako · · Score: 2, Insightful

    I think a big part of the problem that people have is that they simply don't know how to read game ratings properly. I know that, personally, I usually find that when I look at the ratings for a game, along with the text of the review, it can lead me to a fairly accurate understanding about how well I will like it.
    The first thing is that I think people think of the ratings system in terms of absolute "10 is a great game" "5 is mediocre" "1 is crap" scores. That's not really accurate. In general the score must be considered from within it's genre. A football game with a rating of 10 might be excellent for people who like football games, but I certainly wouldn't enjoy a football game with a 10 rating any more than a football game with a 1 rating, because I don't like football games. Likewise I might enjoy an RPG with a score of 7 or 8, but other people would find it tedious, because they don't like RPGs.
    The other problem is that I think people expect scores to fall in a fairly normal distribution. The problem is that game quality isn't a normal distribution. There are a lot of games that are made that people might not consider fun, but they are at least semi-playable. If you consider a game that might get a 1 or a 2 rating, it would have to be something with severe software flaws that kept the game from even being playable. On consoles at least, no matter how bad a game is, it's rare for a game to be so bad that a determined person couldn't play it (even if they didn't enjoy themselves.). If you look higher in the ratings, it's similar. Most games tend to be clustered in the 7-9. To understand why I think you need to really understand what the ratings in that range mean. When I'm looking at a review, and I see a game with a 10, that tells me that the game is well executed and should appeal to the majority of gamers even if they aren't particularly fans of the genre of the game. A 9 generally says that the game is on par with the best games of that genre, and introduces some new concepts to extend it. A rating of an 8 generally tells me that the game is solid and people who are fans of the genre will probably enjoy it, but it might not appeal to people who aren't specifically fans of the genre, or of the series. A game with an 8 might either have a few flaws that lower the overall experience, or it might be a solid game that fails to offer anything innovative. A rating of a 7 generally says that the game is weak. A 7 tells me that someone who was a big fan of the genre or series might enjoy the game, but that there are probably some flaws that other games in the series or genre have fixed, and that the game either has some fairly large flaws that non-fans won't be able to over look, or its very formulaic and will be boring to someone who isn't a huge fan of that forumla. Finally, looking in the mid range of 3 to 6, you generally see games that are lacking something that generally a game should have, but which doesn't render the game unplayable (when I say unplayable, I mean physically the game won't run, as opposed to the game having responsive or intuitive controlls). A 5 or 6 for example says that a game probably has some severe playability issues that interfere with enjoyment of the game, as well as having some bugs and lacking features that are standard for the genre. A 3 or 4 generally says that either because of various bugs or lack of features, there isn't much "game" to the game at all.
    What I think it boils down to is that games first of all need to be rated only within their own genre, because it's hard to set a single scale for games across different genres. Using a normalized scale seems intuitive, but it doesn't work because the quality of games isn't a normal distribution, instead it's skewed so that there are generally a lot of "Ok" and "so-so" games in the 7 to 8 range, a lot of games that lag behind because of various problems, and a very few gems that get the coveted 9's and 10's. It's also hard to quanticize fun. A reviewer can really only rate a game based on what he

    --
    Famous Last Words: "hmm...wikipedia says it's edible"
  5. Should be more like film critics by Bender0x7D1 · · Score: 3, Insightful

    People should rely more on reviews from people that have the same gaming opinions as they do instead of some number. Consider: How many poorly rated movies do well at the box office, and how many highly rated movies do poorly? A lot. Check out Yahoo! movies or similar site and compare the critics to the people. They are never the same and rarely similar. Why? Movie critics see a lot of movies, so are biased towards the storyline and acting instead of a big action sequence. So, they view movies differently than I do.

    Extending that to video games; a reviewer who enjoys FPS games is going to give a high rating to the latest shooter with great graphics. I like older FPS games, but hate the direction that the industry has gone with newer games. So, if a reviewer is a fan of the genre, and I'm not, should I use their review? Of course not! I hate RTS games, so even if one had a 10 rating I wouldn't buy it. However, maybe someone does something new and it is worth my time and money to give it a shot. How do I know? I need to find a reviewer who doesn't like RTS games and get their rating - if they give it a 7 or 8, but they don't like RTS, then I should look into it.

    So, how do you find these reviewers? Give ratings to the games you have played, maybe separated by genre, and then go looking for reviews that are close to your own and look at the name of the reviewers. Then search by reviewer to see how close their ratings are to your own, pick the closest (or some sort of combination - Alice for RTS and Bob for FPS). Now you have some reviewers you can trust will like the same games you do, and you can shop accordingly.

    --
    Reading code is like reading the dictionary - you have to read half of it before you can go back and understand it.