Slashdot Mirror


Are Review Scores Pointless?

donniebaseball23 writes: With Eurogamer being the latest popular video games site to ditch review scores, some are discussing just how valuable assigning a score to a game actually is these days. It really depends on whom you ask. "I've always disliked the notion of scores on something as abstract and subjective as games," says Vlambeer co-founder Rami Ismail. From the press side, though, former GameSpot editor Justin Calvert still believes in scores. "I've been basing my own game-purchasing decision on reviews ever since I picked up the first issue of Zzap! 64 magazine in the UK almost 30 years ago," he says, while admitting that YouTube is certainly changing the landscape today: "There's something very appealing about watching a game being played and knowing that the footage hasn't been edited in a way that might misrepresent the experience."

23 of 135 comments (clear)

  1. Meta scores and user's meta scores by Deffexor · · Score: 4, Informative

    I've found Metacritic to be a good aggregator of scores, but more importantly, the "users" scores (and reviews) tend to be more reliable in terms of not being overly critical of games that are generally pretty good, but don't meet the expectations of "hard core" gamers.

    1. Re:Meta scores and user's meta scores by jclarker6 · · Score: 5, Informative

      Totally agree with this. And taking it a step further one could say any single score on it's own is not that reliable, when taken in aggregate the cream definitely rises to the top

    2. Re: Meta scores and user's meta scores by Anonymous Coward · · Score: 2, Interesting

      I mostly read negative review. Yu can quickly differentiate between drama queen and people with legitimate gripes, then evaluate what they find frustrating and compare with my expectations.

    3. Re:Meta scores and user's meta scores by houghi · · Score: 3, Informative

      Scores by themselves are useless in many cases. I once was heavily involved in a customer service survey. It was basically "From 1 to 10, how do you like the service." What we noticed what that Nordic countries gave a complete different number compared to Mediteranian countries.

      First they thought it was because the service was much better in some countries compared to others. Looking into it and asking customers we found nothing.

      We then started asking a second question: "What service did you expect." and then measure the difference. So if you expect a 6 and you get a 6 it is much better then expecting a 9 and getting a 7. So a 6 can be better then a 7.

      The issue was that the first time we did not have a base to start from. In school-tests the base is pretty easy. 100% is perfect without any errors.

      Compare it to Americans and Enlish where one would say "Wow, this is AMAZING. It is the best I have ever seen." and the other would say "It's not bad." (I jope you get what I mean.)

      When I look at at scres for movies, restaurants, books or whatever, I read the comments to know WHAT they thought about it.

      --
      Don't fight for your country, if your country does not fight for you.
    4. Re:Meta scores and user's meta scores by space_jake · · Score: 2

      0/10 has day one DLC.

    5. Re:Meta scores and user's meta scores by Ravaldy · · Score: 2

      I get what your saying with the customer service surveys as I've been involved in those to. First, I found it was important to keep surveys at 3-5 questions. If you exceed the 5 question mark you discourage the positive reviewers since they don't have the motivation to finish the survey. Negative reviewers are usually far more motivated since they are on a mission to display their dissatisfaction.

      In addition, the questions need to be easily rated. Was the service good is too general. You need to narrow it down to: "Rate how easy the agent was to understand", "Rate how friendly the agent was", "Rate how easy it was to obtain the correct department", ... Different questions will be ask depending on the type of service you provide.

      I strongly believe a score out of 10 is too big of a scale. We usually kept it at 3 (Mediocre, Average and exceptional). As a company you don't need anymore information than that.

    6. Re: Meta scores and user's meta scores by Anubis+IV · · Score: 4, Informative

      I'd go even further than that and say that it depends on the type of scale being used as well.

      When it comes to user reviews, if the reviews are thumbs up or down, I'll do the same as you and read the thumbs down reviews first, since it's easier to filter out the extreme reviewers and get a sense for the common issues. If it's a 5-point scale, I'll read through the 2s and 4s, since those reviews can give you a quick understanding of the pros and cons for the product, without nearly the level of overstatement that you'll need to filter through in the 1s and 5s. And I don't even bother reading reviews based on 10-point scales, since the way that everyday users grade on a 10-point scale is arbitrary to the point of uselessness (e.g. some people treat it like a 5-point scale with better granularity, while others treat it like an academic scale).

    7. Re:Meta scores and user's meta scores by Anonymous Coward · · Score: 2, Insightful

      Are you kidding me? The user scores are just full of people who mindlessly rate the game 0/10 or 10/10 based on the current score in order to pull it up or down.

      Or whenever there's a controversy in the game, they'll rate it 0/10 for a tiny reason to bring the score down.

      There's nothing useful in user reviews of popular games.

      Steam's system is a little better, it only allows a positive or negative experience, and the reviews just show the consensus as "mixed positive". Still, there's bandwagoning but at least there's no illusion of a numbered score.

    8. Re:Meta scores and user's meta scores by Dutch+Gun · · Score: 3, Insightful

      the "users" scores (and reviews) tend to be more reliable in terms of not being overly critical of games that are generally pretty good

      In my experience, users are very extreme in their assigned scores. If they enjoy the game, they assign it a 100 (ZOMG, Best Game EVAR!!!). If they didn't enjoy the game for some reason, it rates a 0 (WTF?! Worst !@#$!@ game of ALL TIME!!!). There are often relatively few scores in the middle. Also, user ratings will often pick up on issues that the press doesn't touch, though, which is a good thing. For instance, when a company introduces intrusive DRM, or if an online-only game has a very bad launch, users will flood the systems with very low scores, where professional scores would not have touched (or perhaps seen) these issues.

      Generally speaking, if a game gets universal praise, there's probably something worthwhile about it, at least to many people. If it generally gets horrid scores, you know that there's something seriously wrong. No, review scores aren't pointless at all. If you want to get the details, then read the actual review, and you can find out if you agree with those specific points or not.

      Eurogamer isn't really dropping the score, incidentally. They're just moving to a "four star" system ("Avoid", "No Recommendation", "Recommended", and "Essential"). In truth, I think that's probably a more honest way of scoring, because it's sort of silly to try to rank a different games based on a one or two percentage points of difference, which is probably completely arbitrary. For instance, what's the difference between a game that ranks 90% on metacritic and one that ranks 89% ? Answer: one more high profile review gave it a five out of five stars instead of four out of five stars. This also avoids the problem of having to try to rank very different genres against each other, or try to convey what a particular score "means" (there's almost always a chart along with the score). In a sense, giving it one of four rankings is sort of cutting out the score as a middleman.

      Also, honestly, I sort of wonder if dropping numeric ratings is a way for gaming sites to give themselves an "out" with publishers, who may apply pressure if their review scores are too low. I've heard of bonus and such being tied to metacritic review scores, which is a pretty nasty thing to do to your employees, IMO. Also, I'm guessing websites don't care to have their review simply aggregated by metacritic into a single, unified score.

      --
      Irony: Agile development has too much intertia to be abandoned now.
    9. Re:Meta scores and user's meta scores by Anonymous Coward · · Score: 2, Informative

      About Nordic countries giving different answers...
      Look up into which grading systems are used in schools.

      Norway has a scale of 1 to 6, with 6 being best(grade school only).
      Finland has a scale of 4 to 10, with 10 being the best (grade school), and 1 to 5, 5 being the highest (higher education).
      Sweden uses letter grades, A, B, C etc.
      Iceland has a scale of 0 to 10, with 5 - 5.99 being 3rd, 6.0 - 7.24 being 2nd and 7.25 - 8.99 being 1st grade. 9.00 and up is a Fine grade. This is very similar to the grading system used in Greece, and somewhat similar to the one in Spain.
      Denmark has a 7-step grade, 12 being highest and -3 being lowest (don't ask, I have no idea either).

      It's weird all around Europe. In Germany, grades are 1 to 6 but nr 1 is best! In France, it's 1 to 20.

      I never realized it was this bad. I just lost a lot of respect on numerical review scales...

    10. Re: Meta scores and user's meta scores by dj245 · · Score: 2

      I'd go even further than that and say that it depends on the type of scale being used as well.

      When it comes to user reviews, if the reviews are thumbs up or down, I'll do the same as you and read the thumbs down reviews first, since it's easier to filter out the extreme reviewers and get a sense for the common issues. If it's a 5-point scale, I'll read through the 2s and 4s, since those reviews can give you a quick understanding of the pros and cons for the product, without nearly the level of overstatement that you'll need to filter through in the 1s and 5s. And I don't even bother reading reviews based on 10-point scales, since the way that everyday users grade on a 10-point scale is arbitrary to the point of uselessness (e.g. some people treat it like a 5-point scale with better granularity, while others treat it like an academic scale).

      The best scale I have seen used is a 4-part scale-
      Is it worth the money even if you don't like the genre? Yes/No
      Is it worth the money if you do like the genre? Yes/No
      Is it worth the money if you like a very specific subgenre? Yes/No
      Is it not worth your money? Yes/No

      Some Youtube reviewers use this format and I've found it much more meaningful than any pure point scale.

      --
      Even those who arrange and design shrubberies are under considerable economic stress at this period in history.
    11. Re:Meta scores and user's meta scores by drinkypoo · · Score: 2

      Their product has to work when I want it to, not when they get around to fixing it.

      The game also has to not be complete bullshit. In Simcity 4, sims have routes and destinations. In Simcity 5, they just make all that shit up, with disastrous results. Why would I pay for a game which is less technically adept than the prior title?

      Hopefully they'll get it right again in 6, but I'm sure not going to be waiting with bated breath

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    12. Re:Meta scores and user's meta scores by AFCArchvile · · Score: 2

      One example of Metacritic scores being contractually tied to bonuses was with Fallout: New Vegas. Gamasutra reported on it almost 3 years ago: http://www.gamasutra.com/view/...

      I never played Fallout:NV, but I remember hearing Jeff Gerstmann describe it as "...very well written... and kinda broken." I also remember someone posting a Windows 7 Reliability Monitor graph, with the only crashes reported coming from "FalloutNV.exe".

      One thing I really like about Eurogamer's approach is that they're simultaneously:

      1. Moving away from a traditional arbitrary numeric score (which is among the old systems of 100-point, 10-point, 5-star, and academic-style scoring)

      2. Retaining a set of summary badges where they can make it easy for readers to find the most notable games, and avoid the games that are broken (as much as the AAA games industry wants to deflect and deny it, broken games are released, and are marketed as though they are 100% functional and are the most outstanding game ever... because that's how all AAA games are marketed).

      3. Forbidding Metacritic from aggregating their scores. This is probably the most important point here; Eurogamer is essentially saying, "Yeah, we're going to score our game reviews with an honest, non-traditional ratings system, and you can't subvert it for your commercially exploitative purchases."

      I hope that more game reviewing outlets take this stand, in order to attempt to stem the worst aspect of Metacritic: its influence on game development. Of course, that won't stop the bigger problem of dishonest marketing plans (E3 shenanigans, mock reviews, cherry-picking outlets for review copies, or not sending review copies at all for games that are forecast to tank while customers who are none the wiser may still be eagerly preordering or obsessing over screenshot and video galleries).

      --
      "Ancillary does not mean you get to rule the world." --U.S. Circuit Judge Harry Edwards, speaking to the FCC's lawyer
  2. Fix the title by neminem · · Score: 2

    Betteridge's Law indicates that the answer is "no", when of course, the answer is actually "duh".

  3. False dichotomy much? by fuzzyfuzzyfungus · · Score: 2

    This seems like a rather pointless question, since 'reviews' and 'review scores' serve somewhat different purposes.

    If you want a comparatively deep examination of a game, strengths, weaknesses, what is it trying to do?, does it succeed?, who is it aimed at?, etc. an answer like "65" or "8" is practically useless. If you want to do a metacritic-style survey(or decide what long-form reviews to read when faced with 2,000 games), though, 3 pages of prose and musing, each, from two dozen sources isn't going to cut it.

    Anyone pretending that a hundred-point score is actually that precise is likely fooling themselves; but there's a much stronger argument that you can get at least a 1-10 or so scoring system unless you are a pure, handwaving "It's all, like, intersubjective, man..." type.

  4. Game reviews have always been broken by Piata · · Score: 4, Interesting

    Almost no games get below 40, while any game that doesn't get 80 or more is considerd a failure. Then you have people giving games 3 out of 5 stars which translates to a score of 60, which skews things even more. Plus tent pole games like CoD can be executed extremely well but offer nothing new so how do you review that? There are games with low interaction (point and click) or high interaction (RTS). How do you compare one against the other? Good reviews are also often given despite massive bugs, incomplete games being released or week 1 launch disasters (like Diablo III).

    It's issues like that which make me understand the no score review trend.

    1. Re:Game reviews have always been broken by Kjella · · Score: 2

      If those are your only two complaints, I fail to see the problem. As long as they rank them correctly within the genre and you can apply your own mind to realize whether you'd like that genre or not isn't it simple and great? I mean Schindler's List is a great movie but if you're looking for a romantic comedy you're way off target, likewise I won't suddenly play Portal instead of Skyrim because one got 0.1 higher score than the other.

      The greater problem is all the reviews that are basically bought PR where the reviewers got their tounge so far up the publisher's backside and making all sorts of bizarre excuses for things that aren't working but will surely be fixed before release in return for exclusive access or photos or interviews and whatnot. If I could trust them to be even moderately honest I would but these days I always wait until I see the user response first, which can be quite different.

      --
      Live today, because you never know what tomorrow brings
  5. Re:No. by aliquis · · Score: 2

    And watching a game on Youtube gives a very misleading impression of gameplay. In fact, I can't fucking stand watching other people play games - especially the kind of games I like, e.g. adventure, where the whole excitement comes from the challenge,

    Try Twitch!

    Trying to watch that with Flashblock / FlashControl and so on installed can be a real challenge! ;D

  6. Re:No. by Piata · · Score: 2

    Sorry to burst your bubble but this is one situation where Betteridge's law might actually faulter. Game review scores have been broken for sometime and removing them entirely might be a step in the right direction.

  7. Steam User Score beats traditional scores by Holammer · · Score: 4, Interesting

    The Steam User Score is currently my most trusted metric for how good a game is, something which is considered "overwhelmingly positive" with a couple thousand user reviews is usually a worthy purchase.
    For non-steam users, imagine Metacritic except you can only submit your score/review if you own the actual game and it's either thumbs up or down.

  8. Scores, yes. Reviews, no. by LaurenCates · · Score: 2

    The thing that I find increasingly aggravating nowadays is how much is hung on score rather than substantive view of the content of a thing.

    For instance (on a sort-of related topic): when a highly-anticipated movie, like "The Avengers" is released for critics and the scores start coming in, and it turns out critics found the movie overwhelmingly positive, the fans get all hopped up when someone dares to give the film a "rotten" instead of "fresh", ruining a "erfect score, as if there was somehow some personal investment in a movie getting 100% of critics to like it (or spoiling of their enjoyment of it if a mere 1% did not).

    Except for the fact that not all critics thought the movie was perfect, and the Tomatometer merely indicates that the movie was at least good enough not to be considered bad.

    The score is the headline, sure, to draw in people to read the review in the first place. But a lot of people gloss over it and stop engaging their critical faculties, brandishing a metric over true criticism as validation of their personal tastes (like Rotten Tomatoes readers; if you don't believe that people do this, find out what happened to critic Eric D. Snider after he posted a fake negative review of "The Dark Knight Rises" before he'd actually seen it).

    I don't have any "infamous" examples of games to point to, though I'm sure examples exist; in fact I wandered into this topic curious about which games were controversial in the same way, since both media have the same kinds of fanatics attached to them.

    My thought is to get rid of scores so that people actually consume opinions, not reduce them to a single number, but that's just me.

    --
    Some people don't believe in fairies. I don't believe in The Patriarchy.
  9. Steam User Score beats traditional scores by Torvac · · Score: 2

    its a good direction, but someone should come up with an adjustment because:
    a) pissed of payers rant more
    c) pleased customers usually don't care about feedback
    b) too many "happy" reviews that list more negative than positive points
    x) positive reviews of sorts "please give them 1-n patches to fix a quadrillion bugs before you vote negative"
    etc.

  10. They didn't drop number ratings... by Yakasha · · Score: 2
    They just changed it to a 3 number scale and gave them names.

    games will be considered Recommended, Essential or Avoid.

    Translated to a 1-10 scale, that is 5-8, 9+, and 1-4.
    Translated to 5 stars, that is 3+, 4.5+, and 1-2.

    Better is to find a specific reviewer that favors the same types of games that you favor and read what they have to say about a particular game. Reviewers themselves should be given scores in different genres to reflect their interest, and scores in different aspects of games that don't necessarily translate between genres and are not necessarily used on every game (perhaps each reviewer chooses 3 most important factors of a set list of say, 10 different areas); then have multiple reviewers on each game.

    How should we score an excellent game with severe networking issues? A flawlessly polished game with a hackneyed design? A brilliantly tuned multiplayer experience with dreadful storytelling? If you expect the score to encompass every aspect of a game, the task becomes an exercise in futility. Add an inflated understanding of the scoring scale in many quarters - whereby 7/10 and even sometimes 8/10 are construed as disappointing scores - and you have a recipe for mixed messages.

    Excellent game with networking issues:
    "Mary the FPS guru" says:

    Polish: 9.5/10 "It's pretty!"
    Networking: 4/10 "Networking problems ruins everything."
    Replayability: 8/10 "Single player scenarios keep me coming back."

    "Matt Foley the puzzle champ" says:

    Team Balance: 8/10 "Pick your army, its all about skill"
    Networking: 6/10 "It's ok because I live in a trailer down by the river!"
    Price: 10/10 "Freeware, freeware, freeware."

    You get the idea. Sorry for the babbling. No time to reword this.