Slashdot Mirror


The Problem With Metacritic

Metacritic has risen to a position of prominence in the gaming community — but is it given more credit than it's due? This article delves into some of the problems with using Metacritic as a measure of quality or success. Quoting: "The scores used to calculate the Metascore have issues before they are even averaged. Metacritic operates on a 0-100 scale. While it's simple to convert some scores into this scale (if it's necessary at all), others are not so easy. 1UP, for example, uses letter grades. The manner in which these scores should be converted into Metacritic scores is a matter of some debate; Metacritic says a B- is equal to a 67 because the grades A+ through F- have to be mapped to the full range of its scale, when in reality most people would view a B- as being more positive than a 67. This also doesn't account for the different interpretation of scores that outlets have -- some treat 7 as an average score, which I see as a problem in an of itself, while others see 5 as average. Trying to compensate for these variations is a nigh-impossible task and, lest we forget, Metacritic will assign scores to reviews that do not provide them. ... The act of simplifying reviews into a single Metascore also feeds into a misconception some hold about reviews. If you browse into the comments of a review anywhere on the web (particularly those of especially big games), you're likely to come across those criticizing the reviewer for his or her take on a game. People seem to mistaken reviews as something which should be 'objective.' 'Stop giving your opinion and tell us about the game' is a notion you'll see expressed from time to time, as if it is the job of a reviewer to go down a list of items that need to be addressed — objectively! — and nothing else."

3 of 131 comments (clear)

  1. Re:Solve it with Machine Learning by bWareiWare.co.uk · · Score: 4, Interesting

    Your description is not so much machine learning as basic math. If you just want each scoring system to have equal weight on the results then compensating for the variation is trivial.

    Where machine learning would come in is to find underlying pattens in the data.

    This could be used to weed out reviewers to lazily copy scores, or are subject to influence.

    It would also allow them to test your scores for some games against the population of reviewers to find reviewers with similar tastes.

    You could also use clustering algorithms to find niche games which got a few really strong scores but whose average was really pulled down because they don't have wide appeal.

  2. Re:Is that so? by SpooForBrains · · Score: 5, Interesting

    I work for a review platform. We have decided that you only really need four ratings, Bad, Poor, Good, Excellent. We don't have a neutral option because really neutral tends to mean bad.

    Of course, quite a lot of our users (and our marketing department) seem to prefer stars. Because an arbitrary scale is so much more useful that simply saying what you think of something. Apparently.

    --
    "The dew has clearly fallen with a particularly sickening thud this morning"
  3. Rottentomatoes by cheesecake23 · · Score: 4, Interesting

    By that logic, Rottentomatoes (which averages reviews using only a binary fresh/rotten scale) should be utterly useless. Except it isn't. It's IMHO the most dependable rating site on the net.

    It seems the magic lies not in the rating resolution, but in the quality and size of the reviewer pool (100+ for Rottentomatoes). In other words, make the law of averages work for you.