Slashdot Mirror


Details on XBox TrueSkill Ranking System

rupert0 writes "A research paper on the Microsoft website gives an insight into the way that gamers will be ranked on the new-style Xbox Live. The paper outlines some existing ranking systems, as well." From the article: "The TrueSkill(TM) ranking system is a skill-based ranking system designed to overcome the limitations of existing ranking systems, and to ensure that interesting matches can be reliably arranged within a league. It uses a technique called Bayesian inference for ranking players. Rather than assuming a single fixed skill for each player, the system characterises its belief using a bell-curve belief distribution (also referred to as Gaussian) which is uniquely described by its mean (speak [mju:]) ("peak point") and standard deviation (speak [sigma])("spread")."

50 comments

  1. The bell curve by TimmyDee · · Score: 2, Insightful

    Who would have thought this to be a novel way to rank people's skills?*

    *I do realize that's not the upshot of the paper. Still, I think all this emphasis on Gaussian distributions for dividing people according to "skill" (read IQ, test scores, etc.) is a bit over done. Convenient, but overdone.

    --
    Per Square Mile, a blog about density
    1. Re:The bell curve by Westacular · · Score: 1

      Perhaps you should read up on the central limit theorem.

    2. Re:The bell curve by illerd · · Score: 1

      Central limit theorem is totally played out. So is e=mc^2. Where's the hot new theorems of today's generation?

  2. The proof of the pudding is in the tasting by LordNimon · · Score: 2, Insightful

    I'll believe this method works if I can join a Halo 3 game as a level 1 player and not have my ass kicked. As it is today, if a new Halo 2 player joins an online game, he will be destroyed by the other players regardless of what their ranks are.

    --
    And the men who hold high places must be the ones who start
    To mold a new reality... closer to the heart
    1. Re:The proof of the pudding is in the tasting by MrScience · · Score: 1

      I saw a presentation on this. I can only say that we [excluding griefers] are all going to be much happier when this ranking system is commonplace. In the paper, notice that Alice increased her level by 50% in one game... she won't be staying low level for long.

      --

      You quitting proves that the karma kap worked. The most annoying of the whores shut up. --CmdrTaco

    2. Re:The proof of the pudding is in the tasting by Joe+Random · · Score: 5, Informative
      I'll believe this method works if I can join a Halo 3 game as a level 1 player and not have my ass kicked.
      It's not possible to guarantee that. Even if the ranking system puts you in a game with other newbie players, it has no idea what any of the players' skill levels are prior to them playing their first match. However, after getting your ass kicked a couple of times, the ranking system should be able to more reliably group you with players of an approximately equal skill level.

      But it is necessary that your very first game is more likely to have some major asskicking, and you may be on the giving end or the receiving end depending on your prior, unrecorded experience.
    3. Re:The proof of the pudding is in the tasting by bigman2003 · · Score: 2, Insightful

      If you don't want your ass handed to you, there are a couple of options:

      #1- buy the game on the release date, and go on Live right away. You will be on an even playing field with the other players. I've done this quite a few times, and many of the best games I've played have been on day 1. Nobody knows the 'secrets' yet. Nobody knows where to hide, or what the shortcuts are (on a racing game) it's fair, and you can learn along with everyone else.

      #2- Play the single-player mode all the way through. DO THIS! I really hate when people go on Live, and they don't even know how to work the controls of the game. Of course you get your ass kicked...you've never practiced!

      #3- Start playing in the morning. Yes, it sounds stupid, but I've noticed I do much, much better in the morning...am I better then? No...but the people who are really serious about it usually play in the evening/night.

      --
      No reason to lie.
    4. Re:The proof of the pudding is in the tasting by tomstdenis · · Score: 1

      #4 - have enough disposable income to verbally threaten the others by saying you'll fly out there to beat the shit out of them if they don't stop spawn camping. :-)

      Most of the time in online games people aren't playing to test other peoples skill and have a challenge. They simply want "more points". If they were in it for the sport they wouldn't stalk players [e.g. repeatedly kill the same guy, go after them when it's illogical [e.g. you're in danger] etc], spawn camp [e.g. hide at their spawn point with a sniper rifle or something of high splash damage], and the dozens of other ways [griefing, TK, etc].

      Basically it's just not that interesting to play online once you realize that the guy on the other end who is pissing you off so much is 12 yrs old, lives in their parents house and bought the game with an allowance. All while you're 23, living on your own and had to work 3 hours to EARN the money for the game.... /rant :-)
      Tom

      --
      Someday, I'll have a real sig.
    5. Re:The proof of the pudding is in the tasting by LordNimon · · Score: 1
      So you're saying that the solution to being able to play online when you're not that good is to become good? That's a stupid answer! What if I don't have the time or energy to practice the game that much? What if I'm just not capable of becoming that good, no matter how much I practice?

      The whole point behind the TrueSkill system to is allow to accurately match up people by skill level, and that obviously includes people of low skill. You're so-called "solution" is really just ignoring the problem! No thanks, buddy.

      --
      And the men who hold high places must be the ones who start
      To mold a new reality... closer to the heart
    6. Re:The proof of the pudding is in the tasting by bigman2003 · · Score: 1

      You are expecting to go on-line with no 'skill' at all and still be competitive.

      Buying the game on day 1 does not mean that you have to be good.

      Playing in the morning does not mean that you have to be good.

      (those were two of my three suggestions)

      Playing the game through on single player means you have to have enough skill to get through the single player game, in order to learn the controls and have at least a few hours with the game.

      If you really think that I was saying that the only way to have a good game online was to be good...then you are mistaken. I'm 37 years old- and I know that my chances of being able to pick up Unreal Championship 2 at this point, and be a decent player, are zero...it's not going to happen. But if I get into Unreal Championship 3 early...and settle into my 'real' ranking...then I'll be able to play for a long time.

      Of course, this still means that I will only come in first about every 20th game. But I won't be in the cellar very often.

      If you are looking to WIN a lot of games...well, then you just have to be good. That's life.

      --
      No reason to lie.
  3. It has to be an improvement. by numbski · · Score: 4, Interesting

    Given that the current system could be described as a "bubble sort".

    That is, the cheaters and glitchers bubble to the top, the people looking for a fair and fun game drop to the bottom and eventually give up and stop playing. :\

    --

    Karma: Chameleon (mostly due to the fact that you come and go).

    1. Re:It has to be an improvement. by game+kid · · Score: 1

      Indeed. Anyone with SOCOM 2 might know the feeling...

      --
      You can hold down the "B" button for continuous firing.
    2. Re:It has to be an improvement. by Anonymous Coward · · Score: 0

      Why not use quick sort? It's much more efficient.

    3. Re:It has to be an improvement. by KDR_11k · · Score: 1

      Was that supposed to be a joke?

      --
      Justice is the sheep getting arrested while an impartial judge declares the vote void.
    4. Re:It has to be an improvement. by Anonymous Coward · · Score: 0

      Are you?

  4. Most importantly... by game+kid · · Score: 4, Interesting

    ...one must not be able to rank up with simple tactics.

    --
    You can hold down the "B" button for continuous firing.
  5. Re:Crikey by Anonymous Coward · · Score: 1, Insightful

    You are retarded.

    1) A lot of nerds enjoy video games.
    2) The 360 debuts in 14 days.
    3) Due to the impending release there is a lot of news, speculation, and hype.

    Given the above, it is prefectly reasonable for there to be a mass amount of 360 articles. The same shit happened with the DS and PSP and will happen for the PS3 and Revolution as their release dates approach.

    Don't like it? Then fuck off.

  6. Bayesian inference by Short+Circuit · · Score: 1

    It uses a technique called Bayesian inference for ranking players.

    The submitter didn't really have to give Bayesian a magical aura...We geeks who remember the first Bayesian spam filters already know it's magical. :)

  7. Microsoft Invented Mathematics? by ObsessiveMathsFreak · · Score: 1, Funny

    Great. A Microsoft announcment detailing their new development of a "bell curve" according to its "mean" and "standard deviation". Are MS claiming to have invented the Gaussian curve?

    What's next. Embrace, extend and extinguish calculus? I can just hear the boardroom conversation when the mathematicians try to claim precenent.

    Ballmer: "I'm going to f***ing bury that guy, I have done it before, and I will do it again. I'm going to f***ing kill Carl Friedrich Gauss!"

    --
    May the Maths Be with you!
    1. Re:Microsoft Invented Mathematics? by Maian · · Score: 1
      Have you RTFA? No where does MS say they invented the Gaussian curve. They even have a link to Math World on that page.

      I swear, the first thing some /.ers think when they see anything related to MS is "FUD" or "EEE" or "world domination". Just like that Avalanche research paper. *sigh*

    2. Re:Microsoft Invented Mathematics? by Westacular · · Score: 1

      Just wait until someone tells Ballmer that he can't integrate it.

    3. Re:Microsoft Invented Mathematics? by Khuffie · · Score: 1

      Wow. Get your head out of your ass for a minute, and realize that all Microsoft is doing is outlining what they're using in terms of mathematics and other means to rank players. As an 'obsessive math freak', you should be interested in the article and the fact that they publically released this information.

  8. A Bayesian slight of hand? by Anonymous Coward · · Score: 5, Interesting

    Let me start this by saying I have no problem with the Bayesian framework... this usually can be a source for a fight in statistics departments ... I've seen a situation where a professor snapped a potential professor hire for his usage of the Bayesian framework in his presentation.

    Typically the Elo system works on a iterative mechanism that updates after a match... that is to say that you have a prior rank, you play your game, and you have a post-match rank (not technical terms mind you) usually its R[player,after]=Rank[player,before]+("speed factor")(Result(0/1) - WinExpectancy(yourrank,theirrank). For Chess the logit function is used with some additional scaling factor (1/(1+exp(-(yours-theirs)/400)).

    This new system is taking advantage of the Bayesian framework, for those not P(A[i]|B)=P(B|A[i])*P(A[i])/sum(P(B|A[j])*P(A[j]), j=1..n)... or in more typical terms of likelihood the posterior density (density shaped by events) is proportional to the likelihood times a "prior" belief, or distribution.

    What's the problem in this... the Elo system can adapt to change in ability of the player over time. A Bayes based system can't. The Bayes estimates converge to what is the true values of the parameters in the model as the number of samples increase to infinity. So an estimate given under frequentist frameworks (MLE, UMVUE, Method of Moments) converge to the same estimate given under Bayes frameworks (mode of the posterior, mean of the posterior). Your estimate can only become more refined with data... if you play enough games, the framework cannot account for any substantial changes in skill. The variance of your stregth or ability rating will decrease. Of course the property of the updating scheme is a nice one to have... I've got to figure there is some usage of approximations... they are very scant in terms of either a prior or a modeled distribution for the event (logit, probit, cloglog, some arbitrary cumulative density function). I'm not going to complain about the normality assumptions in prior or posterior though... I wouldn't know about this situation in particular but the Central Limit Theorem pops up over and over again in statistics (there is a Bayesian CLT but I don't know what that entails). Especially in linear model variants such as this. I see this more as improvement by obfuscation... throw out cute words like "True Skill" and the idea that you have this better system that people will impart its own hopes on it. So, if you want to game the system, just win a lot really early before your variance decreases... looking at the formulae there doesn't seem to be any indication that they are utilizing a conjugate prior (which can help for giving forms for solutions that will not want to eat your pets)... otherwise they'd be staring at a gigantic matrix... and anybody who has dealt with multiple parameters in this situation knows that there will be a correlation structure... which seems to be ignored in the formulae.

    Most approaches I have seen thus far (as a second year grad student) in Bayes methods works away from a nice framework... many sound methods and realistic problems require Markov Chain Monte Carlo or similar techniques (which can be a nightmare in a computation sense)... either you have to have something of an incredibly nice form or you need to make use use of approximations to make a system like this work. I don't see how this will be all that appropriate in online gaming... sports I can see since you aren't taking somebody who mastered squash and putting them right into rugby. (Irrelevant to topic at hand-->) I've always had an interest in methods of ranking players/competitors but mainly from a sports angle. I've got a system drawn up in my head, its a matter of seeing if I can get it to work and then having a few hundred breaks work my way in terms of implementation (again, MCMC is probably the best tool in a lot of cases... but its a computational/time killer). The Bayesian framework is great if you have a good sense of piror belief as it can help guide you to a solution... but it is naturally biased (not necessarily from a point of malice) and at times it is used inappropriately.

    1. Re:A Bayesian slight of hand? by Ralf+Herbrich · · Score: 5, Informative
      Thanks a lot for your comments; you certainly understood the math behind TrueSkill(TM) really well! But there are a few details (that we did not want to bore people with on our web pages) which address all your concerns:

      • The first problem you point out is that Bayesian estimates (in general) asymptotically converge to the maximum likelihood estimates and, hence, in the TrueSkill sytem the sigma's would eventually go to zero and not allow for adaptation in the change of the player's "true" skill. This is true for stationary models but not for models with dynamics (think of a Kalman filter, for example). In fact, in the TrueSkill system we have a dynamics factor in our model equation that says that the skill of every gamer can slightly go up or down (zero mean, small variance) between two consecutive games. If you want to see this at work, please go to http://www.research.microsoft.com/mlp/trueskill/Ra nkCalculator.aspx and put every gamer's Sigma at 0.5; then press Recalculate Skill Level Distribution and you will see that the Sigma's after the game are slighly bigger (they should be 0.504). We have worked out the asymptotic value of the uncertainity, sigma, theoretically and compared our solution to empirical findings on 3 million games; our asymptotic limit was close up to 3 digits of precision. This limit is reasonably large to allow constant adaptation for skill changes.
      • The second problem you point out is that of a conjugate prior. Unfortunately, there is no conjugate prior for the probit likelihood in any representation. The approximation method we are using is called "Expectation Propagation" (see http://research.microsoft.com/~minka/papers/ep/roa dmap.html) or belief propagation in factor graphs. This IS an "incredibly nice" algorithm, to say it in your words :)
      • The third problem you point out is that the whole correlation structure would be gigantic and you are absolutely right when considering that there are millions of people on Xbox Live so this matrix would be couple of million rows times couple of million columns. However, we only save the diagonal of the matrix, that is, the uncertainity in the skill of every gamer. Please note, though, that we do build up the whole correlation structure (temporarily) for all gamers within a game (to make the approximation of the update step as exact as possible).

      Best wishes,
      Ralf Herbrich & Thore Graepel, Microsoft Research Cambridge (UK)

  9. How will I know it's working? by MilenCent · · Score: 0, Redundant

    I'll know it's working if it ranks me up near the top. Yeah, that'll be what decides it. If gives me any of that "below average" stuff I'll know it's utter crap.

    More seriously, FINALLY, an idea concerning X-Box 360 that I actually like.

  10. I believe, unless my statistics course was in vain(highly probable) that the term for mean is mu, not mju.

    yes, I know its a spelling error, look at my name!

  11. In modern Greek by tepples · · Score: 1

    At one stage of Greek, possibly the stage where Latin borrowed heavily (and added K, X, Y, and Z to its alphabet), the letter upsilon was pronounced somewhere between "oo" [u] and "ee" [i], perhaps like a Japanese unrounded u [M] or a German rounded i spelt ü [y]. In the latter case, the name of the letter mu would be pronounced [my], and "myoo" [mju]/[mjUw] would be a decent English approximation given the homophony of "Führer" and "furor" in English dialects.

    But in modern Greek, words spelled with upsilon are pronounced with an "ee" [i] sound, so the name of the letter mu sounds like the English word "me" [mi]. To get an "oo" [u] sound in modern Greek orthography, you need to write ou, omicron+upsilon.

    (Pronunciations use ad-hoc respelling followed by X-SAMPA notation.)

  12. Starcraft / Chess ratings are best by CrazyJim1 · · Score: 1

    The beauty of the Chess rating system is that the higher the peak on the bell, the more complex your game is. If the peak never rises very high, your game doesn't have a large skill differential, and is probably on par with a 2nd rate Street Fighter 2 clone that is nothing but button mashing. But if your peak goes above and beyond Starcraft or Chess, then your game is epically complex.... Now we'll never be able to compare games together because they'll always use different systems, but if they adopted the chess ranking system as its system, you can at least compare it to chess in depth.

    1. Re:Starcraft / Chess ratings are best by Anonymous Coward · · Score: 0

      Chess and Starcraft on the same level as one another? You think far too much of Starcraft, no matter it's popularity.

      There are many other time-tested games. Go for one, with a curve that makes Chess look like a child's game.

  13. huh? by sho222 · · Score: 3, Funny

    say what? (speak [english])

  14. Microsoft's pseudo-science masquerading as news by Kevin143 · · Score: 0, Redundant

    So why is this news? The Microsoft engineers came up with a reasonably obvious method of ranking people, got a trademark on a stupid name, and now this standard rating method is getting free advertising space on Slashdot, and the methedology behind it is sound enough that nobody calls them on it. They didn't do anything. They're just measuring the skill of a player as a standard deviation up or down from where they are for matching players.

  15. Re:Crikey by MBraynard · · Score: 1

    This is less an Xbox story than a story of a highly stastistical method for ranking and pooling players together in an online environment.

  16. Bayesian Inferrence by NitsujTPU · · Score: 2, Informative

    Bayesian Inferrence refers to a rather large class of algorithms. It would be nice if something were more specific.

    To give a heads up as to what this all is. Bayesian statistics are based on the idea that a probability can be updated based on additional information.

    So, perhaps you have a prior of 0.5. There is a 50/50 chance that whoever you're looking at is better than another player.

    Ok, so, 0.5 is the prior. Or, perhaps he's one 90% of the games played, so 0.9 is the prior. Now, 50% of games against the 2nd ranked person, he won, but only 20% against the third... and so on. That would be one form of bayesian inferrence.

    Other forms? Naive Bayes is a type of, fairly simple machine learning algorithm. There are also graphical models, which are rather advanced bayesian machine learning models.

    1. Re:Bayesian Inferrence by NitsujTPU · · Score: 0

      K, so, I RTFA... TrueSkill actually rocks...

  17. I think the real question is by Johnny+Mnemonic · · Score: 2, Interesting

    Seriously, three Xbox 360 articles on the front page? There's astroturfing, and then there's Slashtroturfing. Give it a rest, guys. I mean, who gives a shit?

    Gaming is an interesting phenomenon, and I follow the market and read Penny-Arcade--but I don't need to see all this advertising for Microsoft's next attempt to win another market.

    --

    --
    $tar -xvf .sig.tar
    1. Re:I think the real question is by Bobsledboy · · Score: 1

      It's a next generation console that is coming out in less than a month. Of course it's going to be all over the front page. It's called "news"

    2. Re:I think the real question is by steveo777 · · Score: 1

      In all fairness this is probably the first piece of actual news about the 360 in a few weeks. I too am getting sick of hearing about everyone's take on the 360 shortage. Every story about the shortage is a dup of the last, and the comments are the same too...

      --
      This sig isn't original enough, it's time to come up with something witty...
  18. Accuracy by bclark · · Score: 1

    Bayesian inference is typically used when you have a guess about a certain distribution, in this case a player's skill, and you can take observations that give you some information about the true distribution to get a fairly good approximation that converges to the real distribution as the number of samples increases. There are a couple of problems with the framework here. First, it says that it takes 50-100 games to converge to the real value. This becomes problematic when you factor in the fact that each player's skill is changing fairly constantly, probably at a rate faster than the convergence factor, and moreover the entire skill of all the player's on the network probably increases over time fairly steadily too. I don't know how to work the math out, but it seems like this may not be accurate for the given application. They would be better off with some sort of temporal model, but these tend to be much more complicated. Take this with a grain of salt though, I'm just an undergrad and may be reading it completely wrong.

    1. Re:Accuracy by Ralf+Herbrich · · Score: 1
      The model has a temporal component. Please see our reply at http://games.slashdot.org/comments.pl?sid=167684&t hreshold=-1&commentsort=0&tid=211&mode=thread&pid= 13984349#13987309. The actual number of games it takes until convergence is much smaller, see http://www.research.microsoft.com/mlp/trueskill/ (there is link to a detailed explanation how we arrived at these numbers). The numbers stated there assume that the skill does not change; if the skill also needs to be tracked than the convergence figure will slightly increase.

      Best wishes
      Ralf Herbrich, Microsoft Research Cambridge (UK)

  19. I've been thinking about this problem by NonSequor · · Score: 1

    An idea I had for a ranking system would be to adapt a method used to determine the most influential person in a social network.

    You just put all of the records of how many times each player has killed every other player in a big matrix and compute the eigenvector with the largest positive real eigenvalue. The larger a player's entry is in this eigenvector, the more skilled they are. This method takes into consideration both how many people a player has killed and how many kills those people have and how many kills the people they've killed have and so on. Using this method you can also develop methods for estimating the probability that one player will defeat another particular player.

    The matrix can be viewed as the adjacency matrix of a multigraph. The skill level of a player is determined by the number paths starting at that player's vertex. If computing the eigenvector is too difficult you can come up with a reasonable approximation of the players' skill levels by only considering paths up to a certain length.

    --
    My only political goal is to see to it that no political party achieves its goals.
  20. WoW ranks by smallguy78 · · Score: 2, Interesting

    Possibly the worst ranking system I've seen is the World of Warcraft PVP ranking system. It basically rewards how often you play the game rather than the system which I'm use to in FPS - kills:death ratio.

    The system has 14 ranks, which get progressively harder as you get higher up. The amount of damage you deal to a person decides how much 'honor' you get. Your honor is added up and your rank improves each week.

    Now the problem with this is obviously the more you play, the more honor you get (much like XP).

    In order to overtake ranks above you and achieve the top rank, you have to compete with their honor. With people playing all day everyday this is virtually unachievable to anyone but students or the unemployed.

    Why reach the top rank? Well the rewards (weapons,armour) are within the top 3 in the game. This is the only reason I play (I don't have time to dedicate 5 hours a night to go to Molten Core or Black Wing Lair - and besides being a Paladin means I never get the decent sword as it will always go to a Warrior or Rogue. I get stuck with 'epic' armour that is meant for player-versus-environment healing).

    I'm hoping they make the system a lot more intelligent in the future. Not kill:death ratio based, but make the objectives give much greater honor rewards, reward teamplay: e.g. healing,traps,saps and also honor-per-time-played used instead of the stupid grind crap at the moment.

    --
    Nothing costs nothing
    1. Re:WoW ranks by cluke · · Score: 1

      MMORPGs aren't set up to reward skill, they are designed to reward time spent, and little else. Not knocking WoW, I love it, but I can only shrug at the time requirements for the best equipment and higher PVP ranks. I just have to accept I won't ever be able to compete at that level.

    2. Re:WoW ranks by BenjyD · · Score: 1

      An MMORPG with a monthly subscription that only rewards time spent? What a surprise!

  21. Next Up: XBox Live BCS Edition by hal2814 · · Score: 2, Funny

    This is all looking pretty complicated. It's interesting what all has to go on the try to keep players playing similarly skilled players. So I guess the next step is to have two human polls and a computer ranking system. Of course, they would only really be shooting to pit #1 against #2 and then just make as much money as possible of the rest of us. And don't even think about a playoff to determine the best players.

  22. Re:Crikey by rwven · · Score: 1

    I actually agree with you. This is EXACTLY what MS wants... For a bunch of people who largely proclaim hatred for MS, everyone's playing right into their hands...

  23. TrueSkill(TM) ? by HTH+NE1 · · Score: 1

    I don't know about the rest of you, but I get an uneasy feeling when companies start trademarking truth.

    --
    Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
  24. RTFSummary by pkhuong · · Score: 1

    That's not what they're doing. They're representing a player's skill by a tuple of an average and standard dev. I.E., it can make the difference between a player who has a lot of ups and downs, versus a consistent player, even though they have the same "average skill."

    Ah, what wouldn't one do for an FP...

    --
    Try Corewar @ www.koth.org - rec.games.corewar