Both the ELO and TrueSkill system are derived from a generative model of the game outcome (i.e., the thing to be predicted) and, thus, both systems are well suited for the task of prediction.
In a way, matchmaking addresses this issue insofar as well matchmade games are games where the outcome cannot be predicted, that is, where each outcome is equally likely.
Ralf Herbrich Microsoft Research Ltd., Cambridge, UK
Thanks a lot for your comments; you certainly understood the math behind TrueSkill(TM) really well! But there are a few details (that we did not want to bore people with on our web pages) which address all your concerns:
The first problem you point out is that Bayesian estimates (in general) asymptotically converge to the maximum likelihood estimates and, hence, in the TrueSkill sytem the sigma's would eventually go to zero and not allow for adaptation in the change of the player's "true" skill. This is true for stationary models but not for models with dynamics (think of a Kalman filter, for example). In fact, in the TrueSkill system we have a dynamics factor in our model equation that says that the skill of every gamer can slightly go up or down (zero mean, small variance) between two consecutive games. If you want to see this at work, please go to http://www.research.microsoft.com/mlp/trueskill/Ra nkCalculator.aspx and put every gamer's Sigma at 0.5; then press Recalculate Skill Level Distribution and you will see that the Sigma's after the game are slighly bigger (they should be 0.504). We have worked out the asymptotic value of the uncertainity, sigma, theoretically and compared our solution to empirical findings on 3 million games; our asymptotic limit was close up to 3 digits of precision. This limit is reasonably large to allow constant adaptation for skill changes.
The second problem you point out is that of a conjugate prior. Unfortunately, there is no conjugate prior for the probit likelihood in any representation. The approximation method we are using is called "Expectation Propagation" (see http://research.microsoft.com/~minka/papers/ep/roa dmap.html) or belief propagation in factor graphs. This IS an "incredibly nice" algorithm, to say it in your words:)
The third problem you point out is that the whole correlation structure would be gigantic and you are absolutely right when considering that there are millions of people on Xbox Live so this matrix would be couple of million rows times couple of million columns. However, we only save the diagonal of the matrix, that is, the uncertainity in the skill of every gamer. Please note, though, that we do build up the whole correlation structure (temporarily) for all gamers within a game (to make the approximation of the update step as exact as possible).
Best wishes,
Ralf Herbrich & Thore Graepel, Microsoft Research Cambridge (UK)
In a way, matchmaking addresses this issue insofar as well matchmade games are games where the outcome cannot be predicted, that is, where each outcome is equally likely.
Ralf Herbrich
Microsoft Research Ltd., Cambridge, UK
Best wishes
Ralf Herbrich, Microsoft Research Cambridge (UK)
Best wishes,
Ralf Herbrich & Thore Graepel, Microsoft Research Cambridge (UK)