Chess Ratings — Move Over Elo
databuff writes "Less than 24 hours ago, Jeff Sonas, the creator of the Chessmetrics rating system, launched a competition to find a chess rating algorithm that performs better than the official Elo rating system. The competition requires entrants to build their rating systems based on the results of more than 65,000 historical chess games. Entrants then test their algorithms by predicting the results of another 7,809 games. Already three teams have managed create systems that make more accurate predictions than the official Elo approach. It's not a surprise that Elo has been outdone — after all, the system was invented half a century ago before we could easily crunch large amounts of historical data. However, it is a big surprise that Elo has been bettered so quickly!"
Indeed, Sagarin has shown that applying Elo in sports where the winner is based on points scored is not optimal, since the average margin of victory is a better predictor of strength than won-loss record. But this has nothing to do with applying the Elo method to its original setting of chess, where the outcome of the game is only "win/draw/loss" and there is no margin of victory.
Ah man, no matter how inadequate the Elo system may be for chess, it's much worse seeing it applied to other games where it doesn't belong, which happens regrettably often. The trouble is that the Elo system depends on the premise that nothing affects the outcome of a game other than the skill of each player (and who gets the white pieces).
In chess, that assumption is a pretty good approximation to reality, since every tournament game in run the same way. But many games do have variations in rules or format across different events, such as different maps or races in a real-time strategy game, or different card pools in Magic: The Gathering. Then Elo ratings are biased by how often a player has the chance to play to his strong areas. Players in turn are compelled to game the system: "I should avoid this event because they're using Format X and my rating will stay stronger if I stick to Format Y." The Elo system is meant precisely to obviate that kind of gamesmanship: chess players should need to think only about the strengths of their opponents, which (in principle) will be weighted fairly when calculating rating adjustments. But if there are other competitive factors, which is true for most any popular game invented in the last 30 years, Elo ratings become that much less meaningful.
"This algorithm runs in constant time. Come on, 2,147,483,648 is a constant..."
The first time I Heard Bev Bevan had joined Sabbath I kind of went "WTF?". But they're all Brummies, along with a lot of heavy metal bands around that time. Priest, Magnum ... they probably all played in pubs together wwhen they were 15.
Similarly you couldn't be a serious goth in the 80s unless you were from Leeds, or a flare-wearing floppy-mopped tossbag in the 90s if you weren't a Manc.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Yes, and count how many of them are better than the ELO approach.
Quidnam Latine loqui modo coepi?
I don't think you understand what the discussion in this post is about. The game of chess has no element of randomness -- but the players do, and it's the players we are trying to model. Just because, on average, player A is better than player B, doesn't mean that player A will win every game. The fact is that the same player will play at different levels of ability on different days, and that is the randomness that is relevant to models trying to predict outcomes of chess games.
Basically all rating systems are based on the assumption that players' ability for a given game fluctuates around an "average ability level" according to some distribution, and the goal of the rating system is to discover the average (and perhaps spread) of this indvidual distribution. So even under best conditions the most the system can do is predict the outcome with an error coming from the distribution of abilities. Now assume the distributions are relatively wide -- then there will be a large statistical error even for the best system.
Returning to the main point, the discussion of the last paragraph has nothing to do with the fact that chess is deterministic. In fact, the fact that there is no randomness in chess makes things easier.