A Look At Competitive Ranking Systems
Christopher Allen writes "Competitive ranking is used in sports, chess, and other games, and by online services, such as the new TrueSkill system used on Microsoft's Xbox 360 Live service. Rankings establish who is best, create fair competitive matches between players, and handicap players with differing skill levels. An article at 'Life with Alacrity' discusses a number of approaches to competitive ranking on the internet, each with different issues and advantages."
Chess rankings are awesome because the higher the bubble, the more skill is involved. I think all systems should use chess rankings so you can compare the bubble height of one game with the bubble of another. Starcraft had it right, except it allowed players to pick their partners instead of auto match them. I think automatching should search the first second for +/-10 points, 2nd second for +/-20 points up until 200 point difference. Once you go past 200 points, the game isn't worth playing for rankings, nor is it a very fair competition.
God spoke to me.
There's an interesting ranking system emerging over at beatpaths.com. It started as a way of ranking NFL teams based on who has beaten whom, perhaps fueled by a Broncos fan frustrated with his team being ranked too low by other systems. There are plans to analyze the NBA and MLB, but it seems generally applicable to most competitions.
Another cool thing: dig the graphs straight out of graphviz, a nice open source tool for buiding graphs from textual specifications.
The article talks about the ranking system that I invented and implemented in A Tale in the Desert for use in the Discipline of Conflict. Great to see the coverage, but unfortunately the algorithm didn't work well in practice, and we've since abandoned it.
The problem was that it took too long to converge. Of course all the parameters can be adjusted for faster convergence, but then it became too easy to metagame! I concluded that any continuous system that collapses the result to a small amount of data (like a rank (ELO), or a rank+confidence (TrueSkill) or a bitvector (eGenesis)) after a match would suffer from this problem.
"A Tale in the Desert II" replaced the eGenesis Ranking System with an asynchronous king-of-the-hill method. You start at rank 1, and must play someone at rank 1. It's asynchronous because you don't hold anyone up by not playing - the system never assigns a match. Instead, you just walk up to another rank 1 player and challenge them. They must agree to the match. The winner becomes rank 2, and the loser is "out". If you're out, you can reset back to rank 1, but only once/week. You can metagame your way through a few levels, but it takes an exponential number of co-conspirators to attain a given level. (I've simplified the system a bit. The full system is documented here.)
Unfortunately, the Conflict Discipline was only popular with a very small number of players, and it's being replaced in ATITD 3.
Note: A related problem is Judging Systems, where players rate in-game works of art. We've tried a number of algorithms there, and just recently have come up with one that