A Look At Competitive Ranking Systems

← Back to Stories (view on slashdot.org)

A Look At Competitive Ranking Systems

Posted by Zonk on Saturday January 14, 2006 @08:31AM from the i-win-no-i-win dept.

Christopher Allen writes "Competitive ranking is used in sports, chess, and other games, and by online services, such as the new TrueSkill system used on Microsoft's Xbox 360 Live service. Rankings establish who is best, create fair competitive matches between players, and handicap players with differing skill levels. An article at 'Life with Alacrity' discusses a number of approaches to competitive ranking on the internet, each with different issues and advantages."

8 comments

Min score:

Reason:

Sort:

Chess is my favorite by CrazyJim1 · 2006-01-14 09:19 · Score: 1

Chess rankings are awesome because the higher the bubble, the more skill is involved. I think all systems should use chess rankings so you can compare the bubble height of one game with the bubble of another. Starcraft had it right, except it allowed players to pick their partners instead of auto match them. I think automatching should search the first second for +/-10 points, 2nd second for +/-20 points up until 200 point difference. Once you go past 200 points, the game isn't worth playing for rankings, nor is it a very fair competition.

--
God spoke to me.
1. Re:Chess is my favorite by ChristopherA · 2006-01-14 09:58 · Score: 2, Interesting
  
  An interesting comment was posted by F. Randall Farmer about ever escalating ELO (chess-style) rankings:
  A data point on ELO cheating for you: Yahoo! Games uses ELO rankings for several their two-player games. Before recent abuse mitigation changes, some people used robot to accumulate scores in excess of 6,000,000 points. The abuse-the-ranking game had become a totally seperate competition.
  For now, Yahoo! has capped the ELO scores at 3,000 (I think.) This removed most of the cheating incentive.
Interesting stuff going on at beatpaths.com by Swinging+Man · 2006-01-14 09:35 · Score: 3, Interesting

There's an interesting ranking system emerging over at beatpaths.com. It started as a way of ranking NFL teams based on who has beaten whom, perhaps fueled by a Broncos fan frustrated with his team being ranked too low by other systems. There are plans to analyze the NBA and MLB, but it seems generally applicable to most competitions.

Another cool thing: dig the graphs straight out of graphviz, a nice open source tool for buiding graphs from textual specifications.
1. Re:Interesting stuff going on at beatpaths.com by ChristopherA · 2006-01-14 10:37 · Score: 1
  
  I'd not run into beatpaths before -- it is quite interesting. It appears to be sort of a tourney system for when you can't complete a full set of round-robin or double-elimination competitions, as what happens in NFL during the fall season.
  What is also interesting to me is that it introduces a goal for ranking systems that I'd not thought of before -- prediction. The purpose of the beatpath systems is actually focused on predicting the outcome the next set of the weekend games.
2. Re:Interesting stuff going on at beatpaths.com by Ralf+Herbrich · 2006-01-14 22:17 · Score: 1
  
  Both the ELO and TrueSkill system are derived from a generative model of the game outcome (i.e., the thing to be predicted) and, thus, both systems are well suited for the task of prediction.
  In a way, matchmaking addresses this issue insofar as well matchmade games are games where the outcome cannot be predicted, that is, where each outcome is equally likely.
  Ralf Herbrich
  Microsoft Research Ltd., Cambridge, UK
eGenesis Ranking System by Teppy · 2006-01-15 03:41 · Score: 3, Interesting

The article talks about the ranking system that I invented and implemented in A Tale in the Desert for use in the Discipline of Conflict. Great to see the coverage, but unfortunately the algorithm didn't work well in practice, and we've since abandoned it.

The problem was that it took too long to converge. Of course all the parameters can be adjusted for faster convergence, but then it became too easy to metagame! I concluded that any continuous system that collapses the result to a small amount of data (like a rank (ELO), or a rank+confidence (TrueSkill) or a bitvector (eGenesis)) after a match would suffer from this problem.

"A Tale in the Desert II" replaced the eGenesis Ranking System with an asynchronous king-of-the-hill method. You start at rank 1, and must play someone at rank 1. It's asynchronous because you don't hold anyone up by not playing - the system never assigns a match. Instead, you just walk up to another rank 1 player and challenge them. They must agree to the match. The winner becomes rank 2, and the loser is "out". If you're out, you can reset back to rank 1, but only once/week. You can metagame your way through a few levels, but it takes an exponential number of co-conspirators to attain a given level. (I've simplified the system a bit. The full system is documented here.)

Unfortunately, the Conflict Discipline was only popular with a very small number of players, and it's being replaced in ATITD 3.

Note: A related problem is Judging Systems, where players rate in-game works of art. We've tried a number of algorithms there, and just recently have come up with one that
1. Re:eGenesis Ranking System by Teppy · 2006-01-15 03:44 · Score: 2, Insightful
  
  Screwed up that last link. It should say:
  
  Note: A related problem is Judging Systems, where players rate in-game works of art. We've tried a number of algorithms there, and just recently have come up with one that seems to work.
  
  Yeah, I known, "preview button", blah.
2. Re:eGenesis Ranking System by MilenCent · 2006-01-15 14:30 · Score: 1
  
  Dude! When I heard about your system I went out and took a look at it.
  
  I think that's a very cool system, even if it was unsuitable for the game you made it for, not the least reason for which is because I came up with something similar to it once, heh. I get the feeling we're on something of the same page on these things.
  
  If you ever want to discuss such things, let me know.