Slashdot Mirror


Elo Chess Rating System Topped By Proposed Replacements

databuff writes "About six weeks ago, Slashdot reported a competition to find a chess rating algorithm that performed better than the official Elo rating system. The competition has just reached the halfway mark and the best entries have outperformed Elo by over 8 per cent. The leader is a Portuguese physicist, followed by an Israeli mathematician and then a pair of American computer scientists."

24 of 102 comments (clear)

  1. Sweet by Anonymous Coward · · Score: 2, Funny

    Castle this.

  2. Re:what now? by cappp · · Score: 5, Interesting

    To be fair that owning represents a difference of 0.000629 in the RMSE between the two of them - hardly the sound thrashing those snooty mathematicians rightly deserve.

  3. Re:what now? by jhoegl · · Score: 3, Funny

    Yes, I agree. We should also fight amongst professions because we simply do not have enough to fight about.

    Long live Physicists and they physicisteries!

  4. Re:Interesting by JoshuaZ · · Score: 4, Informative

    This is chess rating algorithm. The goal is to predict given a matchup between two players with known histories how they will likely fare in a game or series of games against each other. Elo is the standard rating system and has been for some time. These algorithms are improvements on that. So they predict better who will win. They have nothing to do with playing actual chess. So the Turk is irrelevant to this discussion (aside from the not minor issue that the operator has been dead for some time.)

  5. Re:what now? by cappp · · Score: 5, Funny

    Whoah there partner, we don't want a full-scale fight between all professions - some of those guys are pretty buff. Pick off the mathematicians and physicists first because the law of the playground must be respected - the small, weak, bifocaled, or curiously gifted with numbers should be taken down first. Then nap time.

  6. Can't be so by Waffle+Iron · · Score: 5, Funny

    A friend called my on my telephone line and told me out of the blue that the Elo rating system had been bested. I was so stunned I almost turned to stone. I said, "Dude, don't bring me down!". But the news slowly sunk in, and now I can't get it out of my head. But I'll tell you what, the jury is still out. I think there's gonna be a showdown, and then Elo will be back on top.

    1. Re:Can't be so by definate · · Score: 3, Funny

      I don't get it.

      REVEAL YOUR SECRETS!

      Wow, Slashdot won't allow me to post with that ratio of non-caps to caps. So I need to write all of this to correct the ratio. The error says "Filter error: Don't use so many caps. It's like YELLING.".

      Dear robotic automated moderating overlord,
      I know it's like yelling, that's the effect I was going for. Obviously your algorithm is shit, because you don't seem to understand context... or love.
      Sincerely,
      definate

      --
      This is my footer. There are many like it, but this one is mine.
    2. Re:Can't be so by halestock · · Score: 2, Funny

      I dunno, I heard the new system has an IQ of 1001, has a jumpsuit on, and is also a telephone.

    3. Re:Can't be so by Paradise+Pete · · Score: 2, Informative

      REVEAL YOUR SECRETS!

      His post is chock full o' snippets ELO songs.

  7. Not surprising at all by IICV · · Score: 5, Insightful

    This is entirely unsurprising. The Elo system was, in a sense, designed to be easily calculable in a time before things like computers or databases or data mining were especially common (after all, it was adopted by US Chess Federation in 1960!), and it hasn't been revised much if at all since then. Of course statisticians using modern methods and number crunching capabilities and huge databases of both game results and game moves are going to be able to beat it by a lot - this isn't like the Netflix prize, where a bunch of teams were competing to improve something that had been in active development up until that very year.

  8. Re:Errata by Chuck+Chunder · · Score: 4, Funny

    Not any more.

    --
    Boffoonery - downloadable Comedy Benefit for Bletchley Park
  9. Whole History Rating by Vintermann · · Score: 4, Interesting

    The french computer scientist Remi Coulom, well-known for the pioneering computer go program Crazy Stone, has published some very interesting research on this issue. He claims not only to beat Elo, but also Glicko, Microsoft's TrueSkill and decayed-history approaches.

    I was going to see if I could implement his ideas for the competition, since he's not going to participate himself. But it doesn't look like I have time for it.

    Here's the paper in case anyone wants to give it a try. I suspect the approach is a bit more solid than the ad-hoc approaches of the quants.

    --
    xkcd is not in the sudoers file. This incident will be reported.
    1. Re:Whole History Rating by Vintermann · · Score: 3, Informative

      Glicko isn't designed to take advantage of all the information that's available in this competition. To calculate your new Glicko rating, you just need the Glicko ratings of both players + the result. I bet all serious contenders in the competition use the whole history somehow. (I talked with one who uses a decayed history scheme; he beats Glicko).

      As to the leaderboard, it's really not so clear. Almost certainly, some of the contenders are accidentally overfitting to the leaderboard test data.

      --
      xkcd is not in the sudoers file. This incident will be reported.
  10. Obvious question by glwtta · · Score: 3, Funny

    So, how did they rank the entries?

    --
    sic transit gloria mundi
  11. Re:Errata by Anonymous Coward · · Score: 2, Funny

    No, Portrugal. Between Spairn and the Atlantirc.

  12. Portrugese by Anonymous Coward · · Score: 5, Funny
    True facts about Portrugese:
    1. More than 250 million peoprle spek Portruguese, making it the firfth most sproken language in the wrorld.
    2. Portrugese is an adjective describing thrings relatd to Portrugal.
    3. Christropher Colurmbus spoke Portrugese.
    4. Portrugese is the officiral langurage of ther Repulic rof Angorlra.
    5. Hery trhe Navgatror, a Portugese prirnce, was in lrge partr resposible for Portugese effortrs durirng the age of explorartion.
    1. Re:Portrugese by xtracto · · Score: 5, Funny

      Hery trhe Navgatror, a Portugese prirnce, was in lrge partr resposible for Portugese effortrs durirng the age of explorartion.

      Wait just a second! you cannot go changing the subject suddenly like that... focus!, we are talking about Portrugese here!

      --
      Ubuntu is an African word meaning 'I can't configure Debian'
    2. Re:Portrugese by Anonymous Coward · · Score: 5, Funny

      That's just a typo! Don't be such a grammar nazi!

  13. and what about rock/paper/scissors by Paradigma11 · · Score: 2, Interesting

    Many rating systems seem to assume transitive dominance structures. If you are playing rock/paper/scissors no rating would be sufficient to predict the outcome of a tournament. Many games (using Batttlenet, true skill..) propably are not interested in finding nontransitive structures since players want to be the best and fans want to know who is the best which is kind of pointless with r/p/s.

  14. Re:what now? by drewhk · · Score: 2, Informative

    Yeah, and his name is Él(Lowercase O-double acute), not Elo, but I understand that "hungarian umlauts" causes significant cognitive stress :)

    Even for Slashdot it seems...

  15. Re:what now? by mutu310 · · Score: 2, Informative

    Actually he was born Él Árpád Imre but changed his name to a more Americanized Arpad Emrick Elo.

  16. Re:what now? by sleeping143 · · Score: 3, Informative

    Careful, those physicists have arsenals of powerful lasers at their disposal...

  17. Re:what now? by Another,+completely · · Score: 3, Insightful

    But do they have sharks on which to mount them?

  18. Re:what now? by turbidostato · · Score: 2, Informative

    "But do they have sharks on which to mount them?"

    We must avoid them teaming to Biologists at all costs!