The Elo Benchmark was submitted a second time. I wrote to Sonas about this. Apparently the rating system has to be seeded. He tried a different approach to calculating seed ratings and this performed better - pushing him one place higher in the rankings.
Great comments, thanks!
To address the your most incisive comments (as I see them)
c) Competitions on Kaggle aren't polls. Competitions are framed in a way that requires serious data analysis. For example the Eurovision Forecasting Comp requires contestants to forecast the voting matrix (who votes for who) rather than a simple who will win.
b,d,e) getting people to do lots of predictions should seperate the talented from the lucky. Having forecasters predict in the same place over and over is a good way to get long enough history to discover the trully talented.
Thanks for the post. I hadn't heard of the DELPHI method - so now I'm a little bit wiser.
According to the Wikipedia article, the DELPHI method tries to get a panel of experts to agree on a single forecast. Kaggle (assuming the wisdom of crowds is the method of choice), cherishes diversity. It takes everybody's forecasts and 'combines' them in the hope that individual forecast errors will cancel out.
Kaggle, unlike prediction markets, is designed to deal with complex tasks where data modeling is required.
For example, a prediction market can be used to get the crowd's view on who will win the Eurovision Song Contest. But Kaggle is asking contestants to forecast the voting matrix.
Funny and insightful - nice comment!
The post was terse, so I didn't explain that competitions on Kaggle aren't polls. Competitions are framed in a way that requires serious data analysis. For example the Eurovision Forecasting Comp requires contestants to forecast the voting matrix (who votes for who) rather than a simple who will win.
I totally agree, past performance does not guarantee future performance. However, the more forecasts you get statisticians to make, the less likely it is that their prediction-history reflects chance rather than skill.
I do believe we rely on predictions more today than at anytime in history because we can make them more reliably (we have so much historical data to base them on).
Thanks everyone for your comments! Sounds like many of you are skeptical that 'wisdom of crowds' can work in this setting. It'll be an interesting experiment, but I'm encouraged by the Netflix Prize case study.
Out of interest, does anybody have any interesting ideas for prediction competitions? I'd love to hear from you either in the comments area or at statsbuff@gmail.com.
According to the leaderboard, Glicko is being beaten by ~5 per cent. Coulom's system better be pretty good!
how do you test current relative rankings without using them to make predictions?
The Elo Benchmark was submitted a second time. I wrote to Sonas about this. Apparently the rating system has to be seeded. He tried a different approach to calculating seed ratings and this performed better - pushing him one place higher in the rankings.
Data only shows results - so there's no scope for gauging the margin of victory.
Did you see this? http://www.fivethirtyeight.com/2009/12/world-cup-2010-advancement.html Would be nice if he entered. Or else it looks like there's enough info for somebody to enter using his behalf.
Worse than that, JP Morgan picked Slovenia to finish fourth. Ahead of teams like Germany and Slovenia.
Great comments, thanks! To address the your most incisive comments (as I see them) c) Competitions on Kaggle aren't polls. Competitions are framed in a way that requires serious data analysis. For example the Eurovision Forecasting Comp requires contestants to forecast the voting matrix (who votes for who) rather than a simple who will win. b,d,e) getting people to do lots of predictions should seperate the talented from the lucky. Having forecasters predict in the same place over and over is a good way to get long enough history to discover the trully talented.
Thanks for the post. I hadn't heard of the DELPHI method - so now I'm a little bit wiser. According to the Wikipedia article, the DELPHI method tries to get a panel of experts to agree on a single forecast. Kaggle (assuming the wisdom of crowds is the method of choice), cherishes diversity. It takes everybody's forecasts and 'combines' them in the hope that individual forecast errors will cancel out.
Kaggle, unlike prediction markets, is designed to deal with complex tasks where data modeling is required. For example, a prediction market can be used to get the crowd's view on who will win the Eurovision Song Contest. But Kaggle is asking contestants to forecast the voting matrix.
Funny and insightful - nice comment! The post was terse, so I didn't explain that competitions on Kaggle aren't polls. Competitions are framed in a way that requires serious data analysis. For example the Eurovision Forecasting Comp requires contestants to forecast the voting matrix (who votes for who) rather than a simple who will win.
I totally agree, past performance does not guarantee future performance. However, the more forecasts you get statisticians to make, the less likely it is that their prediction-history reflects chance rather than skill.
I do believe we rely on predictions more today than at anytime in history because we can make them more reliably (we have so much historical data to base them on).
Thanks everyone for your comments! Sounds like many of you are skeptical that 'wisdom of crowds' can work in this setting. It'll be an interesting experiment, but I'm encouraged by the Netflix Prize case study. Out of interest, does anybody have any interesting ideas for prediction competitions? I'd love to hear from you either in the comments area or at statsbuff@gmail.com.