Slashdot Mirror


Collaborative Filtering and the Rise of Ensembles

igrigorik writes "First the Netflix challenge was won with the help of ensemble techniques, and now the GitHub challenge is over, and more than half of the top entries are also based on ensembles. Good knowledge of statistics, psychology and algorithms is still crucial, but the ensemble technique alone has the potential to make the collaborative filtering space a lot more, well, collaborative! Here's a look at the basic theory behind ensembles, how they shaped the results of the GitHub challenge, and how this pattern can be used in the future."

1 of 58 comments (clear)

  1. Re:I'm sorry by Trepidity · · Score: 4, Informative

    Yeah, the term dates back at least to the 1990s. The classic survey paper (over 1000 citations!) on the subject is "Ensemble Methods in Machine Learning" [pdf] by Tom Dietterich (2000), for those who want to glance through a survey. Though be warned that some of its specific conclusions are now dated--- e.g. there's been a *lot* written in both statistics and machine learning since then on what boosting "really" is and why it works.

    Dietterich presents the more machine-learning view of it, focused on algorithms, combination of predictions, iterative refinement, etc. The best survey from a statistical approach is probably Ch. 16 of this book by three Stanford profs, which you can probably read some of on Google Books.