Slashdot Mirror


Netflix Announces Second Data Mining Contest

John Snodgrass writes "Neil Hunt, Chief Product Officer at Netflix, has announced on the Netflix Prize Forums that they are planning to hold a new data mining competition. The second competition will have some twists and is expected to be shorter in duration. It will feature two grand prizes, to be awarded in a 6 and 18 month time frame. A previous competitor still active on the board has already dubbed it: 'The Sparse Matrix: Reordered' and 'The Sparse Matrix: Factorizations.'"

2 of 56 comments (clear)

  1. Re:Usefullness? by sottitron · · Score: 4, Interesting

    I used to think I was unique in what rare movies and music I liked until I met someone who had almost an identical collection to me. On top of that, we both had some of the same clothes. The reason netflix researches these data mining techniques is because our tastes really do cluster into groups. For some it might be because they like DeNiro films and Spaghetti Westerns. For others it might be that they like two screen writers - though they never know it. The payoff for getting this right if you are Netflix is that if a customer gets recommendations and he/she use those to fill your queue, then they are much less likely to cancel netflix anytime soon.

  2. Re:Contests by Trepidity · · Score: 4, Interesting

    It allows the researchers to "cheat" a bit too via an argument by authority, which is not always good, but does at least make the researcher's job easier. A big issue in data mining is that it isn't purely a technical field, but one with both conceptual and technical issues. The over-arching goal is something like, "get useful and/or interesting information out of data". But what is "useful", what is "interesting", and how do we measure when we've gotten it or not? Usually you have to defend why your problem is the right one, why your metric is the right way to measure success on it, etc. Working on the Netflix competition lets you sidestep all that, because Netflix has already decreed exactly what the goal is, and what performance metric will be used to judge success at that goal, leaving only the technical problems.