Netflix Prize Contest Ends, Down To the Wire
suraj.sun updates us on the Netflix Prize now that the competition has officially closed. We discussed the new leader with one day to go in the contest: The Ensemble, taking the lead from long-time leader BellKor's Pragmatic Chaos, the first contestant to submit an entry that broke the 10% barrier. In the contest's final day, BellKor re-took the lead with 20 minutes to go, then The Ensemble apparently pulled a Michael Phelps with 4 minutes to go, squeaking ahead by 0.01%. At least so the leaderboard claims — but those numbers are posted by the competing teams. The NY Times reports that an official winner will not be named until September — Netflix needs that much time to pore through the complex entries and read the code. Netflix contacted BellKor on Sunday to tell them the team remained in first place; The Ensemble has had no such notification.
They realized that all movies starring Matthew McConaughey and Kate Hudson were actually the same movie. The compression on that alone was enough.
What they need to start is a contest to improve their incredibly lousy on-demand service, the Silverlight player is beyond terrible. All this effort (and money) over getting 10% more accurate guesses that the same guy who liked "Terminator" will like "Terminator 2" is nice and all, but it's a bit of a time waster don't you think?
team a makes algorithm improvement b
team c takes algorithm improvement b and makes algorithm improvement b(+d)
team e takes algorithm improvement b(+d) and makes algorithm improvement b(+d)->f
the guy who squeaked out the extra 0.01% did that on top of someone else's code that eked out 0.05%, etc., ad nauseum
so how do you ascertain who won? all the teams won
they should take the final prize money and try to fractionate each incremental improvement in the algorithm and proportionally dole out the money that aways. anything else is unfair
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
Netflix calculates the score shown on the leaderboard from a set of rating predictions submitted by a team. The team does not, and will not, know the correct answers. For testing their algorithms, the teams use another dataset. The two datasets, part of the package made available to the competitors, are known as "qualifying" and "probe".
No, they lost to a German wearing a polyurethane suit and then declared they wouldn't race any more until the suits are banned.
The reason BellKor is still first is that the published scores are irrelevant. The scores that matter for the prize are based on an unpublished data set known only to Netflix (to prevent people submitting answers that are optimized for the challenge data and work poorly on everything else). On this secret data set, BellKor's algorithm apparently performs better than The Ensemble's.
main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
The contest has been going on for several years straight, and /. has had several stories about it. The article takes knowledge of the contest as a given.
See Wikipedia and Netflix's own site for details.
Where have you been?
[2009-07-26]New Leader In Netflix Prize Race With One Day To Go
[2009-07-26]Netflix Prize May Have Been Achieved
[2007-11-27]Anonymity of Netflix Prize Dataset Broken
[2007-11-14]Close but no Cigar for Netflix Recommender System
[2006-10-02]Build a Better Netflix, Win a Million Dollars?
[2008-11-22]Interest Still High In the Netflix Algorithm Competition
[2009-10-09]Netflix Prize Competitor Already Beats Netflix
etc..
No, that link you posted to a web comic we've all seen a hundred times is not "obligatory."
...I'm sitting here wondering how stable these algorithms are over long periods of time. I'm assuming that the "practice" data set and the "test" data set are equal in terms of time distribution (date of movie release; date of review). But 10 years from now, 20 years from now, I see the RMSE numbers slowly drifting upwards as the algorithm was optimized to the 2000-2009 data set, not the 2000-2020 data set or tahe 2000-2030 data set. But this is not my area of expertise so I'm wondering what others have to say on this topic.
Comment removed based on user account deletion
They have invented this nifty thing called "the Google", you know?
They used root square mean for the competition.
Basically, the difference between the guess and the real answer for each vote is squared giving a value between 0 and 16 (as the biggest error is 4 when you guess 5 on a vote that is 1 or vice versa). This is summed up for each vote in the test and then divided by the number of votes in the test. Finally you take the root of that.
The winner score in the competition is around 0.855. Which is smaller than 0.9514*0.9 score. Where 0.9514 was the result scored by the netflix algorithm.
I hope that explains everything.