New Leader In Netflix Prize Race With One Day To Go
brajesh writes "The Netflix Prize, an algorithm competition to improve the Netflix Cinematch recommendation system by more than 10%, has a new leader — The Ensemble — just one day before the competition ends. The 30-day race to the end was kicked off after BellKor's Pragmatic Chaos submitted the first entry to break the 10% barrier, with the results showing a 10.08% improvement. The Ensemble, made up of three teams who chose to join forces ('Grand Prize Team,' 'Opera Solutions' and 'Vandelay United), has managed to overtake BellKor with a score of 10.09% — an improvement of .01% over the former leaders. From the article on Techcrunch: 'The competition will end [today], so teams still have a little bit of time left to make their last-second submissions, but things are looking good for The Ensemble. This has to be absolutely brutal for team BellKor.'"
that other websites should do this as well.
Slashdot, for instance, could have a contest to unbreak their fucking code by 10%.
What did they do, make sure that all of Uve Boll's movies never came up as a "Recommended for you" movie?
If our elected representatives no longer represent us, do we still live in a Democracy?
Back when I first began using Amazon.com, I never bought a book based on the recommended items. I felt the recommendations were trite, ill-advised, and typically only peripherally related to the item I was buying.
Then the recommendations got better. Much better. I started to find myself buying things right out of the recommended section, and the product combination deals also became very tempting.
If Netflix can turn their recommendation engine into something similar, they will be sitting on a goldmine. As they say, people hate get sold to but they love buying.
rather than declaring your best result early, the Belkor team should have employed a bit of strategy and only declared a lesser result (if any). That would give the other teams something to aim at, without giving away their best results. These would be held back right up until the last minute and then submitted, so that other teams would not have time to make any further improvements (in fact, maybe this IS what they're doing). It's been a successful bidding strategy on eBay for years, so why wouldn't it translate into other competitive areas too?
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
Uwe Boll. It only sounds like a v because he's German.
It does seem like a slight flaw in the rules if there is only one 30-day countdown timer. That is, if a competing team can hold off until the last moment to release their version that bests the current leader, as is the case here. Now that this improvement has been made public, there should be something like a 10-day response time for the other competing teams.
Why not wait another day before submitting the improvement? All they did now was giving the other team one day to respond, and if they succeed, I doubt they will be able to submit yet another improvement. So why not simply wait until an hour or so before the deadline, or am I missing something about the rules, e.g. any submitted improvements prolong the deadline by one day?
For the grand prize, there was a final 30-day countdown from the time the first entry that achieved greater than 10% was received, which was a month ago. So it seems like this will indeed come down to an ebay-like sniping situation in the last few hours.
I wouldn't feel too sorry for BellKor/KorBell though -- they've got many, many best paper awards at conferences and a huge degree of publicity out of the whole endeavor. In fact, in KDD 2009, they detailed most of the methods that most likely got them to the top -- i.e. they incorporated the fact that tastes and preferences drift over time. Simple, in retrospect of course. If you have an ACM subscription, you can read the 2009 paper here.
Plus, since they work for AT&T/Yahoo Research, I remember Yehuda Koren stating that the money wouldn't have gone to them anyway -- possibly a large bonus, but I think they're entitled to that anyway. So I wouldn't feel too sorry for them.
An old-timer with old-timey ideas.
Many teams actually combined multiple methods to get a better score. In fact, "BellKor's Pragmatic Chaos" is a combination of three teams, I'm guessing - BellKor, BigChaos and Pragmatic Theory.
Also, it helps to remember that what's posted on the leaderboard is the result of the "quiz" set - half of the actual set of recommendations you're asked to make. The other half, the "test set," is used for final judging. With such a small difference between BellKor's Pragmatic Chaos and The Ensemble on the quiz set (.0001 RMSE), the test set rank may actually end up reversed.
I claim first use of "Error No. 0B" - or "No. 0B error." It'll be the new ID 10T!
My question is whether there will be any winner at all other than netflix? One of the rules for the competition was that you could not form multiple teams. This was to prevent people from gaining multiple submissions per day. Otherwise a five person group could create 30 teams and thus be able to submit 30 attempts per day. I believe both teams that have exceeded the 10% threshold and thus are eligible for the grand prize are composed of members from other teams and could be disqualified.
'The tyrant will always find pretext for his tyranny.' - Aesop's Fables
Call me crazy, but if you actually *read* the rules it says the contest is going until at least October 2nd, 2001.
Actually, yes, I think I will call you crazy.
coding is life
Okay, you're crazy :-)
So, there's approximately minus 2855 days left?
I just want to know if netflix gets to keep John Titor's time machine ... the time frame (2001) is right ...
They could improve the predictive value immensely if they allowed me and my wife to each rank the movies we watch together separately. With the current system, some movies are rated by just me, some by just her, and some have a consensus rating. It leads to a dataset full of garbage.
It's also true that the winner is not the person who gets the highest score on the leaderboard. Most people seem to miss this.
The leaderboard gives score on the QUIZ dataset, which is half of the answers that the team submits. The WINNER of the million dollars is the person who does best on the TEST dataset, the other half of the answers they submit. Nobody knows how good these guys are doing on the TEST set, either team could be overfitting the quiz set.
It's interesting that the fearmongering of the prior /. post about AI got hundreds of responses but this /. post, which is far more relevant to real AI, has gotten less than a hundred responses thus far.
Anyway, congratulations to Netfilx for doing the right thing for their business in response to The Hutter Prize.
Seastead this.
In fact, according to the second post by Yehuda Koren in this thread, it looks like BelKor does have the best test error rate and will be declared the winner. http://www.netflixprize.com/community/viewtopic.php?id=1498