Close but no Cigar for Netflix Recommender System
Ponca City, We Love You writes "In October 2006, Netflix, the online movie rental service, announced that it would award $1 million to the first team to improve the accuracy of Netflix's movie recommendations by 10% based on personal preferences. Each contestant was given a set of data from which three million predictions were made about how certain users rated certain movies and Netflix compared that list with the actual ratings and generated a score for each team. More than 27,000 contestants from 161 countries submitted their entries and some got close, but not close enough. Today Netflix announced that it is awarding an annual progress prize of $50,000 to a group of researchers at AT&T Labs, who improved the current recommendation system by 8.43 percent but the $1 million grand prize is still up for grabs and a $50,000 progress prize will be awarded every year until the 10 percent goal is met. As part of the rules of the competition, the team was required to disclose their solution publicly. (pdf)"
Any chance of not tagging this story with this meme?
If the people who created Netflix's system are still with the company, I'd say they deserve some retroactive recognition (and bonuses). That's pretty damn good optimization if it's that hard to improve upon, and there seem to have been some really sophisticated people trying to beat them.
What I'm listening to now on Pandora...
Will Netflix incorporate the near-winners' ideas into their current system? If so, won't future teams be aiming at a moving (improving) target? If not, won't current Netflix customers know that their recommendations could be better if Netflix just incorporated a now publicly-disclosed algorithm into their servers?
"We can categorically state we have not released man-eating badgers into the area." - UK military spokesman, July 2007
The prize was clearly a million dollars, not a cigar! I guess the editors don't even read the summary.
Tsunami -- You can't bring a good wave down!
AT&T Labs = bunch of people from former Bell Labs = welfare for AI researchers ;)
Most noteworthy aspect of the winning entry is that their winning method works by combining 107 different types of prediction strategies.
They state that you can get pretty far by blending the 3-4 best strategies, but of course doing so would not have netted them the progress prize
It is kind of sad realization that there actually is no better method. Your best bet is to use brute force and attempt to find some weighting methodology that combines known methods. By the way this is a well known issue in protein structure prediction competitions, for many years now so called meta-servers (predictions work by merely combining other predictions) win all the time. The joke is that we now need meta-meta-servers, combine the results of combiners
Also a clarification on the progress prize: to get it you need to have at least 1% improvement over the previous result. Considering that there is only 1.57% to go there is room for only one more progress prize until it hits the Grand Prize (10% improvement over the original results).
if ($director eq "Michael Bay") {
print "Not recommended";
}
That should improve the system by at least 20%
Right - but the AT&T method is public - so if you can get that the rest of the way to 10%..... I guess step 3 is profit.
It's hard to believe that's how Micronians are made. Why don't we see it right now by having you both kiss one another?
That's not how the contest works. It's based on the RMSE that the original netflix algorithm got at the beginning of the contest. This is fixed and does not change. See the contest site for more details.
Two reasons I can think of. One is the challenge. I like to code but I'm not great with coming up with projects to do myself. This kind of thing would be nice for that.
The other is the experience. If you get second in this, no, you won't win the prize. But you can bet that having that on your resume would make getting many jobs much easier. Amazon would like your skills. So would many other retailers.
Also, as a side note, it's not a lottery. There is a three prong legal test in the US to determine if something is a lottery. I think the three parts are you have to pay to get it, everyone has an equal chance of winning, and there is no skill involved. I'm not positive about the second part. This is free to enter and is based quite a bit on skill, so it's not like a lottery.
Don't exaggerate.
This isn't a way to get free work. It's a way to get very smart job candidates to find you. It's a recruiting tool. You don't honestly think that they will take the winning idea, pay the $1m, and then just say "bye" do you? They will offer that person a job if at all reasonable (if it's a team of 500 students, obviously they couldn't).
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
From my experience with the Netflix Prize, and ML/stat.learning techniques in general, that last 1.57% is going to be the hardest. There is a diminishing returns effect going on here, i.e. the effort required for each successive 1% increase gets progressively larger.
An old-timer with old-timey ideas.
"Any contestants reading this? Maybe you could enlighten the rest of us on why you bothered competing?"
There are two immediate reasons I can think of why anyone would bother competing:
1) To win money.
2) Because they enjoy the challenge of trying to solve an interesting problem.
I'm just a simple coder, and knew that I didn't have any realistic chance of winning money. But I still found it very satisfying to try to come up with a solution and send it in and see how I did. I don't regret spending hours of my own leisure time on the project.
That said, eventually I gave it up. It was very clear that I'm not smart enough to meet the challenge. I had my fun, and it was time to move on to the next project. In summary, I don't think it's safe to assume that everyone is in it for the money.
"Pardon my cynicism, but seems like contests like this are a way to get a lot of ideas and work for very little money."
I call it "brilliant". Netflix probably put some pricetag on what it would pay to get >10% improvement on their system. That pricetag is probably more than $1 million. That means profit!
I think you have to consider that netflix is working off a very large user base with a very large list of titles. In this sense, computation time is going to go way up the more you keep adding all these factors. I'm sure they've had projects internal to netflix to use more data, but found that it just didn't pay off with the increased computation time. It's much better to get good recommendations onto the page instantly than make the user wait 2 seconds for great recommendations. The same is possibly true for doing recommendations ahead of time and having to spend the extra compute time and storage space.
Plus, I think there's always going to be some level of "noise" in the system. People rating things incorrectly (clicked on the wrong number), people changing their minds, etc. And then there's the cases where it makes no logical sense that if I liked movie A, B and C that I should hate movie D. The question is, how good can a recommendation system get when it will always be thrown off by the noise.
So while I agree with you in theory, I think it may not work out to be such a great thing in practice.
Why not give users more control over their recommendations? Heck, even a bunch of checkboxes would be useful.
For example, Netflix frequently recommends rated R movies to my family, but we have never rented a single R-rated movie and have no desire to do so. Moreover, every time we get a recommendation for an R-rated movie, we rate it "Not Interested." I've probably marked dozens of R-rated movies "Not Interested," but they continue to be recommended. (Either Netflix is trying to tell me to just give in and rent one already, or they really don't understand my family's movie preferences.)
A simple checkbox for "Do not recommend R-rated movies" would be all Netflix needs to substantially improve its accuracy for my family. I imagine Netflix could add checkboxes for similar criteria as well. In any case, I think a key point is giving more control over recommendations to the users themselves.
the JoshMeister on Security
The most interesting part of the research paper was this: "More specifically, if movie i was rated x days later than movie j, we multiply their similarity by exp(-x/600). The denominator 600 (days) was determined by cross validation, and reflects the fact that after two years, similarity decays by approximately a factor of 3." Apparently Joe Average's tastes in movies slowly evolve over time, and something you liked three years ago may not be that attractive today.
This raises the question, should someone's age affect the denominator? People in or just out of college generally see their tastes evolve quickly, while people in retirement homes might take decades to get tired of something.
I also wonder if this decay factor applies to other fields. Not just books or music, but toothpaste or politicians. In the US, your representative is presumably re-elected before your opinion has time to change much; the president just as you're getting tired of him. It makes me wonder how Senators get re-elected at all.
Nothing for 6-digit uids?