Slashdot Mirror


Netflix Prize May Have Been Achieved

MadAnalyst writes "The long-running $1,000,000 competition to improve on the Netflix Cinematch recommendation system by 10% (in terms of the RMSE) may have finally been won. Recent results show a 10.05% improvement from the team called BellKor's Pragmatic Chaos, a merger between some of the teams who were getting close to the contest's goal. We've discussed this competition in the past."

29 of 83 comments (clear)

  1. No info about the Netflix prize by Daimanta · · Score: 2, Insightful

    C'mon, the Netflix prize isn't THAT well known. At least you could have given some basic info about it.

    --
    Knowledge is power. Knowledge shared is power lost.
    1. Re:No info about the Netflix prize by Reikk · · Score: 5, Informative

      Background: The Netflix Prize is an ongoing open competition for the best collaborative filtering algorithm that predicts user ratings for films, based on previous ratings. The competition is held by Netflix, an online DVD-rental service, and is opened for anyone (with some exceptions). The grand prize of $1,000,000 is reserved for the entry which bests Netflix's own algorithm for predicting ratings by 10%.

    2. Re:No info about the Netflix prize by MrMista_B · · Score: 4, Informative

      What, you didn't even read the /summary/?

      I know, this is Slashdot, but 'some basic info about it' is /right there/.

    3. Re:No info about the Netflix prize by bogjobber · · Score: 4, Informative

      If the first sentence didn't explain it enough, perhaps you could RTFA.

    4. Re:No info about the Netflix prize by Korin43 · · Score: 2, Informative

      Except it doesn't mention what an improvement of 10% means (unless you know what RMSE means, which I don't).

    5. Re:No info about the Netflix prize by JohnnyBGod · · Score: 2, Informative
  2. 1 Million split 7 ways by basementman · · Score: 4, Funny

    Let's see, $1,000,000 split 7 ways gives us $142,857.14 each. Let's say taxes take half, now you are down to only $71,428.57 each. Unless one of them kills all of their partners like in The Dark Knight that ain't much of a prize.

    1. Re:1 Million split 7 ways by quanticle · · Score: 5, Insightful

      Well, just like the Ansari X Prize didn't cover the costs of developing and launching a suborbital rocket, the Netflix Prize isn't really meant to be a large enough prize to fully fund the development of a new recommendation algorithm. The purpose of the prize is to stimulate interest and get people started. The real reward will come when they turn their algorithm into commercialized software - the rewards from making such a thing applicable outside of Netflix could be large indeed.

      --
      We all know what to do, but we don't know how to get re-elected once we have done it
    2. Re:1 Million split 7 ways by neokushan · · Score: 5, Insightful

      Pretty sure having it on their CV means they can effectively write their own pay cheque in terms of job opportunities.

      --
      +1 IDisagreeSoHeMustBeATrollOrAnAstroturferOrAShill
  3. Well done! by Slurpee · · Score: 4, Informative

    Well done Bellkor.

    But now the real race begins.

    Now that the 10% barrier has been reached, people have 30 days to submit their final results. At the end of the 30 days, whoever has the best result wins.

    This is going to be a great month!

    1. Re:Well done! by nmb3000 · · Score: 2, Informative

      Now that the 10% barrier has been reached, people have 30 days to submit their final results. At the end of the 30 days, whoever has the best result wins.

      That's true, but like the story title indicates, the prize may have been achieved. From the contest rules:

      The RMSE for the first "quiz" subset will be reported publicly on the Site; the RMSE for the second "test" subset will not be reported publicly but will be employed to qualify a submission as described below. The reported RMSE scores on the quiz subset provide a public announcement that a potential qualifying score has been reached and provide feedback to Participants on both their absolute and relative performance.

      So the publicly available submission beat the 10% mark, but only by a narrow margin of 0.05%. The private submission must also have surpassed 10% for them to be considered a preliminary winner, otherwise the contest goes on (at least, this is my understanding). In any case, I agree that the outcome should be interesting.

      --
      "What do you despise? By this are you truly known." --Princess Irulan, Manual of Muad'Dib
      /)
    2. Re:Well done! by Wildclaw · · Score: 2, Interesting

      Actually, this email has been sent out

      "As of the submission by team "BellKor's Pragmatic Chaos" on June 26, 2009 18:42:37 UTC, the Netflix Prize competition entered the "last call" period for the Grand Prize. In accord with the Rules, teams have thirty (30) days, until July 26, 2009 18:42:37 UTC, to make submissions that will be considered for this Prize. Good luck and thank you for participating!"

  4. Re:Do they keep the prize money? by lee1026 · · Score: 3, Informative

    AT&T have committed to giving all money to charity. The person at yahoo developed his entry while working at AT&T, so I will be surprised if yahoo gets any of it.

  5. they were able to get the extra 0.5% over the top by circletimessquare · · Score: 4, Funny

    by simply ignoring data from anyone who ever rented SuperBabies: Baby Geniuses 2, Gigli, From Justin to Kelly, Disaster Movie, any movie by Uwe Boll and any movie starring Paris Hilton

    suddenly, everything made sense

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
  6. most of them did it by circletimessquare · · Score: 4, Insightful

    for simple intellectual satisfaction, like a giant puzzle or a game of chess

    money is not the motivation for everything in this world

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    1. Re:most of them did it by morgan_greywolf · · Score: 2, Insightful

      Well, it was for AT&T. No, they don't want the prize money; they're donating it charity. But what they do have now is an algorithm that can be turned into a commercial product or service. The individual researchers may not have had money as their primary motivator, but their employer sure has hell did.

  7. Interesting by coaxial · · Score: 5, Informative

    I published a paper using Netflix data. (Yeah, that group.)

    It's certainly cool that they beat the 10% improvement, and it's a hell of a deal for Netflix, since it would have cost them more than a prize money paid out to hire the researchers, the interesting thing is whether or not this really advances the the field of recommendation systems.

    The initial work definitely did, but I wonder how much of the quest for the 10% threshold moved the science, as opposed to just tweaking an application. Recommender systems still don't bring up rare items, and they still have problems with diversity. None of the Netflix Prize work address any of these problems.

    Still, I look forward to their paper.

    1. Re:Interesting by Trepidity · · Score: 3, Insightful

      Trying to recommend unpopular movies is problematic. Is the computer program going to be able to discern under-rated (Glengarry Glen Ross) or just crap (Ishtar)

      That is indeed an interesting question, and I think it's what the grandparent meant when he pointed out Netflix's contest didn't really address it. The performance measure Netflix used was root-mean squared error, so every prediction counts equally in determining your error. Since the vast majority of predictions in the data set are for frequently-watched films, effectively the prize was focused primarily on optimizing the common case: correctly predict whether someone will like or not like one of the very popular films. Of course, getting the unpopular films right too helps, but all else being equal, it's better to make even tiny improvements to your predictions of films that appear tons of times in the data set, than to make considerable improvements to less popular films' predictions, because the importance of getting a prediction right is in effect weighted by the film's popularity.

      You could look at error from a movie-centric perspective, though, asking something like, "how good are your recommender algorithm's predictions for the average film?" That causes you to focus on different things, if an error of 1 star on Obscure Film predictions and an error of 1 star on Titanic predictions count the same.

    2. Re:Interesting by Trepidity · · Score: 2, Insightful

      That's true, but since there's not a huge range in ratings, that root-squaring doesn't have nearly as big an effect as the many orders of magnitude difference in popularity. I don't recall the exact numbers offhand, but I think the top-10 movies, out of 17,500, account for fully half the weight.

    3. Re:Interesting by coaxial · · Score: 2, Insightful

      1.) Rare could also be defined as unpopular. Trying to recommend unpopular movies is problematic. Is the computer program going to be able to discern under-rated (Glengarry Glen Ross) or just crap (Ishtar)

      You know what. I actually like Ishtar. I really do. The blind camel, and the line "We're not singers! We're songwriters!" gets me every time.

      So really, the even harder problem is to know when to buck your your friends and go with the the outlier. It's hard, because kNN methods work pretty well, and they're all about going with the consensus of whatever cluster you're in.

  8. "recommendations" by Gothmolly · · Score: 2

    Who listens to these sort of things anyway?

    --
    I want to delete my account but Slashdot doesn't allow it.
  9. i was joking, however by circletimessquare · · Score: 5, Interesting

    from the excellent nyt article about the competition in november:

    http://science.slashdot.org/article.pl?sid=08/11/22/0526216

    it isn't bad movies that are the problem, taste in bad movies can still be uniform

    the real problem is extremely controversial movies, most notably Napoleon Dynamite

    http://www.imdb.com/title/tt0374900/

    not controversial in terms of dealing with abortion or gun control, but controversial in terms of some people really found the movie totally stupid, while some people really found the movie to be really funny

    movies like napolean dynamite are genre edge conditions, and people who apparently agree on everything else about movies in general encounter movies like this one and suddenly dramatically differ on their opinion of it, in completely unpredictable ways

    --
    intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
    1. Re:i was joking, however by bill_mcgonigle · · Score: 3, Interesting

      Yeah, all the recommendation systems where I've bought or rented movies, and most of my friends all said I needed to see 'Fight Club', so I did, and ... meh.

      Consider this list of movies I've bought/rated highly:

        12 Monkeys
        V for Vendetta
        Lost in Translation
        Donnie Darko
        A Beautiful Mind
        Dogma

      I might be grouped with folks who enjoy flicks about identity, man vs. man, those who aren't easily offended, etc. But there doesn't seem to be as clear a way to find a group of people who find aggression offensive, which is basically the driving theme of Fight Club. Perhaps given enough negative ratings it could be possible, but even though I've clicked 'Not Interested' on all the Disney movies, they keep suggesting I want their latest direct-to-DVD crapfest, so I'm left to assume they're rating mostly based on positive ratings.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    2. Re:i was joking, however by Pollardito · · Score: 3, Interesting

      Perhaps given enough negative ratings it could be possible, but even though I've clicked 'Not Interested' on all the Disney movies, they keep suggesting I want their latest direct-to-DVD crapfest, so I'm left to assume they're rating mostly based on positive ratings.

      A co-worker gets almost no recommendations at all from Netflix, and customer service told him that they generate recommendations based on ratings of 4 or 5 (though you'd think that the recommendations that they do generate would have to filter through similar movies that you've rated at 0). He was told to rate the movies that he likes higher in order to fix it, but that's never really accomplished anything as he has several hundred movies in the 4-to-5 range and maybe a dozen recommendations total.

      I'm pretty sure that the Disney/children's movie recommendation flood that most everyone seems to be getting is driven by parents who don't actually love those movies, but are rating those movies on behalf of their children. That causes a weird connection to movies that they themselves enjoy, and it makes it seem like the same audience is enjoying both types of movie. They need to have an "I'm a parent" flag somewhere to help them sort that out

  10. Re:they were able to get the extra 0.5% over the t by interkin3tic · · Score: 2, Funny

    by simply ignoring data from anyone who ever rented SuperBabies: Baby Geniuses 2, Gigli, From Justin to Kelly, Disaster Movie, any movie by Uwe Boll and any movie starring Paris Hilton

    Hey, I (along with the rest of my frat, our school hockey team, and most of the town) was in a movie starring Paris Hilton, you insensitive clod!

  11. Re:real world by Wildclaw · · Score: 2, Interesting

    ~0.85 points (on a five-point scale)

    Actually the scale is not 0-1-2-3-4 but 0-1-4-9-16 as they use Root-Mean-Square. Just thought it was worth pointing out.

  12. Film recommendations by michuk · · Score: 3, Interesting

    Does anyone find Netflix recommendations any good anyway? I used http://criticker.com/ for quite a while and was very happy about the recommended stuff. Recently switched to http://filmaster.com/ (which is a free service) and it's equally good, even though both probably use a pretty simple algorithm compared to Nextflix.

    --
    Polish your GNU/Linux! http://polishlinux.org
    1. Re:Film recommendations by jfengel · · Score: 2, Interesting

      Reasonably good, actually. I often add 4 star movies to my queue, and rarely regret it.

      The problem is the bell curve. There aren't a lot of 5 star movies out there, and I've seen them. There are a lot of 3 star films, but my life is short and I don't want to spend a lot of time on movies I merely "like".

      In fact, it's not really a bell curve. I rarely provide 1-star or 2-star ratings simply because it's not at all difficult for me to identify a film I'm going to truly hate. I don't have to waste two hours of my life to find out whether I'd merely dislike the new Transformers movie or whether it will fill my soul with disgust.

      The left side of the curve is actually quite fat with movies that simply won't interest me at all. The existing algorithm is actually fairly good at telling me I won't like them. The hard part is picking out the very few movies that ARE worth my time.

      They do show both the average and expected rating for each film. What I'd really like to see is a list sorted by the difference: where do I stand out from the crowd? Such movies are likely to have extra appeal.

      So the 10% difference isn't completely worthless, but the real problem is that they're pursuing the wrong goal. There's a lot of information they're dropping on the floor.

    2. Re:Film recommendations by coaxial · · Score: 3, Informative

      I believe that Netflix is still using Cinematch. You could look into movielens. It's from the GroupLens group at U Minn.

      [E]ven though both probably use a pretty simple algorithm compared to Nextflix.

      You do know that Netflix said on the outset "You're competing with 15 years of really smart people banging away at the problem." and it was beat in less than week.

      That's not to meant as a knock against Netflix's engineers, but more about that they didn't really build a state of the art recommender system. Simple SVD (aka latent semantic indexing outperfomed them as well.) They did something a bit more than straight up kNN clustering, but that was pretty much it.