Slashdot Mirror


Build a Better Netflix, Win a Million Dollars?

An anonymous reader writes "In a quest to better movie recommendations, Netflix is opening their database (nytimes, registration and first child required) to users to try to craft a better recommendation technology. The problem is not easy. Says one researcher: 'You're competing with 15 years of really smart people banging away at the problem.'" Recommender systems are really an interesting problem, and that is likely very interesting data to play with.

12 of 197 comments (clear)

  1. Seems like a free gift for Netflix to me... by garcia · · Score: 2, Insightful

    If no one wins within a year, Netflix will award $50,000 to whoever makes the most progress above a 1 percent improvement, and will award the same amount each year until someone wins the grand prize.

    But if someone does win within a year they will still have the ability to use others' code, free of charge, as part of their product.

    The article doesn't say but how will you know if your code is making choices better than their existing system? I wouldn't be submitting my code unless I was sure I was going to win. Then again I'm not a gambler or a coder ;)

    1. Re:Seems like a free gift for Netflix to me... by Sparr0 · · Score: 2, Insightful

      Because each option you add cuts the number of responses in half. A vast majority of users use the one-rating system. Almost no one would fill out a 20-question survey about every movie they watch.

  2. Suggestion by 99BottlesOfBeerInMyF · · Score: 5, Insightful

    As a NetFlix user I have one suggestion for their recommendation system that can make it much better. Make it aware of the connection between series. That is to say, If you rent season 1 of something, suggest season 2, not season 4 (even if season 4 has better review ratings). If I mark season 1 of something as "not interested" instead of giving it a user rating, don't suggest every other season of that same show at the top of my recommendations. I mean how many times do I have to tell you I don't want to see any season of "Friends" ever, even if you pay me?

    1. Re:Suggestion by Xentor · · Score: 3, Insightful

      Hmm, I see your point.

      I was about to mention that I mark things as Not Interested when I own them, to avoid being reccommended the rest (Usually because I prefer to buy series I like, and rent actual movies), but then I realized that fits into what you said perfectly.

      Point conceded.

      --
      "The amount of intelligence on this planet is a constant. The population is growing." -Cole's Axiom
  3. RSSTimes by eldavojohn · · Score: 4, Insightful
    In a quest to better movie recommendations, Netflix is opening their database (nytimes, registration and first child required)...
    Not quite, you can find it here (or the minimalist version for anyone sick of ads).

    Why is it that the Slashdot editors are just too damn lazy to look up the RSS feed links to these pages?

    The problem is not easy. Says one researcher: "You're competing with 15 years of really smart people banging away at the problem."
    While this may be true, I wouldn't let it deter you. Collaborative filtering is a field that is far from dead. The interesting thing about collaborative filtering is that on the surface, it seems pretty straight forward but once you dig into the mechanics of it, there is actually a lot of playing you can do. Ironically, the way you display the data to the end user is often what determines how well of a job you did.

    Allow me to take a naïve approach at this topic and say we generate a movie index of each person. I would have A Clockwork Orange and Koyaanisqatsi at 5 while The Ring 2 would be at the very low end. My friend might have similar movies. If he has A Clockwork Orange up there, you might be able to compute a Euclidean distance between us. However, this approach falls apart because no one has seen Koyaanisqatsi and of the 20 movies I've ranked highly, they are hard to find.

    You don't have to stop there, however. You could also database the movies I marked as "uninterested" or the movies that were presented to me but I didn't vote on. Like if I had seen the offer to mark J-Lo's latest flop but didn't, wouldn't that tell you something about me?

    So these caveats present themselves all along the way and, at the end computation, you have many different strategies for this data. For example, while you might not be able to link my friend an I through movies, how far apart are we on a nod network? What I mean is, if you plotted every user in their own dimension depending on the movies they ranked and attempted to compute as good a distance as possible between all users, how far would I be away from my friend by hopping on these nodes? There's a lot of information to be gleaned in this sort of friend-of-a-friend collaborative approach.

    Now you need to present this information to the user. Do you just up and recommend him a movie? Do you take Amazon's approach and say "Other people did this -- so should you."? Or do you give them some sort of three dimensional flash plotting of you versus the people nearest to you? Do you allow the user to contact those closest to them? Those farthest away?

    My point is that while 15 years of research has been done, it doesn't mean there's been 15 years of testing and implementation which, in the end of creating products, is where most of the importance lies.
    --
    My work here is dung.
  4. About no-login links on /. by Ilgaz · · Score: 2, Insightful

    You can trick the NY Times personally but you can't do it from a front page of a widely popular commercial site.

    I think it is the reason.

    Slashdot can't send thousands of users with a fake referrer to NY Times. That link you provided is for people using RSS readers and subscribed to NY Times RSS feed.

    I think they should talk with NY Times web team to allow slashdot readers with referrer=slashdot without needing login. They can arrange it for sure, this isn't a "no name" site.

    It would be nice for NY Times for statistics too. I bet they currently have to tweak the statistics for "fake" RSS links from Slashdot.

    About "no ads" version: It would be like NY Times mentioning Slashdot and sending people to some other domain (slashdot sux? I forgot) which doesn't have Slashdot ads which makes this site work/pay for the costs. That also means hundreds of thousands users.

    I am not apologising for NY Times or trying to start a discussion about advertising, I just say my end user point of view and plain guesses.

  5. Remove Artificial Supply Limitations by dduardo · · Score: 2, Insightful

    If Netflix doesn't have the movie in stock it should burn the movie on demand.

  6. 5 star rating is flawed by BMonger · · Score: 3, Insightful

    I personally weigh movies on a number of different factors. I might give 3 stars to a movie because it has 4 of my favorite actors in it even if I didn't care for the plot. I might give 3 stars to a different movie with horrible acting but interesting camera angles (From Dusk Til Dawn 2). I tend to average out my ratings dependent on many things a movie has to offer.

    The problem is is that that is my rating system. It works for me. But it does little good to anybody else because they are rating based purely on something else.

    I think they need to implement the ability to rate more aspects of the movie. I'm sure some people out there rate the movie poorly if their disc is scratched or the transfer quality is poor even. A simple 1 to 5 system doesn't cut it. People rate things that aren't "Was the (romance) plot good?", "Do you like this director?", "Do you like these actors?". People rate things that aren't on the box.

  7. Re:only a million? by illegalcortex · · Score: 2, Insightful

    To win and take home either prize, your qualifying submissions must have the largest accuracy improvement verified by the Contest judges, you must share your method with (and non-exclusively license it to) Netflix, and you must describe to the world how you did it and why it works.

    So, you could take the money from Netflix, use it to start your business, then license it to the other players, too.

  8. Re:Privacy issues? by Shihar · · Score: 2, Insightful

    The AOL search was an issue because you could look at search requests for places and figure out where someone was very quickly. If I use Google to plot the rout to the nearest IKEA or porn store, it is a pretty simple matter to trace back who someone is. Short of some serious stupidity, I couldn't imagine Netflix giving away any valuable information in identity theft. A list of movies is highly unlikely to lead to anyone's address or identity.

  9. Re:Adapting their business model by arachnoprobe · · Score: 2, Insightful

    I don't think this is a "programmers problem". From thinking about it, and reading the approaches discussed here, it looks more like a mathematical problem. Finding a good strategy for linking the data and making suggestions seems far more important than hacking a good (my)SQL-query.

  10. Re:uh huh by Anonymous Coward · · Score: 1, Insightful

    Releasing customer movie preferences is completely different from releasing customer search queries.

    The problem with "anonymous" search queries is that when aggregated, they provide data that can expose the searcher's identity. Taken as a whole, searches like "Tupelo Mississipi", "STD symptoms", and "Dr. Smith" reveal quite a bit of information about the searcher.

    However, the knowledge that user #1234567 rated "Beach Babes From Beyond" 5 stars, and yet gave "The Godfather" a paltry 2, while interesting, does nothing to help me determine who user #1234567 actually is.