Netflix Announces Second Data Mining Contest
John Snodgrass writes "Neil Hunt, Chief Product Officer at Netflix, has announced on the Netflix Prize Forums that they are planning to hold a new data mining competition. The second competition will have some twists and is expected to be shorter in duration. It will feature two grand prizes, to be awarded in a 6 and 18 month time frame. A previous competitor still active on the board has already dubbed it: 'The Sparse Matrix: Reordered' and 'The Sparse Matrix: Factorizations.'"
> Why do tens or possibly hundreds of thousands of dollars worth of work just for the chance that you might get payed? It seems absurd.
Challenge and notoriety.
For that matter, just about everything you do has a chance of failure, so why do anything?
=Smidge=
Most of the time people want to tinker/play with concepts. They just need some sort of motivation to get them going. The chance that you might get payed is apparently enough for some people.
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
Most of these are research groups that would publish their results and research anyway. This gives them a practical application and a chance for some fame and money -- the research still gets done and published.
I'll add one more thing. Netflix has done the community a favor by providing a large dataset for testing algorithms. Data mining requires data. It requires more than just raw data. It is really difficult to know how well your algorithm works without data that has known answers to compare to. A good test dataset lets you compare your results to other results.
People who take on challenges just because they can? I do believe that they seem "weird" to you. It would seem that way to many lazy people who just want cash for their clunker and to be left aloe to play Halo.
Why is it so hard to only have politicians for a few years, then have them go away?
As someone who used to watch 3 movies a day for about 3 years straight, I still found the system to be useful.
I thought I'd seen everything that was worth watching but if you're really dedicated to finding more quality films then any help is good help, and this is one of the better systems for finding new films (more accurate than trawling imdb but maybe not quite as fun)
There's nothing at all wrong with studying how the human automatic processes work, but "Psychology for Prizes" does have a very Neil Stephenson feel to it.
The public eagerly jumping for the chance to teach corporate bodies how to better advertise to them seems a little preposterous. In a world where everybody's objective is openness and self-study for the betterment of humankind, this sort of thing would be laudable, but here it's a bald-faced attempt to fine-tune manipulation techniques.
What would be cool would be if Netflix, upon offering you a suggestion, would also explain what reasoning they used to offer that suggestion to you. Open-source advertising. If every billboard had an explanation of the psychology behind it, we could learn much more about ourselves. The amount of free will that we use every day versus automatic behavior can only increase when the illusion of free will is broken down and examined.
-FL
I used to think I was unique in what rare movies and music I liked until I met someone who had almost an identical collection to me. On top of that, we both had some of the same clothes. The reason netflix researches these data mining techniques is because our tastes really do cluster into groups. For some it might be because they like DeNiro films and Spaghetti Westerns. For others it might be that they like two screen writers - though they never know it. The payoff for getting this right if you are Netflix is that if a customer gets recommendations and he/she use those to fill your queue, then they are much less likely to cancel netflix anytime soon.
Apparently recommendations are important, otherwise they wouldn't put that much money towards it. There are tens of thousands of movies you have never heard of, but chances are you might like some of them.
I may pick Movie X to watch because the wife and I each had a hard week, but Movie X may be something that we'd never view under any other circumstance. A discrete system has a very hard time categorizing something as fluid as mood and could easily be led to make very inaccurate recommendations on the whole.
If it has a hard time categorizing it, it's because you gave it bad data with your ratings. If the movie's a one-off thing, either don't rate it, or rate it down.
That said, a sophisticated rating system should be able to recognize multi-modal distributions. I like some dumb comedies, some cerebral Science Fiction, and some action thrillers. A good system should pick out my trends amongst each of these to make suggestions within each genre, some crossovers, and really wouldn't be affected by the one oddball movie I'll never watch again.
It doesn't seem too out of line to assume that if you watch enough "mood movies", a good system will make several suggestions for the next time your wife is in a mood for a similar movie, as well as suggestions for your movie watching other times. They aren't mutually exclusive. It's just up to you to look at the appropriate "Because you liked Romantic Comedies/Foreign Films/Summer Blockbusters/Erotic Thrillers" suggestion box for the mood you're in. The system isn't just telling you "Watch this one movie now, I know more than you", but it can give you a much refined set of choices.
Write your representatives! Repeal the 2nd Law of Thermodynamics!
It allows the researchers to "cheat" a bit too via an argument by authority, which is not always good, but does at least make the researcher's job easier. A big issue in data mining is that it isn't purely a technical field, but one with both conceptual and technical issues. The over-arching goal is something like, "get useful and/or interesting information out of data". But what is "useful", what is "interesting", and how do we measure when we've gotten it or not? Usually you have to defend why your problem is the right one, why your metric is the right way to measure success on it, etc. Working on the Netflix competition lets you sidestep all that, because Netflix has already decreed exactly what the goal is, and what performance metric will be used to judge success at that goal, leaving only the technical problems.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
Why do tens or possibly hundreds of thousands of dollars worth of work just for the chance that you might get payed? It seems absurd.
Perhaps you have become lost on these internets.
I suggest trying this website.
or perhaps this one.
You will likely find them much more aligned with your interests than slashdot.
NewslilySocial News. No lolcats allowed.
The purpose of this contest is to figure out who won the previous contest.
lazy people who just want cash for their clunker and to be left aloe to play Halo
Well, to be fair, aloe does help with the chaffing after 48 hours of non-stop Halo.
Yeah. Would you choose a neurosurgeon who pokes around people's brains in his spare time? I wouldn't.