Augmenting Data Beats Better Algorithms
eldavojohn writes "A teacher is offering empirical evidence that when you're mining data, augmenting data is better than a better algorithm. He explains that he had teams in his class enter the Netflix challenge, and two teams went two different ways. One team used a better algorithm while the other harvested augmenting data on movies from the Internet Movie Database. And this team, which used a simpler algorithm, did much better — nearly as well as the best algorithm on the boards for the $1 million challenge. The teacher relates this back to Google's page ranking algorithm and presents a pretty convincing argument. What do you think? Will more data usually perform better than a better algorithm?"
"What do you think? Will more data usually perform better than a better algorithm?"
I need more data.
With reasonable men I will reason; with humane men I will plead; but to tyrants I will give no quarter. -- William Lloyd
Say what you want about computer scientists, but without them you'd probably be complaining on a chalkboard.
Mathematics is physics without purpose, Chemistry is physics without thought, Engineering is physics - CliffsNotes edition.
Mathematics is physics without purpose, Chemistry is physics without thought, Engineering is physics without tenure.
Sorry, I'm a writer. That makes you raw material.
Riiiht. And mathematical research is just finding a Hamiltonian cycle in a graph defined by the set of axioms used.
\u262D = \u5350