Turning Data Science Into a Spectator 'Sport'
vu1986 writes "Kaggle has a 'predictive-modeling competition platform that makes public the competitors in invite-only private competitions. Think of it like watching a major tournament in golf or tennis, where you can watch the best in the world shoot it out to see whose algorithms are king. Kaggle's tagline is "We're making data science a sport." Maybe now it can make data science a spectator sport.'"
We have plenty of armchair "quaterbacks" right here when it comes to science. They can blab on and on about what's right and wrong but damn it if they're ever asked to put their shoulder into the effort. Most of them seem to know less than a high school chemistry student. But, you know, these guys think they're on the ball.
Announcer Holy Cow, that recursive data parsing algorithm discovered a secret code hidden within the Book of Revelations in 18.5897923 seconds! "All your base are be...." Wha- What the hell is this crap!?!?
sudo make me a sandwich
They could make it go faster than televised bass fishing.
Seriously, no one not wearing white polyester pants up to below their chest and golf shoes, or someone wearing hip waders and holding a fly reel, would have the patience to watch this.
Even if you could trick someone into watching it, you're never going to get beyond the "accumulate points" stage, unless there's an end goal, and you can see progress toward that goal well enough that the representation would allow you to predict a winner or a close race.
If it goes anywhere, it'll be because Jeff Bezos or Larry Ellison favors a team and drops a bunch of machines into that teams cluster. Actually, if it's Larry Ellison, expect him to drop just enough computers into the underdog to be able to claim a tax write off and fix the Vegas odds to the point he can switch the support at the last minute and cash in.
If they get hot Russian chicks in short skirts, I am so there!
...people who are fanatically devoted to one viewpoint, ignoring all evidence to the contractrary, and demonizing their opponent. Yeah, science needs to be more like sports.
I hope the OP gets cancer and dies...
So instead of being employed, we're all expected to work, for free, in the hopes that we win a contest? I sure as hell hope this violates all kinds of labor laws.
The labor market has become the Hunger Games. We all lose.
1^2=1; (-1)^2=1; 1^2=(-1)^2; 1=-1; 1=0.
Here is today's schedule for the most boring TV channel:
Cricket
Canoeing
Bass Fishing
Poker
Bowling
Darts
Predictive-modeling competition
I've been working on the Heritage Health Prize that Kaggle is running for over a year now. It's a fantastic way to learn data science and tackle real world problems with real data and a co-op-etitive spirit. The forums and winning solutions are great for learning the art, and if you've never used R, it's a great opportunity to learn it and talk to people that have a ton of experience in the area.
I've entered a couple of Kaggle competitions, but I'm 'kinda put off by the opaque results.
After the first one ended (predict HIV progression), the released full dataset indicated that the data had been sorted before it was separated into train and test sets. IOW, after being sorted by length, all the short sequences were put into the training set, and the longer ones into the test set. This mistake may have invalidated the competition, and I strongly suspect it would have invalidated any paper written about the results.
More recently, the organizers of one competition stated flatly in the forums that they would release the entire data set once the competition had ended, but then didn't. I inquired about this, and a Kaggle data scientist replied saying "we almost never release the test data".
I'm not sure that Kaggle is all that scientific. If the full dataset can't be examined after the competitions close, there's no way to verify the results.
This smells like old SPAM (both kinds).
Why is Snark Required?
I hope these spectators like endurance sports. My natural language processing models take between 2-7 days to create. While I set the model creation going and have a few beers, watch TV etc, they can sit and watch a terminal with an incomprehensible progress report going on.
"Wow! He's completed 87% of the tokenisation! He''ll be shooting to score any week now!"
Never mind, as long as they pay.
bang goes my karma... again...
I am opting for HD data while spectating. And maybe some Good N Plenty.