Replacing Sports Bloggers With an Algorithm
tesmar tips a report up at TechCrunch that begins "Here come the robo sports journalists. While people in the media biz worry about content mills like Demand Media and Associated Content spitting out endless SEO-targeted articles written by low-paid Internet writers, at least those articles are still written by humans. We may no longer need the humans, at least for data-driven stories. A startup in North Carolina, StatSheet, today is launching a remarkable network of 345 sports sites, one dedicated to each Division 1 college basketball team in the US. For instance, there is a site for the Michigan State Spartans, North Carolina Tar Heels, and Ohio Buckeyes. Every story on each site was written by a robot, or to put it more precisely, by StatSheet's content algorithms. 'The posts are completely auto-generated,' says founder Robbie Allen. 'The only human involvement is with creating the algorithms that generate the posts.'"
I tried reading the first article on the Tar Heels, and as much as I hate reading anything about the Tar Heels the sentences just don't flow together. It's disjointed and mentally uncomfortable to read. I can't imagine anyone using it as an actual replacement for even semi well-written content.
This post was written by a robot.
I'm a good cook. I'm a fantastic eater. - Steven Brust
I've read a couple articles and they are no worse than the SEO-targeted content written by freelancers odesk for $2/hr (and english as a second or third language).
Seems as though the "algorithm" is quite elaborate - taking into account odds of winning as well. Lines such as "The [team] was not supposed to win this game, but made it happen" and combined player statistics "Coming off a poorly put together team last year, this year, the [team] looks to have greater talent."
It reminds me of how someone in Junior high would write. Impressive. Similar to MIT's paper generator: http://web.mit.edu/newsoffice/2005/paper.html
PHP + MySQL + Mad Libs for Sports.
Now I just need to find a robot to read all these sports blogs to free up time for things I want to do.
Now we need a sports fan algorithm to rid ourselves of all these needless sports fans in the world and replace them with something more worth the resources.
Mission. Fucking. Accomplished.
This is the DJ 3000. It plays CDs automatically, and it has three distinct varieties of inane chatter:
- Hey hey -- how about that weather out there?
- Woah, that was the caller from hell.
- Well, hot dog -- we have a weiner.
- Those clowns in congress did it again -- what a bunch of clowns.
How does it keep up with the news like that?
I read the first article on the first linked site and I was impressed. I wouldn't have known it was generated by a computer. Even knowing that it was computer-generated, I'd still be happy with the quality for this kind of reporting. Very good.
I am going to guess that there will not be any humans involved in reading the output either.
Can you copyright the output of an algorithm? Seriously, copyright requires a creative element...
Why am I suddenly reminded of this t-shirt? :)
As I suspected it would, the first sentence includes the word "momentum."
Q: What does the "B." in Benoit B. Mandelbrot stand for? A: Benoit B. Mandelbrot
At least we know that Slashdot isn't generated by robots. A robot wouldn't make the idiotic mistakes that the current human (for want of a better word) editors do. E.g. "one dedicated to each Division 1 college basketball tam in the US." Robots don't suffer from dyslexia, and aren't too lazy to use a spell check.
Now sports editors have something to show novice reporters. "If you can't give me something a whole lot better than this, you're fired".
It's a reminder that standards for every knowledge-based profession are going up every year, driven by the combination of the Internet, globalization, and Moore's Law. And this is just the start of it for journalism.
Part of good sports writing is that it evokes emotions. I read some samples and it's devoid of feeling. It is also completely unable to recount similar events in the past. In fact, no actual events are mentioned beyond statistical data. I want to know about fights during a game or the nearly perfect game that got spoiled.
You mean like BBC News? I just clicked on the first UK article I found to give an example: http://www.bbc.co.uk/news/uk-england-nottinghamshire-11751079
c++;
I've always been amused by how sports reporters vary the verb used to describe a win. They can't just keep saying "Team A beat Team B" over and over, so they mix it up, based on how wide the score was. For a win with a small margin, they might say "Detroit edged Ottawa", or "The Rangers slid past the Ducks". For a large margin, perhaps "The Coyotes pummeled the Blues". I give extra credit if the verb matches the subject, as in "So-and-so doused the Flames".
I think it would be a lot of fun to write a program for this.
"The RoboSportReporter is broken again. It looks and smells like someone poured a beer into him."
They were just trying to make him more realistic.
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.