Facebook Kills Dataset of Crawled Public Profiles
holy_calamity writes "Internet entrepreneur Pete Warden wrote a crawler that collated the public profiles of 210 million Facebook profiles and was set to release an anonymised version to researchers. The pages crawled can be read by any web user, and the robots.txt did not forbid crawling. However, Facebook claimed he had violated its terms of service and threatened legal action. Fearing costs, Warden has now destroyed his dataset. For a snapshot of the insights that data could have allowed, see Warden's post on how the friend networks of the 120 million US users in his data segregated into seven clusters." Of course, if he had it, this means anyone who wants it made their own version of this.
I see very little problem with an automated scan that respects robots.txt.
By not blocking automated access to the profiles, facebook is squarely at fault.
Isn't this the golden egg of Facebook, I though this is what they were selling. That data is fascinating, it is completely anonymous, yet at the same time very insightful for marketing purposes. I think Facebook is just upset because they plan on selling the same data that Pete was.
Since this is publicly available information, and all he did was send a program to go grab it (much akin to asking your web browser to download it), does this mean Facebook has essentially threatened him for no more than reading too much of Facebook too quickly? Sounds absurd to me.
Don't see Facebook going after Google, even though the data that they posses is ostensibly the same as Warden's. The primary diff that i see is that warden was offering analysis and results for free- not trying to monetize it. Maybe that's what made them mad.
why do you think they threatened him? they want to sell this data themselves.
"In America, first you get the sugar, then you get the power, then you get the women..." -H. Simpson
They did something similar to FB Purity, a Greasemonkey script that allows users to filter out apps and other stuff they don't want to see in their feed. Facebook argued that they were misusing their "FB" trademark... eventually they let them continue under the name "fluff busting purity", probably due to the PR backlash that shutting them down would bring.
They've also shut down the Facebook portion of the Web 2.0 Suicide Machine, which runs scripts that allow a user to delete their social profiles as thoroughly as sites will allow. In that case, they argued that the Suicide Machine was violating their "Statement of Rights and Responsibilities"... which isn't even a law! Nonetheless, the Suicide Machine didn't have the financial ability to fight even frivolous claims like that, so they folded that section.
Facebook apparently believes that its users will continue using the site regardless of the ridiculous access policies that their legal department create and defend. I hope they're wrong.
It's better to vote for what you want and not get it than to vote for what you don't want and get it.
- E. Debs