Facebook Crawler Speaks Back
Last week we ran a story about Facebook suing to get a crawled dataset offline. This week we have a bit of a
response written by Pete Warden, the guy who actually did the crawling. He followed robots.txt, and then Facebook's lawyers went after him. It's actually a quite interesting little tale and worth your time.
Mark Zuckerberg is the most unethical guy in the industry today. As is obvious by the origins of Facebook, his infamous hacking of the journalists passwords during the the-facebook era and countless other fiascoes that come to news from time to time. Everyone who has ever dealt with him says have bad things to say about him.
If he is the face of the next generation entrepreneurs, then god saves the industry.
The guy's work looks somewhat interesting. I don't see why he can't just make it a facebook app or something that just happens to crossover onto the rest of the internet as well, maybe that would have helped him fly under their radar if it was seen as something that enhanced facebook.
But seems like his problem all along was lack of publicity, which /. will surely help with.
That said, call me old-school, but I've had more fun with things like ircstats. So I'm mostly still waiting for this new social crap to catch up.
matter if he doesn't have the big $ needed to hire lawyers
Thank you. I ran an open source project for a few years and came home one night to find to find that my webhost had taken its site down after being contacted by a company with a similar name. The company claimed they'd tried to contact me, explained how my project was causing them harm, but the simple fact of the matter was that my project's name did not infringe on theirs.
I ended up renaming the project. I've told the story dozens of times, and the response is always the same. "That's BS! They can't do that! Go to court!" People don't understand that $20 a month in unmanaged Google ads doesn't cover lawyers the same way that company's actual paying customers do.
Whale
Assuming what he did produces a valuable result.
If it's defensible in court by an entity with enough cash or lawyer might, why is there no such entity doing the same thing and then fighting facebook in court?
If it isn't defensible in court, why does it matter that he didn't fight because he didn't have the money?
how about if he rejigged his crawler to get the data from the google cache instead? So he'd never get anything from facebook or enter into any implied agreement with them.
Naieve perhaps but also depressing. Our system does not offer equal justice. It only offers justice for those who can afford to pay handsomely for it and thus guarantees injustice for those who cannot. Hurray.
Of course, even coming up with a hypothetical system of justice that would solve this inequity is incredibly difficult so the system we have endures.
I am not anything even approaching a lawyer, but I suspect his actions were probably legal. The Internet is a public medium, unless you specifically put walls around content, it has the same protection as if you posted fliers on a physical bulletin board in a public place. Yes, you retain copyright over your content, but you have ZERO ability to say "by reading this, you agree to additional terms". If I want to produce a review of all the fliers posted around town, I can. If I want to make excerpts (within "Fair Use") I can. Pretty much the only thing I can't legally do is deface them or copy them outright. Unless he was doing this from a logged in account, I can see how they can limit what sorts of derivative works he makes. (So long as the derivative doesn't violate copyright)
It takes a babe in the woods to think he can just waltz in and take that away with a "But your robot.txt didn't say I *couldn't* do it" defense, without expecting a big legal fight.
Yes. Apart from anything else, he's just about entirely missing Facebook's point. Facebook don't give a shit how he accesses their site; this has nothing to do with the fact that he spidered it in a way that their robots.txt file allows, and everything to do with the fact that he was *redistributing their data* without consent.
Now, the question becomes whether what he was distributing falls under fair use. This is a very tricky question, and has nothing to do with how he acquired it.
American justice might be blind, but it know what money smells like. One more reason why we need judicial reform to prevent abuses like this. Of course fighting it wouldn't be worth it, as even if you won, your "winnings" would have only been the ability to continue using the name. Another good example is http://www.nissan.com, where he actually fought and won, at a great price. His name is Nissan, and his computer business and name existed back when the cars were called "Datsun", but they sued anyway. This is another one of those "We are bigger than you, thus more deserving of the domain name than you" cases.
Tequila: It's not just for breakfast anymore!
If they want to sell data (as they clearly do given that's what their business model is built upon) then they should take greater precautions to ensure that it is protected. If they leave that information out in the open, for anyone with a hint of insight to find, then they should not be surprised to find their valuable data in the hands of someone else. He didn't delve into their private information - he simply accessed publicly available information that anyone with an internet connection could view.
Facebook got lucky - the data was gathered by just an average Joe without the backing to fight a legal battle. Had it been someone significantly larger, the result may have been "go ahead and sue - we'll see you in court." And, quite frankly, I'd be shocked if Facebook would win that sort of battle. And that's a battle that Facebook decidedly does not want to lose - it would mean the end of their business...
I'd be curious to learn if that information is still available (as I am certain it is...) because someone/some company might decide that's pretty valuable _PUBLIC_ information and might, just might, decide they're willing to battle Facebook's legal team for it... Expensive legal battle over very valuable marketing data... If you have the resources for the fight, it might be a fight worth waging...
Facebook may have gotten lucky once but they may not be so lucky next time...
Threats of legal action are not a lawsuit. He didn't get sued. He got bluffed. I don't blame him for caving in, but he shouldn't mislead people by referring to the receipt of threats from lawyers as being sued (this is the sort of error I expect from the Slashdot editors, of course).
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
robots.txt isn't legal document, it's an accepted industry standard way for web sites to limit what web spiders and other web robots can search for. Facebook's robot.txt file basically welcomes everyone to come on in and search their site. Complaining that someone used the data that you gave them permission to access is like realtors complaining that someone is visiting open houses they sponsor and then publishing an analysis of houses for sale based on data gathered during those visits. If Facebook doesn't like that others can aggregate data on their site they should get the industry to agree to a new standard tag that permits crawling but forbids aggregation.
One reform that would help - not completely, because deep pockets would still win out a lot - is the idea of "loser pays." One of the few decent legal ideas out of Europe, and helps prevent frivolous suits...
Then we have these "tea party" idiots. loudmouths simply looking for 10 minutes of fame who really have no desire to protect freedom.
I suppose you attended one of their rally's and spoken with a few of them about their views before making that judgment. Oh whats that? You didn't? Oh, you heard from someone that tea partiers are all ultra conservative nutbags. You saw on tv how redneck they all look? right.
Two serious points I have to make. One is that these "tea parties" have become the only outlet that many conservatives have for expressing ourselves politically. Many of us feel totally disillusioned by Republican party, and are reaching for some other outlet. Tea parties are pointedly not organized under a consolidated traditional leadership, they are intended for all different flavors of "conservatives" to come together to speak against things that we all oppose.
Second, I hope you are not seriously hoping and waiting for an uprising. Aside from the tragic death and devastation and the decades of anarchy before proper government would be restored, who exactly do you think would lead that uprising? (hint: its the ultraconservative nutbags that own most of the guns).
You want people to fight for their freedoms but when someone does, all of a sudden they're a bunch of idiots.
Actually, they're idiots because they're stupid, not because they're fighting for their freedoms.
As one person put it, the teabag movement is like the French Revolution in reverse - a bunch of people running around demanding *more* power for the aristocracy.
You want people to fight for their freedoms but when someone does, all of a sudden they're a bunch of idiots.
The teabaggers aren't fighting for freedoms, they're fighting for the "right" to not pay taxes. They're against big government and wasteful spending, kudos to them for that, but where were they when Bush and the Republican congress took Clinton's balanced budget and ran up the biggest deficit in history? You can't undo eight years of damage in a single year. Their "vote against incumbents" rigns hollow, since we voted the Republicans out and the Democrats in. Where was their cry to vote against incumbents when the Republican held the majority and were running up the debt?
As to "freedom", the Constitution specifically grants Congress the right to lay taxes. Taxes do not infringe your freedom, and government is impossible without them.
Free Martian Whores!
If they wanted their freedom, they'd be crying out over the mass imprisonment of Americans for victimless crimes. They'd also realize that being able to acquire health insurance when you're between jobs, or starting your own consulting business makes you MORE free. Nothing I've seen suggests that they actually care one bit about freedom.
Give me Classic Slashdot or give me death!
You know, for a group that's supposedly so disillusioned with the Republican Party, I sure see a lot of lifelong Republican Party bigwigs getting cheered at your rallies and waving the Teabag flag at every opportunity. You say you want to get rid of the Republicans who ran up the deficit under Bush, but those are the very same people I see headlining most of your rallies. I mean, you do realize that your heroes Mitch McConnell and John Boehner were part of that group that ran up the deficit, right? Did I miss the throngs of Teabaggers calling for them to replaced? No...is that silence I hear?!?!?!?
So either you're lying, or you're a bunch of dupes--which is it?
SJW: Someone who has run out of real oppression, and has to fake it.
I've read through the visible comments, and all of them seem to miss the point: the legal system has just operated in reverse. Rather than preventing the stronger entity from stealing from the weaker, it was actually the means by which the stronger DID the stealing.
Here, so far as I can tell, is what happened: The guy pulled a bunch of PUBLICLY AVAILABLE data from Facebook, connected it in new and interesting ways, offered to sell the product of his hard work to other entities, and then had to delete it all because Facebook got antsy and sued him, and he didn't have the money to defend himself. And of course, Facebook will now take the same ideas, and build up and sell their own datasets.
This is akin to bullies using school rules to steal homework from nerds and turn it in under their own names.