Google Should Be Logging In To Facebook
In the dust-up over the revelation that Facebook had paid a PR firm to plant negative stories about Google indexing Facebook's site, one point was often overlooked: the allegation that Google had been creating dummy Facebook accounts, and using them to log in to Facebook and spider information that was only available to Facebook users. This was denied by Google and never proven, but the denial obscures a more important point. Paradoxically, rather than hurting user privacy, it would have helped to protect user privacy in the long run if Google actually had been logging in to Facebook, spidering the information that was available to members, and making that information available in Google search results.
To review the facts not in dispute: When you create a Facebook profile, Facebook by default makes certain categories of information viewable to other users. Most of your personal information (in particular, your contact information) is viewable to other members that you confirm as your Facebook friends. A narrower set of information — usually including your name and your interests, but not including your contact information — is viewable to other Facebook members who are signed in to Facebook, but who are not in your friends list. (Let's call this the "Facebook stranger" version of your profile.) Finally, since 2007 Facebook has made an even smaller subset of information available in a "public search listing," which can be viewed without being logged in to Facebook or even having an account. Facebook explicitly stated that one reason for creating these public search listings was to make the profiles more easily findable by Google.
Now, the op-ed that Burson-Marsteller was trying to plant in the press strongly suggested that Google was using tactics like creating fake Facebook accounts in order to log into Facebook and scrape the "Facebook stranger" version of people's accounts, and not just the public search listings. (For one thing, the op-ed accused Google of likely "violating the Terms of Service" of Facebook. While scraping the public search listing obviously doesn't violate the TOS, creating dummy accounts to log in to Facebook and spider content automatically certainly does — and that's the only thing Google could do on Facebook beyond spidering the public search listing.) Of this allegation, Wired senior writer Steven Levy wrote:
This information is a lot easier to unearth from inside Facebook, but actually logging into Facebook to purloin information would indeed be troublesome. For one thing, it would violate the terms of service agreement. Is Google doing this? One of the Burson operatives implied that it is. But Google says the company does not go inside Facebook to scrape information, and I find this credible. (If Facebook has logs to prove this serious charge, let's see them.)
But why is this such a scurrilous charge anyway?
When you search for a person's name on Google, you might be looking for information about that person, or you might be doing research on what other people in the world can find about that person (particularly if that person is yourself). If a certain fact about you — for example, the members of your Facebook friends list — is viewable to anyone with a Facebook account as long as they're logged in to Facebook, then anybody in the world can obtain that information about you anyway, by getting their own Facebook account. So it's perfectly legitimate for Google to report that as a fact that anyone can find about you, if you Google your own name. You may not like the fact that Facebook exposes that information about you to anyone with a Facebook account, but it's Facebook, not Google, that makes the information available to anyone. If you Google your own name and Google tells you that that some piece of information is available to any Facebook user, Google is doing you a favor.
For that matter, it's not that easy to view your own "stranger Facebook profile" on Facebook, to see for yourself what other users can see about you. You can't just click your own profile while signed in, since that will show you all of your own personal information. You can't sign out and then click your own profile, since that will show you your public search listing (which is shown to non-logged-in users). You would have to, instead, create a second dummy Facebook account (already a violation of Facebook's TOS), which usually requires creating a second email address that you can tie to your second Facebook account, then signing in with your second account and trying to view your "real" one... How many people — even the most privacy-conscious ones who pore over every article about Facebook allegedly exposing their data — have ever tried that experiment? Having the information already spidered by Google would make it much easier.
When would you actually derive some privacy benefit from not having your "Facebook stranger" profile information listed in Google? Really, only if you're being looked up by a particularly lazy stalker who searches your name on Google — but then doesn't even bother signing in to Facebook and searching for your name on Facebook. If they're motivated enough to find you on Facebook and view your "Facebook stranger" profile there, then you've gained nothing by blocking that information from Google.
Notice this argument does not extend to some general principle that webmasters shouldn't be able to tell Google not to index parts of their website. Many websites have specified, using the Robots Exclusion Standard, that they don't want Google indexing certain documents on their site. (The Robots Exclusion Standard allows webmasters to create a file called robots.txt on their website, which tells search engines not to index any files listed in the robots.txt file. It would be technically possible for a search engine to ignore that directive and index the documents anyway, but virtually all search engines do follow it.) In that scenario, even if a document listed in robots.txt contains personal information about someone, there's no argument that "someone could find it anyway by searching, so Google is doing you a favor by listing it," because nobody would be able to find it by searching unless Google lists it. What makes Facebook a special case is that (a) it has its own search function, and (b) more importantly, it's already the place that everybody knows to go looking if they're searching for a person. These two facts mean that people can find you on there without Google's help.
That might sound unfair to Facebook — that simply because they've achieved success, different rules should apply to them, and Google ought to be allowed to violate their TOS by logging in to their system and spidering people's Facebook-stranger information. But it's the only way for Google to display honest answers, if a user comes to Google to ask: What can strangers on the Internet find out about me?
P.S.: I received many useful suggestions in response to a previous article, in which I described an algorithm for crowdsourcing the abuse-complaint-review process on Facebook, and offered a $100 prize split between users who sent in the best criticisms or improvements. So I'm going to do it again in a more free-form approach: I'll offer a $50 prize to be split between readers who email me the best negative comment or counterargument to the argument that I've just made here. Entries have to be submitted by email, although of course you can and should post your thoughts in the comment threads as well. Email bennettSPAMMERS at SUCKpeacefire dot org with "googlebot" in the subject. You can also donate your winnings to a charity of your choice.
Who the fuck cares...
The world's burning. Moped Jesus spotted on I50. Details at 11.
Using this logic, Google should be given an account to every forum, every blog, every tube site, every system with a login so that users can see if they are talked about some where on the internet. Sounds like a stupid idea. Why should they be able to log into Facebook when they can't log into that small little web forum?
After all, many of them are just fake accounts to help generate credits for facebook games.
And then you have to add the multiple shill accounts that companies like my former employer, starmedia created and used in what must be one of the lamer attempts to create "buzz".
Throw in the bot accounts, that you can get to "follow you" for under a penny a piece in bulk (search for "buy facebook fans" - another of their scummy practices that I had to laugh at)
Last, we have the "real dummy" accounts - you know, the ones that are used to post all sorts of inanities, and the reason why most facebook posts are never seen by human eyes.
In that scenario, even if a document listed in robots.txt contains personal information about someone, there's no argument that "someone could find it anyway by searching, so Google is doing you a favor by listing it," because nobody would be able to find it by searching unless Google lists it.
Which is why you download robots.txt when you want to hack a site. Oh, look, this file contains the root password? ssh ... oh that worked. :D Yes this has happened a lot.
Support my political activism on Patreon.
Yes.
Privacy > Customise Settings > Preview my Profile.
By default it shows what "most people" (i.e. strangers) see. You can then customise it for individual friends on your list.
> Can't you just look at your Facebook settings to see
> what information is available to other people who are
> logged in to Facebook?
Yes, if you trust Facebook to show you *exactly* what other people see. Personally, I would prefer a 3rd party to do this.
The most rabid believers in American Exceptionalism are the exact same people whose policies are destroying it.
There's a jump here: that just because anyone with a Facebook account can find this fact about you, this justifies anyone at all finding you on Facebook just because they're using Google.
I see two ways to support this notion. The first is that Google has a right to presume that you have your own Facebook account, or are willing to get one to view a search result. The second is that information that is available to anybody with an account on a site ought to be considered public. In other words, either there is something special about Facebook, or there is no difference between restricting something to a huge group that it is easy to join and not restricting it at all.
While I agree with the author that it could help expose Facebook's failure to protect users' privacy, I don't think I feel strongly enough about that to grant either form of the supposition that public in Facebook is equivalent to public. Facebook is not a public utility. I deleted my account, as can anyone else (provided you follow their rather arcane instructions). And I don't want Google to have some kind of "information imminent domain" right that makes it OK for them to anonymously spider sites that require login from everyone else. Google is also not a public utility.
The solution to Facebook sucking should not be granting unnecessary power to Google.
I mean...They (Google), should be focussing on marketing their services in authentication using OAuth .
As it stands now, what I see are more and more websites asking potential contributors to use Facebook, Yahoo, AOL or Hotmail. As for Google, it's no where to be seen!
One wonders whether Google is just a sleeping giant or whether these sites are engaged in a conspiracy to sideline Google.
Good point! But at least people aren't stupid enough to enter every detail of their life into a massive database that could describe the network of people that you associate with, along with photos of you and said people. Or couple such information to their portable GPS enabled electronic devices that can be remotely enabled and tracked at any point in time. That would just be stupid! Who would do that! BTW whats your facebook name I'll hit you up from my phones facebook app!
Google should not be spidering private Facebook information. Now I'd love nothing more than to shock Facebooktards with a summary of how much info they're spilling, but my reason for disagreeing has nothing to do with that.
Private Facebook data shouldn't be indexed for the same reason info behind a paywall shouldn't be indexed. Because it's information that the organization doesn't want to make available to non-members. And on the Internet, the price of exclusivity should be obscurity. It would just be false advertising (benefiting only the content provider) and a disservice to searchers otherwise, like showing a paywalled news article in a search result.
Also the price for presenting useless information should also be obscurity. A facebook teaser page that says "look, we found the person you're looking for! Just sign up to get access!" is useless info, and like the various content farming sites, should likewise be relegated to the later pages of search results with the other low-relevance junk.
"When information is power, privacy is freedom" - Jah-Wren Ryel