Why Google's Wi-Fi Payload Collection Was Inadvertent
Reader Lauren Weinstein found a blog post that gives a good, fairly technical explanation of why Google's collection of Wi-Fi payload data was incidental, and why it's easy to collect Wi-Fi payload data accidentally in the course of mapping Wi-Fi access points. "Although some people are suspicious of their explanation, Google is almost certainly telling the truth when it claims it was an accident. The technology for Wi-Fi scanning means it's easy to inadvertently capture too much information, and be unaware of it. ... It's really easy to protect your data: simply turn on WPA. This completely stops Google (or anybody else) from spying on your private data. ... Laws against this won't stop the bad guys (hackers). They will only unfairly punish good guys (like Google) whenever they make a mistake. ... [A]nybody who has experience in Wi-Fi mapping would believe Google. Data packets help Google find more access-points and triangulate them, yet the payload of the packets do nothing useful for Google because they are only fragments."
Of course it was accidental, after all, their corporate slogan is "Do no evil". Obviously they wouldn't do anything that would be evil.
Tequila: It's not just for breakfast anymore!
Inadvertent or not Google broke laws in some countries. Accidentally breaking the law doesn't eliminate responsibility or culpability - even if people shouldn't have left their WiFi unsecured.
If I accidentally run over someone with my car because I wasn't paying attention to what I was doing, it doesn't absolve me of the liability - even if that old lady had it coming, er, was jaywalking.
Laws won't stop the bad guys, but if you have laws you can at least punish them if you catch them. Claiming Google are the good guys (based on what? their motto?) and saying therefore there should not be laws is just ridiculous.
Nothing explains why they stored the data so far. Recording names of access points? Okay. Recording locations of access points? Mmmmaybe. Recording data retrieved by connecting to unsecured access points? No. How can that data be used for any honest purpose? And let's be clear about this: collecting and storing data is an act directed by software which was written by a person or persons who were acting under direction ostensibly by specification. You find those specifications and directors and you will come closer to finding the truth as well as those responsible.
The argument is that capturing data packets is useful to find the SSID of access points which send beacon frames with blank SSID field or where only a client is within range but not the access point itself. That argument is bogus. The mobile devices which will later use the mapped SSIDs and BSSIDs to calculate their own position do not see anything but the beacon frames. It is therefore entirely sufficient to capture just the beacon frames.
There is a legitimate argument that Google was just lazy (or "scientific") by capturing everything they can get in the field and analyzing later. There is however no technical reason for this and we should not make one up to defend Google.
So what TFA is saying is that the issue isn't simply Google snooping on networks and collecting data? And that there may have been a legitimate reason for this whole situation? And that it's blown out of proportion? STOP RUINING MY REASONS TO BE ANGRY AT GOOGLE!
My concern with what Google, and many other firms, are doing is that they are dedicated huge amounts of resources to collected huge amount of data on people. As profit making entities, these firms must at some point monetize this data to get a return on investment. Therefore, if google is keeping data other than basic acces point information, then they must be planning to do something with it.
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
Despite what everyone thinks (and how it seems to the uninformed) it very likely was accidental. If I was tasked to correlate Access Points to their locations, the simplest way would be to dump raw wireless traffic to one file, and raw GPS data to another. Later, you can zip them both up and run some analysis, and get the data you want out.
It'd be real easy to forget to filter the packets you dump to only anonymous, non-data-carrying packets. More than likely the people who designed it just forgot to, or figured it would be no big deal if they just never used that info. Sloppy engineering maybe, but certainly not malicious.
There's a very sensitive infrared camera and microphone outside your house right now, and we're disturbed by your interactions with your plushie. In the spirit of blind justice, I'm going to upload to /b/ and let the People decide.
If you broadcast your movements via radio (and air movements), why on earth would you expect anyone to consider it private?
A thick Faraday cage. If you need it, use it.
Whether or not they are the good guys, laws that attempt to contravene physics are a bad idea. If the packets had been encrypted, it wouldn't have mattered that Google captured them--without the key, they're just noise. You could pass a law saying that capturing packets broadcast without encryption is illegal, or you could pass a law saying that if you want your packets to be private, you should encrypt them, and if you don't encrypt them, you have no expectation of privacy. Which of these two laws do you honestly think makes the most sense?
Normally wiretapping involves a deliberate act of bypassing some kind of lock, if only the lock on the box that contains the wires. Here there was no lock, and the packets were hitting the antenna without any special effort on Google's part, and Google did have a legitimate purpose in putting up the antenna and listening for packets. Yes, they got more packets than their legitimate purpose required. Maybe they did so deliberately, although I can't see any reason why that would have been useful to them. But making it illegal is a really expensive way to solve the problem, and it doesn't solve the fundamental problem, which is that people are sending their personal information over the network in the clear.
Yes, they should have only saved the SSID, location, and signal strength. Instead, they used off the shelf software which saved more data. There is no reason to believe this was intentional.
That's fine and legal to do in the USA, as you have no expectation of privacy using unencrypted broadcast:
http://www.law.cornell.edu/uscode/uscode18/usc_sec_18_00002511----000-.html
TITLE 18 > PART I > CHAPTER 119 > 2511
(g) It shall not be unlawful under this chapter or chapter 121 of this title for any person—
(i) to intercept or access an electronic communication made through an electronic communication system that is configured so that such electronic communication is readily accessible to the general public;
(v) for other users of the same frequency to intercept any radio communication made through a system that utilizes frequencies monitored by individuals engaged in the provision or the use of such system, if such communication is not scrambled or encrypted.
In the US, if you transmit in the clear on unlicensed spectrum, they can legally pick it up due to two different, non-overlapping legal clauses. ( Note, I am not a lawyer, this is not legal advice, this is but one of possibly relevant laws, etc.)
The problem is they didn't need to do so, and it creeps people in the US out. So even here where it is legal, they probably shouldn't have from a PR point of view.
In some other countries it is not legal to collect that data, and doing so intentionally might lower your penalties, but still does not make it legal.
Blessed are the pessimists, for they have made backups.
Any geek worth their salt also never makes mistakes. Myself, I think I made a mistake once many years ago, and for my negligence i was rightfully whipped for it. Now of course I never make them; my work is always perfect.
The thing most people forget to ask, but was asked in this article, is something you conveniently forgot to mention. Here it is:
What possible use could google have for this data? What would be their motive here?
As the article says, there's almost no personal data in the emails. Even if there is, there's so little of it that what useful purpose could it serve? You'd have a hard time correlating it to any one person, or even finding out what it is. There's going to be so little data here, and it'll be so fragmented, that turning it into anything useful would be impossible.
On the other hand, why would google risk collecting this data when they knew what was going to happen if it got out? The risk vs. reward here just doesn't make sense. They're going to risk their reputation on... what? Collecting a few fragments of unencrypted wifi traffic that probably contains so little information and could very well be generated by a bot running on your machine.
I'm not going to believe google did this on purpose until someone can give me a motive that doesn't sound like something from a UFO convention.
You make an excellent point.
For my part, I'd like to point out that if Google wanted to read your email, they wouldn't bother collecting wifi data. They'd just read yer fucking email.