Ask Slashdot: What Does Your Data Mean To Google? (google.com)
shanen writes: Due to the recent kerfuffles, I decided to try again to see what Google had on me. This time I succeeded and failed, in contrast to the previous pure failures. Yes, I did find Google's takeout website and downloaded all of "my data," but no, it means nothing to me. Here are a few sub-questions I couldn't answer:
1. Much more data than I ever created, so where did the rest come from?
2. How does the data relate to the characteristic vector that Google uses to characterize me?
3. What tools do Googlers use to make sense of the data?
Lots more questions, but those are the ones that are most bugging me right now. Question 2. is probably heaviest among them, since I've read that the vector has 700 dimensions... So do you have any answers? Or better questions? Or your own takeout experiences to share? Oh yeah, one more thing. Based on my own troubled experience with the download process, it is clear that Google doesn't really want us to download the so-called "our own" data. My Question 4. is now: "What is Google hiding about me from me?"
1. Much more data than I ever created, so where did the rest come from?
2. How does the data relate to the characteristic vector that Google uses to characterize me?
3. What tools do Googlers use to make sense of the data?
Lots more questions, but those are the ones that are most bugging me right now. Question 2. is probably heaviest among them, since I've read that the vector has 700 dimensions... So do you have any answers? Or better questions? Or your own takeout experiences to share? Oh yeah, one more thing. Based on my own troubled experience with the download process, it is clear that Google doesn't really want us to download the so-called "our own" data. My Question 4. is now: "What is Google hiding about me from me?"
My question is ; who else is getting data about me from Google? Does Google sell it outright? I suppose that is their business model, but it would be nice to know how my metadata is distributed.
Uh? What question are you trying to answer? And how does that question relate to any of the questions I posed? At first I thought you were trying to say something about derived data, but now I have no idea...
However, one of the categories of data I was looking for was data about me from other sources. For example, in terms of marketing my data to the advertisers, such external data as my credit history would seem to be highly relevant. Perhaps I can find my credit report somewhere in there?
In the original questions I left out one of the peculiarities I already discovered. A lot of "my" data that the google sent me was actually links to other places where I had posted things. In other cases the links seemed completely unrelated to me, as with a Google Play app to some game I don't believe I've ever downloaded or played.
Freedom = (Meaningful - Coerced) Choice != (Speech | Beer^2), and sad sock puppets' bad mods avail them naught.
Seriously, do you really think that with anything short of a court order or an order from Congress (or maybe a gun pointed at their heads) they're really going to show you how much actual data they have collected on you? When you signed up for their 'services' using your real name, you handed them the Keys to the Kingdom, regardless of any agreement (that you likely never read in the first place). The only way to win this game was to have not played in the first place.
Uh? Are you saying that they are hiding it by sending it to me? If so, then what I am seeking could be rephrased along those lines. Right now it looks like I have a gigantic pile of data that's even messier than my actual life, which is saying something.
Freedom = (Meaningful - Coerced) Choice != (Speech | Beer^2), and sad sock puppets' bad mods avail them naught.
The 700 dimensions vector (if it's true) is not something you can make sense of. It's an embedding vector that represents your characteristics in relation to all the other people. Each individual dimension doesn't have a meaning.
I used the provided link to "download all your data" and had it save a "takeout" ZIP file on my Google Drive. I then tried adding a few files to drive and removing them then "really" removing them. In both cases a "removed" file (in the Trashcan but not "really" removed) did not appear in the Takeout archive. I then created a new Takeout archive and had it send it as an email to my gmail account. In both cases it's everything from my drive, calendar, all emails, contacts, bookmarks, photos, etc.
In the expanded ZIP under the root "Takeout" dir there's an "index.html" with details on all the files. The 2nd archive i created even contained the first archive in it's entirety from the "Takeout" folder on my Drive.
Are you seeing something other than this?
The Russians have won. They have made the world a cesspool of distrust, greed, fear and hate.
I cant believe we have deteriorated as to let a corporation stalk us
With Google Chrome you can turn many of their tracking features off although if you are feeling paranoid there are other web browsers you can use. It does get more difficult to control or stop information being sent to one or more interested parties if the operating system you are using is configured by default to do so and you can't blame Google Chrome for that.
Like it or not any site, you visit with a web browser will log your information as metadata. Under normal circumstances, metadata is only used for debugging purposes unless a court order is presented to the appropriate managers, (ah the good old days) however depending on the privacy policies of the company that metadata can be sold to interested parties.
It must be noted that most computers even from the 1950's onward logged metadata which as I have explained before is extremely useful for debugging purposes. Under normal circumstances, metadata was only kept for a few days or months (depends on company policy), however, it appears metadata can be used for other purposes and depending which country you live in there may be government policies in place that require retention of metadata for years.
BTW. I run Linux as my primary operating system and I have instant access to four web browsers, those are Google Chrome, Firefox, Konqueror and Qupzilla. There are other browsers I could install (takes about a minute or two) but I choose not to. No matter which browser I use any site I visit will log my activity as metadata even if I am using incognito settings. At least I don't have to worry that my operating system is sending data to interested parties.
There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
Sorry, Trax3001BBS, but I have to conclude that you are a terrible writer. Perhaps indifferent to communicating? If so, why write at all?
I'm really trying to strain my imagination for some meaning in any of your comments. Perhaps your last comment is supposed to mean that you think I'm advocating on behalf of Facebook in some sense of its superiority to the google? If so, I would say that I basically have the same questions (and concerns) about the Facebook data, even though there was so much less of it. At least based on Facebook's claim to have three orders of magnitude less data about me...
Freedom = (Meaningful - Coerced) Choice != (Speech | Beer^2), and sad sock puppets' bad mods avail them naught.
2. Google doesn't have all that data unified. The takeout project is actually the most unified view of your data.
3. Googlers in general doesn't have access to your data. Systems do, and use it in an automated fashion. There are break glass access for some engineers for some types of troubleshooting - but this triggers alarms.
In general, during my > 5 years at Google, I realized it's a company I'll trust with my data for many years to come. The "Data Liberation Front" who ensures that data takeout is available is huge. Also, GDPR in Europe ensures that data takeout needs to be very easy for many years to come. Google was just years ahead of the law there.
The main thing to understand here is that there are two types of data:
- Your raw data
- Their 'derived data'
This 'Derived data' (as the databroker industry calls it) is where the real value is. These algorithmically formed 'opinions' about you are the valuable distilled product they sell. In the USA this derived data doesn't belong to you. It's protected as a form of corporate free speech.
In the EU this is a little different, as these 'opinions' are also considered personal data. The question is to what extent you get access to it. For example, the threshold for personal data is when a piece of data can be traced back to less than 11 people. So the trick here is to create opinions about small groups of which you are a part. For example: knowing that someone with cancer lives in one of three adjacent houses, that is not considered personal data.
So far in my explorations of the data I haven't seen any browser history data, though I strongly suspect the google is collecting it
Unless you have web history enabled (check the settings in myactivity.google.com), I'm quite certain Google is not storing your browser history. I think this is a distinct question from tracking your web browsing through Google Analytics, assuming you haven't opted out of that. In the latter case, Google gets information about the sites you visit from those sites and uses it to update your interest profile, but doesn't store the actual visit history.
Note that there is almost certainly data Google has about you which it cannot show you, because it can't be 100% certain that you are you. Data derived from logged-out interactions can be tentatively correlated with you, but since there's no way to be completely certain you're the same person, it would be a violation of the privacy of whoever actually had that logged-out interaction (which might be you) to show it to you. In the case of logged-in interactions, of course, it's reasonable to presume that anything done while logged into account A can be safely shown to account A.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.