Slashdot Mirror


Google to Anonymize Users' Search Data

Google's official blog states they are on an effort to anonymize their search data after 18-24 months. After previously fighting turning over search data to the feds, it looks like they are striking another blow to the "think of the children" crowd. Any bets on whether MSN or Yahoo! will follow suit?

151 comments

  1. The real WTF is.. by b100dian · · Score: 1, Interesting

    ..the "off the record" button, in the first place!

    --
    gtkaml.org
    1. Re:The real WTF is.. by jacquesm · · Score: 3, Insightful

      I never got why google needs to keep all that history without anonymizing it.

      There is - as far as I can see - no rational argument that has to do with improving search results because you have them tied to individuals.

      And yes, keeping tabs on half the globe is evil too...

    2. Re:The real WTF is.. by LiquidCoooled · · Score: 2, Interesting

      Not only that, but is the history of searches you made over 2 years ago relevant to your current searches performed today?

      --
      liqbase :: faster than paper
    3. Re:The real WTF is.. by dammy · · Score: 1

      Let's see here, they are worried about turning data over to the US government but they have no qualms about getting on their knees to the Communist Chinese government? Am I ever glad I no longer spend my company's money on AdWords.

      Dammy

    4. Re:The real WTF is.. by JackMeyhoff · · Score: 1

      Individuals? You mean "BigFatMamma2002" or "BigBirdDork18m"?

      --
      http://www.rense.com/general79/wdx1.htm
    5. Re:The real WTF is.. by Dunbal · · Score: 4, Funny

      Not only that, but is the history of searches you made over 2 years ago relevant to your current searches performed today?

            Studies have shown that 43% of all people who search for "Donkey Love" will buy our product within 3 years if they see our ads.

      --
      Seven puppies were harmed during the making of this post.
    6. Re:The real WTF is.. by Anonymous Coward · · Score: 0

      It's called a "something interesting" tag. Please tag all your communications that way so we can distinguish them.

      Thank you,
      Googlers

      "In Soviet Russia, evil does you!"

    7. Re:The real WTF is.. by MightyYar · · Score: 1

      The only way to know for sure is to keep records of people's searches for 2 years :)

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    8. Re:The real WTF is.. by Vexorian · · Score: 1

      There's something called abuse. There are scripts that might use google to search terms like the name of a full of exploits gallery system in order to get a list of vulnerable pages, there are people who want to modiffy google trends results, there are actually lot of reasons to abuse web search, so I would have made a logger as well...

      --

      Copyright infringement is "piracy" in the same way DRM is "consumer rape"
    9. Re:The real WTF is.. by paulpach · · Score: 1

      This information is very valuable as an ad provider. Just do a little data mining, and you will find stuff like "people who search for pregnancy cloth 5 years ago are more likely to click on child cloth ad today" and many other not so obvious relationships.

      The only reason google is willing to throw this information away (and money with it) is because customers are concerned about their privacy.

    10. Re:The real WTF is.. by gratemyl · · Score: 1

      They keep it for purposes such as personalized search results and the like.

      A.I. is also far more effective at linguistic analysis (which Google may wish to introduce in the future, if they haven't already) when relations between results are known, and can be mapped to one user.

      The type of things a single user would search for are often limited to certain categories of knowledge and thus a linguistic analysis engine could determine query relations which would improve search results for future users.

      --
      hackerkey://v4sw5/7BCHJMPRUY$hw3ln3pr6/7FOP$ck6ma8+9u6L$w4/7CGUXm0l6DLRi82NCe3+9t5Sb7HMOPRen5a17s0DSr1/2p-3.62/-5.23g3/5
    11. Re:The real WTF is.. by Chris+whatever · · Score: 1

      Not if it helps to catch pedophiles.

    12. Re:The real WTF is.. by NDPTAL85 · · Score: 1

      No, keeping tabs on half (or even all) of the globe is NOT evil. If you don't want anyone to keep tabs on you then you always have the simple and easy option of committing suicide.

      If you don't, then either you don't mind Google keeping tabs on you, or you are a wuss.

      --
      Mac OS X and Windows XP working side by side to fight back the night.
    13. Re:The real WTF is.. by Impy+the+Impiuos+Imp · · Score: 2, Insightful

      People searching for their social security numbers just for the hell of it, or their CC numbers, and presto! Now real numbers exist in some "Google history list" for ever and ever.

      There's a goldmine of data there. "Anonymizing" it doesn't affect this, unless they have filters to try to recognize such and get rid of it.

      Still, if it's in the form of "User X" searched for these 132 terms last month, some terms might identify them and hence link them to other things like their unfortunate search for "donkey love".

      E.g.

      1234 Fake Street (suppose it's your real address)

      +britney +bald +"bald down there"

      What does "bedonk-i-donk" mean?

      fat asses with tiny waists

      --
      (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
    14. Re:The real WTF is.. by Impy+the+Impiuos+Imp · · Score: 1

      "Don't think of the children!"

      --
      (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
    15. Re:The real WTF is.. by Anonymous Coward · · Score: 0

      Off the Record (a.k.a. otr) is an encryption tool which can be used for many things. There is a gaim plugin and an aim proxy available at http://www.cypherpunks.ca/otr/.

    16. Re:The real WTF is.. by Peter+Trepan · · Score: 3, Funny

      Studies have shown that 43% of all people who search for "Donkey Love" will buy our product within 3 years if they see our ads.

      ...and that number rises to 98.3% if we mention we found that item in their search history.

      --

      Step into a huge movement. Don't Tread In Me.

    17. Re:The real WTF is.. by D'Sphitz · · Score: 1

      yeah they should stand their ground, after all it's only 1.5b people who would no longer have access to their product(s). who needs them?

    18. Re:The real WTF is.. by amRadioHed · · Score: 1

      Oh sure, that's brilliant. Kill yourself then you end up in Google News where the whole world can keep tabs on you.

      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
    19. Re:The real WTF is.. by paulpach · · Score: 1

      "Anonymizing" it doesn't affect this

      Anonymizing this data greatly reduces the value of it. They no longer can use your queries from 5 years ago to determine what you are likely to buy. For example they can no longer see the fact that you where the one looking for maternity cloth 5 years ago so they can not know you are more likely to buy children cloth.

      Of course, like you point out, the issue is that of privacy (it always is with data mining), and google offsets the loss by gaining customer confidence.

    20. Re:The real WTF is.. by Anonymous Coward · · Score: 0

      "Who the f*** decided that sentences on the Internet shall no longer be formatted with two spaces after a period?!"

      Standard practice with non-fixed-width fonts. Double spaces were used to make text easier to parse when everyone was using typewriters; now that we have computers we don't need that crutch any more.

    21. Re:The real WTF is.. by Impy+the+Impiuos+Imp · · Score: 1

      Standard practice with non-fixed-width fonts. Double spaces were used to make text easier to parse when everyone was using typewriters; now that we have computers we don't need that crutch any more.


      People keep responding to this similarly.

      Can someone explain how two spaces is needed with courier, but not with ariel? It seems to be useful in both. I don't wonder if it's not some "conserve space" thing gone amok. I do know stream formatters, which web pages are, should insert "proper spacing". That may not be two full spaces, but it's more than one space, which is what is currently in use.

      I need people to join me in sparking this revolution.
      --
      (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
  2. Uhm by giorgiofr · · Score: 2, Interesting

    All they have to do is erase the logs every day or just not keep them. It doesn't "take an effort". Anonymous proxies have been doing this for years.

    --
    Global warming is a cube.
    1. Re:Uhm by Rakishi · · Score: 4, Insightful

      And anonymous proxies do not need to make money or provide much of a service unlike google, logs are very useful for such things.

    2. Re:Uhm by Whiney+Mac+Fanboy · · Score: 4, Insightful

      All they have to do is erase the logs every day or just not keep them. It doesn't "take an effort". Anonymous proxies have been doing this for years.

      I know where you're coming from, but that would kinda fuck with their targetting advertising business model dontcha think?

      --
      There are shills on slashdot. Apparently, I'm one of them.
    3. Re:Uhm by jacquesm · · Score: 2, Insightful

      it doesn't have to, after all the targetted ads are supposedly targetted to the *content* of the pages and your search query. No need to keep that for two years in order to target it better unless you have other plans with my data (such as selling my 'profile').

    4. Re:Uhm by Anonymous Coward · · Score: 0

      Yep, migrating god-only-knows-how-many thousands of servers onto a differeng data aggregating model, piece of cake...

    5. Re:Uhm by daeg · · Score: 3, Insightful

      I'm between the two extremes of agreeing with you and agreeing that data needs to be retained. As any of us who have taken a statistics class (or four) can tell you, you don't need access to the whole sample to provide accurate data. So, say, for instance, the Google engineers were working on a specific niche of the web, say, dog lovers. If I were designing something to better suit dog lovers, my first step would be pulling a report on the common search patterns of people that search for dog-related topics.

      Historical data that identifies a unique user is extremely useful. I do the same thing with our Intranet search and report tools. If I want to improve something, oftentimes the logs will give a very telling tale. (This accounting department employee searched for "expense", then "expense excel", then "expense spreadsheet", then "expense log", finally getting his document. I can then add the keywords 'excel' 'spreadsheet' to the actual document entry.) That said, you don't actually need to know who the unique user is, for all intents and research purposes, User5486734067 is just as useful as an IP+Cookie.

    6. Re:Uhm by Anonymous Coward · · Score: 0

      Let me assure you, my "dog lover" searches were just a phase which in no way reflects my relationship with the animal kingdom. DELETE THE LOGS NOW!

    7. Re:Uhm by jacquesm · · Score: 1

      Even for the example you give I would not need to know *who* made those searches.

      There are two good reasons to keep the data, as far as I can see, the first is to avoid sending
      the same ad to someone twice (but for that you only need a history of what ads they've seen, not
      what they have searched for, though of course that does help to tag a user as a 'programmer' or
      an 'accountant'), the second is when you go in to the massive selling of profiles business.

      There are some companies that do this (Schober comes to mind, there is an 'umlaut' over the
      o but I have no idea how to put it there...), and if google would ever decide to augment
      their revenues like this to make the next 3 month target then we are all going to be in for
      a lot of trouble from DM people.

      I think I'll hack a network sniffer to record my own searches for a few weeks and see what
      kind of profile you could build up from there, I'm actually pretty curious about that.

    8. Re:Uhm by daeg · · Score: 1

      I never said they need the "who", just the unique ID to chain the searches together.

      From my experience with AdSense, Google doesn't give direct access to any of the information. In fact, it makes sense for them to strongly protect their profiles. If they sell them, they lose control over them. Sure, they can retain legal control, but once they're out, they're out. Google isn't dumb, they'd rather make $1 for every profile access versus $100 up front, as the $1s will add up over time (not actual dollars, just an example). They make money by granting access through AdSense to more strongly target ads.

      And I can't find the link, but the search profile study has been done, although I'm not sure about how accurate it is. I bet with any sizable search history for a given unique user, a good portion of them could be traced back to an individual, particularly with programmers and system administrators. Why? Because we search for very specific things. You could, for instance, see "/etc/passwd cygwin permission denied mkpasswd xp" in my search history from today. Seeing that I searched a dozen permutations of that afterward, you could theorize I had an issue I wasn't finding, and could then connect that with various mailing lists and Google Groups and search for the same terms (or various IRC log aggregations) and find my e-mail address and, likely, my name, and maybe a signature with my name and company name, along with e-mail headers yielding an IP address.

    9. Re:Uhm by jacquesm · · Score: 1

      true enough, apologies.

      As for the search profile study that was AOL's blunder, and after examining
      the data that AOL provided in some detail (several weeks worth of work) I am
      absolutely amazed at how privacy invasive this stuff is.

      That is why I'm eagerly awaiting a competitor to the big G that has a really
      strong privacy statement.

      If the quality is anywhere near comparable I'll switch in a heartbeat. But I
      do not doubt that I'd be one of very few people to do so. Not because I have
      something to hide, just because I have seen what you could do with that data.

    10. Re:Uhm by Rakishi · · Score: 1

      Like the other poster said, depending on what a data analyst is doing such information could be useful. IPs I may ad could be useful on their own to provide geo-location and ISP information for example.

      For example, it is likely that google alters its search page with different setups to test various things in which case your long term reaction to such different ad methods could be useful. Likewise seasonal trends require long term data to find. There is a big difference between using data in production and using data for exploratory analysis, the later could very well use a lot of extra data just to check possible problems with the method.

      Also the ad system is more complex than just showing ads to users, user search (and ad view) data could be used to help in whatever model is used for the bidding and ad ranking systems. Information on which ads were shown, how they were shown, who clicked on them and so on can be very useful even at long time periods.

      Remember that this information once deleted cannot be restored and the possibility that someone may need it in the future for some yet unknown reason likely far outweighs the storage costs. You don't want to be told "well if I had 2 years of data I could get another 2% revenue out of this method but I guess I can't so thats a few million down the toilet."

    11. Re:Uhm by Rakishi · · Score: 1

      That is why I'm eagerly awaiting a competitor to the big G that has a really
      strong privacy statement.


      None ever will most likely, not enough people care and G will simply kick their ass due to having better data to model things with. Search isn't exactly an easy field to break into right now , there is possibility for more or less niche search engines but not at a google/yahoo/msn level. Sure someone could make a brilliant new algorithm but then it's very unlikely that they'd also have a strong privacy policy, it just doesn't make much business sense.

    12. Re:Uhm by Anonymous Coward · · Score: 0
      The problem with the 'erase the logs' idea is that the feds will just mandate that services keep the logs. Google has already tipped its hand by admitting that they have the resources to store them for as long as 24 months.


      How about designing a protocol that is stateless WRT past searches? Or at least linked to a cookie set to expire upon restarting ones browser? I'm not certain how valuable past search terms are to an advertizer anyway. Odds are that if I was looking for something last week, I've either found it by now or am no longer interested. Then Google can just say, "Logs, what logs?"

  3. right.... by Anonymous Coward · · Score: 0

    And how are they going to comply to the EU regulations, which stipulate a much longer retention-time?

    1. Re:right.... by Anonymous Coward · · Score: 0

      Why would Google have to comply with EU regulations? :?

    2. Re:right.... by ag0ny · · Score: 4, Insightful

      Why would Google have to comply with EU regulations? :?

      Maybe because they do business in Europe?

    3. Re:right.... by skrolle2 · · Score: 5, Informative

      http://eur-lex.europa.eu/LexUriServ/LexUriServ.do? uri=CELEX:32006L0024:EN:NOT

      The data retention directive only applies to ISPs, and only deals with who you "communicate" with. It does not explicitly say that a record of which websites you visit should be retained, and it explicitly says that the content of the communication must not be retained.

      However, as for all EU directives, it only contains the baseline of regulation. Directives are never law themselves, but have to be implemented in each respective member state by each respective legislative body. These, in turn, are free to implement whatever they want ABOVE the baseline, so some member states may have longer retention periods for this data, some member states may require ISPs to retain additional data.

      The deadline for this directive is September this year, but if you read it, a few member states have reserved the option to postpone parts of the directive, typically of the internet-related traffic. This basically means that they recognize the difficulties in implementing it, and want more time to think about on how to do it, or possibly obstruct it.

      What all of this boils down to is that maybe, sometime in the future, if you have an European ISP, they may be required to store all the URLs that you access. Google search data is transmitted as querystring parameters that are part of the URL, which means that your search data may be stored by your ISP, in a non-anonymized way. There's nothing in this possible future that Google has to comply with, as long as they are not an European ISP.

    4. Re:right.... by ObsessiveMathsFreak · · Score: 1

      Lobbyists

      --
      May the Maths Be with you!
    5. Re:right.... by Anonymous Coward · · Score: 0

      Gmail, and end-user e-mail service, and Google Talk (an end-user Voip service, at least in part) are certainly within the common definitions of Internet services and fall directly within the remit of incoming and existing data retention directives.

    6. Re:right.... by Anonymous Coward · · Score: 0

      It still doesn't apply to them.

    7. Re:right.... by mikkelm · · Score: 1

      EU regulations apply to any business a company does within the EU. No matter where the company is registered. That means Google, too. There'd be little point to these regulatiouns otherwise.

      Go away.

    8. Re:right.... by skrolle2 · · Score: 1
      You are right and I am wrong. :-)

      Most of the other requirements are at least in some sort of feasible realm, they deal with which DSL modem at what address had what IP at what time, and which cellphone called which other. It's intrusive and bad, but at least tied to hardware and physical location. However, I missed this part:

      (2) concerning Internet e-mail and Internet telephony:
      (i) the user ID or telephone number of the intended recipient(s) of an Internet telephony call;
      (ii) the name(s) and address(es) of the subscriber(s) or registered user(s) and user ID of the intended recipient of the communication;

      Given a wide interpretation of "internet e-mail", say that we include forum posts, that would require everyone who has a PhpBB or equivalent to store the headers of all posts the required time. This is insane. Also, say that we include Skype in "internet telephony call", just who is supposed to store the call details if it's done peer-to-peer? If only the peers know that a call has taken place, why would they store it somewhere if they don't want to?

      It's pretty clear that this directive is written by people who think of the internet in the same terms as plain old telephony, which is highly centralized and under the control of a few corporate entities with business interests in the member states. Luckily, we're moving away from that model faster and faster.
  4. Finally we search for those pics of Britneys vagina without fear of harming our permanent google record.

    1. Re:Yes! by MrClownLovesYourMom · · Score: 0, Offtopic

      All we need now, is to be able to edit our typo's on ./

  5. Mine already is by solevita · · Score: 2, Informative

    Although I did have to install the AnonymizeGoogle Firefox plugin to get it.

    1. Re:Mine already is by solevita · · Score: 5, Informative

      Ignore that post above - I'm a moron. I meant to say CustomizeGoogle Firefox plugin .Get it here.

      I guess that's what happens when you Slashdot before caffeine. I'm sorry.

    2. Re:Mine already is by Anonymous Coward · · Score: 0

      Wow .. so you can customize Firefox to interact with Google without an IP?

      Impressive, most impressive. In a related development, I heard that if you turn off "Send Caller ID" on your cellphone then you can make prank bomb threats to the White House! They'll NEVER find you!

    3. Re:Mine already is by solevita · · Score: 3, Informative

      Your IP usually isn't the problem, especially in my case where my ISP sends it all through their regional proxy anyway. What CustomizeGoogle does is randomize your Google UID. Take another look at the recent AOL breach - people weren't suffering privacy loss due to their IP address, but rather because AOL gave each and every user a number that could be tracked through the system. Thanks to CustomizeGoogle, that won't happen to me and my searches.

    4. Re:Mine already is by number11 · · Score: 1

      I meant to say CustomizeGoogle Firefox plugin

      That helps.

      Of course, if you want to shorten log retention further than Google's "only 2 years!", you can go through a proxy like Anonymizer or Tor. If the fullbore proxies are too much of a hassle, there's always the search proxies like Scroogle Scraper (where the log retention is 48 hours).

      Another approach is to poison the data mine with TrackMeNot by generating thousands of random searches in the background.

  6. How about by squoozer · · Score: 1

    anonymizing it straight away! That would be an even quicker solution to the problem.

    --
    I used to have a better sig but it broke.
  7. 0 months? by pr0nbot · · Score: 1

    Why not anonymise the data after zero months? Are they required by law not to?

    1. Re:0 months? by Barny · · Score: 1

      In some countries, yes, they are required to.

      --
      ...
      /me sighs
    2. Re:0 months? by cdrudge · · Score: 4, Insightful

      My guess is they don't do it immediately is because there is internal business value in mining the data. User patterns, length of stay, etc. After 18 or 24 months, the internal value has dropped significantly as things change quickly. I would have thought that the value would have dropped even quicker then that, say after 6 months or maybe a year.

    3. Re:0 months? by Rakishi · · Score: 1

      Even if they weren't legally required it makes more business sense to keep as much data as possible as you never know when someone will need it for some project.

    4. Re:0 months? by Anonymous Coward · · Score: 0

      You also never know when someone will need it to sue you...

    5. Re:0 months? by steelfood · · Score: 1

      This isn't quite true yet. Most people who use the internet are not very savvy when it comes to protecting their privacy. With everyone having gmail accounts, they can effectively trace a person's search habits over years. Especially if people use the same computer, or log in to gmail before every session. So no, the data doesn't become less useful for most users. On the contrary, it becomes more, and utility is only going to increase as google releases more and more services.

      An example off the top of my head (so forgive me if I seem like I'm rambling), you can map shifts in popularity over time by location. So if a cultural movement happens in New York, you can, using search queries, watch it move across the country, and possibly see it move over oceans. And the interesting thing is, you can see where different types of cultural shifts get transmitted first and how quickly they're adopted relative to their initial introduction. With identifiable information, you can identify the user or set of users who have the msot clout in certain areas. Certain ads for those people will be different (and perhaps more expensive) from the same type of ads for people who are merely followers of a cultural movement. And, over an extended period of time, you can actually see the rise and fall of that particular user's clout on a particular subject if the difference is substantial enough.

      Now, imagine if the US government got a hold of this. They can pinpoint people who are influential in certain areas (like political movements) and target them should they need to. Communism getting popular again? Violent revolution? Find out who the major influencers are (sometimes, they're not necessarily the leaders) and take 'em out. Don't even have to assassinate these people now. Call 'em a terrorist and lock 'em up indefinitely.

      One thing I think would really boost browse/search anonyminity and something I really would like to see on browsers is the ability to tie session cookies to individual tabs and/or windows only. That and a MAC address randomizer. But the latter is probably far more difficult than the former.

      --
      "If a nation expects to be ignorant and free in a state of civilization, it expects what never was and never will be."
    6. Re:0 months? by Anonymous Coward · · Score: 0
      Why not anonymise the data after zero months? Are they required by law not to?

      If /. had linked a real FA instead of Googles OWN BLOG you would know the answer. From a FA from a real newspaper:

      And in Europe, the law mandates that such records be preserved. Last year, the European Union ordered phone and Internet companies to retain traffic data tied to individual computer addresses for six to 24 months to help police investigate crimes. The EU left the exact time frame to each member nation to decide.
      See also Google to Adopt New Privacy Measures By Michael Liedtke

    7. Re:0 months? by ACMENEWSLLC · · Score: 1

      The problem is that any time they get a subpoena for a case, they have to make a backup of any relevant data and keep it. So even if they anonymize your data after 20 months, if there is a pending case your data could be kept elsewhere for a very long time.

      What you need to do is buy my handy dandy white noise generator. It creates random searches on Google at a variable rate per second. It injects searches from a database of 40 million stored searches. This buries your real searches in a lot of fake ones, making it a needle in a hay stake.

      This can be your for only.... wait, what is that? Oh no, it's the guys in black suites telling me asd=389

  8. I for one... by Anonymous Coward · · Score: 0

    I, for one, will be very glad that they won't be able to pin my searches for "Goldfish porn" and "Kinky sofa covers" back to my IP.
    Signed,
    John Jacob Smith
    123 Brookfield Lane
    Towarg, South Carolina

    1. Re:I for one... by Anonymous Coward · · Score: 0

      I agree.

      -Bill Gates
      One Microsoft Way

    2. Re:I for one... by Dunbal · · Score: 1

      "Goldfish porn" and "Kinky sofa covers"

            Funny you mention that, I was searching just the other day for "sofa porn" and "kinky Goldfish covers"...

      --
      Seven puppies were harmed during the making of this post.
  9. Shouldn't be collecting that info anyway by Anonymous Coward · · Score: 2, Informative

    Google should not be collecting any of that huge pile of information AT ALL, not just anonymising it after 18 months. As the AOL case showed, search queries can be used to identify individuals even after AOL anonymized them, so it's not IP addresses they are recording, it's PEOPLE.

    There is no need to collect the IP addresses of searchers that haven't opted in to Google's personalized search. There is no law, that requires it.

    There is no need to store the IP addresses of individual visitors to websites when Google analytics is used on a web page.

    There is no need to store IP addresses of pages delivered to adsense viewers. Clicks maybe for a short time to prevent click fraud, but viewers, no.

    None of this information should be recorded, and further the EU privacy directive should be enforced to ensure that none of that information is recorded. The law says we have privacy, Google should be forced to comply with that law.

    1. Re:Shouldn't be collecting that info anyway by GweeDo · · Score: 2, Insightful

      There is no need? What about the monetary need? Google doesn't really care who you are, but they do care about what you are looking for. The more they know about what you are looking for the better their AdSense program can do. The better it does, the more money they make.

      As for your whole you "we have privacy" bit, sure you do. In your own home while using your stuff. The moment you sent your request out over the internet in plain text to a third party (that is a corporation out to make money you know) you lost that.

    2. Re:Shouldn't be collecting that info anyway by rtb61 · · Score: 1
      Technically speaking you as the search end user can make better use of personalised search history and refinement of results. Everybody tends to use search phrases and search styles in a different manner, especially in relation to the experience level of the user.

      Searching will only get more and more complex as time progresses and things like automatic language translations finally start to appear. Privacy on one hand or the search engine adapting to your search style, not really as clear cut a choice as it first appears, especially for the novice searcher (Boolean WTF is Boolean, a skinny ghost?).

      Targeted marketing is a bit of a fudge, I generally buy a new PC every two, marketing a PC at me in between that time will not induce me to go out and buy a PC. Adsense is more about marketing to the people who want to pay for advertising, it makes them feel better about the money they are spending.

      The real focus is on making sure the end user is happy with the search results, so they wont go else where and will specifically seek to use Google search.

      --
      Chaos - everything, everywhere, everywhen
    3. Re:Shouldn't be collecting that info anyway by kalirion · · Score: 1

      Google should not be collecting any of that huge pile of information AT ALL, not just anonymising it after 18 months. As the AOL case showed, search queries can be used to identify individuals even after AOL anonymized them, so it's not IP addresses they are recording, it's PEOPLE.

      AOL did not anonymize correctly. True anonymization would not have queries linked by "userid". Giving you 100 queries and saying "these 10 were made by one user, these 7 by another, etc." is far different from just giving you 100 queries and saying "these were made by anywhere between 1 and 100 users, inclusively."

    4. Re:Shouldn't be collecting that info anyway by mysticgoat · · Score: 1

      Google could not exist without collecting this information. This data is central to its business model, and key to its differentiation from other search engines. Its history of growth (of individuals choosing to use Google over similar products) validates this approach and also demonstrates that the methodology is generally accepted. The great majority of web uers see nothing wrong with the method even though concerns about it are getting a fair amount of publicity.

  10. According to TFA by ReallyEvilCanine · · Score: 4, Insightful

    Google plan to make it "more anonymous". Like pregnancy, data either ARE anonymous or they ain't. You can't qualify an absolute, and "anonymous" is an absolute condition indicating lack of information.

    1. Re:According to TFA by DrEldarion · · Score: 1

      we will anonymize our server logs

      so that it can no longer be identified with individual users Sounds anonymous to me.
    2. Re:According to TFA by Cytlid · · Score: 1

      So you're saying "Data are either impregnated with anonymity or they ain't?"

      I need another cup of coffee.

      --
      FLR
    3. Re:According to TFA by catbutt · · Score: 1

      Anonymous is not an absolute. That is ridiculous. Like almost everything in the world, there are many shades of gray.

      In this case, it can be determined that a search was within a group of 256 people, but they can't tell which one. What if they just stored the country of the user? Same thing, just larger group. More anonymous.

      There are all kinds of degrees of anonymity. I'm not advocating any side to the issue, but if you are going to look at the issue intelligently, seeing it in simplistic black and white terms doesn't help.

  11. It's there servers by tomstdenis · · Score: 1, Troll

    Stop googling for "jihad death to american president" if you're worried about getting caught.

    I should point out that your google query goes over plaintext HTTP so anyone inbetween can eavesdrop on your queries.

    Tom

    --
    Someday, I'll have a real sig.
    1. Re:It's there servers by solevita · · Score: 5, Insightful

      Stop googling for "jihad death to american president" if you're worried about getting caught.
      You're correct. The only people that demand privacy are those up to no good. How about I come over to your house later, sit in your bed for a bit, go through your draws and your phone records, take some pictures of you and your friends, ask the neighbours some pressing questions?

      If you've got nothing to hide, you should have no problem with this.
    2. Re:It's there servers by tomstdenis · · Score: 1

      Ah, the out of context argument. My house is private by the definition that I have locks on the doors and blinds on the windows. Your analogy may make sense if, say, a public walkway passed through my living room.

      I'm not saying people shouldn't have privacy, I'm saying if you export your secrets outside of your domain, you shouldn't expect privacy.

      You don't do your personal finances on a city bus do you?

      --
      Someday, I'll have a real sig.
    3. Re:It's there servers by garcia · · Score: 2, Insightful

      Stop googling for "jihad death to american president" if you're worried about getting caught.

      Excuse me?! I live in America and if I want to research the results of the search terms "jihad death to american president" I'm well within my fucking rights.

      Fuck you for saying otherwise.

    4. Re:It's there servers by tomstdenis · · Score: 2, Interesting

      Well you're describing a law enforcement problem not a privacy issue.

      Google is within their rights to gather as much information as you feed them (your ip, time of day, host strings, query string, etc).

      My point was if you were planning on committing crimes, you shouldn't use google to find tips.

      Tom

      --
      Someday, I'll have a real sig.
    5. Re:It's there servers by Dunbal · · Score: 2, Funny

      If you've got nothing to hide, you should have no problem with this.

            Yeah while we're there we can install the webcam in his bathroom and broadcast on the net every time he takes a crap. I have a pair of guys willing to do the commentary on wiping techniques to add to the video...

      --
      Seven puppies were harmed during the making of this post.
    6. Re:It's there servers by Dunbal · · Score: 4, Interesting

      Ah, the out of context argument. My house is private by the definition that I have locks on the doors and blinds on the windows.

            Funny - my computer is in my house, behind locks and blinds too. Hey Google's computers also are behind lock and key, and they even have security guards and alarm systems. I don't ever remember giving Google permission to disclose any information shared between them and I - oh and heaven forbid I go around giving away the information Google found for me - I'd get sued!

            Why would the whole world automatically be party to the information Google and I shared one evening? My computer sent that information to a specific internet address, and the answer came back specifically to my computer.

            Not so out of context...

      --
      Seven puppies were harmed during the making of this post.
    7. Re:It's there servers by solevita · · Score: 1

      Google is within their rights to gather as much information as you feed them (your ip, time of day, host strings, query string, etc).
      I see the problem now; you clearly don't understand the extent of Google's monitoring. They're not logging just IP address', they're logging people. The AOL data that came out showed how you could follow tracking cookies to see exactly what people, not IP address', were searching for.

      I don't see why you have such a problem with it anyway. Many people around the world asked for greater privacy, Google gave it to them and you got your panties in a twist. Why is that?
    8. Re:It's there servers by Anonymous Coward · · Score: 0

      Why would the whole world automatically be party to the information Google and I shared one evening?

      You should have noticed that it isn't. The topic is that Google will stop remembering that it was you who searched for monkey porn (or stock quotes - I can only guess) two years ago. The situation is like this:

      Dubal: Can you keep a secret?
      Google: No, according to my privacy statement, I will rat out on you if the cops ask me. Also, if I'm drunk and blab it out, you have no recourse.
      Dubal: Ok, here's the secret.
      Google: Thanks. By the way, in two years I will forget that you told me this.

    9. Re:It's there servers by tomstdenis · · Score: 3, Insightful

      This is why it pays to have a modicum of computer knowledge.

      Assuming you're not trolling...

      When you send a query to google, it goes over the "internet" in the clear. That is, not encrypted. Anyone who can see it can read it. Well who can read it? Turns out a lot of people. Between me and google are probably 10 different boxes. 5 of which are just my ISPs routers. The other five are boxes on other networks, not even related to Google.

      There is no inherant requirement for privacy like there is with telephones (maybe their ought to be one). But that said, you're giving your data to Google, willingly no less. That gives them every right to record it. You gave them permission by using their service, I guess you never read their TOS which is your fault, not theirs. Think about the analogy in the real world. This is like you handing your drivers license to every stranger you meet, then getting upset when some of them write it down.

      If you don't want your assets [IP, location, name, platform, etc] leaked to Google you should use an anonymous proxy.

      Tom

      --
      Someday, I'll have a real sig.
    10. Re:It's there servers by tomstdenis · · Score: 2, Insightful

      I'm not against google cleaning their logs. I'm against people claiming this is a privacy issue.

      Google logging all your queries: Not a privacy problem.

      Bank leaking your SSN via stolen laptop: Privacy problem.

      AOL knowing that you like midget porn: Not a privacy problem.

      Government using sub-standard contractor to manage passport data, later turns up on broken into computer: Privacy problem.

      By screaming wolf every time "data" is mentioned you desensitize people to real privacy problems.

      --
      Someday, I'll have a real sig.
    11. Re:It's there servers by kjart · · Score: 1

      Stop googling for "jihad death to american president" if you're worried about getting caught.

      When you use language like "caught" you are obviously not referring to Google, but rather some external agency (i.e. the government) rather than by Google. You are changing the parties involved to strengthen your argument.

    12. Re:It's there servers by Anonymous Coward · · Score: 0

      I don't understand peoples idea that information that I pay to recieve is anyone else's business. If I want to look up plastic explosives, I can do it. If I want to look up anti-American punk rock, I can do it. Unless I do hurt someone, I'm not a criminal.

      Here's an idea, maybe the government should stop pissing people off, that way they wouldn't have to worry.

      Yeah yeah, freedom hater, terrorist, whine bitch moan. Sorry I don't do search queries for the same boring shit that you do. Waiit, no I'm not. Take your snooping and wiretapping laws and suck them like a club-hopping slut on Saturday night, I'll do whatever I like with my time.

    13. Re:It's there servers by tomstdenis · · Score: 1

      Whoa, step off. I'm not saying you should be denied searching that. All I'm saying is don't think it's private. So if you are worried about your privacy, don't use public search engines.

      People seem to infer that I mean to say you should only search for telescreen approved subjects. Hell no. Just don't expect privacy when you're using someone elses server, over the Internet IN CLEARTEXT.

      Tom

      --
      Someday, I'll have a real sig.
    14. Re:It's there servers by everphilski · · Score: 1

      Funny - my computer is in my house, behind locks and blinds too.

      But your search queries leave the house, unencrypted, with no guarantee of protection and travel to Google. That's where the analogy has fault.

    15. Re:It's there servers by QCompson · · Score: 1

      I'm not saying people shouldn't have privacy, I'm saying if you export your secrets outside of your domain, you shouldn't expect privacy.

      Although really, there is a good argument to be made that people have the expectation of privacy when they use the internet from their own homes, even if it is not technically feasible.

      To use the house analogy, I assume you don't keep your blinds down on your windows 24/7. Wouldn't it feel wrong if someone were using a telescopic lens from 200 feet away and watching your every move through your windows? What if you were writing a private letter on an airplane, and when you got up to use the lavatory, the stranger sitting next to you unfolded the letter and started reading it? Or if you fell asleep on a bus and someone started going through your cell-phone data. None of these actions are technically illegal, but I assume they would upset you, just as people get upset when you suggest that all their internet movements are being snooped on.

      I'm not completely disagreeing with you, but I just wanted to point out that just because something can be easily eavesdropped on or monitored doesn't mean that we should abandon all privacy rights because of it. People expect privacy in their communications via snail mail and telephone calls (and we have laws to protect this), why should email or web searches be any different?

      Personally, I think the old postcard vs. sealed letter analogy is a good one for the internet, but I wish more connections were encrypted by default, and stronger privacy laws were enacted to cover everyday internet usage.
    16. Re:It's there servers by tomstdenis · · Score: 1

      I should point out it's legal to be naked in your home, but not infront of a window where others can see.

      There is a certain question about whether you can use information eavesdropped off the internet in legal proceedings. But that's a question of law, not privacy. If you're worried about privacy, you must keep your secrets to yourself.

      And frankly, you don't have a contract with Google to not log your searches. Add to that your'e doing it over http and it's hard to argue anything else.

      i could see if you used google via https and had an agreement that your searches wouldn't be logged. then you could argue you deserve privacy. But that's not what you are doing.

      Tom

      --
      Someday, I'll have a real sig.
    17. Re:It's there servers by QCompson · · Score: 1

      But that's a question of law, not privacy.
      Which is why there should be a law to protect privacy on the internet. Law and privacy are not mutually exclusive.

      This isn't simply a matter of reading TOS's. I don't see why we should have to wait for a corporation to offer it to us before arguing we deserve privacy. Again, there is an expectation of privacy for telephone and U.S. mail communication, so why should we throw up our hands and abandon all hope of privacy for the internet?
    18. Re:It's there servers by Anonymous Coward · · Score: 0

      I understand completely, but how would I go about this, then? What exactly is a NON-public search engine?

      Why should information going from me to the source and back again be anyone else's business? That's my real question. Why CAN'T it be private?

    19. Re:It's there servers by tomstdenis · · Score: 1

      arrg..

      Ok let me explain this to you.

      Even over the phone, you have no privacy. Even though it's illegal to wiretap without a warrant. There is a difference between privacy and "non-admissable in a court of law."

      Imagine you were a spy, and you wanted to communicate with your handler. Would you talk plainly and openly over the phone because wiretapping without a warrant is illegal? No. you'd encrypt the message [codewords, etc]

      And while yes, I think the government should require warrants before wiretaping your net connection, I don't see that your queries with Google are specifically private. If google, a party to the communication, decides to divulge the nature of the data, that's their business. More so, I don't think google is leaking the data, I think they use it interally to target the ads better.

      Point is, if you don't want people knowing your secrets, don't broadcast them for all to see.

      Tom

      --
      Someday, I'll have a real sig.
    20. Re:It's there servers by o'reor · · Score: 1

      If google, a party to the communication, decides to divulge the nature of the data, that's their business.
      Right. So, some day, you go to see you doctor, and he finds you terribly ill. You know the disease will evolve into a really crippling illness, and your health insurance is just about to be renewed. Question : do you mind if your doctor, "as a party to the communication" you just had with him, "decides to divulge the nature" of the disease to your insurer ? Is that "their business" and theirs only ?


      It is only reasonable to expect some degree of privacy between your service provider (Google) and you as a client.

      Point is, if you don't want people knowing your secrets, don't broadcast them for all to see.
      When you are querying Google, you don't expect your queries to be "broadcast" on public display. If you did expect such a thing, why not set up a home page with your real name on it, and a "My dirty secrets" column, listing all the queries you've submitted to Google since you first used it ? Now that is broadcasting secrets, overtly and knowingly. Google is not expected to do that.
      --
      In Soviet Russia, our new overlords are belong to all your base.
    21. Re:It's there servers by cparker15 · · Score: 1

      Your comparison to telephones (POTS) regarding privacy is misdirected. You have no privacy when communicating by telephone. In fact, you probably have even less privacy via telephone than you do via Internet.

      When you place a telephone call, your call doesn't go directly to the receiving party. It passes through at least one third party--your telephone service provider. If you're calling next door, chances are, your neighbor has the same telephone service provider you do (again, not considering VoIP phone services). You dial the number on your phone, but the call is actually placed in your local telephone exchange's automated switchboards (routers). These switchboards are entirely digital. This presents an easily exploitable opportunity for eavesdropping and recording. Sure, the phone company might tell you they ain't been droppin' no eaves, but they're the Phone Company. They can. And without a trace of it left behind.

      If you're calling across the country or across the globe, there are probably going to be two or three other phone companies involved, with satellite links bridging gaps between continents. Your call can be (and most likely is) intercepted and recorded at any point along this path.

      Not to mention the fact that someone could really just strip some insulation off of your phone line(s) going into your house and hook up a handset to the exposed wires. One could even go down to their local Radio Shack, pay $5 for some QuickPort connectors (basically miniature punch blocks with built-in RJ11 jacks), and use one of those with a regular residential phone instead.

      See also http://en.wikipedia.org/wiki/PSTN for more info on how the POTS phone system works and http://en.wikipedia.org/wiki/Telephone_tapping for more info on just how easy it is to listen in on people's "private" phone conversations.

      --
      Have you driven a fnord... lately?

      You must wait a little bit before using this resource; please try again later.

    22. Re:It's there servers by Anonymous Coward · · Score: 0

      Legally you're absolutely right. Everyone can write their TOS as they like as long as it's not outside the law.

      However, in real-world transactions, different companies have different information about you. For example your bank/credit card company knows something about your finances (interestingly, kept private by law in these parts), your health insurance knows something else (also private) and your favorite shop knows still something else.

      On the internet, the big search/ad/social network providers are the ones that know by far the most about any one customer. If you're online a lot, they basically have a history of your hobbies, some business info, some political info, how tech-savvy you are etc etc. That shows that Google et al. are pretty special in that sense.

      The ISPs potentially have still more information, since every request goes through their pipes, but they have no obvious ways to make money out of that, since they are not in the target-ad business or something similar. Also they might be bound by some laws (I have no info on that).

      Bottom line is: big info collectors must be had an eye on continuously.

    23. Re:It's there servers by tomstdenis · · Score: 1

      You go to Google for medical treatment?

      Your argument makes no sense, for what you are talking about is doctor-patient confidentiality. As far as I know, there is no such thing as Google-searchee confidentiality.

      Look, it's this simple. If you transmit your queries, host strings and other info, over plaintext, to a private server, with whom you have no contract, don't assume that the information you transmit is not being seen by other peoples eyes.

      Tom

      --
      Someday, I'll have a real sig.
    24. Re:It's there servers by Crizp · · Score: 1

      But your search queries leave the house, unencrypted, with no guarantee of protection and travel to Google. That's where the analogy has fault.


      It's like sending letters without envelopes and demanding that the USPS makes sure no-one can snap up and read the letters while in transit. Or going to the post office in the nude and demand the post office makes sure nobody can see your penis before you get there.

      What? You don't have a penis, and I'm an insensitive clod? Sorry.
    25. Re:It's there servers by sgholt · · Score: 1

      HMMM...you know that "thought crime" would be the next step...using your search data as evidence is one thing...but the next logical step is to start prosecuting what you might do. Don't be content that you are doing nothing wrong, so nothing will happen to you. Google is doing the right thing, I hope someday the data will not be available at all.

    26. Re:It's there servers by Anonymous Coward · · Score: 0

      All this talk of fancy "ISP Routers" - I don't believe in that mumbo jumbo. Everyone knows Layer 3 of the OSI model is called "Black Magic" and Layer 4, well hell we can't even prove it exists!

    27. Re:It's there servers by grolschie · · Score: 1

      ... How about I come over to your house later, sit in your bed for a bit...

      If you've got nothing to hide, you should have no problem with this.

      It all depends. Are you hot? ;-)
  12. IAO by lundqvist · · Score: 1

    I bet that means the IAO has their project running properly now so they no longer need to use Google Logs ...

  13. I'm feeling lucky: by Anonymous Coward · · Score: 0
  14. We still think of the children! by Anonymous Coward · · Score: 1, Interesting

    After previously fighting turning over search data to the feds, it looks like they are striking another blow to the "think of the children" crowd. Anybody who remembers this incident probably also remembers the article 'Google in bed with the CIA' too:

    "Google was a little hypocritical when they were refusing to honor a Department of Justice request for information because they were heavily in bed with the Central Intelligence Agency, the office of research and development," said Steele. http://www.prisonplanet.com/articles/october2006/2 71006googlecia.htm

    Makes me wonder how fast does the CIA anonymize their material? Ha!
    1. Re:We still think of the children! by turing_m · · Score: 1

      Yep, I love how we hear all this great theatre about Google "not being evil", "fighting subpeonas" and "anonymizing search records" while at the same time they become more firmly embedded in the US spy services. What else would one expect from a business that is (according to another poster) "primarily a media company, like NBC"?

      Here's a quote from William Colby, former Director of the CIA:
      "The Central Intelligence Agency owns everyone of any major significance in the major media."

      Plus ça change...

      --
      If I have seen further it is by stealing the Intellectual Property of giants.
    2. Re:We still think of the children! by NDPTAL85 · · Score: 1

      Thats the way its supposed to be. If the CIA didn't own people then they would have to be shutdown for negligence.

      --
      Mac OS X and Windows XP working side by side to fight back the night.
    3. Re:We still think of the children! by Goaway · · Score: 1

      And what more reliable source could one image than a 9/11 conspiracy theorist?

  15. Because Google's primarily a media company... by xxxJonBoyxxx · · Score: 4, Informative

    Why not anonymise the data after zero months?
    Because Google's primarily a media company, like NBC, only with much finer detail about what you want to see. Like any media company, Google finds demographic data incredibly valuable because it allows them to "connect" you with the "correct" advertisers. There's no way in hell Google would let people be completely anonymous; it goes against their business plan. (I'd also bet three years from now we'll find through some court case that backup tapes somewhere really extend "anonymous after 18 months" to 4-5 years.)
  16. rom the poof-your-gone dept. by 1u3hr · · Score: 0, Offtopic

    "you're gone" [you are]

    1. Re:rom the poof-your-gone dept. by Anonymous Coward · · Score: 0

      no- "your gone" [your bases gone - belong to us]

  17. This is quite significant, by j_heisenberg · · Score: 1

    since that data could be abused in any number of ways, including credit scoring, insurance scoring or leaks of "interesting details" to the press. Probably those would hurt Google's reputation more than any additional income it could generate, but it's still the better policy.

  18. Firefox can already anonymize Google by seandiggity · · Score: 1

    If you're worried about privacy, I recommend Firefox and the Customize Google extension. I'm also a fan of Googlepedia.

    --
    Geeks like to think that they can ignore politics, you can leave politics alone, but politics won't leave you alone.-rms
  19. 18-24 months? by JackMeyhoff · · Score: 2, Insightful

    Which is it? 18, 19, 20, 21, 22, 23 or 24?

    --
    http://www.rense.com/general79/wdx1.htm
    1. Re:18-24 months? by NereusRen · · Score: 1

      It might not be a fixed number at all, in which case their estimate is as exact as they can be without going into explicit detail.

      I'm guessing they will have a new process, executed every 6 months, which anonymizes all logs older than 18 months. How long would any given search remain non-anonymous under that approach? 18-24 months.

  20. Yeah Right by Psx29 · · Score: 1

    This means nothing. If you click the link.."By anonymizing our server logs after 18-24 months..." That's still far too long and is most likely motivated more by logistical concerns in retaining so much data than out of any act of benevolence. However it definately makes good PR to paint this as 'Taking steps to improve privacy'...

    1. Re:Yeah Right by Alascom · · Score: 1

      >That's still far too long and is most likely motivated more by logistical concerns in
      >retaining so much data than out of any act of benevolence. However it definately makes
      >good PR to paint this as 'Taking steps to improve privacy'...

      I am sure that while statistical analysis is one possible use, another use is fraud prevention. Google makes money off each search query. However, there are people who try to scam the system using adsense and adwords programs and keeping a year or two worth of data would probably be very useful in tracking down 'slow burn' fraud. Its easy to spot a someone clicking on a webpage ad 50 times in an minute, but it requires longer term data to see the same fraud occurring with 1 click per week over a year. While most people on Slashdot will refuse to accept this argument, people like myself who actually SPEND money on advertising appreciate reasonable efforts to combat fraud.

  21. You won't be anonymous, and it doesn't matter by guanxi · · Score: 2, Interesting
    To quote them:
    "It is difficult to guarantee complete anonymization, but we believe these changes will make it very unlikely users could be identified."

    "Changing the bits of an IP address makes it less likely that the IP address can be associated with a specific computer or user. Cookie anonymization makes it less likely that a cookie can be used to identify a user."

    "[I]t's possible that data retention laws will obligate us to retain logs for longer periods."

    "How many subpoenas for server log data does Google receive each year?
    As a matter of policy, we don't provide specifics on law enforcement requests to Google."


    I don't think it will mean much unless they publish their anonymization technique. Even Google seems to have doubts about it, and considering the resources of some attackers (e.g., national governments), if the anonymization can be broken it will be.

    But Google's anonymization does not have to be perfect: Google isn't the only place your google.com activity is recorded: There's your personal computer, possibly your ISP, other sites (referrer links show Google search terms), etc. As long as Google makes their anonymity difficult enough to break that it's significantly easier to go elsewhere for the information, they've done their job. If you need to be anonymous, I hope you are taking other steps.

    I, for one, welcome the merciful intentions of our benign new overlords.

    1. Re:You won't be anonymous, and it doesn't matter by davenaff · · Score: 1

      Yeah, if they really want to anonymize the data, they need to publicize the techniques they use.

      Also, people are still missing several key items:

      - Search History is turned on by default. These records are NEVER purged.
      - As far as I can tell from the article, this only applies to Search. Adsense, Gmail, Maps, etc. may be excluded from these new policies.

      - They are still retaining all historical searches. So, they will always be able to look back at stats such as search term frequency, etc. They are not losing much in the way of long-term analytics.

  22. No Consent by Anonymous Coward · · Score: 4, Interesting

    Exactly, it's to Google's MONETARY benefit that they record this information. The EU Privacy law says THEY CANNOT RECORD MORE PERSONAL INFORMATION THAN IS NEEDED FOR A TRANSACTION. Now that it's clear that search data is personally identifiable, the EU Privacy law should be used to FORCE GOOGLE TO QUIT IT.

    "The moment you sent your request out over the internet in plain text to a third party (that is a corporation out to make money you know) you lost that."

    Not so, the law says we have to consent and we didn't consent!

    And what about when that party isn't Google? Google analytics is not on Google's site, it's embedded on third party sites, Google's adsense is on other people's site too. I didn't consent to handing my data to Google when I surfed to third parties site, Google took that data and recorded it in violation of EU privacy laws.

    This has also been sued for before resulting in Doubleclick backing down over exactly this issue.

    http://archives.cnn.com/2000/TECH/computing/01/28/ double.click.lawsuit.idg/

    "A California woman has filed suit against DoubleClick, accusing the U.S.-based online advertising company of unlawfully obtaining and selling consumers' personal information, according to a statement issued by her attorney's office."

    "Hariett M. Judnick filed the suit in Marin County Superior Court in California, on behalf of the "general public of the state of California," the statement said.
    The suit alleges that DoubleClick employs Internet cookies to identify users and track their movements on the Internet. The company tracks and records the sites an individual visits, as well as the information transmitted on the sites, such as names, ages, addresses, shopping patterns and financial information."

  23. Um... by superbus1929 · · Score: 1

    Didn't AOL get into a lot of trouble for this?

    Personally... we knew this was going to happen. Anyone that's surprised is a fool.

    --
    Let's stop dilly-dallying and just change "-1: Overrated" to "-1: Disagree" or "-1: Doesn't Subscribe to Groupthink".
  24. Things That Bit Butts, Part Deux by WED+Fan · · Score: 5, Insightful

    List of nifty little phrases that have bitten their speakers in the ass:

    • They will never bomb Berlin
    • Read my lips, no new taxes
    • I did not have sex with that woman
    • Mission accomplished
    • Don't be evil

    Now Google brings us:

    Let's just be less evil, now that we've been caught.

    --
    Politics is the art of looking for trouble, finding it everywhere, diagnosing it incorrectly and applying the wrong fix.
  25. well by DuroSoft · · Score: 1

    The 'think of the children crowd' should be very pleased by this - children search for sketchy things all the time... and then their parents get blamed for it.

    'Twould be better if it all stayed anonymous, in my opinion

    1. Re:well by Anonymous Coward · · Score: 0

      This is what happens when you get a bunch of Jews running a company. They know they won't be evil except when they can sell you down the river.
      You're right it should all be anon.

  26. ADVERTIZING by everphilski · · Score: 2, Insightful

    it's all about the advertising. Google's knowlege of you lets them advertise to you more effectively.

    1. Re:ADVERTIZING by Anonymous Coward · · Score: 0

      yep, thats the Bottom line. And its really disturbing of all the ways they are going about it. It seems they have no limits. Whatever is necessary to bring as many people as possible to then target them ads to profit from. So many Google services rely completely on other peoples content. Google is not in the business of content generation. They profit completely off of others content. I really wonder when or even if people will start to realize the consequences of supporting Google.

      More and more they add little services here and there. Example, now you find music information provided by Google. Track listings for albums, and even lyrics for each song. And of course its all taken from other sites. They even nicely state small at the bottom where it was taken from.

      Doesn't seem a big deal now. But thats their goal. They want people to go to Google for every possible thing. Because then they basically take the highest bidder out of a group of companies, and send those people to go buy the music from them, or go get news from them. Whatever the topic.

      It just seems wrong to me. I don't understand why no one else sees how serious this really is.

      Why produce content anymore? The real money lies in aggregating all the content. Forget all that hard work of creating.

  27. It Is About Context by EXTomar · · Score: 2, Interesting

    It isn't that Google necessarily care that it is "you" (actually they might but that is another thread...), but "you" are doing a search and then clicking on links in a particular order which is a context that is important for ranking. At an abstract level, the relationship between what you searched and the links you tried is stuff Google wants to track to help enhance relevancy and search results. The problem is that with modern technology to do this they need to know somethings that aren't anonymous which can be abused.

    If they can come up with a way to do this without tying it all back a computer and the individual who made the request then we are probably all better off not because privacy issues (but that is a great side effect) but because you get better results from removing the irrelevant data from ranking consideration. The closer they get to a true anonymous search system, the better the results should theoretically be.

    1. Re:It Is About Context by Impy+the+Impiuos+Imp · · Score: 1

      > At an abstract level, the relationship between what you searched and the
      > links you tried is stuff Google wants to track to help enhance relevancy and search results.

      I thought AskJeeves or someone had this patented already -- to let the links clicked on more for a given search string float to the top.

      --
      (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
  28. Bigger than just marketing. by MindKata · · Score: 2, Interesting

    "Not only that, but is the history of searches you made over 2 years ago relevant to your current searches performed today"

    It is to Google as they want to know more about you, so they can build up a clearer profile about you. Just because they (say they) are going to delete the data after 2 years, doesn't mean they will not use the data in that two years to build up a profile about what you like. Then they can still keep updating that profile over time while deleting data. So even once they delete the data after two years the profile will still persist (in an ever changing and growing form).

    The whole Google "do no harm talk" sounds more like PR spin talk to cover up what their real intentions are ... its like the old saying, "Knowledge Is Power".

    From a research point of view, Google is basically a vast data mining research company. They are forever looking for more new ways to do data mining.

    So now imagine in say a few years from now, you could work out how to build up a profile of searches from a company instead of a person. Then you would be able to know what that company is interested in. Its also the logical extension of profiling individuals. But it would also be pure industrial espionage. But we are told, Google will do no harm, so its ok then. Imaging how valuable that data profiling would be to sell it to a competitor of that company.

    I think in a few years from now, we will see countries starting to create their own search engines so all their research doesn't get feed though other countries search engines, which are basically gigantic information filtering and collection systems for what people (and companies) are interested in.

    --
    There are 10 kinds of people in the world... those who understand binary and those who don't.
    1. Re:Bigger than just marketing. by nnkx00 · · Score: 1

      Doesn't matter, unless every country's search site is better for their citizens than every other one. For the most part, if North Korea makes a search site better than Google, I think you'll find an awful lot of American traffic heading to North Korea whether the US wants it or not. (Or until Bush makes an executive order to make a big American firewall to keep out terrorists and then blocks North Korea, anyways).

  29. Hash the IP addresses? by sherriw · · Score: 2, Insightful

    Personally I think it's all a load of BS. If they really cared about our privacy, and if all they really needed my IP addy for is to aggregate my searches to 'better serve me', then all they have to do is one-way hash my IP addy. Then they can still tie all my searches together, and my gmail and such, but they wouldn't be able to back track it. And the govn't could demand all they want... you want the IP of the user who searched this? Here it is Mr. Bush... go nuts: x867:%dsgfk435j>67&*g[fg

    So forgive me if I don't get all thankful for Google's big gesture. Heh.

    1. Re:Hash the IP addresses? by santiago · · Score: 5, Insightful

      There's 2^32 IP addresses under IPv4. If Google is doing the hashing, then they know the hash function. How long do you think it would take them to brute-force break the hash by hashing every possible IP address and creating a map from the hashed values back to the originals? Express your answer in microseconds.

      (If your solution is to increase the space of inputs by adding a variable salt value, please explain how this allows them to use the resulting hashes for aggregation.)

  30. 127.0.0.1 by supun · · Score: 3, Funny

    Just hard code the function that grabs "HTTP_REMOTE_ADDR" to return "127.0.0.1." That way the feds will think all the kiddie p0rn searches came from the computer they are using.

    --
    :w!
  31. False by Anonymous Coward · · Score: 0

    "Google could not exist without collecting this information. This data is central to its business model, and key to its differentiation from other search engines."

    False, if it was essential to their business they would not be able to 'anonymize' the data after 18 months.

    There is no gain to be had from targeting an advert to a site I surfed yesterday, let alone 18 months ago. There's is no gain to be had from 'tweaking' the results to be more like something someone was searching from my IP address yesterday, or the day before, or 18 months ago.
    And even click fraud has it's limits of detection: Google does after all decide the clicks are valid and then pays the bill once a month after a short delay.

    "The great majority of web uers see nothing wrong with the method even though concerns about it are getting a fair amount of publicity."

    You don't speak for the majority of web users so you're not able to make this remark with any authority.

  32. Not exactly by nova_ostrich · · Score: 1

    AOL got in trouble for releasing it publicly. Google isn't doing that.

    --
    It's scary being a Flash and Flex developer on Slashdot. You guys are unnaturally rabid.
  33. Does that mean ... by digitig · · Score: 1

    ...I can stop adding "-lolita" when searching for "Nabukov"?

    --
    Quidnam Latine loqui modo coepi?
  34. Anonymous is just as scary by tim90402 · · Score: 1

    Everyone is worried about their own personal privacy, without thinking about the power Google is accumulating even if the data is totally anonymous. E.g., if everyone suddenly starts searching for a certain product, Google knows before anyone else, and could buy out the company who makes it, or sell that information to others. As long as the Google data repository is limited to www searches and click-through behavior, there is some bound on their power. It would become really scary if they were able to analyze what people are talking about in e-mail each day.

  35. Google as mutual fund by tim90402 · · Score: 2, Interesting

    What I thought was a future concern may already be happening. According to http://www.computers.net/2006/08/google_in_dange.h tml Google holds 5.8 billion in marketable securities. This is more than 40% of their assets, which by SEC rules means they are a "investment fund" and subject to different reporting and operating rules.

    1. Re:Google as mutual fund by jacquesm · · Score: 1

      Holy smokes, that is probably the single most relevant piece of information that I
      have *ever* come across on slashdot.

      If this is true then not only is the stockmarket in serious danger, it may also mean
      that to 'beat the market' now means to 'beat google', and you had better not use
      google as a research tool if you're an investment banker (I'm pretty sure that's an
      easy profile to make) or the game is up....

      amazing...

      somebody *please* mod parent up

  36. Google doesn't deserve any good press over this by Nom+du+Keyboard · · Score: 1

    Why is Google getting any favorable press at all for this? They never should have been doing it in the first place.

    --
    "It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
  37. 18 months? by J'raxis · · Score: 1

    There is absolutely no reason for them to retain logs linking searches to IP addresses for even 18 seconds, let alone 18 months -- this isn't "improving Google" for any of their users, no matter how much they claim it is.

    Keeping search history for logged-in users is one thing; I can see how some users could find that useful, just like browser history autocomplete. Perhaps they want to keep logs of non-logged-in users around for something like geographical targeting, but there's no reason they can't process out the IP information immediately, or on a quick rolling schedule such as every 24 hours. Or, just keep the /24 or /16 form of the IP address; that effectively anonymizes the data but still provides enough information for geo-targeting or other forms of aggregation. If they want to track the flow of requests (a user searched this, then that, clicked here, then...), they can use their cookie for that, or do something like generate a hash of each IP's hostname* and track requests by the hash.

    "18-24 months, however, is about the right length of time that this data could be useful for the government for purposes of intelligence gathering or criminal prosecution, however.

    * Hashing the IP itself is useless as there aren't enough IPs (4,294,967,296 in theory, much less in practice due to all the reserved /8s) to make reversing the hash back to the IP difficult. However, the domain of valid hostnames is incredibily large (any alphanumeric string up to 256 characters), such that one can be reasonably confident the hostname cannot be computed from the hash.

  38. Business Plan by Anonymous Coward · · Score: 0

    When their application server writes to the log file only log what and when, forget about the who. Let me have the code that writes to the log file, I will have it done in 1 week. If it's taking years to make this change, I expect they won't mind receiving an invoice for $1,000,000 from me. (Corporate reasoning: Years = $MILLIONS)

  39. I don't understand by JustNiz · · Score: 1

    Why don't they just not save search data in the first place?

  40. How About "Let's Not Allow Children To Think"? by Jane+Q.+Public · · Score: 1

    I am so sick of the myriad bad laws and regulations passed because they were supposedly "good for the children".

    Bollocks.

    People have been creating a world with a lid that is so "screwed down" by "authority" that if the trend continues, children will be growing up in a living hell, in which they are not allowed to think for themselves even after becoming adults.

    Is this good for them? Is it good for *anybody*??

    I think not.

    1. Re:How About "Let's Not Allow Children To Think"? by Anonymous Coward · · Score: 0

      If they wanted children to think they wouldn't be forcing them into the Public Education System.

      1984

      But don't worry yourself too much with such thoughts. Once the national ID system goes into effect and they legislate the required use of your ID for online activities they can log your polictically incorrect remarks and send you in for polictical correctness training as such thoughts might lead to hate crimes. Be sure and keep your friends list at Slashdot up to date so they can gain the benefits of this training too.

      Doubleplusgood day to you Jane Q. Public.

      Oh and I am using my ID on one of my computers to compile Linux from source, could I borrow yours for a while to log in to the internet on my other computer? I really hate that the hardware won't boot up anymore without the ID inserted. /1984 off

  41. so do it yourself who needs google by talledega500 · · Score: 1
  42. Well, we actually do evil, but we'll stop in 2 yrs by pcause · · Score: 1

    Google is gathering a huge trove of informaiton about us and this shows it is not anonymous. Search is only part of what they have. The more Google services you use the more you let them build a very detailed profile of you. And the more you do that the less privacy you have.

    They know what you search for, who you IM and email and about what, where you have appointments and what you bought. You essentially have no privacy.

    If you value your privacy do not use any single provider and spread your searches, IM, email and purchases accross multiple service providers. The government can use its powers to get your data and correlate it, but no commercial entity should have the equivalent power. Commercial interests of Google or any other provider run counter to protecting your privacy.

  43. Maybe they will stop helping the Chinese communist by Anonymous Coward · · Score: 0

    I doubt it... Seems like the only time Google is concerned about anonymous access is when some kind of kiddie pron is involved. They have no problem with helping the communist chinese snoop on their citizens and block sites, right?

  44. Losing money entering a goverment info market by Anonymous Coward · · Score: 0

    Remember its governments not just the US now they have to report to! And either way they are just going to lose money on this, have to hand over the data, which will be resold to government business parters (google's competition) for a hell of a lot less than it cost them to collect it.