Slashdot Mirror


Research To "Reveal the Unseen World of Cookies"

An anonymous reader writes "The Guardian newspaper has teamed up with Mozilla to research the monitoring of online behavior through cookies and other web trackers. After downloading the Collusion add-on for Firefox, you can generate a visual representation of all the cookies that have been downloaded which are linked to the sites you have visited. This shows quite an interesting picture. The Guardian staff then want the data from Collusion to be uploaded to their site, after which they say 'we can build up a picture of this unseen world. When we've found the biggest players, we'll start tracking them back — finding out what data are they monitoring, and why.'"

29 of 108 comments (clear)

  1. Great Idea by thesaintar · · Score: 3, Interesting

    I hope implementing it in the right way (with publicly accessible statistical and analysis methods) will shed some light into how we're being tracked. Is there an equivalent of Collusion for Chrome?

    1. Re:Great Idea by WrongSizeGlass · · Score: 5, Funny

      Is there an equivalent of Collusion for Chrome?

      I believe it's called Google Ads ;-)

    2. Re:Great Idea by Anonymous Coward · · Score: 2, Insightful

      Who goes on the internet to BUY porn?!

    3. Re:Great Idea by Anonymous Coward · · Score: 4, Insightful

      You mean those ads which are displayed on all browsers and are in no way tied or targeted to Chrome?

      Either you're a troll or that was a bad joke.

      It's interesting to me how someone joking around is considered a 'troll' to you. You are what is wrong with /. these days. 'Troll' is the new 'I disagree with you'.

      Did he REALLY evoke an emotional response from you by saying what he said? Did it truly upset you to the point where you were incensed and bitter over his words? If so, then maybe he is, indeed, a troll. Otherwise, shut up.

    4. Re:Great Idea by tehcyder · · Score: 2

      I'm not saying they might be trolling because I disagree. I said it because the entire premise of the joke is factually without basis.

      Hate to break the news to you, but jokes don't have to be factually accurate or even vaguely plausible.

      You seem to have issues with people criticising Google in a humourous way. I suppose at least you're not an Apple fanboy, but a Google fanboy isn't much better.

      Get over it, they're computer companies not our Lord Jesus Christ.

      --
      To have a right to do a thing is not at all the same as to be right in doing it
  2. How to get rid of them by GameboyRMH · · Score: 5, Informative

    On Firefox, disable HTML5/DOM storage, install CookieMonster 1.5 and BetterPrivacy.

    --
    "When information is power, privacy is freedom" - Jah-Wren Ryel
  3. Pot kettle spy. by FatLittleMonkey · · Score: 5, Insightful

    we'll start tracking them back — finding out what data are they monitoring, and why.

    Well, here's my contribution;

    The Guardian page in the link has six trackers:
    24/7 Real Media
    Audience Science
    ForeSee
    Maxymiser
    Optimizely
    Quantcast

    I don't know what any of them do, and I blocked them all. Fuck 'em.

    --
    Science is all about firing a drunk pig out of a cannon just to see what happens.
    1. Re:Pot kettle spy. by Lucky75 · · Score: 2

      I actually see 9:

      24/7 Real Media
      Audience Science
      ForeSee
      Google Adsense
      Maxymiser
      Omniture
      Optimizely
      Quantcast
      Twitter Button

      --
      DNA -- National Dyslexic Association
    2. Re:Pot kettle spy. by FatLittleMonkey · · Score: 5, Funny

      Story of my life. I brag about having 6, and the other guy has 9.

      --
      Science is all about firing a drunk pig out of a cannon just to see what happens.
    3. Re:Pot kettle spy. by bfree · · Score: 2

      You missed some more!

      googleapis
      simplifydigital
      guim
      llnwd
      ophan
      ytimg
      youtube
      quantserve
      wunderloop
      revsci
      cogmatch
      imrworldwide

      I'll leave it as an exercise for the reader to de-dupe the above list (e.g. quantserve Vs quantcast and ytimg Vs youtube) and decide for themselves which ones are innocuous.

      I didn't even bother to let any of them run any javascript to discover what else they might try to sneak in. I'm also willing to bet I missed something.

      You have to love the "obfuscation" and attempts to get past blocking, from the simple noscript web-bugs to

      document.write('<scr' + 'ipt type="text/javascript"

      --

      Never underestimate the dark side of the Source

    4. Re:Pot kettle spy. by Anonymous Coward · · Score: 5, Interesting

      Hi,

      I'm the Guardian journalist working on this.

      Unsurprisingly, if you install Collusion after reading an article on The Guardian, you tend to log cookies that our website sets. So we're noticing quite a few of the trackers we use on guardian.co.uk turn up in the project. :)

      We're ok with that - better to be open that our website uses cookies for registration, analytics and advertising (just like most others!), than pretend or hide away the fact. Actually, we did another article on the same day showing how we use them: http://www.guardian.co.uk/technology/2012/apr/13/new-law-cookies-affect-internet-browsing.

      The ones in that list above are a mix of third-party advertising cookies, analytics and A/B testing (so I'm learning!).

      When it comes to the data we're going to try and get from the Collusion info - we can't really infer much about what behaviours have been tracked from the exported data. However, it gives us a nice long JSON string that associates certain cookies as being set when visiting certain sites. At the moment we're using that to find out how many instances of each type of tracker we're seeing across multiple sites.

      We're then going to take the most prolific ones and find out more about what they do, who owns them, how they work, etc. However, we're going to be using old-fashioned journalism to do that - research and phone calls.

      However, I was thinking of putting up open documents like this: https://docs.google.com/document/d/1lCp8H9i-MJwyORj_MOZflH6BCt9j6HIbQkyS2536knM/edit
      so you could see where I'd got to and put me right if I was going off track (as it were). Good idea? Bad idea?

      Joanna.

    5. Re:Pot kettle spy. by SpaceLifeForm · · Score: 2

      You will all see different cookies because they are coming from various machines on the net. Upstream intermediates are inserting them on the fly.

      --
      You are being MICROattacked, from various angles, in a SOFT manner.
    6. Re:Pot kettle spy. by FatLittleMonkey · · Score: 2

      Eh? If Ms Geary puts it anywhere public online, google can see it anyway. (As can the actual NSA.) So unless you're saying that Google will censor her work, your comment makes no sense.

      --
      Science is all about firing a drunk pig out of a cannon just to see what happens.
  4. Cookieculler by MLCT · · Score: 5, Informative

    Bit of a shoutout for the firefox extension cookieculler.

    I have never found anything that matches cookieculler for features: it doesn't just purely delete cookies, it operates with a white-list based system (the way everything on the web should work). Cookieculler deletes all cookies each time you close the browser, except the ones you have whitelist "protected", that keep login information etc. as you choose.

    Along with noscript, cookieculler is the main reason I stay on firefox.

    1. Re:Cookieculler by Lucky75 · · Score: 2

      I've found "Ghostery" to be pretty damn good. Blocks them rather than allowing+ deleting them.

      --
      DNA -- National Dyslexic Association
    2. Re:Cookieculler by emilv · · Score: 2

      How is cookieculler different from setting a default policy in Firefox and then using the built-in whitelist in Firefox to give permissions for certain sites?

    3. Re:Cookieculler by MLCT · · Score: 4, Informative

      Granted firefox can offer something close, but not quite. Cookieculler offers finer control, because you can whitelist the *cookies* rather than the domain. So I can (and do) choose to protect my /. cookie, but not anything else that /. place in my browser (hypothetical example, as /. don't place any other cookies).

    4. Re:Cookieculler by plover · · Score: 2

      Citation really needed.

      --
      John
  5. Cookies or COOKIES!?!? by Anonymous Coward · · Score: 2, Informative

    Anyone else read the title and thought people were taking a deeper look at why those delicious baked goods are so tantalizing?

  6. Internet marketing by Roberticus · · Score: 4, Interesting

    If average folks become aware of how many cookies get set (along with getting a user-friendly way* of turning them off), that could have a huge and entertaining effect on the world of Internet marketing**.

    For example, right now, I can assume enough website visitors have JavaScript enabled to make it almost 100% (and not worth writing HTML for the case where they don't). But if I can only reasonably assume, say, 50% of my visitors/email through-clickers/etc. have cookies active, that plays havoc with my reporting.

    * "User-friendly" defined as "something my dad can do without asking me for help".
    ** I spend all day every workday in this world.

  7. Facebook by Lucky75 · · Score: 4, Informative

    You'd be shocked at how many cookies come from facebook across multiple sites. I use an extension called Ghostery (https://addons.mozilla.org/en-US/firefox/addon/ghostery/) to block most of them.

    --
    DNA -- National Dyslexic Association
    1. Re:Facebook by 19thNervousBreakdown · · Score: 2

      Spoiler: It's practically every site.

      --
      <xml><I><am><so><damn>Web 2.0</damn></so></am></I></xml>
  8. Yo Dawg by Z80xxc! · · Score: 3, Funny

    Yo dawg... I heard u dislike being tracked, so we put a tracker in your trackers so you could be tracked while we track.

  9. "What Data they are monitoring and why" by dmomo · · Score: 3, Interesting

    It will be interesting to see not only the results of this analysis, but also how they came any conclusions that they do.

    Many cookies are used only to store a unique identifier. They data about a user many websites actually store is housed and maintained on their server, keyed by the unique id. This could include "pages visited", "duration of visit", "browser/system specs/settings" along with any derived demographic data.

    It would be hard (though not necessarily impossible) to determine this from a cookie analysis.

  10. Collusion is quite fascinating... by dryriver · · Score: 3, Interesting

    I found out using its automated "graph-builder" that the 3 - 4 supposedly "safe" sites I visit most often, actually pass my user data on to Google, Facebook, DoubleClick, Mediaplex, Adroll and other services. Its quite educational to watch the graph go from a blank page to a fairly complex network of interconnections as you continue to browse. Its going to be interesting to see what results from this when the Guardian gets all the aggregate data from Collusion. It does seem indeed that there is such a thing as a "secret world of cookies" out on the internet, and I personally support that this "secret world" be uncovered fully, so we get to see what entities are clandestinely mining our supposedly "private" user information as we surf. --- The whole thing also reminds me of the book "Brandwashed", where the author explains at length how commercial establishments collect all sorts of data on us, and exploit it to sell us more products.

    --
    Why did the chicken cross the road? Because Elon Musk put an AI chip in its head.
  11. porn by Sperbels · · Score: 2

    finding out what data are they monitoring, and why

    Well, all the porn websites seem to know that I prefer brunettes over blonds.

    1. Re:porn by joannageary · · Score: 2

      But do they also know that you buy your underwear from Marks & Spencers? That's the interesting sort of thing I'm hoping we'll find out - what companies are tracking over such varied sites and what information (if any) they then sell back to their clients.

  12. Cookies not the only way to do this... by isaac · · Score: 4, Informative

    Cookies are not the only evidence of tracking. Even Flash LSO, HTML5 local storage, etc.

    There's a surprising amount of identifying information in request headers and what's available to javascript. (see http://panopticlick.eff.org/ for a demonstration.) That means, one often needn't accept or store a cookie to be tracked.

    A really comprehensive pro-privacy browser extension would munge request headers and enumeration of fonts, plugins, screen resolutions, etc. to match one of, say, the top 5 most common desktop browser fingerprints - and to change every so often (Changing per request would itself be a trivially detectable signature.)

    -Isaac

    --
    I am not a lawyer, and this is not legal advice. For Entertainment Purposes Only.
  13. Methodology Issues by TaoPhoenix · · Score: 2

    You know already who the "Big Players" are - Google, Facebook, Microsoft, your choice of a couple more related ones.

    Then it descends into all these little companies. I would expect that some of them are subsidiaries of the big guys etc.

    The ideal goal of each of these "thingies" (cookies, flash objects, etc etc) is to nail down who visits down to a unique user if possible.

    So just copy the Ghostery block list, maybe the AdBlock block list, your choice of a couple more tools.

    If you want a "market share per ad company" report then get one of those.

    There's something bothering me with your study design but it's not clear yet.

    --
    My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine