Slashdot Mirror


New Method of Tracking UIP Hits?

smurray writes "iMediaConnection has an interesting article on a new approach to web analysis. The author claims that he is describing 'new, cutting edge methodologies for identifying people, methodologies that -- at this point -- no web analytics product supports.' What's more interesting, the new technology doesn't seem to be privacy intrusive." Many companies seem unhappy with the accepted norms of tracking UIP results. Another approach to solving this problem was also previously covered on Slashdot.

5 of 174 comments (clear)

  1. Step 4. . . by SpaceAdmiral · · Score: 5, Insightful

    We know some IP addresses cannot be shared by one person. These are the ones that would require a person to move faster than possible. If we have one IP address in New York, then one in Tokyo 60 minutes later, we know it can't be the same person because you can't get from New York to Tokyo in one hour.

    If my company had computers in New York and Tokyo, I could ssh between them in much less than 60 minutes. . .

  2. Still doesn't help deleted cookies by mattso · · Score: 5, Insightful

    They make some silly assumptions that I don't think work with users using proxy agents, but in the end it still boils down to the existence of cookies. Which would be ok, if the problem they are trying to solve wasn't that users are deleting and not storing cookies at all. They do mention using Flash to store cookies, which I suspect will have to be the next area users will have to start cleaning up. But just because cookies don't overlap in time and the IP address is the same doesn't mean it's the same person. A bunch of users that use the same browser and share an IP address that always delete their cookies with this system will look like one user. Vastly under counting. Which I don't think web sites are interested in. Vast over counting is profitable. Under counting, not so much.

    In the end there is no way they can even mostly recognize repeat web site visitors if the VISITOR DOESN'T WANT THEM TO.

    The big problem is stated at the top of the article:

    "We need to identify unique users on the web. It's fundamental. We need to know how many people visit, what they read, for how long, how often they return, and at what frequency. These are the 'atoms' of our metrics. Without this knowledge we really can't do much."

    If knowing who unique users are is that important they need to create a reason for the user to correctly identify themselves. Some form of incentive that makes it worth giving up an identification for.

  3. Tragically flawed by tangledweb · · Score: 5, Insightful

    The article's "Sky is Falling" tone rests on a single factoid. "30 to 55% of users delete cookies" therefore current analytics products are out by "at least 30 percent, maybe more".

    That is of course complete nonsense. Let's say we accept the author's assertion that different studies have given cookie deletion rates across that range. I can accept that a significant number of users might delete cookies at some point, but what percentage of normal, non-geek, non-tinfoil-hat-wearing users are deleting cookies between page requests to a single site in a single session? If it is 30%, then I will eat my hat.

    Most cookie deletion amoung the general populace will be being done automatically by anti-spyware software and is not done in realtime.

    The author clearly knows that even the most primitive of tools also use other metrics to group page requests into sessions, so even if 30% of users were deleting cookies, it would not result in a 30% inaccuracy.

    Of course "researchers propose more complex heuristic that looks to be slightly more accurate than current pracice" does not make as good a story as "paradigm shift" blah blah "blows out of the water" blah blah "We've been off by at least 30 percent, maybe more." blah blah.

  4. Paradigm shift ?!? by rduke15 · · Score: 5, Insightful

    When I read "paradigm shift" in the very first paragraph, my bullshit sensor sound such a loud alarm that it's hard to continue reading...

  5. Typical web analysis junk by Sinner · · Score: 5, Insightful

    About 20% of my time on my last job was spent doing web analysis. It drove me insane.

    The problem is with the word "accurate". To management, "accurate statistics" means knowing exactly how many conscious human beings looked at the site during a given period. However, the computer cannot measure this. What it can measure, accurately, is the number of HTML requests during a given period.

    You can use the latter number to estimate the former number. But because this estimate is effected by a multitude of factors like spiders, proxies, bugs, etc., management will say "these stats are clearly not accurate!". You can try to filter out the various "undesirable" requests, but the results you'll get will vary chaotically with the filters you use. The closer you get to "accurate" stats from the point of view of management, the further you'll be from "accurate" stats from a technical point of view.

    Makers of web analysis software and services address these problems by the simple of technique of "lying". In fact, a whole industry has built up based on the shared delusion that we can accurately measure distinct users.

    Which is where this article comes in. The author has discovered the shocking, shocking fact that the standard means of measuring distinct users are total bollocks. He's discovered that another technique produces dramatically different results. He's shocked, shocked, appalled in fact, that the makers of web analysis software are not interested in this new, highly computationally-intensive technique that spits out lower numbers.

    My advice? Instead of doing costly probability analysis on your log files, just multiple your existing user counts by 0.7. The results will be just as meaningful and you can go home earlier.

    --
    fish and pipes