Slashdot Mirror


Browser Detection of Website Statistics Services

An anonymous reader submits "David Naylor has reviewed the browser recognition capabilities of seven free website statistics services. Among other things, he kills the myth that many Opera users go unaccounted for in website statistics because of the default UA spoofing. Furthermore, he shows that the quality of browser recognition ranges from absolutely abysmal to near perfect."

25 comments

  1. What would be more interesting: by Anonymous Coward · · Score: 3, Interesting

    The same but with web server log analyzers. For example I suspect Webalizer for not being accurate concerning user agents.

    1. Re:What would be more interesting: by ogre57 · · Score: 1

      Iirc from job 2 years back, depends on how you have it config'd. Yes, you can set it up very sloppily, thus easily spoofed. Otoh default setup on RH9 was maybe too accurate, in that it logged the mozilla's that came with RH, Debian, and SuSE on 3 separate rows even tho all were (supposedly) the same version.

  2. Nice work - some minor suggestions for 'ya by xmas2003 · · Score: 2, Interesting
    Nice analysis David. I'd personally love to see Analog (an oldie but googie) added to your table and my guess is that Urchin would be another popular one.

    As you know, one can easily spoof the User Agent (and Firefox makes this totally trivial) - any idea on what percentage of folks are doing this type of stuff? Too bad that Slashdot didn't put this on the front page, because then you could analyze that inbound traffic.

    P.S. FYI FWIW: using Analog, here's my browser percentages for Christmas/2004. I also have a Browser Info Page for those folks interested in seeing real-time what their browser is reporting.

    --
    Hulk SMASH Celiac Disease
    1. Re:Nice work - some minor suggestions for 'ya by MarkRose · · Score: 1

      I notice your stats don't list Konqueror -- is it really that rarely used? I love it... it's by far my favourite browser.

      --
      Be relentless!
    2. Re:Nice work - some minor suggestions for 'ya by naylor83 · · Score: 1

      Well, this test hasn't included log analyzers - it's simply hit statistics services. I don't have access to any server logs, so I'm not able to do such a comparison either.

    3. Re:Nice work - some minor suggestions for 'ya by DuncMan · · Score: 1
      I also have a Browser Info Page for those folks interested in seeing real-time what their browser is reporting.

      It's broken :-) . I'm in England, United Kingdom, not United States!

  3. MSN Explorer by joeljkp · · Score: 1

    I haven't noticed this in any stats I've viewed... does MSN Explorer use IE's UA string, or can it be counted on its own?

    It would be helpful for determing how many people out there actually want a candy-interface all-in-one browsing/email/chat experience.

    --
    WeRelate.org - wiki-based genealogy
    1. Re:MSN Explorer by naylor83 · · Score: 1

      That's a good question. Maybe I'll look into it. Do you have MSN Explorer installed?

    2. Re:MSN Explorer by joeljkp · · Score: 1

      Nope, sorry. I may look into it if I come across it, though.

      --
      WeRelate.org - wiki-based genealogy
  4. Thanks for the post by museumpeace · · Score: 2, Interesting

    I noticed that many blogs run stats/hitcounters and was wondering which one I should put on mine.

    --
    SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
    1. Re:Thanks for the post by naylor83 · · Score: 2, Informative

      Use statcounter - as far as I know it's the best, period. (It's also invisible!)

  5. Adblock extension interference? by Anonymous Coward · · Score: 1, Interesting

    I've heard it claimed that Adblock
    drops Mozilla stats for services
    that use web bug like things to
    track. Anyone have an idea of just
    how much this warps reported stats?

  6. Cannot be accurate by Anonymous Coward · · Score: 2, Interesting

    he kills the myth that many Opera users go unaccounted for in website statistics because of the default UA spoofing

    It's just not possible to build accurate statistics from observing HTTP traffic, and I wish people would stop trying. HTTP logs are good for one thing: performance tuning your server.

    Don't believe me? Here's a simple test. Fire up a traffic logger, Internet Explorer, and Opera. Go to a typical dynamic website in Internet Explorer, click on a link, and hit the back button. Now do the same in Opera.

    Check your logs. Notice how Internet Explorer made another request for the page when you hit the back button, but Opera didn't? That's because Opera follows the rules of RFC 2616 (the HTTP 1.1 specification), and Internet Explorer doesn't. It means that in this little experiment, the exact same actions in Internet Explorer accounted for 50% more requests than in Opera.

    That's just one of many, many ways in which HTTP traffic can differ wildly even if the browsers had equal market share. Whoever brandishes HTTP logs as any kind of evidence of market share (Firefox guys, I'm looking at you) is either unqualified to make such statements, or is being deliberately misleading.

    1. Re:Cannot be accurate by naylor83 · · Score: 1

      When you say 'observing HTTP traffic', do you mean normal hitcounter services with code which you insert into you HTML? I'm a little confused.

    2. Re:Cannot be accurate by Anonymous Coward · · Score: 0

      Any observation of browsers, really. That's covering embedded Javascript, embedded images, analysis of Apache logs, etc. I've not heard of any type of monioring that doesn't have showstoppers. The only way I can see of getting reliable browser usage statistics is out-of-band study, i.e. asking people on the street.

    3. Re:Cannot be accurate by Anonymous Coward · · Score: 0

      It means that in this little experiment, the exact same actions in Internet Explorer accounted for 50% more requests than in Opera.

      You can somewhat get around this by setting and reading a cookie with each request to track individual users. That way you can through out page views from the same user for the same page that happen within a minute of one another.

    4. Re:Cannot be accurate by Anonymous Coward · · Score: 0

      You can somewhat get around this by setting and reading a cookie with each request to track individual users.

      Surely you aren't suggesting that cookies are reliable?

      Even if they were, your suggestion would dissuade users of browsers like Lynx that prompt you for every cookie from visiting your site. Also, it would bias your statistics in favour of browsers that allow fine control over cookies and browsers that don't support cookies.

      That way you can through out page views from the same user for the same page that happen within a minute of one another.

      That's the second time I've seen somebody make that mistake today. It's "throw out", not "through out".

    5. Re:Cannot be accurate by naylor83 · · Score: 1

      I can't help thinking that that would give a serious over-estimation of IE users, since those who use the Internet alot (and hence aren't running the streets all day long) more often use A Good Browser(tm).

    6. Re:Cannot be accurate by Anonymous Coward · · Score: 0

      Sorry, I meant "e.g." when I wrote "i.e.". Basically any normal way of surveying which computer applications are used. My "ask people on the street" example was merely to demonstrate what I meant by "out-of-band".

    7. Re:Cannot be accurate by PhilipMatarese · · Score: 1

      When measuring browser usage statistics, the number of hits isn't as important as the number of unique connections made. Even if IE met the RFC correctly, hits wouldn't be very telling. A person who opens every page on the site 1000 times each should count the same as a user who opens one page - they shouldn't count as 100,000 users.

      Measuring connections can have drawbacks, too. Did the user close and open their browser, or did a different user from inside the same proxy make a connection? Even so, it's a lot more telling then watching hits.

  7. Nice site name by archeopterix · · Score: 3, Funny
    Nice analysis David. I'd personally love to see Analog [analog.cx] (an oldie but googie) added to your table and my guess is that Urchin would be another popular one.
    I'm not going to click on anything that starts with "anal" and ends with ".cx".
  8. Spoof the spoofers by OhHellWithIt · · Score: 1
    To the best of my recollection, Internet Explorer spoofs Netscape by saying it's Mozilla in the HTTP User-Agent header, then adding parenthetical remarks that indicate it is really MSIE. So the notion of Opera pretending to be MSIE means that it has to pretend it's MSIE being Mozilla/Netscape. Microsoft introduced this spoofing because of some bug in web servers, I think that caused MSIE to fail.

    I just wish that all the browsers would send what they really are, and that websites wouldn't deliberately kick out certain browsers because they aren't what the developers had in mind. (I know, this is kind of like asking for world peace, but heck, it's a start!)

    --
    "Who controls the past controls the future. Who controls the present controls the past." -- George Orwell
    1. Re:Spoof the spoofers by naylor83 · · Score: 1

      I know, this is kind of like asking for world peace, but heck, it's a start! Yeah, I agree ;-)

  9. AWstats by CMU_Ken · · Score: 1

    I'd rather just use AWstats locally than ask someone else for my site's traffic statistics.