Why Good Data Can Be Hard to Find Online

← Back to Stories (view on slashdot.org)

Why Good Data Can Be Hard to Find Online

Posted by ScuttleMonkey on Friday April 18, 2008 @01:14PM from the still-don't-trust-alexa dept.

WSJdpatton writes to mention that Carl Bialik has an interesting look at why good data can be hard to find, much less understand, online. He cites a couple of examples, both Google's first-quarter performance numbers and Alexa's revamp of their number-tracking process. "Now Alexa is incorporating other sources of data -- though it says the prior ranking 'wasn't wrong before, but it was different.' Some sites saw big changes in their rankings following Alexa's move: The tech blog TechCrunch said it fell far from its prior position in Drudge Report territory (rarefied air in Web-traffic terms). On Friday afternoon, Drudge Report ranked 545th, compared with TechCrunch's ranking of 1,784th, according to Alexa's new math."

3 of 39 comments (clear)

Min score:

Reason:

Sort:

Alexa? No. by Slashdot+Suxxors · 2008-04-18 13:17 · Score: 4, Informative

This isn't exactly on topic, but I think you should give it a read before you make a final opinion on what the article is trying to stay.
1. Re:Alexa? No. by Firehed · 2008-04-18 14:33 · Score: 4, Informative
  
  Maybe relative tracking can't be done by simple means since it requires participation on everyone's part, but absolute local tracking is trivially easy on any server that supports server-side scripting and has some sort of database access. A couple lines of code at the bottom of your page to insert a new row on a page load and you've got nearly perfect visitor logs that can easily go beyond your standard server logs.
  
  Again, useless for relative popularity unless you have everyone's data. But it still tells you how popular your site is which is great for ego boosting and advertiser stats if nothing else.
  
  (I'd suggest that Google Analytics is going to be a lot more useful in the long run and at least has the potential to provide relative data in addition to the absolute, but anything that relies on client-side scripting is going to give less accurate numbers since clients can disable or screw around with scripting)
  
  --
  How are sites slashdotted when nobody reads TFAs?
Another Example: Hitslink by Anonymous Coward · 2008-04-18 14:01 · Score: 1, Informative

Another example besides Alexa of "readjustment" is Hitslink. Last November, they revised their figures for OS share for March through October 2007. Linux went from a reported .81% share in October, to .50%. They made only a brief allusion on their site to filtering out "unrepresentative" hits from their data. Recently, they again revised their Linux share for January 2008, from the original .67% to .64%. Even though Hitslink seems to have trouble deciding how many Linux users there are, that doesn't keep people (like Westlake, who keeps posting Hitslink numbers on Slashdot) from citing them.