Hits or Misses: Who is Your Website's Audience?
securitas writes "The Christian Science Monitor's Gregory M. Lamb wrote a
story interesting to anyone who runs a website: How do you accurately and reliably measure the audience for your website? From the article: 'Most websites have no idea how many people view their content. This inherent fuzziness is causing problems for commercial websites, especially online publications desperate to make money from Internet advertising... How can you charge for ads when it's nearly impossible to tell advertisers how many people will see them?' The article discusses the flaws and problems with Nielsen/NetRatings and comScore Media Metrix - they grossly undersample workplace users - and the rise in the number of sites requiring user registration."
By height.
$30 Off All Plans: Use code TRIPLESAWBUCK
I always just set a cookie with a tracking ID, and then use that to keep track of the anon user. counting the number of tracking cookies given out each day, and the time they were used for seems to work sufficiently for me... or is there some problem with that I don't know about?
Call me oblivious, but wasn't this one of the reasons why cookies were created?
Hmmm.
that is why most online advertising consists of fees based on the 'per click' methodology?
"How can you charge for ads when it's nearly impossible to tell advertisers how many people will see them" --- These people use access logs??
I dont know what the real strategy of most online newspaper websites is, but they seem to follow this pattern:
1. Make content available online, free of cost
2. Wait for people to start using and monitor the growth in number of hits
3. Reduce the website response to a crawl with mind numbing popups, flash ads, quick time ads, and generally anything that would make sure the user "spends" more a few minutes on the homepage
4. Wait for most users to go away to some other website.
5. The few braves who remain - force them to register and read all the content, since you want to chart your users by demography.
6. Finally, now make most of the content premium - based upon the data collected in step 5, however inaccurate it is. Flood the site with more ads, if possible
7. Moan and bitch that there is no revenue generated.
8. Repeat cycle
http://efil.blogspot.com/
may think their audience is a bunch of nerds, but in reality its a bunch of suave playboys that get to have sex with many hot women. I suggest they make the appropriate content changes.
This is completely backwards. Infact, it's exactly the opposite. It's quite simple to tell how many people view your webpage, and hell of alot easier (and more accurate) than radio or TV.
This is the source of the problem with web advertising, your numbers fairly accurate and based on actual events, not some satistically questionable sampling method. There's little room for fudging.
Demographics on the other hand are a little more complicated. There, you actually have to ask.
---
Less Talk, More Beer.
I suppose the CSM is about to discover how many slashdotters view the content of this website...
Mandatory reg. puts people off using the site in the first place (Why register if you can see the content.. If you can't see the content who knows if its worth registering for?).
IP addresses is half the problem (everyone behind one company firewall looks like 1 user).
Cookies are ok so long as your users are ok with you "tracking their browsing habits".
Its a tricky puzzle...
http://twitter.com/onion2k
I would put a CGI page counter at the bottom of every page. I think the one with flame numbers works the best for this, but the digital looking on also works well.
Great ideas often receive violent opposition from mediocre minds. - Albert Einstein
i dont care who looks at my site as long as my statistics page reports more than just me.
Anyway, the exact numbers don't really tell you anything. You really need to know the differences between two sub-populations (are visitors from pay-per-click ads or visitors from standard search results more likely to buy?). A program which makes this sort of comparison easy will give you far more insight than one which tries to get the total number of visitors closer to some mythical "true" number.
(I am the author of analog and CTO of ClickTracks, but I'm writing in a personal capacity).
11.0010010000111111011010101000100010000101101000
I found this article to be rather insightful. I personally run a small IT/science-news site (in Finnish) and I'm really having a hard time figuring out visitors of the site. Of course I can get some data from the log analyzing software (awstats and webalizer are being used for the site) but it really doesn't tell me what I want. It seems that the website logs don't always tell the truth. For example I'm getting about 20-30 hits a day with a referrer pointing to a site that's a search engine for blogs (${god} knows why the site has been tagged as a blog) but browsing through the actual logs reveal the hits to belong to a indexing-robot of the site that's a little too enthusiastic.
The most reliable way to find out about the visitors on a given site would be a user survey, although not complete as not everyone would fill it out, but it would give an idea about the habits of your most frequent visitors. I, if I were an advertiser, would be interested in more than just number of hits and visits and most advertisers would be baffled by stuff like "we got XXXYYYZZZ HTTP requests last month". Personally I would prefer to advertise on sites with a well-built sense of community and an active userbase that's keen to interact with the website, when I browse a site for the first time or a site that I visit infrequently, I rarely click on banners or ads. I'm more prone to clicking ads on sites which I visit daily or so, it gives me a feeling of supporting the site I like and I just might buy something from the advertiser if they are offering something that I need, therefore focused advertising is the key, hence again you need to know your users.
Logs tell you numbers but you need the visitors themselves to tell you who they really are and how often they visit your site.
Your first line is that advertisers shouldn't care how many people visit... but then you go on to talk about how you increase traffic to your own website.
If your site uses an ad-supported business model, you (and your advertisers) should care how many people are visiting your site. Advertisers want to spend their money somewhere that they know will be seen.
The Super Bowl charges more for a 30-second spot than your local cable channel; that's because of the sheer number of people that will be watching. If you (and your advertisers) know how many people are visiting the site, then you can put some numbers to your business model - and that's a smart way to run a business.
...amaze me. I recently helped a friend put together a website for his bakery. Why did he want a website? Because it was something to do that he hadn't done before. Will it drive customers to his place? I doubt it; most small companies like that survive on local ads and word of mouth. I guess my point is that I am still, after all this time, doubtful when it comes to the accuracy of usefulness of ads or site based on visits, click-throughs, etc. I don't think knowledge of the availability of a product is enough; a site must be informative and interactive above and beyond what other forms of advertising can do. While some companies do a great job of this, too many others are like my friend's site---little more than a billboard.
Don't be a looter...and yes, I know that it's spelled with an "A" instead of an "E".
Alexa's model is interesting - they hand out a "free" toolbar that gives you google search, as well as pinging Alexa and showing you every page's Alexa rank.
Unfortunately, the toolbar also slows down your browsing (especially if you're on dialup). And the more tech-savvy a user is, the less likely they are to want that toolbar on their system. Thus tech sites are going to be depressed in those rankings, always.
Alexa also can't tell a subdomain from a regular domain - so subpages of IGN.com or UGO wind up just increasing IGN or UGO's rank, and blogs hosted at X.BlogHost.Com just raise BlogHost.com's rank without being able to tell what the particular blog's rank might be.
Finally, the biggest flaw in Alexa's ranking system is that it's based on voluntary input; rather than finding 'Net users and trying to get a representative sample (which is the goal of the Nielsen TV setup), they take anyone who'll put in their toolbar. Sure, they can get a pretty large number of idiots to install the thing, but they're still idiots - there are demographics that the toolbar just won't get adopted by in that fashion.
The other sad thing is, there are companies that use Alexa's page rankings to decide how much they'll pay for ads. Go figure.
I use webalizer, cookies, and a two stats packages for my cms system (geeklog). One stats package only admin has privalige to, which gives me very detailed acurate info such as time, ip, which page viewed, referers, UID (user id), links followed, country browser, platform ect. All open source. Does the job for me.
"If the facts don't fit the theory, change the facts." -Albert Einstein
Karma? There's a serial modder out there.
The CSM is essentially secular. See the 'about us' pages. Seems that the naming of the CSM was a rather unpopular move by the paper's creator, Mary Baker Eddy - the rest of the staff didn't seem to want to call it that, since it's not really Christian at all...
---
"I did nothing. I did absolutely nothing and it was everything that I thought it could be."
Tracking unique visitors?
Not that hard if small margin of error is ok.
Charging for ads when you don't know how many page views you will get?
What about CPM (cost per 1k impression) rates? Want 10k impressions? Pay for 10k impressions.
Target demographics?
How about track what article topics are popular, how many return readers per topic, etc?
These are not that hard to do with the right people. The guy who writes the "techie column" in many cases is not the right person.
I guess if you think like a newspaper, you end up with these problems seeming impossible to figure out.
Have I lost my marbles, or is this really not that hard?
-Pete
Soccer Goal Plans
Their ISP killed their account after 3 reported strikes.
Then there's em3.net, a scumware site that tried this last year. Following the links triggered attempted spyware downloads.
(If anyone is truely interested I have a partial list at http://idunno.org/misc/referralSpammers.aspx)
Read the article. They are complaining that one user may read the content from work and from home, and so count as two users. One might also point out that sometimes two people may use the same computer, and only count as one person.
My wife and I both read the same article/section in the newspaper we got yesterday, even though we only got a single paper. (We "logged" 1 impression even though 2 were made.)
I understand that is the opposite of what you suggest, so...
Not only that, but we had some sections delivered to us that we (gasp!) threw out without even reading even though we may have been part of the target demographic. (We "logged" 1 impression even though 0 were made.)
And the web is different how?
-Pete
Soccer Goal Plans
After careful review of our target audience, we have have begun work on our new bulk Prozac and Lithium banner ad campaign.
I find the value of web logs is more the relative growth of traffic, or from section to section. Since one can assume relatively the same degree of error each month (i.e. 2 users on the same computer, 1 user on 2 computers, etc.) you can gain a lot of information just by comparing logs over time. The same goes for section by section. If your web site has 5 distinct sections you can compare within them and then over time. Advertisers like to know absolute numbers, but if you can tell them that they'll get 2x if they advertise on a particular portion of your site and it's likely that section gets a certain type of visitor that is very valuable. In the least it gives you some solid direction about what your users want so you can build a better site, and eventually get more ad revenue from it.
and they could call it metamoderation? Yeah, they should implement that.
My user number is prime. Is yours?
I'm not going to click on your banner. Nope. Not a chance. Not happening.
It's not that I'm not interested in your product. Online adverts I see actually tend to be:
1. Something unavailable to me (wrong country).
2. Something of no interest to me.
3. Something I own already (this happens a _lot_ with Gamespy).
But that's not the point. The point is, I'm at the web site because I'm looking for something, and it's probably not your product. When watching TV, I never watch an advert, and immediately decide to research/buy that product. At best I'll make a mental note to have a look out for information on it later, in most cases I won't think about it until I'm looking for that kind of product, at which point I'll probably remember your advert.
An example might be easier. I frequently see adverts for car insurance. I don't drive, for a variety of reasons, but if I was going to learn and buy a car, I'd probably start calling around the companies whose names I remembered from adverts. Well, actually I'd Google for a comparison site, but lets pretend I'm too lazy to do that, okay?
Oh, also, pop-ups/unders are a really good way of persuading me to avoid your company, your advertiser, and whatever site I got the pop-up/under from.
Who cares about demographics? We're trying to figure out what people's interests are, what types of ads they'll respond to.
Well, duh. If a visitor looks at the sports pages during work hours, you have a fair deal of information about that person already. Isn't that already enough to serve up ads that would likely be relevant?
If these dead-tree publishers of yesterday's news got a clue, they might also realize that web-ads are actionable, and actions can be counted. Do people click on the ads? Do they generate leads or sales? There's this interesting industry called affiliate marketing they should look into (my guess is they'd make good money off personals and job ads).
What they read, when they read it, and what ads they want to learn more about. WTF more do they need?
Information: "I want to be anthropomorphized"
For those who haven't figured it out already, the web is not an advertising medium. Yes, you can find people who will pay for advertising, but it's a peripheral and unimportant element of the service.
Hasn't the dot-com-bust taught us anything? Revenue models based on advertising are not going to work except for the rare few who have market share and a steady stream of gullible businesses that want to cheat and try to buy an audience instead of building one.
Anyone who needs to know how many people are on his/her site and their nature, will already know, and will already have things in place to measure and qualify this. The most obvious of which is sales of their products/services. Traffic reports are amusing but otherwise irrelevent unless you're in the business of selling traffic reports (like Nielsen - another bottom feeder that is providing a crutch to businesses in an effort to continue to perpetuate the myth that online advertising is worthwhile).
This is all about earning the right to count your visitors.
There is NO WAY I am going to spend time giving up my privacy and demographic information if the site has not earned the right to waste my time.
When you walk into any store in the mall there is a small laser that is counting foot traffic. Each person or close walking couple breaks the beam once to enter and once to exit. It isn't precise, but it is close enough and further the store EARNED THE RIGHT to count visitors becuase there is a reward - viewing the merchandise. Plus, there is a very low cost (exposure to a low powered laser).
Compare this to a website that would require you to fill out a form, presumably with valid info (the article mentions 90210 as the most popular zip code on the web), and THEN you get to see the content. No thanks. potentially valuable content not worth the bother.
Now if there was some technology that would allow you to store this reader profile and it would be transmitted when you visited a website without the need to fillout a form, I bet some people would use it.
But no one wants to give their drivers license to the GAP store clerk before entering and there will never be a time that, no matter how valuable it would be for a web site owner, people provide valid, accurate data on who they are to view site content that has not earned the right to ask for that information.
I only came here to do two things; kick some ass, and drink some beer...looks like we're almost out of beer.
With a relatively compact bit of javascript embedded into a page, the user gets hopefully relevant ads that are not obtrusive or flashy, same as the Google Adwords text-only ads you see on the right side of the Google results pages. And you can customize the colors and format to suit your own pages. Google, while they do serve the ads based on your site's content, do allow you to prohibit certain keywords, so you can block out competitors' ads.
To make it useful to the host, Google allows you to create "channels", so within one AdSense account you can track different pages. You can get a detailed report of how many pageviews each channel generates, as well as click-thrus (which of course leave your site).
To sweeten the deal, you get paid for click thrus. That means you get paid when someone leaves your site, but my philosophy is that if they do that, they weren't planning on sticking around anyway, so I might as well profit from it.
In my case, my site generates about 3000 pageviews and 15 clickthrus, and that translates into about $1 a day in revenue. It's not much, but I roll that back into the Google AdWords campaigns that I run, which generate inbound traffic. I'd rather have people coming to my site that want to be here, than those that don't, so I see it as a fair trade.
And in the end, the reporting and tracking are handled by Google, and provide a tangible benefit to my business.
Oh, and if you want to see an example in operation, look at the very bottom of our site's main page.
--Brandon / Split Infinity Music
Since the above was written I discovered a common practice of sysadmins and help desks is the suggesting manually deleting all cookies (since you can't do it selectively with MS-IE) to get over site bugs. And now the increasing popular spyware removal tools (E.g., spybot) remove 3rd party cookies used just to count unique visitors in the name of removing sypware and viruses from your computer.
Originally I thought of defining a visitor for HTTP domains as the cookie if it exists, and the client IP address otherwise. But the flaw in this is that it will double count first time HTTP visitors. Once for the log line of their first hit with no cookie. And again for the subsequent hit. With streaming logs, using the GUID (effectively a cookie these days) and the client IP address is more useful as a unique visitor. The log lines in streaming are actually the summary of a sequence or request/reply transacations and so the first "hit" log line does have a GUID/cookie logged.
What follows is addition research I turned up:
says: `` A visitor is defined as "a unique IP address with heuristic." To properly account for visits, the Web site needs to identify a "visitor" so that visitor activity is properly tracked. Registration and/or cookies are the best way to track a visitor's activity through the Web site. Unfortunately, a lot of Web sites do not require registration, nor do they use cookies [and browsers can disable cookies] If cookies are used, it is the clients' responsibility to provide the auditor with details on how the server sets the cookie, the cookie format and how the cookies are used. An alternative that has been suggested is to use the IP address AND user-agent in combination, to identify a unique visitor. The interaction with the site by this "visitor" is then analyzed to determine the number of visits which should be recorded. Using only the IP address to identify a visitor is not acceptable due to the number of visitors that may not be accurately reported because they are operating behind a proxy server or firewall. ''