Calculating Number of Users Based on Amount of Unique IPs?
pjdepasq asks: "I run a small but growing web site. Currently the site has optional registration (for the message boards), though we know we have a larger number of anonymous users. Is there an industry standard for calculating number of unique users based on the unique IP addresses over a period of time (1 week? 1 month?) We'd like to get a handle on the number of users we have. Sure, I know about dynamic IP addreses and ISPs like AOL which can dilute or confuse the numbers, but surely there's some benchmark calculation we can use."
But I'm told by people in the magazine business that the industry standard there is to assume that the number of readers is 5x the number of issues sold. Of course that will vary widely by magazine: but that's the ratio they all use when making readership claims in their rate cards.
This is exactly the question the original poster was asking, but for the web: everybody knows that getting an exact answer is impossible, he's just looking for a rule of thumb.
11.0010010000111111011010101000100010000101101000
One problem is that it would depend very much on the type of website and thus the type of users you had. If you have a B2B website, and most of your visitors are from companies, your (unique user):(unique IP) ratio will look very different to a site with mostly home visitors coming through large ISPs.
The industry seems to be more concerned with developing more and more reliable versions of the half-hour timeout metric. Of course, they're chasing the wind. (And furthermore, all the different versions of their metric are then not comparable -- see this study from Xerox PARC (PDF, 228kb).)
I leave you with this thought from my essay How the Web Works:
11.0010010000111111011010101000100010000101101000
Exactly. All the people posting here going "cookies don't work because people turn them off" are on planet Slashdot.
HOWEVER - one thing that's not been mentioned yet is that if you use mod_usertrack to cookie your logs and you get a user who does not accept a cookie, it creates a stream of unique IDs - one for every request that user makes.
So - those people who turn their cookies off and go to Apache servers could be looking like hundreds or even thousands of users! Hooray!
G
"And the meaning of words; when they cease to function; when will it start worrying you?"
You can't do it.
First, there are too many corporations using NAT, and it's impossible to know how many people are NAT'ed. A company may have 100 employees, but only have 3 static IPs.
Second, there are man servers which have a ton of virtual hosts on them, each with their own IP. A server could have 20 or 30 or more IP's assigned to it, there's no way to know. Furthermore, a server could have multiple NIC's, assiging different virtual hosts to different NIC's, making it even harder to figure out.
-Cire
IP based ratios won't work because AOL will fuck you over. I've seen the same users come from different IP's in their proxy space, plus their cacheing means some users will never make it to your page.
Some users have plug ins that request the page from a second IP (NBCI's quick click anyone?) that will skew your numbers. Exact numbers will not happen, and ratios will vary widely based on your clientele.
Set cookies and go on your way.
You can always check if the user has accepted the cookie or not: redirecting from one page to another where the user must have a cookie it was accepted. You can then estimate the proportion of non-cookie users.
And if you base your stats on a short period of time, say 2 or 3 weeks, users clearing cookies will be a minority.
IP's are misleading... every user behind a proxy server shows up as the same IP address. There could be thousands of users behind the same proxy.
MadCow.
I used to have a sig, but I set it free and it never came back.
I have no problem with people refuting my claims/advice, but you offer no reasoning whatsoever as to why your "method" is better than using cookies. I certainly DO live in the "real world", and the methods suggested using cookies would definately provide simple, accurate, and objective measurements of unique users.
Do you not have experience using cookies? Do you not know what they do or how they work? Do you understand the problems with using IP addressing, as indicated in the cookie discussions above? Do you have any actual "substance" with which to argue those points?
I'll gladly submit to better ideas, if only you can show me the flaws in my own arguments and convince me of yours. Unfortunately, your post lacked any (possibly quite correct) details to support your claims.
MadCow.
I used to have a sig, but I set it free and it never came back.
There are too many confounding factors that can mess with your conclusions if you use IP numbers.
Depending on your target audience, the bias can correlate with geography (people in areas without vibrant independent ISP culture are more likely to use services like AOL that run users through proxies, and some countries - such as New Zealand - have almost everyone behind them), it can correlate with specific large institutions (if half your market is Time-Life, you'll only get one unique IP in the logs), it can correlate with almost anything.
Cookies are easy and painless and the small number of crackpots who are afraid of them are more likely to cut evenly across various demographics than are the entanglements created by assumptions about IP-human correspondence.
"Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
No, you don't have to, but the cost to you is nil in terms of time and disk space and any other resources, so in the general case there's no particular reason not to unless you get intrinsic reward from being a curmudgeon.
But you can count the number of cookie-refusers and account for them by extrapolation. This way the confounding effects of IP-based tracking only affect the minority of your users who won't accept your ginger snaps.
"Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
The question says "we know we have a larger number of anonymous users" [than people who register], a majority of the people who turn off cookies are going to be in this group. And there's no incentive to keep the cookie even if you got people to accept it.
However, there is an easy way to associate IP address with "real human". Put the IP address in the cookie. Everyone loves that.