Unique Visitors = 1/10th of Unique IPs?
Max Fomitchev submitted a little blog entry where he proposes that the ratio of unique IPs to actual unique users is 10:1. This flies in the face of the numbers you usually see attached to these sorts of things. I'm not sure about the logic he uses to come up with these numbers either.
The 10 was a hypothetical...the only point was that you can't trust the number of recurring visitors that a site reports because they users come back with a different IP (obvious) and get counted twice. Couldn't one use cookies and IPs in combination to get a better gauge? The IP may change but the cookie would not. Sure some may delete it, but it'll still improve accuracy at least a little bit.
IPs = Visitors*(DSL_FRACTION*VISITS_PER_TIME_PERIOD + CABLE_FRACTION)
If DSL_FRACTION = CABLE_FRACTION = 0.5 then we get
IPs = 5.5*Visitors or Visitors = IPs/5.5
thats the formula she uses to get the number of ips.no idea if the math is right,anyone who has an idea,please enlighten me
So, he's saying my website has 1/10th of a visitor?
This guy's the limit!
I help keep this in balance by using my neighbor's wireless, that IP has a load of unique users.
Isn't this what cookies are for? Sure lots of people wash them. But I'm betting the majority of people do not.
I've hosted several servers from home for years at a time without my dynamic IP address ever changing, and I've known many others in the same situation. I think this 10x rule might be a bit extreme...
I think anyone in the tech sector knows this already. It's pretty common knowledge that people surf from multiple IP addresses. How about a solution instead? How about adding MAC addresses to stats, or modem_id's (whatever the equivilent to them is a to a MAC) and have webbrowsers send that data too then stats applications can have something more unique to identify a unique customer by than an IP address.
So if I access a site at work and then access it at home, that's two IP's (two different computers and two different ISPs) but one user. Seems a bit high to say 10 but I could see 2 - 3...
I'm not clever enough for a sig...
"Presuming that the latter have static IPs the former draw different IP from some pool each time they connect." I have dsl and according to dyndns.org, my dynamic IP host, my ip was last updated on April 13. Just because someone is using PPPoE doesn't mean that each time they visit your site they will have a new ip.
I guess it depends on the definition of "visitor". Maybe the site tracks unique IP addresses, or maybe it ignores them where the client has a permanent cookie installed.
However, I can attest that my ADSL connection is pretty DHCP heavy. Sometimes my IP won't change for weeks, but I've had 5 or 6 IPs in 24 hours on several occasions.
Actually, the article is about exhaggeration of the number of visitors. Number 10 is just a vague unsubstantiated number given in the blog entry without supporting data to illustrate the magnitude of that exhaggeration.
I do not believe in karma. "Funny"=-6. Do good and forbid evil. Yours, Oft-Offtopic Flamebaiting Troll.
Cookies are a much better indicator of what browser you are communicating with.
Also, most spiders don't bother with cookies, so that's another way to tell something isn't a real user.
Unfortunately, some users disable cookies. And then all you can do is fall back on their IP address.
It would be nice to see cookie-tracking support in Open Source stats engines like awstats.
Bruce
Bruce Perens.
Can he find a formula for the number of /. articles posted vs. the actual unique articles?
Both a grammatical and a math/logic error in the first sentence!
I think the moral of the story here is that you can glean no true information as to how many visitors your site really has by unique IPs. This convenient unique visitors = 1/10th of unique IPs idea is no more accurate than simply assuming each new IP is a new visitor. There will be people who visit your site from 10 different locations and thus 10 different IPs, and there will be whole families on one IP visiting your site. Or perhaps one of those 10 different locations one person uses is used by others. Since we have no way of knowing these things, any sort of formula is as inaccurate as the next.
You're right, I wouldn't steal a car. But if it were possible, I sure as hell would download one!
I see what the guy is saying - dynamically assigned IPs at clients mean that one person can view a site from multiple source IPs over a period of time. Both DSL and cable use dynamic IPs - but they are not often disconnected/reconnected, and when they are, DHCP is likely to pull the same IP address back anyway.
Besides that, think of all the people at work on internal LANs, each presenting the same public IP source address to the same web server. This effect more than makes up for the dynamic IP nonsense the blogger spouted off.
Maybe he's just mad about the size of his epenis, and is trying to compensate with illogic.
Web 2.0 == Giant Blogspam Circle Jerk
This argument is flawed. Logging to Slashdot now from my house and two hours from now from my friend's house should count for two visits, and so it rightfully does. The article writer seemed to have a problem with this? ZOMG 2 different IPs...
And if my IP has changed but I'm still here... that's because I haven't surfed for many hours at least otherwise the lease will be renewed and the address will stay the same. So it should still count for two visits. Duh.
Global warming is a cube.
I have Comcast Cable and to be honest, they never assign me a different IP address. I've had the same one for 6 months, the entire time in this new appartment. My old connection was the same, DHCP getting the same IP address over and over again. Does this guy account for my wife and I visiting the same website, being two different users, yet having a shared IP address?
Fixed 4:1 ratio here tho i can't speak for anyone else.
1 office and 3 home computers.
10 seems a little excessive, timeframe probably matters to actual ratio: unique per day? month? year?
I do most of my news browsing at work, where several hundred people show up as one IP (home computer is exclusively for WoW).
Besides, the assumption that stated unique visitors = actual unique ip's is innacurate. Lots of companies track users with some kind of UID cookie, for more accurate stats. True, this isn't perfect either, and will reset when users wipe their cookies or it expires, but is probably closer to the real number than ip's.
Ooh, wow... IP's aren't good indicators of uniqueness... I'm sure the Slashdot editors will tell you how valid that is when they're troll hunting.
But I don't think dynamic DSL IPs are that big of a problem. What about DSL users that are connected 24/7? My DSL provider rarely kicks me off and I can hold the same IP address for weeks.
What about laptop users at wireless cafes or users who post/read from work? Surely the same IP that reads a tiny website from home is likely also the "same IP" that reads it from a nearby workplace.
If only the same was true with our little OS over here.
2) He skips a few major technical details about the IP system itself.
d) He's mulling over a random loopy theory in a personal blog post, which isn't quite news. If it were, I'd be William Randolph Hearst by now.
Slashdot Burying Stories About Slashdot Media Owned
If you are only using IP to generate your visitation metrics, then you're fooling yourself, for the reasons outlined in the blog. You can't guarantee an IP is unique to a user, any more than you can guarantee that a user is unique to an IP (think Internet cafe or library; different users, same machine with potentially the same IP)
You have to use a combination of log data to try and scope out exactly who's visiting: IP, browser type (can't count robots in your stats), membership id (if the site uses/requires it), and possibly cookie data (you assign a unique id when a user visits the site and they carry that data to every page). Here's a good breakdown of metrics processing and its pitfalls.
GetOuttaMySpace - The Anti-Social Network
... but didn't read the detail. A DHCP client gets the same IP address it had previously, so unless the pool is in short supply of free addresses it will get the same address as before.
And why does he suggest that DSL clients have static addresses while cable users have dynamic ones?
Also, most home users (I'm allowed a presumption too) have routers instead of bridge/modems or PCI card modems, and they are kept on all the time. While the router is powered on it will keep renewing the existing IP address.
I have a dynamic IP address but it's stable enough to run servers on it.
I'm sorry if I haven't offended anyone
AOL represents a large chunk of traffic and all users appear to come from one of a dozen proxy servers. Did he factor this in?
I am a little puzzled by his assumption that DSL users get a new ip while cable users have a static one. I had a DSL account with a static ip and a cable with a changable one.
Also if you got a good ISP that doesn't drop your always on connection you won't be changing your IP all that often. Hell even my crap cable provider rarely changed my ip.
So no, for a "large" site you can't really determine unique visitors by ip. That only works for really small sites wich maybe get one visitor per ISP worldwide.
Even if IP was unique that still doesn't account for proxies. I have personally had to explain a couple of times to customers that no user X wasn't reloading the site constantly, that is the proxy for a university.
So my conclusion? Wake up and smell the coffee. Everyone who knows anything about measuring site metrics know that you can't make any claims based on ip's.
His 10:1 claim is just random guessing.
If you absolutly must know then you need to do what newspapers do when they study how many people read their newspaper (and therefore see the ads) vs how many newspapers are sold. Take a wild stab at it and choose the figure you think your are going to get away with.
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
I forgot something.
What about the other way?
Do they see the 10 people on the office NAT as one IP ?!?
That would skew it in the other direction and average things out wouldn't it? Now 10 is definately excessive.
First of all, a DHCP server is typically going to give you the same IP address each time your computer requests it, unless there are more users than IP addresses, in which case there will be some shuffling. But that tends to be when there are more users than available IPs.
There are entire domains hidden behind a NAT device of some sort. This would be many users per IP address. TFA didn't mention this at all.
So I think TFA is indeed arbitrary, and also wrong.
bp
If you have a NAT and only one IP facing the net, but a bunch of machines using network address translation, can the web site determine how many unique users there are attached to that IP?
...which is that most ISPs deliver dynamic IP addresses, so users very well may be drawing multiple IPs over the period of time that they visit a website. But instead of running with idea and doing something useful with it, like conduct even a limited study (corrolate IPs & cookies, etc), he simply pulls some numbers out of his ass.
It's not so hard to come up with ideas like this - the real work is in verifying the idea, or coming up with a helpful method for figuring how those who care about such numbers can figure out what the real numbers are like.
It's lazy journalism when (1) the guy with the idea doesn't do any kind of validation of his concept before publishing, and (2) a big news site like Slashdot simply picks it up and runs shoddy work like this, tossing out a very potentially inflated 10x number...
Nobody knows exactly how many true unique visitors there are to a given website. And given the various ways to determine what is "unique", this muddles the pie further.
However, the important thing is, advertising rates aren't affected since they have been market-corrected for this. If an advertiser can make money, he will buy. If he can't, he won't. Whatever the true number is, it's already been factored in.
eTrade SUCKS
What about NAT and proxies?
There is no mention of NAT in his analysis. NAT is even included in some DSL modems these days from SBC. Lots of companies will have 100+ computers behind a single IP address.
Why is this guy's post news? Can I sit around and write bad formulas to get my blog linked by slashdot too?
I wonder if the other major ISPs do the same.
Comment removed based on user account deletion
yeah, i agree with the other commenter who mentioned comcast. as a cable customer, although i technically have a dynamic ip address, i don't really. it changes maybe twice a year. although i would say that perhaps 2 to 1 or 3 to 1 would be a decent guess, maybe.
and he is totally not considering situations where it works the other way around. that is to say, he is not considering situations where different visitors come from the SAME ip address (as opposed to the other way around). if a huge lan exists, such as a university, which is behind a router, and 100 people from that lan visit the site, they will all appear to have the same ip. i think, right? taking that into consideration may throw the numbers in the opposite direction.
If you use http://www.mrunix.net/webalizer/ then it counts number of visits, the same IP is still the same visit if the user of that IP (re)loads a page within 30 minutes. After that it's a new visit... And it should be safe to assume one visit is one user. Saying that 1:10 is the ratio for IP/users is simply saying that every user will visit the site ten times - which seems like a worthless number without limiting it to a time-period and also, the number seems to be taken out of no-where.
9/11: Never forget it was a false-flag operation
Conversely, NAT, Proxy servers, CDNs (like Coral, CoDeeN), etc. decrease the number of IP addresses that access a page while having a large number of users see the content...
Single IP addresses could be multiple people. Check.
/. and run the stats on registered users. THAT would be [vaguely] useful. This, on the other hand, was just pure, pull-it-out-of-your-butt speculation.
Multiple IP addresses could be individual people. Check.
Cookies cannot be trusted to be persistant, since people routinely clear their caches. Check.
However,
Not all DSL customers are on dynamic ip.
Not all cable customers are on static ip.
The reverse of the above is also not true, so why even get into that?
So, what can we learn about IP address->Unique visitors from the above collection of information? ABSOLUTELY NOTHING.
However, you could come up with a reasonable approximation if you went to the effort of constructing a sample of known individuals and recorded their behavior and against the selection of IP addresses they use throughout a day/week/month/year. Hell, take a site like
Can we get a "From the 'Well, Duh' department" subheading for things like this?
We have 54 employees going through one firewall, and having one external IP address. On our company website, only that one IP address shows... So for that IP, it is not 1/10th of a unique visitor, it is 54 unique visitors. His numbers are baseless and skewed.
I'm out of my mind right now, but feel free to leave a message.....
You can use statistics to prove anything.
40% of people know that. . .
How about the people who connect from behind a router and have the same public IP, wouldn't that have the opposite effect. Sure these people _look_ like the same user, but could easily be a lot more than that.
If I send my sister a link on our home network, she could go to the site and it looks like the same visitor, etc. Everyone forgets about this too. Surely unique IPs != unique visitors, but it is somewhat close.
Judges and senates have been bought for gold; Esteem and love were never to be sold.
I agree you don't REALLY know for sure what the ratio of unique IP's to actual human visitors are (even assuming you filter out the BOTS) ... but my guess is that it is actually pretty close to 1:1 ... and in fact, unique IP's may actually undercount the number of unique visitors.
Hulk SMASH Celiac Disease
This random blogger (*) proposes something fairly wild without any proof whatsoever. Slashdot reports it simply because it is a wild guess.
Hmmm... I think I'll guess that there are only 10 unique internet users in the world excluding Comic Book Guy [tm], maybe that will get me reported on Slashdot giving me 10 hits of sweet, sweet advertisement money.
(*) Well, I've never heard of him.
There is about 500 people in my city routed throught one IP at our popular ISP in my city that makes 1 : 500 unique IP to unique user ratio, now consider many providers use this kind of routing.
At Wikipedia for example I saw my IP address under our local subject edits many times, never touched myself the article.
No joke, we have 800 people going out over one IP from here. Kinda a pisser when I hit the 'slow down cowboy! you just posted' message. As much as the stats are inflated by dynamic IPs and multiple logon points, they are deflated by NAT and Proxies.
-Rick
"Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
You can't trust web stats, that much I agree with. The rest is a bunch of hand waving.
DSL customers do not get a new IP every time they turn on their computer. Maybe some do, but my IP changes maybe once every few months, max.
He fails to mention the effect of NAT'ing and mega proxies, both of which are in heavy use and have the OPPOSITE effect. All of AOL emerges through a small number of IP addresses, clearly more eyeballs than IPs.
I agree that IP != eyeball, but that's it, there could be more eyeballs than ips or less, who knows, and it probably varies from site to site, based on demographic. There is no way to know for sure. Cookies will only tell you the number of computers.
I'd say he'd have to look at a specific population.
Among college students and younger, it may very well be 10:1, or worse.
For those of us accessing from work and home. That will be 2:1 assuming the same site.
For those of us behind corporate firewalls or other traffic aggregate points. It could very well be 1:1000, or higher.
Without some other data point, unique IP address statistics are next to worthless, except in "We had xxx,xxx average daily last month and xxx,xxx + xx,xxx average daily this month.
all of these techniques help quantify unique users and monitor the trends in their online behavior. as far as noise in unique IP counting, i think that the biggest issue with relying soley upon unique IPs is that a simple count of unique IP addresses will include all robot noise. the major web searche engine spiders will not influence this count much, but the gratuitous IPs logged by script-kiddie bots can eclipse the human traffic on smaller sites.
about sean dreilinger
What the author is pointing out is merely the obvious: when a site says they have X visitors they're making a guess. In fact, this link from April 30th both explains and shows why web site statistics are not accurate.
This need to say how many visitors a site has is nothing more than marketers trying to justify their costs. The trend to shove commercials down our throats using every conceivable idea including the possibility of preventing you from switching channels when a commercial comes on serves only the marketers since they're the ones who are reaping the most from inflated statistics.
We will bankrupt ourselves in the vain search for absolute security. -- Dwight D. Eisenhower
Firstly, cable users while on DHCP keep that address for weeks, thus one unique IP is one unique visitor. Secondly, I'd wager that a huge portion of internet surfing occurs at the workplace, which is masked behind a SINGLE IP: an enterprise of 5,000 or 10,000 employees is represented as 1 unique IP.
Back in the day of modems and less inet prevailance, I generally thought the numbers balanced out. Today, I think there are many MORE visitors that unique IPs.
I almost knew not to read the article on the basis of the statement "the logic he used". The article did not disappoint. The logic he used is irrelevant. The whole argument is pointless because he tried to argue it logically. There are plenty of ways to inflate or deflate this number, however, as above comments have pointed out. One should not try to come at the answer with logic. Just measure it. (Yes, measuring it is not necessarily easy, but difficult-to-obtain right answer is always better than an easily-argued wrong one.)
I'm no expert, but I have about 20 people in my office, connecting through the net through a NAT'ed firewall. If we all looked at this guy's blog at work, he would technically have 20 unique visitors coming from a single IP address, so that would actually deflate his number of visitors based on IP address. It seems like Max's law is a little one-sided.
Per ardua ad astra.
I'm not sure about the logic he uses to come up with these numbers either.
News for Nerds! News! This isn't news. This is a random and arbitrary article that passes itself off as fact when it's just speculation. That's not news! Why the hell did you post it Taco if it isn't news???? Please.
Webmasters the world over know that the numbers we get aren't entirely accurate because of DHCP servers, NATs, and all the things we for IP addressing. That's not news and no webmaster really needs to be reminded of it!
What would be news is a statistical study that tries to provide facts to back up this 1/10 number, or a new, more accurate way to count people who visit your website.
"All great wisdom is contained in .signature files"
Still zero.
--- This
When I worked for a search engine company we relied on a combination of IP address and HTTP cookie to identify unique users. True, many people disable cookies, delete them, etc. but by making use of multiple tracking methods you get a much more accurate idea of usage.
"Don't trust Stats. Except mine..."
1/10th = 1/10 of make sense
Some examples:
I don't really know why it matters in any case. For advertising, clickthrough rate is more important than number of users, and they are not very closely related. Sadly, the poorer your site's navigation the higher the clickthrough rate (and the fewer pages on your site people will see each visit, as the ads take them away sooner).
Live barefoot!
free engravings/woodcuts
This makes no sense.
a) As stated in the post, we don't know what this guy bases his information on.
b) 10:1 sounds rather inflated. I know plenty of people that have 1 computer per IP (NAT or not). So if the average is 10:1, then there has to be a lot of people with more than 20 computers behind an IP.
c) It's not really news, it's just some guy ranting about something.
c) We post it as NEWS anyway.
I've been know to browse sites on my two machines at work ( with different realworld IP addresses ), then visit it later when I get home.
:-)
So I would appear as three unique visitors. Cookies wouldn't solve this problem.
Using just cookies wouldn't work if a person used multiple browsers on their machines, like me.
A Hybrid would work best.
I like using lynx so I can block cookies and not have to deal with adds.
I'm not sure about his article and his formula, but it is already a debate in the web analytics industry. http://www.omniture.com/blog/ Even using cookies it's nearly impossible to get correct unique visitor counts and that is why the industry is moving more towards unique visits, because a visit is a visit, it doesn't matter who the visitor was... The only way to really measure how far off visitor data could be is comparing unique customers (cusomter id) to the number of unique visitors they create (the customer id coming from a login). That way they could see the affects of multiple customers on multiple machines and browsers and also see the affects of multiple customers on a single machine and browser.
Proxies could, especially ISP proxies (AOL, anyone) can hide potentially 10,000's of unique users.
Also, as far as i've seen DSL IPs don't change that often.
The article title looks like "Unique Visitors = 1/10th of Unique PIs?"
For the sites I do traffic analysis for, I've noticed that certain end-users will have their IP address change during a session (i.e., between one page request and another, often within minutes or even seconds of each other). AOL users seem to be in this situation, along with one other major ISP (I forget which). When the IP changes that often, I started trying to figure out other ways to count unique visitors. I still haven't come up with anything particularly good.
So, if the main users of the sites in TFA are from AOL (or some such), the 10 to 1 effect is one of the likely possibilities.
However, TFA does appear to use speculation rather than actual numbers...
Let S_n = {nst+us+vt : s,t in Z \ {0}, u,v in {-1,1}}. For all n in Z where |n| > 2, Z \ S_n is infinite... right?
This guy is so far off base, that this is really just sensational, and hardly worthy of being /.ed
This article is from CMP media.
This is a technology article from CMP media.
...
...
So other than advertising a blog entry that, at best, is deeply flawed due to the complete misunderstnading of NAT, DHCP lease times and the fact that no reputable site uses IP addresses as a basis for their stats, what exactly was this article about?
It's that this is a Marketing Person who has realised that IP != Unique User.
That places him amongst a tiny minority of marketing people, even if his reasoning and ideas on methodology are just as batshit insane as the rest of his kin.
-EvilMagnus
Estatistically speaking, and in the long run (meaning tending toward infinity) the "profiles" of people that browse site A and site B would fall in more or less the same -- meaning it's not the overcounting or the undercounting that cancel each other -- is the the over/undercounting for site A is more or less the same over/undercounting for site B.
Therefore, if site A has 10 million IPs logged yesterday and site B has 5 million IPs logged yesterday, it's fair to say that site A has double the traffic of site B.
That is, unless you can prove that the user "profiles" (PPP, PPPoE, people that use lots of different computers versus proxies, NATted people etc) are wildly different between site A and site B -- which can occur e.g. in pr0n sites (which would be blocked by most corporate proxies but not home computers etc), the number of distinct visiting IPs is a good measure of site traffic.
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
I've got cable with supposed dynamic addressing but in truth it has remained the same for over a year. There was a time it changed every few days but I presume they upped the allocations or whatever. I think it's interesting what this guy is saying but (as many people are saying) the data is very confounded and many different reasons could apply.
spoonerize "magic trackpad"
I recently put up a blog post entitled, "The Secrets of the Universe, As Released From My Mystic Corn-Hole" and I got practically NO TRAFFIC. What's the deal here?
This tripe gets posted, but my butt-wisdom goes ignored?
It should be obvious to everyone that Slashdot is under the control of the Scientologists.
I am from a small, grease-loving country in the north called Ca-na-da.
I work for a contractor that maintains/develops/hosts a govn't site. I'm working right now to transition our Webstats Analysis tool off of an old copy of Webtrends (crappy product), to Urchin (decent product). Measuring Unique visitors is IMMENSELY difficult on a government site because you aren't allowed to leave cookies that exist beyond the user session!
I _have_ to go by IP+User Agent. Cookies would make everything 10000x easier, but because of privacy laws, it's not usable.
All the suits want is a number, any number. When working for an online news company, I tried to explain to a group suits that the traffic numbers we had (unique IPs) wasn't really an accurate figure -- that its only real value was in comparison to itself over time, to show traffic growth.
To illustrate, I showed that a large portion (30% +) of the IP addresses visiting our (local news) site were originating in either Virginia or Washington State -- and that that was because AOL and MSN have their operations.
No one wanted to hear it. Someone suggested instead that maybe it was because there are large military installations in both places, and soldiers from our area were checking for local news. Plausible, but probably not to that extent.
Eventually, it was agreed that we'd keep telling advertisers what we had been telling them, which was how many 'visitors' the site had, based on IP addresses.
This kept everyone happy; our advertisers thought they knew what they were getting for the money, and our salespeople continued making sales with no ambiguity.
Nobody cares about accuracy -- they just don't want ambiguity. I've seen this applies to lots and lots of fields, not just web traffic stats.
I did a quick analysis of a 250,000 line entry server log. I counted unique ip addresses, unique useragent cgi values, and then the number of unique combinations.
A useragent value looks like this: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98;
Although even this is hardly reliable since useragent can be faked, and useragent isn't unique enough to be a client fingerprint -- its still helpful in this context.
One can make the assumption that a given user's "useragent" value isn't going to to change much on a day to day basis, though it will not stay the same over time as vesions get updated. GENERALLY speaking, the same IP address but different USERAGENT values would indicate different people from behind the same NAT firewall, or different users assigned the same DHCP address.
Here's what I got for results -- it looked like counting only unique IP's gave you only about 85% of the unique hits.
Total Hits Looked At: 249861
Unique IPs: 10309
Unique UAs: 1578
Unique Combos: 12232
The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
While he has a point (I visit particular sites from at least two different networks on a regular basis) there are a couple things he doesn't take into account.
Firstly, every DSL I have ever worked with is a peristent connection. (Why would anyone bother with an on-demand DSL?) It may not be a fixed IP, but it is a pretty darn sticky IP. It only changes when one endpoint of the DSL circuit is reset, and that generally doesn't happen much more often than monthly. If you're tracking unique IPs over the life of your site, then sure... hits from IPs assigned to DSL users will be very inflated by this effect. But if you're tracking unique IPs in the last month or so, then the inflation will be pretty minimal.
(Incidentally, the same effect will occur as users change ISPs; your unique users will move to different IP pools each time they switch. Is that worth taking into account?)
Secondly, he doesn't take into account NAT. That's obviously not a big deal when counting home users, but it is a big deal when counting anyone on a corporate network. Tere could be one or one thousand unique users behind every NATted address... if you're counting just unique IPs, there could be a serious undercount of unique users in this space.
Wouldn't it be easier to say "Unique IP hits bear only a loose relationship to unique user hits" and leave it at that? The measurement may be inflated, but it is simiarly inflated for most websites. And when comparing one site to another, that seems good enough.
With reasonable men I will reason; with humane men I will plead; but to tyrants I will give no quarter. -- William Lloyd
He's correct about visitors that visit from many different IP addresses, particularly AOL users that weren't really mentioned. The same AOL user can have several different IP addresses on just one visit to a web site, due to the way that their proxies and such work. I distinctly remember phpBB running into this issue because it wanted to associate each user's login cookie with their IP address for security, but with AOL users there's just no rhyme nor reason to their IP addresses.
The flipside that's not considered at all though is the number of places that have any number of unique visitors all using the same IP address. Everyone in my office will have the same external IP address for any web site we visit. I know many other offices are exactly the same way.
What does this mean overall? I think they balance out for the most part. Some people are overreported, some are underreported, and it probably all balances out in the end.
Real unique visitor numbers are always much higher than IPs, or anything that the stats programs comment as 'Unique Visitors'.
Stats program rely upon ips, browsers, other identifiable factors to asses what traffic is 'viewed'.
This is a thing of yesterday. Today there is much 'privacy' and 'security' hype going on, networks are showing the users to the remote server as they are a single user, people in their homes use privacy blocking stuff to prevent many identifying info about the client pc being submitted to the remote server that holds the site visited.
Thus, it is natural that many actual visitor traffic is now being interpreted as 'not viewed' traffic. But how much of it, what is the ratio ? That totally depends on the visitor spectrum of the site, hardware and measures they use in their own desktops and so on.
Read radical news here
Unique Slashdot Articles = 1/10th of Total Slashdot Articles?
"So, what IS the typical holding interval for a DSL ip?"
Anyone who has a router setup as the primary access point will always have the same IP (barring an ISP reset of some kind - or turning off the router for some reason). I've had the same IP for over 2.5 years now and my pathetic Cisco 678 router gets it via DHCP (I know, I checked in CBOS). The router needs to be rebooted every so often (which prompty causes it to reset the NAT table to some pseudo-arbitrary point in the past, disregarding any changes even through "written") and I still get the same address.
DHCP virtually always assign a MAC the same IP address as it was last seen with, providing it has not been taken already. Since IPs are assigned by FILO, that would mean the stack has been completely used. Unless the ISP has an extraordinarily tight supply of addresses (in which case you'd probably do better with a different ISP), that's not likely to happen if you use your computer regularly.
I urge you people! Mandatory registration on each site! Three names, e-mail, address, age, sex, and social security id!
My IP has been 127.0.0.1 for a really long time now. Ever since I got my first internet connection, actually. That must be why it's such a "nice" number and not those horribly complicated ones other people always seem to have.
I work for one of the big four accouting firms and commonly attest traffic reports from websites that are pubilically listed. There is a requirement for this where internet traffic is published in financial statements.
The calculations used are pretty accurate and certainly not what you would get reported straight from Webtrends or similar. We take a unique user as a combination of unique IP and Cookie. However, we also identify and exclude cookie storms - this is where a browser refuses a cookie and so you get a new cookie for every hit (yes I do mean hit and not page, or visit).
I am sure that non-publically listed companies aren't quite as rigerous.
Eric
ummmm, what about Proxy and NAT? If anything, the opposite of what this article concludes is true. What a waste of my 4 minutes. I want my money back!
To get a bigger set of sample data, hmmm... I suppose even if you have to allow for a % of cookie-blocking users, one could still code the more complex dynamic stuff against a server-side cache / session (to avoid forks), and then a noncritical "Uniqueness" cookie would probably still yield enough data for a meaningful estimate on the IP-count question.
(and with that, I'm off to go determine just how un-unique my visitors are...)
Pi Ran Out
There's lies, damn lies and statistics.
1/10? In decimal, its 0.5 is it? Thats not that bad...
I wouldn't mind you in my head, if you weren't so clearly mad -Lews Therin Telamon
My point is that many non-publicly owned sites report unique IPs as a measure of unique visitors, and this number is totally incorrect because of mixture of dynamic and static IPs.
I further insist that there is no general fomula except to conduct a study on how frequently folks visit the site and what's the ratio of unique IPs/cookies to unique visitors.
A study is needed indeed! But who is willing to participate and admit as a result that they have only a fraction of unique visitors?
On the other hand, if there's a whole lab behind one IP address, it's quite possible that each system has the exact same configuration...
I have bad news for you, buddy... I h4x0r3d j00 last night!!!!111
Gamingmuseum.com: Give your 3D accelerator a rest.
And it gets worse! If users also use an RSS reader, it has its own useragent signature. If you assign cookies, plus use useragent, plus use the ip, you start to get close I suppose, but there really is no good way to make valid measurements.
The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
The pool is normally smaller than the number of users. Otherwise, there would be no point in having a pool - each user could have his/her own IP address.
It works because, if you have enough users, they are never all using the internet at the same time.
I'm not sure if it's still true, but a few years ago it was the case that virtually all international web traffic for the entire UK academic network was funnelled through 15 or 16 cache servers.
The period of time involved here is too short for a large number of users to have different IP's each time. I'd have to do more complete analysis to know for sure, for example if the IP's are in similar address groups or not.
Its never going to be close to perfect, but it clearly refutes the 10:1 statement in the article.
The problem with quotes on the internet, is that nobody bothers to check their veracity. -- Abraham Lincoln
Most real web analytics services use page tagging to measure unique browsers. IP address analysis is just not valid. Real analytics packages screan out spiders and other non human traffic as well. Of course even page tagging with cookies is not totally fool proof - 2 people could use the the same computer with the same login and one person may use a different computer at home and at work but it is better than any other method for measurement.
Note: If you want to see how page tagging works the easiest way is to check out Google Analytics.
[Please type your sig here.]
the 1:10 notation would mean 1/11th not 1/10
... to just register a cookie variable that's unique to that browser that doesn't have to expire for a long time and that key is checked in a database. Sure, it's not secure, but if you want to inflate the perception of unique visitors, there are always ways around that. Actually, just checking for the cookie alone lets you know if someone's been to the site before. There are lots of ways to track users coming and going even without logins, by using cookies.
Neither have static IP addresses in most cases, Cable may be more likely to hold an IP lease longer, but it is basically dependant on the stability of the connection and the lease times the ISP gives out.
Alot of ISPs in Australia also offer sticky IP (== long lease time), makes traking naughty customers easier. I'm sure the US providers do this as well.
If a DSL customer has 10 times as many IP addresses as a Cable customer in a month, that's a reflection of the ADSL quality - rather than anything else.
EMail: 0110001101100010010000000110001101110010 0110000101111010011011100110000101110010 0010111001100011011011110110
I think i use this headlien from now on when I find bullshit!
/. : I connect to there with a different IP each day, and with several different IPs on several days if I use a different computer, e.g. at work.
The author has a point: the number of recorded IPs has no real connection to the number of users.
NAT router: lots of unique users with same IP.
Interesting site like
However, the articel is bullshit, because the author reversed the explanaition:
While it is obvious that DSL visits would be inflated they are drawn from a limited pool. Given enough time the entire pool will be drawn by DSL customers thus giving the impression that there are as many unique visitors as there are IPs in the pool
So, the number of visitors equals the number of IPs? If there are indeed so many visits, that is not that bad.
However: if EVERY customer of that DSL provider visits the web site in question, you still ONLY see as many visitors as you have different IPs, but you have far more visitors than IP numbers. The opposite of the authors conclusion is true.
IMHO: the complete idea to measure visitors via IPs is moot anyway. If you want to measure that, use a userlogin/password and/or a cookie. User login/password combiantions you easy get by offering customization (e.g. liek amazon offering you stuff yoou might be interested in, concluded from your last boughts)
angel'o'sphere
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
No modern webbrowser allows cookies to "leave a trail" of where you've been anymore.
Cookies are only sent in requests to the domains that created them in the first place.
I guess one valid point would be that ad-networks can sort of track you by determining what sites who use their banners you visit based on referrals headers. The issue there is:
1) There's no user-identifiable information that links your cookie to the pages you visited feature that adnetwork's banners aside from your IP address at the time
2) Adnetworks don't care _who_ you are. They use the cookie to determine what sites should be considered "related" so they can better determine how to distribute banners. (That is, banners that do well on one site might do well on sites visited by people who visited the first site)
3) You can always block referers to images/iframes hosted on external domains (adnetwork stuff). IIRC this is the default now on Firefox and Safari... so... its irrelevant now.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
I'm not certain what site Mr. Fomitchev was benchmarking, but on one website I administer, I get ~25,000 unique visitors per day, and nearly 20,000 downloads of the 3.5 megabyte software package on the site. Now, I don't believe very many people will be downloading multiple copies of the software, nor do I think most of the bots that hit the page are going to waste their downstream to grab copies, either.
Not only is his maths flawed, but it is "fewer" not "less".
not one to ten, neither one direction thing.
then, why?
Servant of karma
Comment removed based on user account deletion
I visit a webSITE.
I aim with my gunSIGHT.
In truth, you are all blameless. English is a terrible language, and people should try to avoid using it as much as possible. Of course, most already succeed without even trying.
Great men are almost always bad men--Lord Acton's Corollary
"How are we supposed to know what our visitors find interesting?"
... and yet turn around and argue that everything is groovy when some random webmaster does it?
Ummm, to answer that question maybe you could try tracking your own damned web site, instead of tracking users?
Let's say you have four web pages. Page three gets the most hits. Golly, maybe people like the content on Page 3 better? Ya think?
You are lying to yourself if you think you need to track OTHER PEOPLE instead of tracking your content.
I don't get it. Why does the Slashdot crowd scream bloody murder when the government or industry uses RFID or some type of surveillance technology
Surveillance is surveillance. With limited court-supervised exceptions, it is always immoral and dishonest and should be illegal.
People don't turn off cookies and the like because they feel mildly pestered by a website's tracking. WE FUCKING HATE IT. Tracking is quite simply wrong behavior by websites, and indefensible.
so why can't the virtual host just be sent along with the HTP data in the encrypted SSL stream?
When I lived in DC, I had Verizon DSL. Connected via PPPoE and only had a total of two IP addresses in 3 years. Comcast where I live now, uses DHCP and again I've had the same IP address for 2 years and only 3 IP address in 5 years. Dialup is the only time I've ever gotten a different IP every time I connected.
Also where I work, has 4000 people + 1000 public internet servers all behind just a few IP addresses. The parent and the article are just totally wrong.
Lucky you! Mine is 127.231.17.68... Much harder to remember. Wait, the net mask is 255.0.0.0... Are we on the same ISP? That must be a huge ISP, to use an entire A space...
Rethinking email