Fedora Metrics Help Whole Linux Community
lisah writes "When Fedora released Fedora Core 6 late last year, the team decided to track the number of users with unique IP addresses who connected to yum in search of updates for a new installation of FC6. According to the data they collected, FC6 crossed the one-million user mark in just 74 days. Fedora Project Leader Max Spevack says that while it's great to use metrics to better understand what users want, the real value lies in its ability to encourage hardware vendors to more offer more Linux-oriented goods and services. Spevack told Linux.com: '[W]e always say we wish hardware vendors had more [Linux-capable] drivers. Well, if you can go to them and say, "Hey, there's millions of people using this," then maybe they will listen. In the real world, you need data to prove your case. Well, here it is.'" Linux.com and Slashdot are both owned by OSTG.
Doesn't collecting data make you evil?
"reality has a well-known liberal bias" - Steven Colbert
I have legacy hardware, and too little knowledge, so I'm too afraid to switch from Core 3 to 6. God only knows what would break, and I sure don't know enough to work around it. But if I could get 6, I'd be in their statistic too. There's bound to be more people like me, who can't get 6 for some reason. So that number is a low estimate!
Saddly this metric will be very quickly attacked because of all users who have broadband connections with IP changing every 24 hours.
...
Maybe counting how many different IPs downloaded *1* given critical update will be more precise (based on the assumption that even users with non permanent IP will download the patch once to secure their machines, and then won't download it again).
But even if it lacks precision, it is still a good indicator that Linux *IS* in fact popular and much more widespread than people think.
It just lacks sales figures to prove it.
Specially when compared to the so-many "Vista didn't get a warm welcome" reports we read a lot those days.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
I don't think commercial support is about the number of users, it's more about not wanting their precious IP laid bare to anyone who can download the kernel source (ie. everyone).
Also, in these times of big companies patenting everything the source could reveal infringement.
IP addresses are necessarily unique ("one of a kind"). You mean "distinct" here.
"Skill shows through where genius wears thin." -Wittgenstein || Religion: uniting aviation and architecture.
Not saying there isn't a vast number of Linux users (I'm sure there are well over a million individual Linux users - that's a third of 1% of just the American population), just that numbers from data like this can be skewed.
I just installed FC6 on a machine yesterday, and they made it impossible to do anything without connecting to their server. I'm keeping the machine off the network, but apparently there's no way to install packages from the DVD without first downloading the update lists from their mirrors.
.repo files to point to the DVD instead of the internet, but it still crashes with mysterious errors about media uris. I finally gave up and installed Ubuntu instead. So no, this doesn't help the whole Linux community. We'd be furious is Microsoft imposed this sort of requirement on new installations.
The Add/Remove gui (and yum) crashes if DNS isn't available. After some research, I was able to hack the yum
One issue they mention (and many people here will mention) is
"1) Users who have dynamic IP addresses will likely be counted multiple times, which inflates the number by some amount."
To counteract this once you hit the 6 month mark you simply delete IPs that haven't been used in 1-2 months, by doing that you pratically guarante that whatever number you have is an underestimate and that number becomes a lot more authoritative.
Still it's awesome to see the numbers for Fedora are that high considering the dissapointing Linux penetration I see even among CS people. Heck we could all band together and form our own city! We could call it Fedoraville and our sports teams could compete against our rival city Ubuntuville!!
I stole this Sig
Given the numbers coming out, I'd think that it sure can't hurt for these guys to post the number they are.
e ctor=Briefings
Here(2nd page ) Mark Shuttleworth mentioned Ubuntu having 8 million active users:
http://redherring.com/PrintArticle.aspx?a=20497&s
Now what are the hardware vendors waiting for? Permission from Microsoft?
LoB
"Anyone who stands out in the middle of a road looks like roadkill to me." --Linus
Personally I don't understand shyness/lack of will/underrating ourselves in these case. Look at Firefox, they made whole PR campange around those numbers! And if they won't matter....THEY DO. They are true numbers who can be verifired, checked, compared, etc.
I think most of problem of using meme "look at the numbers, user count are huge, man" is that there's lot of geeks which don't see this argument as simply valid (those numbers can't be wrong, etc. etc.). They would like to better convince hardware developers that they MUST get those damn specs (by some hidden morale or simple common sense, which, I agree, exists in this case too) out rather trying to wow them to community side (presentations, numbers, proof of concept (you don't have to care about driver, etc.)).
We need more actions like SpreadFirefox, period. Done right, they just work.
user@ubuntubox:~$ stfu This server is going down for shutdown NOW!
Collecting non-personally identifying data, that would be logged anyway during the normal process of the server function (httpd/ftpd daemons will log connection anyway wether or not FC owners choose to do something out of it) and publishing only the compiled form (the total number. Opposed to the complete obfuscated [rot5 scrambled ?] list, AOL-style), ISN'T EVIL (It just similar to the "number of visitors" counters back in the old Web 1.0 days).
/. crowd).
Collecting data in an opt-in manner like http://counter.li.org/ to do statistic. ISN'T EITHER
Collecting data, that don't necessary need to be collected for technical reason (IP address vs. Pentium serial number), without telling it the user first, without asking permission to the user first, THAT IS EVIL (and regularly done by microsoft and other object of hatred from the
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
It's how any new systems are being checked for the first time, and most people probably aren't reinstalling it constantly and downloading updates, so there's very little attacking you could do to these figures.
I'm the guy who actually maintains that Statistics page on the Fedora wiki.
The real "story" here is a couple of things:
THING 1 -- We're making the best effort that we can at showing the world how many installations of Fedora Core 6 we know about.
THING 2 -- We're being upfront about the assumptions and caveats that go along with that number. Quoting:
"Accuracy of metrics
We believe it is reasonable to equate a "new IP address checking in" with "a new installation of FC6", with the following caveats:
1. Users who have dynamic IP addresses will likely be counted multiple times, which inflates the number by some amount.
2. Users who are behind NAT, corporate proxies, or who rsync updates to a local mirror before updating will not be counted at all.
The anecdotal evidence that we receive from different groups, companies, and organizations makes it quite clear that group (2) is significantly larger than group (1). As such, we believe that the true numbers in the field are higher than the numbers on this page."
THING 3 -- We're also being upfront about how that number is generated.
I'm not trying to spin the data in any way. I'm just putting it up there, and trying to do so as objectively as possible. Anyone can draw their own conclusions, or compare it to data from other distributions, if you can find similar reporting.
One thing I haven't noticed being mentioned much is that this only counts the Fedora 6 installs. This would have to be a low ball figure as there are so many machines with some other distro installed on it that wouldn't for any reason go to download a patch from Fedora. If they've already admitted that their numbers are low because the factor that causes it to be low is greater than the factor that causes an overstatement, and they're already estimating a million, once you count in all the other distros, I think 'millions' is quite reasonable.
The greatest thing you'll ever learn is just to love and be loved in return.
So just fire up a live CD with a recent kernel and try it out. You don't have to upgrade if it doesn't work. Hardware drivers are in the kernel, so just testing the right kernel on your system will tell you whether it works (mostly).
FC3 uses kernel 2.6.9
FC6 uses kernel 2.6.18
Intron: the portion of DNA which expresses nothing useful.
you need to learn to use Slackware, it is the best distro for old hardware...
Politics is Treachery, Religion is Brainwashing
that's why i only have unformated HD in my computer, i shall do no evil.
-Thus began the age of darkness-
How are they using the IP address for marketing purposes? They're using the number of IP addresses. No one can take the information they've released and determine that a computer at x.x.x.x is running Fedora. (And the information they have, they would have had anyway -- just like Slashdot knows the IP address you posted from.) As the GP said, it's no different from a website processing its server logs and reporting that it had X unique visitors during period Y.
Come to think of it, since yum fetches data over HTTP, it is a website processing its server logs and reporting the number of unique visitors.
Personally, I rsync from a mirror and have a local repository, so I have a whole bunch of machines that dont get counted. Stuff like that will result in the numbers being a bit off.
"so I'm too afraid to switch from Core 3 to 6."
If you upgrade that rarely, I'd suggest you take a look at CentOS. CentOS 4 will be a far smaller leap (RHEL4 is close to FC3/FC4), and you'd be on a maintained platform again.
There might be an outcry if Microsoft did that, just because people hate Microsoft and think Microsoft is evil, but that wouldn't mean that doing it would be evil. (So, Microsoft may in fact be evil, but not necessarily everything they do is evil, and moreover, just because they could do something, doesn't make it evil.)
There's nothing wrong with saying "x people accessed Windows Update this [year|month|day]." That's no different from the hit counters that used to exist on every web site. (And which were tacky, and I thank God that people finally realized this.)
What would be evil, and the temptation they need to avoid, is to take their server logs and start mining them for data that can be sold or used for malicious purposes; i.e. personally identifying information about what users are using what versions of Windows, or even how often they're updating, etc.
Aggregate information about hits is something that HTTP servers and their operators do all the time. Where it gets evil is when you have cookies tracking particular users across multiple sites, etc.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
Not to mention multiple computers hiding behind NAT; they would probably appear to be one system, due to the single IP address, unless the software for determining "hits" is smart enough to look at the transactions and realize that the same IP address just requested the same data 4 times over, and thus is probably 4 machines on a LAN behind a NAT router. I suspect that it is not, though, and thus you're almost certainly underestimating the number of installed systems.
That doesn't mean the metric is worthless though, if anything it makes it more useful to use in badgering hardware manufacturers, since you can pretty reliably quote it as a minimum. E.g., "there are at least 1M people using this software as of 1Q07, probably more..."
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
I'll concur with that quote. I have enough Fedora boxes behind NATs at home and at work to make up for several dozen dynamic IPs.
I use Kubuntu, but the concept is the same. When I use aptitude, it hits something.archive.ubuntu.com, and I get counted as one person, since I am behind NAT.
However, I have six machines, all of them on Ubuntu server or Kubuntu. One is AMD64, the rest are i386.
So, that skews the numbers for sure.
I wish the Linux Counter is taken more seriously. They used to put an automated email message in Slackware, so the likelyhood of you registering was high. Otherwise, it is only good for comparative studies only, not absolute numbers.
2bits.com, Inc: Drupal, WordPress, and LAMP performance tuning.
i had no idea people still had these kinds of problems, they are what drove me from RH/Mandrake years ago. I moved to Debian Sarge (before it was "stable" even) and even did a dist-upgrade from sarge to ubuntu on one system. "apt" upgrades are rarely a problem even when the system is "live" and not booted off a CD, and never an issue if done from the console so that when upgrading libs the X server doens't crash on you.
Oldest system I do this with a 486DX2 50Mhz with 32 Meg ram and there's never a problem. It's actualy an HP Network Scanjet 5 with an Ubuntu "command line" install and the Enhanced scripts to run the interface, no idea where else a 486 with linux would be all that usefull to maintain though I'm sure there are some out there.
- Disclaimer: Information in this post deemed reliable but not guaranteed.
I just did a retro-fit upgrade and an install on two machines and neither went to the "yum" repository mirrors to do an update till after they finished their first reboot where I had to activate the update manually (and get the gpg keys installed).
- I remember that "install" at some point gave me an option to install against latest package in the "yum" repositories, which I do not do for speed.
- I remember the "upgrade" and "install" screens from Anaconda being different. The "upgrade" never asked me to update against the "yum" repositories.
"pup", which is the graphical tool analog to "yum", handles rotating through the mirrors properly as far as I remember where it just fails over to the next if the current one can't be reached. I've had my Internet die while trying to do this, I don't recall it ever crashing on me and this is doing many installs and upgrades across every version of Fedora.
I don't blame you for switching to something else given these problems. I'm just stumped how you got these problems.
Or if he waited that long, why not wait for CentOS 5 (based on FC6), which is due out 2-3 months after RHEL5, which is due out before the end of February.
it's called "unstable" for a reason, if you are running it, you should expect problems anyway. sarge to ubunutu 5.04 on a server (no ubuntu-desktop package madness) and it was fairly smooth, I think i had to remove a couple oddball things that had new names and reinstall them, but it wasn't a hassle since they weren't "core" packages and it was all manageable from aptitude. I haven't tried going ubuntu->debian, and I'm not sure why anyone would want to for a "stable" version.
- Disclaimer: Information in this post deemed reliable but not guaranteed.
I tried live CDs of Ubuntu Breezy, Dapper, and Edgy. (I'm an Ubuntu fan, for obvious reasons.) None of them would boot up, so I gave up. I probably gave up too soon. The machine is a Sharp MP30 laptop, which, since it's very light, is still worth several times its weight in gold for me. It came with Linux pre-installed and a bunch of custom drivers for some of the weird hardware. It's also my main computer with my entire life on it. If it crapped out, I'd just have to change my name and go start a new life in the South Seas. But I think, given some of the ideas mentioned on this thread, I'll have another shot at trying live CDs.
(Um, I'd never heard of CentOS. (red face here) Thanks for the tip. I'll try that out.)
Time to restart my PPPoE dialer and call yum a few times....
I'll concur with that quote. I have enough Fedora boxes behind NATs at home and at work to make up for several dozen dynamic IPs.
Hey, probably time to setup your own yum repo then - the mirror should adjust your stats back to normal.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
But even if it lacks precision, it is still a good indicator that Linux *IS* in fact popular and much more widespread than people think.
It might be much more popular than they think since yum-updatesd is broken in FC6.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
In a way, I think slashdot should take credit for this metric coming to pass.
It's been obvious to people who "work" with linux, that Ubuntu is trying to be the friendly, cool, slick linux, and portray other distros as the geeks. Just like high school. Of course just like real life, there are more geeks than you realize and they are a lot more productive than the "slick" people. To counter this, the slick people employ marketing science -- which actually has nothing to do with marketing or science -- it's just lying about the other guy to make yourself look good.
I don't have a problem with Ubuntu, and I quite appreciate what they are trying to do in usability, but to pretend that they are the most popular linux distro is (quite literally) belittling of the others.
Actually, yes. Since most of their users are Windows users, they want their drivers to be Microsoft approved and signed. Microsoft can take however long they want in this process, or even extend it indefinitely. I really doubt Microsoft will be as friendly to companies that ship Linux drivers as they are to those that don't. They've never shown such altruism in the past.
more grammar checking might be more gooder-er here.
Furry cows moo and decompress.
They know how many people use linux thats not the real reason they dont support it. We all know that its most likely illegal deals they have with microsoft. Hush hush wink wink
You could be right. Laptops have some strange hardware and the kernel developers only seem to be testing on the mainline, new desktop systems. It might take more knowledge and hacking then you want to make a working kernel for your system.
Intron: the portion of DNA which expresses nothing useful.