Netcraft Web Server Stats Challenged
kolchak writes "An article in The Age has an interesting analysis of the Netcraft Web Server Usage Reports. According to Port80 Software, Netcraft's surveys are biased towards domain name parkers and very small web sites, not taking into account how popular a site may be - there's some interesting results in the competing Port80 survey." However, it should be pointed out that Port80 "develops software products to enhance the security, performance and user experience of Microsoft's Internet Information Services (IIS) Web server."
From thier Partners page:
"Port80 Software's Strategic Partners:
Microsoft, Inc."
Strategic in what way? FUD?
Ok, so the Microsoft connection makes it easy to write the whole thing off as astroturfing, but they have a point.
Parked domain names usually aren't separate websites; they're usually hundreds, or thousands of domains pointing to the same server/service that's trying to sell them for profit. In addition, Netcraft counts www.yahooo.com and www.yahoo.com as separate sites-- Even though they both go to Yahoo.
In this manner, Netcraft's method *is* unfair, because there's no weight as to the location to which the domains point.
The theory of relativity doesn't work right in Arkansas.
umm, how can you claim that they are sampling correctly when your only evidence of the way they sample is by way of an app that crashs on linux/apache servers?
I am the Alpha and the Omega-3
"A developer of tools for Microsoft's web server software..."
Come on. I expect them to pull for their team but let's get real. They are not a neutral party and it is in their interest for people to believe that IIS is more common, whether or not that is actually the case. I don't exactly blame them for trying to spin the "facts" in their favor but following the money does hurt their credibility in this matter.
If it wasn't so sad that people can charge $50 for what in Apache is a one-line config change, it'd be pretty funny.
It's hard to be religious when certain people are never incinerated by bolts of lightning.
One box running multiple sites should not be less valued than multiple boxes running one site each for this simple reason:
Linux can do it better than Windows and therefore more Linux boxes are going to run multiple sites!
So why should a criteria of "large companies" be better than "all websites"? Large companies aren't going to select a better web server just because they're large, and the coroprate culture of large companies can be it's own sort. If you're going to limit yourself to certain types of companies, shouldn't the limit themselves to, say, the 1000 largest dot-coms? Look at companies that couldn't exist without their website. I rather doubt there'll be much IIS among them...
Give a man a fire, and he'll be warm for a day, but set him on fire, and he'll be warm for the rest of his life.
...this story is a plant to sell their ServerMask software.
You know, I wouldn't mind reading this "research" if only the companies involved were forced by some law to declare where their funding's coming from.
"Yep, we've just proven that Linux is the number one desktop in the world today. This statement brought to you by Novell/SuSE" would sit just fine with me; I could file the statement accordingly.
As things currently stand,
- I get to treat all such "research" as crap, regardless of whether it is or not.
- I get to continually challenge corporate decisions that are made on the basis of such research. "XYZ Research Inc says XYZ is the best product, and they also say they're in no way related to XYZ Inc. It must be true because it's in this magazine"
I know exactly where it all started, and I'm gonna whack those guys from the "Ponds Institute" if I ever find out who they are...
You have to look at their survey. It's talking about the CORPORATE web servers. I work for a major corporate america company. We have close to 4000 servers handling our "web" environment. That consists of web, app, and database servers. There's more IIS then anything else out there for sure in corporate america. Expecially on the WEB front end. In a corporate environment there are about 20 Windows to 1 Unix boxes. Mostly due to Windows servers being so cheap and can't handle as much load per server. But on the DATABASE backend there is much more UNIX to Windows.
Another thing is Corporate America is barely getting their feet wet with Linux/Apache. The UNIX boxes that are installed are not running Apache, they're running something from a major vendor (ie. Netscape, etc). Up until this year there was NO linux in the corporate company I work for. If a MAJOR vendor will not support a product, corporate america will not install it. They love to point the finger at the vendors. If there's nobody to point a finger at when something goes wrong, it will not get installed.
Until Redhat started selling Linux for $5k corporate america wouldn't even bat an eye at it. Now they're eating it up like hot cakes cause it's EXPENSIVE! Linux is no longer a free thing. Now powerful execs can point fingers and plus be able to throw around the "L" buzz word and feel like they're pushing the envelope.
Yes, security through obscurity does work ;-)
...Unless of course if you're dealing with a completely clueless (or just plain sneaky) kiddie who throws every single exploit he has (regardless the server) at your box. That's when security through obscurity stops working
I signed up for a
It doesn't matter if the domain is parked or serving thousands of pages...domains are just as easily parked on IIS as on Apache.
slashdot, news for crazed liberal socialist zealots
can you take a company seriously if tehy cannot do some simple ASP/SQL code?
please, I am all for schepticism, but you are using it to help prop up your world view, which is not what being a scheptic is about, being a scheptic is about being open minded until you get all the information, while this is not all the information, there is a thing called proffecionalism. if you can not present yourself in a proffecional mannor then you do not deserve the luxury of being thought of as credible. look at an interview as an example. if you act rude, you will not get he job even if you are a really nice person who is very well educated in the field you are trying to join. 1st impression is everything.
I am the Alpha and the Omega-3
I'll ignore for the moment the question of the quality of their data. I'm sure others will endlessly debate it (and I'll probably join in). Let's look at something else: The quality of their presentation.
First, let's take a look at the most recent Netcraft server survey. Let's see, clean display. The scale grid is subtle and doesn't draw attention to itself, but makes it easy to see exactly where a line falls. There is little wasted pixel data. It's easy to see trends and make comparisons. For the curious the exact numbers for the last two samples is listed (regrettably one two samples are listed). The graph labels the data it shows ("Market Share for Top Servers Across All Domains August 1995 - November 2003") leaving the reader to form his own opinions. On the down side, the scale confusingly marks 7% increments and the yellow line for Netscape/SunOne almost disappears into the background. Still, a well above average for graph. Definately room to improve, but better than most people expect to see.
Now let's example the Port80 server survey. Wow, what a difference. The grid is a much more dominant element. The 3d effect means that bars further in the back appear taller (by up to 15 pixels, or about 7%) and makes it hard to compare a specific data point against the scale. The complexity of the 3d bars complicates things, the "top" of the bar is actually larger than the month to month shift in the numbers. The "area" of the bars implies size (intellectually you know it isn't, but your gut says otherwise), this means that the largely obscured middle bars (Netscape and Apache) seem smaller. Ultimately bars are the wrong choice, we're examining points over time (suggesting a line chart), not clusters of data. The chart is labeled with a conclusion ("Microsoft IIS Maintains Dominance Of the Corporate Web Server Market"), suggesting interpretations to the reader. On the up side, they provide heavily broken up information for the most recent sample point (regrettably it's a graphic). They include a worthless pie chart. If you want to show market share a line chart showing historical data would be much more enlightening.
Conclusion? Port80's graphs suck. Hard. It's a stunning example of how not to create high quality graphs. The creators need to be beaten with copies of Tufte's information display books until they get it. This is the sort of amateur crap I expect on PowerPoint slides from people more interested in being cool than being useful, or perhaps from the graphics department at USA Today. As an engineer I'm disappointed.
Search 2010 Gen Con events
Anyone else notice that the spokesman for Port80 claims that they have been running the survey all year "except for a period between February and June"? That means they've been running for about eleven months, except for the five months when they weren't running...
I don't think they have much in the way of credibility, even without their transparent bias. They seem to have a creative way with arithmetic.
It is a woman's prerogative to change other people's minds.
"Netcraft is biased"
"develops software products to enhance the security, performance and user experience of Microsoft's Internet Information Services (IIS) Web server."
Entities who could be accused of having a conflict of interest, ought not bother at all with statements like these. It will only end up making them loose integrity.
War crimes, torture, lies, illegal spying... Would someone give Bush a blowjob, already, so he can be impeached?
What about boxes like the ones where I work that run many (dozens, hundreds even) domains on one physical server? That's where the real difference creeps in; it's how 60-whatever % of sites run on Linux while 60-whatever % of boxes running web servers run Windows. Lots of the Linux boxes run multiple sites (and I don't just mean www.foo.com and images.foo.com; I mean they run www.foo.com and www.bar.com and www.baz.com and www.qxt.com on the single box).
So, take one of my boxes at work: it currently hosts 53 second-level domains and about 200 subdomains from them. The one I'm thinking of has its own class C netblock, but we have similar ones that just have a single IP address for their dozens of sites. Do you want that counted as one server, as 53, or as 200? Netcraft says it's 200. Port80 says it's 1. I'd like to count it as 53. Netcraft's way tells you what people who make web hosting decisions like. Port80's way tells you what people who make hardware and software buying decisions like.
All's true that is mistrusted
Not quite. See, if you're saying your running IIS but actually not, you're immune to all IIS-exclusive hacks. They simply aren't gonna work against Apache... so you give the illusion you're Superman when they fire bullets at you. Of course, you're still at risk to kryptonite should an Apache expolit be released... but hackers looking for Apache servers to hit will think you're an IIS server and hopefully not bother with you.
It's security by misdirection... a cousin to security by obscurity. Not a complete security solution, but it does help a bit in convincing hackers looking for an easy target that you're not one, so move on to the next victim.
ok so like,
You are attempting to defend an undefendable position based on the credibility of an obviously biased company attempting to manipulate reality to render their desired outcome yet you feel the need to rail against someone because of some spelling errors? I tend to give someone that does not speak english as their first language measure of respect especially when they destroy a pathetic point I'm trying to make with better english and a better thought out argument than mine even if there are a few misspellings. You need to drift slightly farther away from zealot to be taken seriously.
For every annoying gentoo user, are three even more annoying anti-gentoo crybabies. Take Yosh from #Gimp for example.
while it is not a substitute for a good security policy, it is an excellent augmentation. the old saying goes that the only secure computer is one that isn't connected to the network. well, that's not really possible if yr running a web server, but you definitely don't need to advertise that you're connected... or how you're connected.
let's use a military analogy (ugh). you may put your soldiers in an armoured transport... but they still wear camoflauge.
i mean, after all, we all turn off ping before we put our servers up... don't we?
2 1337 4 u!
They list the 995 sites they include (they're using the Fortune 1,000, and (looking at some of the earlier reports), apparently 5 Fortune 1,000 companies don't have sites. (If they're still Slashdotted, you can download the pages from Google's cache. start here.)
A bit of quick Perl hackery pulls back the following values, roughly in line with what they report. The second column is actual sites found.
That said, I doubt the usefulness of the survey. It's a survey of Fortune 1,000 companies. These are often companies whose web presence is minimal. What does a giant holding company need with a web site? Heck, five of the companies didn't have any site at all! Of those sites that exist, many lack any sort of complexity (say, thousands of pages, or lots of dynamic pages). Simply put, many of these sites would run fine an almost anything, they don't represent Hard Work. I'm a lot more interested in what Google and Yahoo choose to run than in what the Radian Group and the Kiewit run.
Now Netcraft does have the problem they cite: Netcraft weights everyone equally. Perhaps that introduces bias. Perhaps we should select a set of sites that is high bandwidth, typically has at least some dynamic systems in place (say, to handle selling accounts), and is a popular target for hackers? How about porn sites? Porn operators have a hard job, thanks to Smutcraft you can see what they run.
Second, it looks like they've chosen one site for each company. For Amerco, for example, they chose UHaul.com running IIS. Reasonable enough (UHaul is part of Amerco), but it's interesting that they skipped amerco.com (running Apache). Not a great example, surely (especially since uhaul.com is certainly doing more real work than the very thin amerco.com), but it shows that there is a selection process of some sort, and any selection process risks introducing bias.
Search 2010 Gen Con events
$5 / month hosted VPS on linux = awesome!
So basically, they're using a (questionably biased) survey of "servers" running IIS Vs others.
No excuse me, but wouldn't be able to run 100 sites on an apache box without problems beat the pants off having to run 100 seperate IIS boxen?
I mean, if say, 70% of the websites in the world were to be run on 30% of the servers, I'd say those 30% of servers had something over the other 70%...
There is not much point in bashing one or the other survey as being biased. Of course they are (whether intentionally or not), since a single survey will only ever show a single perspective.
- Netcraft shows servers by hostnames
- Port80 shows servers for US Fortune 1000 companies
Both are interesting (even though the Port80 graphs suck, and their software is broken).
But both are meaningless by themselves if you want a serious view of server software usage.
Adding Netcraft's SSL survey (which isn't free) would help to get yet another perspective.
Then a breakdown by IP addresses instead of hostnames would be interesting, but Netcraft doesn't seem to publish that.
And what about non-US Fortune-N companies?
And web servers whose main business relies on the web (as this post suggests)?
And stuff you definitely cannot get like the sites with the most traffic? (maybe you could get "sites-with-a-lot-of-traffic-which-do-banner- advertizing-with-major-banner-advertizing- companies").
If you take the survey for what it is, it's interesting. Just don't expect it to tell you more than it can.
Port80 is not about market share, it's about market share in US-based Fortune 1000 companies this summer. A very limited, but nonetheless interesting survey (if you care for surveys, that is).
Who will do a survey of slashdotted sites? Shouldn't be too difficult. Anybody bored in some rainy region of the globe?
Another poster commented saying that Netcraft offers similar surveys to members. They are saying results of the Fortune 1000 to be very similar to this report.
Settle down. Relax. Linux will be where you think it is today within 3 or 4 years.
To Sell this type of software is just admiting that Apache is more Secure than IIS.
You will never protect yourself faking a weaker server program because it will only increase your cracking trafic!!.
Comment removed based on user account deletion
A script kiddie might still attack you because he's just a brute forcer. Anybody with brains won't trust your server's self-identification... so who are we fooling here?
In skimming threads, it looks like people have missed the real problem: that the have pre-selected there sample.
There sample is the servers of the "fortune 1000 companies". Now, I don't know how the Fortune 1000 chooses it's companies, but I'll bet they don't choose those companies that have succeeded due to good IT choices. Microsoft will be on the list.. but how much money does Google make? Is it on the list?
Moreover, and this is the really important point, they are completely ignoring every other kind of site. Government, educational, research, NGO, military, etc, etc. It ignores all the sites that don't make any money but are vitally important.
OK, they're just doing the study to prove that _companies_ use MSII. But even that's bad: it only proves that BIG companies use microsloth. This may be an intelligent decision for big companies, but not for small ones.
So, in general, the only thing that Port80 really says in it's study is that big, rich companies use Microsoft. This implies no causality: few of these companies make money from the web.
The Netcraft survey shows that PEOPLE use Apache.. and I think that's much more interesting.
---Nathaniel
The parent poster's point is that their site grabber program can get IIS sites but crashes on some Apache sites
More to the point, if they understand HTTP so badly that they can't even get server headers and parse them correctly, do you really want to trust such a company with HTTP-rewriting, compression, caching, and wildcard-DNS services that's their main product?
Seems to me that those sort of programs require a good deal of knowledge to get working correctly. Maybe a few levels above what you need to implement a webserver or DNS server. It seems odd that someone with so much knowledge would make any errors in handling the internet protocols...
So lets see, they want to sell us a product which supposedly increases the security of IIS boxes, without even actually increasing the security in the process, but rather mangling the headers to look like Apache, in the hope someone will skip over it.
Since when do the web server scanning viruses actually check the headers to see what type of server it is?
I would think that someone who was scanning for vulnerable web servers would notice "This is a server" or "Yes we are using ServerMask" quickly and realize that someone is playing a game of hide the IIS server. Thats one hell of a big fucking redflag.
None of their products actually offer any *real* security from what I see. They just hide the errors and obvious from normal people. It won't stop someone from nmaping the IIS box and see that its running Windows NT/2k/2k3. It won't stop those lovely Windows based viruses that scan for exploitable webservers.
Lets not forget what happens when SQL/ODBC errors pop up and completely give away that your an IIS slave. Its so freakin easy to cause a server's script to throw back errors for analysis.
If anything, they are saying that, "Yeah, IIS sucks, look how we can make IIS pretend to be like the much more secure and powerful Apache web server."
Why not just run Apache in the first place? You don't have to pay money to a third party just to change basic configurations, and you get the most secure web server in existance.
It seems painfully obvious.
Brielle
by this logic, you should post your email address all over the web and rely completely on your spam filter.
the bottom line is this: hiding your server decreases the number of scans and attempted xploits on your box. since secruity can never be 100%, a reduction in attacks translates to a reduction in breaches. basic math.
2 1337 4 u!