MSN Search Engine Favors IIS
Scud writes "It appears that if you want to rise up in the rankings over at the MSN search engine you would do well to host your page on IIS. Ivor Hewitt has done a study and it appears that by using IIS, you are likely to increase your odds of a higher listing by several percent."
It's clearly biased towards Internet Explorer too, the results I get back in Firefox are mostly irrelevant blogs and pages full of adverts.
"Those who cast the votes decide nothing; those who count the votes decide everything." (attrib. Joseph Stalin)
So what's going on? I have no idea, I doubt it's all a big conspiracy... but some possible explanations spring to mind: Perhaps the MSN search has simply been coded by developers used to talking to IIS machines and so it just does that job better? Perhaps the MSN spider is taking advantage of some specific IIS features to provide enhanced indexing?
In other words, there are some explanations out there other than "MS is biased and there's a conspiracy and they are trying to take over the world"...
One man's Funny is another man's Offtopic.
... to think ms wouldn't use all it has. Obviously it hasn't yet learned from google, that being evil is bad. And bad guys get punished.
Is there any truth to the rumor that having a picture of Bill Gates on your site makes you #1 in your category?
I'm a big tall mofo.
And who is the silly person who would expect it to be otherwise? Have you actually been listening to the news at all over the past decade? Have you learned nothing? The real story would be if the ranking did not rise if it were housed on an IIS server. Otherwise it's a nothing, I would have assumed that.
... and they still think they can beat Google to the game. When are they going to realize that what made Google so successfull was the fact that is has been so unbiased in all ways imaginable, including not accepting payments to get higher rankings.
Google makes money by prioritising quality. Microsoft makes money by prioritising money.
Go figure.
...you gotta do something to pump up your buggy, non-mainstream, insecure webserver.
Yeah, right.
My page is titled "San Andreas Radio" and if you Google it, comes out #1 or #2 every time.
MSN it and it comes out about #7. Either they're being paid to reduce its rank (it's a bit subversive), or they don't like the fact I'm hosted on Linux, or they simply don't have a very good search engine.
If I put the exact unique title of a page into an engine, I expect that page to be #1.
Mirror of Ivor Hewitt's site
I think I have never used MSN search in my life. I suppose other people do, but how many? Anybody know MSN search share percentages?
San Francisco Photographers
The control over what webserver you will use is typically limited by your hosting provider. While many provide the choice between Unix-based servers and Windows-based servers, many do not.
For those who use hosts that do not provide these services, I don't think it appropriate to think that they are simple SOL. Rather, the better quality your website provides, the more relevant it is to the topic you discuss, the better it will fare in any search engine. The type of webserver you are using becomes nothing more than the tiniest fraction of your search ranking.
if it favours iis machines, it makes it that much easier for virus writers / script kiddies to play about with them if it displays them in preference to other web servers.
To be conclusive, it needs to be a controlled experiment with the same text and same outgoing/incoming links.
Just the webserver alone changing. This can happen by taking a popular site and then changing what it reports to the MSN search robots.
But until such an experiment is done, the data is open to too many interpretations.
The link to MSN search on the main story links to beta.search.msn.com. It should be noted that MSN Search is out of beta for a while now - the correct links should be http://search.msn.com. It's not like it's Google or something - trying to keep everything in beta for years to escape criticism.
It's really unlike MS to skew things towards their products... I'm sure it's a mistake or a "Linux Zealots" distorting the facts...
My site is first or nearly first in google using relevant search terms. But in MSN it never shows (even if listed). Maybe also the use of PHP is harmful for MSN ranking? M.
For years now, the company where I work has had all it's Apache systems reporting that they are IIS 5.0 systems. Just a quick change in a single file before compiling and there you go!
Microsoft has its own search engine? When did this happen? This is the first I heard of it. I have never heard one of my friends say, "Hey just MSN Search it!"
* Sorry for that image.
- Crow T. Trollbot
and this is "news", why?
Well, MSN never promised they wouldn't be evil.
So it seems fair to me.
Just change the server response line if the GET or POST comes from Redmond, WA to say you are some version of IIS. I can already see the recommendations coming from the SEO folks.
Apparently it's a very slow news day. In the interests of being remotely on topic.. (yes my karma will suffer dearly for this)
Why would this be any real surprise to anyone? MSN being MS is obviously going to give preferential treatment to their own products. This may be by design or strictly because IIS servers respond to some proprietary (yes I said it) requests that other servers won't.
I don't necessarily see it as an evil thing, but it's not entirely philanthropic either.
The world according to SComps
One of our clients has a web app hosted on an IIS box and their main website hosted on apache. The web app ranks higher than the main website when doing a search for them.
Most of those useless keyword, domain parking/hijacking, and spam sites out there run on Linux+Apache because the owner can host thousands of those domains fairly inexpensively, and that's the key to all spam: minimization of operating expenses so you only need 1 out of 100,000 users to click/buy to turn a profit.
These sites don't have any real content, they just point to other sites and/or exist to spam you with advertisements. Some of them have googlebombed their way higher into the rankings.
My guess is that MSN does a slightly better job of filtering those useless sites out of the index at the present time, OR the "googlebombing" techniques they use aren't as effective with MSN's indexing. Since they almost exclusively use Apache that would have the false appearance of favoring IIS.
This is just a guess, but it seems plausable.
Natural != (nontoxic || beneficial)
MSN Search should be banned for being dishonest.
Banned from where and by whom?
MSN search can do whatever they like. I don't know anybody who actually uses it. Even non-tech oriented people that use IE (against recommendations) set their startup page to something else. Google, mostly, but also "My Yahoo" and their webmail or portal of preference.
No sig
...Googled for anything using MSN!
If you disagree with me on social issues, then it's pretty clear that you are a narrow-minded bigot.
Dear Microsoft Employees: It has come to my attention that some of you are under the impression that there is another web server other than IIS. Please know this is not true. It is about as silly as this rumor that there is another browser besides Internet Exploder and another email program besides Outlook and Outlook Express. The EVIL Linux / Open Source movement has it out for us. However, upon review of the data collect from our super-worm, it appears that Linux and Open Source may too be a big hoax. We can find no evidence that either one exists. So, please continue on making your software as insecure and unstable as possible. Our marketing records indicate that such "features" such as the "Blue Screen of Death" actually cause people up upgrade their older versions of Windows to our newer, more intrusive versions. Thank you for your hard work. Until we rule the world, Bill Gates
I'm not a troll, but I play one on Slashdot.
Tha article actually links to an older smaller version of the analysis. There's a more comprehensive wordlist at: http://www.ivor.it/goog
is when all the extra traffic from higher rankings crashes the IIS servers that much faster!
Reject Fear - Embrace Hope
Remember, the guys working on the MSN search engine certainly use IIS to host their intranet sites, and whatever internal webservers they use to test against are probably IIS as well, at least in the most cases. They are likely to consider bogus results for their own sites (both internal and external) more critical... that's not malice, that's just human nature. Even if they consciously work against that, they're more likely to notice problems there first.
And search engine tweaking is more an art than a science. It's an evolutionary process, with feedback loops and strange attractors. So if there's any difference in the behaviour or design of Apache or IIS that would be visible to a search engine, it's likely to lead to a slight bias in favor of the server software that the servers they pay more attention to run.
Leads me to think: is it significant? That is, can we exclude (to a reasonable certainty, that is, p>0.95) the possibility that the effect seen cannot be attributed to chance or some other criterion MSN uses?
Ivor says at some point The initial set of words indeed showed a significant difference between the results from Google and the results from the Beta MSN search..
But what does he mean? I would be interested in what kind of significance test was applied, what the exact results were. Just looking at the ratio of percentages doesn't tell me enough... One should go back at the original data (seems provided, good) and check if the effect is actually trustworthy or just, in Ivor's words, "Odd. Pure coincidence perhaps."
Before seeing some analysis of significance, I don't believe anything...
The only difference in the HTTP response is just that IIS adds headers and that IIS has that stupid HTTP Continue on handling SOAP via ASPNET.
Just telnet to almost any Apache web server and type GET / and then to an IIS server and do the same thing. Look at the top. Almost all non-IIS web servers return no default headers.
Microsoft.com:
redhat.com
When people mention a "goverment conspiracy", it is related to several agencies, or at least should be.
:)
The IRS is not conspiring to get all your money. It is just company policy.
morcego
The Fine Article states that while Google's results are comparable to Netcraft's server survey results (that is, their share of Apache and IIS represents the respective market share), MSN seems to favor IIS. So no, Google does not favor Apache.
Have they gone ahead and implemented that thing about assigning you a hosting provider at birth then? What a shame. Back in my day, we used to be able to pick our hosting provider based on what they provided and what they charged for it.
Ah, the good ol' 1900's.
--MarkusQ
"msnbot/1.0 (+http://search.msn.com/msnbot.htm)"
"Time is an abstract concept devised by carbon-based lifeforms to monitor their ongoing decay." - Thundercleese
Sadly, this fails to surprise me. This news story doesn't sound too different from the one posted yesterday about Kerry-contributors being banned from some engineer gathering.... Why should a search engine give a flying fuck about what http server a box is running?
You can have a "conspiracy of one" if that person acts in multiple roles.
As an example, let's say that one person is a company's bookkeeper and CFO. (This isn't uncommon in small companies.)
As a bookkeeper she cooks the books to cover her embezzlement.
As CFO she prepares false financial documents for her company and its investors.
One person, criminal acts in two roles, so in many states she can be charged with conspiracy in addition to embezzlement.
BTW, this isn't a "conspiracy" in the legal sense since it's not a crime to give preferential service on the basis of web server. It's sleazy unless it's fully disclosed, but it's not a crime unless they actually sell the search engine as an unbiased tool.
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
In other words, there are some explanations out there other than "MS is biased and there's a conspiracy and they are trying to take over the world"... "
It's called plausible deniability. "Why, no, we had no idea this would happen. You say it's an interaction with an IIS feature that causes this to happen? Heavens to Betsy, we never thought of that."
Microsoft people aren't stupid, and they ARE trying to take over the computer world, or haven't you been paying attention to what they say and what they have done? The engineers that built MSN Search would certainly be aware of any interaction that fits with IIS features to provide enchanced indexing. They would have been all over it from the beginning. And a side-effect means that IIS sites come out higher? Great! It's a feature that benefits us, they would think.
Of course MS is biased. Of course they would have noticed this. Of course they like it.
Where's the paired t test?
It's just coincidence that there happens to be a bias that makes IIS-hosted sites measure higher by this metric. ;-)
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
Apache freely issues advisories and patches. It will issue an advisory if even one user faces a minor risk.
Microsoft (and nearly all other proprietary software companies) tries to hide problems to protect their perception in the marketplace. You usually only see advisories for major problems that will become public knowledge anyway, and numerous other fixes are piggybacked on the big ones.
But beyond that advisories don't really address the quality of a product. They're one metric, but nothing more.
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
A "government conspiracy" normally involves the government conspiring with oil companies, local steel or similar.
I am trolling
Then, in 2007, came Longhorn, with integrated web search using not Google, but MSN. Joe Sixpack didn't care, but MSN was so damn convinient he forgot about Google - effectively forcing Google Inc. with its costly development department out of business. Later - oh surprise - all results you got for "Linux" on MSN were advisories to ditch it for Windows. He who controlled the search result, controlled the industry. (Maybe I should put some fake Frontpage-Meta-Header to my webpages to increase Rankings on MSN ... just to be sure)
Screw the FSM - Real geeks believe in the Invisible Pink Unicorn
Believe it...
First off, I looked at the difference in means for Apache rankings in MSN and Google. 61.5% (MSN) vs. 64.3% (Google) for 970 observations Right there, you ought to be able to eyeball it and see significance. But, to make sure, here are the results of a t-test which checks the likelihood that two matched sets have different means (forgive the crappy formatting):
M G
Mean 0.615061856 0.642948454
Variance 0.01100624 0.008740111
Observations 970 970
Hypothesized Mean Difference 0
df 969
t Stat -10.51551356
P(one-tail) 7.26569E-25
t Critical one-tail 1.646427658
P(two-tail) 1.45314E-24
t Critical two-tail 1.962415113
As you can see, the P is 1.45 x 10^-24, which at least makes us think the results are not pure coincidence. I don't intend on speculating on the causality, though...
You could just change the HTTP Server header that Apache sends out. Someone should try it for a few weeks and see if it really makes any difference.
If you have mod_header installed, just add the below line to httpd.conf:
Header set Server "Microsoft-IIS/6.0"
In major countries like Germany, IIS is already down to around 3% of the server market. Even world wide, most people have the sense to run Apache. You can look at the percentages, but every time an IIS farm is rolled out, shortly thereafter, they wise up and drop it for Apache or any other product actually suited for being connected to the network.
Frankly, I'm not sure why this article even made it to Slashdot. Is slashdot or OSDN participating in this year's marketing tsunami by doing product placement ads? Please let's go a week without MS articles, there's enough shilling going on in the discussion without them.
Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
Unfortunately most people still think that to get to a page on the web they use MSN search. That default home page for explorer is very sneaky. Customers that I've spoken with don't even know what the address bar is for. They hit the 'home' button and type the URL in the search. Some things that appear obvious to computer users are apparently not obvious to everyone else.
Add something like this pseudocode to your server:
if $Browser = "MSNSearchBot" then $Server = "Microsoft-IIS/6.0"
One line blog. I hear that they're called Twitters now.
Or see if, as at least three other posters indicated they do, Apache users are blocking MSN.
If you don't know where you are going, you will wind up somewhere else.
When I learned that Microsoft was trying to muscle into the search business, I added a robots.txt rule to exclude them from my web site. I figure that the less useful their results are, the harder it will be to do evil.
I could imagine (some) other Apache users like myself doing something like that, but I can't imagine any IIS user doing it...
Coudn't you infer that the results of this study show that Google favors apache? They are in compitition with Microsoft after all. I am sure that both search engines are using different algorithms, why assume that Google ranking is truely "correct". Why not study a greater number of search engines and see if all show a bias one way or the other.
Mod Me down!
I hadn't read his F'ing Link and thought 1000 were each individual webservers for a particular search term.
He meant 1000 different searches, which is a sensible way to do it.
His stats may be fine.
My site is a webcomic/rant site that uses some fairly... colourful language. Apparently MSN search has a porn filter, so we get all sorts of traffic from porn searches that didn't turn up any porn. In fact, MSN search is our number 2 referer, beating out every marketing campaign we've ever done.
If you search MSN for things like "anal fucker", "hardcore sites", or "why is leah remini fat now" there's a good chance UAC will be right there on the first page. And our site is PHP and Apache all the way.
Bite the hand.
Try typing "online music".
On Google the top two references are iTunes and iTMS. On MSN you'll have to go through a few pages before you'll see anything about iTunes.
Yeah, I trust Microsoft to provide unbiased search results. Sure I do.
m.m.
Yeah the trouble is that the person who submitted the story linked to the old "original" result set i.e. the "/orig/" in the url rather than the more complete more recent results at: http://www.ivor.it/goog
I guess the "MSN against Google" report is more attention grabbing.
Along your line of hardware and services, the oems work for them [thus they do not need their own hardware division], and they do [or have a "partner" that does] all of the consulting/services you could ask for - with one caveat, you must buy their products.
You are forgetting a couple of things. While your arguments are indeed valid, MS will continue to exist due to their insulation from the Karmic Wheel by HUGE PILES OF CASH. So, even if everyone said "fuck MS, I will not give them another dime, I'm moving to Linux" MS will dump a small portion of their HUGE PILE of cash into something that will generate revenue. Even if they did nothing viable, it would take a long time time to deplete their cash stash. I believe they would even make a MS/Linux before we saw their demise. If MS got into the linux game, I believe it *could* hurt many of the distros out now. Could you imagine a linux kernel wrapped in ms proprietary bs? Then they would have most of the advantages of linux [aside from being open] and the advantages of windows [manufacturer support]. Yeah, it would hurt at first [kinda like a skinned knee] but they could get right back into the game. They may be taked down a few pegs, but you are NOT going see MS die anytime soon.
ymmv
Thanks for the link to the original. However, now I'm even more convinced it's nothing! Look at the variation between the four engines: the MSN results actually don't stand out, even though they are the lowest for Apache. For example, there is more difference between Google and Teoma than between Google and MSN. So, are we going to accuse the other search engines of manipulation, too? They exhibit the same level of variation from the apparently unquestionable Google reference.