Inspecting MSN Search
ins0maniac writes "I compared Yahoo, Google and MSN's image search. I noticed that, MSN's search had images from only a few sites. I searched for keywords britney spears and randomly checked few pages upto page number 20 and found that the 400 images were only from 3 domains :| 5in9.com, celebritypicturesarchive.com and nabou.com. This is totally weird as it doesn't seem like a search engine, but a collection of few online galleries." There's a number of other interesting notes in the entry about the new search engine. Also, Britney.
I already have all 400 of those.
This is a standard Microsoft tactic. It shouldn't surprise anyone.
1. Launch a web site in a particular genre but don't actually have any real functionality
2. Distribute a press release
3. PROFIT!!
I'm a big tall mofo.
... that's why I love science. You can find the best reasons to do the weirdest things ... ;-)
..Is a revenue stream. The galleries in question probably pay for dominance. Yeah, this seems contrary to a full free search, but at least the results are on subject.
The real task, it would seem, would be to find a way to have the engine return the proper pictures for the proper searches (so typing in Daddy's birthday doesn't result in pictures of some 50 something dude banging some barely legal chick with a party hat on.)
Stuff like that.
So, did you turn it off before the search? I did.
Other searches don't appear to be similar. I'm guessing that perhaps these companies have paid for higher placement on the example used in the article?
I searched for "britney spears nude goat dildo sparcstation" and didn't find a single thing.
I'm going to have to perform this experiment myself.
In the interest of the truth, you know.
Watch the Teaser Trailer for "The Lightning Thief" Her
But what does "Also Britney" mean ?
It means that Rob's little head is less articulate than his big one.
we /.-ers certainly hate her music but apparently she's not as painful to look at. :D
msn search may not be as good as google/yahoo, but the prominent-cleavages-to-image-number ratio is quite high for all three search engines. who's complaining? :P
Yeah, but how do you turn off the damn safe search in MSN... :-)
discussions that- if google put adwords on the image search results, they were potentially crossing the line of using copyrighted works without permission- to turn a profit - perhaps MSN is only image searching/displaying where they have been given permission to display copyrighted images...
every day http://en.wikipedia.org/wiki/Special:Random
http://mirrordot.com/stories/5defdb2c0e9cac7c89624 a2594f96717/index.html
mirrordot doesn't seem to have archived all the images yet though...
For research, I checked out some of those pictures returned by the Britney search.
Many of the thumbnails displayed aren't the same picture that's retrieved when you click on the link. So, their cache must be outdated already. When I'm browsing thumbnails, I expect...no I demand...my search engine to return the appropriate photos!
-Barkeep, a draft of your most hazardous brew, for the world is slowly stepping into focus, and I don't like what I see.
I don't really expect anything from MSN search at this point, it will require some major fine-tuning to become really powerful.
On the other hand, I don't expect any reviews of MSN search to be any good so early on either. Simply because, if you're a googler or some other search engine user, you like what that one offers for a reason; switching is hard.
I'm no MS supporter, but do you think this might be because the new search engine has been crawling the web for a fraction of the length of time Yahoo and Google have been crawling the web?
"Also, Britney" is an indirect reference to the amusement one feels due to the relevance of Britney Spears in this story. It also serves as a masculine form of code-speech in the form of silence, the silent element which follows "Also, Britney" best interpreted as "You know what I mean? She's hot, right?"
I think I'll stop here.
...and the very first link on the page (under "sponsored sites") is:
www.microsoft.com
Windows outperforms Linux: Industry case studies and test lab results provide insight into the advantages of the Microsoft®...
-CausticPuppy "Of all the people I know, you're certainly one of them." -Somebody I don't know
The original article has been /.'ed already, but there's a cogent point to be made:
Unless the images are titled, tagged, annotated, etc., there's no good way to index them.
If I just throws a bunch of images up on a web site, there's not good technology, other than some pretty advanced facial recognition stuff, that can determine who, or what, a particular picture represents.
Change the resolution, color depth, etc. and I change the checksum for the image, so the index fails to recognize that one picture is the "same" as another, just resized, etc.
I see a lot of that on Google's image search - but can't find a way around it, either.
...to the search, and turn SafeSearch off, then MSN gives you a whopping 12 results! Hmm...perhaps MSN is trying to censor the net, even when we set the preferences not to.
Clicking most of the Britney pictures from celebritypicturesarchive.com displays a completely different picture than the one you clicked (still a picture of britney, though).
:-/
Sig Nature
Britney? Come on, she's so last season, did you even see the Google.com zeitgeist? It's all about runner up Paris Hilton this year. lol.
Ubuntu, the way linux should be.
Try Ubuntu FREE! --
this is contrary to google image search where it's not simply searching for filenames. google search seems to understand that images of britney spears need not have "britney" and "spears" in the filename.
The MSN Search right now is too new to get an accurate reading on how it is going to ultimately perform.
Google has been around for years spidering sites where MSN Search has only been around for a few months.
The real test is going to be a year from now, when it's had more than enough time to spider a good portion of the web. Even Google's search paled in comparison to Altavista at first until at least 6 months passed. After a year passed its searches were much better since a good portion of the web was spidered by it.
At this point in the game, It would have to be an absoletly amazing site to take Google out, and I don't think MSN Search is the site thats going to do it.
In Soviet Russia, Trojan exploits YOU!
http://search.msn.com/images/results.aspx?q=kelly+ ripa+camel+toe&srch_type=2&FORM=QBIN
I'm sorry, but this is where I draw the line-- it's completely unusable
what about "natalie portman naked petrified hot grits"?
Way to put a spin on what you were actually doing. Inspecting? Sure...
One man's Funny is another man's Offtopic.
Would you rather the author did an image search on RMS?
My beliefs do not require that you agree with them.
It was MSN Search you were inspecting?
Searching for 'bill gates' in MSN returns the page Bill Gates As Mabus. Apparently this project is dedicated to finding the human manifestation of the anti-Christ.
None of the first 10 results (searching from the uk) return his homepage.
Searching with Google turns up Bill Gates' Web Site - Home Page.
Which means: Stick to Google.
Treo + Kaffi = Traffi
MSN Search
... Last Updated: Monday, January 31, 2005 - 12:00 A.M. Pacific Time Manage
while the first few result are still remotely related to what I expected (sex offender registries, sex - by teens for teens), the ninth link is cool:
Microsoft Corporation
The entry page to Microsoft's Web site. Find software, solutions, answers, support, and Microsoft
* www.microsoft.net
I'm amazed how stupid and desperate these guys there must be.
Assuming you are writing from a warm room, somewhere near sea level, you have a density of approximately 1g/cm^3. So yes, you are relatively dense.
To Turn off SafeSearch:
E HP
goto:
http://search.msn.com
click settings:
[Which will bring you to:]
http://search.msn.com/settings.aspx?ru=%2f&FORM=S
Try not to get confused and think you're using google...
On the third section from the top click "off"
You'll find the "Save" button in the lower right hand corner if you scroll down.
I was going to read through the source code and post a GET link which would turn it off for you... but I'm not about to read through that code at 8:45 in the morning. Sorry, folks.
PS.. I notice there are different language settings.. do you suppose MS will offer translation services?
first two things i searched on the new engine, betterontoast and seriouslogic, didn't come up with their official dns registered pages. www.seriouslogic.com and www.betterontoast.com (these are real).
I found this rather strange and brings into question the size of the msn cache.
This guy's nameservers are down. It's not that the webserver is down; you can browse it by the IP address listed in his whois information. It's that the webserver has a default Apache start page as its default and his domain as a vhost, but none of his nameservers are up to resolve requests for his domain.
I'm amazed not only that so many posts were made "about" the story from various diagonal points of view, but without anyone actually browsing his site. It's even more interesting that his story got posted at all without the referenced content being reachable. I read a great story once at a web site that's no longer up; maybe I should post it!
-j
The current barrier to entering the market for search engines is low. The technology is relatively simple as the multitude of search-engine companies will attest.
The advantage that M$ has, over Google, is its huge R&D budget. M$ labs is the modern-day equivalent of the venerable Bell Laboratories, which is shriveling under the management of Lucent. M$ has plucked numerous professors from the computer science departments at top universities by offering incredibly high salaries.
"I personally dont think that Britney would have more than 10k pictures online (or may be offline too?)" (from the review)
At last count I have 2,347 pictures of Britney Spears.
And I don't even like her music...:-)
I wonder how many photos of the Corrs MSN can find...I've got 2,080 photos of them...
How about 1,228 of Salma Hayek?
1,406 of Angelina Jolie?
1,083 of Carmen Electra?
24 of Chelsea Clinton? Waitaminnit, WTF?
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
Try to provide a Feedback. It does not proceed. I tried to provide MSN a Feedback about there is noway to get to the main page after searching. Pops up a Windows BOX containing Submission information and gives up. btw, I use Firefox and I bother not to check for the same on IE.
Senthil
An error occurred while attempting to run a script on this page.
http://www.msn.com/ line 149:
TypeError: Attempt at calkling a function that expects a HTMLDocument on a Window.
Je fume. Tu fumes. Nous fûmes!
Just did a quick comparison of search.msn.com, google.com and www.yahoo.com. Here are my results:
Search term: microsoft sucks
Google: results about 862,000
Yahoo: results about 762,000
MSN: results about 1,856,364
There's a joke in there somewhere dying to get out.
I find it unusable, the local weighting is so strong that you get local results regardless of the language your searching with or what your searching for.
.NET and you don't get a Microsoft site, you get nothing but French pages in France. Switch to English language setting and you get English pages for French sites.
.NET comes up as this in Spain:
/ index.html
Search
Connect through a Spanish ISP and you get Spanish ones.
For example
1. www.lobocom.es/regdom.html
2. www.clikear.com
3. www.clikear.com/aspnet
4. mexico.clikear.com
5. www.empleo.net
6. www.ciberaula.com/curso/masteraspnet
7. www.ciberaula.com/curso/puntonet
8. venezuela.clikear.com
9. ecuador.clikear.com
10. www.filipenses.net
No mention of Microsoft and 2,3,4,8,9 are all parts of the same site!
Search for "Marseille by Night", Google gives you sites about the nightlife of Marseille. MSN gives me:
1. www.maketon.com/conciertos.php
A concert by a DJ called "Dj Jack de Marseille"
2. www.fyl.uva.es/~wgeolid/fajg/solidario/forosocial
About volunteers working NIGHT and day and having a meeting in MARSEILLE.
3. www.infoconciertos.com
Again it mentions a concert in MARSEILLE and a DJ playing at Gomma NIGHT.
Its all complete crap, absolute crap, its not turfing to say it, these results suck big time. They are no even in the same league as Google, Yahoo and the rest.
What a wierd result. I don't think it's a "sponsored link" effect, though. It looks, instead, like the ordering algorithm clusters sites, so that sites with lots of pictures of Spears show up near the front, and sites with fewer images show up later. If you hack the query to look at pages around 200, you find many more sources on each page.
Yeah, but if you take out the 'kelly ripa' and just search under 'camel toe' you do get some hits. The eerie thing is that Britney Spears somehow manages to make it into the first page of these search results too!
Though, as if to prove your original point, adding 'britney' to that search also gets no results.
P.S. You'll almost certainly like cameltoe.bolt.com
Signatures are a waste of bandwi (buffering...)
Better wait to look at this one when you are at home.
In Republican America phones tap you.
Microsoft just indexes the search results from the Google and Yahoo image searches. It then leaves them with about 200,000 results to dig up on their own, which they can get from their sponsors.
I'm not really being serious... or AM I!!??
Please stop stalking me, bro.
That just isn't a best example for research, since this term is very competitive, so the results are bound to be heavily manipulated by search engine optimizers. One cannot draw any meaningfull conclusions from such "research".
Click here
Those of us who've been around a while know the well-worn pattern:
(1) MS sits on arse for years doing no innovation while another company produces an innovative, excellent, useful product and spends several years refining it and making it even better
(2) Start to take notice as another company starts to get a lot of limelight in some mainstream market "space" it never occurred to you to enter
(3) Announce intention to compete.
(4) Spend the next couple of years with half-hearted attempt to play catch-up, producing a mediocre equivalent that's not really even terribly good. After a few hit-and-miss betas, announce "version 1" with much fanfare and lots of fawning press releases, with a product that basically brings customers what was already available five years ago from the innovative competitor, blatantly copied down to detailed elements of the user interface but it 'feels' like 'just a poor clone'
(5) Spend another couple of years watching in frustration at low adoption rates of your product. Slowly improve product until it meets a "good enough" standard (still not as good as competitors, but "good enough"), and then ...
(6) ... shove it down customers' throats by abusing desktop OS monopoly: Integrate own product into the next version of Windows so tightly that people almost have to use it, e.g. put MSN search box right into taskbar thus making it far less convenient to use other search engines.
(7) Gain market share rapidly. Fawning press hails you as a great innovator. Ten years later, everyone thinks you practically pioneered Internet searching.
Will it work this time? Probably.
Mark my words, Longhorn will have an MSN search box built into the taskbar.
Google vs MSN
Bubble sort
G - A usefull site
M - A portal
Abiword crash copying
G - the correct site with the bug report
M - no helpfull site
The cure
G - www.thecure.com
M - msn music & ringtone site
All in all MSN search looks nice but performs dissapointing.
Btw, how to change it so it will search in international sites first instead of just dutch sites?
Found 1,474 images...
Including some I never saw before...
And yes, a lot of the images link to 404 pages, but I've seen that on Google, too.
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
I don't think it does fuzzy searching as well.
Looking through various search queries in Google and MSN I noticed that Google finds images in pages that barely make a mention to the keyword (and does it accurately). MSN on the other hand the pages have more references to the keyword I am searching for.
I'm not sure why this is, I guess it's just the alor. they use to index.
I'm betting MSN will improve a bit, it takes a while to index the net to the level that Google did. It takes a long while.
I'll be saving a few queries, and comparing them over time, and see how the change. I'm guessing in 6 months the query results will be fairly different.
Perhaps this guy didn't know the default setting on MSN search is to group results by domain. Maybe he should try agian with this setting off if he wants to see more variety of domains providing Britney pics.
Remember... ZG9uJ3QgZm9yZ2V0IHRvIGRyaW5rIHlvdXIgb3ZhbHRpbmU=
MSN
and check out the 6th picture.
google
This time its the first
Right - not a problem...
Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
Many pictures include this sort of search-rich information, either from the camera or added manually, using cataloging software. Google's Picasa 2 freeware (Windows only) embeds it's key words just so. Microsoft Research's excellent freeware (Windows only) World-Wide Media eXchange tools do the same for geo-coding photos. There are numerous other tools that can do the same, leading to a significent set of internally 'tagged' material.
So, why aren't the search engines taking advantage of this? They're already loading the images and creating thumbnails, how much extra work is it to extract any additional information in the file and use that in it's indexing too, especially compared to the potentially increased accuracy?
I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
I just want to speculate over the technology both search engines use. Specially for image search engines. When you search for a picture, and use 'monkey' for example for search key. You will see pictures with names that does not include monkey in it even at first page of Google search results. That's because Google using context reference for images *better*. But for MSN you won't see any pictures on that so called first 20 pages that does not have 'monkey' in file name.
So basically MSN search engine puts files w/ search key in name at high priority. That's why it's showing pictures at same domain until names with britney spears ends.
MSN search might be a good service for msn users but it's nowhere to race with google either with search index and technology. But you know MS and his strategy, IE was like a silly program and had not even 10% market share at its first year. Thanks to MS and his monopol strategy it's now %90 and we got a full of garbage web.
Why don't you try this one for a change..
href="results.aspx?q=britney+spears&format=rss&FOR M=ZZRE"
Anyone got any idea why they'd have an RSS feed at the bottom of the page? I'm sure there is one, but I'm having a hard time coming up with a practical reason for monitoring the content of a search engine result.
My blog, "Kitchen Sink Gazette", http://sundroid.blogspot.com/ has a link to a site where you'll find a photo of Bill Gates and the entire MSN Search Team and a list of news articles about MSN Search.
Sun and Fun
I hope MS doesn't try and copyright "SafeSearch" and make google change the language in their settings. Conversely, I wonder if Google has it copywrit.
Also, Anyone else notice the "Location" setting on MSN? I wonder how this is effecting our "scientific research"
Personally I'm just pleased I'm 1st for 'fuck microsoft' :)
This just looks like a bug, plain and simple - If you go to settings, there is an option to group images from the same site - checked by default - but taking it off has no effect, so if one site such as in this case has ALOT of images, its going to be a long way before you get onto the next site. Which you can pretty easy.
Everything about this article is just based on one dumb luck search, and not alot else it seems. Sure it's Microsoft, so it's easy to get all het up, where as if Google made the same mistake, everyone would be much more likely to try figure out what the real deal was.
I saw the light at the end of the tunnel... But it was just someone with a flashlight bringing more work.
Take it easy, my fine friend. I have no time to lose by using MS products, simply I don't need them. If MS wants to enhance his products, better they pay for it, while deciding which betatest for free, I do that related to open source initiatives.
Reminds me of a joke.
A scientist is conducting experiments on cockroach behavior. First day he cuts off one leg of a cockroaach and shouts walk. The cockroach is able to walk limpily. Second day he cuts off the second leg and shouts walk. The cockroach is still able to move around. Third day he cuts off his third leg and shouts walk. The cockroach tries hard to move and is able to do that. Fourth day he cuts off his last leg and shouts walk and obviously cockroach is unable to move. The conclusion: When you cut all the four legs of a cockroach the cockroad goes deaf!!!
All of the sponsored sites are Overture links. Do the same search at Overture and compare the results. Also, compare the search tool bar itself at yahoo.com with msn.com. Built from the ground up, my ass. http://slashdot.org/article.pl?sid=05/02/01/125923 5&tid=109
I only need one search engine! www.sexloupe.com
Wheeeeeeee
Just to play I did a search for my web site "homelessirc", it had about 4 pages worth of hits. So, I start looking through the 4 pages of hits and was rather disapointed to find that most of the results were duplicates from the same page or very very closely related pages (the sort of hits google would filter out automatically).
Then when I hit the third page I noticed that there were now magically 7 pages of results. So I looked back at the other pages, and their page counter jumps between 4, 5, 7 and back down to 4.
further grumblings about poorly copying google's home page....
Setting up to two pages, one on a MS host and the other on a Linux host, I noticed that google will only find the linux webpage and the MSN search only finds the MS hosted webpage (created in MS publisher). To me, the results in the MSN search seem to provide results for servers/hosts using MS products. I wonder what the real truth is behind all this...
It's not the size of a msn's cache that matters, but how they use it...
--LordPixie
You may want to tweak this joke a little. You know a cockroach is an insect, right? How many legs does an insect have?
Hey Gaurav, you just /.'ed your own site. anyways i have a similar one too at http://deydas.com/archives/2005/02/01/the-all-new- msn-search-a-review/19/.
Compare apples to apples buddy :)
Search for Bill gates on the search.msn.com and www.google.com and the results are very similar.
So there's definitely something funny going on with these two-legged roaches that don't move...
Interested in a Flash-based MAME front end? Visit mame.danzbb.com
You know darn well they are gonna integrate msn search into Longhorn. If google pushes the hell out
of firefox to gain the search bar on the desktop then stand a chance os staying in the game. They had better get going or they will be in the long line of monopoly victims.
Got Code?
Come on, I MUST KNOW!
Literally. I've seen it. I had a contract there. Racks and racks and racks of 2U dual processor boxes from Rackable Systems, as opposed to the countless HP boxes & Sun boxes. (Yes, Microsoft has a LOT of Sun hardware in production!)
:-)
I don't know if they run Linux, because we (the data center peeps) didn't have access to any of them, only the Rackable Systems people did.
They sure did have a lot of blinking lights.
In order for it to work, MSN would have to be the default search engine from the desktop.
Two problems - the first is that most people will just type "google" and reach the site that way.
The second is that Spyware will not let MSN stand as the default search engine, instead redirecting people to ad-laden search sites even from default search boxes in the browser.
I think it's the whole reaosn MS is fighting spyware as hard as they are (buying that company) because it's actually threatening an aspect of lockin.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
I opened IE, clicked on the britney search link (ahem) and... the photo sites installed spyware! Unbelievable.
I don't know, but when the last Slashdot article about MSN Search came up, I tried it out just to see what the fuzz was about. And one thing immediately struck me: the site's URL is to complicated. I mean, compare www.google.com to search.msn.com - there isn't even a 'www' in MSN's URL (and I don't mean that as funny, Joe Sixpack usually adds a 'www' to any URL thrown at him because 'that is how the internet works').
I guess it is the same with this as it is with "The Internet" (you know, the blue "e" on the Windows desktop launching IE) or "computers" (these TV like things where you can write letters with this Windows thing on it and a harddisk under the desk). "Google" today is already a synonym to searching on the web, and just by being superior in the results alone will not be enough to dethrone Google (and we are not even sure yet if MSN will be better).
Personally, I like the image portion of M$N Search and find it to be extremely accurate in finding relevant images (and it really is hard for me to admit that I like something that M$ has developed).
Not that my site is the center of the universe, but my logs tell me msn only just started looking at my pages a few days ago. That migh thave something to do with the incomplete results...
msn
and
google
the MSN result makes no sence compared to googles
so why get the puggin?
What would be really good is for them to capture the EXIF data in JPG files and index based on that or find a way to incorporate it. Yes, for a file named &!^@%#!!!!!.jpg to show up in a Britney Spear's image search is redicilous.
dustmite, you couldn't have said it better.
Both DNS servers for gsharma.com seem to be down at the moment, or at least not responding properly to DNS requests. Web site's still up, though -- if you're having problems, add the following to your hosts file:
207.58.133.70 www.gsharma.com
I tried searching for "draves" with the new
msn search. It gives rather different
results from all other search engines. My
brother works for microsoft and has a home
page hosted on their site. My home page is
draves.org.
Google, yahoo, altavista, dogpile, and
lycos all return my page as #1 and my
brother's page at microsoft as #4, #13, or
greater than #30. MSN returns my brother's
page as #1 and mine as #4.
What do you make of that (besides that i'm
an egoist)? I would conclude that MSN
search inflates the ranking of results from
microsoft.com.
Scott Draves
Internet Explorer and Office have never been superior to the competition nor will they ever be. They only dominate because Joe Average does not know about the superior alternatives.
It isn't very important, i know, but there's still some bugs in the results page. One i've notice is the wrong enumeration of the pages. If you search "abababa" you'll find 25 pages full of results, but it show you 28. try: http://search.msn.com/results.aspx?q=abababa&first =231&count=10&FORM=PERE4 and click on 28, you'll go only to 25.
As far as I'm concerned, Microsoft has never put out a successful piece of software. Every one of them has bitten me (and anyone who actually wants to get real work done) in one way and another with unfulfilled promises.
Money may be useful as a way to represent value, but only if everyone agrees to use it that way. Using money to trade value and using it to play power games are mutually contradictory activities.