Slashdot Mirror


Microsoft To Launch Homegrown Search Engine

Mr. Christmas Lights writes "While Google is currently the king-of-the-hill in search engines, Microsoft continues to lag in market share and uses Yahoo's technology/results. But Cnet reports that they'll launch on Thursday their own homegrown search engine , although it appears this is mostly a face-lift (despite a year of development and $100 million investment). According to Bill Gates, they 'will introduce a homegrown web crawler and algorithmic search engine ... later this year,' which is almost certainly their tech preview (you can look at this now) -- but will that be ready for prime-time in less than two months?"

11 of 300 comments (clear)

  1. this article is oooooold by evil_one666 · · Score: 4, Informative

    This article is from june 30th

  2. Doubtful this will take any ground. by Ambient_Developer · · Score: 3, Informative

    I say it time and time again.. Microsoft is not a company of innovation (besides user interfaces), they are a company that aquires other companies. It is doubtful that a home-grown engine will beat the likes of google.. Especially being so late in the game, not only that what good will a face lift do? Google is already one of the easiest things out there, how can Microsoft make search even easier? THAT is the 100 million dollar question!

  3. Almost there... by Luigi30 · · Score: 2, Informative

    They have 99 million matches for Linux. Google has 162 million.

    --
    503 Sig Unavailable

    The Signature could not be accessed. Please try again later or contact the administrator
  4. THE bot? by knipknap · · Score: 5, Informative

    I wonder whether that's the bot that has been scanning my website for three days by attempting to "crawl" through all session ids and causing more then 1 GByte of traffic.

    "msnbot/0.11 (+http://search.msn.com/msnbot.htm)"

    It was only stoppable by blocking the IP. (robots.txt was only read once before it started) Great, smart bot, really.

  5. Re:Netcraft says the hosting servers run on Linux by alib001 · · Score: 2, Informative

    Mostly wrong.

    ...the DNS directs us to a server operated by Akamai... Akamai's http caching servers run Linux, and so we report Linux as the operating system. However Akamai also forwards the http Server: header from the original server as part of the cached content, and so we report "Microsoft-IIS/6.0" as the web server.

  6. Re:About time by FireFury03 · · Score: 4, Informative

    It pays attention to robots.txt directives (finally, a small amount of standards compliance!)

  7. Re:About time by AndroidCat · · Score: 3, Informative

    It looks like it checks for meta tags too. (Useful when /robots.txt isn't convienent.) MSNBot page and other info. Also note that it only checks /robots.txt once a day, so policy changes might not take effect right away.

    --
    One line blog. I hear that they're called Twitters now.
  8. Re:3 bad results. by BuilderBob · · Score: 3, Informative
    Orange. No results for Orange, the mobile phone company.
    Linux. No pointers to linux.org.
    Google. Returns the Dutch/Belgian version of the page. Why?

    These are no longer true. I know it used to do this but now ...

    'Orange' returns Orange.co.uk.
    'Linux' returns linux.org
    'google' returns google.com
    'microsoft sucks' returns fuckmicrosoft.com
    'abu graib' returns the photographs of inside the prison.
    'lindows' returns lindows.com

    This is from Firefox 0.8 on Redhat Linux.

    BB

  9. Wow by Anonymous Coward · · Score: 1, Informative

    I had never used yahoo search before. It looks *exactly* like google, apart from the big Yahoo logo.

    msn search is really similar as well. From the way text ads are done, to the font colours, etc. Complete rip offs of google.

  10. Re:About time by flithm · · Score: 2, Informative

    Actually it says it pays attention to robots.txt, however my test results show that it does not behave as expected. After noting the amount of bandwidth it was consuming, I created a robots.txt based from the examples on their web site, since I noticed it wasn't following the rules I had already specified that other crawlers obey nicely.

    Unconvinced... here's some stats from my logs:

    MSNBot hits: 10217+77 bandwidth: 441.67 MB
    Googlebot hits: 116+90 bandwidth: 16.13 MB

    This is after the modifications of the robots.txt file, and this is only for a 2 week period in October. MSN bot was drawing nearly 1 gigaBYTE of upstream per month, just from my lowly site! No thank you... I prompty did this:

    iptables -A INPUT -p all -s 65.54.0.0/16 -j DROP

    I encourage all other webmasters to do the same.

  11. Re:About time by FireFury03 · · Score: 4, Informative

    It payed attention for me with:

    User-agent: msnbot
    Disallow: /

    iptables -A INPUT -p all -s 65.54.0.0/16 -j DROP

    Or even better, if you have the TARPIT module:
    iptables -A INPUT -p tcp -s 65.54.0.0/16 -j TARPIT :)