Slashdot Mirror


Will Google Become Another Netscape?

kaluta asks: "The Economist has a typically clear and concise story about bringing Google to the stockmarket. Basically, is it going to be the next eBay or Amazon, or will it 'simply be the next overhyped share sale to make its founders rich only to wither away miserably, either for lack of a sustainably profitable business model, or, like Netscape, because it finds itself in the path of that mighty wrecker, Microsoft?' Cool picture too."

2 of 299 comments (clear)

  1. Re:Room for improvement in Google by cioxx · · Score: 4, Informative
    1) Phrase Searches. Google still can't do them: if you want accuracy, you have to elsewhere.

    I guess someone didn't bother to read the Google Manual before using it.

    Google has an excellent Phrase Search capability. You just need ".." quotes.
  2. Re:Be careful for what you wish for by Corgha · · Score: 5, Informative

    Can a hosting provider create a robots.txt file outside of my control?

    Well, yeah, they can do whatever the hell they want (though some things might alienate their customers). Keep in mind that your hosting provider could also just have firewalled away the Google crawlers. They can also try to block them by User-Agent, but just I checked and they don't appear to be doing the latter. From the looks of it, they're not that competent, anyway.

    re-checking now I see no such file exists

    That's not what your web server says, according to the HTTP protocol it claims to be following.

    When I request http://www.holocaustnow.org/robots.txt, I get a 302 redirect to http://64.202.166.210/index.html, which returns 200 but says "Page Not Found" in the text (it should return 404 if it means to say "Page Not Found").

    That is silly, and non-standards-compliant behavior, and the resulting page is totally unparsable as a robots.txt file. Basically your web hosting provider is saying to the robot that robots.txt does exist, but it's over there, and its a big blob of incomprehensible HTML.

    Now, of course, I don't know for sure, but I wouldn't be surprised if well-behaved robots (i.e. not grub) found this behavior to be confusing, and decided therefore not to index the site just to be safe and avoid stepping on any toes.