Will Google Become Another Netscape?

← Back to Stories (view on slashdot.org)

Will Google Become Another Netscape?

Posted by Cliff on Friday October 31, 2003 @10:16AM from the history-may-or-may-not-repeat-itself dept.

kaluta asks: "The Economist has a typically clear and concise story about bringing Google to the stockmarket. Basically, is it going to be the next eBay or Amazon, or will it 'simply be the next overhyped share sale to make its founders rich only to wither away miserably, either for lack of a sustainably profitable business model, or, like Netscape, because it finds itself in the path of that mighty wrecker, Microsoft?' Cool picture too."

2 of 299 comments (clear)

Min score:

Reason:

Sort:

Re:Room for improvement in Google by cioxx · 2003-10-31 11:07 · Score: 4, Informative

1) Phrase Searches. Google still can't do them: if you want accuracy, you have to elsewhere.

I guess someone didn't bother to read the Google Manual before using it.

Google has an excellent Phrase Search capability. You just need ".." quotes.
Re:Be careful for what you wish for by Corgha · 2003-10-31 11:37 · Score: 5, Informative

Can a hosting provider create a robots.txt file outside of my control?

Well, yeah, they can do whatever the hell they want (though some things might alienate their customers). Keep in mind that your hosting provider could also just have firewalled away the Google crawlers. They can also try to block them by User-Agent, but just I checked and they don't appear to be doing the latter. From the looks of it, they're not that competent, anyway.

re-checking now I see no such file exists

That's not what your web server says, according to the HTTP protocol it claims to be following.

When I request http://www.holocaustnow.org/robots.txt, I get a 302 redirect to http://64.202.166.210/index.html, which returns 200 but says "Page Not Found" in the text (it should return 404 if it means to say "Page Not Found").

That is silly, and non-standards-compliant behavior, and the resulting page is totally unparsable as a robots.txt file. Basically your web hosting provider is saying to the robot that robots.txt does exist, but it's over there, and its a big blob of incomprehensible HTML.

Now, of course, I don't know for sure, but I wouldn't be surprised if well-behaved robots (i.e. not grub) found this behavior to be confusing, and decided therefore not to index the site just to be safe and avoid stepping on any toes.