Slashdot Mirror


Who Isn't Paying Attention to ROBOTS.TXT?

Kickstart asks: "After wading through the Apache logs, after being hit hard for three hours by a very unfriendly spider, I see that there appear to be real, legitimate, search engines that do not follow robots.txt rules. Looking around, I see that some specialized search engines make no mention of their policy on this or say what servers their spiders come from. Does anyone have information on who follow this standard and who doesn't?"

4 of 85 comments (clear)

  1. Here is your problem: by Neil+Blender · · Score: 5, Funny

    All spiders are going to ignore your ROBOTS.TXT file. Instead, they look for a file called robots.txt.

    1. Re:Here is your problem: by AndroidCat · · Score: 2, Funny

      What a lot of sites need is a slashdot.txt file.

      --
      One line blog. I hear that they're called Twitters now.
  2. Re:Spammers are bad (of course) by Dancing+Primate · · Score: 2, Funny
    You mean
    User-agent: *
    Disallow: /
    yes?
  3. Re:Spammers are bad (of course) by timothv · · Score: 2, Funny

    HAHAHA! Apparently not. See his own robots.txt