Slashdot Mirror


White House Website Limits Iraq-Related Crawling

oscarcar writes "Dan Gillmor is reporting on the White House website's use of its robots.txt file to disable search engines from crawling certain material. Many excluded items in the robots.txt file involve mentions of Iraq, possibly to prevent people from finding changes to past statements and information when archived elsewhere."

12 of 837 comments (clear)

  1. Funny by sulli · · Score: 5, Funny

    whitehouse.com doesn't have that problem.

    --

    sulli
    RTFJ.
    1. Re:Funny by sulli · · Score: 5, Funny
      Also, TRUE God loving Americans would absoltely love to see that filthy, degrading website taken down because of the damage it causes to children who go there on accident.

      I feel the same way about whitehouse.gov. Couldn't have said it better myself.

      --

      sulli
      RTFJ.
    2. Re:Funny by jovlinger · · Score: 5, Interesting

      true. true. Apparently some poor fool made similar remarks on k5 a while back, and did indeed receive a personal visit from the SS. No charges filed, but 'tis a rude awakening indeed when your online words come and knock on your door.

  2. upside by 514x0r · · Score: 5, Funny

    it's good to see the whitehouse embracing technology so much.

    --

    !(^((ri)|(mp))aa$)
  3. Re:Oh please by phritz · · Score: 5, Insightful
    Congratulations to simoniker, poster of the most inanely paranoid comment I have ever read here on slashdot. And that's saying something.

    I have to admit, when I first read the story I thought someone was being paranoid. But you really should RTF robots.txt file before you accuse the poster of being paranoid. The disallowed files are extraordinarily specific. I really can't come up with a plausible explanation beyond simoniker's.

  4. Truly Frightening. by Dlugar · · Score: 5, Funny

    Obviously, they're keeping people from accessing the top-secret teeball Iraq files ! Besides:

    Disallow: /teeball/iraq/
    check out these other frightening examples of censorship:
    Disallow: /kids/spotty/iraq
    Disallow: /kids/eggroll/iraq
    Disallow: /kids/barney/iraq
    Disallow: /easter/iraq
    Disallow: /mrscheney/iraq
    Disallow: /national-anthem/iraq

    Truly frightening.
    --
    Computer Go: Writing Software to Play the Ancient Game of Go
  5. I, for one... by wardomon · · Score: 5, Funny

    welcome our White House Robot Overlords. It would be funnier if it weren't true.

    --

    - - - If the sun is a star, why can't I see it at night?
  6. Missing Iraq and 9.11 files by jjn1056 · · Score: 5, Informative

    Looks like they removed a bunch of files where they were making claims that Saddam was behind 9/11. One could be lead to suspect that now that Bush got his war his doesn't need that lie anymore, and wants to erase all history of it since it undermines his authority.

    --
    Peace, or Not?
  7. Barney, agent provacateur of the CIA? You Decide by mykepredko · · Score: 5, Funny

    Downloading the "robot.txt" file and doing a quick ctrl-f on different words, I discovered that there are six instances of "Barney" coming up in the robot.txt:

    Disallow: /holiday/2002/barney/iraq
    Disallow: /holiday/2002/barney/text
    Disallow: /kids/barney/iraq
    Disallow: /kids/barney/text
    Disallow: /kids/photoessays/barney/iraq
    Disallow: /kids/photoessays/barney/text

    Which is the same number as "cheney", "powell" had 4, "saddam" didn't have any and "bush" only comes up with "bushpets".

    Clearly, there is something to do with Barney and Iraq that The White House doesn't want you to know about.

    myke

  8. Re:Drawing farfetched conclusions by johnnyb · · Score: 5, Insightful

    It really doesn't look like it. It looks like someone screwed up, because none of those directories appear to exist at all. I mean really, what are the chances of /firstlady/photos/2003/01/iraq actually having at some time contained real data?

    It looks like someone did a

    find . -type d|perl -e 'while(<>){print "${_}/iraq\n"; print "${_}/text\n";}' > robots.txt

    I have no idea what the purpose would be, but it seems like a funny thing to do if you were trying to hide something.

    By the way, who is going around looking at people's robots.txt files?

  9. Re:Other, arguably more reasonable explanations by greenhide · · Score: 5, Funny

    How can they be Iraq related if they didn't exsist to begin with?

    A question that GW gets asked all the time. :-)

    --
    Karma: Chevy Kavalierma.
  10. Re:Other, arguably more reasonable explanations by EinarH · · Score: 5, Insightful
    Didn't think so, not a single one that I went to is a valid URL, and I highly doubt that they were valid to begin with.
    From
    http://www.bway.net/~keith/whrobots/disdirs.html
    Some of the directories that 404 truly are empty of files. FOr instance:
    http://www.whitehouse.gov/news/timeline/iraq

    doesn't have files.

    But at least some of the files that 404 above Do have files in the directory, just not an index file. For instance:

    http://www.whitehouse.gov/infocus/iraq/100days

    does not have an index page, so just entering that URL will give a 404.

    However, the directory has the following files in it:

    http://www.whitehouse.gov/infocus/iraq/100days/100 days.pdf
    http://www.whitehouse.gov/infocus/iraq/100days/int roduction.html
    http://www.whitehouse.gov/infocus/iraq/100days/par t1.html
    http://www.whitehouse.gov/infocus/iraq/100days/par t2.html
    http://www.whitehouse.gov/infocus/iraq/100days/par t3.html
    http://www.whitehouse.gov/infocus/iraq/100days/par t4.html
    http://www.whitehouse.gov/infocus/iraq/100days/par t5.html
    http://www.whitehouse.gov/infocus/iraq/100days/par t6.html
    http://www.whitehouse.gov/infocus/iraq/100days/par t7.html
    http://www.whitehouse.gov/infocus/iraq/100days/par t8.html
    http://www.whitehouse.gov/infocus/iraq/100days/par t9.html
    http://www.whitehouse.gov/infocus/iraq/100days/par t10.html

    All those files are excluded by the directory disallow entry in robots.txt

    And, yes these files *are* relevant.
    --

    Melius mori in libertate quam vivere in servitute.