Slashdot Mirror


National Archives Cuts Back On Web Site Archiving

hhavensteincw writes "The National Archives and Records Administration (NARA) is coming under fire for a new policy to stop the "harvesting" of a digital snapshot of all federal agency and Congressional Web sites after every Presidential and Congressional term. NARA, which archived more than 75 million Web sites in 2004 after George Bush's first term ended, will not harvest agency and Congressional Web sites when his current term is over because it says agencies are supposed to be archiving Web content on their own. But NARA has been criticized by some for opting out of preserving these important historical archives on the Web."

14 of 45 comments (clear)

  1. Its not History by nurb432 · · Score: 3, Insightful

    If you dont document it.

    --
    ---- Booth was a patriot ----
  2. interesting in consideration..... by 3seas · · Score: 3, Interesting

    ... the price of storage dropping as it has.

    So what is the real reason for this? Its certainly not cost.

    Is it possible that nobody is interested in the data?

    1. Re:interesting in consideration..... by bumburumbi · · Score: 5, Insightful

      Is it possible that nobody is interested in the data? People may not be interested in the data now, but as time passes, it will become more and more important. I am a bit surprised that the National Archives and the Library of Congress collect so little of the American cultural heritage. In Iceland, where I live, the National Library collects everything on the national TLD (is) three times a year, important sites are crawled more frequently. I know that the US web is several orders of magnitude larger than the Icelandic web. One would however assume that the resources available to the NARA and LC are significantly larger than what the Icelandic National Library has to spend on collecting websites. Collecting a subset of the US web every four years should be well within the means of the US government.
    2. Re:interesting in consideration..... by Kwirl · · Score: 4, Insightful

      I think we all know that the less history remembers of George W Bush's term as president of the free world, the better off we will look in our children's eyes. If he gets lucky he might get off easy with a 'worst president to ever hold the office' footnote.

    3. Re:interesting in consideration..... by Foobar+of+Borg · · Score: 3, Insightful

      If he gets lucky he might get off easy with a 'worst president to ever hold the office' footnote.
      Of course, like with Nixon, you will still have slavering beasties defending him for the next few decades and blaming everything on liberals and campus radicals.
  3. Wrong Time to Quit by Doc+Ruby · · Score: 5, Insightful

    The NARA should not be considering quitting right when the Bush regime is caught red-handed deleting vast amounts of incriminating digital content that it was legally required to archive.

    If anything, NARA should be required to archive even more now, to guard against losing the unique copies at the other ends of official communications and publications. It should upgrade to a policy of redundant archivers keeping separate copies under separate policies, so that a rogue Executive can't flip one switch and toss all the evidence of their actions into the fire.

    --

    --
    make install -not war

    1. Re:Wrong Time to Quit by Anonymous Coward · · Score: 2, Insightful

      I'm not certain if you read TFA (or TFS, for that matter), but these are public websites that the NARA was archiving. They were doing it ONCE every term. If you want to see just what the NARA was doing, click on "Cached" on Google's search page...same idea.

      Honestly, I'm not pro-Bush by any stretch of the imagination, but the NARA's decision is NOT going to help the Bush "regime" hide anything that wasn't already readily accessible to the public.

    2. Re:Wrong Time to Quit by the+pickle · · Score: 4, Insightful

      The NARA should not be considering quitting right when the Bush regime is caught red-handed deleting vast amounts of incriminating digital content that it was legally required to archive.

      Am I the only one who read this story and thought that maybe the NARA isn't choosing to do this? I think it's a mighty strange coincidence that they'd be doing this on their own in the last year of a presidency that, for the past seven years, has shown a willful disregard for the law, especially when it comes to the administration's own recordkeeping. Dubya's White House has made the missing files associated with the Clintons look like a single lost receipt by comparison.

      p

    3. Re:Wrong Time to Quit by Doc+Ruby · · Score: 2, Insightful

      I think something is better than nothing. That volume of evidence and "virality" of distribution means that even a snapshot will preserve traces that are hard to totally expunge from the entire Federal government's public records. But if that snapshot isn't even taken, that's much harder.

      The dropping from inadequate archiving to none has crossed a threshold where people are now paying attention and demanding adequacy. The inadequacy of the prior policy means that both those in power in the Bush regime and many outside it agree on changing the program, which is a start for political compromise. Switching to the Internet Archive is a mediocre interim measure, but one which Republicans probably don't like, because even though it's their trademark privatization, it's still publicly funded, and not to a crony just skimming a contract while failing to expensively fulfill it. All of which creates political conditions and momentum towards a more distributed archival process, which could fund archives including libraries as I described.

      So instead of giving up, now is a good time to demand more and better. Because it's the right thing to do, and because the way it's happening shows a path to actually getting it.

      --

      --
      make install -not war

  4. Should we be surprised . . . by TXISDude · · Score: 2, Interesting

    It really should not come as a surprise that yet another federal agency has decided not to do its job, but only what it wants to do. . . The reality of the situation is simple, the web is becoming a major communications method for the government, and the content will be a lens into the history of the government's interaction with the people. I am actually afraid that this "ignoring the present" is not some form of conspiracy to prevent the recording of history, but more of a case of senior government officials not understanding the world as it is. Not recording the communications of the government to the people, in the form and context of how they were presented is a complete abdication of the responsibilities assigned to NARA and I hope that this story gets the US Congress to intervene and tell teh agency to do its job. Of course, I also hoped that Santa would bring me a new car, and the Easter bunny would bring golden eggs. So, I am ready for another disappointment.

    --
    Hope is the worst of evils, for it prolongs the torment of man. -- Friedrich Nietzsche
  5. These archives are useless.... by Anonymous Coward · · Score: 4, Insightful

    Any archives done by the government are useless because those who control the government can modify them if they so desire. This data needs to be archived by multiple independent private parties.

  6. The national archives exists for exactly this. by DragonTHC · · Score: 4, Informative

    their job is to archive public records. Every document produced by the US government is public record unless classified.

    --
    They're using their grammar skills there.
  7. doublespeak by osssmkatz · · Score: 4, Interesting

    Back when archives.org was archiving whitehouse.gov, we saw changes in speeches to match the current rationales etc. Is this why they don't want to archive?

    --Sam

  8. Problem is bigger than Natl. Archives. by joebob2000 · · Score: 2, Informative

    Private archiving, (e.g. archive.org) coverage is not what it once was either, though maybe for different reasons.

    More and more operators are choosing to protect their "intellectual property" using robots exclude, noarchive, or similar policies.

    More and more websites use dynamic methods to present data, or use more complex interfaces involving javascript, flash, java, etc that make them technically hard to capture.

    Conversations that formerly occurred on usenet now happen on proprietary bulletin board systems that are technically difficult to crawl. Furthermore, most BBS TOS forbid automated crawling.

    It is interesting that as more and more content is backed by databases, it is getting harder and harder to access and search for the desired content.