Slashdot Mirror


Saving Digital History

Gavinsblog writes "The Washington Post is reporting that the Library of Congress in the U.S. plans to initiate the $100 million National Digital Information Infrastructure and Preservation Program (NDIIPP). It is hoped that the project will lead to the preservation of data that is constantly changing on the Internet. But I wonder who will choose what is worth saving?" This may remind you of the LOC's effort to preserve and digitize the audio collection in the National Recording Registry.

5 of 133 comments (clear)

  1. What about the DMCA? by kfg · · Score: 3, Interesting

    Good question. Why not sue them for infringement for reproducing your post and find out?

    KFG

  2. Re:New Media Doesn't Last by Xzzy · · Score: 4, Interesting

    > There are plenty of books that last hundreds of
    > years if kept in appropriate conditions.

    My suspicion is that punch cards will make a return at some point. ;) No, really. Not only does it resolve the longevity issue, but it could also solve the issue of obsolete reading hardware (seems to me it'd be easier for a distant generation to rig up a punch card reader than a cd-rom drive). Punch cards are in a rather obvious format as well, if worst came to worst and humanity nuked itself back to the stone age.. in ten thousand years a disc that looks like a mirror is probably harder to translate than a piece of paper with regularily spaced holes.

    I think the only difference will end up being the material used; how many centuries could a stainless steel plate with pin sized holes last in a library's basement?

  3. From the viewpoint of meme theory... by asparagus · · Score: 3, Interesting

    The important information will save itself without outside help.

    For example if talkorigins.org was wiped out of existance tomorrow, the theories it has created will live on in the minds of those who have read them. These essays can be easily recreated by re-reading the various creationist works. On the other hand, if the various creationist works were destroyed, they would probabally not be recreated because they have already been refuted.

    The history of information is the history of massive portions of it being eliminated, but then either re-printed, re-discovered, or re-invented centuries later.

    The Catholic church 'knew' the earth was the center of the universe.

    Along came Copernicus with his helio-centric theory, and the popes tried to lock him in his house for his entire life.

    Now, if the modern versions of these men were to make the same claim, they would be soundly laughed at.

    So, while this is a noble effort, it is merely a collection of data. Time itself the bayesian filter that will determine which parts of the internet are important.

    -Brett

  4. Preservation vs DRM by dpilot · · Score: 4, Interesting

    Since the public domain died back in the 1920's, and since this is about digital content, it stands to reason that pretty much all of the content that LOC is talking of preserving will be covered by some sort of copyright, and an increasing portion will be protected by some sort of DRM. What will the LOC stand be on this?

    Since the LOC seems to hold some of the strings over implementation of the DMCA, they can obviously craft a loophole for themselves. But it will be interesting to see what that loophole is, and how it will work. Will they simply leave the stuff under DRM, and have their own copy of keys, or will they manage to have an unprotected copy?

    Enquiring minds want to know.

    --
    The living have better things to do than to continue hating the dead.
  5. Archive.org, and its limitations by Animats · · Score: 3, Interesting
    There is, of course, archive.org. That's a surprisingly small operation for what it does. A few volunteers work on the server farm (less than a thousand commodity PCs), and there's a little office at the Presidio of San Francisco. The web crawl is done at Alexa, and the Archive is filled from Alexa's backup tapes, which is why it runs so far behind.

    There's a live backup of the Internet Archive at the Library of Alexandria in Egypt. Thus, no single government can censor the archive. More duplicates may be established in other countries.

    Perhaps unfortunately, it's easy to remove material from the archive. Just put a "robots.txt" file on your site, and not only will it not be captured again, the archive will immediately refuse to display copies of the blocked site. This seems to be enough to keep the militant copyright holders happy.

    Most text is saved, but not all pictures, and very little video. This is good enough for most historical purposes.