Saving Digital History
Gavinsblog writes "The Washington Post
is reporting that the Library of Congress in the U.S. plans to initiate the $100 million National Digital Information Infrastructure and Preservation Program (NDIIPP). It is hoped that the project will lead to the preservation of data that is constantly changing on the Internet. But I wonder who will choose what is worth saving?" This may remind you of the LOC's effort to preserve and digitize the audio collection in the National Recording Registry.
is another persons treasure.. I'd say just save it all and allow others to sift through and decide what is worthwhile and what is worthless.. just like the library..
No need to add slashdot as one of the website. They keep reposting stories here as an initiative to preserve their own history.
The irony is that, while digital files could be preserved indefinitely in absolute perfection, many are being completely lost in much less time than it would take a book to turn to dust.
Kudos to the folks at the Library of Congress, and other projects like the Wayback Machine who are working to preserve a surprisingly ephemeral media.
- None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
Isn't this already being done by the WaybackMachine (http://www.waybackmachine.org)?
cogito ergo sig...
I deleted all my porn, and I was afraid I wouldn't be able to get it again when I need it.
Sheesh, evil *and* a jerk. -- Jade
How much energy should humanity spend remembering its past? I love history, but frankly I'd rather they fund more discoveries (i.e. NASA) than archive drivel like my slashdot musings.
Good question. Why not sue them for infringement for reproducing your post and find out?
KFG
Plus not all the data can be saved anyway... sites such the Internet Movie Database, Amazon.com, and even Multimap are database-driven. Even assuming you get access to the underlying database you still need to preserve the code which gets used to generate the pages. And for what purpose?
Add to that the problem of accessibility. If the data isn't laid out in an easy-to-browse fashion then it's as good as dead anyway. I prefer to browse a library by topic, not searching for keywords and hoping a nice book pops out.
Sorry, but my karma just ran over your dogma.
This may sound like a joke but I really hope they save the big red dot. I dont know if the website is still in existence but a while back there was a website that had a big red button. When you clicked it, it said you have clicked the big red dot. The counter had some ridiculous number. This was back when it was envogue to show off your hit count.
Well, maybe they can come up with a system where people post what they think it is important in history and then some of the same people moderate that using a unit called Mod Points up or down to see if they are or not worth saving... maybe call it sloshdat.
A mechanism would be deviced to protect the figures that make history against the people reading the history, and effect that could be called Sloshdatted.
I'm sure that with a system like this, historic figures such as many of the presidents would be Modded Down, while anyone who trashes an established monopolistic corporation would appear in the history books.
A system like this, would, without any doubt, save and Mod Up a comment like the present one for future generations.
"There is no teacher but the enemy."-Mazer Rackham
we have an incredible fascination with spending today looking at where we were yesterday instead of where we are or where we're going.
I'm not talking about history. I love history. My shelves are well stocked with various dead trees delineating history.
I'm talking about our own lives. When we go on vacation we tend to spend most of our time *documenting* our trip rather than living it. Then we live it "in absentia" as a kind of recreational post mortem.
It's a fascinating to thing to observe, but I admit it puzzles the hell out of me.
This point was driven home to me a while ago when someone pointed out how odd it was that I only have one photograph of my SO of 10 years. I only have it because my mother took it. In my mind why would I want a photograph when I could just look at *her*?
KFG
that the goatse man will NOT be preserved in this way......
*shudders*
RoseColor red={0, 0xffff, 0x0000, 0x0000};VioletColour blue={0, 0x0000, 0x0000, 0xffff};find / -name *mybase*|chown you
It is hoped that the project will lead to the preservation of data that is constantly changing on the Internet.
One possible reason: because the OIA and Company might need the data to track down terrorists, etc. (Much the same way that the FBI keeps a collection of outdated phones books.)
After all, when the events of Iran-Contra blew over, Congress quietly passed a bill authorizing the CIA to use any Federal agency for cover. Why not the Library of Congress? Indeed, where else? Makes perfect sense.
-kgj
The answer is simple... what represents the goverment mindset of the day will be chosen to represent that mindset in the future. Cynical ? Of course not, why would they be even handed ? Will they store what Al Jazeera (sp?) says rather than what the Washington Post says, why would the views of Palestine be represented over the views of Israel.
Or of course they will stear clear of politics and pick only science and absolute news, thus making it pointless for future historians.
Saving what is said OVER what is already saved is an interesting idea, but will this be targeted beyond those people who already retain everything (like CNN and the BBC) or will it include them ? The BBC store everything, "Just in case", will this money record that information yet again, or will it concentrate on other fields after ensuring that the BBC information is already available?
Historians of the future will have more information than historians of any other generation. Their problem will be that the miriad of views reflected via this information doesn't mean an increase in the spectrum of political opinion, but the ability of everyone to be opinionated.
Their worst problem is that the leaders of the day (Bush, Blair et al ) don't stand out like the leaders of previous years. Will anyone rate the speach of Powell or Bush against, Churchill or Kennedy ? Nope. So how to judge politics of today, how to judge what should be stored, we have no leaders of merit, we have only retoric. So choose what to store, and realise that history will judge as much what you choose to save, as what you saved. This is a different problem to that which has faced historians up till now.
An Eye for an Eye will make the whole world blind - Gandhi
Dear U.S. Library of Congress,
Although not a U.S. citizen, I implore you to retain redundant backups of the website goatse.cx. Losing this website to a disaster would be tantamount to losing the collective works of Shakespeare, DaVinci and Picasso. The goatse.cx guy is an artist in the truest sense of the word.
Yours very truly,
grubby
Trolling is a art,
We need to take extra precautions to preserve some "movies", because, ahhh, they contain certain "positions" unlikely to be witnessed before or since outside of their "industry." I will therefore generously donate 500 burnt CD's of such movies to the people compiling this digital library.
Cyde Weys Musings - Scrutinizing the inscrutable
From the article:
On top of the $5 million the library received for planning the initiative in 2000, the plan approved yesterday releases another $20 million of funding to develop a system for evaluating and storing digital information. Just as the library receives more than 20,000 printed pieces each day but keeps less than half, it now faces the herculean task of deciding what digital information should be saved for future generations.
--
The library doesn't keep all of the printed information it receives, keeping all of the information online is an enormous, if not possible task. The archive.org has terrabytes upon terrabytes of data, and they don't even come close to having everything that was on the web at any one time. With the budget they're talking about, keeping all of this information would most definitely not be possible.
So what I want to know, is if one of Disney's movies get archived, will they sue the Library of Congress?
I'm only paranoid because everyone is against me...
I noticed in the article that one of the topics on which information was being preserved about was 9/11 and that got me thinking.
On a broader scale news media love the internet because they can make outlandish claims when a story first breaks and then modify it as the facts become available. How do we know whats being preserved is accurate ?
Secondly, do we trust the people controlling all this nice, easily modified information not to change it to suit some political whim ?
They say the victor writes the history book. Digital storage will allow the victors to run a few drafts by their spin doctors first.
Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
What truth?
There is no dupe
Geocities web pages may be exactly what a future historian is interested in. They tell you something about the common culture and people. Why do you think archaeologists are so fond of ancient trash dumps?
Mea navis aericumbens anguillis abundat
The important information will save itself without outside help.
For example if talkorigins.org was wiped out of existance tomorrow, the theories it has created will live on in the minds of those who have read them. These essays can be easily recreated by re-reading the various creationist works. On the other hand, if the various creationist works were destroyed, they would probabally not be recreated because they have already been refuted.
The history of information is the history of massive portions of it being eliminated, but then either re-printed, re-discovered, or re-invented centuries later.
The Catholic church 'knew' the earth was the center of the universe.
Along came Copernicus with his helio-centric theory, and the popes tried to lock him in his house for his entire life.
Now, if the modern versions of these men were to make the same claim, they would be soundly laughed at.
So, while this is a noble effort, it is merely a collection of data. Time itself the bayesian filter that will determine which parts of the internet are important.
-Brett
In Nineteen Eighty Four, The Party embraced the digital revolution because they could easily control what the news said about them. (Who controls the past controls the future...)
Anyway, the point is the government may not be the best to be in charge of this.
</rant>
I believe you are talking about The Really Big Button That Doesn't Do Anything.
A novel concept in its time, it was a strangely addictive big red button on a website. Established in 1994, and linking back to itsef, it was more repetitive than Taco's story postings.
As interest in it waned, though, they added a message board-ish thing that let people comment on the button. As it was quickly misused, the best comments were left and the worst deleted.
There, the very first MS bashing in large amounts began with comments like, "Huh? A button that does nothing? Must be a new Microsoft product..."
Although dead at the age of 5, its final resting place is in its original home, Spatula City.
Work sucked, until it became unemployment, when it became slightly more tolerable. -Tet
Since the public domain died back in the 1920's, and since this is about digital content, it stands to reason that pretty much all of the content that LOC is talking of preserving will be covered by some sort of copyright, and an increasing portion will be protected by some sort of DRM. What will the LOC stand be on this?
Since the LOC seems to hold some of the strings over implementation of the DMCA, they can obviously craft a loophole for themselves. But it will be interesting to see what that loophole is, and how it will work. Will they simply leave the stuff under DRM, and have their own copy of keys, or will they manage to have an unprotected copy?
Enquiring minds want to know.
The living have better things to do than to continue hating the dead.
The actual cost of storage is not that high. The highest costs are involved when human intervention enters into the equation.
Google does not evict anything out of their cache. They just keep adding capacity. Hence Google can already see changes to websites. Granted I'm sure that this data isn't durable though.
There's a live backup of the Internet Archive at the Library of Alexandria in Egypt. Thus, no single government can censor the archive. More duplicates may be established in other countries.
Perhaps unfortunately, it's easy to remove material from the archive. Just put a "robots.txt" file on your site, and not only will it not be captured again, the archive will immediately refuse to display copies of the blocked site. This seems to be enough to keep the militant copyright holders happy.
Most text is saved, but not all pictures, and very little video. This is good enough for most historical purposes.
It's always a good idea to save a piece of history. Traditionally, it's been done by writing a book. As we've seen, a book can be read thousands of years later. But what about digital information? The media types changes rapidly and todays storage is obselete tomorrow. So, how will the historians read a "Seedee" 100 years from now? Ok, assuming they actually managed to build a device that can read the data of a CD, the data will most likely be corrupted, since CD's has limited lifespan.
Now, the only way to accomplish this is to make it a dynamic storage. That is, go with the flow and when a new sooper dooper storage device is invented, copy the data to that, thusly ensuring two things. 1) The data is "refreshed" 2) The data can be read by the contemporary hardware.