Vint Cerf Warns About the Perishability Of Human Knowledge (vice.com)
Vint Cerf "worries about the decreasing longevity of our media, and, thus, about our ability as a civilization to self-document -- to have a historical record that one day far in the future might be remarked upon and learned from." An anonymous Slashdot reader quotes Motherboard:
Magnetic films do not quite have the staying power as clay tablets. Clay tablets are more resilient than papyrus manuscripts are more resilient than parchment are more resilient than printed photographs are more resilient than digital photographs. At stake, according to Cerf, is "the possibility that the centuries well before ours will be better known than ours will be unless we are persistent about preserving digital content.
"The earlier media seem to have a kind of timeless longevity while modern media from the 1800s forward seem to have shrinking lifetimes. Just as the monks and Muslims of the Middle Ages preserved content by copying into new media, won't we need to do the same for our modern content...? Unless we face this challenge in a direct way, the truly impressive knowledge we have collectively produced in the past 100 years or so may simply evaporate with time."
He points out that much of this century's digital documents can't be viewed without software. Do we need to start carving our web pages into clay tablets?
"The earlier media seem to have a kind of timeless longevity while modern media from the 1800s forward seem to have shrinking lifetimes. Just as the monks and Muslims of the Middle Ages preserved content by copying into new media, won't we need to do the same for our modern content...? Unless we face this challenge in a direct way, the truly impressive knowledge we have collectively produced in the past 100 years or so may simply evaporate with time."
He points out that much of this century's digital documents can't be viewed without software. Do we need to start carving our web pages into clay tablets?
The vast majority of things that are worth knowing will always be remembered and preserved. If the few that forgotten become necessary, they will be reinvented.
The world will continue spinning. No need for alarm.
The best way to preserved knowledge is to disseminate it widely. Or, to paraphrase Linus Torvalds, someone somewhere will mirror all the really important stuff.
It seems that there is an inverse proportionality between the durability of a storage medium and its storage density, and I don't know if we can overcome that easily, as we have the law of entropy working against us. A stone carving or a clay tablet can overcome hundreds and thousands of quantum events, and they will still be stone and clay. A papyrus starts to rot, when its molecules break up, and it gets brittle and is more easily destroyed. Printed paper is thinner and has smaller letters than a hand written papyrus and thus even small damage can erase whole words or paragraphs. And with a hard disk or flash memory, even single quantum events can erase or flip a bit, and a two bit error is already unrecoverable, and any more damage loses large swats of the file.
Contrariwise... my family has left an immense amount of information. Boxes and boxes of pictures, some films (!), postcards, letters, college studies... I am planning to digitize all of it. In physical form, it takes an immense amount of room, can only be held by one person, and is not backed up. It will be much more flexible, useful, and safe as computer data.
We had this exact problem at a former place of employment, i.e. we had contract requirements to provide access to original oil field data for the 25-year lifetime of the field, the problem was that most of this data was in the form of seismic data locked into a specific version of the exploration sw.
The solution we came up with depended on making a virtual machine image of everything needed to run the original application & data, including license files and user databases, and then freeze the system clock: This way we could restart that image at any point in the future and as far as the sw would know it was still 2005.
We would still need regular maintenance, to make sure that newer versions of the virtualization platform could still run the original image. In the worst case we expected to have to add an additional virtualization layer, i.e. so we could run the 2005 sw inside a 2015 virtual machine which would run inside a 2030 VM host.
This approach has of course been used to good effect in order to save classic games.
Terje
"almost all programming can be viewed as an exercise in caching"
The issue isn't just that the media will decay, it's that the media is too cheap. There is no incentive to curate our documents, and we will end up with so many still in existence, no-one in future ages will have the inclination to wade through the rubbish.
When people had paper photographs, they soon accumulated boxes of albums, and by 1990, those holiday snaps from 1970 were kind of dusty and not worth keeping. So people chucked them out. But of course they looked through them first and kept a couple of photos, maybe even got those framed. All of which means that when they died in 2010, their kids had only maybe 100 photos to look through, and decide what was worth holding on to.
Now, our holiday snaps are uploaded to the cloud. They aren't a nuisance, and we never get rid of any. When we die, maybe our kids will be able to get a drive or an account key, or something, with 20,000 photos on. Do you really think they will do more than look at a few random ones, before adding them to their own 5,000 photos?
Same with our emails, our whatsapp messages, our blog posts.
The total amount of media from our age will still be significant - the sheer quantity produced ensures that much will remain. But what remains intact won't do so because of its significance to our age. We don't bury our most valuable items in the ground for safety, or lock them in huge chests, or keep them in safes.
-----
I like your logic there. I'd even say they should be able to extend the protection by paying the difference and even include the first 5 years a freely implied protection on anything.
0-5 years is free and implied on any work.
Before the 5 years expire, you need to pay $10.24 and it's registration is extended to 10 years (very minor investment if you foresee your work becoming profitable)
Before the 10 years expire, you need to pay $317.44 and it's registration is extended to 15 years
Before the 15 years expire, you need to pay $10,158.08 and it's registration is extended to 20 years
Before the 20 years expire, you need to pay $325,058.56 and it's registration is extended to 25 years
Before the 25 years expire, you need to pay $10,401,873.92 and it's registration is extended to 30 years
I can't think of many works that would still be worth 10 million after 25 years. Perhaps a book to movie deal like LOTR, but I have to imagine with 150 million copies of the book being sold, it's fair to say Tolkien was already more than fairly rewarded for his work and it should have long since been put into the public domain by that point.
He points out that much of this century's digital documents can't be viewed without software.
Umm, I'd say 100% of digital documents cannot be viewed without software. If they could be viewed without software they wouldn't be digital documents.
I think what's more relevant is that they can't be viewed without special hardware. That's one reason why we're always chasing some kind of optical storage. If you have a sufficiently advanced optical reader, you can adapt it to read other kinds of optical storage... so long as their resolution is lower than your scanner.
What he actually said was "That many of the digital objects to be preserved will require executable software for their rendering is also inescapable." What that seems to say [to me, anyhow] is that without knowledge of the formats, getting meaningful data out will be nigh-impossible.
You guys have good points, but are missing the fact that professional archivists have already been debating and discussing this problem for decades. Vint Cerf may have just stumbled upon the idea, or maybe he is just trying to "spread the word". I agree that more people being aware of how easy it is experience data loss is only a good thing, but mostly just to individuals for family history reasons. The "really important" stuff such as collected scholarly knowledge, research, etc. - essentially the billions of dollars worth of stuff contained in the collective university library systems - is already being closely guarded by some very smart people.
They still prefer tried and true technology for the most part. Microfiche is still manufactured and used regularly because they know it will still work for a long time with pretty much nothing needed to access it beyond some light and a magnifying glass. They very rightly do not trust digital systems that have not been proven successful over a long time span. However, when done right, a digital archive is vastly more useful for research. You need meta data and OCR, among other things, but finding your particular needle (or more importantly, every related needle) is very fast. The down side is storage media longevity and access needs more complex systems. Offline RAID and internet distributed systems are valid theories being explored. Believe it or not, public-key encryption is a good thing. By encrypting archival material you can be quite certain it has not suffered bit-rot or been tampered with. Total loss is not the only thing archivists fear, degradation of media or intentional alteration are both to be prevented.
Archivists are not stupid, and the products available to them are very specialized. Here's one you can read about:
Opus
Having all your eggs in a digital basket is scary in terms of total nuclear war, but as long as you have accessible bootstrapping knowledge it's not so bad. If you have specs for PDF/A, public key encryption, enough computer design history and semi-conductor fab skills, etc. it would be enough to get at all the rest. A small file cabinet full of microfilm can contain enough knowledge, schematics, and blueprints to function as the bootstrap to restart civilization. Start small with things like language primer (can't assume that whoever finds it knows your language, so make it easy for them to learn), basic math, making steel and other tools, high-yield agriculture and associated chemistry, basic medicine, and your hypothetical post-nuclear war hunter-gatherers have a good shot at rebuilding before becoming extinct. It may take generations to get to the point we are at now, or with this knowledge available it may only take years. And when it comes time to make the bootstrap file cabinet, according to the first rule of government spending, why buy one when you can have two at twice the price? (Or thousands, why not?)
Millions of tweets and facebook posts may be lost forever, but I don't see that as such a bad thing...
Oh, heavily abbreviated (I'm Canadian, by the way):
if the mouth points down, it's a bottom-feeder, if it points up, it's a surface feeder, if it points straight, it's carnivorous.
if it has a dorsal fin, if it has thin fins, if it has big flippers, denotes its relative speed.
vertical tail fin, it lives in reeds, horizontal tail fin, it doesn't.
scales vs no-scales, eyes on the sides vs the on the top, big eyes vs small eyes.
belly-colour vs dorsal colour.
so really basic observations can give you a pretty good idea of whether or not it can attack, defend, move through tall plants or narrow coral, is often seen from underneath or is often seen from above, lives in darkness deep waters or shallow, moves fast or slow. Put it all together, and you've can come pretty close to exactly what it is and where it lives.
And if you're in Mr. Mawson's class, there was a quiz ten minutes after the lesson, just to prove that you weren't really paying any attention, so everyone failed every time, and knew exactly what they needed to study in time for the test next week.