On Data Obsolescence and Media Decay
mouthbeef asks: "What's the future of storage media? With CDs and tapes prone to relatively speedy decay, and hard-drives an entropic nightmare of moving parts, how
will we keep our data safe over the long haul? I just got some e-mail from a writer pal who isn't really technologically sophisticated, alarmed because someone told him that his backup CDs would decay and rot in 20 years. He's an sf writer, and he was thinking "big picture:" a coming infopocalypse in which sysadmins devote their every waking moment to re-archiving their old backup data." Is such a scenario likely? Why or why not? (More)
"I wrote back that I didn't think that would happen, because:
- Every time I buy a computer, it's got more storage on-board than all the computers I've owned until then, and I just migrate all the data files I've ever created or saved to the new box, like a hermit-crab changing shells
- With broadband becoming more real and more cheap, it makes sense that in the long run we'll store most (if not all) of our data on remote servers -- encrypted, of course -- that are managed by trained pros with access to mirror drives, climate-controlled vaults, etc. etc.
- Even if this doesn't happen, most of your data files will be in stupid, proprietary formats like Word 3.0 that won't be openable, anyway
How reasonable does this seem to you folks? What do you do with data that you need to preserve for the ages? "
The Dead Media Project (begun by Bruce Sterling, second in command was Richard Kadry, now hosted by Tom Jennings!) has been documenting this issue for many years now. The collected works of the DMP are at...
http://wps.com/dead-media
... Enjoy!
- Trevor Blake
Unfortunatley, that's not a viable option if the data you want to save is of a personal nature, and you don't feel like letting the whole world see it..
You sound like those hosers who argued that tubes are better than transistors. All the available research and engineering couldn't develop a measurement that would show where solid state was inferior to tubes, in fact most tests only illuminated that tubes were generally less capable of reliably reproducing the original signal. Yet the audiophiles insisted that they could "hear" the difference. And finally they were given a blind test and couldn't beat the results provided by a random selection.
Oh yeah, I do agree that mp3 is lower fidelity that a pure digital signal. But a high quality digital signal is about as pure as you can get.
Will your grandchildren be able to view images of you after you're done. How about the recordings of your children you made when they were speaking their first words. You see, there is a lot of personal data which is worth keeping. And worrying about nuclear strikes is the least of your worries.
Copying data from disk to disk, lots of trouble there. Tape formats are going out of style about every 5 years and the media only lasts about 10 anyway. CD-ROM and DVD are too new to be proven in the field. Floppies won't last more than a few years and things like ZIP&JAZ only seem to last a few months.
Lets take a look:
Digital media is a lot more problematic than anything we've come up with so far. Yes so our grand children will be able to get bits of an old disk. So what, they'll have absolutely no way to determine how to piece the bits into anything meaningfull. This is quite different from "Rosetta Stone" technology where only translation ranther than interpretation is needed.
Paper can be read even if it is stained, stained bits are just noise.
There was a good article on just this thing a few years ago in Stientific American magazine, start there.
Err, you mean Johnny Mnemonic. Don't mix up your Keanu.
Dude, if that is your home setup, I want to know your *work* setup ....
With DVDRAM's coming out within the past month, it allows users to back up massive ammounts of data, up to 5.4gb on each cd. so as far as storage goes, that's probably the future. Raul
I hope that if I make it to my 80s I can still
access my p0rn collection
Who-hoo - score -1 irrelevant
Sorry, this is a bit off-topic, but doies anyone know where I can get some of Lem's books in Russian translation ? They were my favorite books when I was young (and didn't speak English)... *sigh*
Thanks.
It doesn't matter if it fills up as long as it leaves an interface...
Nice idea, when you have no clue about biology. The error rate for the average base pair can approach 10e-6 per generation. For some bacteria (the only likely bio-ROM we'll have any time soon), the rate is much higher. Coding redundcancy can be built in (witness the 3 base pair coding system already used by life forms), and copying can be done with more efficiency.
Unfortunately, any such scheme is vastly more expensive (vats, medium, scientists; or pens, fodder, and farmers for pig-ROMs) than magentic or optical storage. Holographic type memory has the advantage of almost 100% redundancy, and is already available.
Additionally, addimg more DNA to a critter can only go so far before you get errors from mitosis and meiosis (recombination, x-overs) at a larger rate. Add in bit switching by UV or ionizing radiation, and the idea gasps for traction. Finally, can you imagine a virus attack on a bio-computer? A simple retrovirus encoding a retrotransposon could give your database cancer. No, the idea flops on its own lack of merit.
I've heard this 3rd hand over my last 20 years as a programmer. No longer does anyone even know the source or cite of the statement. It's become a legend, harkening images of cheesy Superman movies with the long crystals you drop into the tubes to "play" 'em back. The closest we've come is with the crystals spread flat over a disc. CDRW (phase change crystals) or with megneto-optical media. No multi layer (DVD material is not a crystal). Full 3D crystals. I don't see any real progress in this direction.
How many people still know the format of files created on PDP-11 or earlier systems. Who says a byte will still be remembered as 8 bits 50 years from now. What's the chance of anyone still able interpret TRS-80 Scripsit+ files on Newdos 2.0 disk a few decades from now. I don't have the solution to this, but you can expect many system managers to become involved in constant data conversion/translation cycles.
Rob Turk (rob.turk@gelrevision.nl)
Standards, even voluntary standards, get in the way of progress. They facilitate communication for the second or third generation using them (the first has to pay for and implement the switch), but the half-life of a standard can be measured just like the half-life of CDs or tape backup.
The central problem of the "infocalypse" is one of decoding. Any literate peon can decode a book about 100yrs old, or so. It takes a good education to understand English much older, and significant scholarship to decipher old English, let alone other tongues. So, the "standard" English has undergone significant change, in only the past 100yrs.
Computer media are deployed at a higher rate, coding algorithms and devices are lost or damaged. Data generated using the old system is then nothing more than noise. Worse than hieroglyphics, since the rosetta stone has no spare parts, etc. So, why save it? If it can't be accessed, it's not data.
We must either innovate some seriously intelligent agents to manage the archives (sorting the wheat from the chaff) and keep moving it to new formats, or accept that we have small brains and vastly increase the depth of the circular file.
No standard, however arbitrary or kewl, will ever replace knowing what needs storing and wnat needs pitching. And none of them can make future generations (or equipment) literate enough to understand everything we might try to save.
How much data do you get on a 7-track 12" tape? I'm just wondering if the amount of data would be reasonable for the average home user using todays systems.
DNA is pretty interesting too ...
Between Moore's law (well, not *exactly* Moore's law, but its storage technology equivalent... ah, hell you know what I mean...) and obsolescence, I doubt it will be an issue for a while yet. One thing that has recently struck me as illuminating on this matter was that I recently found and downloaded, just so I can say I have such "old" stuff I guess, cause all I did was zip 'em all up and stash 'em someplace.... Windows 1.x and every version in between to 3.1. Several entire operating systems that were once considered the height of bloatware took me an hour or so to pull down over a 33.6 connection. Anyway, generally if something is imortant to you it gets converted along with technology. If you have something you wanted on the cassette drive of your old Commodore PET, you've probably either migrated it along through 5.25 floppy-->3.5 floppy--> CDROM, or else you don't care anymore about it. Also, think about how many new PC's have 128+ Mb of RAM---it wasn't really all that long ago a hard drive wasn't even that big, and it takes a wad of floppies to make 128Mb, too.
What do I do with these 10 inch bernoulli disks with 40 megs data a piece ?? ---Answer Trash them. Let Go !!!! Ya gotta let go move from castle wolfenstein to unreal ... Just let go!
You ever read Battlefield Earth? That gold record was the cause of the demise of mankind!
Seriously, is there a Library of Congress archivist reading this? I believe they have a serious effort going to tackle the problem of changing formats and disintegrating media.
concerning mp3's: from the perspective of someone actually concerned with fidelity, they're shit. just plain shit. they're getting better, yes, but especially the stuff that's floating around my campus ripped using the free trial of audio catalyst with only 128-bit depth. i mean the fact is that analog recording with always be superior to digital, being capable of catching subtle things that digital leaves behind, blind to it's existence. but, yes, analog is hard to deal with and expensive. the main reasons that mp3's are so prevelant are that they're small and almost always free. people who are using the pos speakers that came with their computer can't tell the difference. usually. but someone with decent fidelity and an ear to hear? ya, babe, mp3's suck. bigtime. that is my rant. next.
"S okay, them tricorders they'll have in the future will read anything. Probably can pick up a CDR dug out of the ground and wave a tricorder over it, and it will 'read' the pits in the CD by the arrangement of the individual constituent atoms of the disk or something. (and then boot the slackware it found on it in an on-the-fly generated virtual machine inside the tricorder, which it would infer in seconds by examining the code and inferring what the CPU it ran on must have been like :-) Well, it must be something like that--ever notice the computers in Star Trek never have cross platform communication problems? They pick up viruses off of unknown alien probes, transfer data with computers designed by other species....
Here's a fascinating site: http://www.well.com/user/jonl/deadmedia/ In 1990, to preserve their greatest accomplishment, Microsoft put away ten sets of diskettes for Windows 3.0 in a hermetically sealed vault where they wouldn't get stolen. They opened it a few weeks ago to see how the diskettes held up over a decade. They found 62 sets, including several that had been physically abused, so linux must be doing pretty well.
Older film based on celluloid turns into vinegar. If it doesn't burn up first, that is.
Hmmm.. I assume that everyone worrying about CD's surviving a EMP blast from nuclear radiation, will have Radiation proof clothing to protect themselves??? I'm thinking forget the data what about me? We are after all Electro Magnetic Creatures ourselves. What about the data stored in your head? Will it still be intact?
"just imagine an archivist a hundred years from now having a (miraculously intact) multilayer CD dumped on
their desk and them having to figure out how bits are stored, the byte size, the checksums, the character
encodings, the filesystem, GZ/JFIF/PNG/Word/encrypted files, etc. from scratch. You can't assume
documentation is going to exist for it all!"
You forget this is the future. That Archivist will have open source reverse compilers (That do it on the fly and in any language) plus the help of AI that is smarter then anyone on the planet. Decoding old data in old formats is not a real problem. Hell it was written by people in the first place so other people can read it. If people can brake the enigma code im sure people can brake some old file format.
Joshua corning
hook@majik3d.org
How anglo-centric of you. What about all of the other languages out there with their own characters. A better alternative would be uni-code.
Since i bought a CD burner, i have been making images of all my floppys and storing them on CD. I've gotten through 90% of them so far, and have encountered that about three to five floppies out of every hundred have faded. Their formatting is no longer recognized.
After all, how hard will it be to include MPEG-2 decompression in next generation video players ? The cost of an MPEG-2 decoding circuit probably won't be very high anymore.
Raw parts cost about five years ago (when i designed them) - around $150. It's likely way cheaper now.
Isn't that kind of narrow minded? Much of the Apollo data has been lost, because the media in many cases degraded, and in others, the data format has been lost. The military can't even come up with a comprehensive list of soldiers involved in nuclear tests -- the information, stored on old, low density mag tape or worse yet, punch cards, has been lost. CDs are only expected to last around 100 years, tops, in a humidity/temperature controlled environment. 20-25 may be a little low, but they will oxidize, and imortant information will probably be lost. Ask any archivist. There's more data in the world than just your Quicken files.
Stan kenton is a bitch man !!!! Long Live Stan and cuban groove !
Paper conservation was the excuse, but the real reason was to optimize the signal to noise ratio for the propaganda. . .
DAT reliability?
AAAAAAAAARRRRRRRGGGGGGHHHHH!!!!!!!!!!
Translation: Yesterday I was trying to restore a user's mail file on a Novell Groupwise server (which BTW sucks donkey-balls) from a 12G DAT (Seagate drive, HP tape). I ran a 'build session files' pass, which read the whole tape and indexed the contents. Then when I went back to actually restore the session I needed, the drive refused to read the tape at all. The tape was write-protected, and the drive had been cleaned recently. Other tapes are still readable, but the one I *need* is toast.
What really sucks is that I just need to find one little message, probably a few kilobytes. That's about 0.0001% of that tape's capacity, but it's buried (1) behind an unreadable tape header segment and (2) inside Novell's cryptic and baffling Groupwise database structure.
In summary,
AAAAAAAAARRRRRRRGGGGGGHHHHH!!!!!!!!!!
p.s. Any recommendations on a good data-recovery shop (preferably in Canada)?
gnee! gnee!
Since storage mediums are changing constantly it shouldn't be a problem. As long as we have to keep copying all our information over to new mediums every fews years deterioration isn't really a problem. If product makers fail to come out with new mediums like dvds then we stop the transfering process and it becomes likely that we will lose vasts amounts of our information. Using paper to store information is no longer a viable option because of the shear amount of information we generate thanks to our computers and other wonders of technology. Imagine trying to store every webpage there is on paper. So we must look to technology to solve the problems it has caused. Special storage devices should be manufactured with longer lives, preferably a few hundred years. Information that people feel should be stored can be placed on these mediums. If anyone wants this information in a few hundred years then it can be copied to a new disk. If it hasn't been touched then we probably don't need that information anymore.
"I don't know," said Isaac Newton. "I suppose I could make sure future generations have a copy of my work on physics, but I don't care. We must live in the present, because that's the only way to live."
Here's to all the morons who didn't buy SIDF-compliant tape backup software 4 years ago, and let the only non-proprietary open tape format die.
The new, improved speed won't help until you're moving the data *off* of it, onto something newer, bigger, and faster.
A recent article in the UK paper The Guardian commented that the sheepskin on which the earliest known version of Beowulf is written had lasted far longer than any modern medium, and was therefore superior :-) So go for holes punched in sheepskin: the storage medium of the last millennium. They were trying to make a beowulf cluster.
Y'know, I used to think the MP3 format sucked compared to CD, but if you use a quality encoder rather than the fastest one you can find, normalize the WAV before encoding and otherwise put a bit of effort into it, I'll be damned if I can tell the difference on some things even at 128kbps. Note that the player used makes a hell of a difference, too. Half the MP3's out on the net are 128kbps which is OK by itself, but they are recorded overloaded so the peaks are all fuzzed out, or are conversely not loud enough, and most of all are just encoded by crappy encoders designed to rip as fast as possible no matter what else happens. (what usually happens is dynamic range SUCKS and the high end gets flattened) If you don't have a good sound card and speakers, you'll probably never notice any of this anyway. Oddly, I notice when feeding the SBLIVE output to my stereo, MP3s (ones I've ripped myself) carry more fine detail than the original CD does, played in the CDROM drive on said computer, and then fed through the sound card. Cheap D/A on the CDROM drive maybe? Should check into getting a CDROM with digital out to go to the SBLIVE and see what happens then.
Then I forgive the U.S. government all its sins. Obviously nobody can read the Constitution anymore.
Most aging papers, etc can be turned into a digital format via OCR (like most large companies do with much of their papers and mail these days anyway) and then reprinted at leisure.
There are many ways to restore and preserve the data held on older disks, and technology will continue to grow in this respect.
But the important question here is not whether or not we can save it, but do we want to? Is your email from your Uncle Binky talking about having his bunion removed at the doctors really something that you will want to have 60 years from now?
Things that are of value will be saved a preserved as the days go on, and things not so worthy will fall by the wayside.
I'm sure the Democratic National Committee would have loved it if your attitude had prevailed. "Just another anti-war kook living in England. His name is Billy Clinton. Throw it away."
I preferred paper tape to cards, although I see a couple of advantages to cards: width (12 bits/position?) and the character is printed on it - fewer worries about forgetting the encoding that was used. (And they're almost unbeatable for randomising seqeunces :) I'd like to see an 8- or 16- bit paper tape machine that printed the Unicode glyph near the holes.
Personally, I've already lost lots of stuff to time... first cards and paper tape, then C= cassettes, then floppies (both 5¼ and 3½, first C= format, then Amy, then IBM), now ZIPs (some Winduhs format, some BSD4.4)... I'm familiar with that "whatever happened to that thingy I had a couple years ago?" feeling :(
If you're really worried about digital storage for the centuries-scale (even just the dcades-scale), you need something that's human-readable. The bit pattern must be obvious. It should be relatively easy to build a reading machine (archivists are always underfunded). The character encoding must be obvious; the bits are no use without knowing what's an A and what's a octothorpe. And verbose textual formats like XML or TeX are to be preferred to anything binary.
Just imagine an archivist a hundred years from now having a (miraculously intact) multilayer CD dumped on their desk and them having to figure out how bits are stored, the byte size, the checksums, the character encodings, the filesystem, GZ/JFIF/PNG/Word/encrypted files, etc. from scratch. You can't assume documentation is going to exist for it all!
Flintstones, meet the flintstones, they're the modern storage family....
[G]
If you are looking for antique tape equipment search at universities with computer science dept dating back to the era of tape. at the uni I work at we just decomissioned 2 sequent machines with tape drives. I have tapes in my office that are older than me! some of the data on them has been moved to CD but it is labor intensive and no-one seems to care much about it. whoknows if there is anything interseting on them or not. summery: machines still work, but no-one cares.
Not to be picky, but the number of boxes
is even more ridiculous considering that there
are 8 bits to a byte:
8*650e6/(12*80) = 10832 boxes
and still some more... (1MB!=1Mb)
Curiosa, I have never seen a box of punchcards,
what are the dimensions?
-anders
I'm willing to believe that the high-speed, high-capacity distributed servers of tomorrow will have VMWare-style emulators for every chipset and every OS ever made on them as public utilities like grep or perl
That would be interesting. While I can see that people might want to emulate a Windows based PC on a powerful server, I couldn't believe that they would want to emulate a Commodore 64. The next generation propably wouldn't want to emulate Windows 95 either, but would want to emulate a system that can emulate Windows 95. The result would be that you would have to emulate a big machine to emulate Windows to emulate a C64 to read the data from a disk that you had to spend lots of money finding a disk drive for. After this, you find that you still need a copy of Mini Office.
Will people be able to even understand it? How many people can read and understand the declaration of independence? Do most people really know what Thomas Jefferson meant by 'pursuit of happiness'? Will finacial information have any real value to people in the future? Europe is already moving to a unified currency. Hmm that spork (model number 730-00835) costs $185.98, is that equal to 185 Euros? Information that's worth storing now (everything) might not even mean anything to people 100 years from now let alone anything significant. (1000 years...)
The Preserving Machine by Philip K. Dick - great story!
... it just gets re-packaged. Take a look at http://www.void.demon.nl/TZXformat.html
to see how the ZX Spectrum people dealt with the issue of tape decay.
Who among us can read Cuneiform? Heiroglyphics? Heiratic? Latin?
Seriously, the infopocalypse is an old problem which is simply getting worse.
Sysadmins have never spent and will never spend 100% of their time migrating data to new media. This has been and will always be performed on an "as-needed" basis.
That which is not needed (at the time of conversion) will be lost for all practical purposes. It may exist on tape somewhere, but after a certain time nobody will ever have a great enough need for it that it will ever be restored.
What we're talking about here is the "short term memory" of our society.
How do we preserve our data? I don't believe we are capable of doing this intentionally yet.
Just as books go out of print, get shelved, don't get translated, and eventually rot, nearly everything we do in software will be temporary.
The difference is that data expressed electronically are far more temporary.
Are you joking? As a geology student, I can tell you that 5 meters is a joke. it will fill up in oh, ten,fifteen years?
It really depends on the scenario if computers are still active. Many people believe that the web contributes to the preservation of data through distribution and self-proprigation, through archived Emails or mirrored FTP-sites. But even this will work for information that is of interest to someone. Ultimatly you are still dependant on physical media It is refreshing to think that Nsync CD's will eventually melt
Good old nasa. finding new and ingenious ways to waste billions of dollars on absolutely nothing.
That's nothing to me. I'm an Immortal, I'm gonna live forever. What I wanna know is are CD's quickening proof?
Assuming your university is in one of the countries involved in WWII, another explanation of the brittle-paper phenomenon is "wartime paper conservation regulations". Many of those journals were lucky to be published at all.
And yeah, given the state of our republic, the physical condition of the Constitution is probably fitting.
The first DVD drives had one (blue?) laser that couldn't read CD-R media reliably. People griped, and new DVD drives like mine have a second (red?) laser so that they can.
Many files are compressed to save space on the media. Compression removes redundancy and a few bit errors can make the file unreadable. Are there any utilities that instead of just compressing files also encodes with extra bits for error correction (not just detection)? Could you encode the files in such a way that whole blocks of missing data could be reproduced as well as missing bits? Perhaps it's easiest just to put extra copies of your digital files or microscopic text or what have you on different sections of your media.
bradst12@pobox.com
You contract them to re-write the contract in the current language every year as well. Then they ad +$2000 / year to the cost, and you run out of money sooner too. :-}
Clearly the source for the database system must be universally available, in fact it needs to be archived just as well as the content is. Proprietary single-platform tools may make okay proofs of concept for the few who will ever see them, but for real use they're dead ends.
Suggestion for prob with Seagate DAT drive: Put your tape in an HP DAT drive. I have been using HP DAT drives for almost ten years now, but in the past 18 months got five Dell servers with seagate DAT drives. I have replaced these peices of junk on all five servers because they would suddenly and without warning stop reading some tapes. In each case an HP drive was able to read them.
I have 4 47gig SCSI drives on my Athlon 600 system. Plus a SCSI 8x cd-rw and a SCSI CD ROM and a SCSI ORB drive. All that is gonna be unreadable in 20 years. Help!!!
This is already a problem for the US government in particular. IIRC they have literally millions of 9-track tapes and entire "warehouses" to store these tapes in. They have entire staffs devoted to nothing in the world but pulling these old tapes when their lifetime is almost up and copying them off to the newest archival medium.
As someone who just loves books .. most are not printed on acid free paper anymore and a huge amount of them is going to be lost within the next 10 to 30 years.
Add to the the usual environmental pollution and such. As an example, putting up your photographs in some new piece of furniture can be a bad idea, the chemicals can ruin them.
Leaves the question whether current data is really that important .. or whether there is that much to be proud of.
CD's are fine for the average Joe as they will out last the human life span [I don't know where the 20 years came from- CD's are rated at 70-100 yrs at 1m reads]. Should our storage last longer ? Ask yourself how many family photo albums are land filled each year. What do you have digital that warrants usefulness after even 20 years ? I have a stack of software CD's and diskettes that are now obsolete after 2 years. What do we need to save- Software ? -Personal records ? I can't find anything in my house older than 2o years except me !!!
I never thought of this! oh, my porn... my precious, precious PORN!!!!!!!!!!!!!!!!!!!!!!! :-)
Even formats decays.
Word 98 cannot read old rtf files anymore (the accents are garbled).
So what about more complex data formats?
Running an emulator in an emulator to transform the files simply is not the solution, as the chances you've lost some part of the path get higher with each generation.
Geven the fact that software decays also (each emulator will never be 100% compatible). So with each generation you lose some data.
Check out the Long Now Foundation, http://www.longnow.org if you are interested in that kind of thing.
Actually, from what I've read from most merchants specializing in CDR's, the Blue on Silver disc's last about 5 years, and the gold's last 20 years.
If you write data to a CD and never read it, does it matter if it's there?
Even the Buddhist masters cannot answer this one.
... doesn't have much experience with tape.
Tape sucks. The cheapest CDRs will be readable long after the last DAT has crumbled into a nonpermeable acetate polymer of randomness.
You're discussing a tool that has advanced so much in 50 years that if transportation followed the same trend we could move accross the nation in 50 seconds for 50 cents. You really don't think that data storage experts will not solve and simplify the way present/future data is archived? I'm no expert but I have faith in those that are... leave the theories to them and lets discuss something else.
I, for one, would relish the thought of having a "Capt. Pisshard For President!" poster.
Yes, even very recent word processing formats (and data formats in general) are out of date, and abandoned. That's why I wrote all my documents with troff (now happily supported by open source groff). As long as I can find a C compiler, I'll be able to process my documents. Head hunters ask for my CV in Word and I just laugh. They get it on paper or in pdf format nowadays. But a world full of fools continues to support Microsloth and a dozen other companies that push their own proprietary formats. Open data formats are dead before they start because they aren't sexy, and you can't lock the users in. No use at all for corporations. We are doomed.
This should be the main argument in the DeCSS case. The MPAA is arguing that DeCSS is used for making copies. Copyright law allows an authorized owner of a work to make a backup copy. Bring up the fact that laserdiscs have been failing, and that a purchaser of the latest DVD movie should be able to transfer the content to a new media format when it replaces DVD's.
Well, so the storage companies must work very hard to preserve your data because if they fail, they are in very, very, very deep shit. For example, one storage company I read about has 3 data vaults in secret locations around the US, and your data is stored in all of them. Somehow I doubt that you can top that.
Bullshit. I wish people would stop trying to bullshit about stuff they don't know, instead of just saying 'I don't know,' or even not saying anything at all. 1 second of 16-bit samples at 2^20 samples/sec would take up more than two megabytes. And that's just 16 bits. I won't even entertain 2^2^20 samples/sec. I won't even bother to refute the rest of your post, it is left as an excercise to the reader.
I do not what the other writers mean by rotting CDs. But here in Bombay (India), I work on my PC in non-airconditioned home with lot of suspended dust and very high humidity. *ALL* my floppies (including so-called anti-fungus floppies) have become unreadable due to fungus growth. The fungus layer is thin and can be seen by the naked eye when the floppy surface is held inclined to the light). The same has happened to my CDs. But so far the fungus is thin and am able to read the CDs. Also so far I have managed to clean the CDs with wet cloth early enough to remove the initial traces of the fungus. Thankfully in all cases the funugs has first taken root on the printed side of the CD. So gives me early warning.) The only things the fungus seems to live on is high-technology material (eg. inside of camera lenses :-( ). Preserving data for home users like me is a real problem.
Well, I personally own quite a number of paper documents more than 100 years old, and while I consider myself a scholar none of these came to me AS a scholar, anyone could have obtained them if they so chose.
As for limited utiltiy, it's true, the autobiography of P.T. Barnum is of limited utility. Of course, it was of limited utility when it came out! Danielle Steele is of even more limited utility even if less than a month old.
I already have 60GB of hard disks on my home PC for video editing, and I'm thinking of doubling that in the next six months and expecting to have at least a terabyte by 2005. I'd need twelve of your *massive* DVD disks just to back up my current system!
Face it, when you're talking about low-compression video data, DVDs are about as useful for backup as 1.44MB floppies...
Researchers at Los Alamos are exploring using a Focused Ion Beam to make very small (um scale) microfiche-like storage in silicon. It's not exactly a solution for home users, FIBs cost ~$1.5 million, but for very important stuff, it has a very good survivability rate.
Please bear in mind that current state of the art storage capacity on a single CD is apparently 140G.
Ever put a CD in a microwave? [um - if you must try this, 3 seconds. Max.]
Danielle Steele is of even more limited utility even if less than a month old.
I think this speaks for itself.
A lot of DVDs aren't encrypted, particularly the 'cult' and low-budget end of the market where distributors aren't willing to pay royalties for a CSS technology which doesn't gain them anything.
You're right though, this will cause problems if the idiots manage to get decryption banned; however, the movie companies love it because it means you'll have to buy a new copy of the movie in the new format when DVD becomes obsolete!
Try using a microwave owen on an CD. That will give you an idea of what EMP could do to a CD. (Hint: the media layer will act as antenna destroying the data - basically the same as how semiconductors are destroyed by EMP)
Ach, forget digital storage. Dye on dead trees lasts hundreds of years (think Guttenberg Bible). Dye on dead animal skins lasts longer (think Medieval illuminated texts). Dye on rock lasts millenia, as does carved rock (think Egypt and Mesopotamia).
... but, frankly, I own a handful of books printed in the 19th Century. And, you know what? I can read them perfectly well, and I'm positive my grandchildren will also be able to read them. The typesetting, binding, and paper are all of better quality than what I would have had from their modern counterparts ... and often the old versions are cheaper.
Bonus: you don't have to rely on electronics *or* propietary formats.
Of course, analog storage isn't perfect; you still have to worry about language drift and styles of print evolving
Flintstones! Meet the Flintstones...
I know this is high-flying, but... a way to store data would be to embed them into organisms, for example into a type of tree.
Modify the genetic structure to contain some sort of information inside it and somehow make that part immutable. Then just plant those seeds around the planet and if the plant ends up successful, people can read your data after 2000 years with no problem. Or even after longer periods of time, eg. the fern, hedgehog, alligator etc.
But what information would be so important? "It's... a cookie recipe!" Nope. Maybe something like "Ye shall not collide antihadrons with dilithium pellets lest a cataclysmic reaction occureth" or something.
The recording and movie industries will stand in the way of any removeable storage medium large enough to hold their precious works, not matter how much the masses may need it. The recording industry may allow a compromise: the data self-destructs after a few days.
Cant think of much.....nope.
Of course, when talking about how long media will last it depends how l_o_n_g you really mean. In the end all the above suggestions are jiggered by, amongst other things, proton decay: viz the expected lifetime of protons is in the region 10^30 to 10^35 years. I suppose thats enough to satisfy most people though :-P
Hmmm, maybe you should just ensure that STP is added to the standard for the medium?
I know of older millitary applications where they were making nuclear-safe non-silicon read-only memory by "stitching" in a matrix of metal wires. The downside is the time consuming work. Perhaps a textile factory could start making ROMs for you...
Sure, CD-Rs may decay in 20 years (though I was told it was 100 years), but it makes no difference. Somehow, I doubt in 20 years 650MB will take more than about a square centimetre of space. It won't matter if you CD collection decays, because you'll already have the whole thing backed up on a small optical cube that cost you $10.
Actually, I believe that IBM is working on using crystals for storage. Assuming that they can find a way to do this economically, we will have a storage medium that could last until the end of time.
A lot of you do not seem to understand that 100 years is not really a long time, not by human race standards, not by Earth standards and certainly not by the Universe's standards. What we need is something that will last 1000's of years, if not 100,000's or millions, if we are truly serious about preserving all the information the human race has gathered, about our History, Cultures, Art, Science, Philosophy etc etc. Once information is lost, it is lost forever, and you can never truly recreate the original, no matter how clever you are, excepting time travel, smart arses! Some people have joked about using stone tablets, which isn't that big a joke when you consider they are durable and have been thoroughly tested. What we need is some data storage medium with the durability and longevity of stone tablets (or better) and the accessibility of at least plain paper, if not high speed electronic access, which is optimal, and unfortunately there seems to be no solution to this very real problem on the horizon, anytime soon. Judging by the responses to this forum, it is not something that is really taken very seriously by the majority, which I think is pretty sad, considering how important this issue is, if you take even a minute to think about it! Cheers Whisper
You're absolutely wrong about the stability of motion-picture film. Actually, 35mm motion picture film is an excellent long-term storage medium when it is properly stored (in the proverbial cool, dry place) and regularly rewound and inspected. Nearly all current film-preservation work ends with a 35mm film negative as the final product. Most films that have been lost ended up that way due to carelessness or neglect and not due to an inherent instability in the storage medium. The original negative to "The Great Train Robbery" still exists and is printable. Film prints made today should retain their color and be projectable for at least fifty years (and longer if stored in vault conditions). Digitization sounds good on paper, but is actually a really bad idea--compressed formats are universally frowned upon for archival work (for many reasons, most of which should be obvious) and digital storage formats change constantly, while 35mm film has been standard for over a century. Also, the cost of digitizing 35mm film at full resolution is about $4 per frame. Compare that with about $1 per foot at 16 frames per foot for making new film elements...
Maybe some bits would get corrupted, ruining the whole encryption scheme, or maybe they wouldn't recognise the data as an encrypted Hollywood blockbuster, but just discard it as useless junk?
Uh, I don't see a difference between the two frankly. 99% of Hollywood movies are useless junk.
FDE
In a year or too I'll have an implant that will upload all and any info that I want to my brain. (a-la The Matrix) When I die, you can just transfer the info to someone elses brain. No need for cd's or anything.
Rocket science is easy. Neurosurgery, now *that's* difficult.
our valuable data is like money. if we keep our money in big piles and no one buys or sells anything, the money is worthless. witness the 1930's depressions. to keep things going, we buy and sell goods and services, keeping the money flowing. if we let our data sit in local clusters of decaying storage, not only is the physical media being destroyed, but the value of said data is diminished in that no one is looking at it. i think the answer is to keep the networks busy, seeing as we'll all have fiber to the home, just passing our backups around. ahh, but it's not secure you say. and to answer you, i should question your faith in Open Source. and now i feel quite silly for writing all this, as it's probably going to show as the 347th anonymous post in the last hour, and ain't none a yous gonna readit. but i got it out there, it's not just backed up in the recesses of my mind.
BTW, I hand old CDs from strings outside the house as decorations and have noticed something. The Sun destroys the green ones and the gold ones, but the blue ones do not fade at all. I a blue one down after a year of hard suns rays (I live in the deserts of AZ), and read back a full and complete ISO image of Red Hat 4.2! The other colored CDs didn't even recognize as a valid disc. The blue ones were generic, blank label Verbatims. Don't forget to think about ability to survive abuse when choosing backup media.
There is a subtle point being missed on this thread. We are not facing a data extinction problem, we are facing two, with different time constants.
Data within an institution typically has a short half-life, say years to decades (banks, tax info, etc.) The problem here is moving all still useful data into a format that is still readable by the rest of the firm, an in-house job in most cases. The hermit crab analogy is particularly apt, going from tape to (say) CD to solid state.
This emerging problem will demand innovation, and specialists. Specialists to resurrect or maintain the old formats and reading machines, specialists to oversee the transfer, and specialists to find the latest and greatest encoding scheme. The real fulcrum here is the manager (information management, I suspect, will become a new and major field), who must schedule the maintenance, oversee the buying and employment of equipment (and stay in budget and on time!!!), and most importantly, get the biggest bang for the buck by keeping only the most necessary data.
The second problem of the infopocalypse has a slower time constant, decades to centuries. It is largely irrelevant to institutions (all but the very few that will survive that long, and who can predict that?). Works of art, philosophy, and science are the major players at this time scale.
Whereas the first problem (of institutions) is particularly Sysyphesian, pushing the data up one hill only to have it roll back down in another 5-10 years, the second problem is not really a problem. Unless you're so anal that you consider all works of art and science to be worth saving.
Think about it, how many physics students read Newton's Principia Mathematica? I know of none. They get the summary and biography in the textbook. From the library. How many art students need to see the original? None. They get a print. From the library. The enduring themes and ideas of our culture last because they are enduring, not because someone chooses to furiously keep copying them down. Sure, there are more new scientific ideas per day now than in Newton's time, but the distilled product is kept in the text, while the old theories and bad ideas are not.
As fields, science and art and history advance on their own, students getting the necessary detail from their teachers (not, say, Bacon or Descartes or Michelangelo). It would be arrogant indeed to assume that we know ahead of time that this or that is worth saving. If it is, someone will save it.
We need to dispense with the silly notion that Visa's database needs to be saved for 20,000 years, or that string theory needs to be repeatedly transferred from CD to solid state to quantum computers. Only the most boneheaded of archaelogists would hope to save all of our present culture for future generations to laugh at.
Only the most idiotic fool would want them to.
A binary-punched card has 80 columns of 12 bits (possible holes) each. 650e6/(12*80) is 677,084 cards. Was it 500/box? I remember carrying more than 4 boxes was a chore, but 1354 boxes?
Also consider what one CD is in terms of plain ascii, equivalent to typing novels on a one-font typewriter. If you type 120wpm and words are average 5 characters and a space, that's 650e6/(120*6) minutes of typing, or approximately 41.2 years of continuous 7/24 typing. I don't think the average person will live long enough to fill up a CD that way. So you can assume all your novels and all your source code and all your tax data will fit on a CD.
Of course, your digital movies are another matter. It's just interesting to see relative scales.
The same general principles could probably be applied to transmitting less important messages.
Carve your data 5 meters deep into a mountain range. Immune to most natural disasters except impending geologic eras.
With the exception of the first formula for CD and CD-R, the potential for CD decay is an urban myth that is being exploited by the media, including librarians.
- problem.html
Please reality check your perceptions at:
http://www.cd-info.com/CDIC/Industry/news/media
If you are still worried about decay, there is a permanent CD format that will last millions of years. HD-ROM (200 gigabyte) and HD-ROSETTA is immune to technology obsolescence, electromagnetic failures, and withstands the effects of time.
http://www.norsam.com/
Say, ever heard of the three-body problem?
Stars move in literally unpredictable ways. After a while, your mapping will produce gibberish. Even more ironic, you'll find that your key will be as large, if not larger than the information you are trying to record.
Claude Shannon will not be denied.
- a 6-foot stack of source code, punch cards, old Algol and FORTRAN programs. Unreadable by current hardware. Value:zero/sentimental.
:)
:)
Historian: "What's with all these pieces of paper? Do you think those holes could be encoding something?" (researcher then trips, falls, and unsettles the whole 6-foot deck)
- a stack of Apple ][ disk with "all Apple ][ software ever written". Unreadable by current hardware. Value:near-zero.
Apple IIs are like cockroaches
- A couple of thousand 400K diskettes containing Mac System 1.0, Microsoft Word 1.0, Adobe Photoshop 1.0 and similar stuff. Unreadable by current hardware. Value:who knows?
Still good for competitive upgrades in some situations
Evidence? I have several (audio) CDs from the early 80s which are no longer readable.
No longer readable, or just no longer readable by your equipment? One thing that doesn't seem to have been mentioned is that our CD readers are generally designed for speed, rather than making absolutely sure every last pit is read. Given that mass-produced CDs have physical pits in the media, I think it's likely the pits themselves will remain, and be readable to some device which reads them rather more slowly than a 40x (or even a 1x) CD-ROM.
Ooh, a sarcasm detector. Oh, that's a real useful invention.
It's easy to deal with. Just costs a lot of money.
data goes to disk, disk migrates to tape.
Deleted
There was another Slashdot article about this a while back.
RFC2119
true but there are people that don't want to be responsible for their own backups. like the writer who submitted the post. he's a writer not a computer guy. anyway, i never thought about it this way until scott mcnealy said something that rang true. he said that keeping money on your hard drive or anywhere locally is like stuffing money under your mattress. we put money in banks because we know that our money is safe and secure there. if your hard drive crashes or you walk through an airline xray with your laptop or your house burns down, you're shit out of luck. i may get a feeling of security with a local backup but i still think remote storage is safer.
"The lie, Mr. Mulder, is most convincingly hidden between two truths."
--
And Justice for None
I worked for a company that (in 1988) did all their backups onto paper tape, because it would last longer than magnetic media.
They weren't so forward thinking in all respects though. This was the same company that had a Y1988 bug caused by using 1-digit years in their databases.
I'd say the whole thing is a fairly moot point - it used to be (way back) that if you wanted to save some data, you threw it on a floppy. Floppys tend to degrade (depending on quality, conditions, treatment, etc...) after a few years. Of course, hard drives got bigger, so people didn't need to move so much off onto floppies...then programs got bigger (bloat) so hard disk space was once again at a premium. The Zip drive came along into the mainstream (I believe it was in use by graphics houses for a while before it became popular to the average joe) --- so now it was a larger (capacity-wise) storage medium, that didn't degrade quite so fast...but even Zip drives are being replaced by CDRs, as the drives become less expensive. I'd say that it isn't inconceivable that in the next 20 years we'll see a new, larger-capacity storage medium that will outlast CDRs by a LARGE factor come into play...and by the time THAT is starting to break down, we'll have something better.
Anything that's actually important enough to keep forever will survive by any means necessary (barring Murphy's Law taking hold). The rest can peacefully degrade.
Mankind has always dreamed of destroying the sun.
Case in point: If you go the Museo del Oro (sp?) in Lima, Peru, you can see some of the few Incan gold artifacts that the Spanish didn't melt down into gold bars. There aren't that many religious items there, but, well, if you wonder what they did for fun...
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
It seems to me the best storage material would be stainless steel platers from 5 1/4 to 12 inches in diameter. Information could be recoarded by litleraly blasting pits in the material on both sides. The first disk in the archive set would be marked so it would be ovious that it is the first. I would make each disk a different color and that color would folow the sequence of colors in the s pectrim. That way if the disks got out of order it would be easy natural way to tell what order to put them in. A society advanced enough to get to the archive would know what the spectrim is.
I thought about using gold since in has real good properties for storing lasting information but decited against it for several reas ons. One, gold is soft and can easy deform destoring the information. Stanless steel will last forever where I want to put it. No t years, decades or even millinua, but billions of years. Second, gold is vaulable. Stanless steel itself is practially worthless. Only the information on the disks has any real value other than the fact that disks themselves would tell a small story about how technologicly advanced in materal research our society was.
The first disk I would make side one low density. So low that you could tell there was information on it by rubbing your fingers ac ross it. First few tracks would contain a roseta stone on how to read the rest of the disks. The next few tracks would contain a s ummary of what is in the archive and why it is here. The next tracks would contain all the information on how to build a reader fo r the much higher density disk to follow. Hell, I would include a "simple" mechanical reader in the archive for the first disk and base the higher density readers on the simple one.
Now where would I put this archive? I would put it in the most stable, secure environment I could think of, space. The first archi ve I would put on the moon, in Tyco crater. Second archive I would put in a secure orbit around Jupiter and the third I would put i n deep space beyond the orbit of Pluto. Now here is where I start ripping off 2001. I would put each archive set inside a black, o bisidian monolith of perfect dimesions. You want it so that any intellegent creature that came across it would know from looking at it that this was placed here by another intellegence. So you want it to look as unnatural as possible.
I would have the archive set in side a special case in each monolith that would resonate when a high intenstiy radio beam was focuse d inside the monolith. So that any probing intellegence would know there is something inside it. The only way to open it would be to break it.
Now why put it on the moon and in space? Well I couldn't think of a better place to put them. Certainly couldn't keep them here o n Earth. By placing it on the moon if humans where blasted back to the stone age we would have to climb to a level of technology eq ual to the '60 before we could retrieve the first archive. Since by that time we would have forgoten about the second archive the first one would point to it on the last disk. The second archive would contain all the contents of the first archive plus much more . The third archive would contain the first and second but no new information. See the third archive is not ment for humans, it is ment for others.
In 4 billion or so years the sun is going to expand and consume the first and maybe the second archive if they are not found. The third archive will last forever in deep orbit outside Pluto. It would be for any future aliens that came along in a few billion ye ars. It would be our way of saying "we where here, this is us and what we where." It would point to the second and first archive b ut the first and second would not referance it. We want humans to forget about it sinse it is not for us.
What would you put in such a archive? I would put us. Our cultures, our history, copies of our music, stories, religious beliefs, and genetic make up. I doubt I would put much technology in it because that would be redundent. Any society advanced enough to ret rieve the archives past the first would be far in advance of our society when we placed them there.
Somebody is going to see the words space involved and going to bawk because some imgainary cost they are going to pull out of thier ass. While I don't know how much such a project would cost I would rather spend a couple billion on this than on some mulitbillion dollar pentegon wartoy. Besides, I don't think it would cost a couple billion.
I read at +2. If your post doesn't reach that level I will not see or respond to it.
Of course, you should be careful how you store those (vertical or horizontal), considering glass is really an extremely viscous fluid.
My Mobile Fidelity Sound Lab Gold CDs (with pre-recorded music) claim a 200 year shelf life.
pronoblem
Go on, GPL your work, stick it on sunsite and let the whole Internet be your backup. Now we know what RMS really had in mind all those years ago...
> nobody is making any effort to archive old >computer games
In the UK the Museum of the Moving image started such a project, they were worried that so much had been lost of the early days of movie making that they didn;t want the same to happen with computer games - especially given the UK's leading position in the industry.
DAT used to be wonderful -- most 2GB drives work great. The newer high density drives are unreliable in my experience.
Backup "as we know it" on removable media is probably going away in the next 5-10 years. Retention of tapes, or copying old media to new ones is not practical. It is better to simply archive to hard drives, and their prices are falling much faster than tape prices.
Benny
Finally! A year of moderation! Ready for 2019?
Since the 140G CD is apparently on the way, and capable of holding the audio from 2000 standard CDs (suitably MP3'ed) I would guess that the copying-old-data cycle will happen only once more before we start storing video rather than just text and occasionally sound.
At the moment, I could fit every hard disk in the house (many) and all of my audio CDs on a small part of one of those. Who knows how big the next shell will be for this hermit-crab data?
When we get down to storing data at the molecular level, in crystals or similar, the crystal can spend some of its idle time rewriting and refreshing the data, the integrity guaranteed by multiple copies and serious checksums.
No huhu.
Got time? Spend some of it coding or testing
As to my knowledge current B/W microfilms remain in excellent condititon for 500 years, if stored properly and should remain readable for 1000 years.
Color microfilms are expeted to last only 100 years due to more complex chemical structure
Currently there aren't any real long time (500 years for example) preservation solutions for digital information. Best recommendations are to use open and standard file formats, refresh to new media frequently and develop emulations for current computer systems so that not only images, text etc. files but also complete applications like multimedia CD-ROMs can be accessed in the future.
I've read about the matter a bit in work (Helsinki University Library's Centre for Microfilming and Conservation.)
...and if they don't want to replace 'em, how 'bout they let folks like MP3.COM put 'em up on their site so we can listen to 'em after the originals, which we paid for, degrade?
Interesting angle I never thought of before...
--------------Rev. C.C.Chips---------------- For the real truth, visit
Bulk eraser.
The idea is to expose the medium to a strong, reversing magnetic field---one that will get the magnetic domains in the object vibrating to it. Leave the medium in the path of the eraser, maybe circulate it around a bit. But here's the kicker...
After awhile, slowly move the medium away from the eraser until it's about 3 or 4 feet away. Then, and only then, turn off the power to the eraser.
Maybe, just maybe, you'll have gotten rid of most of the magnetic fluxes.
--------------Rev. C.C.Chips---------------- For the real truth, visit
But I think in general, most important data will always be recopied onto newer media before the old media decays. No "infopocalypse", though it makes a good buzzword. Link it to a magic date and you can start another media/marketing scam. Data decay is certainly an issue, but it is an ancient issue. And increased technology has only made it less and less of an issue. No longer are we at the point where in a thousand years there will only be a few scraps of manuscript surviving of someone's life, the real issue is preventing personal information from being published and publically archived for all posterity. I'll bet a lot of early USENETters aren't entirely happy about their early online days being publically archived for eternity. Well, actually, I know; I'm one of them!
They could be buying them just to distribute stuff. That's why I buy these cheapo disks. The shelf life of the data I'm distributing is about 4 weeks. (And don't talk about ftp etc., Some of my clients are lucky to have radio phones)
--
--
"Insert witty quote here."
especially given the UK's leading position in the industry.
You left out the emoticon...
;)
Sheepskin rules! Think of all the other great works from that same time period which were preserved because someone had the foresight to put it on sheepskin. There's that great document about... and that other poem with the guy who... and then there's the... OTH, maybe Beowulf just got lucky.
Hmm... what happens when the paint shop no longer understands the contract?! ;)
The value of saving information is not purely to pass on what we think of as valuable or worthwhile. Keeping a lot of trash stored away so that people can look at it later is important if they are to understand us. How could a future historian understand the 1990's if they only had access to the "good" stuff? To understand late 20th century America, one would have to be able to see infomercials, read tabloids, and listen to crap. If we actively filter out what people of the future can or can't see about our society, we are trying to rewrite (or prewrite) history for them.
This does not mean we shouldn't pick out things that we think are particularly valuable and make them easier to find--that would be good. I'm just opposed to the idea of actively throwing out information in the name of making our culture easier to digest for others.
Also, our standard of what is worthwhile or not worthwhile could be very different from what people think 100 years from now. Today, we worship pieces of art that most people considered worthless when they were created. In the distant, distant future, perhaps, Ishtar will be rediscovered for the masterpiece that it is!
This _should_ be done. I admit I haven't read any of his books, the reason being his inane 'Chaos Manor' column in Byte, which sadly continues in the on-line version. If I were to read about one more of his Mac Hypercard apps, or how much he loves M$, or how important he feels are quality SCSI terminators. Grrrrrr!
This is a major source of headache for NASA, see this slashdot story. As for the idea of putting data in remote storage.. well, it is a good idea but it doesn't solve this particular problem since it means that while *I* am not forced to deal with the issue, someone else is.
didn't isaac asimov push microfiche? (maybe in the foundation books?) i think that's the answer! man, i've looked up newspapers from nearly a century ago on microfiche..
can you imagine the human genome project using microfiche for data archiving? =)
- pal
let me just say that this is an understatement. i don't think i've ever used a 3.5" that lasted more than a week. and this isn't just with one drive, either! as amazing as it seems, i don't think i _ever_ lost anything i put on 5.25" (which admittedly wasn't very much). but 3.5" disks are almost entirely useless. i don't store _anything_ on those suckers that is even of marginal importance.
- pal
There's an interesting article on how the librarians view this problem at:
9 vanderwerf.html
http://www.dlib.org/dlib/september99/vanderwerf/0
I know that some data is really important ("critical") on a short-term view. But why is so important to keep so much information for so long? Of course, we should protect a few things (good and bad) for future archeologists.. But this is not the point here.
:) I mean, we will all die someday, and our data should die someday too. We should not impose our trash on next generations.
Archeologists and historians will always wan't to know more than what we have kept for them. That's their job.
We should just face the fact that we can't archive everything. And that even archives can (and sometimes do) get destroyed.
When data get lost, we should see this some very good occasion to reconstruct what was fragile, redesign what was poorly done the first time. Don't tell me how much that cost and how much work has been lost and all that crap..
What is [seen as being] good will survive. And if it doesn't survive, it will be re-invented. As the system goes, this might just be a very nice way to get new patents and suck the blood out of your neighbour.
Orzak
1. Buy 10 cd-rw disks. Should cost you about 30 USD or so. If you're paranoid, buy all 10 different companies of disks - all good name brands.
2. Set up a schedule of backups, for instance: every sunday at 6am your crontab emails you to remind to make a backup. You pop #1 cd in and backup your homedir. Next week you take disk #2 and backup your homedir, asd so on.
3. If you're really paranoid you can md5sum iso of your homedir before burning, then retrieve iso from the disk and md5sum it and of course compare md5sums. This will help you to get rid of fast-decaying brands too.
Take a look at the alias I have that erases cdrw and makes backup:
alias bkup='cdrecord -v speed=2 dev=0,0,0 -blank=fast; mkisofs -r -o ~/.etc/backup_img ~homedir/; cdrecord -fs=16m -v speed=2 dev=0,0,0 -data ~/.etc/backup_img; rm ~/.etc/backup_img'
My drive needs the disk reloaded before cleaning and it's a caddy drive so I can't do unattended backups. If you can do them, this whole procedure can be extremely easy when set up properly in crontab.
Of course there are a few problems:
1. fire, tsunami, earthquake, flood, volcano: backup essential data to remote server. I know there are such services. The idea is that even though they aren't rock-steady, the chance that both your hd and your backups die and their server dies at the same time is highly unlikely.
2. 650mb may not be enough. In that case, consider dvdrw when avail. and tape backups. Tape backups are a good idea anyway but I already had the cdrw drive around..
This should be enough for almost anybody.
-- ATTENTION: do not read this sig. It doesn't say much.
Why do we keep relying on mechanical devices. I'm tired of all these moving parts in my machines. Solid state is the way to go for me. Can't wait for the price to drop.
I drank what? -- Socrates
What about color film? I thought that there were problems with the stability of the organic dyes used in some motion picture film. Some old color prints have a weird looking color balance, like they had been left out in the Sun.
One thing that flashed through my mind as I read through the posts here:
One poster made the comment about relatively cheap copies to be made by request -- obviously applying this theory/knowledge to CD media. What occurs when we come up with a proprietary format like DVD (view/copy protection)? Understandable that people of the future may not necessarily want to watch South Park or The Matrix, but if the intent is to get the data off the disc as easily as possible, then why have this at all?
I'm sure that not all DVD's are encoded/encrypted (DVD ROM's aren't, are they? Not having a DVD-ROM player...) but I do have a feeling as we tumble through 2000 and beyond, we'll start to see more "copy/view/listen/play" protection.
Karnal
If I wish to ensure my data is kept through time, it's time to fire up the printer. Reams and reams of paper. Yes, paper can burn, but I have more faith in paper than current media.
What's the current thinking of the lifetime of laser-printed material? I guess the toner's plain old carbon with an organic binder. Laser toner flakes off with mechanical stress, but will it dessicate with age?
I think your best chance is some acid-based paper printed with an inkjet fed with permanent ink, and then stored in a dry environment.
Either that or papyrus.. it seems to last well! =)
While losing data is a problem, an more important issue for most people is being able to find _useful_ and _relevant_ information on an ongoing basis.
The main dangers are:
1 necessary information has disappeared due to obsolescence - it has already been noted that most necessary information has a limited lifespan, so this problem is limited as long as standards have a 'reasonable' lifespan
2 too much junk - high signal to noise ratio - this can be tackled with processing power, but can still be a problem
3 'Non-essential' information has not been preserved, because it isn't seen as valuable to those creating the archives. An example - Titanic will continue to be a money-spinner, and will probably be 'ported' to any new medium. A documentary on racism, or conglomerate control of the media, may be of educational use, but may not be preserved in a central archive, because it will not be sufficiently profit-making, or will damage the image of the parent company. Another example is Fantasia - can _you_ find a copy of the original version?
Andrew
I'd rather go down in familiar flames than be lost in that endless blue.
*Perhaps* the dark side of Mercury might suffice, but I doubt it...even the dark side, while not receiving radiation directly, is probably still impacted by the gigantic magnetic fields of the Sun.
DNA is a Turing machine. You, however, being dynamic and emergent, are not.
January 1, 1990 I walked into work at the D.O.T. and looked over a couple of traffic reports and noted the date - 1980.
The party's over
Of course, although your punch cards have survived your storage room you no longer have a card reader. Fortunately, you can read them with your sheetfed scanner or holding them in front of your video camera...
Because they'll TRY to scan everything with them?
The simple solution to making a warning sign for 500 years in the future is issue a yearly contract to a paint shop to repaint the sign in the current language. ;-)
Do what the US does to protect the Constitution not too expensive really, just fill a vault with lots of pure nitrogen or helium.... an inch of lead and a couple more inches of steel, hermetically sealed ought to go a long way towards improving those projected life spans.
Ruler of creeper, mortal and scallop.
This isn't a new issue, or even one related to technology. If you read Simon Singh's The Code Book he takes a detour out of relating the fascinating history of cryptography to relate how the Rosetta Stone allowed hieroglyphics to be read, and how Linear B (the Minoan script) was translated. Great stuff, and it shows that the problem of old material being in dead languages is an old problem.
ben_ the technologist and platform agnostic
Many of the compressed media formats have a resiliance to loss. Look at mp3s for example. You can lose a block and all you get is a short burst of noise. The overall signal is still retained. Same goes with DVDs. DVDs are in fact a lot more tolerant of errors than other digital formats.
A trade off is available: you design your compession algo so that a given amount of error will only affect a given amount of data. If you have enough data, then a loss is damaging but not terminal.
Encryption becoming a problem for future generations isn't reliant upon the US relaxing crypto export rules. It's perfectly legit for me to use a 64 kbit key on my data, I just can't export the software to a country not on the "ok" list. I can publish the algo in a book, describe it in a natural language and ship it anywhere I want.
Advances in information technology may reduce the complexity of understanding unknown data formats.
It's also interesting to think about possible directed evolutionary changes and how they can affect the way we store and transmit information. Look at audio data. Nature uses analog wave through a medium to transmit signal to the ear. The ear has developed the techniques to translate that information into a form that the brain understands. FHG did a bunch of research and determined that a lot of the information provided isn't necessary to convey the same concept (mp3). Given that they have developed this perceptual knowledge base of what we can hear, it's possibly only a matter of time until we learn to process that information directly and do not require the translation from the "minimal" dataset to the "natural" data set. There's a lot of implications that I can't see right now, but may really change the way we think about information and communication.
They don't have to be slow reading. Think about a high resolution digital photograph of the entire disk. It's just a matter of interepting it. Nothing dictates that the same method of reading the data be used in future times.
Technically it's possible to take a picture of a track on a vinyl LP and then use image analisys to play it back. In fact I'm surprised that nobody has done this already. Seems like you could point a laser beam into the grooves and then use the reflected light to recreate the audio signal, thus reducing the amount of wear and tear on the record. A lot of collectors would probably love this. Maybe I should go into business...
Store enough data and the language problem sorts itself out.
There's an 8-inch drive and some 8-inch floppies within a few feet of me...
So I don't see your point.
I have seen the future, and it is inconvenient.
The National Archives identified this problem several years ago. Most data archived to CD is going to be useless within 10 years anyway. If it needs to be kept longer, what recourse have we? Yeah, magnetic tape lasts longer, but not that much longer in the big scheme of things. If the Consitution of the United States of America were stored on DLT, we'd have lost it 180 years ago.
There is some promise in storage of data in plastic blocks. I remember reading that IBM had been working on a tech that would use a laser to change the color of individual molecules within a small block of translucent plastic, thus giving one 1's and 0's. I don't happen to know the lifetime of such a technology, but it seems that it would be less susceptible to decay.
The primary problem is that language evolves over time and makes the data harder and harder to interpret. Of course, in the real sense of things, not much needs to last more than 20 years in electronic format.
The information that is truly needed will be accessed and backed up more often than once every 20 years. We're not going to suddenly forget the constitution or how calculus works. Important knowledge is widely distributed an accessed frequently. Your tax returns and downloaded porn are not importaint 20 years later.
Finally, CDR's will corrupt only if they are mistreated or accessed frequently with an older CD-ROM drive. Put them in a jewel case in a cool environment and they'll last a lot longer than 20 years.
STFU & GBTW
actually - if your house burns down and your money was in your mattress the treasury department has investigators that can figure out how much money there was - something about money not burning that well or something.
slashdot username - at - email.domain.name
You might look into the storage products offered by Norsam Technologies, such as HD-ROSETTA. The idea is to basically etch either bits or actual text (readable under a microscope) onto a metal disk. This technology is being considered for the library envisioned by the Long Now Foundation.
You know, this is one thing that has bothered me (and I guess, could even be considered about the originals):
Something like the Constitution (and the DOI), the concepts are very important, but what is to stop anyone with a digital copy of changing certain words and such to match thier own agenda (say over a period of several decades)? Indeed, who is to say the paper copy of the Constitution is the true original?
We not only need a way of protecting the information from bit rot, we also need to remember to implement the ways (which exist!) to protect these documents from alteration!
Or am I being overly paranoid?
Reason is the Path to God - Anon
I don't think the strength of the magnet has much to do with it...It has more to do with moving the magnet to get the electrons on the disk to get up and dance...
I used to take a fridge magnet and move it back and forth across the disk a few times on floppies I wanted to destroy.
I patented screwing your mom. But it got revoked for "prior art."
Unfortunately, I don't see an easy translation to computers, maybe beyond a remote server that upgrades a RAID every few years.
However, one unique problem that plagues data as opposed to books, film an photographs is that the really useful stuff changes, even in minor ways, frequently. One of the larger issues, aside from strict data preservation, is knowing how say, Amazon's first site looked.
My employer, it is fairly assumed around the office, doesn't have it's first site in code or graphics, only screenshots. I think this is true for many organizations beyond mine.
Oddly, the archival market in photography developed after a few decades of seeing early photographs fade, then determining what environmental factors lead to image decay. I suspect something similar will emerge for data, beyond the archival tape "cover your ass from data loss" market.
--Humpty Dumpty was pushed!
Ever put a CD in a microwave? How about the dashboard of your car?
That metallic substrate makes a great antenna for all that EMP. Probably gets hot enough to warp the plastic if nothing else.
Shut up and eat your vegetables!!!
I tried all sorts of different media types for different purposes and have found:
My JVC deck hates FujiFilm, BASF, and no-name blue-greens (cyanoazine?). It is lukewarm to Kodak gold (100 year lifespan) (I have to eject and re-insert a disc every time I kill the power to the player). It loves Verbatim discs.
My girlfriend's personal JVC stereo is completely tempermental (sometimes it plays burns, sometimes it won't), but its quite a few years old, and the older players seem to only like traditional aluminum discs.
My suspicions are, however, that the combination of burner and media are what make the disc readable or non-readable by players. The trouble is, I haven't had an opportunity to test this empirically - every time I get an audio disc from a friend its always something different from what I have.
Don't like my sig? I don't either.
I'm not sure you're right that punch cards last well. What I mean is this. If you have 650MB worth of punch cards, what's the chances of every single bit of data surviving for longer than, say, if you have 3 copies of the same CD?
Y'd have to house them in a sterile warehouse or something. You could probably get CDs to last better, spending equivalent $/MB on looking after them. Or at least magnetic tape.
perl -e 'fork||print for split//,"hahahaha"'
>Strange how they haven't tested it
>for 20 years.....Much less 200....
Strange how you didn't even visit/read the page before posting this... in 2 minutes you would have been reading words like "a very complex and statistics-based process" and "extensive media longevity studies" and "mathematical modeling techniques".
Like you, I'll believe it when I see it, but I wouldn't go shooting off my mouth without at least checking things out.
'Intellectual Properties' are uncontrollable in the wild. To base an economy on them is just stupid.
What I've noticed is that most of the data we're accumulating is quickly becoming useless. 10 year old schoolwork isn't something so worthy of archiving. The data you really want to keep shouldn't be very large anyway...
I would have to disagree on this point.
Firstly, people will start using digital media instead of photo albums, family videos, and the like. I for one intend to digital video my family every six months and archive it (once I have a digital video camera). This data I'll want to keep until the day I die, and presumably my descendants will wish to keep it after that. This is gigabytes of data that anyone using digital media for a record of family and friends will wish to archive for a very long term basis
Secondly, what about digital artists? I use my computer to do audio multitrack recording - a five minute song with four or five tracks can take up two or three hundred megabytes, and a bigger more complex track can be a gigabyte or more. Although I probably won't get famous, someone who uses home studio digital recording will. Now if someone wishes to remix one of the tracks these people did in the future (and someone will, look at the remixes of old songs coming out now), an archival copy of the individual tracks will be invaluable. For my own stuff, it will be very hard to remix, as I've only got space to keep the mp3 version of my songs, and not a version with each track seperate (I need a CDR burner!)
postmoderncore - art and creation are a higher purpose
Anyone care to explain why they are TRI-corders? Seem to do a hell of a lot more then 3 things..
Blessed are the pessimists, for they have made backups.
Too bad the human brain leaks.......didn't you ever see flight of the navigator? :)
Please. Microwave ovens output 'microwaves' (specifically, 2 GHz), which are electromagnetic (radio) waves, not "nuculer" radiation...
h-bar_hack
www.backwoodsengineer.com
The CD-R suppliers claim 20-200 years. The upper bound is simply because thay don't know of any real failure mechinisms. The lower bounds is wear and tear of 20 years of careful use. Can someone point us to some *real* data, not just speculations.
(bogometer reading: obvious blatant flasehood, temporarily assuming perfect recording and interpretation to prove a point. imagine...)
... One voice suggested that the filming was ordered by Titus himself... the guard didn't like hearing that...we heard someone lecturing him on the value of the film they were using, and berating him on using it to film common, everyday entertainment... An altercation apparently became bloody off camera, and the film ends shortly after that.
They recently discovered an authentic audiovisual recording of a gladiator fight in the Flavian Amphitheatre dated Circa 90 AD. Captured in the ancient Roman film are Titus, hundreds of high-ranking, hobnobbing patricians, a bloody battle that makes Ben Hur look terribly tame, and a view of an architectural achievement that scientists have previously been able to only theorize about: The Colosseum's roof.
Scientists at the University of Greece have added this tape to the recently discovered collections of Socrates' filmed lessons and thousands of morality plays that served as entertainment on the day.
"We had always had evidence that the Romans were an advanced culture, but only recently...discovered how barbaric those fights actually were..."
"It's especially surprising, considering the last five minutes of the_film_, that it survived at all.
"At one point in the film, the person filming was apparently struck down, by a Colosseum Guard
"Apparently the film was stolen by someone else at the event...the last five minutes give us pause to realize how significant this discovery actually is."
My point: The historical significance of poop culture (misspelt, but it sticks) is not so much about the show, but what's around the entertainment. Who are we to say what will be irrelevent in thousands of years?
_______________________________
Then the future historians will dig around and find a nifty source code of "DeCSS" and their problem is solved.
You want fidelity? Have you heard of sony's new HCD format? Uses DVD-like disks, but audio is recorded in 2x2^20 samples per second, much higher than CD's 44.1khz. In the future, who is going to care about hearing 1/2000000 of a second of audio? The point of anyone in the future hearing, reading, or seeing anything we make today would be for quantity of information, not quality. So what if a document is 50% degraded? 50% is still useful... Data recovery labs today, though expensive, can recover drives that have had lightning surges, been burned, and I have even heard of torn up floppies being put back together. Im sure aging of disks will be only a small problem.
I think the real problem is as sensible digital storage seems to be today it is the worst way to go. We already have problems reading 20 year old digital tapes because there are no drives left - a good exaple for this is that just recently the german inteligence agency was able to read the tapes from their east German counterpart because they used old IBM tape drives (well actually a east German rebuild) and their own lost fileformat, which had not even encrypteion, and it only took so long because they cold not find a tapedrive which could read the tapes because everybody who had one got rid of it years ago when it got obsolete - and I am sure there are many other cases where something the like happend.
Another problem is that besides drivetechnology being lost (this will suerely also affect DVDs and CDs) filesistems standards will be forgoten and what are you going to as an archologist if you 30 years from now find a CD-ROM and after you managed to build a drive and recover the digital data on it if you can not find a way to tell where a new file starts without an efort just as great as the building of the drive itself and then you probably found a QUAKE CD and chances are good that you never find out what it is - just think that at that time people might have 1024 bit quantum computers so it is unlikely they even will think of the posibility of a computergame writen for a 32 bit computer - although it might tell them much about todays society! Or well think about text files. Who knows how long ASCII might last till it is forgoten? We only had it for a couple of decades and im sure nobody will know what it is in the 22nd century.
The only way to keep data for a long time (I am thinking of a couple of millenia) is to store it as a hardcopy. This can happen kinda like the ancient stone tables - a friend of my parents is an archologist who managed to translate 2000 year old reports from the middle east. I think we should write our important data on corosion proof sheetmetal (with a laser for examaple and stor eit in several places around the world which are geologically stabel. There we can put down impotant fact about ouer society and the plans to build stuff needed to read ouer digital media if they have survived. Because we have no guarantee that there will be no breaks in human development in teh near futur which are as deep as the end of the roman empire.
--Ulrich
On no accounts allow a Vogon to read poetry at you
IMO this is no real probleam as long as the capacity of storage media keep falling.
Every ten years you can store about 100 times as much data for the same price in nearly the same time.
It is no problem today to keep everything someone has written in his whole life on a laptop, buy a new one every 5 years and use ten percent of the disk for *everything* that was on the last machine.
And if there will be a demand for offline long term storage media, it will be met.
Or in case of books, you can still print them out or even engrave them into platinum sheets if you think your writing deserves it and you are willing to pay for it.
Peace and Prosperity
Pavel
Without order, nothing can exist. Without chaos, nothing can be created.
At some point, the constitution will fade away to the point that it can't be recovered, but it's been copied so man times, it would be very difficult to truly remove it from the culture.
That's because the constitution is a meme, an idea that floats through a civilization as though it were alive, like Shakespeare or the Bible.
Document preservation is for those documents that don't get so much attention.
let me just say that this is an understatement. i don't think i've ever used a 3.5" that lasted more than a week. and this isn't just with one drive, either! as amazing as it seems, i don't think i _ever_ lost anything i put on 5.25" (which admittedly wasn't very much). but 3.5" disks are almost entirely useless. i don't store _anything_ on those suckers that is even of marginal importance.
I feel your pain. In fact I feel like a soulbrother or something. When I have to use floppies (which is getting more and more rare) I always save it twice (with another extension) or use two floppies. My two last computers I have made without any floppy at all and just use bootalble CDs and CD-R or CD-RW for storage. Man, what an improvement this is.
"There is no substitute for thinking" - Bjarne Stroustrup
If anyone had the time, I imagine that someone could retrofit an exsisting 1/2" audio production deck with a new head-stack that could read the tapes. ??? If Les Paul could invent the multi-track recorder in the '50s, someone today could build a head stack that would read the tapes. We in the video industry are having the same problem with tape rot. I see a lot of Memorex tapes from about 20 years ago whoes oxide just sheads off! As the stuff decomposes, it gives off the strangest smell. Ewwwww. :)
It's mandatory to wash your hands before returning to the land of Dairy Queen.
Let me guess: your primary "production result" is text, right? If you work with heavier media than that, one DVD disk is definitely not enough. He, sometimes it's nice to be a coder; we're safe.
main(O){10<putchar(4^--O?77-(15&5128 >>4*O):10)&&main(2+O);}
Anyone care to explain why they are TRI-corders? Seem to do a hell of a lot more then 3 things..
Seem is the operative word. Supposedly, everything they do can be boiled down to a combination of three functions: scanning, analyzing, and recording.
Does this
Anyone care to explain why they are TRI-corders? Seem to do a hell of a lot more then 3 things..
Seem is the operative word. Supposedly, everything they do can be boiled down to a combination of three functions: scanning, analyzing, and recording. Hence the word, tri-corder.
Does this
You're right! This may be as big of a problem as Y2K!!!
We could use the technology from the Regan Star Wars program to etch BarCoadFS data onto giant titanium palletes which would be copied and have the backup of the backup be sent to a deep cave in the moon for storage.
(Then again, maybe I should stop toking off my Klein Bottle Bong and seriously worry about this.)
In the county where I live they recently discovered all their old data tames are largley unreadable. The paper originals for this data are all in the landfill by now. Ultimatly, what people are going to discover is that, for all it's fraility, paper last forever - all you have to do is keep it dry and away from fire. Don't forget, it isnt the fragility of the media that should worry us, it's the speed at which this stuff becomes obsolete. Readable DVD's, CD-ROM's, whatever ain't no good fer nuttin; id the machines to read them are worn out and no parts anywhere to fix them. I have a beautiful Wollensac ytape recorder that 30 years ago was a very expensive reel-to-reel machine, the best the average person could afford. I got it at a flea market for &20, and when it stops working, it goes to the landfill, and my tapes, pristines as they are, may as well be paperweights. And, if you had a like-new collection of anything on Beta, how would you play them back? Yet, I can remember a time when Beta machines were everywhere. We don't need durale media, we need machine standards that will not change from decade to decade, or better yet, century to century. But, that is totally unreasonable. Anybody got any ideas? And, anybody got an ol Wollensca I can use for parts? I love those old Stan Kenton tapes !......
If CD's decayed that quickly, a lot of environmentalists would be very happy. Plastic just doesn't decay that quickly in air, unless pollution has become a lot worse lately.
Where is my mind?
Check out Project Upper/Mute, an all-around awesome compiler fra
The solution is simple:
Create a storage medium that is composed mainly from soda cans, plastic rings from around six-packs, plastic grocery bags, styrofoam food containers, etc. . .
I understand that these things take eons to decompose.
;)
In 20 years you'll probably be lucky to find a cdrom drive.
. The filter of decay has served mankind well so far - sorting out that which somebody treasured enough to save from the vast ocean of lesser stuff. In this century the Dead Sea Scrolls were discovered nicely preserved for over a millenium because somebody thought them worthwhile.
Yeah, that works fine; at least until the Romans come in and burn down the library you were keeping all the important stuff in. You destroy the library, you destroy the culture. You destroy the culture, and the land is yours for the taking. Do it often enough, and you have yourself an Empire.
Is this post not nifty? Sluggy Freelance. Worshi
I have many audio CD's that are over 10 years old (probably around 150 exceed 10 years), and not a single one exhibits any perceptible degradation, except for a very small number that have been physically damaged. Some of these date back as far as 1983, and even those have no audible defects. These have had quite a bit of use, and no special handling or storage, the jewel boxes that came with them. From time to time, they've even been exposed to temperatures above 100F and below 32F. I'd say 10 years for consumer audio CD's is rather conservative.
---
Peace,
vilvoy
Yup, I was incredibly surprised by the same phenomenon when I wanted to really erase some of my audio tapes before recording on them again. I was getting sick and tired of hearing U2 come through my new Coltrane recordings (!). I have always been very cautious about letting my tapes/disks/etc get anywhere near anything magnetic, so when I wanted to get rid of that U2, I figured this weakness could be put to good use. I laid a ceveral cm speaker magnet on top of the rewound spool of the casette and left it overnight. The next day, I was quite surprised to find Bono still singing at full volume. Hmm. After some experimentation, I found that the only way to "erase" anything with the magnet was to have the magnet actually touch the tape ribbon. The next problem was: how was I going to get the entire length of the tape ribbon to rub up against this magnet; I sure didn't want to wind the whole thing by hand. I ended up taking out the motors/circuit board from my 8 (?) yr old sony walkman, laying the tape on the board sideways...so that one of the spoolers fit into one of the spools and exposed the tape ribbon, and pushed the magnet up against the tape where the read head normally goes. What a fun and amusing hassle! My friends thought I was nuts. And you know what....I can still here Bono coming through sometimes. Good grief!
// ///#\)
______________________(
Amen to that. I've recently had occasion to look over my accumulation of old data. Item:
- a 6-foot stack of source code, punch cards, old Algol and FORTRAN programs. Unreadable by current hardware. Value:zero/sentimental.
- one 1200-foot, 9-track, 1200-dpi, magnetic tape. Content:More Algol programs, complete database for printing out a 6-by-9 ft. Playboy poster on a Burroughs high-speed line printer. Unreadable by current hardware. Value:zero.
- a stack of Apple ][ disk with "all Apple ][ software ever written". Unreadable by current hardware. Value:near-zero.
- A couple of thousand 400K diskettes containing Mac System 1.0, Microsoft Word 1.0, Adobe Photoshop 1.0 and similar stuff. Unreadable by current hardware. Value:who knows?
- several Syquest 44MB cartridges and drive, incompatible containing "important backups". Incompatible with current hardware/software. Value:zero.
- several Ricoh optical cartridges and drive, ditto, ditto.
- several dozen DAT DDS-II backup tapes and drive, containing "backups of everything!". Can still be read on occasion, but it's been over a year since I _had_ to. Value:who knows?
All the important stuff is constantly migrating from hard drive to hard drive, anyway. Granted, I'm an individual programmer - a company or institution will view things differently, but I don't worry...
The moral of this story is that for any archival medium that requires technology to interpret, one must archive the necessary hardware to read it, along with spare parts, the software to control the hardware (another little problem), not to mention the software to interpret the data and a machine (or machines)to run that. This is not a simple problem.
In the future of Star Trek, they never seem to have trouble figuring out that collections of similar-looking small loose objects are data storage devices. And they always seem to manage to figure out how to read the things, too. Of course, after 400 years of people coming out with new data storage devices every few years, maybe they would have invented every possible mechanism for storing data. :-)
I think the only media that has a sognificant probability of surviving some devcades are MOD. And you might even get drives that can still read them in the future as the major drive manufacturers have commited themselves to support at least the last 3 generations of media. As the drives are SCSI, there is a long-standing command set that specifies how to read the disks. In contrast to CR-R the MO disks come with a sturdy encasing (ever asked yourself what happens to your precious backup on CD-R, if you drop the disk??),and drives do a verify and atomatic error management. At present the manufacturers claim 50+ years data live.
On the downside, the drives are relatively slow (ca. 300Kb/sec writing, ca. 800Kb/sec reading), and drives are somewhat expensive (ca. 250 Euro/USD for the 3/ 1/2" variant), but media a cheap (9 Euro/USD per 650Mb media) and its a standard removable disk, not some media you can only write in large chunks.
I think as long as preserving your bits is concerned, no other presently available system is comparable in reliability and potential to be still readable in the future (apart from printouts on good paper). And if you need to store lots of data, you can use the 5 1/4" variant that stores up to 5GB of data per disk.
Now considering what exactly you can store so that it is still interpretable some decades into the future is a much more complicated question. The first problem is not even the files, but the filesystem. I really don't know what to use here. Maybe no filesystem at all, but just a plain tar into the device? Tar might indeed have the potential to be still around in 30+ years.
Next problem is what format to store your files in. For sources plain ascii should be best. LaTeX might work. For processed text consider PostScript level 1 with fonts embedded. Maybe RTF, but I really don't know.
This is just one of these questions everybody thinks is easy, but which in fact is very hard. And cheap solutions like CD-R were never designed to be used for reliable and/or long-term storage.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
This is happening in Sweden too, I saw a news show a few months ago about how there is so much data to transfer so we won't have new copies of everything until the things we copied first are in danger of corruption.
This may not be as offtopic as you may think. It's funny sometimes when you read posts on here. For example, take a look at the MPAA/DeCSS arguments on the story just a few articles down on here. Within you will find a plethora of people arguing that 'copying DVD is junk not feasable, the size itself will stop anyone, who wants to store 5-6GB of a movie? If you compress it, the quality sucks.' Then take a look at this thread. People posting on how terabytes of storage will be readily availble soon, how new cds will hold more information than ever dreamed possible before. Anyone see a connection here? Now lets go a little more offtopic for just a second. The MPAA is well aware that hardly anyone out there is going to bother trying to copy a 5-6 gig movie to the hard drive or span it over 7 cds. Even compressed, they look horrible. Don't think the MPAA does not know this. What they do know, however, is that the technology of very massive media storage is right around the corner. We are talking about it right now. Who knows, within the next 2 years I may be able to burn 100 gigs on a single cd, or store 500 gigs on hard drive costing no more than $500 bucks. That's what scares the MPAA. I don't think they are as stupid as we think they are. With datastorage skyrocketing in size and reliability it is perfectly feasable to be able to store 5 full length copied DVD movies on a single cdrom within the next 2 years. Now the whole DeCSS and reverse engineering the code is whole new issue, and I stand against the MPAA on that one. I just don't think they are as stupid people think they are when it comes to datastorage and DVD size. Dirk Daring
This encryption might make it more difficult for future historians to decipher the digital data, even though they'll have far more powerful computers.
Maybe some bits would get corrupted, ruining the whole encryption scheme, or maybe they wouldn't recognise the data as an encrypted Hollywood blockbuster, but just discard it as useless junk?
I suffer from attention surplus disorder.
Anyway, it turned out I can't moderate if I respond to a discussion.
I suffer from attention surplus disorder.
Why would floppies be unreadable? In the future, I don't think engineers will have such a hard time figuring out that the magnetic patterns on the disk are like the holes in the punchcards you mentioned.
:-)
But I'll moderate your post up anyway.
I suffer from attention surplus disorder.
Verbatum...anyone remember DataLife Plus 5.25" disk? I actually poured coffee on it and had fingerprint printed on it...then I pulled out the disk from the plastic shield and wiped it with alcohol...then I used it...then it worked.
Amazing quality. I wouldn't think for more than a second before choosing Verbatum's media.
I used to work at the Royal Greenwich Observatory in the UK, and there were often debates about the long term preservation of archives. I'd recommend slashdot readers interested in serious long term archive issues should check what the big national museums and archives are doing. Several issues to consider:
Format of actual data. Pick something that people are likely to read in 1000 years. Hard enough as one of the slashdotters has said to unpick an early version of MS Word, let alone a 1000 year old format.
Usual story - the more complex the medium, the more likely it's going to mess up somewhere :-)
Hmm...you basically just described the process for making a conventional CD master, except on the master stamp all the pits are inverted. What you might want to suggest is that they instead make a conventional stamped CD, and merely use more durable materials. This way no special reader is necessary. (although a caddy would be preferred to preserve the life of the disc)
If anyone's interested, the English translation (by Kandel, natch) is entitled Memoirs found in a Bathtub and is an excellent read. It's a claustrophobic, Kafkaesque black comedy set in a vast underground military establishment. There's a lot on information and particularly encryption... in one particularly brilliant scene, a cryptographer runs lines of Shakespeare through his computer to discover their "true" meaning.
What is making the CDs rot? Is this an oxidic reaction (with air), or an unstability in the CDs themselves? Wouldn't keeping your CDs in a vacuum plastic bag help? You can buy such bags to use with a pump to store stuff for longer periods of time.
- Steeltoe
http://www.debunkingskeptics.com/
Yes, some data will be lost. That has happened before. We dont have complete records of our past. and we are doing fine as is. Cant we suspect that our childrens children will be able to do just fine even if they cant find the /. source?
The truth is that most records are lost, not primarily because of physical decay but because nobody cares enough.
If you want your work to survive to future generations, you make sure that enough people find it and make copies. That way you increase the chance that at least one copy will survive.
Actually, I sometimes think that the urge to preserve everything is a sign of decay. Monuments are built by the empire at its peak, wishing to be remembered, not by the rising competitor.
All opinions are my own - until criticized
At the current rate of growth in the storage media sizes, you can fit _all_ your data on a disk (CD, DVD, etc...).
So would only have to make a copy of your old backups every, say, ten years (to be on the safe side.
Javier
When his defense asked, "Which computer has Jon Johansen trespassed upon?" the answer was: "His own."
I'm asking now, how should you do so?
One paradigm I use is the concept of graduated copies. The idea is that the most vital data - information which is required every day for tasks you use should be the most accessible and the easiest to change. As data becomes less useful it should be moved into less accessible, more static regions of storage, until it moves into your equivalent of a "permanent" archive. Nothing is truly permanent unless we find a shortcut around entropy, but the permanent archive would have the feature that the data stored within never changes, is slow/difficult to access, and degrades very very slowly, but inexorably.
For example, your hard drive, in a few high-traffic folders, is the fastest and most accessible place for data. You can subdivide your hard drive, too, by making folders where you move data you don't think you're using. Eventually the hard drive goes onto your conventional backup media (CD-R's in my case), and you delete the non-vital data in the junk folders from the hard drive. Now the next time the drive gets a backup you're only backing up fresh data; your other junk already HAS a home, on that CD-R.
How you decide what is vital and what is not isn't easy, but here's one way: write a shell script that sorts all files on the drive by last access time, and prints out the bottom, say, 10%. (The bottom being the least-recently-accessed.) This is easier in Linux than in, say, Windows. That data probably never gets accessed, and you should uninstall it or kick it out to your permanent storage.
It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
Perhaps a plastic tape could be used for backups. A clear tape could have black stripes printed across it. Then just shine a laser through it to a reader on the other side. This would provide a flashing laser light, and that will end up as 1s and 0s.
"Reality is less than television."-Brian Oblivion
Let's look back at history. Are we seeing people race against the clock to try and back up their DOS 1.0 disks? Nope. Why? Because it's useless. I have some old floppy disks which have stuck with it for over 10 years (Win 3.0, DOS 5.0...). They still work, but am I ever going to install them on anything else? Course not.
My church just upgraded their computer system. They were using good ol' Word Perfect 5.1 for DOS, but now they have WP 9. What'd they do with all their data files? Either trash them, or use the dot-matrix printer one last time to print them to hard copy. Granted, we still could have opened up the DOS files in WP9 and saved them in that format, but most of them we didn't need.
A few companies around my area that I see upgrading their computers scroll through the old datafiles, print out the financial ones they need to keep record for, and trash the rest. Why worry about a business letter typed up to a company 15 - 20 years ago?
Case in Point: Much ado about nothing.
Has anyone read "A Canticle for Leibowitz" by Walter M Miller Jr? A post-apocalypse book, it is based on the premise that it is not just random environmental degredation that destroys our data and hence our science after the nuclear holocaust, but that people were so disgusted at how our knowledge had been used that there was a mass orgy of book burnings.
Even given that you can't book-burn an optical disc (though it could easily be destroyed) surely an equal worry to the slow deterioration of our media is that people may not want to preserve data that future generations may consider valuable?
My university has an archiver with a "preserve everything" policy because when they migrated their systems at the end of the 70s they only saved the "important" stuff like source code. They had a request a few years ago for binary code from an obsolete system that someone was trying to emulate - and they couldn't give it, even though it would have been exceedingly useful, because it had been deemed "worthless" and lost.
Perhaps we should be equally concerned about what we're preserving, as well as how we're doing it. Tom Harris
--
Tom Harris
http://www.harris.ukgateway.net
"The other possibility I see is that bandwith gets cheap enough so that we may consider remote storage vaults. That has a couple of privacy issues I'm certain you can see... But it's incredibly convenient and will probably be adopted by everyone if we just find a way to have a high speed switched pipe to everybody's home at a reasonable cost.."
:)
Just an idea, but if everything the remote storage vault saw was already strongly encrypted, there wouldn't be much loss of privacy. There would still be some risk, but I think I'd be comfortable with it.
Mmmmm.... high speed switched pipe to everybody's home... yummy.
i'd better run and patent that!
DAT hasn't been reliable in my experience. I found corruption in the middle of one after a few months. Sure it might have been a dud, but then so might your backup tape... And that was with a supposedly high quality HP drive.
In my experience, CD-R is about similarly safe. I found corruption on one of those once, it was a Linux distro in fact. Maybe it wasn't the medium at fault in that case. enjoy,- Jamie
Punched cards? pfar! They'd rot. Carved stone tablets are the way to go.
Hmm. but friction from the read (write?) heads could eat it over time...
Ok... how about we use huge blocks of nutronium...
Shawn Poulsen (Fruan)
"On Slashdot, many obvious things are insightful." - Annonymous Coward, 2000/7/9
I hear that some people are worried that in the future, the formats we use today will be so arcane that it will be practically impossible to find out what the data we leave behind actually means.
I really do not think this will be much of a problem. Computing power is increasing exponentially, as does out knowledge of how to use that computing power.
Finding out how to read a format is essentially the same as breaking an encryption scheme, just that most formats aren't made specifically to make reading them hard, like most encryption schemes is, so the task is actually easier.
I don't know about formatting and stuff like that, but certainly, people in the future will be able to figure out what we have been sending, provided they have the data. I mean *today* we can provably reverse ingeneer todays formats, why shouldn't people in the future, with vastly greater knowledge about basically everything, not be able to do so, should they so wish?
Bjarke Roune
You think so?
I'm thinking of adding an old 2X to my main system because the DVD I have in there cannot read any CDR at all. It can read factory CD's but cannot read any recorded CDs.
If I want to copy a CD, I have to copy the whole thing to disk first.
Compatibility would be nice. My CDRW cannot read DVD (of course), and my DVD cannot read CDR or CDRW.
In passing, what is the expected life of a CDRW under ideal conditions?
Phillips helped develop CD technology. Their internal projections on disk lifetime, as I remember, was approximately 10 years for a consumer CD (i.e., a Modern Jazz Quartet audio CD), no more than 10 years for the green CD-Rs, and "over 50 years" for a glass master. All assume the media is stored under controlled conditions (temperature, humidity, atmosphere, and radiation). These are projections based on testing, so be sceptical. The message: don't count on archival storage -- over a hundred years, say, like a book printed on acid-neutralized paper and stored in a good library -- from CD technology.
There was a similar story on PBS awhile back concerning the 1960 Mars probes' telemetry data. The data were archived at a university in (I believe) the Northwest. Test reads of that data suggested that about 20% of the tapes' contents were no longer readable.
Perhaps IT curricula (the kind that train people who say, "I don't code, I manage") can include a kind of data life-cycle management approach. The costs of updating both hardware- and software-based storage formats on an ongoing basis are probably less for crucial data than the cost of a crash-recovery project once the data are urgently needed. An analog would be the fact that maintenance of large Cobol-based systems was considered unglamorous, unsexy, and a backwater job, until Y2K concerns made those jobs quite remunerative for awhile.
- Dave
These computers are forgotten by most, but not by us. People throw them away, and we take them home.
Don't call my crazy, that's what they called me back in the home!
All, yes ALL of the data storage that has the ability to easily migrate generations require neither a special reader or connection. From paintings on walls to carvings. Some of the actual meanings have been lost to time with the coders. But the data still remains and most likly will be decoded.
Current tech has only been around for the past 100 years. Not even close enough for a real duration test. Most of the data archived to data tape in the early 50's is lost to time. Movie reels from the turn of the century are falling apart. Museums are always seeking film restoration experts, because the vault is disappearing faster than it is being restored.
Floppies just require a small accident, CD's just need to be scratched once good. Remote storage just requires an unplanned critical fault.
If I wish to ensure my data is kept through time, it's time to fire up the printer. Reams and reams of paper. Yes, paper can burn, but I have more faith in paper than current media.
Strange how they haven't tested it for 20 years.....Much less 200.... *Kill the opossum....He just makes problems for the possum*
I can think of a couple, not including inability of current technology to measure precisely enough:
Marking and measurement technology, of course, are additional issues.
Hey, my old Commodore 128 and Apple II data remains readable to this day on double sided low density floppies (the kind where you flip the disk..) to this day. I guess most of them are about 14 years old now. Incidentally, there are some 8 inch disks at my school that are older than I am...
I don't think your cds would be affected that much from nuclear radiation either... :)
--
"I'm surfin the dead zone
In the twilight, unknown"
Even knowing that old data becomes less and less valuable (its a fact) there is still the need for retaining it, else we would loose all our history just labeling it as "old stuff".
:-) and binaries, pictures, .wav of me singing, whatever) than a picture on a piece of marble somehow.
:).
Same kind of problems apply to h the idea of "Virtual Cimitery" which i think its definitely interesting.
Maybe it sounds funny, or futuristic to think about it now, but i personally would rather be remembered for what i have done (including source code
But then how to trust a "ethernal provider" ? sure the provider , being an "active" agent (e.g. getting money for it) can refresh its data periodically so there is no "long term media" problema, but then who can guarantee he will stay in business and not burn it all someday?
Only solution to this becomes then the same solution as to long term storage, some sort of consortium of distributed storage, that everybody agrees and pays some fees on capable of distribute redundancy. One solutions for long term storage AND something that would make me have a little hope in my little virtual tomb
This ass_u_mes that the internet KNOWS the difference between best and junk. Present day archeaologists gain tremendous insights from ancient garbage dumps.
The best solution for existing digital data is to continually copy forward, and document in parallel what you did. This way, the future will know what something is, and what we thought about it.
There is an example of Prior Art here. The Irish monks maintained a great deal of learning through the European Dark Ages, material that was not retained in the Moorish libraries.
The stuff in real-space, that is another matter. Curators and Librarians have spent centuries working on those problems.
oh well, typos happen I guess. How embarrassing :)
..and replace drives as they fail. Of course, that seems to be only cheaper in the relatively short term, but as drives themselves continue to become cheaper, it will hold over the long term too, assuming he's got enough business.
We do have a great long term storage media, stone!
Thousands of years after people made monuments in Egypt and other places, hundreds of years after people had forgotten what the messages meant, the message still remained, because it was etched into stone.
Stone works very well as a long term storage media, as it is sufficiently heavy and difficult to work with. And there is usually little benefit in trying to make it into something else. (Example. The ancient Greeks made truly great statues from marble and bronze. The bronze statues, for the most part, were melted down by other peoples, because they could be used for war purposes. The marble statues, however, lasted, because they were not very useful outside of being statues.) What if we make a very reliable metal disk, and all of them are melted down for a future war effort? Or perhaps we run out of oil, and all the CDs are melted down to power our cars?
For real permanence, we need to use something that will last when people do not care about it, or create something of such significance that people will always care about it.
After researching many formats and mediums he figured none would really stand the test of time. His solution post it on the net. Reasoning being that others would find it and hopefully enjoy it the way he had. If others took copies hopefully it would be converted to various formats and mediums morphing over time from a html document to xml to ... This sounds like the best available theory to me.
May not be exactly the same case but it provides good ideas for a way to approach the problem I think. It gives the best chance of something of value being maintained.
"Patience is a virtue, afforded those with nothing better to." - I don't remember
"Patience is a virtue, afforded those with nothing better to do." - I don't remember
I used to use one in the early 80's and its heads were worn out BEFORE we got it.
We used to offer a tape to disk service in those days when floppies started becoming popular, and the only stuff which really caused problems were the old word processor (proprietary formatted) files. Even strings didn't help.
If you do find a junked one, apart from the obvious things to repair (all the electronics), the vacuum system is critical and the head can actually be ground back to some extent (polished).
Then of course there are the problems of actual data conversion (EBCDIC or whatever: did they actually use a standard code or was the material stored as the binary image like we used to do with paper tapes?).
That far back in time, chances are the coding scheme used was proprietary... although in the case of your system you may have the executables as well as the source.
Methinks the printout would be more easily handled with OCR.
Good luck!
- JR
This problem isn't new. I heard a story a few years ago about the problem of finding data from the moon landings. Not only did they find it difficult to find where the data was archived, but found it was on a recording format which wasn't used any more and they had to find the one remaining compatible tape reader which was about to get chucked. But actual media decay? Well punch cards - people already have stacks of punch cards which have lasted 30 years+ - so they are known and tested for long term reliablity
According to the Dead Media mailing list's Working Note 32.4:
"DATA STORAGE: FROM DIGITS TO DUST
"Surprise == computerized data can decay before you know it
By Marcia Stepanek in New York
"Up to 20% of the information carefully collected on Jet Propulsion Laboratory computers during NASA's 1976 Viking mission to Mars has been lost. Some POW and MIA records and casualty counts from the Vietnam War, stored on Defense Dept. computers, can no longer be read. And at Pennsylvania State University, all but 14 of some 3,000 computer files containing student records and school history are no longer accessible because of missing or outmoded software.
(...)
"For consumers, the biggest worry is CD-ROMs. Unlike paper records, CD-ROMs often don't show decay until it's too late. Experts are just beginning to realize that stray magnetic fields, oxidation, humidity, and material decay can quickly erase the information stored on them.
"Says Robert Stein, founder of New York-based Voyager Co., which makes commercial CD-ROM books and games: 'CDs have a tendency to degrade much faster than anybody, at least in the companies that make them, is willing to predict.' Stein doesn't expect the CD-ROMs Voyager sells to last more than 5 or 10 years, and neither, he says, should customers."
I must say that for me personally, it is not a big deal. I really don't see myself in 20 years using the same data I am today. For nostalgic reasons, I may be interested in playing older games, but by then, it will probobly be through emulators since no modern computer will have equipment that supports such antique technology such as trilinier filtering. =-) However, I am worried about corporations and the government. What is going to happen when the IRS backup tapes decay? Farfetched, I know, but it is a thinking question. Banks and government agencies are going to have the most trouble with this, but I am sure that surely they have came across this problem by now and already have solutions. For people worried about their data: 1) always create a backup 2) if it is really important, have a backup on two different medias. I mean, if the average floppy has a lifespan of 3 years, and you go back 5 years later and realize that both floppies are bad, what are you going to do? and of course 3) never keep your original and backup in the same place. I think I just restated stuff that all of you know. Anyways, back to the original question, no, I am not incredibly worried. At least at the moment. I mean, if a CD starts going bad on me, I can always make a copy of it. Of course eventually, you are going to run into generation errors or something, but, hey, just take care of your CDs, know how to handle them, and always put them back in the case when you are through.
When our remains are dug over in a few hundred/thousand years what will be left?
We buried piles of junk a few years ago but now it's all recyled (a good idea!) but all our data will disapear. Our books will rot, our cd's and tapes rot or simply lose data. We will leave no rosetta stone of ASCII or other data types and all the fragments that will survive will just be 0's and 1's to future generations.
The first 50 years of cinema have all but gone because the films have coroded. Early photography, disapeered, What can we do now to ensure our cultures are not lost over time.
Lets not let them forget us!
Sparkes
PS Don't bother posting if you just wanna inspire flames!
*** www.linuxuk.co.uk relaunches 1 Mar 2000 ***
blog and junk
As you say, really important stuff will be kept on servers with mirrors. Alternatively, if a computer medium has a known lifetime (and the reader for that medium still exists), it's easy enough to transfer it onto the new medium.
;-)
The thing is, how much stuff do we want to keep? And does it matter anyway? I can't see the loss of the original copy of the Declaration of Independence, or anything like that, as being a big deal. The concepts are what's important here, not the physical object. And if information is important and it's in the public domain, it'll be copied, mirrored, reused and so on in so many different formats, you're never likely to lose it. I mean, there aren't many first editions of the Bible around, are there?
The future of growth means, probably, that atomic scale memory is the densest. This'll be reached within half a century and from then on, media problems will be over - however, interface problems won't be.
These interface problems (e.g. different filesystems, different data types) will continue for a long, long time: probably many careers will be made in 'programming archaeology.'
I don't think the VM idea will work; whose made a VM or tape drivers for NASA's store of 1960s telemetry data, which might one day be needed? Some lucky person will have to hack the tapes directly.
I see data decay becoming a problem for people out there who buy a new computer every.......10 years or so. As for the rest of us in the world, all we have to worry about is MS blue screens and hd crashes (both of which appear to like my computer...) ;)
So you rewrite your CD-ROM backups onto fresh media every 15 years. Not hard.
Besides, how useful are the contents of those disks going to be 20 years from now? Gimme a break.
73 de N5VB (ex-KD5BIV) AR SK
The flash chips, like the ones in MP3 players are what we will need to go to. NO MOVING PARTS! and incredibly fast read/write!
The Computer Conservation Society softwares erve.html) t an/) -
preservation people in the UK
(http://www.personal.leeds.ac.uk/~ecldh/ccs/pre
have been trying for several months to find
somewhere here that can read 7-track tapes with
historic software on. There is a tape I would
like to get read that is thought to contain the
source to the Supervisor for the Cambridge TITAN
system (http://www.cam.ac.uk/CambUniv/Societies/cucps/ti
the first timesharing system developed outside
the USA. There is another copy of this source -
a three-inch high pile of line printer output
from 1973 - which should at least help as a key
to identify character sets if the tape can be
read. (The appendix on character sets in the
Titan Programming Manual runs to 27 pages.) The
tape was copied from a TITAN-specific format
to 7-track tape before TITAN was shut down in
1973 so it could still be read afterwards, then
forgotten about until after Cambridge no longer
had any 7-track drives - then found again last
year after being supposed lost.
Have any copies of the source to CTSS (or, indeed,
any timesharing operating system that was
operational before that on TITAN went live on
20 March 1967) survived?
heh, and no, I'm not a Salesman for SanDisk
As for Word 3.0 files not being readable in the future, I have one word: ASCII.
--
steve jenson
stevej@katmango.com
http://www.katmango.com
If all our data is destroyed, except the really important stuff, future historicans will have a tuff time finding out how a normal person lived in our time. No records of what food we ate, what our pay was, no movies or music or anything. Maybe there will be a couple of copies of everthing left(a writer saves his works), but they will proboblay not survive(ie get backed up again) for very long.
Why would floppies be unreadable?
:-)
Well, I was thinking that they might find a floppy and not even realise that it might be used for data storage. You would have analyse it using a magnetic reader to actually spot the data. I just felt that people would be more likely to analyse it with a microscope.
But I'll moderate your post up anyway.
Thanks, but the fact that you responded to my comment is a much better vote of confidence than moderation.
We have Tomb Raider. That puts us at least two points ahead:)
I think the point was that most server admins won't want people playing IK+ on their machines, so they won't emulate a C64. They won't mind people using Word, so they will emulate a PC
Nevertheless, I think it is important to make sure that these are kept and that people in the future can play them. In 500 years time, they will be as important a resource as any entertainment from 500 years ago is today.
Anyway, you missed Fort Apocolypse
And Invaderload.
And Spy Hunter.
Ignoring the first two is reasonable, but how COULD you forget the classic game that every C64 had to have.
Except I didn't have it.
"One problem is that the next generation may not care about a particular piece of data, but the one after that would find that data invaluable."
Unfortunately, it is the judgement call of the 'next generation' about what gets archived and what doesn't. This is their right, because they are meeting the cost of storing and maintaining it. It is market driven.
Religions have been built on this premise. The preaching and prophecies of each religion have had a huge amount of human investment, in terms of money and lives, to stop their information content from disappearing into the celestial bit bucket. In the end, the heroic acts of the 'keepers of the flame' becomes part of the folklore of the religion, which gets handed down, and even embellishes, the original message.
Stephen Hawking has written another book. It's about time as well.
Is this just an incoherent rant ?
It is coherent but not deep enough. I read with some horror the narrow use and time views, such as when you said, "Modern word processing still opens really old file formats like Windows .WRI and Word 1.0". Old? I have books in TRS-80 Model I Electric Pencil format!
But it's so much broader than that. We have created a society of ephermal materials, increasingly so each year. As we moved from stone to paper to magnetic and optical media, we gave up durability for fluidity and speed. But I won't repeat your arguments, only point to some other examples and questions...
Who has time for this? Every year there is more data to back up, more information to get in order. As a composer, I have scores in software now six versions old. The ability to understand the meaning of the data is compromised with each upgrade, so I have to re-work as well as convert and transfer them. And there are sequences created as far back as my hand-built digital box. Some I've brought forward through a TRS-80 Model I all the way into Cakewalk 9. So I'm a composer whose time is split between creating the new and re-archiving the old!
Sure, who cares if I can't recover my KIM-1 data (even now)? As an artist, I do care, especially if great works are lost. The breakthrough music of David Behrmann was done on a KIM-1. Frozen documents (CDs) of them have been released, but his music was interactive as far back as 1977. Behrmann is one of the last century's musical lights. His work will be lost unless some hardware is kept up or some software moved to another system. Who will do it? I may not come up to Behrman's genius, but I have several dozen interactive works starting in 1978, and some of these technologies are long lost already.
As an individual artist with a body of work spanning nearly 40 years, I have a room full of decaying and obsolete media ... artistic creations that only function on KIM-1 and TRS-80 or OSI or Color Computer with dozens of data formats and custom interfaces. Paper tapes, wafer tapes, 8-inch and 5-inch and 3-inch disks. 4-channel Dolby-B tapes. 2 channel 4-track dbx-1 tapes. 2-channel dbx-2 cassettes. Fostex 4-channel cassettes. DAT tapes. Minidiscs. 8mm and 16mm film. Beta video. 2-inch slides. Mylar overlays. Negatives in many formats. Even a bloody set of endless-loop Elcassettes and 8-tracks for a sound installation! And I'm just one guy.
But it's not just media decay. It's knowledge and understanding. Someone else pointed to a tricorder of the future, which could read the data and determine its purpose. A good idea, if such a tricorder could contain the historical thought of each individual. But even from Beethoven's sketchbooks, who could determine the 'correct' ending for a symphony? Reconstructing data might be possible; understanding it will be impossible.
A rant for a rant!
Dennis
http://maltedmedia.com/We do have the technology now, as the poster says, to migrate our data ever forwards into new storage, assuming no cataclysm occurs
But we don't have the time. You can increase data density and processor speeds, but not the human time to make the decisions on what to migrate where, when, and how. Not to mention you're only talking about digital data here, not the real world.
Heck, I was gone to a rehearsal one day and come back to find comments to this topic essentially over. With that kind of attention span, who's gonna do this stuff?
Dennis
http://maltedmedia.comYes to this. Who gives a rat's ass about files several years old? Looking at old backups on my floppies is painful just from the boring contents. Lets hope Uncle's (Sam's) media on its citizens suffers a similar fate. :)
The guys at The Long Now Foundation seem to think that digital media will deteriorate. They have a plan to counter it.
siener's youtube channel
If we are to leave our culture to future generations movies are always a good source. Unfortunatly our DVD movies will be encrypted and if they have lost the keys they will not be accessable. The only way they could get at them would be to hack the encryption and if the future governments were very primative in their thinking then hacking enryption might be illegal and therefore they would get arrested. Then again no government could be that pathetic.
I cannot add much about natural media decay over time, but if you're concerned about surviving something like a nuclear disaster o a big meteor hit, I suggest KEO. Just filter your stuff to the very essential and expect your data to last for about 50,000 years.
I couldn't help but laugh when I read this, for a number of reasons:
1) If your sysadmin is worth anything at all, he does backups anyway. Probably every day. Heck, he might even get cron to do it every night when he's not even there. If it's important, it goes to tape every night. This may appear to be a temporary backup (since most organizations cycle about six or seven tapes through the week, overwriting each of them probably every seven days) but if the Very Important Data you want archived is on the drive every night, then it gets backed up every night. When the tape fails, it's on the hard drive. When the drive fails, it's on one of six or seven tapes. Replace as needed.
2) The vast majority of (unix) sysadmins are good at automating tasks. If any of us wind up doing one particular task all day long (as if it could possibly get that far...) we'd just write some program or other to do it for us. Even if it took a week to build, it would slash through the backlog in no time.
The issue of Archiving Things For Posterity Without Constant Maintenance (the Library of Congress, for instance) however, is a different can of worms that we need to worry about. However, if you consider the fact that you can still pick up a copy of Principia Mathematica at your local library, despite the fact that the original paper is probably long since gone, it's a testament to the durability of Important Information, and the public domain, not to mention the printing press. Nothing short of a global catastrophe such as a nuclear war is going to prevent the important stuff from being handed down over the next million years. And if that happens, well, there won't be any archeologists to dig up the pieces anyway.
---
I can't wait for proper speech-recognition.
"No problem. I have the capacity to do infinite work so long as you don't mind that my quality approaches zero."-Dilbert
I don't believe any study that states CDs have a lifespan of 100 years - not until someone actually finds a way to accelerate time for testing!
DNA as a storage media is interesting because it is already very well understood, (Well, the parts that we use at least and synthesis and translation/transcription) but using it has a few challenges. One, although you can pump out a custom sequence with a desktop sequencer, the accuracy is not guaranteed. Imagine that you CDRW randomly dumped 15MB of data every time you burnt a CD. Another problem is storage. DNA requires specific conditions or it starts breaking down, and even in an ideal environment is starts to cook. The human body is constantly repairing its DNA, but we have cellular mechanisms to do that. DNA stored within a cell, such as a simple bacterium, is a concept, but do you really want you data escaping the lab? And the contents of you hard drive probably don't code for any useful proteins... Of course, DNA is easy to copy and propagate, just fragile and slow to build. The idea of some guy popping his open source software into a PCR machine and making 1x10^12 copies is cool.
"Life's funny sometimes." "And sometimes it isn't." --Cat's Cradle
As far as film goes - even film today is degrading at amazing rates. The reason why it's not a problem is because we can archive the film digitally, and reprints can be made and sent to theatres. However, film itself is an unstable media. For instance, the campus cinema has to send back the film immediately after they are done or pay for it - because it will have degraded a couple days afterwards, if not immediately returned to an archive state. A good example of this is when they were going to show Army Of Darkness - the only reel they could find had disintegrated beyond repair. (Bummer, as it's one of my favorite films). As for worrying about our civilization being remembered....Who the hell cares? You think they want an archive of www.mcdonalds.com and www.lotsaporninc.com? No. Civilization does not hinge on the Internet. Instead of worrying about what will be found 3,000 years from now...how about worrying what will be found 10 years from now? Live in the present, because that's the only way to live.
The only (admittedly large) advantage of using the Moon as a storage site is that it doesn't undergo geological activity - you could probably leave data there for millions of years without worrying about earthquakes or continental plates colliding.
However, if you're not too worried about keeping your data for that long, you may as well just put it into a high Earth orbit where it isn't likely to run into anything, or buried in a mountain. This has the advantage of being cheaper than having to throw it all the way to the moon, vacuum proof whatever storage device you're using and also radiation proof it.
After all, inside the Earth's orbit you don't have to worry so much about radiation.
You have to keep in mind, though, that eventually anything in orbit around the Earth, the Moon included, is eventually going to fall down and crash into us.
There's always the solution of building some kind of self-replicating data store that will avoid the damaging effects of radiation and generally getting destroyed by natural causes, but then you've got the problem of the information in the data store 'mutating' through 'reproduction'. Really, the only solution is to build some kind of ultra-protected probe and throwing it out into interstellar space.
Unfortuantely, even *then* you'll be faced with ablation of the data store material into the vacuum. Fast forward a few eons, and then the fundamental particles of the data store itself begin to liquify and generally mess up.
So, it's a no win situation, really.
Someone will come up with a way to map all your data against say the position of all the stars in the milky way (with of course an appropriate algorithim to go back in time) and give you say a 1.44 Mbyte key. Then you can keep a floppy of each days key forever. This should finally offset the number of AOL disks in the universe.
Hey,
.. but i remember that i once read an interesting article about the r&d department of NASA, who really have a heavy amount of data comming in all day, figured out that only to backup and re-backup the data they already have is almost IMPOSSIBLE, not to mention the loads of data still to come.
it is some time ago
I think regarding the average user at home, the decay of storage media will not be a major trouble, but regarding governmental or professional institutions this may be a real threat.
Just think of all the ecommerce transaction stuff, and the mouse-movement-and-click-recordings (;-) the big retailers NEED to save for this millenium!!!
Any idea of the whole amount of data produced aroung the world each day, and the relation of stuff that needs to be stored to the stuff that is deleted?!?
Periodical media replacement is the only way to go and the fact that almost all data is/can be stored digitally makes it easy (but not cheap).
The punched cards is a nice idea, but not as realiable as one might think. Paper also dissolves quite rapidly due to acids used during production.
ps: I don't think I need my p0rn collection after a nuklear stike...
Very interesting thread, thought about that quite often myself...
:)
:) So emulation is crucial, too.
I have a nice antique collection of Apple II disks (with bits so large, 140kB/side, that most of them are still readable - have three Apples for hardware redundancy, too
Once I manage to back up all my old stuff to e.g. magneto optical drives, I'm quite confident that the data is safe, and I can rely on emulators to access the data. Nice.
However, I can't find the *time* to copy all the stuff from the Apple II to my Mac! Assuming 5 minutes per disk side, I have several days to spend, just for swapping floppies. No chance.
(OK, most of the stuff is available on the Web anyway, but I do have a lot of self written programs for the Apple II)
BTW, another aspect - most folks are talking about the data, but I'm also interested in the nostalgic games on the Apple II, no ASCII reader can help me there
Same for my Atari ST stuff, the 3,5" disks are rotting on some shelf... at least I managed to copy the Atari's hard disks to MO (the Mac could fortunately read the Atari/PC SCSI partitions) and even got the *really* odd floptical 20MB disk data onto MO, but still...
Only to introduce still another topic: Records, not CDs. Hundreds of them to archive. However, they seem to last longer than magnetic media (oops, reminds me of my cassette tape collection!) - and no means of copying those quickly to digital media, 30-35 minutes per record side... Well, most of them are bound to exist somewhere as MP3s, so I should focus on really rare stuff or own recordings, hm? Time, need more time...
Did any one watch the TV special about the NY times Time capsule they solved this problem by engraving type and images into a piece of platinum with a laser they managed to place several hundred pages of text and photos in a disk the size of a CD. Assuming that it isn't melted down in a fire it should be indestructible. They mentioned at the time that this was the same way that the government stores data the needs to be protected from a nuclear blast. as long as your only trying to store text this would be more than adequate
I've never noticed it before but my thinking cap does sort of resemble a hockey helmet
Oh, and make sure the bar doesn't lose any dimension to oxidation or other factors. (I.E. Rust, diffusion, proton decay???) The amount of material actually lost is vanishingly small, true, unless you took your iron bar from the body of a '72 Pinto, but again, how precise do you need it to be? Say you store a CD-R worth on there. Call it 650MB. Turn 650MB into a single decimal number. That's a required precision of 665,600,000 decimal places. (Assuming no compression system.) Whoa. The genetic thing is looking better and better...
D
Well if we do not like our current storage mediums: magnetic, optical, or whatever, perhaps we should develop and utilize something different, like Organic Storage based upon long strands DNA. Backups? well DNA is after all self-replicating (isn't it?) and would in turn back itself up automatically and even repair lost or damaged strands of data over time. Yet of course there seems to be some inheirent downfalls to this storage meduim which would question it's durability and stability. I.E. natural Viruses (That would be messy!), Data Mutation, and plain old evolution, which limits the reliability of the storage in non-sterile environments which is pretty much everywhere. Hmmm perhaps I'm just stupid (ignore this message) The Nein-Ja (^_^) forget it.
Civilizations that are build around a big stream (Ancient Egypt, China) have a longer life cycle...
Early film, called nitrate film, was actually explosive under the right temperature/moisture conditions. Not only would it destroy itself, but sometimes the entire library it was sitting with. I don't really have hopes that CDs will last the 100 years people talk about. Case in point: I had a Maria Callas box set, and the red ink printed on the other side actually ate through the CD, making the CD unreadable. (This was after six months from release, so the company exchanged them.) But, I think we need to think less about the media but more about the inks on CDs decomposing over time. Especially CDRs and CDRWs. Writing on them with Sharpies is dangerous, since the Sharpie ink does actually decompose over time (Sharpie has noted this). Just wait.
Typically they store a disk for a while in really bad storage conditions (high temperature, humidity, elephants etc) and then extrapolate from those results to guess how long a disk would last if it was stored 'properly'.
Personally I was ripping a lot of my CD collection over the weekend (just so much more convenient to click on a playlist entry rather than having to find a CD every time I want to play it), and several of the early disks that I haven't touched for years have oxidation tracks curling in from the edges. Luckily not far enough yet to destroy data, but had they been 70-minute CDs rather than 40-minute CDs many of the tracks would probably be unreadable.
Incidentally, this isn't just a problem in the digital domain; finding good prints of old movies is becoming harder and harder, and apparently when the Babylon 5 folks got their negatives back from Warner Bros to re-edit the pilot episode they found that many of the rolls had been soaked in a flood in the Warner film vaults, and others had been eaten by rats! In fact, it's quite possible that current DVDs will be the best version of many older movies that will be available in a century from now.
Try this. Get a strong magnet. I used an old speaker magnet. It had lots of force, enough to support a hammer. Pass the the magnet over the zip drive a few times. Heck, leave it on there a while. Now try and read it. Do the same with a floppy and see whether you can read it. I could read both just fine. I would have thought a strong magnet would totally wipe them but apparently not. A buddy of mine asked how he could quickly "destroy evidence" on storage media. I told him a strong magnet was bound to do it. I had another thought coming. Apparently, magnets are not a reliable way to destroy data.
Wansu, th' chinese sailor
I work for STK (The company that owns the tape storage market for big companbies with lots of data) Our customers already have this problem.
Nasa (Which has all that satalite and other automaticly collected data that needs to be stored. Not all of it has been processed yet despite being 20 years old or more) They are in the habbit of migrating to the latest tape technology every couple years. (3? no more then 6) because the latest and greatest allows them to get double the storage in the same space. they do this not only for the space savings, but also to keep that data from getting unreadable.
They are not alone, but I can't remember the specifics. (I'm also not sure I'm allowed to mention more)
STK equpiment has a reputation of reliability. Then again, you pay minimum of $20,000 for a tape drive and it goes up to $150,000. (Or buy the OEMed DLT drives for $6,000)
As a linux user, right not the best you can do is copy to a new medium every couple years. Make sure you do a verified write, and keep a copy offsite. (in case of fire if not protection from over zealious law enforcement) Better yet is a vaulting company, which do in fact exist, but they are immature at this point. (Meaning that you shouldn't trust your data to them without research into them, there are good ones and there are those that will lose your data. Pricing may also be more then you want to spend) I would not trust any one media to be my backup.
Remember that most data isn't worth backing up. (linux source - except for local mods that are not yet in the source, /usr, most jpegs . . .) Think carefully, what is worth saving to backup? Probably "My dog by jessica age 6" (momentos of youe kids), pictures of the family, the project you are working on today. Tax records (for three years in most cases). There is more, but the majority of your 50 gig hard drive isn't worth the bother.
Don't forget what other have said about reliability of the medium. They appear to have more data then me so I don't cover that ground. They had other insiteful things to say too.
Colin Smith notes,
Have a look at http://www.norsam.com/rom.html for digital archiving and http://www.norsam.com/rosetta.html for analog archival storage. The basic technology is to use particle beams to write very high resolution to silicon wafers ("high-performance rock" :-), which are extremely durable as long as you don't go after them with a sledgehammer or something.
The digital version stores 200 GB on a side of a 5 1/4 inch platter (with 10-disk and 300-disk jukeboxes, making possible a "petabyte machine room"), with very high speed (30 MB/s) write rate and reasonable (3 MB/s) read-rate. The analog version you can think of as "super-microfiche", writing analog page-images to the wafer (at something like the entire Encyclopedia Britannica on one wafer); it is readable by even such lo-tech methods as a good microscope (so it shouldn't suffer from reader-obsolescence).Norsam is partially funded by IBM venture capital, by the way.
"My opinions are my own, and I've got *lots* of them!"
magnetic tape and punched card formats which can no longer be read, because there are no surviving readers
Actually, punched cards are the easiest legacy format to read. A reader is an easily constructed electro mechanical device. If a great many have to be read, optical is the way to go. If speed isn't an issue, precision alignment isn't required either. A sheetfeed scanner and simple software can also read the cards. For that matter, if it's important enough, they can be manually transcribed. The above all applies to paper tape as well. All other storage techniques (magtape etc) require more sophisticated readers. If the punched cards have not been read, the data on them must not be all that important.
On the other hand, what would one use (other than a CDROM drive) to read a CD? Any option I can think of requires far more effort that for magtape or punched tape/cards. The best bet (since it's not practical to transfer everything to punched tape) is to keep updating storage media. When denser or more durable media comes into wide usage, that's the time to make a transfer. If the data is to survive civilization itself, the reader should be documented in an easy to read form (such as diagrams and text on hard plastic plates).
The biggest problem I see is stupid copyright protections. When such measures are employed, the media and data are ACTIVLY HOSTILE to archiving/preservation efforts. For an example, over 20 years ago, the BBC lost several early episodes of Dr. Who. A number of them were recovered because fans had recorded them on professional video tape machines (home VCRs were not available at that time). Had copy prevention mechanisms been in place (like MPAA and others want to have now), those episodes would simply be gone because nobody could have recorded them. Consider, how long is forever for a DIVX silver disk?
A few TV shows may not seem all that important (and with TV, they probably aren't), but consider encyclopedias, textbooks, novels, etc. All of which are moving to electronic form, and all of which will probably be in some stupid proprietary copy protected form. That's a problem even now. Just try reading an old Excel spreadsheet today, and then consider trying it in 50 years (Good luck).
On the other hand, nice simple comma delimited ASCII is pretty easy to read no matter how old it is. Even if the fields are not documented, it's not too hard to guess.
In summary, unprotected open standard formats are the way to go if preservation is important.
They had a test done at Los Alamos National Labs where they tested the media for corruption after exposure to extreme heat and corrosive conditions.
It's not quite ready for people to have an HDROM burner in their home PCs, but I suspect that when the patents run out in a dozen years, many will take interest in the technology...
If you're not part of the solution, you're part of the precipitate.
Then they found the disk drive for them.
Many other TV stations did the same thing. When ABC TV (the UK station by that name) was bought by Thames TV, all the old ABC tapes were left in a pile outside. Anything not picked up by collectors was trashed. That would have included the early episodes of The Avengers (many of which ARE now lost forever).
Some programs were lucky and were missed by the raving hordes of vandals and Huns that inhabited TV at the time. Sapphire & Steel escaped by turning into a door-stop.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
most likely services like xdrive will be used for storage we want to keep. even now this is safer than keeping it on your local drive because they handle all the backup and if your local drive crashes your S.O.L..
"The lie, Mr. Mulder, is most convincingly hidden between two truths."
--
And Justice for None
Stored properly, writable CD's last 100 years or more
Evidence? I have several (audio) CDs from the early 80s which are no longer readable. OTOH, I have 9-tracks made at the same time which are still OK (presumably; I don't have access to a 9-track drive currently, but they were fine five years ago).
Personally, I think that microfiche is the way to go. Plastic lasts quite a while, and OCR software is already good enough to read in straight text in a standard typeface. And even if civilization collapses, all you'll need is a decent lens and a mirror to review your pre-cataclysm tax records...
Just junk food for thought...
I think the answer is a self-contained recorder and playback device which is sealed and can accept a wide variety of power source. Call it "BackAnywhere"! A data time capsule.
The premise is simple: encase the actual storage device (likely solid-state and non-magnetic for obvious reasons) into a case, write the data out, and seal it. The catch is in the interface - since 100 years from now we can't be certain ASCII will still be in use we shouldn't necessarily write the data in that format. However, it's been shown by history that languages with a sufficiently large text-base can be deciphered even if they are thousands of years old (or a rosetta stone can be found to translate)... I suggest we put a well-known book into the encoding stream. When you start it up and press one of the buttons, out comes shakespear or something. After the archeologists have figured out what it says, press another button and there's the stored data - whatever it may be. Hell, you could bury an entire library into a 6"6"4" space.
The thing about the power supply is the only problem: electronics require power. How this power is put into such a system in a way that ensures that you won't blow the thing to kingdom come if you plug it in wrong will be the problem. Afterall, after WWIII in 3200 when we're rediscovering the lightbulb somebody might have the "bright" idea of plugging it into a 30kV generator.
Something to think about....
From Kodak's CD-R Overview.
The InfoGuard protection system includes a special coating that resists damage due to scratches, dirt, rough handling or other common mishaps. As a result, it's reasonable to expect a life of 100 years or more when discs are stored in average
home or office conditions.
--
Why pay for drugs when you can get Linux for free ?
echo '[q]sa[ln0=aln80~Psnlbx]16isb572CCB9AE9DB03273snlbxq' |dc
...an article in the New Yorker dealt with this issue last year, and discussed the National Archives, where more than half the staff is now dedicated to transferring items from optical discs (the archival format of choice in the 1980s) to more modern media, while plenty of pre-1980 archival information (newspaper, video, gov't papers) both pre-1980 and current are still not yet archived. The problem is that optical disc turned out to be a dead end; the format they standardized on no longer has anyone manufacturing players for it! The article finished by pointing out that we now generate more 'recordable' information than ever before, but we are also losing it at a higher rater than ever before. The format least likely to be obsolete, ironically: paper. --Philip
"It's amazing how our industry is strewn with beautiful, dead technology and bitter engineers." --M. Huyck
>Up until the 1930's somewhere, journals are pretty well
>preserved. Then they suddenly get awful as paper mills switched to new methods.
s/9/8/ for most of printed materials.
I have several books I inherited that were printed in the 1800s. The two oldest -- bound periodicals from the 1830s -- handle like they were just printed a few years ago. The books from the 1870s are very brittle, & when I have children, I'll have to hide them away from the rugrats until they're old enough to understand just how fragile the darned things are.
And we're not talking quality literature here: the bound peridocials are examples of popular magazines, full of sentimental stories & poetry. At some point the covers were torn off, & my grandmother rescued them just before they were tossed into a fire pit. The one book from the 1870s are translations of Schiller, was far more carefully produced & has an inscription from my great-uncle to my great-aunt.
In a few hundred years, a lot of stuff from the 19th & 20th centuries will be lost. And it'll puzzle people how it happened.
Geoff
I think I see a trend here. Maybe for them it really would be easier to muzzle the entire internet than to produce p
It's interesting how some of the oldest technologies for "data" storage are proving to be the longest lasting.
Here's an example:
Wire recorders and wire recordings. These date back to the 1940s and 1950s. Instead of using a magnetic tape as we know it today, you record on a stainless steel wire.
The disadvantages are:
o Mono. It's a single wire. You can't put multiple tracks on it.
o Frequency response. Not so good, but acceptable for voice recording and radio recording.
The advantages are:
o Little to no hiss! Tape hiss is mostly due to the fact that magnetic tape is covered with small, irregularly shaped magnets. A stainless steel wire is continuous, with no individual magnetic particles. Wire recordings sound surprisingly clean.
o In theory, they can last forever! Tape formulations tend to break down over time. The plastic backing dries out and the oxide flakes. A wire recording is just a spool of stainless steel wire. It doesn't deteriorate. I have recordings from the late 40s that still sound pristine, and may well last forever.
A couple more examples of "obsolete" technologies that are incredibly archival:
o Black and white photography: Daguerreotypes. These were made by plating silver on copper, sensitizing with iodine, developing with mercury fumes, fixing with salt, and toning with gold chloride. The image is basically gold on silver, and the images do not fade.
o Color photography: Technicolor. Technicolor pictures were originally made on a special camera that performed color seperation in the camera, and produced three black and white negatives, each representing one of the primary colors red, green, or blue. These negatives were then used to create "matrices", which are essentially printing plates. Finally, the three matrices were used to print the release films -- using highly stable, acid based cyan, magenta, and yellow dyes.
The Technicolor process was replaced in the 70s by monopack film, which has three color layers in the film base. Monopack film is much cheaper to produce and easier to use, but the dyes used are dictated by the chemistry requirements of the process, and the dyes are not stable. This is why original prints of such films as "The Wizard of Oz" retain their color unfaded, while most films from the late 70s and early 80s have faded to shades of pink and red.
Another example is punched cards. As someone pointed out, they can rot, but in a hundred years, if you found a stack of punched cards in the bottom of a desk drawer, next to a magnetic tape, I'll lay odds that you can recover the data off the cards, but not the tape.
We need a standard for the long term storage of data. This would consist of a number of mini-standards for media and file formats, call it LTDS (Long term data storage standard) LTDS 1.0 might support the CD-R format as the media, and a flotilla of file formats - say MP3/WAV for Audio, MPEG3 for video, HTML 4 for documents, PDF, and Java for programs, and of course some sort of file system standard (probably the current ISO CD file system standard).
Hardware and software 'readers' would then be certified as LTDS 1.0 compatible, meaning that it can read all the physical media in the standard and all the file formats in the standard.
As time progresses LTDS 2.0 will of course be developed say on DVD-RAM with newer file formats, but LTDS 1.0 would be a subset of the 2.0 standard. Hardware and Software readers would have to be LTDS 1.0 compatible as well as LTDS 2.0 compatible to be certified LTDS 2.0 compatible. You would always be able to read your stuff, no matter how old the format you saved it in.
There is still the problem of physical media decay, but I am sure that the media manufacturers can address this and make some especially long-lived CD-R packaging (or DVD-RAM in the future, or what have you).
-josh
It's also important not to store content in formats that become undecodable. So, Word 97 is out for archival storage. If the content is in ASCII or UNICODE format, you can probably hack up a parser that gets most of the information back. It is also useful to store source code in a common and reasonably simple language (C, Fortran, core Java, Scheme; not C++) that can decode the content along with the content. For example, for encrypted data, I usually store a source copy of the crypto program along with it. I consider good formats for long term storage formats like HTML, PBM, JPEG (with decoder), MPEG, and Sun audio format.
Well, actually....
Tubes are better than transistors in certain applications just becasue they DONT work right. They color the sound in a way that is appealing to audiophiles. It has nothing to do with clairty it has to do with a listening experience. Tubes apply sort of a dynamic eq to playback as their electrical properties muddle with the sound. Alot of it is foolishness on the part of gearheads, but a fair amount of it is actual fact, that tubes are percieved to sound better. Remember this is all perception, it's not black and white.
About digital versus analog, find someone with a good quality audio card and record something at 22khz, 44khz, and 48khz. Now do the same accross 16 20 and 24 bits. If you're using good listening and recording equipment, you WILL hear the difference. That doesn't mean it wont sound better than certain analog gear, but it does mean that in theory analog has the ability to do better. Until I sit in a recording studio and hear state of the art digital vs state of the art analog of the same event recorded at the same time, I'll have to side with analog.
The other thing at work here is that people take tapes and make them digital and then whine about digital's quality. That doesnt make much sense as the media is PART of the recording. The limitations and strengths of analog are part of an analog recording, running that through a/d converters to change the forrmat is going to be lossy. Similarly taking a digital recoridng and transcribing it to analog is probably lossy as well.
As for mp3, yeah the quality is bad. Any self-respecting audiophile would never archive to mp3 given analog as an option. Of course most people dont have recording studio quality analog gear and the leap of clarity of a digital reproduction of the master tapes is tremendous when compared with consumer analog devices.
-Rich
..or make a gold gramaphone disc of your vital data, then nail it to the side of the nearest convenient space probe.
Bit difficult to retrieve, though.
I agree. We don't know what we lost, so we can't judge the usefulness.
It is unlikely that the library on Alexandria contained scientific knowledge we haven't rediscovered (I don't believe in lost Atlantis, etc) but it certainly contained facts which are now forever lost, like more historical records of the time than exist in biases recordings (the bible, etc). To have lost that library, and others similar, is tragic, from a historian's POV.
So, we should have a way to record all the data that we want, such that none is ever lost accidentally.
For this, data havens aren't great. If the owner of the data is lost it's all too easy for the data to be meaningless to everyone, strongly encrypted until it appears to be white noise. Physical data is handy this way, if your backups and in a safety deposit box, you might decide on less encryption, enabling heirs to read your files if you didn't pass on encryption keys in your will.
Speaking of which, we need a strong encryption system whereby you can unlock data with a certain number of secondary keys, or a master key, but where the data doesn't get easier to unlock with less than the required number of secondary keys. For instance, the boss can unlock the data, as can any five of the seven employees, but if four conspire, the cracking is no easier. This will let keys be passed on after death, etc, in wills and by delay mail, such that records can be unlocked, but in such a way that a dishonest person can't look at your will and gain premature access to sensative data.
On the subject of easily recovered digital information with a fairly high density, have you considered printing digital data to paper as a series of light/dark areas? This way it can easily be scanned into a bitmap (something we'll always have the ability to do) and a programmer could whip up a translator in an hour or two. Then print an intro page describing the text format (65 -> 'A'), etc, and the encoding (if you need to use anything special) as well as the dimensions, etc. These pages could be printed on high quality paper and laminated, or in the msot paranoid case, photographically etched onto non-reactive metal film (which allows a better resolution, btw.)
Testing of this method allowed 2048x2688 (or so) resolution, which translates into 672KB / page, or just over two pages/floppy disk.
It's the longest term data storage we could think of because if you did the paranoid route and used metal film, it would theoretically last thousands of years, and all it requires to access is a scanner, which we have to assume there will be in the future, and a semi-talented programmer.
It does have file-format problems, but if you completely document the file format in text in the beginning (could even be very small, requiring a magnifying glass or scan+enlarge to read) or at least provide bootstrap info, as in, describe how to read a text file, then include the first data as a text file describing what to do with the rest of the digital data, etc.
With metal film, or even very good paper and photographic printing, you could get 4-8MB / page. At 500+ pages per volume, it's pretty compact storage, and it would be used for rosetta stone type info, or the most important records. Everything else can just be translated from one format to the next every 10-20 years, as storage should allow ten times as much data on the same size medium in that time.
Some of the data is already encripted like DVDs (ok bad encription, but it is encription), and soon the USA will make it even easier to everyone encript data by relaxing even more the cripto police.
Also the media is evolving, but usualy it is to accomodate more data. The durability of the data is usualy less important, if the media can survive 5 years then it is more then enouth (after all in 5 years this media will be obsolete).
So what is the memory that our civilization will leave for the future archeology? Tiny little disks, with 1000s of terabytes of encripted and compressed data that will be probably half damaged. Even if they could read the data, remember that is probably in almost atomic level, it would require to find the algoritm of decompression and decription and a key.
--
"take the red pill and you stay in wonderland and I'll show you how deep the rabitt hole goes"
[]'s Victor Bogado da Silva Lins
^[:wq
However, I think he was mistaken. Ancient societies left stone tablets, cave paintings and the like behind, and there's no-one who fully understands the languages or the contexts (when an archaeologist says an object is of "ritual significance" he actually means he doesn't know what it's for). We do have the technology now, as the poster says, to migrate our data ever forwards into new storage, assuming no cataclysm occurs. And even if it does, it is far more important, in terms of recovering data, that the language (source code) survives, rather than CD ROM drives, Minidisc players etc (the binaries), because then data recovery is an essentially straightforward task.
The point is is that stone tablets are damn durable, while digital mediums (take your pick) aren't. When you see a stone tablet you see the inscription and you can say, "Golly gee! There's something written on this! It looks like a horse." or what not. When you see a CD, you look at it and say, "Hmm kind of shinny. Mirror?" or if you're smart/lucky "CD!" Then of course you have to figure out whether it's filled, or empty, whetherit's an audio CD, or data CD. Okay now there's files on it. Is it using rock ridge, joliet, or iso. Is this file data or is this an excutable, or some support file like a libary. Is it for mac, windows, solaris, linux, Be...
We have this problem today. I can give you an 8 inch floppy disk and say, "Behold! The answer to all the world's problem lies within. All you must do is read it and begin." Do you know where to even get an 8 inch drive? I sure don't. The only one I've ever seen was in "Wargames".
You talk about the fact that it's more important for source code to survive. That way you can reconstruct the system that produced the data, and then you can read it. Sounds reasonable enough. One problem. What is source code typically stored on? Big stuff sure as hell isn't stored on paper (However, I did one time see PGP source code printed and bound in an appendix to a book, don't remember which one though. (It had something to do with PGP. Suprise. Suprise.)). Source code is typically stored on a digital medium, because it makes it easier to use. It's a catch-22.
Now don't tell me about "well everyone will know" because "everyone" knew back in the past how to read Myan, and we all know how well that turned out.
A good point. But honestly I don't think this will be a problem in 20 years. More recent standards like DVD look like they're going to maintain upward compatibility with CD-ROMs. A hundred years might be more of a problem, but hopefully, within a hundred years, there will be ample time to transfer the data to a newer format.
Is the product capacity times lifespan divided by price. For a CD, you have something like 5*10^10 o.yr/$ (that's a byte-year-per-dollar). For printed paper, it's maybe 3*10^5 o.yr/$, so we've definitely made some progress. Floppies are utterly worthless, by the way: nowadays, they survive about one week before going bad. Anyone care to calculate how much a tape is worth, by this standard (I don't know how much they cost)?
A recent article in the UK paper The Guardian commented that the sheepskin on which the earliest known version of Beowulf is written had lasted far longer than any modern medium, and was therefore superior :-)
So go for holes punched in sheepskin: the storage medium of the last millennium.
ben_ the technologist and platform agnostic
Many people seem to think that its about storage size; it isnt. There will be no problem finding the space.
The problems instead are actually migrating the data. Ideally, the data should be kept in a live state, transferred from old storage media and converted to more modern formats (and classified and indexed!) during the available migration period, when such migration is supported. That, for even a mediumsized organization will be a full time job for a few people.
In the worse case, you're only transferring from old media. Then, recovering any data instead becomes a full time job of locating it and researching storage formats, finding something able to read those formats and eventually converting the documents to something readable.
Of course, it mostly becomes a problem if your organization is using proprietary format on data. Using the simplest most standard format such as ascii or sgml formatted documents makes it far easier.
One problem is that the next generation may not care about a particular piece of data, but the one after that would find that data invaluable.
My parents have recently gotten into geneology and I find it very interesting as well. The thing is is that we continually find people who didn't write down their parents or grandparents names, because they "knew" that information. But two or three generations on and you don't know your great grandparents were, where they lived, or anything else about them and their lives.
People who keep regular journals, even of the most mundane and day-to-day activities and events are invaluable to those who are trying to find out about their lives. Trivial information provides glue that ties historical events into perspective and show how things relate to the individual and not just what is represented by the history books.
Seems to me that there are already two solutions that will handle exactly what you describe. They are both low cost, intuitive and user-friendly. Arguably, once you've used either product, you may fuind it difficult to manage without them (at least, I know I do ;->). The products are:
1) "The Brain" by Natrificial. You can check it out at thebrain.com
A relational File-manager for Windows.
2) BeOS
Clearly the solution is here. The question is: will enough people adopt it to make it work?
--sugarman--
I've read of one decent long-term solution. It's not particularly convenient, but it *is* known to be capable of surviving hundreds of years.
Print the data on acid-free paper. Use an impact printer, not a laser printer. (Are you sure that the toner binding agents will last hundreds of years?)
You could print textual material in an OCR-friendly manner (e.g., the source listings in the "Cracking DES" book). This will obviously take a lot of paper and space, but it could be read by a human.
Or you could print the material in a "barcode" type style with plenty of embedded checksums. If you use 1 mm square cells (which should be large enough to allow scanners to adjust for paper warp, water damage, etc.), the amount of data which fits onto a single sheet of paper isn't much more than you get with raw text... until you consider that this is true 8-bit data with error recovery, not 7-bit text.
I think it goes without saying that this format is not intended for frequent use. But if you had information that you *had* to archive for centuries and you had unlimited access to vast underground storage vaults, this is probably the most stable media known today.
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
A closely related problem is the question of whether you're storing data in the first place.
This question has come up twice in the past decade. In the first case, a tape backup drive quietly failed and there was no indication of a problem until they attempted to retrieve a file.
In the second case, the person responsible for performing backups carefully ticked off the paperwork... but as far as anyone could tell he never actually swapped any tapes. The company discovered this after he (intentionally) corrupted the Netware database and then walked out the door.
Both problems can be solved by simple procedural changes (e.g., always "verify" tapes after writing, have someone else run "verify" or rotate the duty).
Yet... twice in the past decade I have direct knowledge of data loss. In one of case this happened despite a competent and dedicated IT staff. Assuming this wasn't just a statistical fluke, it follows that there must be a significant risk that archival data is bad at the moment it's produced - perhaps somewhere between 5-20% of one or more bad media per year per backup group.
It doesn't make much sense to invest in premium media if you're saving garbage.
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
This raises an interesting point. Given the efforts made by the PGP group, and with less limitations, would it be possible to generate pages of text on paper or possibly something more resilient like plastic, with both english encoding and binary encoding of the same information? something similar to an efficient barcode style encode down the side of the margin (Which is rarely used in documents) with standard english text and pictures in the document? scanners these days are easily capable of distinguishing text from pictures, so those needn't be encoded, leaving one page of text plus checksums, which surely can be binary-encoded within the margins?
Just a thought.
You can't win a fight.
Get yourself a big radio transmitter, and just beam the stuff into space with lots of error correction. When you wanna retrieve it, you just have to hope for a faster-than-light drive. No media decay problems, and with technology advances your ability to retrieve the data properly increases every year, depending on the rate, this could mean thousands of years of archiving. Even better is if some alien races pick it up and store it as well.
You can't win a fight.
That is totally true, let me give another example. I loved Atari 2600 when I was kid (who didn't) and I have two emus for it. I downloaded all my favourite games and I was happy to know that I actually remembered how to play them a good 10, 12 years after the fact.
But, I downloaded a ton of games that I never had but always wanted - but was not able to find manuals for them all online. Now I am a bit confused as to some of the games...what the blinking blob means or why x happens when my little guy picks up item y. So I have sadly given up on some games that looked cool, but I'll be durned if I know how to play. Given too that a lot of the early games lacked "begin" or "end" screens.
I've also read about Finnish scientists who are trying to come up with signs to last at least 500 years in a language/medium that people will understand in the future, or perceive as a danger sign. Our yellow anad black shield will probably have as much meaning to people in the future as Venus figurines do to people now.
This is an old thought experiment that has one flaw (see if you can spot it!) But it makes an interesting case for low tech solutions.
Iron Bar Storage System
First, take all your data and stream it out as one long multi-digit number. Now place a decimal point in front of the number and treat it as a very precise fraction. With the length of your iron bar treated as 1, measure off the distance along the bar equal to your data's fractional value and file a little notch in the bar.
That's it. The notch now represents the fraction that is your data. Anytime you want to recover your data, just measure the distance to the notch, divide it by the length of the bar, remove the decimal place, and convert the number back into your data bytes.
Voila! Instant low tech storage!
Shut up and eat your vegetables!!!
As some other posters also touched upon, I'm not as concerned about how we save data, or how much we can save, as I am about /what/ we save. It would be a tragedy to save /everything/ and then find that anything of any significance is entirely drowned out by the noise of the irrelevant and insignificant.
/very/ high volume capacities. Traditional "file systems" (systems of files) doesn't scale very well. It is not feasible to uniquely name every file on your hard drive; if it is possible, it is at least very tortuous to maintain. The problem is that tradition file systems have only one dimension. They account for only one strata, on plane, one cross-section, of the many attributes files have - namely their "name"/location. But files have much more meta-data than just their "name". They have content. When we want to obtain a file, we aren't looking for it's name, but it's content.
We are getting to a point at which traditional "file systems" are going to become archaic. When "file systems" were first created, drives had very low volume, and very few files. The name-space-to-"file" ratio was very high. It was possible for everything you could fit on a disk to have a uniquely identifiable name/location which gave you instant insight into what that file contained or was for. However, we now have
A new paradigm needs to be introduced. I think traditional file systems will need to acquire characteristics of relational databases. What good is a 17 GB drive if it takes you half an hour to find something you want?? Today we have much richer and diverse content in our data, and our storage systems need to accomodate that. We need to be able to make intelligent, high-level queries, like "All email files which contain spreadsheets on last weeks product demonstration". This is what we are looking for, not "prddemosprsheet012500.text". File's aren't just of one type, or one attribute anymore.
Our data contains many planes of meta-data. We need a storage system that understands that, and allows us to make intelligent and intuitive high-level use of it. We need an associative/relational storage mechanism, whereby files are stored not according to an absolute location, but according to their attributes and relations to other things.
Jazilla.org - the Java Mozilla
It's 10 PM. Do you know if you're un-American?
Hmm...you basically just described the process for making a conventional CD master, except on the master stamp all the pits are inverted. the life of the disc)
well, partially (the process is close) but the idea is to get a single, playable CD with as durable a construction as possible. metal and glass rather than foil and plastic, and so forth
--
-=DaveHowe=-
Make a standard backup of your vital data.
Take it to a special "data preservation agent" who will probably do it as a sideline for normal Disaster Recovery stuff
Agent makes an optical mask of a CDR image onto a blank, metal disk
optical mask is acid-etched to give a metallic CD (using two metals with noticably different optical properties, or burning right though if the disk is thin enough)
in a suitable atmosphere, mold glass around the metal disk to give you a metal-and-glass CD.
place in padded, light-opaque, metal case, and state you only guarantee the data readable if it is kept in that case full-time.
Obviously, the DRA would need to keep some hardware capable of reading these, but as he will probably be offering vaulting services for these disks anyhow, he will be wanting to access them on demand in any case. Any comments?
--
-=DaveHowe=-
Modern word processing still opens really old file formats like Windows .WRI and Word 1.0, and I don't see that likely to change in the near future.
//e on a daily basis, and many more people who are still using old word processors (Word or Word Perfect) for Windows 3.1
But it could happen at any time and without warning.
In my experience the typical non-geek computer user buys a computer and uses it hard for eight or ten years. I know people who still use AppleWorks on an Apple
All someone at Microsoft has to do is say in a meeting "You know, I wonder if it's time we dropped support for ancient version of Word X.YZ?" and it could easily happen.
Most data has a useful lifetime after which it is of little use to the owner. My tax returns are only useful to keep for a few years, the same with the financial documents that support them. My birth certificate is useful for my lifetime and has some value to genealogists after that. Flame mail to the network that aired some boneheaded Y2K alarmist story 7 weeks ago is already obsolete.
The problem is to organize data in a way that highlights how long it is needed. It is difficult to give a date in advance after which some things will be obsolete. If I write a book, when I am finished writing it, if no one buys it, the file is useful as long as I want it. If it becomes a best-seller, my biographer will probably want the rough drafts in 20 years. But I don't know that when I save them.
The solution to this is to learn a better strategy for identifying data. Some file formats already make provisions for this. LaTeX and DocBook already provide tags for quite a bit of identifying information about the source. Meta information can be placed into HTML. CVS stores records of who made changes and why in addition to retaining a record of each revision.
In fact, now that I think about it, CVS provides a good model for data storage. You get a way to retrieve each version of a file. You get a way to link together corresponding revisions of several files. And you have a record of when, why and by whom all of the changes were made. But at its heart, it is a system for data that is still alive. It is not a system for organizing the historical records of a person, company or government. And it doesn't address the question of media decay because it is independent of the specific media.
The net will not be what we demand, but what we make it. Build it well.
Interestingly enough, with all the recent hubub over "millennium capsules," the proposals for the NYTimes capsule wrestled with this very question of data loss. One group came up with a pretty impressive solution: genetically splice/embed data into the DNA of a cockroach - then reproduce it and set them in the wild.
i um/m6/design-lanier.html
http://www.nytimes.com/library/magazine/millenn
That data will survive everything.
Several groups are looking into this technology as a possible way to stably maintain their archives over a very long period of time. Take a look at the Long Now Foundation library for an example.
Fortune favors the bold. -Virgil
Magneto-optical is probably one of the most stable storage mediums available.
Theory and practice diverge in an unhelpful manner. 3 years ago I worked on a project to convert a 5 year old MO system to another MO system, simply because the old drives were no longer available and ongoing maintenance was a hassle. Owing to stupid cost-cutting on my project, the "new" drives we used were already becoming obsolete. Today no-one still makes drives that can read either set of disks and on-going maintenance of the #2 system is dubious.
At my site, we're tasked with creating and maintaining the archive of sattelite data for the GOES series of weather sattellites operated by NOAA. Currently the archive spans about 25 years and 150+ terabytes of data. Most of the data lives on Sony Umatic tapes (video production quality equipment). It was very cutting edge at the time and required some interesting H/W hacks to get it interfaced with the dish electronics.
Currently the system is hopelessly obsolete and the remaining units are being carefully nursed as we begin the migration effort. Furthermore much of the older tapes have becomre "read once" media, so you can't afford to miss anything.
Many of the suggestions about formats and media life ignore some of the realities and complexities of the real world. Our acrhive necessities break a large number of these assumptions (as I'm sure do many others).
1) ASCII and higher order reprsentations are not adequate for scientific data.
2) Selecting media with a longer life span only defers the problem to a later date and makes the migration process longer. It also makes it even less likely that adequate readers will be available when the migration begins
3) Raw data formats can change arbitrarily often during the lifetime of the archive.
4) There is unlikely to be adequately stable online storage media available to hold the entire archive as well as the "live" data set (data volumes will increase to match existing storage capacity).
So, what can be done? Many of the suggestions already posted are good and should be incorporated into any archive strategy. So here are some suggestions based on things we're looking at:
1) Identify what needs to be archived. As many have noted, most things don't need to be archived.
2) Build a migration strategy into the plan right away.
3) Keep source code and any auxilliary data needed to access the data available with the data itself.
4) Keep at least two copies of the archive. It's amazing how many archives exist in only one place (depressing, really).
Of course the biggest challenge is to make all that data in a meaningful form. That's really the biggest part of the problem, and it's likely to get worse as data volumes grow. Things are coming down the road that will make our current demands look pretty small. That's good and bad. On the good side, our existing problem will fit easily into any solution we come up with at that point. On the down side, it's not clear what those solutions will be.
This brings up two personally relevant items for me that others may find illustrative.
The first is laserdiscs. They were advertised as permanent. After all, what could go wrong? The media was sealed in plastic and couldn't get any air, so deterioration was impossible, right? Wrong! "Laser rot" became obvious within a few years of the introduction of the technology. There are now gazillions of first-generation (and later) discs that are simply unreadable. I have dozens of them and can personally attest to the sinking feeling that comes with seeing data degrade and become unavailable. (In a similar vein, music CDs will degrade while the vinyl they "replaced" will, given proper care, soldier on for another century or two. And records sound better/hold more data, too. CDs were supposed to be an improvement?!?) The lesson? Don't trust industry shills who tell you a technology is good for 100 years. They simply don't know.
The second example is more personal. As a former photographer, I have some works that I want to preserve forever. Maybe I'm conceited enough to think that in a thousand years my works will be found and I'll be proclaimed a great artist. Maybe I'm just anal. Either way, I want my photos to be around for a long, long time. Now, properly processed silver halide-based film is pretty stable. My negatives will last a long time. But for the ultimate in longevity, I've begun making platinum-based prints on a variety of media, including plastic squares and enamel tiles. If I can find a source of enameled titanium squares (about 10 inches or so), I'll have a combination of media and chemicals that can reasonably be expected to last for a couple of thousand years. The lesson? True long-term data integrity sometimes requires an open-minded approach. If I rejected platinum processes because they were 100 years old, I'd have never discovered their permanence.
Until someone can come up with a novel way to store data that is truly permanent, I'll rely on the "bigger hard drives, cheaper, every year" theory to keep my data safe. But I don't really feel good about it.
What I've noticed is that most of the data we're accumulating is quickly becoming useless. 10 year old schoolwork isn't something so worthy of archiving. The data you really want to keep shouldn't be very large anyway...
This may be true of households and many businesses, however there are government requirements for keeping data for things like clinical trials and funded research which are subject to this problem.
There is also a public-interest issue here. Imagine if the tobacco industry, which has in effect lied to the public and hence murdered for the past three or four decades, had been required to have its research data archived in a retrievable format. Another example is the archived PROFS email correspondence of the National Security Council members during the Reagan era, which led to the smoking guns of the Iran-Contra scandal. A final example: a small city near where I live recently had difficulties deciding in what its City Charter actually said, because of poor recordkeeping of its officially adopted amendments over the years.
While most data is of little use to us after a year or a few years, there are longer-term projects and public-interest requirements that make it a public issue. True, I doubt if anyone would want to save most of those Letterman Top-10 lists, blonde-jokes and similar net-chaff for very long. I do think the Stephen Wright stuff will endure, however....
-Dave
and the listing itself explains why. In the (relatively) not-so-distant future, it's very possible that an entire century worth of data be stored on thousands or hundreds of dollars worth of equipment (as opposed to millions) - keep in mind that not much data has been produced in centuries previous to this, and not much produced in this compared to future ones. (By this century I mean the 20th: one more year folks).
:) I suppose it'll be found somewhere on ENCYCLOPEDIA, DISC 1: PREHISTORY-2012.
Perhaps the amount of data will increase faster than the amount (price) of storage, but I doubt it. 640k should be enough for anyone! In any case, all the data generated thus far is likely to remain safely stored somewhere until extinction, if it is ever digitized and made publically available (and anybody cares to store it).
I can imagine them in 3000 CE looking back on the logs of the (at that future time) most popular web site ever to have existed, and reading this very thread
Two years ago, I was involved in an effort to preserve the archives of the Stanford AI lab from the 1960s and 1970s. Several alumni spent weeks taking turns reading 9-track backup tapes through the last two working 9-track tape drives at Stanford. The raw data was sent over the Internet to a big file server at IBM Almaden. There, somebody who remembered how the old SAIL tapes were formatted had written a program that extracted the files from the backup tapes.
Once the files had been extracted, they were processed into standard formats. (The SAIL machine had its own wierd character set and its own image formats.) Text was converted to Unicode and (monochrome) images were converted to GIF. The MD5 hash of each file was computed, and duplicate files (these were backup tapes) were removed based on the MD5 hash.
The material was then indexed with a web-spider type program, so it could be searched readily. CD-ROMs were made of the content belonging to individuals, and sent to those who could be identified. Permission is being obtained from each individual to have their data published. (The files include private E-mail, for example.) Data approved for public release will be visible on the Web in a year or so.
If you ever had a SAIL account at Stanford, Bruce Baumgard at IBM Almaden has your stuff, and you can contact him for a copy.
It's a lot of work. And this was only about 10GB of data. This gives you a sense of how hard the problem is.
The concept: build a hyper-reliable storage device or something, something that can last for millennia. Then ship it to the moon. Wah-lah! Permanent information deep-freeze.
-troll taker
Assuming that I have no information about a Disk Drive, then I agree that it would be unreadable, but not punchcards. If you know about the alphabet, then ASCII punchcards containing source code probably wouldn't be too hard to decypher.
Newspapers are frequently archived using photographic reduction. These are expected to last a very long time. Reading these just requires a very good magnifying glass.
Even CD's would probably be quite easy to decode and decypher with technology equivalent to ours.
Also, we keep a lot of data around purely for future historians. I think destroying everything, or even just that part of all the worlds data stored to prevent decay would be virtually impossible.
What might be harder to decypher than the technology is the language that these are written in. We need to start making some Rosetta Stones, and burying them all over the world.
It is reckoned that 90% of paper, once filed, is never read ever again. How many millions of tree's worth of redundant paper do we have in cold storage, just in case someone needs it ? Do we want to repeat the mistakes of history, by wasting resources and lives in archiving every piece of data in sight, for fear of losing it one day ? Do you realise that we are generating data faster than ever before (yet genuine information remains at a premium) ?
......
The answer is simple. If it is of relevance, and people want to keep it (remember Oliver North ?), then people will keep it. People will judge the value, and take appropriate action. Otherwise, it goes to the bit bucket in the sky, and remains there
In years to come, will we be complaining of 'data pollution', caused by spurious archives that no-one dare ditch, or noxious chemicals from discarded CDs ?
Stephen Hawking has written another book. It's about time as well.
Most of the stuff people archive tends to be logs or other such chunks of 'low value per byte' data - expensive stuff tends to get kept live and backed up regularly. The only other thing that I think is habitually backed up is stuff like important configs, important content - in other words, things which must NOT be lost, but which are frequently obselete in a few months. With hard drives expanding fast, most critical data can just be migrated onto evre more capable storage systems. I know where I work (an ISP), customers get backups done for them, and no-one thinks about the long-term viability of those backups because all critical data is backed up regularly, because it is kept live. Who cares about last year's web pages? As for an entropic nightmare of backups of backups :> The tools for controlling backups and the speed of backups are improving fast enough that we spend LESS time doing backups, not more. Plus with speciality data vaults coming more into play, I see the future being one of many people and companies with data, backed up by their choice of 'backup provider', who keep backups, and who in turn are backed up by national or international 'secure backup services', unseen companies known only to those in the business, who aren't interested in backing up less than a coupla hundred terabytes per customer. There's been a few rumbles about very high-density CD's and laser-read semi-biological crystals as storage media, the notable point being a move from '2-D' storage media to '3-D'. Which makes me wonder if some physical process of copying a crystal block might take over the informational process of writing a new one. Data usage is expanding to fill the services that are provided, and technology races to stay ahead of that demand. Given that we are not yet storing data by the electron excitation states of atoms ("NEW from Store-TEK - a Titanium based storage cell, allowing a whole byte per atom! Forget your 4-bit arrays and buy the new...") I think that the storage media industry has got plenty of cards to play to meet the expanding needs of data storage. Cache-Boy
Error 404: There is no spoon
One thing that has always disturbed me is the chaotic way in which information technology evolves. It is natural that it is so, considering evolution comes through individual steps taken here and there. But I believe that if we opted for a 1-year freeze in new developments and set up new standards, everyone should benefit in the long run.
It sounds absurd, ok, but I would like to see a standard defining a general information storage format, which would encapsulate whatever format would be chosen to really organize data (FAT, NFS, journaling, orthogonal, etc.), much in the way that IP encapsulates other specific protocols. If well designed, such a standard could allow for really substantial growth in storage capacity and still provide backwards compatibility, like reading a 720kB floppy disk in a 1,44MB drive. It could be designed to work in disks, tapes, chips and so on.
But that, of course, is just wishful thinking...
Magneto-optical is probably one of the most stable
storage mediums available.
It can be rewritten up to 10 million cycles, has a
shelf life of 50+ years, and are only affected by
magnetic fields if you heat the surface to 300
degrees.
Another good contender with a higher data density
is AME tape technologies, such as AIT, Mammoth and
VXA-1. AME tapes are good for around 20,000
passes, with an archival life of 30+ years.
SLR is another good contender for high-capacity
data storage, with a shelf life of 20+ years.
For large scale storage, using a form of
heirarchical data management would be the best
approach, with MO drives (which have a "mere"
capacity of 5+ GB) serving out files that are
still accessed on a regular basis, and using large
capacity tapes on the backend (such as SLR100 or
AIT-2, each boasting 100GB compressed).
As data warehousing becomes a more important
industry HSM systems will likely integrate auto
migration from media that is reaching the end of
its archival lifecycle.
punched cards. The only media to survive a nuclear blast radiation! Hope this helps.
Only garuanteed storage mechanism! Good for thousands of years.
Capacity: 2Kb/tablet
I/O: 1byte/hr
Media cost: £50/tablet
Error rate*: 1 per 100bytes
Note: Error rate assumes fully qualified and certified stone mason.
Deleted
I can't speak for anyone else here, but so far my personal experience has been that Maxell CD-R's are the absolutely worst available out there and Verbatim have so far been the best.
By saying that I'm refering to how I bought my first CD-R about three years ago, and of the 20 or so Maxell disks that I've archived data onto, only one is still readable by any CD-ROM/CD-R that I insert it into. By contrast, every one of the verbatim disks that I've burned, which were stored in exactly the same environment as the Maxell's are fully-readable, and I haven't had any problems with them.
I've also used a few Sony and Memorex disks with which I haven't had any problems (that I'm aware of) but I have found my verbatum disks to be incredibly durable. I burned 20 or so Audio CD's onto verbatim disks two years ago before leaving on a cross-country road trip, and despite vast changes of heat and cold, as well as being literally tossed around my car, every one of those CD's is also still working.
Again, this is just my personal experience, but whenever I see someone picking up a spindle of 50 or so no-name brand disks at a local computer store, I have to wonder how important the data they're putting on there must be...
--Cycon
Your Brain + EEG + LEGO Robots = Brainstorms
This is a very real problem, but it won't amount to an apocalypse unless we ignore the issue.
As others have pointed out, the exponential increase in storage capacity makes it relatively easy to "keep buying more disk" and migrating your data all the time. Certainly the convenience of having everything online is nice, too. And everything on line should have periodic backups happening. I've managed to do this for the past decade with my data, but I've lost the eight or so years before that, and I miss some it.
But there's logical as well as physical bitrot. The media itself deteriorates, making it hard to get the information back, but understanding what that bitstream represents after a few years can be a real problem. If you've got binary word processor files from an Apple2 or C64, you'll probably not be able the read them unless you also have the binary and can get it running in an emulator. Given the amazing progress that's been made in the last 150 years deciphering the records of dead civilizations, I wouldn't say that reading your MS Word 5 documents will be impossible in twenty years, but it might not be worth the effort. Open standards and open source really help alot with this issue. If you can find a document describing the file format, you're saved. And the same applies to hardware formats. Also, it's much easier to keep open source software alive--essentially carrying the 'make a copy on the new system' over to executables.
I'd say the solution is pretty much that simple: keep track of your data, plan to make a complete copy every 5-10 years, and choose formats and that are publicly documented and that (you hope) will be easy for future software to support.
This approach will well work with anything that is in daily use by a reasonably large group of people. Also it works best with information already stored in digital form. There is other information worth keeping, historical data, literature, even texts intended only for reading once (adverts, notes, email) may give later generations an insight into present everyday life and hence be worth keeping.
Many of these texts are not yet broadly available in digital form and are not important or interesting enough for enough people to be kept handy. Try looking for some older book by a not so famous author. Even encyclopaedic works are worked over for each new edition and older bits of information have to make place for newer ones.
With historical facts it's even worse, in most cases there's at least two versions of one event and who was in the right is mostly determined by who survived. Just have a look how warfare now concentrates on media control or try to imagine the twisted version of history if the nazis had won WWII, even now there are some denying the existence of the holocaust.
I think all this information is well worth keeping, and since it's difficult to see today what later generations might find worthy the 'evolutionary' approach (if i/we don't want to keep it later generations won't want it either) doesn't work. And it doesn't suffice to just keep this information somewhere, it has to be kept in an accessible form, on media readable with modern equipment (who will go through the trouble reading an old magnet tape) and indexed (if you have 1GB of unsorted texts/textfragments on a harddisk are you ever going to wade throgh that to get that piece of information presently of interest?)
"By the way if anyone here is in advertising or marketing... kill yourself." -- Bill Hicks
I disagree almost entirely.
Very little of the data volume becomes useless, because we don't know what "useless" will be to the readers in the future. Contemporary archaeologists spend much useful time sifting the contents of rubbish pits and latrines - if that turns out to bhe interesting, how can we ever say that data won't be. Maybe your schoolwork is dull and uninteresting to you, but how about an educational historian in a century or so ? Wouldn't you like to know how teaching was carried out in the past ?
Also the majority (by volume) of data will always automatically generated sensor data (humans can't type fast to keep up), and that tends not to become useless with time. NASA have already lost interesting telemetry data.
Authors have definitely lost early book drafts because modern WPs don't open old WP formats. Word 1.0 isn't old ! that's not even a decade ago. What about stuff from the '70s on hardware formats that no longer have players ? CP/M WP formats used by some of the first great novelists to work digitally ? (mind you, losing the whole of Pournelle is fine by me). Personally I'd find it very hard to read my own degree work, and I'd probably have to do it by scanning in the paper copies
Solutions ? I'm not a hardware guy, so I can only talk about the soft data side of it. I think XML (and similar) has a big part to play here. Let's stop thinking of data formats subjectively as "the data format that belongs to SprongWriter 4.2a" and instead work with formats that have objective definitions that extend beyond the client app of the day. Why should I need a copy of that particular WP to open the data, if the data is already in a format that's inherently accessible. We already have the technical skills and tools for this, I call on all developers to make use of them and to stop writing these proprietary data oubliettes.
Book Recommendation: The Clock of the Long Now Stewart Brand Why this sort of thing matters, and what a few people are trying to do about it. Best book I've read this year.
PS - SciAm also had a piece on digital data loss, a year or so back.
This is not a new problem. People have been dealing with the question of recovering data from old media for years. As a first data point, a number of years ago, about 5 IIRC, some people finally decided that some old music tapes had to be rescued.
The method used was to find this old RCA gentleman how had retired more than a few years before then. They then went to the Smithsonian and got the last remaining version of the tape recording/play back device that had been used to make the original master tapes. The RCA guy used the specs and his knowledge to tune the tape deck to perfection. They then put a high quality amp and spliter down stream of the tape deck to feed 2 digital tape decks (The professional version, not DAT, more bits and a bit faster sampling rate) and a couple of analog tape decks as well.
After testing, they carefully placed one of the Master tapes on the deck, started all the recorders and press "play". As the Master tape played it just came apart. They had to keep the heads clean but this was a one time, one chance thing. They succeeded.
From the recordings they made some wonderful CDs. Amazingly enough, the Master tape had almost no "hiss" in it.
Data point two. MIT I believe it was, decided to move some of their older theises to CDROM for easier online access. The first thing they noticed is that many of the data tapes they had stored things on were 7 track tapes, and of course they had no 7 track tape drives any more. Again people went to the museums got out a 7 track drive, spent the time to fix it and make it work, then built an interface box to connect it all up and away they went.
3rd data point. Somebody sent out to a mailing list that they were looking for some old code to run on a mulator for a PDP11(?). We ended up going into our machine room and found some old release tapes. This included a copy of BRLUNIX (Based on a BSD release) and I think, an AT&T Sixth addition. These were 9 track reel to reel tapes. We went into the machine room, powered up the tape drive, copied the tapes verbatium to disk. We set it up to do the least amount of reading. These tapes were around 15 or 20 years old.
Because of this rescue which happened late last year, we saved the tape drive when the machine was tossed due to "Inability to prove Y2K compliance". So the tape drive still sits on the machineroom floor. The operators turn it on and clean it once a week. But it isn't currently hooked up to anything, but we expect it to be hooked up to something again in the next year or two. Just to be able to read all those old tapes we still have.
At home I use EXABYTE-8200s for my back ups. I have 3 drives and you can still get them referbished. While each tape only holds 2GB (Compared to a max of 150MB for a 9 track tape). The media is small and low cost. The exabyte encoding also has a great deal of redundency in it making it an exclent choice for long term storage.
At work they do much of their backups EXABYTE 8500s. For the Crays, they use to use IBM 3480 tape cartrages, when they changed tape formats, they spent a few weeks moving all the data from the older format to the new format.
Of course our most reliable storage medium to date has been our paper tape and punch cards. While they maybe low density and sometimes we've had to make readers for them (Auto feed to a flat bed scanner which scanned the card. Process the card for holes and voloa).
CDROMs have the problem of decaying do to light contamination. If you want to keep them for years and years and years, they have to be kept out of sunlight. And because our long term, low cost, storage methods keeps dropping in cost and increasing in size, I suspect that what we will find in 3 years is that everybody is carefully copying all their data from CDROM to DVDs which will have a twenty year life span.
The basic rules on saving your data for the long term are:
Chris
This suggests that ALL data should be made freely available for archiving. If NASA had made an effort to make sure as many people as possible had copies of that data, then you wouldn't need to do all this transferring. It would have been transferred to newer systems by someone already.
Apart from with MAME, nobody is making any effort to archive old computer games. The BBC managed to destroy a lot of valuable origional video tapes (Apparently they taped over their copy of the moon landings). These show that data is kept around much longer if copying is encouraged rather than discouraged.
I am an archivist. My job is to sift through data and decide what is worth saving. Generally about 5 percent of collections of modern records are saved. Popular culture is indeed documented to some degree in any historical library and there are several repositories which are dedicated specifically to the preservation of popular culture.
The filter of decay has served mankind well? How illogical, when you have no idea of what has been destroyed how do you know mankind has been served well? Was mankind well served by the destruction of the Library of Alexandria, the Aztec library destroyed by the Spanish, the historical libraries destroyed by the Serbs in the Balkans?
Sure CDs may last 100 years (we really don't know) but it is unlikely they will be able to be read by anything. Paper is still the most stable format available (although it is impractical for many reasons to transfer digital data to paper as some of my colleagues are prone to doing) and there are many vast libraries of data open to the public. We had well over 40,000 researchers use our library last year and less than 1 percent were scholars.
My profession is wrestling with two technology related questions.
1. How to make paper collections accessible electronically. For example the papers of ONE congressman (approx. 400k documents)took 5 years and nearly 3 million dollars to digitize. We have one collection which has 32M documents. Sure digital copies are cheap - IF the original was electronic and in a form easily translated.
2. How to preserve much of the information which currently only exists in electronic form, be it governmental databases, personal computer files or web pages. We did an interesting experiment a couple of years ago when we captured about six dozen web sites which documented the devestating Red River flood in Minnesota, North Dakota and Canada. Most of these sites existed on the internet for only 2-3 months and were disappearing as we captured them. I think it will be possible to study how the internet was used as a tool in response to catastrophe from the governmental level to local churches and organiqations. Of course current copyright law makes it illegal for us to post this database of websites on the internet but thats another issue.
Aging Newbie is correct in the assertion that only a small percentage of data need be preserved, yet I feel that conscious, reasoned choices about what should be saved serves mankind far better than the filter of decay. I also believe tha solution ultimately will involve a combination of strategies including electronic.
Skavvy(whose firewall apparently won't allow him to register)
WOW, i cannot beleive that half of the /. readers are not working on data recovery as we speak. I spent a good couple months of my life running back and fourth across hallways doing tape retreival because the machines that were made in the late 70s, early 80s couldn't be replaced. This was made even worse by the fact that half the tapes were courrupted. Fact is, we have lost a lot of the voyager space probe missions. With data centers poorly funded, the race to copy all the data from older 7 track format tape to new media is slow and gruiling. 7 track machines are NO LONGER MADE and the companies outfitting newer tape heads to read the old data are charging way more than the scientific centers can afford. Not only voyager, but magellin and so fourth.. GONE... and going as we speak. As the few machines that can retreive the data struggle to re-read the tapes literally hudreds of times trying to recovered those last missing bits, tapes yet to be re-archived are falling apart. Once the data is stored, what does one DO with half-complete 1970s computer records? There is yet an "emulator" to read most of this stuff. Fact is, it is gone, and anyone who says this problem isn't going to pop up again has yet to store anything important on a floppy drive. bortbox
However, I think he was mistaken. Ancient societies left stone tablets, cave paintings and the like behind, and there's no-one who fully understands the languages or the contexts (when an archaeologist says an object is of "ritual significance" he actually means he doesn't know what it's for). We do have the technology now, as the poster says, to migrate our data ever forwards into new storage, assuming no cataclysm occurs. And even if it does, it is far more important, in terms of recovering data, that the language (source code) survives, rather than CD ROM drives, Minidisc players etc (the binaries), because then data recovery is an essentially straightforward task.
I expect acid-free paper to survive long enough after an ecological catastrophe or, say, a meteor strike, to be useful to the survivors (better start moving the engineering textbooks down into the bunkers). And of course, Ship-It awards will outlast the end of time, not to mention non-biodegradeable shopping bags.
As a civilisation, if we wish to preserve a legacy, we currently posess the skills and technologies to do so - if we choose to.
From what I've understood, the lifespan of a CD-R is around 20yr for those which are based on cyanine or AZO (and which appear blue or blue-green when you look at them) and around 100yr for those based on phtalocyanine (which appear golden to the eye).
Of course, it depends very much on the way you treat those CD. If you put one in a light-free, dust-free, safe deposit box, it can probably survive several kyr (uh, thousands of years) without damage.
The unfortunate thing, however, is that because the error correcting codes work so well, it is not always easy to tell that a CD has begun noticeably deteriorating until the data is actually unreadable, and then it is too late. It would be nice if the drives could return some sort of ``CD quality'' status.
I always write down (on paper) the md5 fingerprint of the raw ISO image when I burn a CD. In that way, I can be sure whether I have pristine data yet. (And if I make copies, I can be sure the copy is exactly identical to the original.)
This information is provided in the hope that it will be useful but WITHOUT ANY WARRANTY. Without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Yamaha CD-R site
Josh
One could always do what Linus did for backing up his work --sharing it with the world. I heard he didn't have a tape drive for many years until he was given an Alpha, but his work could always be found somewhere on the internet in good hands.
The internet will always save your best work and discard the junk.
As someone who just loves books .. most are not printed on acid free paper anymore and a huge amount of them is going to be lost within the next 10 to 30 years.
I'm sorry to hear that. I've been fascinated by this phenomenon in our university library. Up until the 1930's somewhere, journals are pretty well preserved. Then they suddenly get awful as paper mills switched to new methods. Pages are yellowed and brittle. In the 1950's the error was discovered and pages become white again with the switch back to acid-free paper.
Let's hope we don't make the same mistake with digital media. And it could be worse: almost all the film from the first half of the century is lost to self-rot and enviromental damage. For all its faults, DVD is probably the best thing that's ever happened to film from a historical perspective.
What I've noticed is that most of the data we're accumulating is quickly becoming useless. 10 year old schoolwork isn't something so worthy of archiving. The data you really want to keep shouldn't be very large anyway...
.WRI and Word 1.0, and I don't see that likely to change in the near future. The filters will probably stay, but be optional. If you want to future-proof your documents, run a mass conversion utility on them and convert them to a more "standard" format than Word or Wordperfect. Say, pure ASCII, HTML or RTF. Sure, you're going to lose formatting, but if those are documents you're not likely to use ever again, yet there may be a slight chance you will, then losing formatting isn't important. If you need the content again, you shouldn't mind too much having to redo the formatting correctly again...
Modern word processing still opens really old file formats like Windows
Floppy disks are degrading rapidly, but most people's floppy collection can fit on a single CD-R. Then again, most people just don't care about their floppy collection, and will just let it die. The data contained on it isn't useful anymore.
Let's see about Audio CDs. They degrade over time (scratches) and possibly rot. I believe that what will happen is that we're going to convert them to some format like MP3. I'm fairly certain that MP3 capability will continue to be implemented in computer for a very long time.. And if it shows signs of getting phased out, then you might simply batch-convert everything to the new format. Or just rerip your Audio CDs that are sitting in storage, if you really care about the quality (since batch conversion will result in degradation, unless we find a way to actually enhance the audio quality... which might or might not happen...)
Movies. VHS tapes degrade... Probably, we'll be converting what we really want onto some kind of optical disk in the future. And the rest willl decay, and we won't care about it decaying. When the format (DVD-R perhaps ?) is being phased out, since it's in digital format, it should be possible quite easily to simply transfer our DVD-Rs to the higher capacity medium... Perhaps 10 discs on a single one... Saving a lot of space, and having the format live another 20 years. After all, how hard will it be to include MPEG-2 decompression in next generation video players ? The cost of an MPEG-2 decoding circuit probably won't be very high anymore.
The other possibility I see is that bandwith gets cheap enough so that we may consider remote storage vaults. That has a couple of privacy issues I'm certain you can see... But it's incredibly convenient and will probably be adopted by everyone if we just find a way to have a high speed switched pipe to everybody's home at a reasonable cost..
If we do indeed have high bandwith in every house, I see that the media companies might also get their acts together and start putting up their own gigantic media-archive. They could offer a monthly media-license that'd give you access to any music or movie you want. Or perhaps just make you pay for every access to the archive. Of course, such a thing.. I can think of so many ways it could go wrong. What if they decide to have only censored material on the archive ? What about independant artists ? Perhaps we'll just see a protocol to access and pay for access to media archives, and have a dozen appear. Let's say, DisnABCTimeAOL could have theirs, AndoTransmeVAMicrosoChryslerDaimler could have theirs...
This could be so horrible if not properly done - a lot of "non approved" content could suddenly become unavailaible if you killed the distribution channels except those media-archives... So. Is this just an incoherent rant ? Would you care to add any constructive comment to it ? Answers ? Questions ? Anything at all.
In many later books Lem refers to an informatic catastrophe: sometimes it is caused by a necro-virus, a product of a computer evolutions (the arm race was banned from Earth and transported to the Moon, where sophisticated computer systems worked automatically on weapon development. Each nation was allowed to get the weapons back on Earth, but that meant others could equally prepare; somehow, the automata on the Moon get out of control and start evolving, finally leading to a nanobot-virus thriving on silicon chips - therefore the title, "Peace on Earth"), sometimes by basic physical properties (in a humorous story "Prof. A. Donda" the title hero discovers a basic equality between energy, mass *and* information, and one of the consequences is that if information achieves a certain density it changes into matter, that - a new universe. God's word was counting from infinity to zero in an infinitely small time :-) ).
I admit - I was gestaltet by Lem's writing. Many of his ideas from sixties and seventies came to life in the nineties (e.g. virtual reality or sciences which deal only with information retrieval). I do believe that information storage is a problem - but not because the medium would not last forever, but because of the signal / noise ratio you have even in your personal files. As I look on the four Macs we work with in our lab, and the couple of Gigabytes of data, and then dozens of GB of backups, different versions, obsolate versions, alternate versions, gel pictures you have no idea where they came from and who needs them, and so on, and so on... Yes, there are better solutions than using a Macintosh in a multiuser environment, but that's not the point. I've been using Linux for years and have my personal data at home, and I seem to have a GB or so of data I'm to afraid to remove just in case. And there are so many alternatives of storage, backup, databases... and I'm just a simple biologist!
Returning to Lem - yes, I do believe we are approaching a critical point, like a bifurcation in a chaotic equation, and the word "chaotic" fits here in especially well. What happens next? He who cometh and giveth us a system (not OS, but an information retrieval system), he hath the power and our souls. Well, mine at least. Hope he doesn't come from Redmont, though.
Regards,
January