Will There Be Historical Records from the Digital Age?
magarity asks: "NPR's Morning Edition today aired a segment on the Medici Archive Project where every letter sent and received by the ruling Medici family of renaissance-era Italy is being stored. The interviewer, Bob Edwards, casually joked that it was a good thing the Medicis didn't use email or else all this history would have been lost. It is easy to predict that at a similar distance in the future little will be known about our time period. After all, it is already problematic retrieve 25 year old data from 8 inch floppies, simply because the reading mechanisms are hard to find even if the media has retained the data. The same thing will happen to CDs in 50 years. How should the dawn of the digital age be recording itself for history, especially casual correspondence that gives insight into day to day life?"
"The Medici Project concerns itself with the rulers and given the recent report of US Congress members not making use of email one assumes they are still using good old long term archivable paper. Will the President and Congress in 2030 or even 2020 feel the same way? The main problem being digital records are so much more easily tampered with compared to old paper. It's not as easy to do carbon dating or other such tests with a bunch of bits. Remember: the victors always have and always will rewrite history as much as possible."
There was a company doing this, digitally encoding data with some etching device on a silicon-based wafer/platter/somethingorother. Wish I could find the link, but it's supposed to be *the ultimate* in long-term data storage.
Any loss of historical documents caused by use of email today in lieu of paper documents will be offset by the sheer volume of information available. Imagine how much data is locked up in slashdot postings alone? A good chuck could be rescued from some college student's web cache on their hard drive. Writing was for the most part the exclusive provence of a priveledged few that were actually literate and didn't have to worry about going hungry. Now at least a quarter (lowball estimate?) of all Americans use computers in some form. Future archaeologists will have plenty of information to deal with.
Important information survives (usually). Trivial information gets lost. This is how it should be. There's no reason to preserve every bit of data for 'historical' reasons.
Who is trying to erase any record of his past drug convictions? Who has been firing scientists that produce studies contrary to what his big-oil pals want to see? Who is cutting the taxes on his rich cronies? Which party quashed certain elements of the Census in order to maintain their power structure?
That's right kiddies, the right-wing and their puppet GWB.
So go ahead, preach on about Vince Foster -- meanwhile, we'll all be slowly poisoned by right wing tyranny and their greed-politics.
I can pick up a book that was printed 500 years ago and still read it. 500 years from now can I do that with what I'm writing right now? What about letters? This is important in a historical sense. We have learned so much of how life was in the past just by reading letters from plain people. Of course, back then, when you got an education, you got one that is better than what they go today. Listen to some of the privates, simple nobody privates, in the Civil War and their letters sound almost like poetry. Show me a teen today that writes to his parents or friends with the same vocabulary or command of the language. They're too busy "gettin jiggy wit it" I suppose. But what medium can surpass the printed word? Or photographs for that matter. People seem to be buying digital cameras left and right, but what you buy today is outdated next year. On the other hand, I could pick up a Leica that was made 50 years ago and STILL slap film into and shoot away. Will my little Cannon S20 digital last me for the next 50 years? I can view platinum/palladium photographic prints that were made over 150 years ago and they're STILL as vibrant as when they were made. Can a print off an inkjet last for 150 years? We're so quick to get new technology that we don't think how this will affect those in years to come. Yes, I care about this. I see photographs of my great grandparents and read their letters to one another when my grandfather had to leave my grandmother to come to america in the 1910's. By reading their letters I feel I'm so much closer to my heritage and can tell how they lived and loved. (no, I didn't mean this to turn into a trip down memory lane). What I wonder is if my great grandchildren will be reading my "emails". Of course not, they're gone forever. So much information is simply lost in the wind. Shame.
No, the trick is that a picture is worth 1000 words. Since graphics usually compress worse than text (limited dictionary), we simply want the 1000 words because it saves us space on our servers. :)
Server Administrator
National Archives
Soon my quest for world data domination will be fulfilled!!!
You know, one feature of Babylon 5's story line that always bothered me was the apparent dearth of historical material available to them from merely 1000 years before - consider how sketchy their knowledge was of the previous Shadow Wars, whereas our own historical view of a similar timeframe is fairly well fleshed out, at least as far as major political/military events are concerned. Could it be that be that in this as in many other things, JMS is exceptionally prescient?
What may be the most difficult part of the problem isn't the long term storage, but conveying what's stored.
Think about Egyptian culture. We wouldn't have a clue without the Rosetta stone. It wasn't enough that they left writing and markings that have lasted thousands of years. We needed a tablet with the same message in several messages to figure out what they were trying to say.
So what you really want in your storage is a long term package, no moving parts or power supply, some generic and easily understood interface, and a primer that cannot be misunderstood.
Also, for those thinking we can just have plain ascii text, it's not that simple. Ascii is an encoding scheme. You have to have something in the primer to tell the reader how to decode the data and then what those letters and words mean, and so forth. In 2000 years we invented Latin, French, German, English, but modern German speakers would find Old High German hard to comprehend.
This gets worse as time goes on. It's already hard to explain feudalism to people, try explaining the Roman Republic's governmental structure. Now, try explaining American Democracy in 500 years.
It's not just the media, it's the culture. And a primer is how you get them able to follow enough of the conversation to get a grip on it.
Making copies of data, even for historical preservation, without permission of the copyright holder is illegal unde the DMCA. You THIEVES!
Microfiche records of old newspapers (and old magazines back in the stacks) are a researcher's staple, and an important one. Now it seems many publishers are trying to stop this. There have been questions on Slashdot and elsewhere about what's going to happen to scholarly journals when they become purely electronic, and access to them is all pay-to-play. The same question ought to be asked about popular media, too.
Before the Internet got big, I spent a lot of my time in library reference rooms, mostly at either the Enoch Pratt (Baltimore public) main branch or at the Johns Hopkins Eisenhower Library, which kindly gave me a courtesy card to use even though I was not associated with the University.
I have also spent many happy hours at the Library of Congress in Washington, D.C.
I have dug up -- and based many stories on -- little, overlooked nuggets of information I picked up browsing through old newspapers and magazine articles and obscure references. For example, while hanging out in a private library owned by an insurance company (The Equitable) and scrounging through old city directories, I noticed that in 1800 or so, Baltimore had many "Tea Rooms for Refined Young Ladies" located near the shipping docks (Fells Point).
I wondered why in hell a rough sailor's district had so many such establihements until I realized that they were all whorehouses, which led to a wonderful little article about prostitution in Baltimore over two centuries.
I had a lot of fun researching that piece... lots more fun than writing (ahem) about software...
See what can happen when you hang around in libraries and museums? And why the DMCA is a bad idea? Think of future journalists who won't be able to get editors to pay them to hang out with whores, because they won't have access to enough historical records to justify writing that kind of story.
- Robin
They will say:
Thank you, Ministry of Historical Perspective!
I would think so. Yes there is a lot of stuff going on on the net that no one cares about now and no one will care about in 50 years. On the other hand we have most of the letters people like Washington and Jefferson wrote, because they made personal copies in a diary before they sent them (which made sense in a day and age when letters might not get there). And they are of great intrest to many people. And there are many other records from that period and before including a very complete set of Several hundred years of the Cairo Jewish community in the middle ages that was found about 100 years ago. That one existed because Jewish law requires some written records (those containing G-d's name) to be stored or disposed of properly. And the community just got into the habbit of saving everything. Its literaly hundreds of volumes of stuff.
In 50 or 100 or even 500 years will historians be able to access what we have done today? I hope so but I don't really know.
Erlang Developer and podcaster
Well Limiting the number of formats that you accept has the major advantage that will not have problems that in 100 years people will not be able to read it. The other bad side if ASCII is that it will only do English text, If you want to archive a document in Greek, Hebrew, Yiddish, German, Russian or Chinese or whatever you can't do that with 7 bit ascii.
Erlang Developer and podcaster
We just have more medium-term storage. The sorts of things that won't last more than a couple dozen years are generally things which, in the old days, wouldn't have lasted a minute: music couldn't be stored at all until recently, and many conversations we have by email (which could degrade) would have been done in person and never stored at all.
Stewart Brand addresses this issue on the Longnow website:
http://www.longnow.org/10klibrary/library.htm
DNA seems to work even better than paper.
Once the computer age "wild west" mentality wears off a little, people will hire services to perform automatic nightly backups of all their data, just as sure as they buy homeowners insurance and wear bike helmets. Its just common sense. Data loss will become extremely rare, and even scandalous.
This is no coincidence, because Knuth's main oeuvre, a several volume work on computer science, has already a related aspect:
Computer science changes very fast and Knuth decided to include just those parts of computer science that have settled and that might have reached a maturity that would make them unlikely to get radically changed in the future. Hard task. And indeed that stuff he put into his three released volumes is highly mathematical, because such stuff is typically evolved enough, but still he did not really manage it, so the RISC architecture for example pushed him to update his machine language MIX.
At some point, when Knuth got some copies of his TAOCP, he was frustrated enough because of the typographic quality getting worse. So he decided to take some time off to develop a system that turned out into TeX (who else than a professor can take 10 years sabattical to do such :-)
To shorten the story:
Knuth developped TeX, the programm that assembles boxes into lines, lines into pages, pages into documents. Developped Metafont, the programm that takes the mathematical description of font families (= a meta font) and renders them into bitmaps. He developped the computer modern fonts in Metafont format. Plus he invented a system called literate programming, that allowed to derive programming code and documentation from the sources.
All this, has been released in form of five books:
This means, that even in hundered of years, everyone with those 5 books, something like a computer, and the ability to read mathematical texts plus the computer science knowledge to implement a Pascal like language, will be able to reconstructs Knuth's whole system!!!
If at that point .tex sources are available
(at least as printed listings!),
they will be able to hack device drivers for
their then common output devices and to be
able to print all of Knuths works in original
typographical quality!
That is real deep reason for Knuth's TeX - longevity of information.
Embossed metal would be good.
No better. Metal gets corroded by water (worse yet: saline water), melted by fire, cracked by cold etc.
Besides rock, which has proven pretty good throughout the ages, there's one thing that could hold up the promise, and that's mineral paper. (Aka, asbestos paper.)
Karma karma karma karma karmeleon: it comes and goes, it comes and goes.
Have you ever heard of gold? Not to mention titanium, hafnium, rhodium, platinum, nickel, chromium?
Hardly affordable metals aren't they? I'm talking something remotelly accessible, not gold-plated disks to be sent outter space...
Where YOU awake in economics class?
Karma karma karma karma karmeleon: it comes and goes, it comes and goes.
"Important information survives (usually). Trivial information gets lost. This is how it should be. There's no reason to preserve every bit of data for 'historical' reasons"
But the selection forces change, just as being big as a dinosaur was great in the Jurassic, but it wan't so great when the extinction came.
I've worked on research projects whose primary source was day-to-day accounting records of a small business running in Egypt during the 11th century.
Yes, but we don't need the records of every small business in every country in every century. Just some sampling will do. We lose information but it's a tradeoff for space and conservation work. The same about modern data.
__
__
Men with no respect for life must never be allowed to control the ultimate instruments of death.
GW Bu
Orwell was a well-known member of the U.K. socialist party if memory serves.
Doubleplusungood! Thought Police! Here! I have found a crimethinker! He must be an agent of Emmanuel Goldstein, spreading misinformation!
Put Doctor K with his brother in the Castle!
__
__
Men with no respect for life must never be allowed to control the ultimate instruments of death.
GW Bu
Nowadays it seems that it's the place where artistic (or allegedly artistic) works used to go. Don't look for Mickey Mouse to show up there any century soon.
I see even classic Slashdot is now pretty much unusable on dial up anymore.
Digital rot of our records, I mean.
Think about it -- what do we have to pass on to future generations of the past 20-30 years? Boy George, N'Sync, Lyndon LaRouche, Hare Krishnas, Monica Lewinsky, Rush Limbaugh, Al Gore, Rob "CmdrTaco" Malda...
It might be a good idea for ALL these things to slowly melt away ...
"Beware by whom you are called sane."
Potato chips are a by-yourself food.
The Long Now Foundation is making an attempt to preserve at least _some_ of our world for posterity.
-- Out of cheese error! Redo from start.
I hope your box will survive for the next 200 years :) ...
Freaker / TuC
--- I am known for the ones who want to find me on the net. Is that a privacy risk or a privilege? One might wonder..
cat internet | lpr
Already, a lot of truly historic correspondence from the dawn of the Digital Age is possibly gone forever. As another Slashdot article some time ago pointed out, we can read Galileo's correspondence on astronomy in the 16th century but we can't read Marvin Minsky's correspondence on AI in the 20th century. The pace at which stoarage technology is changing is moving so rapidly, and these concerns about "digital rights management" that CPRM and the DVD-CCA are supposed to address will make it even more difficult. These "Digital Rights" basically include the "right to allow your content to vanish in the mists of time".
Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
I'm sure the presidential libraries and stuff about important famous people, the Medici of the digital age, will continue to be well preserved - at least that part that they want to be remembered for - but a vast majority of information, 98% probably, isn't worth the trouble of saving.
Currently I'm about to pick up a used Super-8 projector to show some films that are in great shape.
Also just got a 1930's Burroughs adding machine for $15 from a hamfest that, with a few drops of oil and cleaning is in 'like new' condition and will probably be in working condition hundreds of years from now if kept in the right environment (room temp, low light and humidity - basements, attics, garages and sheds are hell on that stuff).
try { do() || do_not(); } catch (JediException err) { yoda(err); }
The accounting ledger is only of interest today because it is largely all that survives of the culture. You have to be careful when making assumptions about older societies based on a handful of spotty records. If all you can find are commercial records, it far too easy to assume that commerce was the most important thing in people's lives when it very well may not have been.
I'm not worried about what records will survive and won't survive from our era. The Romans, the Greeks, they didn't worry about such things. They worried about what legacy they would leave for the future (fat lot of good it did them) which is what kind of world they were leaving for their children. This is far more important, IMHO.
Just be sure to wear the gold uniform when you beam down -- you know what happens when you wear the red one.
All of our digital archives are deteriorating at a rate unparalled since the introduction of acid-based paper.
If its not the medium (read an 8" diskette lately? How about a 14" 5MB cartridge? How about a reel of mag tape?) its the software (M$ Word documents formats were deliberately sabotaged to force people to migrate to the newer versions. [I don't know anyone who actually needed M$ Word '97 until they found that they had to upgrade when M$'s biggest clients who'd got their copies for dirt.])
There will be thousand year old documents and last week's flimsies and nothing in between. Just an Orwellian silent testimony to greed and obsolence planned and otherwise.
But that said. have we said or written down anything worth keeping?
MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
Use modern circuit etching technology on long-lived media such as corrosion resistant metal.
Etch text, not binary codes.
The future can read this with a computer or magnifying glass.
That applies to 5 years ago or 2000 years ago.
Even paper distintigrates, albeit in centuries.
Only a tiny fraction of stuff is copied now or then.
Everything has value to someone at some time. I have the sick habit of collecting the Internet (custom spiders suck large parts of the web and usenet onto my harddisks) just for the heck of sorting through it to see what I find. In 100 years my hdd full of odds and ends could be a great find for some historial researcher. I'd disagree with the original poster though. Our culture will be better documented than any culture before us. We're an information culture and we leave our data all over the place. Someone that digs up a stack of cd's would have a huge collection of multimedia information and all they'd have to do is figure out how to read the discs (which is referenced in other documents both printed and digitally.. so there is a key). Sure it's important to keep copies of disks, email, music, etc.. despite lame IP claims.. for historical reasons but this is fairly easy to do. Copy the other sources into raw data files (iso images, cd rips, game roms, etc) and copy the files around as much as possible. Email and other personal files which are quickly deleted or may be encrypted may be the hardest data to save.. but the large amount of email that ends up cached or forgotten all over the place would still probably exist and by that time I expect the future culture to have the computing power to easily decrypt our files.
At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
Copied all my e-mail from outlook to the standard text format when I went to Linux. .mbx format.
<p>
How did you do that? I have tons of old mail I would love to archive, but can't seem to convert from
<p>
-josh
We need to define a long term storage standard which is a suite of storage media and standard file formats. Call it LTSS 1.0. To be a LTSS 1.0 compliant reader you have to support all media and file formats. This could be a dedicated reader, or a general computer with some specialized software and hardware.
LTSS 2.0 might have whizbang new file formats and storage media which supports 100 times as much information density, but it must be compatible with version 1.0.
LTSS 1.0 could support WAV, MP3, GIF, TIFF, Text/ASCII, Text/Unicode, HTML version whatever, and perhaps even Java for interpretation of abirtrary file formats. The media, CD-R, or perhaps one of the writeable DVD formats when they mature.
-josh
I agree. The advantage of the digital age is that it makes it practical to keep things like correspondance. I've been archiveing virtually all the personal email I send or receive since 1989. There's no way I could justify keeping that much paper around. But in electronic form it is hard to justfiy not doing it.
Well, you could accept images in .xpm format, which is text but easily viewed as a pixmap image (it's sorta self-documenting). You just don't want to accept random binary data that you would have to retain a reader for as well.
Your right to not believe: Americans United for Separation of Church and
We still haven't figured out exactly how to make Greek Fire, have we?
Your right to not believe: Americans United for Separation of Church and
Yeah, I know it was sarcastic, that's why it was confusing. It sounded like you were trying to contradict what I was saying, but then you also were agreeing with me. Oh heck, it's too late and the meds are getting to me.
What?
Take the Ems telegram for example, seriously altered, and sparked a war between France and what would become Germany. Of course, we now know that it was altered, but at the time no one knew what happened.
If you think that people altering documents for their own good is anything new and ruins good historical records, you need to wise up and take a history class. This is nothing new, and we still have a good idea of what happened.
What?
Where are my funny mods when I need them? :)
--
These are easily readable, and will surely stand against the test of time.
Ok. More seriously, a large subset of the "digital" information can actually be stored on a lasting analog medium, where special hardware/software will not be required to retrieve it. For example, email and web pages can be stored onto microfilms. Same goes with source codes. Digital audio can be stored on audio tapes. Digital video goes on video tapes. These analog media are farely simple to handle. They do degrade slowly with time, but I'll bet at any age people will always have the ability to retrieve large percentages of the information stored on them.
So, if you want information preserved forever, go analog. The software standards can change all they want... You can always re-digitize the analog information, but not the other way around.
The way to keep data long term is to form long-lasting institutions (like libraries, for example) whos purpose it is to perpetuate knowledge and information. Within the Earth, you can't see any medium as being eternal, so you have to create a social construct that will perpetuate the data, across media and societal changes. A good example of this is the Bible. The original 'Bible' is long gone, but it's one of the most solid pieces of historical data because there is a social constuct, Christianity, that has a primary tenant of keeping that word alive. This isn't a religious rant, but just an example of ways to archive data beyond the lifespan of any given medium.
Kevin Fox
--
Kevin Fox
Though we're happy living here, the Earth is highly corrosive and chaotic. We don't see it because it happens in slow motion (by our perspective) but everything's getting worn away, oxidised, bleached, or otherwise transformed by chemical reactions.
If we want to save data we need to make redundant copies, in a form that is resistant to electromagnetic radiation (say, microetched in carbon, silicon, or other stable element), and put it into a heliocentric orbit 1 radii behind or ahead of the earth's orbit (this way it's not in a trojan point, which could result in collision damage, but is still in a 'mathematically likely' place).
Most of the corrosive factors would be left behind on Earth, and the data would be stable for the long haul. Alternatively, we could put data on the moon, where it would be stable until a meteor hit it or covered it up, likely tens or hundreds of millions of years, and if we put several down, they'd last longer.
Hmm, maybe a big micro-etched monolith buried just under the surface...
Kevin Fox
--
Kevin Fox
I think the ICQ logs from efront are a very important historical record. Even if most of it is inane, it provides an uncommonly frank and unclouded view of a crashing internet company. Some of the most valuable historical records _are_ the inane letters sent from person A to person B. How about the Diary of Anne Frank? A thirteen-year-old's AIM chats are one of the most important works of the century!
am I missing something?--I just opened a .mbx file, and it was mostly plain text?
Slackware: old school feel, new school gear.
For good examples of similar thinking, check out Danny Hillis' 10,000 year clock project. The first thing he did was toss out all "modern" technology because none of it would last as long as he needs it to. He had to go back to the Bronze Age, I think?
www.HearMySoulSpeak.com
"Old" does not mean "significant". There's a lot of noise in old records -- what little survived does get looked at, but someone's grocery list is not really of historical value. The famous Rosetta Stone was just a simple praise of a new king; it is known for how it said it, not what it said.
If our civilization comes to an end,
why are the people that come after us
going to care about 90% of the stuff
we know or do? They might be more
interested in where to find food and
shelter. Really important information
is still printed on paper, and I don't
see that stopping any time soon.
www.timcoleman.com is a total waste of your time. Never go there.
For a virtual world we ought to separate the infos from the media. We could store data and execute programs some computers and use the majority result. See Askemos how this will work.
Once we are at it, we might find that files are worse than paper for another reason. We better had "write once" files. - If reusable paper were better that nomal paper, we would have it in the stores. Enough cycles of invention went over it already.
A democracy, a so called 'free society', can easily be manipulated and controlled by the person controlling the information. What happens when all information, except what comes from 'authorities' is suspect because it is so easily fabricated?
It reminds me of the Arnold Swarzen...(?) movie, "The Running Man". He's a police helicopter pilot who refuses to shoot unarmed people involved in a food riot. The powers that be manipulate the video tape evidence to make it appear that he massacres the people instead. People are shown the tape and cry for his death in a game show type fashion until some revolutionaries are able to show the real tape by hacking into the communications channel.
The temporality of public records has very serious implications for our social structure. If the only record of your speeding ticket is an entry in a database, what happens when a glitch makes you a drunken sloth who doesn't pay child support. If the entry showing Bush's drug convictions get deleted, will there be no other record. Trust me on this, email is a politician's dream. Everything from here on has plausible deniability.
Aah, change is good. -- Rafiki
Yeah, but it ain't easy. -- Simba
What the days slashdot articles are from 50 years ago? Do we care what they are today?
All Troll + "offtopic" mods are meta moderated as "Unfair", because you abused the system.
That sort of information can also be important to genealogists.
I know of several records from my own family that are part of inventory sheets, since the company kept the names of the folks who performed the inventory, and their contact addresses. It's -wonderful- information for dating when individuals were in certain places, and that sort of things.
Historians may not be specifically interested in you, no, but what about your decendants?
The day-to-day information that we produce is the stuff that makes genealogists go nuts. It's the stuff that leads to books like "Roots". Biographies of people who, to themselves, seemingly did nothing with their lives, yet looking back ath them a hundred years later we see how extraordinary they were.
Should -everything- be saved? No. Personal correspondance with friends and family should. (and hell, I have -every- piece of email that I've received at work over the last year saved. Talking roughly 500MB or so of gzipped archives (which balloon to about 1.5G)).
Although a smaller fraction of the data produced today will be readable in the future, there's so much more data produced that you wouldn't want to read much of it anyway. The fraction of it that's produced on long-lasting media like acid-free paper is still quite a lot.
I personally don't feel the need to copy any of my old floppies. All that I ever had on floppies and that mattered to me is now somewhere on my current hard disk (and a few past ones). All of it takes only a fraction of my 18GB drive. Assume I had 100 floppies that mattered: that's less than 200MB, which you can copy in a few seconds on modern digital media.
As a matter of fact, each time I get a new computer, I copy all the stuff from the old one, and it takes only a fraction of the space. The 40MB of my first (Atari ST) hard disk are there. The 160MB of my first Mac hard disk (120MB left after I copied the Atari hard disk onto it) are there. And so on.
The real issue is binary formats that have been forgotten. For instance, I have source code of programs I wrote in GFA Basic (a Basic for the Atari ST, in case you wonder.) But emulators come to the rescue there. Today, I can run Atari programs faster than on the real machine.
-- Did you try Tao3D? http://tao3d.sourceforge.net
When I was studying Latin in high school, I learned that what archeologists are most interested in is the way people lived in those times -- things which were so common that no one thought to write them down, or save the writings. They were taken for granted. Thus, the essential task of the archeologist is to not only read the works of ancient people, but to go digging up their houses and garbage pits to discover what their lives were REALLY like. They need to uncover the things about the cultures which were so ubiquitous that they were never recorded.
Having a series of backups of the Internet would help future archeologists tremendiously, I think. It would give a true record of what our society was like. We can't consciously record what future generations will want to know about us, because we don't know how they will be different from us, how our values and customs will differ.
Of course, we're still writing plenty of books which will tell future generations a lot about our culture (if they don't all disintegrate first -- beware the paperback book!). And, lord knows, we are throwing away plenty of garbage to root through!
Ok. I am not a history scholar, but I have occasionally worked with different archives during the past years; with the Danish State archives, and The Berlin Document Center (has a new name now).
The amount of information (archives) that a state amasses, is simply astounding, and thats just the the bits that goes into the archives; at least 90% of all paperwork is scrapped even before that.
An example; I helped a scholar do some statistics on black market crime during and after the the war;
He examined a single, lower court, in the period from 1940-1953. "Only" 8000 cases went through this court, but just the verdicts alone, averaging 3 pages per case, amounted to 25000 pages, bound into fifty, 500 page tomes. Each of these cases, would also have generated a "file", containing eg. police interrogations, wiretapping records, anonymous letters, forensic evidence, case evidence, court orders, affadavits, etc. A really conservative estimate would be, that each case, would have generated at least 20-40 pages, meaning that just this single court, in a few years, could have archived 100.000 - 200.000 pages. A totally impractical thing to do. Therefore these files were "cleansed", before being archived.
If all those papers that public institutions produces were preserved, we would be swamped in archives. Some stuff simply has to go.
Old-style paper archives has physical storage problem. Modern "bit-based" archives should in theory, be less burdenend by this. (200.000 pages should fit handily into a single cdrom.)
But on the other hand, modern information systems makes it so much easier to generate, and preserve information. (just think of many gigabytes of information a single company has on its servers)
How many emails is sent every day? 5-10-20 millions? If just a fraction of these, say 100.000, were preserved every day, think how many freaking million emails that would be during a short period of 50 years. But more importantly, how many (and which) emails would posterity need, to say something about our time, and the social pattern behind the phenomenon; email?
The main problem with digital archives, is the same as with paper archives; You can't, and shouldn't try to preserve everything.
I don't doubt, that over time, even the majority of that information selected to be preserved, will be lost, due to bit-rot, war, fire, carelesness, natural disasters etc, during the next 1000 years. But even if just, a tiny, tiny, fraction of this is preserved, there would be "enough" information, about our time, for the historians to make a good overall picture.
A single, modern "Statistical Yearbook", probably contains more demographic information, than all medieval archives put together.
A modern public library, probably contains more works, and written information about the last 100 years, than have been preserved, from when man began to write, until the Middle Age. Still, a lot can be said about the Roman Empire, even though so precious little in writing has survived.
So to reduce future archives to a manageble size, the majority of information simply has to be discarded. Then it is more likely, that there will be funding, for preserving the rest in a proper way.
Consider the amount of time, money, blood, sweat and technology, that goes into carefully extract scrolls from the Pompeii site, and make them readeable, it should be a "trivial" task to recover any kind of non-encryptet data, no matter what digital media it resides on. However, the cost of doing so may not be trivial. Just think on how many data formats, future historians would need to reverse engineer, just to cover this last decade.
"Remember: the victors always have and always will rewrite history as much as possible."
How I have come to loathe this dogma.
Originally, it stems from the fact, that sometimes only one parts "history" survived from ancient times until today (Athens, Ancient Egypt springs to mind).
But the dogma really isn't true anymore; First, in democtratic countries, it is impossible for the state to directly control, what history is written. Secondly, after having dealt with the massive "memory" rewrites among former Waffen-SS soldiers, I can only conclude, that the loosers are just as eager as the winners to rewrite history; there has been a huge amount of revisionist "history books" written since the 2. WW. ended. From outright holocaust denial, to apologeic "Waffen-SS coffee-table books", where the W-SS soldiers are portraied as just a bunch of happy, anti-communistic boy-scouts, on a picnick in the USSR. Noone of them were ever nazis, or anti-semetic, they never saw any warcrimes (except those the russions made), blah, blah, blah. Total denial of facts.
So a better dogma would be:
"Remember: both the winners and looser always have and always will rewrite history as much as possible."
Historians know this of course.
Grrr...
tagline
... hi bingo
The world will be destroyed 5 minutes before the completion of the backup of all heretofore created data, just as the dolphins predicted, and the little white mice will be upset that they didn't get to look at all our comic strips and juicy emails...
Insert mind here.
He's a republican he does need facts. If Rush says so it must be true.
War is necrophilia.
(sarcasm mode)
What I think we should to is get some text-to-speech software, and have it dictate all of our emails and such into audio format. Then we can burn a few hundred audio cd's. You may all be saying "how is this different from puttting it on a computer cd"... well my friends... if we have learned anything from the RIAA.... the music industry moves slower than molasses running uphill, and you can probably be those cd's will be playable for at LEAST a hundred more years. Serioulsly... if any of us were given a vinyl record... who here could NOT find a player to play it on? And those things have been out for a LONG time.
Steve
I opened up each message. Cliked "Save as" and saved as .eml. It was a bitch. I think there's a pst2eml perl script out there somewhere. Or maybe mbx2eml?
Best Slashdot Co
One problem with archiving digital communications is the volume. One of the problems that were found during the many Clinton investigations was, when e-mail was subpoenaed, separating the wheat from the chaff. All the mail was backed up onto tapes, which weren't very well marked. And the first searches were done on subject lines. Quite a bit of relevant mail was missed, and turned up years later when people actually sat down and read every message.
The National Archives (here in the USA) is worried about preserving data. The various software and hardware formats used over the years make it difficult to track and retrieve the data. NASA has spent a fair amount of money moving old planetary exploration data from tapes to optical disks, and then to CD. My father worked on a project at DMA (now NIMA) to do the same thing there.
Best Slashdot Co
It is the only thing which has been proven to be usable after a huge period of time.
Or engraved metal or chiseled stone.
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
The main problem being digital records are so much more easily tampered with compared to old paper
Sometimes the answer to your question about how do we do X with technology can be found by remembering the history of technology. In this case, what might be a better long-term storage medium than magnetic or optical media is good 'ole paper tape. Now, some research should probably be done to increase both the durability of the tape material and the density of information stored on it, but it is the best solution I can think of, and probably the easiest to decipher by archaeologists of the far future.
--
SecretAsianMan (54.5% Slashdot pure)
Washington, DC: It's like Hollywood for ugly people.
There's actually some of this going on today.
I'm a bit fuzzy on details, but a few years ago I heard (from someone who worked in the field) about a project to resurrect old LANDSAT tapes from the 1970s. Someone figured out that the old data would make a great baseline for climate change studies, and the raw data could be processed in ways simply not possible 25 years ago.
The tapes were still around, stuck in a warehouse somewhere., To get them into a readable condition, they had to be slowly baked (in pizza ovens!) to drive out moisture they had absorbed, then scraped with a sapphire blade to...well, I forget why. Scrape off some gunk.
I believe they managed to dig some old recorders out of the scrapheap and get them working with the help of some old hands.
Wish I could recall more details, but that's all I know.
Tom Swiss | the infamous tms | http://www.infamous.net/
Tom Swiss | the infamous tms | my blog
You cannot wash away blood with blood
Ever wonder what they do with all those communications? Maybe they can put them in escrow for 200 years :)
..don't panic
Don't steal this idea because I'm going to patent it and make lots of money, but here it is:
Everything2 is great for recording encyclopedic sort of knownledge. What I'd really like to see is something that is designed just like Everything2, but instead it records *experiences*. Everybody writes experience and event nodes, and eventually we have a living history of everything that ever happened. Sure a lot of that will be irrelevant, but just think of all the correlations and connections that could be made. Sort of like 6 degrees of separation, but for real life events.
It's 10 PM. Do you know if you're un-American?
No kidding. I'd hate to be in Deja/Google/whoever's shoes, trying to archive useful data, in face of terabytes of "Nude Asian Teens" email generated -- literally -- completely automatically at the click of a mouse button. Especially since the most useful spam filtering methods (outright router blocks, keyword triggers, a bullet to the head of the marketing agent) are frowned upon by nice people.
Paper libraries have a "volume" problem because the media itself takes up so much space, and must be carefully stored. Digital libraries have a "volume" problem because any old jackass can easily create fifty times the amount of information that's worth keeping, and it must be winnowed out by a human.
Just my rant today (cleaning out another twelve spam emails).
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
At a personal level, I am currently denied access to email of my own from as little as 5 years ago. I would save it into files periodically, on whatever shell account I used at the time. But periodically there are non recoverable file system errors, or shell accounts that just disappear in the dead of night (we'll see alot more of this if the ISP burnout rate continues.)
So forget this problem of losing our digital records as a society, what about losing my personal identity?
I still go back and look at physical letters of mine from 10+ years ago, but email from as recent as 1994 is hard to find. That frightens me, frankly.
Data that is easily destroyed goes hand in hand with data that is easily copied. I think data loss will always be more prevalent with digital media than it was with more conventional ones.
We have two of those sitting around in our basement, plus a few 8" floppies. There's also an old Z80 that might be able to actually use the things. My dad would probably be fine with sending one off to an organization that could actually use them, although I doubt we wanna pay shipping.
My mail server's down and I'm waiting for a new account, you can reach me on IRC at irc.edgeirc.net in #3ddr
If the data is there someone will figure out how to retrieve it. It's one of those human things, just like deciphering alphabets of languages long dead.
Chances are that there will daunting amounts information from the digital era. One of the reasons being that the rate of literace is so much higher now than it's ever been. And if it turns out everyone stored their floppies by the speakers there will still be physical records: books, magazines, microfilm, letters, videotapes...
Hard drives, floppies and CDs, are but a few of the mediums used todsy to store data.
And finally; History has always been up to the historian. Truth is in the eye of the beholder. Who is to say a false history is not better than the real one anyway?
Sattinger's Law:
Sattinger's Law: It works better if you plug it in.
If you've been on the Internet for a while, one thing you can do is to search for your own name (or online handle) on Google.
Of course, finding this can be rather painful. Much of the stuff that got archived for me dates back to when I was a clueless AOLer.
--
Obfuscated e-mail addresses won't stop sadistic 12-year-old ACs.
Win dain a lotica, en vai tu ri silota
...or at least as long as active and caring human society - are no problem.
But you have to get away from the mindset that seeks a "wearever" medium, everlasting standards, and indefinitely available hardware. That is the naive approach.
The word is "living archives". The archivists' work is never done.
The approach that works is just to regenerate all data from media that is wearing out, obsolescent media, and obsolescent standards - before it is in danger of being lost. This must be a constant process of renewal. Since the data is digital, and anyone with the slightest imagination would store redundant copies in physically separated locations, the process is lossless.
So when 3.5" diskettes become well established, and 5.25" diskettes start looking like orphans, you redub everything from 5.25" to 3.5". Then the same thing when CDs overtake 3.5" diskettes. And on and on (I seriously doubt CDs are forever in any sense of the word).
The trick is to know when the time is right each time. I won't minimize the problem. But the watchword is "be conservative".
Twenty years or so ago, the Smithsonian museum had an exhibit about fiber optics that included a working model of Alexander Bell's "light phone" (it mechanically modulated a beam of sunlight) and his original lab notebook (borrowed from Bell Labs' engineering records). The notebook was still legible because (a) the paper was acid-free and (b) the ink was pigment-based. Even though I keep a notebook, it will not be legible in 100 years (perhaps one of my great grandchildren will be interested) because either (a) the high-acid paper will have decomposed or (b) the parts written with dye-based ink will have faded.
The fairly recent PBS documentary on the US Civil War was based in large part on letters and journals written by soldiers using (you guessed it!) acid-free paper and pigment-based ink.
Make tomorrow's history! Write letters and keep journals using acid-free paper and pigment-based ink -- if it's all that survives, it will be the authoritative material on the typical daily life!
Engraved Nickel
The Rosetta Disk
AJ
The Medici project has experts working with fragile, hundreds-of-years-old paper documents. It is conceivable that in the future, there will be similiar experts who have special tools and procedures for reading ancient media like CD's. However, IIRC, the lifespan of optical media like CD's is about 100 years. Perhaps future technology will be able to extract data from partially degraded CD's. Historians have always faced challenges in finding data that have been worn away by time. Future historians will be no different.
Don't forget that Friday is Hawaiian shirt day.
There was some talk on another thread about how long CD's would last. Audio CD's, and infact all cd's that are 'pressed' (IE not CD-R's and CD-RW's) should last a very long time. These disks are NOT subject to 'laser rot'. Laser rot was what happened to early 12" laser video disks. Laser disks are two sided, and are made in the same way as audio cd's in that the information is hot pressed onto the plastic, and then aluminum is vacuum deposted onto the plastic to make it reflective. Two of these disks are then glued together. What was happening was that the glue was attacking the aluminum and mosture was getting inbetween the disks. Better glue formulas have mostly solved this problem. Audio and computer CD's that are factory pressed are single sided. The aluminum is protected by a coating of varnish which serves as the label. As long as this is not scratched the aluminum layer will remain intact and the data can be read. It might be possible to restore a damaged disk by stripping off the varnish and aluminum and vacuum deposting a new layer of aluminum. Not something you can do at home though. DVD's consist of two or four disks sandwitched together, they might have laser rot problems if the glue isn't good....
CD-R's and CD-RW disks record via a dye that changes color and reflectivity with heat from the laser. This dye can destablise under light and heat. So keeping your CD-R's and CD-RW's in a dark cool place would be a good idea. Also the more they are 'played' the shorter their lifespan might be. So make a backup copy of any CD-R/RW you want to keep. CD-R's might be more stable than CD-RW's.
There's a good review of a Nicholson Baker rant against Librarians in general for their sins of deliberately pulping the paper records of the past 130 years and replacing them with decomposing and badly executed microfilm facsimiles.
It seems that Vannevar Bush's infatuation with microfilm was shared by many in the WW2 OSS community, and this seems to have led to a misguided attempt to replace papers and books with microfilm in the interests of "efficiency".
Da Blog
Of course, the NYTimes, etc, have archive searches as a premium service, but there are just tons of media outlets that don't seem to archive, or if they do, don't seem concerned with letting people get at it. This seems like at least as much of a concern as degrading media: the organization and maintainence of archives in the FIRST place.
This topic is one that is already being seriously considered by librarians and historians.
The USA's Library of Congress Preservation Reformatting Division is digitizing many items for preservation, and you can be sure that they're concerned that the digital preservation will be at least as effective as the original (analog, paper, whatever) form.
One of the current projects of the Research Libraries Group is data preservation. The RLG is an international group formed originally by Columbia, Harvard, and Yale universities and The New York Public Library in 1975, with current members from academia, government archives, public and private sector historical organizations.
A google search on digital data preservation gives plenty more linkage to groups actively looking at the issues involved in digital storage.
Of course, there is still a huge volume of personal and corporate data that will no doubt degrade to dust. For that, we all need to take the approach of wiredog to keep our personal data accessible by refreshing the media as technology advances.
Naturally, since this is Slashdot, all of this has been already covered. This article was a particularly good treatment of the topic and was posted as a followup to an older Ask Slashdot.
Really, how different will it be if the future only has the preserved personal effects and communications of an insignificant fraction of the general population? Today, archeologists make a career out of extrapolating whole civilizations out of building foundations and shards of pottery.
So, with a little care, I'm confident that my own data will be happily accessible as long as I need it. After that, the future will take care of itself.
Mozilla
Did I miss something? What historical records was Clinton accused of altering?
--
The shareholder is always right.
Which OCR software do you use? The one I tried wasn't incredibly accurate. (That was a few years ago, though.)
--
The shareholder is always right.
"As the well-known conservative George Orwell ..."
What?? Orwell was a well-known member of the U.K. socialist party if memory serves. He certainty wasn't very optimistic about socialism as demonstrated by 1984. (I tend to share his bleak outlook about utopian societies and thus I prefer my governments to have a touch of libertarian.)
Maybe you are working from a different definition of conservative (quite possible if you are not in the U.S.)
Kevin
I beg your pardon, but when I was doing research on the dietary habits in Early Modern France, someone's grocery list would have been of extreme historical value! Luckily we do have some petitions for aid written to city authorities in which the petitioners detail the household consumption of bread wine etc ...
I hate to trivialize this and become just another /. naysayer, but if it's that important they can build a cd-rom drive.
Sigs are awesome huh?
Optical media is not really such a bad option. A useful, self contained system for playback of optical media could be easily built. If nothing else, carefully preserved schematics for future readers of media could be store with it to make sure that if the machine is ruined and media survives, it might still be read.
The real reason that old magnetic tape is hard to read now is that it was never a great format in the first place. The stuff falls apart. My last employer had an old HP reel-to-reel machine for reading data on tapes from a company we had purchased, but the tapes were so old that the chemicals on the tape itself turned to dust and fell off. This is not a problem with optical storage. Optical storage also has the option of being dedicated in very small spaces, unlike the van sized tape players of old.
Life is also not a big issue with optical media, because just as the books of the Medici's were recopied over and over into new languages and on better bindings, so can data be quickly copied from old optical media onto newer formats.
Agreed. But my point was not that some commoners writings were important, but that a specific individual's writings generally are not. Commoner's writings are used to understand life for people of that stature, much like old paintings are used. If we had every writing from every commoner, much of it wouldn't even be read. (although some cs informaticians might try to find NLP relationships within it)..
-Moondog
I'm at a loss to understand why this question is perceived as being difficult to answer. Notice the posting talked of the *ruling* class. Today we look back at history and see people who kept records of their letters. They are usually wealthy and upper class.
The analogy would be to read emails from, say, the white house in 200 years. Do you think the white house is saving their emails? You bet. Do we have lots of examples of (from the general public) letters from 200 years ago? Certainly not as many as there will be emails in the future. Usenet archives, digital backups stored in basements, most emails are being stored two or more times at two or more places. I don't quite understand why someone would think that just because it isn't on paper, it isn't going to keep. We are going to have far more emails stored in the future than we will know what to do with.
As society we think of ourselves as individuals to be pretty important, but lets face it, for the vast majority of us, no one is going to care in 150 years. With that in mind, the digital age is storing far more records than ever before and the future holds a new paradigm of historical record. I almost lament that I wasn't born 150 years after the advent of the digital age where high resolution movies will look as good 1000 years from now as they do today.
-Moondog
Perhaps our clearest records may come to us through our own broadcasts, IF, there is a way around C.
If we could sidestep Einstein right now, stationing a probe 40 ly out would get us fantastic coverage of the early 60's. 30 ly out would get us Vietnam. 20 ly out would get us Reagan, and the mullet.
I'd bet if there IS anyone out there listening we'll be very highly rated, at least for entertainment value. :)
In all seriousness I look for projects like http://www.keo.org to pass down records to far future generations. We have never before tried to think in two generation terms, much less in hundred or thousand generations. We have less than six hundred years of carefully-documented history.
In my eyes part of our growth as a sentient species will involve us learning how to carefully chronicle ourselves for distant-future generations, and how to think and plan in greater than ten year terms.
We still exist very much in the now, as short-lived creatures with even shorter-lived goals -- this may make no sense. My eyes are beginning to cross from fatigue. G'night.
Get off my virtual lawn, you damned virtual kids!
50 years is pretty optimistic for CD lifespans. I've found that after 2-3 years, a typical audio CDR blank will begin to show noticable degredation and after 5 years is sometimes completely unreadable due to internal corrosion or other faults, even using good quality CDR blanks.
Of course, factory burned CD's last much longer but we're talking about digital archives, and those are typically burned by individuals using some sort of CDR or CDRW blanks.
Ok, so...you have the stuff you want to last the ages on CD, but if you want to read the stuff once "the ages" come, you'll need the instructions to make a CD-ROM drive, no? If the instructions also need to last as long as the stuff that's being archived, simply put the instructions to make a CD-ROM on a CD, and it'll last forever and a day (or 50-odd years as someone pointed out)...simple! So, now you've got your archived material, and instructions on how to make a CD-ROM. To read them, you just need a CD-ROM. All you need to do is build a CD-ROM to read the instructions on how to build a CD-ROM, which is stored on a CD so you can get to your archived material. In order to build this CD-ROM, you'll need the instructions on how to build a CD-ROM. These instructions should be stored on CD for it's good (I know...) archival (not) properties. Luckily, the instructions on how to make the instructions-reading CD-ROM's instructions reading instructions can be found on a CD-ROM (You obviously couldn't write this down on paper...there's no jokes about instructions to read paper to be had), so all you have to do is build such a CD-ROM using instructions that you previously CD-Instructioned Build ROM put on ages last put on dry kept cool place with build CD-ROM
$ Segmentation fault
$ su
# shutdown -r now
Ok...having built (counts on his fingers) a roughly infinite number of CD-ROMs in order to read the instructions on how to build the CD-ROMS so you can get the instructions to build the CD-ROMS (go get a coffee, this could take awhile), you've successfully built the final, archive-reading CD-ROM, you find that it is one of the following:
Completely Plausable CD Contents #1: A Win95 CD with the labling worn off
Completely Plausable CD Contents #2: That cookie recipie, The Story of Mel (Love the story of Mel...just it makes a good example of things that get passed around alot), The Tao of Programming, ASCII "Angela", The Halloween Document, a "Herbert Kornfield" editorial, "UNIX Command Line Fun" (bash: %blow: no such job), the "Mad Short DeCSS Implementation of the Moment", et al.
Completely Plausable CD Contents #3: 600 meg of B-rated porn, an "All Your Base Are Belong to Us" mpeg, and the lyrics to "Stairway to Heaven", and the IIS unicode exploit (Windows binary, zipped)
Completely Plausable CD Contents #4: "Slashdot Archive 2453 of 15,000", containing, among other things, the overcaffeinated post of one procrastinating w4nker who really should be studying for his ApSci exam tomorrow instead of wasting everyones time by prattling on about instructions regarding CD-ROM building Instructions so you can build a CD-ROM to read the instructions to build a CD-ROM in order to get at the instru *manual reboot (a la the BOFH, found on CD #5)* I think I'll go make some maccaroni and cheese.
I apologize in advance to my Karma
"These people look deep within my soul and assign me a number based on the order in which I joined" --Homer re:
Likewise, various people are trying to shut down the MAME ROM sites, but a lot of the hardware ROMs are deteriorating now and many of those games, which represent a golden age of creativity and a technical wonder of resource usage, will be gone forever. Kinda makes you sick, doesn't it?
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
You certainly have a point!
. htm.
But how long does the actual data on a CD-ROM survive ("real CD-ROM" / CD-R / CD-RW)?
Btw, blatantly assuming you're talking about WordStar for DOS there are good utilities for conversion of WordStar to other, more common, formats here: http://www.petrie.u-net.com/wsdos/pages/downloads
--
Ner lbh sebz gur HFN? Gura lbh'ir whfg ivbyngrq gur QZPN!
Screw it. If Cavemen could paint on dirt walls 1.5 million years ago and we can still find and decypher the crap I really don't think we need to worry. One way or another the populus of the future will be able to learn about the past. The holes in the story will be filed away with the same holes we can't fill now.
Well, considering that some people complain that the "Digital Age" has caused paper consumption to increase, and reports of landfills keeping phone books from the 60's in near perfect condition, it seems like there will be plenty of stuff for the historians to ponder over in the next centuries.
DOS is dead, and no one cares...
DOS is dead, and no one cares...
If there's a Bourne Shell, I'll see you there
Heh why not just encrypt them and archive them?
Sure... its not perfect, you need to keep the key safe (or at least the passphrase for the key).
The added bonus is that in a few hundred years when someone may want to add them to a hiistorical record, the encryption key will probably be short enough (by that days standards) as to take a day or so to break.
-Steve
"I opened my eyes, and everything went dark again"
So like, tell them that you lost it. Then let them try and prove that you didn't, in fact, lose it.
Or better yet.... you can't remember the passphrase, forgot...sorry.
Kind of hard to prove whether someone remembers something or not - especially under all the stress involved in court cases and what not.
-Steve
-Steve
"I opened my eyes, and everything went dark again"
Hmmmm Company?
I was thinking individuals personally, not at the company level. Each individual should take care of that for themselves.
Besides. Whether the company has use for it or not is irrelavant. The historical recorod is more important. Hell...
Use a public key cryptography and destroy the private key from the outset. JUST keep it as an encrypted archive for the deep future.
Courese...then there is the matter of storing it... but better to develop practices first, worry about storage later.
-Steve
"I opened my eyes, and everything went dark again"
A truly wonderful example of this kind of thing are the early works of JRR Tolkein. The early history of the Silmarillion is absolutely fascinating and a wonderful example of the development of a literary theme. That's a work that wasn't published for over 50 years after it was started, but some of the earliest drafts still exist. Because those drafts are available, it's possible to see how it developed. Will the same thing happen when authors write everything in Word and write over old versions every time they change anything? How about if they're still very careful about keeping copies of early drafts but the formats change so much that they can't be read anymore?
Enter VMS, which automatically saves every version of a file, until you manually delete them. If Unix had not wiped out VMS, everybody would have every file they ever worked on.
Word actually does have a versioning feature which saves every version you worked on if you enable it.
My OS is going to have infinite versioning and journalling capabilities, so you can undo any change you ever made (not just on "file save" boundaries). When VMS was developed the typical hard drive was under 100 MB, and now that they sell 100 GB drives for a dime a dozen, we have the room to save everything. Why current OS have usage models which encourage people to delete everything is beyond me.
Slightly Offtopic. In geological time, regardless of what methodology we use (well pretty much regardless), there will be nothing left. I recall an article once, somewhere (New Scientist or SciAm probably) that went along the lines that if one was to exterminate all human life on earth today, the only evidence we ever lived would most likely be some spikes in the chemical record of the rock strata laid down in our time.
And remember, we still have hundreds of thousands to millions of years of technical advancement to go in the geological time period that forms the strata of this time. All the buildings (all the environmental damage etc) reduced to nothing more than a blip in the chemical composition of rock.
Of course this assumes that we are not around to have a continuous cultural record of history, after all the dinosaus lasted for tens of millions of years and we are much more powerful than they.
"The first thing to do when you find yourself in a hole is stop digging."
While I'm all for archiving data for future historical analysis, I think it's fairly certain that IM logs, "how's it goin?" e-mails, and detailed transcripts of #40yearoldsinglebaldguys will not be very useful to historians in three hundred years. Yes, they tell about our culture and practices, and yes they might be interesting, but we don't need all of it to extrapolate those conclusions. There is simply no room to store the vast quantity of information generated on the Internet on a daily basis, and considering the fact that 99.998% of it is of little value, I think that we can safely do without it.
Things are still floating around from the old days. We have Usenet archives from the 80s, and text files from even earlier. We can learn a lot about the culture based on those. Things that grab the public consciousness tend to around. They get mirrored, printed out, saved on disk, etc.
Does there need to be a giant warehouse that contains vacuum-sealed printouts of every wise thing said on the internet?
No. No, there doesn't.
Got Rhinos?
The bigger question isn't media, but sofware. I'm very confident we'll be able to get our files from ISO9660 discs, but I already have a bunch of WordStar and old MacWrite/MacPaint files I can't open and it's only been a decade. We'll be able to retrieve the raw data, but will be actually be able to interpret and make use of it?
Well, there are two issues here. One is keeping a readable copy of the software, the other is being able to run it. Since most software programs are used by large numbers of people, it seems likely that someone would have the foresight to keep a copy of the software to interpret the data along with the data itself. Running it also shouldn't really be a problem for future generations. Presumably, someone will have a copy of the specs for the architecture for which the software was written, and an emulator can be created. Of course, if the software's source is available, it would be even easier.
Also, reverse engineering a data format isn't that hard anyway. If you looked at the raw data of your MacWrite files, I'm sure you'd find your text in ASCII somewhere, possibly with embedded formatting information. Non-textual data is more difficult, but still possible, particular if you have some fragments of information about the data format to go on.
I have been trying to collect some old software
from CP/M-80. A *lot* of it is lost, perhaps
forever. Specifically, there was the Whitesmiths
C compiler (written by PJP), but he hasn't got
it, and the company that currently owns the
copyright may no longer be able to read the
mag-tape backup. This software may now be lost
forever. People and companies should give old
material like this to digital archives, and these
archives should be available on the 'net for
research purposes. *Before* the media disintegrates. Unfortunately, IP ownership makes
it unlikely that sources etc. will be freely
published, and by the time that the IP is
worthless, the company is likely to be gone as
well. When will we see Windows 1.03 and it's
source published? Probably never (but thanks
to Caldera for publishing source and images
for CP/M, GEM, etc. That *is* the right thing
to do).
Ratboy666
Just another "Cubible(sic) Joe" 2 17 3061
Just kidding. I seem to remember another piece on this. Basically, it came to what storage medium had the longest life. Microfilm was out as film degrades. I seem to remember CD's being ruled out as well. Apparently they don't last as long as I thought. I can't seem to remember what the preferred medium was though.
http://www.gslis.utexas.edu/programs/pcs/.
Nice things are nicer than nasty ones.
the only thing worth preserving is the high quality pr0n... It's a lot more interesting than reading what some obscure 300 y/o programmer named Linus had to say about some equally obscure "MACH microkernel"....
Perhaps we can still use the same technique to solve the data archiving problem: Just broadcast all our data into space. To read it, all we need to do is invent FTL drive, pop out to the right point in time and read the data as it goes by.
I'm sure we could find other uses for the FTL to help recover the R&D investment.
-Eldurbarn
This is probably as good a place as any to mention the Dead Media Project ...
--
The real Captain Avatar is a fictional character, so I suppose he doesn't mind if I impersonate him.
Someone should just hooky up a daisy wheel to the organic AI the CIA uses to read the world's email, and have the real Kevin Mitnick (not the DOJ's PR department's half-baked simulacrum ) stack it all nice and neat on the secret underground continent they're using as an alien petting zoo.
I thought that was what things like Echelon and Carnivore were for????
Ok my karma is maxed out. When do I become Enlightened?
No, the trick is that a picture is worth 1000 words. Since graphics usually compress worse than text (limited dictionary)
The latest wavelet compression techniques can compress a good-sized color image to 8 kilobytes, or the size of a thousand English words plus light markup.
Will I retire or break 10K?
On July 20, 1969, Neil Armstrong was the first man to walk on the surface of the moon. Here is a picture, in an open, documented graphics format
And the format is called ASCII art. Just use this simple program to convert your 1-bit .bmp format images to images made of standard ASCII characters.
Will I retire or break 10K?
You just don't want to accept random binary data that you would have to retain a reader for as well.
If binary is the problem, uuencode is the solution.
If proprietary formats are the problem, then documented, unencumbered formats such as PNG, JPEG, FLAC, and Ogg Vorbis are the solution. Just make sure to archive documents (such as ISO and IEEE standards) that can be used to create a reader.
Will I retire or break 10K?
The bigger question isn't media, but sof[t]ware. I already have a bunch of WordStar and old MacWrite/MacPaint files I can't open... will [we] actually be able to interpret and make use of it?
For older formats, you can always emulate the computer for which the viewer software was designed, or write a new viewer from the format documentation. For example, QuickTime 4 can open MacPaint files, and so can a short C program I wrote. Remember, if you want to archive something, make sure you have the format documentation (or the viewer software and the architecture documentation) so that future generations will be able to create a usable viewer. (IEEE and ISO standards are Good Things[0].)
About five years ago I still had an old floppy controller with an odd WD chip on it that could talk to it using OS-9.
So install Mac OS X (the successor to Mac OS 9) on your machine and read that floppy.
Oh, you were talking about that OS 9.
[0] GOOD THING is U.S. Trademark No. 75,516,347 registered to Martha Stewart Living Omnimedia LLC. (Look it up at TESS.)
Will I retire or break 10K?
LTSS 1.0 could support WAV, MP3
s/MP3/Ogg Vorbis/
GIF
s/GIF/PNG/ because PNG is better documented and supports 24-bit color and alpha transparency. You partially address this with
TIFF
but s/TIFF/PNG/ because even without TIFF's LZW codec, TIFF is much larger than PNG and not as well standardized.
Text/ASCII
Non-European language advocates would complain.
Text/Unicode
Better. Thank you. This solves the script issue, but in what natural language would information be stored? How is it a valid assumption that future generations can read format specs written in US English of A.D. 2001 or in UK English of A.D. 2001?
HTML version whatever
Make sure it's run through W3C's HTML Validator if you want to archive it. MSHTML is a Bad Thing.
and perhaps even Java for interpretation of abirtrary [sic] file formats.
The Java(TM) langauge does not have the wealth of alternative implementations that the C language has. Both are nearly Turing complete (full Turing completeness requires unbounded storage) and equally fast when compiled to a native instruction set.
Will I retire or break 10K?
we'll put them on giant monolithic blocks, etched in binary (the univerisal language) and bury them in dry ice-caves on mars. when mankind is ready, they will find them.
The REAL sam_at_caveman_dot_org is user ID 13833.
2315 AD: It would appear that the entire society was obsessed with "NAKED HORNY CHEARLEEDERS WET AND WAITING FOR U!!!!!!!!!!", "online casinos", messages from some person named "bounce@" and worshipped a diety called "Viagra". No wonder they vaporized themselves.
What do you mean? No president before Clinton would ever do something so terrible as to manipulate records to try to hide his guilt. Honest Dick Nixon would never have stooped so low as to, say, erase 18 1/2 minutes of an incriminating taped conversation in order to cover his ass. Evil Bill Clinton is a complete departure from the behavior of every other president in the history of this fine nation.
There's no point in questioning authority if you aren't going to listen to the answers.
This stuff has always been volitile. We have a fraction of the historical data we would like to have from any time period. Yes, the letters of the Medici are still around and available, similarly the corresponsdence of the major players of our time will be archived (either electronically or in hard copy. Probably both.) The letters of the common man were as often discarded in times past as e-mail is today. Some of it will not doubt still be around (just as the data on many of those eight inch disks still survives on more modern media today), but the vast majority will be lost. This is fine, especially since there is a finite amount of data that historians can analyse anyway. Generally speaking it is nearly impossible to tell what will or will not be historically sigifigant from the point of event origins anyway. I would venture to say that considering the level of literacy in our culture today, and the varied data storage mediums available, historians will have far more data from our time than current historians have from anytime before World War II.
I don't need a million points of light, just two points of multi-mode fiber and a 10 Gig-E router.
With Raptor, the NSA, and other intelligence gathering organizations.
The trick will be recalling the data from those organizations.
DanH
Cav Pilot's Reference Page
Cav Pilot's Reference Page
UNIX - Not just for Vestal Virgins anymore
Paper holds up? Are you kidding? Paper can get wet, burn, be torn to shreads, ingested and colored on. Embossed metal would be good. (not to mention it makes a nifty Photoshop filter.)
--
Wooden armaments to battle your imaginary foes!
Broadcast everything important into space. If we ever need it again, we just zip out along the transmission wave at realitivistic speads, until we get to the bytes we want, slow down and read them, then zip back home. M@
Krispy Cream is people
I've heard of one researcher studying the history of diseases. He'd go into the rare-book rooms of libraries, open each book, stick his nose in and take a big sniff. Apparently during one particular epidemic period, people used a lot of vinegar to sterilize things - including books. By smelling each book, he could work out where & when the disease was passing through via the lingering odors of disinfectants (even after a few centuries). [Don't recall all the details, but that's the gist of it.]
Every time source media is lost, we lose more information than just the alphanumerics on the page.
Can we get a "-1 Wrong" moderation option?
Right now, the NSA is reading and cataloging all of our private e-mails -- there will be records of everything we say for generations to come!
"Grandpa, what was a EULA?"
If it ain't broke, it doesn't have enough features yet.
Archival quality paper is really the safest bet for any information that can be converted to this form. I've heard countless anecdotes about the strength and resilience of paper - capable of being reconstructed even from the ashes of some fires! (believe it or not, the CIA puts it burned documents into acid to corrode the paper ashes so they cannot be recovered.)
An excellent resource to learn more..
I strongly agree with AC's argument. But forms of government are a really bad example. How many Americans have any understanding of how their government works? Even those who have taken the time to study it (mostly naturalized citizens, who are required to know something about this stuff, unlike "real" Americans), mostly just read the Constitution and related documents -- which have roughly the same relation to actual government as physical chemistry has to cooking.
Feudalism is an even worse example. The word, in its modern sense, was first used by French revolutionaries, to describe the aristocratic regime they had just overthrown. (Before that, it was a legal term, applying to a certain kind of property law.) Since then there have been endless redefinitions of the term, all of them pretty conflicting.
A better example would be based on simple cultural icons. In 500 years, how many people will know that Neal Armstrong was a real person and Luke Skywalker wasn't?
__
Yes, a lot more information will be lost in the digtal age than in previous times. But an awful lot more will be preserved, too.
The archeological record always seems to improve with technology. From stone etchings to written scrolls to printed matter and photography and on into the 20th century, the more technology people had, the better record they left of themselves.
What's my printer used for again?
Come on... we still use paper now-a-days, and anything important that was on that 8incher is on a harddrive or tapedrive (otherwise it wasn't important enough to keep). There is still books of info and stuff. When we get into digital books and remove paper from society entirely... that's when to ask this question.
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
... too busy to record itself.
-Vic If you can't figure out my email, then don't.
Come on people. We're producing enough information today to keep the entire globe occupied with archive digging. Important stuff, like science articles, great literary works, etc. will be preserved because they're always transferred to new media as it comes along. The rest of it is mostly crap no one would care a flying fig about in 2060.
A penny for your thoughts.
A witty
We're now entering the new digital dark age, where all records are digital...
Old time radio and vintage porn would be gone by now but for the internet. Moors law and the ever increasing size of hard drives has allowed most of us to keep ever larger collections of junk, some of it will be historically valuable. Is it really possible to listen to a 10gig mp3 collection? I'm sure no one would delete it for fear of wiping out the good stuff. 50 years from now we will probably have that mp3 and porn collection on a very tiny corner of some optical storage cube along with a bunch of old my document folders and /home directories.
To the contrary.
Historians are attempting to decipher burned scrolls written by every day Roman citizens found buryied in ash near Pompeii.
Letters from common Civil War soldiers are regularly read and studied.
Conformity is the jailer of freedom and enemy of growth. -JFK
Ok, I'll buy that.
I would point out that different groups of common people (eg. soldiers, businesspeople, farmers, etc) would be interesting.
Conformity is the jailer of freedom and enemy of growth. -JFK
You occasionally see old clips on (M)TV.
Still I agree that it's hard to find videoclips of many groups. Some big artists (e.g. Madonna, ZZtop) have Best Of DVD's on the market but if you want to find that memorable clip from 1985 from a one hit wonder you're generally out of luck.
Being of the videoclip generation I have better memories of some artist's videoclips than of their songs. It is a shame that the music industry doesn't cater to this need. But then they don't for many other needs in the market as well, like the one for cheap downloadable music.
First off, I would imagine the advanced peoples of the distant future will be able to figure out our primitive programming languages and bulky, clumsy storage mediums. Perhaps they will even be assisted by intelligent machines in which they rely on completely but know nothing about.(see article http://slashdot.org/article.pl?sid=01/04/10/043622 0&mode=nested)
On the other hand, if the incredibly advanced peoples of the future are somehow unable to interperate our digital storage mediums, it WILL save them from having to look at all of the utterly USELESS information posted on this site. Not to mention the banner adds and prOn which dominates the internet, the commercials on TV, DVD, VHS, soon-to-be video games and so much more.
They might beleive A0L or Micro$oft ruled the world!
It wouldn't take much of all of this "information superhighway" for them to see why we splattered ourselves all over the evolutionary canvas I think...
Just one useless site/.
"You are not a beautiful and unique snowflake."...Tyler Durden
Related to this, check out the Bibliotheca Alexandrina, which is the project by the government of Egypt to rebuild the Library of Alexandria, but updated for the modern age (digital archives, etc):
Unesco Site: http://www.unesco.org/webworld/alexandria_new
MIT Tech Review Article: http://www.technologyreview.com/magazine/apr01/jen kins.asp
From the Tech Review Article:
-roop
To do so, they are using a new computer language called eXtensible Markup Language, or XML. It is a way of marking up electronic documents with easily understood tags instead of coding dependent on what will some day be obsolete software.
Naturally, NARA's main focus is the archiving of documents that are mainly of historical significance to Americans.
When I though there might be a demand for Cobol (y2k work) I started to collect cheap old Cobol Books, that were $1, $.50, .10, or free (I went all out and spent $8 once). It's a fun, cheap hobby, and a lot of the old mainframe documentation isn't on the Internet (or too hard to find) I live just outside of Cambridge Ma, which makes it easier, but think about it, especially if you're around the Valley (Maybe SF would be better to buy old books???.)
Maybe they could be called Abandon-Docs??
Just because a large bit of info on our culture may be lost doesn't mean it will all be lost. Sure, a lot of relevant stuff is stored digitally, but a ton of information of every kind is available on paper. If future historians want to know about our culture, let them dig up our old books and newpapers and magazines, they'll learn incredible ammounts about us. And if they want to know about our digital culture, they can still hit the books. It's all on paper, somewhere.
Stupid like a fox!
I work at a university library as a 'technical specialist' (gloified technician), and recently sat in on a meeting involving how libraries are(and should be) archiving data via digital media. The long-run case is simply this: digital media saves space, but keeping up with a good five year plan keeps the data available, yet is expensive.
;)
Basically, the five year plan means rotating the data from one media type to a new media type... waiting every five years. Although computers move from day to day, the method of data storage and retrieval remain approx. the same within a five year period. As long as the data is updated every five years or so, the data are always available. The price of keeping the data in this state of never-ending movement would be somewhat static, as once a new method of storage comes of age, and is a standard, it is pretty cheap. The real price comes from manpower. Which... could be solved by spending some time developing a software system that could be altered, on command, to handle the new media... Enter Linux!
I could keep going on and on about this beautiful system, but I grow weary of trying to remember all of this stuff, and typing it, and looking like I am still working on something useful!
Besides, data formats are nothing, historians have decoded long forgotten scripts and languages which no-one speaks anymore. I think it will be comparatively easy to get at the files on a CDROM, 500 years from now.
They'll just put the thing into some sort of a 3-D scanner and work on the computer copy... "Oh lock these dents are 1's, and those are 0's and they write them in a spiral." Sure it may be tough work (file system, data formats), but they'll also have very sophisticated technology to analyze these things. They might just have to click on the "unknown media wizzard" and get all the files. ;)
Another problem is deteriorating media, but on a historic level I don't think it matters much. Current data recovery companies can do amazing things already: restoring hard drives from totally burned-out PCs, or restoring data which has been overwritten multiple times.
It's one problem to keep your data so you can readily access it in 50 years, but I think on the scale interesting for historians we have no problem at all.
It seems almost impossible for things to disappear from cyberspace these days. There's an incessant number of mirrors and archive servers and the like floating around to keep just about everything.
Example: don't you hate it when you're searching your favorite search engine and you keep getting old and completely useless Usenet posts? That stuff will never disappear...
"If at first you don't succeed, lower your standards."
Or do things just not deteriorate in space?
The Internet is generally stupid
and store the box somewhere safe.
take your sig and shove it
In the future, after the oil and gas run out, you won't have electricity, so digital information is lost and CDs return to their natural state as drink coasters. Paul. (see http://www.dieoff.org/page1.htm)
Given the pain in the ass that reading clay tablets is at the moment (since our command of Sumerian is limited, our command of Hittite is even more limited, and our command of Eblaite is heavily dependent on our already-limited command of Sumerian, the only clay tablet language we can read fairly well is Akkadian, and even this is not fun), I honestly don't want to find myself in the position to have to read some ancient 400kB Mac GCR disks with Nisus Writer 1.0 files with text in Ethiopian on them.
As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)
BTW I'd be grateful if you could contact me and give me some details about the Egypt-related project you were working on.
As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)
Yes, but we don't need the records of every small business in every country in every century. Just some sampling will do. We lose information but it's a tradeoff for space and conservation work. The same about modern data.
You evidently have very little experience with historical work. Anywhere there was a "tradeoff" in the past historians nowadays curse those who were responsible. Data is invaluable. Reconstruction based on samples is worth very little when compared to reconstruction based on actual data.
As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)
I hope your lpr is not connected over the network, otherwise you'll have a nasty recursion problem and will run out of paper eventually.
As a state gets corrupt, its laws multiply; the most corrupt states have the most numerous laws. (Tacitus, Annales 3:27)
I just deleted my life diary for Quake, well worth it.
I'm running into these problems now. My ancient software collection from the mid 1980's is sitting on 5.25" floppy disks. The low density ones at that.
...
.. it might be interesting in a few years.
Modern machines don't have the drives. Older drives are worn and potentially flakey. And the media is aging and suffering from bit rot. (I've had four read errors in about 120 diskettes.) And the media hasn't been made in almost 10 years.
I'm using 'dd' to make images of the diskettes and I'm going to burn the images to CD. The copy-protected diskettes are a real problem though; my old copy program (COPYIIPC) doesn't work on newer hardware, and even if it did, it will make another floppy, not an image I can burn to diskette. Teledisk might work
I can't even imagine trying to do this with 8" floppies or older tape formats. Most of this data is of little worth now
While we may think e-mails, IM logs, websites, and other forms of saved conversations between people are useless to us and future historians, we are forgetting about the sociologist or the psychologist who would find this information interesting.
It will be one of the best ways for people in the future to understand about our culture and our society. The things we do now reflect much of who we are and what we find important.
But here is a question? What if Slashdot (or another website) were to shut down or simply stop running. What happens to all the information that was accured over the years? It is all to be lost with a simply fdisk command?
I remember back in my younger years, I radio station called WDRE which had a budding web site and many loyal listeners in Philadelphia. But, like all good things, they were bought and their format changed. The website no longer exists except for the archives on my computer.
But how many other website have faced a fate like this? To be forgotten (except by a few) or replaced with another site.
The economy may be built on computers, but rest assured social record-keeping is not. For important documents, and permanent information, paper copies are still much preferred over their electronic cousins.
Think about the last time you read a novel on your laptop, instead of picking up the book. And the last contract you signed? It wasn't on those digital pads you find at Best Buy for signing receipts. Paper is still king, and it will be for years. It never gets obsolete, and it lasts just as long as anything else we have.
A new year calls for a new signature.
Given that scholars can still read clay tablets, I don't think that there will be problems reading any kind of computer media. Sure, it will be a matter for specialists. Basically it will be a new academic speciality someone who can read old floppies and CD-ROMs.
Regards, Martin IT: http://methodsupport.com Personal: http://thereisnoend.org
Even two or three hundred years from now, a reasonably skilled technician or at worst a team of them will be able to dig up a CD mechanism from somewhere, fix it up and get it reading data. CD mechnisms are like Ford's Model T -- only much more common -- and let's face it, there are still a reasonable number of Model T's running around to auto shows, and there isn't nearly the historical incentive to keep a Model T running that there is to ensure that there will always be a CD-ROM reader running somewhere.
And it's likely that if most people are like I am (I value my data and my work) they will continue to migrate data to new formats as they emerge.
The bigger question isn't media, but sofware. I'm very confident we'll be able to get our files from ISO9660 discs, but I already have a bunch of WordStar and old MacWrite/MacPaint files I can't open and it's only been a decade. We'll be able to retrieve the raw data, but will be actually be able to interpret and make use of it?
P.S. I still have an old Siemens 8" floppy drive, single-sided, hard sector. About five years ago I still had an old floppy controller with an odd WD chip on it that could talk to it using OS-9. No way to talk to it with my Linux box, though...
STOP . AMERICA . NOW
Words are great but their meanings can change over the years. Being able to hear words and see the situations in which they are used can give a much better overall representation of how society really was. (Granted you need to be selective with what you watch. ;)
Willy
(And since it *is* the *digital* age that concerns us, I've got no way to tell you why.)
668: Neighbour of the Beast
Get all the world's governments to fund a special project:
:)
Build a nuclear-safe room deep inside a mountain. Make sure it has no rust-enducing environment. Use as many servers and storage devices needed for storing all encyclopedias, government material and other useful documents / images / videos of today.
Make generators which will start at a press of a button, whith a sign that says PRESS HERE TO START in all the world's languages.
When the button is pressed, everything starts and the large screens on the wall displays a menu which allows you to select a language for the user interface.
The purpose of writing the interface / messages in every world's language is to make sure someone can understand it a couple thousand years from now.
Imagine the archeologists from then finding this monstrous cave, starting the generators and getting access to information previously thought long gone.
I think it's a nice thought, and don't flame me saying this is impossible. I know it probably is far-fetched
Thanks to lawyers most of the organizations I've been a part of have a policy of shredding all paperwork after 7 years. Not a lot of history there.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~ the real world is much simpler ~~
--- -- - -
Give me LIBERTY, or give me a check.
If you really believed what you were saying, you would believe that the modern-day Bible was a pack of lies, the product of thousands of years of manipulation of god's law by the elites. This would explain why there are so many contradictions in the Bible.
You don't believe what you wrote, it's just a convenient way to attack the Clinton Administration.
Which raises the question, are you Satan?
My other sig is extremely clever...
One of my projects, deferred for a couple of years, was something I called Interhist, the Internet Historical Preservation Society. I didn't have time to make it go back then, and never got enough interest from others, so we let the domain lapse... The 'what' of the project was to preserve ordinary documents (e-mail, Usenet postings & such) from ordinary people. The intended 'how' was to simply use live storage (i.e. hard disk) mirrored at a couple of locations, with a few standard forms for people to specify the terms on which we held their docs - i.e. don't release until I die, etc.
:-( ).
The 'why' is something I'd like to add to this thread. Specifically, I've read on a number of occasions of valuable historical finds that provide in detail a picture of ordinary life in a given historical era. Think Pompeii - what's always cited as so incredible is the slice of everyday Roman-era life that was preserved under the ash flow nearly intact. This led me to form the opinion that future historians would indeed find the records of ordinary people to be valuable, and therefore they are worth preserving now.
If anyone is interested in trying to help revive the interhist project, write me at scott@zorch.sf-bay.org (as soon as my home e-mail is fixed...
I work for a certain government Agency (no, not THAT one, mine's on the mundane side) and am involved in the transistion from paper-based to electronic record keeping. The transient nature of the storage media is certainly an issue, although I beleive NARA (National Archives and Records Administration) advises folks to back up disks with paper. Verifying the integrity of a record is also an issue, one that comes into play mostly when a record is referenced in Federal Courts...for more info, try NARA's website http://www.nara.gov/records/index.html
Some of us have fallen in love with the notion of giving without reserve-Raoul Vanegiem, Revolution of Everyday Life
I laugh when one of them prints his email, handwrites a response onto it, and then faxes it back to the origin. I guess the jokes' on me.
Here lies one whose name was writ on electrons.
-Styopa
It's funny how much a neglectable quantity of written archives bore such an incredible historical value until the XIXth century. Archiving had a meaning by then, not much noise on the signal. In our wazzaaaa culture though...
É que os desafinados também têm um coração
If you use a film camera and throw the negatives and prints in a shoe box they will last almost forever
This is true of the hundred-year-old Bradyesque B&W's you mention, but the chemistry of color snapshots taken over the last fifty years makes them substantially less stable -- something to do with the organic dyes they use. Ever wonder why that old Kodachrome snappy of Grandpa from 1965 has that awful pink tinge? It'll only get worse, until eventually it's an unrecognizeable blob.
However, the older B&W stuff will just get a little yellowish. Or "sepia" if you prefer.
Because, unless you handed over the key upon demand, then not only would you be guilty of obstruction of justice, but also contempt of court for failing to produce documents during discovery.
MacOS, Windows, BeOS, GNOME, KDE: they're all just Xerox copies
If I recall correctly, many attorneys are now advising clients to proactively delete archived email and other correspondence stored electronically, so that in presumed future legal actions the discovery process won't turn up incriminating evidence in the defendant's files.
The deletion, apparently, if prescheduled on all documents doesn't consititute obstruction of justice, whereas conscious destruction of only selected material may be construed as obstruction.
Part of the problem in maintaining a useful archive into the future is storage media, but a bigger part is the attitude that we should be afraid to allow our routine communications to be stored permanently.
Oh. And by the way, IANAL.
MacOS, Windows, BeOS, GNOME, KDE: they're all just Xerox copies
Apparently, George W. was an inveterate user of email right up until the inauguration. At that point, he sent a farewell missive to his correspondents, in effect saying he could no longer use email because all such correspondence would be a public record and he didn't want his private musings made public.
So, no, many important communications will not be retained, unless someone is placing a wiretap on the president's phone.
MacOS, Windows, BeOS, GNOME, KDE: they're all just Xerox copies
"Obtuse Anger is that which is greater than Right Anger" - Lewis Carroll
The books which teach us all (students, hobbyists, workers), give good samples of what our life today is like. In both terminology and content, the books will be around for a long while in one form or another.
http://www.textfiles.com/ Enter Silly comment here.
Guess I'll do what I did when I pulled the 5 1/4" drive: Grab all the old 5 1/4" diskettes and move the data to the hard drive, burn a CD and away we go. So, right before you get rid of your CD drive forever, pull all your CD's and copy the data to whatever the new great media-of-the-year is.
Oh, and avoid the "no-copy" media.
My family has a central data server in which all of our data and backups of our programs are stored. It uses a nice IDE RAID system to insure that our 40+GB of files are safe. (although this includes a library of 122 program install sets, such as office, windows, Photoshop, ect...) Our USER directory is over 21gb of hard data files, which includes financial records, Office App save files (word, excel, power point, access), image files (no porn you dirty minded geek!), and email. Also backed up is our web pages, and other important information.
I foresee a time when almost every home has its own central server, and the data on such a server is by design, self backing up. A family with such a server would never loose copies of their records, that is as long as a fire or something didn't kill the server.
Since the server in my house has so much important data , I also have a hard drive copy in my bank safe deposit box, wrapped in magnetic shielding (just in case). I update the back up once a year, which isn't very often, but better than nothing... This is something I doubt the average family will do in the future, but as online backup increases in durability, the backup will be held on internet servers instead.
And I use a different hard drive on each backup, BTW....
Basically, modern data media is dynamic, not static. The information has to be re-copied and compressed regularly or it is lost. I had a case of this before we went to the server, I lost all my AutoCAD file, which HURT!!! I had years of files just erase with out a way to recover. Can you say ouch?
Plug for my website:
Wireless LAN Hardware and Systems!
Network over an area of 15 or more miles!
www.techsplanet.com/wlan/
However there are some things which need preserving for posterity :
This puts me in mind of the Deja-News archives now owned by Google..
What will become of them?
-- Paris. Not the City
But what an impression!
The advent of digital media simply presents a new storage concern for the same old problem, archival. What do you archive? Same as always, what you find important. Whether this includes things one finds trivial today and tomorrow might be priceless to an archeologist, that is irrelevant. What do you consider priceless in your life? Bank statements? Your diary? How long is long enough? Your average person knows and cares nothing about archival, and concerns themselves essentially not at all with true long term storage/archival. Why should they after all? Their data probably is far less long-term critical than the government's. Regardless of data source however, the storage medium is another issue. Anyone who thought magnetic floppy disks would be a storage format for future generations was just plain senseless and knew nothing of basic electricity and magnetism. CDs are certainly more durable, but don't forget they're just a cheap little slice of foil, bonded to a slice of plastic. What is the lifetime...debatable. Care to lightly scratch the top of a CD and see how long it lasts? By far the most durable common system is MO (magneto optical) which requires a magnetic field and a laser to shift bits, and has at least a 30-50 year lifetime. The US Govt. has been archiving to MO for years. But these drives too (if you can even find them today) will be obsolete soon and new technologies will be unable to read old disks. What is the solution? Good old printed microfiche? It's got a much longer lifetime, but far less storage density than new digital options (I don't have time to call Anacomp and get the latest numbers). Paper output to paper? Doubtful. The bottom line is filter your storage needs to something that can be regularly shifted to "upgraded" storage mediums as that space becomes cheaper. Near-line tape-to-tape or EMC Symmetrix-style solutions that get cycled regularly to new technology would seem preferrable. Choose your data carefully, and archive it to multiple locations. Unfortunately, reality is that people will continue to store to floppy, take Polaroid photos and print dye-based inkjet digital solutions. We'll see what those memories look like in ten years... /g
Tell my mom. She's good at remembering useless details that nobody cares about and explaining them to anyone who listens. Plus she was born before the advent of the telephone.
Hey freaks: now you're ju
Actually, paintings do deteriorate due to viewing, and quite quickly. Photons bombarding the pigment cause the colours to fade like an old photograph. There are regulations as to how bright lights in a gallery can be and how many there are, as well as how many days out of a year a painting is viewable (the rest of the time it's in a dark climate controlled room). And remember, the Giocanda is only 400 years old...works from earlier times have only survived due to extreme storage facilities. The cave paintings around Cro Magnon, for example, survived because they've been in a cold fucking cave for ten thousand years. And the artifacts of Tutankhamun and Rameses II survived because they were buried in a stone coffin in one of the dryest areas in the world.
The digital age gives us great hope for preservation of everything, because we can copy sounds, images, motion and even DNA structures with perfect reproduction. But it will only be through the careful preservation of this information that future generations will be able to access it
If anything, and you can consider this a dig at DMCA if you like, it will be the number of copies of these artworks that will permit them to be preserved. Consider this: there is only one Mona Lisa -- if she fades, we can only guess at what her colour was. But there are millions of copies of Wing Commander IV. It's a relatively simple task to go through a few thousand of these, extract from each disc what data hasn't rot through, and compare it to the others. Combine that with huffman coding and CRCs and we can quickly reconstruct the original with perfection and certainty. You can't say that of the Venus DeMilo. And unlike other generations' copied mediums, we can trust the intermediary -- the cold, heartless eye of the scanner and OCR soft -- not to misspell anything or make up shit. Bemoan the need for proprietary copyrights if you like, but the digital age's perfect reproducability is the factor that will decide its permanent etching in the databases of the future.
Hey freaks: now you're ju
The effect of this upon history is obvious. Originally, historians thought that the digital age would be great for doing history, because so much source material would be available. The growth in data warehouses and similar archives indicates that it's human nature not to throw anything away. (That, and my garage!) But now, to prevent the risk of exposure during email discovery, there won't be anything left.
What a shame.
In 10,000 years or so when people look back the 2001, they will know plenty. But only the important stuff. Depends on whether what they consider important is what the spin doctors persuaded the news media to publish, or what _really_ happened. If you want to know what really happened, you have to find records left by ordinary people.
They did have "reusable paper" in the dark ages. Most writing was done on parchment, which is a sort of leather. It was expensive enough that a good many warlords would have the lettering soaked and scraped off of books so as to re-use the parchment for their army payroll and tax records. Sometimes, monks of a later generation would try to recover the original text from the incomplete erasures...
I only see paper as the solution. Bulky but it will hold up. But if given an option, I would like to see atomic scale writing on some sort of sheet ( metal / plastic / ... ). that should be able to hold up for a long time as long as we have atomic scale readers.
/. has pointed out in the past. Digital storage sometimes gets lost in the sense that there are no readers ( anybody recall that NASA tapes that are in storage because nobody can find a reader ? )
As
ONEPOINT
spambait e-mail
my web site artistcorner.tv hip-hop news
please help me make it better
if you see me, smile and say hello.
Y-E Data, a floppy drive manufacturer was making 8 inch drives until very recently, I am sure if the data is important enough they can still do it quite easily, but I am sure it would be expensive. Most of the parts would have to be made by hand or outsourced. blade
http://www.ohlssonvox.com
That may be true for historians looking for big trends, but we can learn a lot from our own past by looking backwards in our own history. I still have emails saved from almost 9 years ago, and WOW! I was such a clueless college kid. Some of the mistakes I made, and talked about on email with friends, really make me think twice now about what motivated me to be so stupid. If nothing else, we should be archiving our records, for you never know when you might become the next most famous person on the planet, and then everyone will want to know what got you there. Linus Torvalds certainly didn't write Linux to make himself famous. He did it for fun in his spare time, and now he's internationally known! Wouldn't it be great, when one of us reaches such a pivotal moment in time, to be able to say to everyone: "Look, this is what made me who I am, and here is what you should and should not do to become a valuable member of society."?
The fact that digital media isn't persistant isn't an issue just yet; nearly everything human beings have decided to save as information has been published in books, which are collected in the library system -- which is basically an analog sort of distributed, persistant storage. People have private copies of books, students keep their textbooks forever, and libraries buy copies of their own... Our culture will survive through its books and papers, as has every other culture we know about (even the Babylonians had clay tablets, thousands of years ago).
Virtually all of our government's papers have been preserved through the system of Federal Repositories, large libraries usually associated with a large state university. Most are on microfiche and microfilm, although some are available in paper form. As I understand it, the repositories were designed to preserve our government's records in the event something, ah, *happened* to Washington D.C. (cough, cough NUKE, cough cough).
While it's true that our personal correspondence is tending towards the digital (phone conversations and email) this doesn't mean that information about our lives isn't being stored. Biographies, personal accounts of experiences, a whole lot of serious fiction, and the personal papers of a number of important people are regularly published. Once it gets into a book, it's more or less immune to time, unless one of us burns it. It may be true that some of that correspondence is tainted (perhaps even entirely full of shit) not all of it is, and people will be able to check a number of sources to find things out...
Basically, I'm not too worried. As long as the tree huggers don't outlaw paper (or any plastic substitute we may be offered) we're pretty much A-OK.
crazyphilman@programmer.net
crazyphilman@programmer.net
Sort of fat, good looking in a disheveled sort of way.
Does anyone know what the possiblities of encoding
it in DNA and the fossilizing it somewhat quickly?
Seems like mostly that stuff lasts quite awhile.
...then we can read anything.
In 50 years it may be trivial to use NMR or STM or newsprint and a flat-sided crayon to get the bits off an 8" floppy.
Questions of the value of the information are moot. Questions of technology rot are FUD. There is a balance between how one person values reading the data, and what he can afford to use to do it. The number of cases where the desire is great and the technology is nonexistent will probably remain constant and infinitesimal.
It's a silly damn question. Just as silly as it was 30 years ago when mag-tape stock started decaying or 50 years ago when steel-tape recorders started breaking down or 7000 years ago when Imhotep's little brother Shmohotep spilled mofo-juice on the papyrus with the tuning instructions for the pyramids...
--Blair
Actually, Kodachrome (specifically) is relatively long-lived. It's not necessarily archival, but Kodachrome will last much longer than most other color processes. Kodachrome film is similar to B&W film. The color is added in the developing of the film (which is why it's basically impossible to process Kodachrome film yourself -- you have to send it to a big lab). Most color films have the dyes in the film itself, and these dyes are not especially stable (though they've come a LONG WAY from those of the 50's/60's). However, for true archival quality, no film process will beat a properly developed B&W negative or print.
Our own personal e-mail correspondence, and bank records and that... well, simple answer, it won't be saved. Nor really, should it. I'm not an important person, no historian will EVER care to read the crap that flows in my inbox. Surely, the DeMedici's were worthwhile subjects to read their mail, but certainly not I... or probably anyone else except the top .001% of the population... Bill Gates, Linus, the President... those type of people.
In 10,000 years or so when people look back the 2001, they will know plenty. But only the important stuff. And that's all they would care about, afterall they would have 10,000 years of other crap to sift through without having to read the chain-mail jokes that my girlfriend sends me.
Now, as a non-electronic analogy, imagine if the Mona Lisa was designed so that it could only be viewed in one gallery. Copies of it, in any form, are impossible. Now imagine said gallery has a fire. A priceless work of art is gone for all eternity, save people's memories.
Remind anyone of cps2? Thankfully that's been cracked, as who knows whether Capcom would have ever released the data to the public (as they did with some cps1 games). And the cps2 boards didn't have a life span of a few decades, more like 5 YEARS. See cps2 suicide for the details. Now apply this sort of copyright madness to all modern forms of art, and ask where we'll be culturally in 50 years. Scary.
Endless arguments over trivial contradictions in books written by ignorant savages to explain thunder in the dark.
We're setting ourselves up for a similar disaster, but I'm not so worried about old floppies and tape machines. I'm much more worried about being locked in to proprietary formats (such as .doc).
Someday, there will be legislation not un-like the DMCA that will make reverse-engineering .doc illegal. Someday, Microsoft will require you to contact the "mother ship" to ensure your copy of Word is legit, or, Word will be on some central server.
Someday, Microsoft won't be there to validate your key, or serve you the latest Word applet. The source for Word will be tied up in IP lawsuits and beaurcatic bungling... or worse, your .doc will be encrypted with keys that only Microsoft had at one time and no longer does, in which case even the source is no good.
Then what?
By placing all our eggs in one collective basket/format, and having that basket be controlled by a closed-source corporation, we are heading towards an information meltdown not seen since the destruction of the library of Alexandria.
"History does not repeat itself", Mark Twain once said. "It rhymes".
Ryan T. Sammartino
Ryan T. Sammartino
"Ancora imparo"
Point well taken. Kodak was sued over unstable colour images in the sixties (I believe).
Things have improved since then and modern colour photgraphs are quite stable. We really have to wait 100 years to find out.
My point is that the pinkish picture of grandpa will still give you a good idea of what grandpa looked like. An image on a disk that no one can read will keep grandpa's appearence a secret forever.
Kodachrome is a bad example to use as it is very stable mostly because only Kodak will process it. Ektachrome and colour prints are less so. I have seen an improperly processed B&W image degrade in a few months.
I am often asked about digital cameras and it is this exact problem that I like to point out. There are two problems, even if the media is stable enough to keep the data safe for hundreds of years you are relying that technology will available to read the informaion.
If you use a film camera and throw the negatives and prints in a shoe box they will last almost forever and will be viewable as long as there is light.
Even some of the earliest photography has proven to be quite stable, look at the amount of Mathew Brady work that still survives and works just fine.
Digital technology is not the answer to everything.
We still have printed and physical works and we always will. The Internet has many books written about it and historians will undoubtedly be writing nostalgic recounts of the Internet age in another 50 years or so..
Even if it is all digital vapor that will vanish from existance someday, let's remember that many ancient civilizations left very little behind but their ruins and a few clues as to who they are yet we always seek to understand and put the pieces of the puzzle together to try and picture the life and times of our ancient ancestors.
Besides, in 100 years, everyone alive today will be dead so who cares if The First Web Page(tm) is preserved and enshrined for eternity.
--
$ chown -R us:us yourbase
I have all my emails, ever since 1994.
How to do it: transfer as you change. I've had several HDs, diskette formats, have even used cassete tape drives (anyone remembers TK85?). Now I have everything in CDs; when everybody starts using DVD writers (when they get cheap enough, that is :) ) I'll migrate to that, too!
If the user concerns himself with his(her) own data instead of waiting someone else invent a "perfect" way, having access to history will be no big deal. And it makes for a great backup policy as well!
Linux *is* user friendly. It's not idiot-friendly or fool-friendly!
It is easy to predict that at a similar distance in the future little will be known about our time period. After all, it is already problematic retrieve 25 year old data from 8 inch floppies, simply because the reading mechanisms are hard to find even if the media has retained the data.
Yeah, 'cuz all of our history is written on computer media now that we live in a paperless society.
Oh, no, wait, that's what they said would happen 25 years ago on 8" floppies.
Ever notice how many books there are in a lowest-common-denominator bookstore like Borders or Barnes and Noble? Our history is not just on floppies/CDR/etc.
Dave
see Adrian Berrys The Next 500 Years (1995), Chapter 5, "The Death of History".
This has got to be the reason why all those ancient advanced civilizations such as Atlantis didn't leave any trace of their existence behind, except in oral traditions.
I don't think that devising a reader for antique magnetic or optical will be the problem - I'd imagine that a future hacker would welcome the challenge of figuring out how to read a 8" floppy disk.
:)
The problem is that all physical media fail over time. Magnetic media degrades over just a few years. CDs experience "CD rot" (if anyone has any old video disks, you know what I mean), and even media designed for durability eventually breaks down.
I think humanity has to fact that fact that over the millennia, we inevitably lose our old historical records.
At least archeologists will never be out of work.
Invisible Agent
Invisible Agent
This post is a mirror; when a monkey stares in, no hacker gazes out.
Why should we even worry about archiving data for the future? Since when has humanity ever consciously decided to preserve every little bit of information? The important scraps stay, the irrelevant ones are forgotten, some stuff will stick around and make historians feel warm and fuzzy inside, some will rot. This is how it should be.
It doesn't matter what medium we store our precious little scraps of nostalgia on. If you have something you want to save you just move it to a new medium when you feel the need to do so. The storage medium is irrelevant. We don't need some new storage device that will last for 20000 years, we need people to keep what they like and forget the rest. As it stands we're pretty damn good at that.
--
lasts longer than paper.
In fact, the cheap paper we make today, from wood-pulp and full of acid, degrades in a few years (leave a paper-back book on a sun-lit shelf for a bit). Archival stuff is made from rags, without chemicals.
Inscribed stone/clay, and metal are more permanent of course, but less flexible.
I know this has come up before, I'm sure I remember discussion about "programmer archaeologists" of the future - noble beings equipped with trowels and oscilloscopes, who reconstruct long-dead file formats from half-corroded CDs. Sounds like a neat job.
It's worth noting that one hundred years from now some of the most valuable land may be the former landfills. They're chock full of high quality resources. Copper, iron, recoverable plastics. We may rue the day that incinerators were promoted as a replacement.
Digital records are favored by our corrupt, foreign-dominated Federal tyranny for one very simple reason:
It's terrifyingly easy to alter them, or to dispose of them entirely.
This is frightening, but true: As the well-known conservative George Orwell observed in his great novel 1984, "He who controls the past controls the future." The "Party" in 1984 devoted itself to doing exactly what the Clinton regime did: They went through all historical records, altering, falsifying, modifying, deleting.
No one will ever know what the Clinton death count really was. No one will ever know what really happened. The "records" are malleable. You can trust no information that comes from the government, because it's all been "massaged" and "fixed up".
Will there be historical records? Not in any meaningful sense: There will be something that looks a lot like such material, but it will be a work of pure fiction.
Goodbye, America. We were great while we lasted.
"Offtopic, Inflammatory, Inappropriate, Illegal, or Offensive" -- hey, that's me!