Slashdot Mirror


Will There Be Historical Records from the Digital Age?

magarity asks: "NPR's Morning Edition today aired a segment on the Medici Archive Project where every letter sent and received by the ruling Medici family of renaissance-era Italy is being stored. The interviewer, Bob Edwards, casually joked that it was a good thing the Medicis didn't use email or else all this history would have been lost. It is easy to predict that at a similar distance in the future little will be known about our time period. After all, it is already problematic retrieve 25 year old data from 8 inch floppies, simply because the reading mechanisms are hard to find even if the media has retained the data. The same thing will happen to CDs in 50 years. How should the dawn of the digital age be recording itself for history, especially casual correspondence that gives insight into day to day life?"

"The Medici Project concerns itself with the rulers and given the recent report of US Congress members not making use of email one assumes they are still using good old long term archivable paper. Will the President and Congress in 2030 or even 2020 feel the same way? The main problem being digital records are so much more easily tampered with compared to old paper. It's not as easy to do carbon dating or other such tests with a bunch of bits. Remember: the victors always have and always will rewrite history as much as possible."

25 of 251 comments (clear)

  1. Of course there will! by Chris+Johnson · · Score: 3
    Of course there will be historical records from the Digital Age!

    They will say:

    • music thieves are like looters or other sorts of robbers, and right thinking people despise them
    • nobody has ever been motivated by anything other than self interest
    • people will trade off privacy for a bit of convenience
    • Microsoft has always been the world's web browser
    • Bush won
    • Oceania has always been at war with Eurasia

    Thank you, Ministry of Historical Perspective! :P
  2. We don't have less long-term storage by iabervon · · Score: 3

    We just have more medium-term storage. The sorts of things that won't last more than a couple dozen years are generally things which, in the old days, wouldn't have lasted a minute: music couldn't be stored at all until recently, and many conversations we have by email (which could degrade) would have been done in person and never stored at all.

  3. answer: by desslok · · Score: 4

    cat internet | lpr

  4. only copied stuff is "saved" by peter303 · · Score: 3

    That applies to 5 years ago or 2000 years ago.
    Even paper distintigrates, albeit in centuries.
    Only a tiny fraction of stuff is copied now or then.

    1. Re:only copied stuff is "saved" by Moofie · · Score: 3

      What are you talking about? I've got the DVD that Moses brought down from Mt. Sinai. Look! It says "10 Commandments" right there on the front!

      --
      Why yes, I AM a rocket scientist!
  5. We need a standard for long term storage by joshv · · Score: 4

    We need to define a long term storage standard which is a suite of storage media and standard file formats. Call it LTSS 1.0. To be a LTSS 1.0 compliant reader you have to support all media and file formats. This could be a dedicated reader, or a general computer with some specialized software and hardware.

    LTSS 2.0 might have whizbang new file formats and storage media which supports 100 times as much information density, but it must be compatible with version 1.0.

    LTSS 1.0 could support WAV, MP3, GIF, TIFF, Text/ASCII, Text/Unicode, HTML version whatever, and perhaps even Java for interpretation of abirtrary file formats. The media, CD-R, or perhaps one of the writeable DVD formats when they mature.

    -josh

  6. Isn't this scary by Shotgun · · Score: 5

    A democracy, a so called 'free society', can easily be manipulated and controlled by the person controlling the information. What happens when all information, except what comes from 'authorities' is suspect because it is so easily fabricated?

    It reminds me of the Arnold Swarzen...(?) movie, "The Running Man". He's a police helicopter pilot who refuses to shoot unarmed people involved in a food riot. The powers that be manipulate the video tape evidence to make it appear that he massacres the people instead. People are shown the tape and cry for his death in a game show type fashion until some revolutionaries are able to show the real tape by hacking into the communications channel.

    The temporality of public records has very serious implications for our social structure. If the only record of your speeding ticket is an entry in a database, what happens when a glitch makes you a drunken sloth who doesn't pay child support. If the entry showing Bush's drug convictions get deleted, will there be no other record. Trust me on this, email is a politician's dream. Everything from here on has plausible deniability.

    --
    Aah, change is good. -- Rafiki
    Yeah, but it ain't easy. -- Simba
  7. Some thoughts by wiredog · · Score: 4
    When the 3.5 inch floppy came out, I copied all my stuff on 5 inchers over. When CDR came out, I copied it all onto a cd. Made backups, too. Copied all my e-mail from outlook to the standard text format when I went to Linux. No doubt I will be copying my data to DVD-R someday. And, 20-30 years from now, to its successor.

    One problem with archiving digital communications is the volume. One of the problems that were found during the many Clinton investigations was, when e-mail was subpoenaed, separating the wheat from the chaff. All the mail was backed up onto tapes, which weren't very well marked. And the first searches were done on subject lines. Quite a bit of relevant mail was missed, and turned up years later when people actually sat down and read every message.

    The National Archives (here in the USA) is worried about preserving data. The various software and hardware formats used over the years make it difficult to track and retrieve the data. NASA has spent a fair amount of money moving old planetary exploration data from tapes to optical disks, and then to CD. My father worked on a project at DMA (now NIMA) to do the same thing there.

    1. Re:Some thoughts by Erasmus+Darwin · · Score: 3
      The National Archives only accepts data in ASCII format. They view text as the lowest common denominator [...] You can understand their posistion after you sit down and think..this is our American history...

      So I'm sitting down and thinking, but I still don't understand their position. I can appreciate both the importance of ASCII text and its accessibility (hell, I still use lynx to browse the web), but I can't understand why you would restrict yourself to only text.

      Consider the following:

      On July 20, 1969, Neil Armstrong was the first man to walk on the surface of the moon.

      --versus--

      On July 20, 1969, Neil Armstrong was the first man to walk on the surface of the moon. Here is a picture, in an open, documented graphics format.

      There's just too much history that's more than just pure text. I can understand trying to make as much material as possible available as text, but you can't let such a decision allow you to exclude relevant materials that're more than just text.

  8. Paper tape by SecretAsianMan · · Score: 3

    The main problem being digital records are so much more easily tampered with compared to old paper

    Sometimes the answer to your question about how do we do X with technology can be found by remembering the history of technology. In this case, what might be a better long-term storage medium than magnetic or optical media is good 'ole paper tape. Now, some research should probably be done to increase both the durability of the tape material and the density of information stored on it, but it is the best solution I can think of, and probably the easiest to decipher by archaeologists of the far future.

    --
    SecretAsianMan (54.5% Slashdot pure)

    --

    Washington, DC: It's like Hollywood for ugly people.

  9. Automated chaff generation doesn't help by devphil · · Score: 3
    One problem with archiving digital communications is the volume. One of the problems that were found during the many Clinton investigations was, when e-mail was subpoenaed, separating the wheat from the chaff.

    No kidding. I'd hate to be in Deja/Google/whoever's shoes, trying to archive useful data, in face of terabytes of "Nude Asian Teens" email generated -- literally -- completely automatically at the click of a mouse button. Especially since the most useful spam filtering methods (outright router blocks, keyword triggers, a bullet to the head of the marketing agent) are frowned upon by nice people.

    Paper libraries have a "volume" problem because the media itself takes up so much space, and must be carefully stored. Digital libraries have a "volume" problem because any old jackass can easily create fifty times the amount of information that's worth keeping, and it must be winnowed out by a human.

    Just my rant today (cleaning out another twelve spam emails).

    --
    You cannot apply a technological solution to a sociological problem. (Edwards' Law)
  10. It's Not Just Digital -- Microfilm Sucks as Well by meehawl · · Score: 3

    There's a good review of a Nicholson Baker rant against Librarians in general for their sins of deliberately pulping the paper records of the past 130 years and replacing them with decomposing and badly executed microfilm facsimiles.

    It seems that Vannevar Bush's infatuation with microfilm was shared by many in the WW2 OSS community, and this seems to have led to a misguided attempt to replace papers and books with microfilm in the interests of "efficiency".

    --

    Da Blog
  11. Re:Natural selection by spasm · · Score: 5

    "Important information survives (usually). Trivial information gets lost. This is how it should be. There's no reason to preserve every bit of data for 'historical' reasons."

    I've worked on research projects whose primary source was day-to-day accounting records of a small business running in Egypt during the 11th century. The records were preserved in part because they were at the bottom of a trash pile. The records gave us a huge amount of information about everything from transport methods to the ability of the state to collect tax. Most of the 'important information' from that period which people though was worth preserving revolves around which ruler stomped which other ruler's butt. Our 'trivial information' gave us a lot of stuff which we knew nothing about before, stuff which helped explain why ruler X had the economic wherewithall to stomp ruler Y's butt and, well, more interestingly, what it was like to live under ruler X or Y.

    The same applies today. Yeah, a record of what your family ate for dinner for the past two weeks is truly trivial. But what it will say about daily life, the transport of food, diet, cooking technology, food storage & a whole lot more about life in the early 21st century might be invaluable to some historian in a thousand years.

    Your 'trivial information' is someone elses data goldmine and vice versa. One of the things I really like about computers is they allow you to keep a lot of personal shit you might otherwise have to trash because it gets bulky. The chances that I'll hang onto all my mail & all my parent's mail and all my grandparents mail is pretty good when it fits onto a CD rather than choking up my small apartment with boxes. The chances that some future historian will get to read ordinary everyday mail rather than just the mail of presidents and kings in a thousand years is getting better.

  12. On optical media. by supabeast! · · Score: 3

    Optical media is not really such a bad option. A useful, self contained system for playback of optical media could be easily built. If nothing else, carefully preserved schematics for future readers of media could be store with it to make sure that if the machine is ruined and media survives, it might still be read.

    The real reason that old magnetic tape is hard to read now is that it was never a great format in the first place. The stuff falls apart. My last employer had an old HP reel-to-reel machine for reading data on tapes from a company we had purchased, but the tapes were so old that the chemicals on the tape itself turned to dust and fell off. This is not a problem with optical storage. Optical storage also has the option of being dedicated in very small spaces, unlike the van sized tape players of old.

    Life is also not a big issue with optical media, because just as the books of the Medici's were recopied over and over into new languages and on better bindings, so can data be quickly copied from old optical media onto newer formats.

  13. The answer to this question is pretty obvious... by smoondog · · Score: 5

    I'm at a loss to understand why this question is perceived as being difficult to answer. Notice the posting talked of the *ruling* class. Today we look back at history and see people who kept records of their letters. They are usually wealthy and upper class.

    The analogy would be to read emails from, say, the white house in 200 years. Do you think the white house is saving their emails? You bet. Do we have lots of examples of (from the general public) letters from 200 years ago? Certainly not as many as there will be emails in the future. Usenet archives, digital backups stored in basements, most emails are being stored two or more times at two or more places. I don't quite understand why someone would think that just because it isn't on paper, it isn't going to keep. We are going to have far more emails stored in the future than we will know what to do with.

    As society we think of ourselves as individuals to be pretty important, but lets face it, for the vast majority of us, no one is going to care in 150 years. With that in mind, the digital age is storing far more records than ever before and the future holds a new paradigm of historical record. I almost lament that I wasn't born 150 years after the advent of the digital age where high resolution movies will look as good 1000 years from now as they do today.

    -Moondog

  14. This is a known problem by Greyfox · · Score: 4
    This problem is aggrivated by the current copyright laws. Long after the copyright holder's lost interest, it will be illegal to copy the content to fresh media. Lars may bitch and moan now about his songs being stolen but in 100 years will anyone know who his band is or hear his songs again? The DMCA will only make this problem worse, potentially making it impossible to preserve any works from this era.

    Likewise, various people are trying to shut down the MAME ROM sites, but a lot of the hardware ROMs are deteriorating now and many of those games, which represent a golden age of creativity and a technical wonder of resource usage, will be gone forever. Kinda makes you sick, doesn't it?

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

  15. Is it necessary? by zpengo · · Score: 4

    While I'm all for archiving data for future historical analysis, I think it's fairly certain that IM logs, "how's it goin?" e-mails, and detailed transcripts of #40yearoldsinglebaldguys will not be very useful to historians in three hundred years. Yes, they tell about our culture and practices, and yes they might be interesting, but we don't need all of it to extrapolate those conclusions. There is simply no room to store the vast quantity of information generated on the Internet on a daily basis, and considering the fact that 99.998% of it is of little value, I think that we can safely do without it.

    Things are still floating around from the old days. We have Usenet archives from the 80s, and text files from even earlier. We can learn a lot about the culture based on those. Things that grab the public consciousness tend to around. They get mirrored, printed out, saved on disk, etc.

    Does there need to be a giant warehouse that contains vacuum-sealed printouts of every wise thing said on the internet?

    No. No, there doesn't.

    --


    Got Rhinos?
    1. Re:Is it necessary? by rgmoore · · Score: 5

      Of course the flip side of this is that it's not always possible to tell who will be considered interesting in the future. In many cases, the most interesting use of archives is to look at the work of interesting people while they were working their way up and weren't of broad enough interest to attract major attention. Nobody knew that a 25 year old patent examiner named Albert Einstein was about to become a scientific star, but because we have his personal letters we can find out what he was doing scientifically and personally.

      You never know if the next great author might be posting his early, great works to some fan e-mail list because he can't get his foot in the door at a major publisher. Maybe the next great debator is getting started in flamewars on Slashdot. Maybe the next great OS designer is getting into arguments with established academics on USENET. Oh, wait, that already happened, and we can only read the argument because somebody though to archive it. Maybe the next great philosopher who will be mostly ignored for 100 years is already publishing his early thoughts somewhere on the web. You can't always tell what will be valuable to the future until well after the fact, so preserving as much as possible is still a really good idea.

      A truly wonderful example of this kind of thing are the early works of JRR Tolkein. The early history of the Silmarillion is absolutely fascinating and a wonderful example of the development of a literary theme. That's a work that wasn't published for over 50 years after it was started, but some of the earliest drafts still exist. Because those drafts are available, it's possible to see how it developed. Will the same thing happen when authors write everything in Word and write over old versions every time they change anything? How about if they're still very careful about keeping copies of early drafts but the formats change so much that they can't be read anymore?

      --

      There's no point in questioning authority if you aren't going to listen to the answers.

    2. Re:Is it necessary? by skoda · · Score: 3

      I've been reading Stephen Ambrose's books the past couple of years, and based on his work, I now think that the 'whassup' emails are of value, because they will tell historians about the common man.

      While the histories, news articles, and official documents of a given era are very important and informative, it is also necessary to the personal accounts from the people involved in the society at the time to help provide perspective, and to help identify biases in the 'official' accounts.

      Considering how valuable even the pedestrian of documents are from e.g. 3000 BC, I imagine that today's equivalent will be of equal value to historians in the 7000 AD.
      -----
      D. Fischer

  16. Re:Does anyone care by rjamestaylor · · Score: 3
    Does anyone care...What the days slashdot articles are from 50 years ago?

    The problem with planning for the future is that it is hard to know today what will be important tomorrow. Perhaps the insignificant trolls on Slashdot will be of great import in the future (and, no, I'm not referring mainly to Jon Katz articles). Who woulda thunk that an accounting ledger from ancient mesopotamia would be of any interest 2500 years later?

    --
    -- @rjamestaylor on Ello
  17. It's not the media, it's the SOFTWARE. by aussersterne · · Score: 4
    There's a large difference between 8" floppies and CD-ROM. The installed base of CD reading mechanisms (CD-ROM, CD-R, CD-RW, PlayStation, Dreamcast, SegaCD, Saturn, PS2, 3DO, VCD, home stereos, walkmans) is many orders of magnitude greater than the installed base of 8" floppy drives ever was.

    Even two or three hundred years from now, a reasonably skilled technician or at worst a team of them will be able to dig up a CD mechanism from somewhere, fix it up and get it reading data. CD mechnisms are like Ford's Model T -- only much more common -- and let's face it, there are still a reasonable number of Model T's running around to auto shows, and there isn't nearly the historical incentive to keep a Model T running that there is to ensure that there will always be a CD-ROM reader running somewhere.

    And it's likely that if most people are like I am (I value my data and my work) they will continue to migrate data to new formats as they emerge.

    The bigger question isn't media, but sofware. I'm very confident we'll be able to get our files from ISO9660 discs, but I already have a bunch of WordStar and old MacWrite/MacPaint files I can't open and it's only been a decade. We'll be able to retrieve the raw data, but will be actually be able to interpret and make use of it?

    P.S. I still have an old Siemens 8" floppy drive, single-sided, hard sector. About five years ago I still had an old floppy controller with an odd WD chip on it that could talk to it using OS-9. No way to talk to it with my Linux box, though...

    --
    STOP . AMERICA . NOW
  18. White House Email by cube+farmer · · Score: 4

    The analogy would be to read emails from, say, the white house in 200 years. Do you think the white house is saving their emails? You bet.

    Apparently, George W. was an inveterate user of email right up until the inauguration. At that point, he sent a farewell missive to his correspondents, in effect saying he could no longer use email because all such correspondence would be a public record and he didn't want his private musings made public.

    So, no, many important communications will not be retained, unless someone is placing a wiretap on the president's phone.

    --

    MacOS, Windows, BeOS, GNOME, KDE: they're all just Xerox copies

  19. Simple solution to digital recording by dasmegabyte · · Score: 3

    Tell my mom. She's good at remembering useless details that nobody cares about and explaining them to anyone who listens. Plus she was born before the advent of the telephone.

    --
    Hey freaks: now you're ju
  20. Re:On Bitrot by dasmegabyte · · Score: 4

    Actually, paintings do deteriorate due to viewing, and quite quickly. Photons bombarding the pigment cause the colours to fade like an old photograph. There are regulations as to how bright lights in a gallery can be and how many there are, as well as how many days out of a year a painting is viewable (the rest of the time it's in a dark climate controlled room). And remember, the Giocanda is only 400 years old...works from earlier times have only survived due to extreme storage facilities. The cave paintings around Cro Magnon, for example, survived because they've been in a cold fucking cave for ten thousand years. And the artifacts of Tutankhamun and Rameses II survived because they were buried in a stone coffin in one of the dryest areas in the world.

    The digital age gives us great hope for preservation of everything, because we can copy sounds, images, motion and even DNA structures with perfect reproduction. But it will only be through the careful preservation of this information that future generations will be able to access it

    If anything, and you can consider this a dig at DMCA if you like, it will be the number of copies of these artworks that will permit them to be preserved. Consider this: there is only one Mona Lisa -- if she fades, we can only guess at what her colour was. But there are millions of copies of Wing Commander IV. It's a relatively simple task to go through a few thousand of these, extract from each disc what data hasn't rot through, and compare it to the others. Combine that with huffman coding and CRCs and we can quickly reconstruct the original with perfection and certainty. You can't say that of the Venus DeMilo. And unlike other generations' copied mediums, we can trust the intermediary -- the cold, heartless eye of the scanner and OCR soft -- not to misspell anything or make up shit. Bemoan the need for proprietary copyrights if you like, but the digital age's perfect reproducability is the factor that will decide its permanent etching in the databases of the future.

    --
    Hey freaks: now you're ju
  21. Simple answer: "No." The reason should scare you. by theonomist · · Score: 4

    Digital records are favored by our corrupt, foreign-dominated Federal tyranny for one very simple reason:

    It's terrifyingly easy to alter them, or to dispose of them entirely.

    This is frightening, but true: As the well-known conservative George Orwell observed in his great novel 1984, "He who controls the past controls the future." The "Party" in 1984 devoted itself to doing exactly what the Clinton regime did: They went through all historical records, altering, falsifying, modifying, deleting.

    No one will ever know what the Clinton death count really was. No one will ever know what really happened. The "records" are malleable. You can trust no information that comes from the government, because it's all been "massaged" and "fixed up".

    Will there be historical records? Not in any meaningful sense: There will be something that looks a lot like such material, but it will be a work of pure fiction.

    Goodbye, America. We were great while we lasted.

    --
    "Offtopic, Inflammatory, Inappropriate, Illegal, or Offensive" -- hey, that's me!