Slashdot Mirror


Most Digital Content Not Stable

brunes69 writes "The CBC is running an article profiling the problems with archiving digital data in New Brunswick's provincial archives. Quote from the story: 'I've had audio tape come into the archives, for example, that had been submerged in water in floods and the tape was so swollen it went off the reel, and yet we were able to recover that. We were able to take that off and dry it out and play it back. If a CD had one-tenth of one per cent of the damage on one of those reels, it wouldn't play, period. The whole thing would be corrupted'. Given the difficulties with preserving digital data, is it really the medium we should be using for archival purposes?"

33 of 353 comments (clear)

  1. That's nothing, think of DRM by iamacat · · Score: 4, Insightful

    That content can not be preserved at all. We'll be a civilization without written history, like American Indians.

    1. Re:That's nothing, think of DRM by Rude+Turnip · · Score: 5, Funny

      And if they didn't insist on DRM in their smoke signals, they might still be a pretty formidable group today.

    2. Re:That's nothing, think of DRM by MBGMorden · · Score: 3, Insightful

      [quote]While whites did enough evil, like stealing the whole country[/quote]

      Well, I'm 1/8th Native American (but 7/8ths White) if that counts for anything, but this is always overblown. Whites/europeans came in and conquered the land. That's what people have done throughout all of recorded history. The Romans Conquered the Greeks, the Normans conquered the Saxons, etc. The list goes on and on. The case has ALWAYS been that if some other nation wanted your land and you couldn't stand up to them in a military confrontation, then you were gonna loose that land.

      Now I'm not saying that it's right or justified or anything, but European conquest into North America is always vilified much more than any other tale of conquest, and I'm not sure why.

      --
      "People who think they know everything are very annoying to those of us who do."-Mark Twain
    3. Re:That's nothing, think of DRM by saforrest · · Score: 3, Insightful

      If by forced you mean they lost the war then yes, they were forced. If somebody tried to claim your land would you ever stop fighting. I know I would stop when i was dead. They were just pussies. If they had any conviction we woudl be at war with them today.


      Ballsy words for an Anonymous Coward. Hopefully you'd stick to them if your hometown were invaded.

    4. Re:That's nothing, think of DRM by vertinox · · Score: 5, Insightful

      The Romans Conquered the Greeks, the Normans conquered the Saxons, etc. The list goes on and on. The case has ALWAYS been that if some other nation wanted your land and you couldn't stand up to them in a military confrontation, then you were gonna loose that land.

      As a person who loves to study European antiquity I would point out some flaws in this thinking...

      1. When the Romans conquered the Greeks they actually adopted Greek culture and didn't kill off the Greeks.
      2. When the Normans conquered the Saxons they didn't kill off the Saxons nor really conquered their land as much as just intermarried with them (Hence Anglo-Saxon Culture)

      The only whole sale Genocides that history can come up with is the Crusaders massacre of Jerusalem (which wasn't really as much as hatred of Muslims as it was starving Europeans killing off everyone in the city regardless of religion out of rage of having to starve in the desert for several months) and then the Mongol sack of Baghdad which wasn't over so much as land, but out of spite of the execution of Mongol diplomats (considering they burned and salted the lands made the "take your lands" point of conquering sort of a non-issue).

      The genocide and seizure of lands in this scale was never really seen before until the colonization of Americas. It wasn't as much as the Indians could not defend them as much as it was that the westerners thought they were subhuman.

      Which sadly we saw again in the European theatre in WW2.

      --
      "I am the king of the Romans, and am superior to rules of grammar!"
      -Sigismund, Holy Roman Emperor (1368-1437)
    5. Re:That's nothing, think of DRM by UncleTogie · · Score: 3, Informative

      To them, it was impossible to 'claim' their land -- since they didn't consider it 'their' land.
      Best summed up by Chief Seattle, in 1854: "This we know: the earth does not belong to man, man belongs to the earth. All things are connected like the blood that unites us all. Man did not weave the web of life, he is merely a strand in it. Whatever he does to the web, he does to himself."
      --
      Don't tell me to get a life. I'm a gamer; I have LOTS of lives!
  2. Multiple identical copies? by WinterSolstice · · Score: 4, Insightful

    Isn't that the point of digital? Lossless copies are possible (depending on format obviously). Why have one plastic cylinder that can be lost when you can have it in 5 or 10 locations?

    --
    An operating system should be like a light switch... simple, effective, easy to use, and designed for everyone.
    1. Re:Multiple identical copies? by t00le · · Score: 5, Informative

      Any good backup strategy will have multiple media types, so CD/DVD should not be your primary backup media type. If you prefer to have an medium for fast access, then it is still viable. As long as it is not your primary media type, which should be something with tried-and-true longevity.

      --
      When the only tool you have is a hammer, every problem looks like a nail
    2. Re:Multiple identical copies? by gad_zuki! · · Score: 4, Insightful

      The cost of multiple backups is very real. The real issue here is that this is a frivolous complaint. First off, wet tape being readable is an artifact of the medium. The rosetta stone in the british museum is pretty readable but we arent exactly throwing out our modern media to go back to stone. Also, lets consider a reel to reel tape is about 90 minutes (7inch). 650 megabytes on a standard disc at encoding similiar to the quality you get out of a reel to reel tape is something like 1,500 minutes. And its smaller. So lets not go a little too crazy with idealizing the past.

      Also I'm certain for every analog horror story there is a digital lucky story (and vice versa). Not to mention digital encodings usually have some kind of redundancy. A small scrach does nothing but the same scratch on an lp forever destroys some part of the track. I wont even go into the magic of data restoration (which the author ignores). There's really no 'tough medium for the ages' out there that can do it all. Just complaints and blind-luck stories.

    3. Re:Multiple identical copies? by eno2001 · · Score: 3, Funny

      Digital media is OK, it's the storage that sucks. That's your basic point. But I have to disagree with you on the ubiquity of CD-ROM and DVD-ROM drives. Trust me... of all those devices that exist today, you'll only find less than 1% in a serviceable state in another 75 years. What we really need is a self-replicating storage system that builds copies of itself. I propose that for proper storage of digital information, we should really be looking at systems that can store the data in a sequential chemical form (to represent the bits). These systems should be very compact and only contain a limited set of data + the ability to copy that data to neighboring units. (Death by a thousand paper cuts sort of thing) These small systems would be contained within larger systems whose sole responsibility would be acquiring the necessary physical resources (complex matter that could be broken down into the base chemicals needed by the smaller storage systems).

      The larger systems could also provide mirroring by interfacing with each other as directed by chemical interactions in order to preserve original data as well as integrate new data that may be useful in assuring that future units are even more resilient to any sorts of flaws or possible malfunction caused by inappropriate chemical input. The key to all of this is going to be to make sure that the larger units are impelled to continue the duplication and exchange of data ad infinitum. To do that, there should be some sort of mutual benefit that the engaged units acquire from the mirroring. Multiple levels of mutual benefit would likely be more successful than just one level. So I propose that at a base level, the units should be programmed with routines that make them feel more or less successful whenever a mirroring connection is attempted. I know that sounds strange, but it should be a pretty simple subroutine and will at least get the units to attempt mirroring.

      The next level would also be an expansion of the data mirroring to the actual manufacture of a tertiary (or even more) unit that contains selected data from both origination units. As part of the mutual benefit relationship between units, the origination units should be programmed to protect the manufactured unit in order to safeguard its data as it would be the freshest copy (chemically speaking) and therefore more viable. So the relationship between origination units and next generation manufactured units would be that of security and stability from the origination units as applied to the next generation.

      Another aspect to all of this that would add even more value would be to provide the larger units with various sensors that would store ANY and ALL possible forms of energy radiation and chemical exposure to the environment. This would assure that the units would not only contain the originally stored data, but would be constantly gathering the data in a parallel fashion in every corner of the world where the units are deployed.

      As you can see, this would ensure after several generations, that all the original data is in tact and could simply be retrieved by reading all units chemical stores simultaneously and reassembling the original data as well as newly stored information. Imagine that... a sensor array that spans the planet with historical functions as well. And all self-sustaining and chemically based.

      --
      -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
    4. Re:Multiple identical copies? by Red+Flayer · · Score: 4, Funny

      So (-1, redundant) should now be (+1, redundant) for posterity's sake? And dupes are posted for archival reasons?

      I'm confused.

      --
      "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
    5. Re:Multiple identical copies? by Not_Wiggins · · Score: 3, Funny

      Exactly! Why store it on plastic at all?

      What I do is take files I care about, encrypt them, rename the file to something tempting like "Cheerleader Sex Orgy XXXIV.avi," note the MD5 (sticky note on the next of the monitor), and share it on a P2P network.

      Instant distributed backup! 8D

      --
      Diplomacy is the art of saying, "Nice doggie!" until you can find a rock.
    6. Re:Multiple identical copies? by sporkmonger · · Score: 3, Funny

      We know papyrus has tried-and-true longevity for sure. Everything else is just a pretty good guess.

  3. Stone tablets by IckySplat · · Score: 5, Funny

    Stone tablets. Just drill a hole for a zero and your away and laughing
    Now we just need a large enough area to store them :)

    --
    Help! help!, the termites are eating my DRAM!!!
    1. Re:Stone tablets by kalirion · · Score: 4, Funny

      Hey, it worked for Moses...the 10 commandments are still around.

      That's out of the original 15.

  4. 3.5" by otacon · · Score: 4, Funny

    At the enterprise level we use 3.5" 1.44MB Floppy drives in an elaborate redundant array. It consists of roughly 70,000 Disks, each changed nightly. We haven't had any problems yet. Hopefully the rest of the world will play catch up soon.

    --
    In a world of acronyms, the words are the real victims.
  5. It's the messanger, not the message by Anonymous Coward · · Score: 4, Insightful

    Ridiculous. It's not the fact that content is digital, it's the fact that the media being used to store the information (CDs etc) is fragile. If these mythical audio tapes had been digital tapes, recovering the signal from them would have been just as easy.

  6. But what you got off the tape... by dave420 · · Score: 4, Insightful

    ... wasn't *exactly* what you put on. You have the appearance of stability, that you can retrieve something off a damaged tape, but the truth is something different. That's the beauty of analogue. The same simplicity and fault-tolerance of the format also means the format will naturally degrade over time. The contents may be retrievable, but they've degraded, and as such are not the same contents as when first written. Digital fails, but when it doesn't fail, you have exactly the same content as you did when you started. Archivists will not run from digital - their techniques will improve instead. or something.

  7. Crush and Preserve! by webword · · Score: 3, Funny

    Shouldn't it be possible to take all the media and just crush it? You know, like throw it into a Mega Power 3000 Digital Garbage Collector (TM) and crush it into a diamond or something? Let future generations figure out how to decompress it.

  8. wring recovery method by Red+Flayer · · Score: 4, Informative

    If a CD had one-tenth of one per cent of the damage on one of those reels, it wouldn't play, period.
    That's because you're trying to optically read through the damaged part. It is possible to recover data from damaged discs, as long as only the coating (and not the reflective surface) is damaged. It is quite possible to polish the surface and read the data, or even to fill in some of the damage and repolish for reading.

    Just because it's harder to recover the data doesn't mean it's impossible.

    Of course, anyone using CDs or DVDs for large data backup must have a lot of interns to do the disc swapping.
    --
    "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
    1. Re:wring recovery method by Criffer · · Score: 4, Informative
      Exactly. If you try to put a bent CD into a CD drive, you're obviously not going to be able to read it. But that doesn't mean its not recoverable.

      To recover data from a CD, you can simply photograph it at high enough resolution. Even with huge scratches, even with parts of the disc physically missing, you can recover the data exactly as it was encoded. How? Reed Solomon code .
      Quoth wikipedia:

      The result is a CIRC that can completely correct error bursts up to 4000 bits, or about 2.5 mm on the disc surface. This code is so strong that most CD playback errors are almost certainly caused by tracking errors that cause the laser to jump track, not by uncorrectable error bursts
  9. They could try harder by Waffle+Iron · · Score: 4, Interesting
    The CD wouldn't play with an off-the-shelf CD player. That doesn't mean that a special "archaeological" CD player can't be built that would perform extensive microscopic image analysis of the disk surface in order to read the data in the face of extensive corruption.

    Some analog technologies, like old color films, have also degraded and need image enhancement to recover the original content.

  10. Every Superman has his Kryptonite by elrous0 · · Score: 4, Insightful

    Yes, analog tape is durable. But let's take it and that "CD" and put them in front of a large electromagnet and see how each fares.

    --
    SJW: Someone who has run out of real oppression, and has to fake it.
  11. have people already forgotten? by phantomfive · · Score: 4, Informative

    Have people already forgotten the advantage of digital? If you have an analog tape, every time you make a copy of it, the quality will be degraded. But with digital, you can make a million copies and the final copy will be the byte by byte equivilent of the original. So what if CDs only last 10 years before becoming unusable? You can make another copy! So what if this guy wouldn't have been able to recover after physical damage to his media....if it was important, he should have had digital offsite backups! And those backups would have been 100% equivelent to the originals.

    --
    Qxe4
  12. 1% = Total Loss? by JesseL · · Score: 3, Interesting

    If losing 1% of the data on a CD means the data is a total loss, doesn't that say to you that you should be using a file system and data formats with more redundancy and parity?

    Of course for the ultimate in durable electronically readable storage you should be burning everything to PROMs.

    --
    "Prefiero morir de pie que vivir siempre arrodillado!"
    1. Re:1% = Total Loss? by the+eric+conspiracy · · Score: 3, Informative

      CD Paranoia. I've used it to recover CDs that a $1000 player choked on.

  13. We can take this seriously. by Lethyos · · Score: 3, Informative
    --
    Why bother.
  14. I've said it before and I'll say it again... by eno2001 · · Score: 3, Funny

    ...the solution is simple. We need a way to take a quantum snapshot of the whole of the Earth at least once every 24 hours and then to send that data out into space as a broadcast in all directions. To retrieve the quantum structure, we'd simply pop out of a wormhole near where the data is passing and retrieve it, then retransmit it back to here and reconstruct the Earth as it was before catastrophe struck. The nice thing about this is that if we can find another M class star like Usolia (our sun), we don't even have to beam the data through the wormhole. We could just intercept it near the star and start the assembly process there. Point-in-time restores for the whole of the planet. Imagine that. You're welcome.

    --
    -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
  15. Remember the "Domesday Book" by hopbine · · Score: 3, Interesting

    In the 1980's they digitized the Domesday Book. Trouble was the format they used is now obsololete. The good news (apart from still having the origional) they have re-inveted the wheel. http://news.bbc.co.uk/2/hi/technology/2534391.stm for details.

    --
    Semper ubi sub ubi
  16. Umm.. by phasm42 · · Score: 4, Insightful

    If a CD had been submerged in water, it would've been fine. There's no point in making the comparison if it wouldn't have been damaged in the first place. They need to find a better example.

    --
    "No one likes working in a hamster wheel, and your shop smells of cedar shavings from here." - TaleSpinner
  17. It's already happened/happening. by Kadin2048 · · Score: 4, Insightful

    This also is the response to the other big cry-wolf thing, "What happens when the data is in a format that's too old???!!11one" The answer is we just keep copying it to new formats. I have digital copies of papers that I wrote in high school. They were written on an old copy or Works for Windows 3.1 and usually saved to floppy. I don't have a floppy any more but it isn't a problem. I long ago transferred them to a harddrive and I just keep transferring them to new drives when I get them. I also periodically load the old documents in to whatever my current word processor is, convert them, and re-save them as a new format.

    I think you're missing an important element here. As you move along in time, the volume of data that must be converted to the format du jour only gets bigger and bigger.

    For a single person, it's probably not too bad. I, too, have pretty much everything I ever wrote since I first got a computer, and every few years I've committed to rolling the whole thing onto new media. So I've gone from offline backups on floppies, to Zip disks (in retrospect a mistake), to CDs, to DVD-R, and now to DVD+R (the -R discs were crappy and I've since heard that +R is a superior format anyway). This isn't much trouble, because the amount of data I have to backup hasn't really grown that much faster than the data density of available media. I'm probably up to a couple of DVDs for the stuff I really, really care about, maybe a binder if I include all the photos and video.

    But what's a basic Saturday-afternoon copy-and-burn job for an individual is a Sisyphean task for a large government agency or library, particularly one who is constantly generating new content. I've seen places that could barely keep up with archiving the stuff they were producing, much less roll their vast archives forward onto new media. So they'd have vaults of hard drives, sitting next to DLT cassettes, next to IBM 3480, next to racks of old half-inch open-reel tapes. Probably back in some dark corner there were piles of punched cards; it really wouldn't surprise me. The problem of data loss due to unreadable formats isn't some abstract 'maybe,' it's already happened in a lot of places (but nobody really wants to talk about it, so it mostly gets buried and whatever's on the tapes gets written off).

    The reason why there's so much interest in preservable formats is because while it may not be strictly impossible to constantly roll old backups and archives forward, it's very hard, and requires vast amounts of effort and expense. If you have a backup that's being written into a format that you know is going to be readable for a long time, even if it's more expensive to write initially, you can save a lot of money and time down the road by not having to copy it forward as often.

    People may get a little shrill when they're talking about these issues, but they're quite real.

    --
    "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
  18. data type is more important than medium by benmoreassynt · · Score: 3, Interesting

    This is a dual problem:

    1) Digital data needs to be moved about once every 5 years onto a new physical store, disk, whatever. Think of the amount of data sitting around on floppy disks that is being lost as we speak.

    2) Data has to be recorded in a way that that presumes whatever software you use to create it will not exist in the future. Anyone who saved their life's work in some ancient binary word processor file will know what I mean. For most computer-based data storage that requires data be stored somewhere in plain text, and using as open a format of 'markup' as possible, if any.

    In effect, from a historical/archival point of view, data does not exist unless it is kept in at least two places at all times, and unless whatever bit of software you use to create it can also save it in a non-binary format of some sort for access for future generations who don't have a copy of your software.

    Ok, that does not pertain to sound recordings or images, but even then some sort of 'permanent' standard is essential for all data.

    I used to work with medieval documents written on vellum - sheep skin. The original Domesday book was written on vellum, and is as readable today as it was in 1150. (It also doesn't need a power supply to work!) Meanwhile the digital 'Domesday' Laser Disk made in the early 80s in the UK had to be saved from oblivion a few years ago (with a great deal of work) because the computers and hardware that it was created to work with were utterly obselete. Fortunately, and unusually, someone realised the problem before it was too late.

  19. Roman & Greeks != European & Native Americ by CasperIV · · Score: 3, Insightful

    The European invasion of North America hardly constitutes a genocide. The sole purpose was not to eradicate a race, but to destroy the fabric of the culture and remove them from the land. I do believe I have friends that have some native american ancestry... The only difference is that it happened in a modern era, and the conquered people were allowed to retain some continuity. People act as if the inhuman treatment that befell the natives was in some way out of the ordinary for human nature. You can not compare the destruction of the Native Americans to Rome conquering Greece. Greece was a well developed empire that fell to another and was absorbed. There was technology and racial similarities that promoted integration. By comparison, the native people of North America had no such technology, literature, and had no relationship with the Europeans. In the beginning people negotiated, but the problem is that negotiations are a farce, and they only matter if neither side has an advantage. In the case of the Native Americans, they never really had a choice, and the some of them knew it. They had absolutely no chance against European powers simply because of the lacking of technology and cultural cohesion. One thing that people forget is that the idea of a superior people has been around forever and still continues. It is part of the human psyche and almost every major religion in the world. Don't think of it so much as a racial superiority, but rather religious. This is very much what is going on in the middle east and why they can't have peace. The religions of the region believe they are chosen to possess the holy land, and they can't let the sub humans have it. This has happened throughout all of history to ever race in the world (even among the same peoples)... just this one was more well documented.