Afterlife Will Be Costly For Digital Films
Andy Updegrove writes "For a few years now we've been reading about the urgency of adopting open document formats to preserve written records. Now, a 74-page report from the Academy of Motion Picture Arts and Sciences warns that digital films are as vulnerable to loss as digitized documents, but vastly more expensive to preserve — as much as $208,569 per year. The reasons are the same for video as for documents: magnetic media degrade quickly, and formats continue to be created and abandoned. If this sounds familiar and worrisome, it should. We are rushing pell-mell into a future where we only focus on the exciting benefits of new technologies without considering the qualities of older technologies that are equally important — such as ease of preservation — that may be lost or fatally compromised when we migrate to a new whiz-bang technology." Here's a registration-free link for the NYTimes article cited in Andy's post.
"Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it."
- L. Torvalds
Here's to hoping for a brighter future... for our children.
Why is it more expensive to preserve a bunch of bits and bytes than, say, a reel with analog information, printed on some soon-to-be-brittle plastic? I'm very sure the latter will decay in a quicker fashion.
Preservation was a lot easier when the media lasted longer but by far the largest problem is the increase in the amount of data.
What is interesting is that old analog film & tape also degrades, but does so more gracefully. They also get degraded by reading, not just by storage. Archives of old footage etc have largely been converted to digital to allow older signals to be accessed without damaging the originals.
Engineering is the art of compromise.
I agree. Even if there is gonna be some overhead (pay the guy who swaps out broken drives), it's never, ever gonna be over 200 Grand per movie.
That's silly. I'm pretty sure some "Consultant" came up with that figure.
release the file into the public domain and put it out on bit torrent? You'll get lots of backups made, for free. It will get converted to new formats, and backed up again, for free. Oh, you want future profits? Then quityerbitchin about the archival costs.
"National Security is the chief cause of national insecurity." - Celine's First Law
I cant help but relate some personal experience here. I know its not production quality, or lots of information, but I recently pulled out my Apple IIe from storage. It included the original 5 1/4 floppy disks and drives.
There was also a cardboard box with ~150 floppy disks, some as old as 20+ years. NOT A SINGLE ONE WAS BAD. Yes, "Zork" still works!
Could it possibly be that the quality of media just isn't up to the demands of a longer life of storage anymore? We all know how Cadillac runs that racket, as in sell the crappy car, and make the money off replacement parts. Has media storage gone the same way? As in 'sell the media, but just good enough to work for x years' before being replaced. And with the demands to increase revenue year over year for public companies, perhaps that time-frame has become shorter and shorter over the years to keep the money flowing in.
Or am I just being too cynical? But you know, a world where such works as "Zork" can survive and "Legally Blonde" can not, on their respective media, might not be that bad.
As jonadab once put it:
:-)
> Those who do not study history are doomed to repeat it
Yes, and those who do study history are doomed to watch in frustration
as it is unwittingly repeated by those who do not
Please correct me if I got my facts wrong.
The reasons are the same for video as for documents: magnetic media degrade quickly,
The myth of bit rot on hard drives is just that- a myth. It's been perpetuated for two decades by the idiot Steve Gibson, selling his own snake oil (Spinrite), and unfortunately, not enough people are calling him on it. I thought it actually did something too, until I read that post from someone who actually knows how modern drives work. As the author points out, there's a track that can only be written at the factory, and if what Gibson claimed were true, ALL drives would be dying left and right after a few years. Funny how I've found drives made almost a decade ago working just fine now...
The problem hasn't changed; it's mostly obsolescence in drive interfaces, and the drives themselves (for tapes.) PATA is common these days, but everything is going towards SATA, for example.
Both DAT and 8mm were in common use as little as 6-7 years ago...but you'd be fairly hard pressed to find a place to but either now save eBay. And...do YOU want to entrust a backup to an ebay drive?
Please help metamoderate.
It's not just the finished product, it's all the footage they keep around for different editions, remastering, deleted scenes, etc. And the source material is often not compressed in a lossy format. Sure, 4000 TB will store a lot of DVDs, but it won't store many movies in raw format. And only a fool wouldn't also have backups.
Be relentless!
The answer is simple, copy it over frequently.
Yeah, from the article, are several silly things are going on here:
Find free books.
If they want to permanently archive digital media, why not just keep the DVD glass masters around? They shouldn't degrade like plastic, and if carefully packaged it seems that they could last for millenia. If a special reader were developed that could optically scan the glass surface without the need for a rot-prone metal layer, then the information could be retrieved without having to risk damaging the master by making a new pressing.
Where he compares salt mine storage of analog media to storage of digital media, and decides to just multiply his made-up $208k figure by 100 years to come up with.. wait for it... $208 million. I guess that's why he went into journalism and not the sciences.
Leaving out the humongous math error, why can't you just store the digital fucking media in the same salt mine? The things that damage analog film are the same things that damage digital media.
Is it any wonder we have the expression "lies, damned lies, and statistics"? This article is all three, with some incompetency thrown in.
It's rare that you're presented with a knob whose only two positions are Make History and Flee Your Glorious Destiny.
They had to hire an MSCE to migrate the data from proprietary Windows Long-Term Archival Backup Media Video format (.wltabmv) to the new, safer (from pirates and such, arrr) long-term Windows Long-Term Protected Archival Backup System Against Pirates And Intellectual Property Theft Format Media Video (.wltpabsapaiptfmv)
And some, god help them, migrated to Apple's Almost Better Than The Competition So You Can Feel Better About Using A Proprietary Format For Only Three Dollars a Pop Codec (.aabttcsycfbauapffotdapc). Those Apple Engineers cost bocoup bucks.
Please stop stalking me, bro.
1/ Draw each frame on a sheet of papyrus, staple the whole thing together on one edge, making a flip book, and hide the whole mess in jars in caves in the desert. Don't forget to include copies of the scripts.
2/ Devise an obscure religion based on your film, spread it to as many people as possible.
3/ Wait.
As nearly as I can tell, the whole concept of recorded history probably ended when we developed means to record reality directly, rather than transcribing it to clay slabs, stone, and paper.
I'd try flash memory or maybe even a punch-card-type system with machine readable data printed/stamped/cut on paper.
Yes, a punch-card system is perfect...until somebody drops the deck...
ZuluPad, the wiki notepad on crack
With all the push by the various arms of media industry to keep finding ways to continue to generate revenue from their products, I'm sure they'll be pushing the envelope with long-term storage solutions. Large capacity storage used to be considered anything greater than 1 GB with technology that was available "way back when" (not that long ago, really). Nowadays, that's a ridiculously small amount of storage that I can (and do) carry around in my shirt pocket.
Computing power used to be awfully expensive, too. Now we've got desktops that are capable of scientific computing sitting around at 99% idle all day. If it weren't for Vista, we wouldn't even be using a tenth of the memory built into them (sorry, had to stick a dig in there somewhere).
My point is that as the market demands new capabilities, technologies emerge that satisfy those needs. As time goes on, the efficiency of these technologies increases while costs decrease. It's just how things work. Today's data retention problems for studios will contribute to tomorrow's advances in long-term storage technology.
I can think of at least a couple of major companies that also have a vested interest in long term archival... Google... cough... Google...
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
analog also decays. The difference is that it is easier to pull SOMETHING out of it as it decays. The downfall of analog is that it is is MUCH more expensive to protect.
Back in 90/91, I worked for a company that did burning of CDs and Laserdisc (compressed data for the DOD). The CDs cost something like 5 or 10 each, and the laserdiscs were a couple of hundred each. IIRC, These were based on gold, and would last something like 50 or 100 years without losing a single pixel. I would guess that hollywood could easily afford these.
I prefer the "u" in honour as it seems to be missing these days.
The standard motion picture format is MJPEG2000. It's not a very efficient format, but it's well defined and going to be around for a long time: there's both a lot of hardware and software that relies on it, and it scales up to high resolutions.
The consumer format wars between Microsoft, Apple, Sony, and other companies have no influence on this.
How about printing a few copies of a binary bar-code record in big books of archival quality paper for terms of a few centuries? Or how about blowing the bit pattern into any other format with some longevity on some nice passive substrate like a non-flowing glass if you'd like to keep them for a few millennia? Two hundred plus grand a year per film to maintain, my aching ass. Give me two million bucks - the supposed cost to archive just ten films - and I *guarantee or your money back* that I can design (and build a prototype) archive system that will reliably maintain digital films such that they can be recovered many centuries from now with no more "yearly archival cost per film" than a roof over its digital head. Error correction and all. All this story demonstrates is that someone isn't taking proper advantage of the technical community.
I've fallen off your lawn, and I can't get up.
Once again repeat after me... the benefit of digital is not that it LASTS FOREVER or is EASIER TO PRESERVE. It is that it is EASY TO COPY.
Who gives a rats ass if a given copy of a film will degrade in 10 years. I can make a 100% perfect copy of the thing in minutes. Copy the data every year. Hell copy it 100 times. Copying also makes the obsolescence of formats meaningless.
I still have emails and RTF documents written in 1994. These are 100% perfect copies of the original data. Is that somehow to be interpreted by brain-dead fear-mongers that any day now my data will be "obsolete" since the obviously 15-year old media is almost degraded beyond recognition? Or are people a bit more intelligent and realize I have already copied this from hard drive to disc and back about 30 different times?
A game of 52 Million Card Pickup, anyone?
Current movies are already printed to film for viewing in theaters, so the problem isn't at a crisis point yet. The problem will come when major film manufacturers quit making movie film.
If the major studios demand it and are willing to pay higher prices for low manufacturing runs, film manufacturers will still make the film. I predict this will happen for the forseeable future.
By the way, nothing but cost says you can't take each element in a digital scene and print it out to its own frame in addition to or instead of printing out the movie frame-by-frame. Also, nothing says you have to use 35 or 70mm format: If your original digital image has more resolution than you can store on 70mm film you can use a larger format.
You can also use microfilm techniques to print technical information such as the descriptions of camera angles and even computer data files and computer code in human-readable, hex, or some other form to film for archiving, along with the computer code for the programs and enough information to build a virtual computer to interpret that code. Sure, it's a lot of information but remember, the goal is to put all of the information in a storage box and be able to retrieve it in 100 years and make use of it.
If they had done this level of preservation with old NASA computer data and data-descriptions we wouldn't have some of the problems we are having today with un-interpretable data.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
First, what kind of film was it that had a tendency to burn? Nitrate-based film?
Second, I just heard that the studio that produced Aerosmith's first album has lost the masters, so they're going to re-record it.
This kind of problem isn't new, and blaming it on electronic media is silly.
Yes, you do have to take steps to ensure the availability of it in the future - but the same is true of analog versions too. If you don't have a good filing system, or your 'vault' is the backseat of a car in southern California, the reels are going to get damaged/destroyed/lost, too.
I was on a railroad photographers' list for a while, and I remember the digital/analog debate came up one time. Someone said, "I'll be laughing when you lose all your files because your hard drive crashed and don't have pictures any more!" Obviously he never considered he could easily lose his negatives/slides, or have them damaged in a flood or fire. Analog media has different risks and storage requirements, but they BOTH require proper storage. (And, frankly, digital has the additional advantage that it can be easily backed up at multiple sites with no loss in quality.)
Technicolor dye transfer (imbibition) prints were much less fugitive. Color separations onto black and white film stock (often termed YCM for yellow, cyan, magenta) are much more robust. Production of these separations (and imbitition relief "matrix" films) was intrinsic to the Technicolor printing process (even if the film was shot in conventional tripack negative, then transferred to Technicolor for printing), and films where these intermediates were saved (or where someone presciently thought to have a set of YCMs made), are much safer for the future than anything kept only on color stock.
In the 70s there were some photo places (especially in Los Angeles) that marketed Eastman Color Negative 5247 movie film (short-end remnants from the movie industry) as a cheaper alternative for 35mm color negative still photography, and printed this onto 5283 color print film (same as movie prints) for 35mm slides.
I recently found a few boxes of these that I had shot back then (and stored under entirely careless, or Arrhenius/Murphy if you prefer, conditions). I am not good at evaluating color negatives by eye, but the positives were faded either to mutated colors or to almost nothing.
Even simple technologies can have amazingly short shelf lives under conditions of disuse. I recently turned on my stereo system after close to 3 years of not being used. The amplifier, CD player, and LP turntable all failed to operate. Part of this might have been due to de-formed electrolytic capacitors; these appear to have more-or-less repaired themselves after a couple of hours with the power turned on. Both the CD player and the turntable suffered additional electromechanical problems that required a combination of manual exercise and cleaning to rectify.
None of these devices have anywhere near the scary sophistication of a modern hard disk drive.
Seeing as I cannot remember what I last set my external firewall password to, imagine the additional challenge of future Hollywood being bitten deeply in the butt by present Hollywood's favored time-bombed destined-to-be-lost-art proprietary DRM technologies, with the keys long since dissipated in Hollywood's perennial miasma of mergers, acquisitions, lawsuits, cocaine, and personal vendettas.
Maybe they are living in 2007, where they are paying a $200,000 a year licensing fee to a patent troll who got a patent for "A business process which preserves digital motion pictures".
In all seriousness, the biggest obstacle to preserving a history of our culture is copyright. If the owner of the copyright doesn't care to preserve the piece of our history that they have their monopoly on, the information will simply deteriorate and there is nothing legally that can be done about it. We can only hope that the evil dirty thieving pirates save our history for future generations.
DPX or TIFF image sequences. (These are the standard formats for high-end digital post production already.)
This space unintentionally left unblank.
The DPX format commonly used for digital post production uses about 35 megabytes *per frame*.
My calculator says a 2 hour movie at 24 frames/sec will have about 175,000 frames.
A few more button presses tell me that's a bit north of 6 terabytes of data.
Let's quadruple that to include all the cut scenes and unused footage, to 25 terabytes.
TB drives are available now for $400 or so each. They use under 10 watts idle.
Building a 30 drive RAID would thus cost $12,000, and require perhaps 500 watts if run constantly, including cooling. Let's bump that to $15,000 to pay for controllers and chassis.
Three such arrays (in case of earthquakes, etc... keep 'em at opposite ends of the continent) would cost an initial $45,000, take up perhaps 7u of rack space, and need 50 kWh per day for all three. At 30 cents per kWh, that's 15 bucks a day, or $5500 per year. Let's double that, assuming those 7u cost you $5500 a year.
So... my numbers, triply redundant, come to an initial investment of $60,000 (profit, hey!), and a yearly cost of $20,000 (more profit!).
How the hell they came up with $208k is beyond me. I'm thinking I should start a company that does this for the studios, it's looking quite lucrative.
..in that order.
Yes - You don't need to have 5.25" drive now to read back data that you stored onto an 'old' IDE drive 2 years ago. And that's a bad example because you can still get 5.25" drives. 200 years from now when we're working with crystalline storage methods, we won't have to read back from HDD platters.. just from the holographic storage drives that things were transferred to with the last generation of storage devices.
Will we still have film projectors 200 years from now? Possibly not.
Whocares - because the formats used to store digital film aren't exactly H.264 or whatever fancyschmancy codec the copyright-infringent care about; google 'digital intermediate'. And yes, those formats do tend to change, but they all remain lossless and, again, things can be transferred with each generation.
Will we still know what to do with film 200 years from now? Ahhh.. there's the kicker.. probably, yes.
This is also where the cost comes in - you have to keep upgrading to the latest formats and the latest storage devices to ensure that there will be no 'digital divide', so to speak.
With film, you don't incur this cost. It's lossy in an analog sense, but if somebody looks at a film reel 2,000 years from now - and we assume to still have the same visual system in our watersacks - it will be trivial for them to see, literally, that it is a series of pictures which, in succession, appear to animate. Even if there's no device to play them back then, it would be trivial to build one from scratch using very rudimentary knowledge.
With digital, even if you have the latest format and the latest hardware to read the device it's stored on, it is non-trivial for the layman to read this file and be able to put it back into a picture; in fact, it tends to take people with intricate knowledge of the device and the storage format.
Personally I'm all for doing both, costs be damned, if the material is important enough. That said, do we really need to hold on to all material forevermore? Like a history book, it should be enough to retain the highlights (be they positive or negative), and not cling onto minutiae, as a society. Similarly, like family archives, those who believe something to be well worth the preservation for future generations (either within the family or civilization as a whole), will - or at least should - do so on their own and have history prove them right, or wrong.
If you plan to fight entropy, you're on the losing side. EVERYTHING degrades eventually.
Seven puppies were harmed during the making of this post.
And they use precisely 0 watts when un-powered and in storage...
Apart from the idea that you would not use tapes I am in complete agreement. I would add they are stuck in a 1985 mindset where the internet does not exist.
It is a pretty simple problem to solve. You set up a smallish data centre on three continents. You install some LTO4 tape libraries and start replicating the data to each over the internet. With LTO4 you are looking at ~600TB per 19" rack, and when you are not accessing the data (most of the time) you are not consuming power. Add in some checksumming and patrol checking of the tapes and problem sorted. In 5,10 years time you migrate to some new tape tech. That involves sticking some more frames in, hooking them up and telling the software to copy the data to the new tapes.
Remember as well this is a high assurance system not a high availability system, so some of the expense of a datacentre can be saved. No need for that diesel generator for example because it does not really matter if you cannot access the data today because of a power cut. What matters is that it is preserved and when the power returns you can access it.
> I know that's moded funny, but that might actually be a very good argument for "open sourcing" movies.
I wouldn't call it "open sourcing" exactly, but let's just say that films won't soon go extinct, at least as long as there are people willing to copy them.
Actually, that's how books survived. The only ancient books we have now are the ones people thought were important enough to copy regularly, plus a few random things that survived for a ridiculously long time.
I'm not convinced we need to keep 90+% of youtube or Friends and similar crap for people to watch 100 years from now.
Engineering is the art of compromise.
No. The trick here is only half archival; the other half - and it's not complex, just apparently not obvious - is that it should take any half-competent tech no more than a day or so to rig up a reader using discrete components of current technology, the task having intentionally made simple. An optical diode, resistors, a transistor, maybe a lens system and an XY table. Not "drives" and metaconstructs like them. This way, the components can be emulated if required (doubtful, but possible) by higher technology. The format needs to be blind-dumb-simple, as does the error correction; row-column EC will allow recovery of single lost datums and is trivial to implement. If it is easy to do today, it will be easy to do tomorrow. Once that is done, you can construct as sophisticated a reader as you like, all the while knowing that if worst comes to worst, some half-smart high schooler can recover the data given enough time and $100 in parts.
You misunderstood my guarantee, too; I was guaranteeing that I could get the job done and archive, and recover, a movie in this fashion, making a maintainance free storage method that did not suffer from unrecoverability. I was not guaranteeing the data; they have to provide physical security for it, and I have no control over that, so I couldn't possibly make any promises in that area. I *could* sell them some land in Montana; I just bought two city lots and the 5000 sq ft building on them for 25 grand. Taxes are low, too. ;-) There's plenty more where that came from - hundreds and hundreds of square miles. Thousands, even. Storage space isn't a problem unless they insist it be in LA, which - of course - would be stupid. It should be in a geologically stable area with a high speed pipe and reliable power, that's all.
I've fallen off your lawn, and I can't get up.
Although we are probably getting a little too carried away in making everything digital, there is a lot to be said for the long-term storage options of data in an analog form. Even if an item stored in an analog form is destroyed by 50% or more, it's not impossible to recover most of it with fairly reliable accuracy simply due to the amazing ability of the human mind to recognize common patterns and fill in the blanks. Even if the analog were warped out of it's original order, odds are good we could recover it.
On the other hand, digital archival of data, which can offer incredible clarity and potentially 1:1 accuracy in restoration often becomes an all-or-nothing proposition if even a tiny bit of the data is lost or altered. Even with file formats/codecs that offer some form of error correction or redundancy, the final result we may end up seeing could be little more than randomized shifts between a blank screen and a perfect image... all of which are swapped in and out so quickly, we may not see the recoverable parts long enough to identify any usable pattern.
For example, try comparing something like the "scrambled" channels (mostly the porn channels) on cable television back in the early to mid 90s to something like DirecTV during a heavy rain storm. Even though the cable stuff was typically visible warped and uncomfortable to look at, you at least had a good idea of exactly what was going on behind the scrambling, even without the audio channels. But, try watching a DirecTV signal under less than ideal weather conditions, and the best you get is a bounce between a random mosiac and pitch black, combined with severely degraded audio pops here and there. You're luck if you can even get a useful picture of anything on the screen, let alone being able to comprehend what is going on in the show itself.
That said, how difficult would it be to create a micro-film drive (photosensitive analog scanner/burner) that could not only store any document on a computer in an analog form, but do so in a format that could be interpreted entirely by the human eye using a proper magnifying device. For that matter, why not create a hybrid device that would store both an easily visible analog form of a document as a high-resolution thumbnail, along with a digital version using pattern of dots similar to how data would be stored on an optical disc. This way, no matter what device you use to extract the information, you'd always have the means to access the data you need.
8==8 Bones 8==8
Like Barbra Cartland? Or Penny Dreadfuls? Or the RFC Archive? Or YouTube?
Huge amounts of fundamental culture simply disappears because it is so transparent or ordinary to those it affects. The next generation comes along and they forget about it because of that apparent mediocracy. For example, breast feeding was normal, ordinary, and public in America up through the 1950's. Movie and later Television rule-makers didn't allow showing it unless it was part of some National Geographic type presentation. Today, breast feeding is being re-discovered in a storm of controversy because an entire generation has not only forgotten, but confused the topic with beer commercials.
Then again, how many people want to remember Phillippine Midget Snuff films? And why?
Pacifist paratroopers yell, "Ghandi!" when they jump.
I'd imagine the big G would fall over themselves to do it. And it would cost the movie industry zilch.
flash memory is not really any more reliable than a harddrive for long term storage. At least if you're talking about the cheap high capacity stuff that you would need to store a Tbyte or two of raw data.
A stack of archival CD-R or DVD-R, or actually pressing a master would let you hold the digital data for a few hundred years quite reliably. Just has a FORMAT.TXT on there to describe the encoding format(s) you used, just in case anyone forgets. And yes, a text file can be 1000 pages long, if it must be.
And C programming language has been thriving for 30+ years, it might not be too much of an assumption to think someone could dig up a C compiler in 50 years and compile a straight ANSI C program. A program that converts My Weirdo Format(tm) to raw binary frames and audio with comments in the source code might be all that is necessary for transferring lost media. I suspect the source code for that could fit on your archival media and would take a tiny fraction of a percent of the space.
I suspect that since CDs and DVDs are so prevalent and such an open format, that even a thousand years from now someone will be able to figure out how to read one and copy it to another medium. And CD's format is simple enough that it would be trivial to reverse engineer, if someone dug up our civilization in 10,000 years they could likely find the thousands of the various dictionary and language CDs out there as a sort of rosetta stone.
obviously there would be data loss on 10,000 year old CDs, but theoretically you could pull something off the regular non CD-R kind.
“Common sense is not so common.” — Voltaire
...and then you need system administrators and a repair/replacement budget and technicians to do it plus a network connection and and extremely patient help desk support (given the MPAA's demonstrated understanding of technology). This will also mean you need a manager and a building to put the racks in, security staff etc etc etc. The staff and building can certainly be shared by multiple "films" but I can well imagine the costs of all these staff and there overheads will add considerably to the cost.
Beaucoup. It is spelled Beaucoup.
mod it 'informative', mes petits.