MXF+JPEG-2000+HDD = Future of Video Preservation?
Anonymous Archivist writes "Media Matters, a technical consultancy specializing in archival audio and video material, recently completed a Mellon Foundation funded Digital Video Reformatting Preservation Project for the Dance Heritage Coalition. They conclude that MXF is the recommended container format, JPEG-2000 is the recommended encoding format and HDD is the recommended storage media. It's a very valuable series of experiments and offers a strong indication of where the archival preservation of analogue video is heading."
OK, let's talk archiveability. Let's talk about a medium that you can leave in a shoebox for a hundred years and read just by shining a light through it. I'm not talking hypothetical here - this technology is proven by the fact that people used it a hundred years ago and it worked. And the technology is even better now, even more stable.
I am of course talking about film. It is very very easy now to write digital images onto film, not very much more difficult than it is to scan film. There's no need to worry about whether the file format will be supported in the future, as I've already said. You don't need to shovel money into vendor's pockets every few years just to copy it to the latest trendiest type of disc. You can build a machine to project film out of junk if you need to, or you can scan it if you want a digital image and when you have a better scanner (e.g. a higher DMax), you can just scan it again.
The dude who wrote this report is just blowing smoke. He's trying to sell snake oil.
If you mirror across two disks and put the into storage, and one develops some minor errors, it is not possible to tell which one has the errors unless the data itself stores error checking and correction information. This is why God RAID-5 was invented. Using 3 drives you can identify and repair any errors that develop on any one drive.
If you just mirror it on two hard drives and then put them into storage, they will last for a very long time.
Technically true, but my experience indicates that the most likely time for a drive to fail is when you power it up after a long period of inactivity. It's not exactly optimal to store your data for years only to have the drive(s) die when you first try to read from them...
I am TheRaven on Soylent News
It would be smarter to use PAR2 (or similar) on a filesystem basis, than to use a RAID filesystem. It's easier to deal with user space programs for reconstructing data.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Definitely not!
Most if not all peer to peer networks require a certain level of interest in an item for it to be retained. Popular items are always easy to find while obscure / old items gradually disappear from the network.
Try finding a movie that's a few years old. You'll have more trouble finding the original Jurassic Park than Jurassic Park III.
Peer to peer is not a great way to reliably and systematically preserve cultural heritage.
There will always be multiple backup solutions, but the biggest trend continues to be towards using hard disks for backup. When your data files are enormous (such as with audio/visual data), HDD backup is even more attractive.
All this is is a method to line some guy's pockets. I'm sure the tape guys are gonna say, use XYZ type of tape. The disk guys are gonna say disk.
What makes this guy think that the interface to the HDD is going to be around in X years?
PC's have only had two dead (non-(e)IDE/ATA) interfaces, the ESDI and the ST506/ST-412 interfaces.
But what if you were trying to find a computer with IPI (1960s mainframe) interface.
The Fed gov't has this problem with trying to find parts for their old 8/9track tape drives..
Here's a good list of all the HDD interfaces over the years: http://www.i-t-s.com/corporate/terms.html
Stick with microfiche, film, that way we don't have to pay some vendor $$$/yr to keep alive a dead technology or pay some other vendor $$$/media to move them from old to new media.
Well, that's more a function of the cost and size of data storage. Give me a Petabyte of soldid state nonvolatile storage, and I'll toss Jurrassic Park I in there for giggles, along with 20's silent films, clips from "Bozo's Circus" on WGN on 1969's Chicago TV, the collected books of mankind, complete 3D terrain maps of Mars and every old time radio recording in existence. Gimme a $200 unit that does this, and I'll preserve anything I can get my hands on!
And if you can spare the space, a directory with a wav file and a stack of uncompressed TIFF images is even better. Compression formats are complicated to reverse engineer.
Store .mng + .flac + source code for libmng and libflac, and you don't need to worry about any sort of complicated gnireenigne.
Because when you're archiving digital data, recoverability is paramount. You have to ask yourself, "What if all I had was a piece of this data, say, a hundred gigabytes from the middle of the disk? Could I turn that data into useful information?"
If you're dealing with a run-length-encoded array of packed pixels, the answer is obviously yes. That's among the simplest forms of encoding known. (If you don't RLE the data it's even simpler, but a trade-off between simplicity and storage requirements is okay as long as you maintain a lot of simplicity.) Even if you don't know how the data was encoded, you've got a good chance of figuring it out just by doing some simple analysis on the bytes. But with a complex encoding scheme, it's much more difficult to figure out what you're dealing with just by looking at it.
When talking about archiving, the objective is to be able to recover as much as possible given as little as possible.
indeed, lossless for archival preservation is the
only way, as it fits the basic rule of art restoration
technology -- never apply "improvements" which
cannot be reversibly undone to take advantage
of future science.
ironically then, the lossless format doesn't matter.
however, at least for the instant case of dance video,
the likely input (a myriad of digital tape formats)
is hopelessly neanderthal -- anything having to do with DV,
or MPEG, or even ATSC HDTV already tosses away much
color information. (4:1:1, 4:2:0, and 4:2:2 colorspace is embarrassing
to preserve "losslessly".) ditto for temporal
info, with interlacing being the culprit. even film at
24fps just will not cut it for motion such as dance.
so here's to better camera technology, whether it's
10- or 12-bit 4:4:4 RGB, or something like
carver mead's foveon made swift.