Slashdot Mirror


MXF+JPEG-2000+HDD = Future of Video Preservation?

Anonymous Archivist writes "Media Matters, a technical consultancy specializing in archival audio and video material, recently completed a Mellon Foundation funded Digital Video Reformatting Preservation Project for the Dance Heritage Coalition. They conclude that MXF is the recommended container format, JPEG-2000 is the recommended encoding format and HDD is the recommended storage media. It's a very valuable series of experiments and offers a strong indication of where the archival preservation of analogue video is heading."

44 of 214 comments (clear)

  1. Re:Lossy file formats... by iezhy · · Score: 5, Informative

    JPEG standart defines several encoding formats, which include lossless compression as well

  2. Re:MXF? by Raul654 · · Score: 5, Informative

    " The Material eXchange Format (MXF) is an open file format targeted at the interchange of audio-visual material with associated data and metadata. It has been designed and implemented with the aim of improving file based interoperability between servers, workstations and other content creation devices. These improvements should result in improved workflows and result in more efficient working than is possible with today's mixed and proprietary file formats." -- What is MXF

    --


    To make laws that man cannot, and will not obey, serves to bring all law into contempt.
    --E.C. Stanton
  3. JPEG-2000 by Anonymous Coward · · Score: 3, Interesting

    Why would they go with a compression format that doesn't do inter-frame compression?
    It might be nice for editing, but you could get more quality in the same space with something like h264, or even h263 if they have to do this right now (i.e. before h264 is quite ready for prime time).

  4. Recommended Storage Media by Antonymous+Flower · · Score: 5, Funny

    Recommended Storage Media: Peer to Peer network.

    1. Re:Recommended Storage Media by remahl · · Score: 4, Insightful

      Definitely not!

      Most if not all peer to peer networks require a certain level of interest in an item for it to be retained. Popular items are always easy to find while obscure / old items gradually disappear from the network.

      Try finding a movie that's a few years old. You'll have more trouble finding the original Jurassic Park than Jurassic Park III.

      Peer to peer is not a great way to reliably and systematically preserve cultural heritage.

    2. Re:Recommended Storage Media by fred911 · · Score: 2, Funny

      Along with the recommended exchange formant:

      Multi RAR'ed 14mb archives of .BIN and .CUE files. Including the ever necessary .nfo files.

      --
      09 F9 11 02 9D 74 E3 5B - D8 41 56 C5 63 56 88 C0 45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
    3. Re:Recommended Storage Media by Catbeller · · Score: 3, Insightful

      Well, that's more a function of the cost and size of data storage. Give me a Petabyte of soldid state nonvolatile storage, and I'll toss Jurrassic Park I in there for giggles, along with 20's silent films, clips from "Bozo's Circus" on WGN on 1969's Chicago TV, the collected books of mankind, complete 3D terrain maps of Mars and every old time radio recording in existence. Gimme a $200 unit that does this, and I'll preserve anything I can get my hands on!

    4. Re:Recommended Storage Media by strider44 · · Score: 2, Funny

      every 14mb RAR will be shared except the last one.

  5. Why HDD? by Orinthe · · Score: 5, Interesting

    The HDD recommendation doesn't seem to make much sense. The article talks about cost-per-gigabyte, but obviously it is much cheaper to use CDRs or DVDRs. This is video preservation, after all, not storing indefinitely for video /editing/, which would require a more malleable storage medium. And before someone points out that there are studies showing that the longevity of CDR/DVDR discs is questionable, surely proper storage of discs (and not buying the Best Buy free-after-rebate special) would be sufficient. HDD, after all, is susceptible to head crashes, and being a magnetic medium can be more easily overwritten.

    --
    SELECT quote.text AS sig FROM quote NATURAL JOIN attribute WHERE attribute.description = 'witty';
    0 rows returned
    1. Re:Why HDD? by chotchki · · Score: 3, Informative

      If you just mirror it on two hard drives and then put them into storage, they will last for a very long time. HDDs only die when run via wearing out and not just sitting on the shelf.

    2. Re:Why HDD? by TheRaven64 · · Score: 2, Insightful
      If you just mirror it on two hard drives and then put them into storage, they will last for a very long time.

      If you mirror across two disks and put the into storage, and one develops some minor errors, it is not possible to tell which one has the errors unless the data itself stores error checking and correction information. This is why God RAID-5 was invented. Using 3 drives you can identify and repair any errors that develop on any one drive.

      If you just mirror it on two hard drives and then put them into storage, they will last for a very long time.

      Technically true, but my experience indicates that the most likely time for a drive to fail is when you power it up after a long period of inactivity. It's not exactly optimal to store your data for years only to have the drive(s) die when you first try to read from them...

      --
      I am TheRaven on Soylent News
    3. Re:Why HDD? by drinkypoo · · Score: 4, Insightful

      It would be smarter to use PAR2 (or similar) on a filesystem basis, than to use a RAID filesystem. It's easier to deal with user space programs for reconstructing data.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    4. Re:Why HDD? by FireBug · · Score: 3, Interesting

      I didn't read the article (or the rest of the /. comments), but hard drives make much more sense than any optical storage medium in certain cases.

      Media will always wear out, regardless of what type it is. When you have huge amounts of data to back up, it's much nicer to be able to copy it to the latest greatest storage medium quickly and efficiently. Thousands of CDs/DVDs even with an automated "disc changer" would take a hell of a lot longer to transfer than a bunch of servers with hard drives.

      With a hard drive solution, you can just build a new server with new drives and copy everything over from the old one as fast as the hard drives and network allow. Couple this with RAID and multiple servers in different physical locations and you have a pretty damned resilient data archive. ... and just for fun, here's an (old) example of people using hard drives for large scale backups.

      http://www.tomshardware.com/storage/20030425/index .html

    5. Re:Why HDD? by Jah-Wren+Ryel · · Score: 4, Informative

      If you mirror across two disks and put the into storage, and one develops some minor errors, it is not possible to tell which one has the errors

      Exceptionally incorrect, prepare for smackdown.

      All data on a hard disk is protected by very sophisticated error detection and correction elgorithms. The chance of getting "some minor errors" is effectively nil - either they are corrected by the disc's controller, or the controller returns a "sector unreadble" error - which is what keys any effective mirroring system to go get the data from the second disk. You just don't get bad data from modern hard disks.

      This is why God RAID-5 was invented.

      No, raid-5 was invented to maintain the I in RAID. Mirroring doubles your costs, RAID-5 only increases them by one disk out of the N disks in the parity group, where N is usually but not limited to 4-5 drives.

      --
      When information is power, privacy is freedom.
    6. Re:Why HDD? by ultranova · · Score: 2, Informative

      The HDD recommendation doesn't seem to make much sense. The article talks about cost-per-gigabyte, but obviously it is much cheaper to use CDRs or DVDRs.

      Wrong. Here in Finland, a new 160 GB hard disk ( Maxtor DiamondMax 10) costs 89 euros. An empty 700 MB cd costs 1 euro. Assuming 1 GB = 1000 MB, it would take 160 GB / 0.7 GB = 229 CD's to get the same capacity as that one HDD. So, if you use CD's, you pay 229 euros, if you use HDD's, you pay 89 euros.

      The cost per gigabyte in my example HDD is about 0.55 euros, while the cost per gigabyte in CD is 1.43 euros, which is over twice as much.

      Please note that this is by no means the cheapest disk of this size you can find (or the cheapes price for this particular disk); the cheapest price for a 160 GB disk I found was 74 euros for a Seagate Barracuda. For that, the price per gigabyte is 0.46 euros, one third of the price with CD's.

      Add the fact that HDD's are much more convenient, and it becomes pretty obvious why HDD's are recommended :).

      Hmm. The lowest price for empty DVD-R's (4.7 GB) seems to be 1 euros, which would make the cost per gigabyte 0.21 euros... However, the same source also claimed that the lowest price for empty CD-R's is 0 euros, which puts it's trustworthiness into some doubt. And in any case, HDD's keep on getting bigger, and are still more convenient (no constant CD switching).

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

  6. first step by same_old_story · · Score: 3, Funny

    make their report available on a format other than a '.doc' file. it is known to change a lot and therefore not suitable for long term storage.

  7. Nonsense by sql*kitten · · Score: 5, Insightful

    OK, let's talk archiveability. Let's talk about a medium that you can leave in a shoebox for a hundred years and read just by shining a light through it. I'm not talking hypothetical here - this technology is proven by the fact that people used it a hundred years ago and it worked. And the technology is even better now, even more stable.

    I am of course talking about film. It is very very easy now to write digital images onto film, not very much more difficult than it is to scan film. There's no need to worry about whether the file format will be supported in the future, as I've already said. You don't need to shovel money into vendor's pockets every few years just to copy it to the latest trendiest type of disc. You can build a machine to project film out of junk if you need to, or you can scan it if you want a digital image and when you have a better scanner (e.g. a higher DMax), you can just scan it again.

    The dude who wrote this report is just blowing smoke. He's trying to sell snake oil.

    1. Re:Nonsense by Jeff+DeMaagd · · Score: 3, Informative

      and no you cannot " build a machine to project film out of junk". do you know how much film projectors cost? (hint a good lens alone, is over $5000,00.

      I think it is possible. Such an expensive lense isn't necessary. The best film projectors cost a lot, but I don't see it as that difficult to fabricate a basic one from scratch. It won't be the best but it could actually be watchable on a small scale.

      It might look hard to the monkeys that assembles ATX computers but I think a decent one could be made from scratch as a small senior engineering project for college, and probably could be adjustable with different sprockets and such. A little more complex than just shining a light through it. It may be hard to imagine, but there was a time when people had portable film cameras for home videos. It wasn't fancy and didn't need to be.

      Kodak announcing they'll stop producing film has little to do with anything, IMO. Five years is a lot of time but thus far, the drive to push digital projection is going much slower than people expected. Lucas wanted his Episode III to be exclusively projected in digital video, but it's not going to happen unless he wants to drastically cut the number of screens, I'm thinking a tenth of the screens is not an unrealistic figure.

      Of course, part of that is political and economic, because it saves the film distributors from major costs, but they refuse to pass on the savings to the theater companies that must invest as much as a quarter million dollars just to get started.

    2. Re:Nonsense by Waffle+Iron · · Score: 2, Insightful
      OK, let's talk archiveability. Let's talk about a medium that you can leave in a shoebox for a hundred years and read just by shining a light through it.

      Most if not all film from 100 years ago was made from nitrocellulose. If you left that in a shoebox for 100 years, you would probably end up with a box of dust.

      Most of the color film from 50 years ago was made with unstable dyes. If you kept that for 100 years, you'd have a box of transparent plastic.

      Now they think they have film that's more stable. Call me back in a century and we'll see if they're right.

      If there's any tendency for an analog copy to deteriorate, even if it takes centuries, you'll need reprint it periodically to preserve it. The same holds for digital copies. The difference is that each successive analog copy loses more of the original information.

    3. Re:Nonsense by Alan+Partridge · · Score: 2, Informative

      "Archiving in film is still the absolute highest quality you can achieve."

      Nope.

      "Scanning it once does not deteriorate the analog copy for later scanning."

      Yes, it does. Scanning usually involves physical damage and dye fading.

      --
      That was classic intercourse!
  8. JPEG 2000 for video? Huh? by Zarhan · · Score: 2, Interesting

    Okay, so JPEG 2000 uses wavelets and is therefore quite advanced, but as I have understood, it's still geared for still images (ok, there is probably some form of motion jpeg 2000?).

    I would think that most optimal method would be to use something like DIRAC instead (or Ogg Theora). DIRAC uses wavelets and adaptive arithmetic coding, so it should be "on par" with JPEG 2000 - and should also be free of patent encumberance.

    JPEG 2000 has one feature that might make it better in "archival" purposes - there is a lossless mode which still achieves higher compression ratios than PNG.

    1. Re:JPEG 2000 for video? Huh? by Zarhan · · Score: 2, Informative

      Dear kind AC,

      MPEG-2 has nothing to do with wavelets, MPEG-2 is based on DCT. In general, there are four methods for compression, discrete cosine transform (DCT), vector quantization (VQ), fractal compression, and discrete wavelet transform (DWT).

      MPEG codecs (1, 2, 4, H.26x) all use DCT. Have a nice day.

  9. OK, so when are we going to have support for it? by bersl2 · · Score: 3, Informative

    https://bugzilla.mozilla.org/show_bug.cgi?id=36351 (no link for obvious reasons) is the bug report, which has been around since April 2000 but has not progressed much due to licensing issues (copyright ones fixed, patent ones not?).

  10. Turn it up! by belg4mit · · Score: 3, Interesting

    Ummm what about the sound?!

    --
    Were that I say, pancakes?
  11. Graceful degredation by Dwonis · · Score: 4, Informative

    Avoiding inter-frame compression means that, if you have some small amount of data corruption, you only get one, maybe two corrupted frames of video.

  12. Paper by FullMetalAlchemist · · Score: 2, Funny

    Storing digital information on paper is feasible and lots of research efforts have been put into it.

    Storing data on anything magnetic or optical is a bit worrysome. But then, it's not critical data so I guess it doesn't really matter.

    1. Re:Paper by RichardX · · Score: 2, Funny


      >>it's not critical data so I guess it doesn't really matter

      >Ouch, burn. There is nothing quite like the feeling of being told that your culture and history isn't important, and doesn't matter.


      Oh, c'mon.. I mean, culture.. history.. it's hardly porn. Who cares if a few decades of historical records get wiped? Heck, just make 'em up again. Losing part of your porn collection though.. now that's a disaster.

      --
      Curiosity was framed. Ignorance killed the cat.
  13. HDs by eno2001 · · Score: 2, Interesting

    I could have told people this as they've replaced video tape, and audio tape for me for the past decade. I find them much more convenient, portable and cross platform. I have SCSI drives from 1994 that will still work in a PC (Linux or Windows) or Mac today. They are easy to backup to and restore from. The HD is about as close to perfection as you can get in a storage medium. At least until you get flash drives that can store 1 terabyte at minimum, and have an infinite number of writes. At least a 100 year lifespan.

    --
    -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
  14. Re:"it" being JPEG2k by TheoMurpse · · Score: 4, Informative

    When there isn't patent litigation surrounding the format.

  15. Ars Technica... by EMIce · · Score: 3, Interesting

    ...had a guide on capturing analog video, said to be the part of a 3 part series, going over each capturing, cleaning, and compressing. Only part I ever came out - Ars do you read slashdot? - I am waiting on the last guides for some advice on how to preserve these rotting home VHS tapes.

    Meanwhile, does anyone else have advice on capturing and cleaning video since we are already talking about compression? What settings are good for capturing and what sort of software exists to clean up VHS and give it the appearance of more clarity? I am using a WinTV card as Ars recommended it.

    1. Re:Ars Technica... by Noose+For+A+Neck · · Score: 3, Informative

      I wouldn't bother with ArsTechnica. For the definitive guide to capturing analog video and digitally archiving it, you would want to read this guide on Doom9. Plus, they have many other video-related guides on that site and a forum that is second to none in terms of the sheer amount of expertise exhibited by the users there.

      --

      Software piracy is victimless theft.

  16. Simplistic by fm6 · · Score: 2, Interesting
    I suspect that your picture of the survivability of film stock is a little optimistic. But I'll leave that issue to somebody who actually knows the technology. What really bothers me about your argument is your focus on a single factor: keeping the data available as long as possible with an absolute minimum of maintenance. If that were the only consideration, then film is actually a bad choice. Many more archival techniques are obviously more survivable. You could, for example, etch the data on platinum plates.

    But survivability isn't the only consideration. Cost is always an issue. (So much for my platinum plates, though your approach isn't exactly cheap either.) You also want to be able to able to access the data in the short term. I worked my way through college operating film projectors. It's is not a convenient medium!

    One thing I'd like to know is why archival-quality optical discs weren't considered. (Presumably there's something in the document about this, but it's a poorly structured word file, and finding key facts is more work than I care to expend.) They cost 5 times as much as standard CD-Rs and recordable DVDs, but their manufacters claim the data is good for 300 years. Of course, you need some fairly complicated technology to play them back, but CD and DVD drives are pervasive consumer devices -- they should be around for a very long time.

  17. The move to disk backup continues by RonBurk · · Score: 3, Insightful
    People objecting to the use of hard drives for backup miss several points.

    • Yes, they are (somewhat, not excessively) vulnerable to magnetism. But optical discs are vulnerable to light, fingerprints, chemicals, etc.
    • Optical discs continue to lag far behind in capacity. So, in the land of audio/video backup, the choice is between a single hard disk, or dozens of optical discs. The risk of failure of multiple optical discs is amplified by the increasing number of discs.
    • Bandwidth is another issue. Though hard disk bandwidth lags behind the growth of hard disk storage, optical disc bandwidth lags even further behind.
    • Restore time is another issue. You can line up a bunch of optical disc drives and try to make all your data available at once, but you're probably never going to get the restore speed of solutions like Massive Arrays of Inactive Disks.


    There will always be multiple backup solutions, but the biggest trend continues to be towards using hard disks for backup. When your data files are enormous (such as with audio/visual data), HDD backup is even more attractive.
  18. Re:MXF? by The+Ultimate+Fartkno · · Score: 3, Funny

    MXF is the new, proprietary video compression method jointly sponsored by Microsoft and MTV. The new Most eXtreme Format is the video compression of choice for today's most hard-core, edgy, in-your-face artists with an attitude!

    Ashlee Simpson says "When I'm performing for a half-time show of 10,000 screaming fans, I want to make sure that every bit of the live energy is caught perfectly! I give 100% for my fans and want to make sure they get every bit of my performance!"

    MXF... in your FACE, Quicktime! This isn't your father's archive-quality lossless video compression algorithm!

    (and keep an eye out for Ogg Vorbis 2 - by Mountain Dew!)

  19. More JPEG-2000 stuff by fnord_uk · · Score: 2, Informative

    There is more to jpeg2000 than a compression scheme offering scaleable quality and resolution within a single losslessly compressed file. There is also the interactive delivery mechanism offered by the JPIP protocol. Now there is something really useful...

    --
    In theory, theory and practice are the same. In practice, they're not.
  20. Re:Lossy file formats... by Leo+McGarry · · Score: 2, Insightful
    You absolutely can have TIFF video.
    Source: Macintosh HD:Users:Leo:Movies:Reagan.mov
    Format: Integer (big endian), Stereo, 48000 Hz, 16 bits
    TIFF, 720 x 480, Millions
    Movie FPS: 29.97
    Playing FPS: (Available when playing)
    Data Size: 1617.9 MB
    Data Rate: 19.9 MB/sec
    Current Time: 00:00:00.00
    Duration: 00:01:21.04
    Normal Size: 720 x 480 pixels
    Current Size: 720 x 480 pixels (Normal)
    TIFF would be a much better choice for archiving, because it's a much simpler format and is much easier to decode.
  21. Yawn, another technology of the day... by HockeyPuck · · Score: 3, Insightful

    All this is is a method to line some guy's pockets. I'm sure the tape guys are gonna say, use XYZ type of tape. The disk guys are gonna say disk.

    What makes this guy think that the interface to the HDD is going to be around in X years?

    PC's have only had two dead (non-(e)IDE/ATA) interfaces, the ESDI and the ST506/ST-412 interfaces.

    But what if you were trying to find a computer with IPI (1960s mainframe) interface.

    The Fed gov't has this problem with trying to find parts for their old 8/9track tape drives..

    Here's a good list of all the HDD interfaces over the years: http://www.i-t-s.com/corporate/terms.html

    Stick with microfiche, film, that way we don't have to pay some vendor $$$/yr to keep alive a dead technology or pay some other vendor $$$/media to move them from old to new media.

    1. Re:Yawn, another technology of the day... by Wesley+Felter · · Score: 2, Insightful

      This is not an issue because you never remove the hard disk from its computer. When the computer becomes obsolete, you buy a new one (with new disks), and copy the data over.

  22. Document the file format with C source code by tepples · · Score: 2, Insightful

    And if you can spare the space, a directory with a wav file and a stack of uncompressed TIFF images is even better. Compression formats are complicated to reverse engineer.

    Store .mng + .flac + source code for libmng and libflac, and you don't need to worry about any sort of complicated gnireenigne.

  23. Re:Decoder simplicity importance? by Leo+McGarry · · Score: 2, Insightful

    Because when you're archiving digital data, recoverability is paramount. You have to ask yourself, "What if all I had was a piece of this data, say, a hundred gigabytes from the middle of the disk? Could I turn that data into useful information?"

    If you're dealing with a run-length-encoded array of packed pixels, the answer is obviously yes. That's among the simplest forms of encoding known. (If you don't RLE the data it's even simpler, but a trade-off between simplicity and storage requirements is okay as long as you maintain a lot of simplicity.) Even if you don't know how the data was encoded, you've got a good chance of figuring it out just by doing some simple analysis on the bytes. But with a complex encoding scheme, it's much more difficult to figure out what you're dealing with just by looking at it.

    When talking about archiving, the objective is to be able to recover as much as possible given as little as possible.

  24. Intraframe vs. interframe by Wesley+Felter · · Score: 4, Interesting

    For whatever reason (I'm not a video expert) many people prefer intraframe codecs for archival. As you probably guessed, Motion JPEG 2000 just treats each video frame as a still image and compresses it with JPEG 2000.

    Dirac will give much better compression that JPEG 2000, but it also introduces the possibility of interframe artifacts.

  25. Recoverability depends on seekability by tepples · · Score: 2, Funny

    Because when you're archiving digital data, recoverability is paramount.

    No, Viacom is paramount.

    "What if all I had was a piece of this data, say, a hundred gigabytes from the middle of the disk? Could I turn that data into useful information?"

    As long as your codec is seekable, this works. Motion JPEG is trivially seekable, consisting entirely of keyframes. Toss a redundant copy of the codec on the volume after every GB or so of video data, and recoverability is preserved.

  26. .doc? by b1t+r0t · · Score: 2, Funny

    For people concerned with the preservation of "data", they've sure picked an interesting format to write about it in.

    --

    --
    "Open source is good." - Steve Jobs
    "Open source is evil." - Microsoft
  27. GIGO and the "born digital" problem by retiarius · · Score: 3, Insightful

    indeed, lossless for archival preservation is the
    only way, as it fits the basic rule of art restoration
    technology -- never apply "improvements" which
    cannot be reversibly undone to take advantage
    of future science.

    ironically then, the lossless format doesn't matter.

    however, at least for the instant case of dance video,
    the likely input (a myriad of digital tape formats)
    is hopelessly neanderthal -- anything having to do with DV,
    or MPEG, or even ATSC HDTV already tosses away much
    color information. (4:1:1, 4:2:0, and 4:2:2 colorspace is embarrassing
    to preserve "losslessly".) ditto for temporal
    info, with interlacing being the culprit. even film at
    24fps just will not cut it for motion such as dance.

    so here's to better camera technology, whether it's
    10- or 12-bit 4:4:4 RGB, or something like
    carver mead's foveon made swift.