Slashdot Mirror


MXF+JPEG-2000+HDD = Future of Video Preservation?

Anonymous Archivist writes "Media Matters, a technical consultancy specializing in archival audio and video material, recently completed a Mellon Foundation funded Digital Video Reformatting Preservation Project for the Dance Heritage Coalition. They conclude that MXF is the recommended container format, JPEG-2000 is the recommended encoding format and HDD is the recommended storage media. It's a very valuable series of experiments and offers a strong indication of where the archival preservation of analogue video is heading."

214 comments

  1. Lossy file formats... by Anonymous Coward · · Score: 0

    Isn't JPEG lossy? Why would it be recommended as the prefered storage mechanism, instead of TIFF or the like?

    1. Re:Lossy file formats... by iezhy · · Score: 5, Informative

      JPEG standart defines several encoding formats, which include lossless compression as well

    2. Re:Lossy file formats... by Peter+Cooper · · Score: 0, Redundant

      They're talking about video. You can't get TIFF videos. Also while JPEG 2000 is lossy it's considered to be 'good enough'. Of course, some people thought this about MPEG 1 when it came out, so you might have a point.

    3. Re:Lossy file formats... by Anonymous Coward · · Score: 0

      Why not use something like huffyuv?

    4. Re:Lossy file formats... by Anonymous Coward · · Score: 1, Informative

      JPEG2000 also supports a lossless wavelet compression mode. But yes, the 'lossy' version of JPEG2000 is supposedly better quality than traditional JPEG.

      http://www.jpeg.org/faq.phtml

    5. Re:Lossy file formats... by n0-0p · · Score: 1

      JPEG 2000 compression is wavelet encoding. This type of encoding allows you to selectively increase accuracy (at an increase in data size) until the compressed image is eventually identical to the original. From what I've read it even has a very good compression ratio at the lossless level. To vastly over-simplify, think of it as a far more advanced approach than interlaced GIF for progressive image retrieval.

    6. Re:Lossy file formats... by tonsofpcs · · Score: 1

      You can't get TIFF videos
      Yes you can, they are usually stored as image sequences, but you can. .

    7. Re:Lossy file formats... by sjf · · Score: 1

      Some early video editing HW I worked on used TIFF as an intermediate stage between the MJPEG decompressor and the final raster output. Makes sense to me.

      -S

    8. Re:Lossy file formats... by Leo+McGarry · · Score: 2, Insightful
      You absolutely can have TIFF video.
      Source: Macintosh HD:Users:Leo:Movies:Reagan.mov
      Format: Integer (big endian), Stereo, 48000 Hz, 16 bits
      TIFF, 720 x 480, Millions
      Movie FPS: 29.97
      Playing FPS: (Available when playing)
      Data Size: 1617.9 MB
      Data Rate: 19.9 MB/sec
      Current Time: 00:00:00.00
      Duration: 00:01:21.04
      Normal Size: 720 x 480 pixels
      Current Size: 720 x 480 pixels (Normal)
      TIFF would be a much better choice for archiving, because it's a much simpler format and is much easier to decode.
    9. Re:Lossy file formats... by Anonymous Coward · · Score: 0

      You can't get TIFF videos.

      Bullshit you can't. All it takes to do it is someone to come up with a container format for multiple images and sound. As it is, back in the day, all MPEG encoders took multiple TIFFs, and usually only TIFF.

      They wouldn't do this, because that's not what TIFF was designed to do--but it WOULD work, and exactly like a Motion-JPEG (the precurser to MPEG that used what is in essence a bunch of JPEG files.)

    10. Re:Lossy file formats... by Anonymous Coward · · Score: 0

      Unfortunately in JPEG2000 they don't include patent-free techniques.

  2. MXF? by slavemowgli · · Score: 1

    Enlighten me. What's MXF?

    --
    quidquid latine dictum sit altum videtur.
    1. Re:MXF? by Raul654 · · Score: 5, Informative

      " The Material eXchange Format (MXF) is an open file format targeted at the interchange of audio-visual material with associated data and metadata. It has been designed and implemented with the aim of improving file based interoperability between servers, workstations and other content creation devices. These improvements should result in improved workflows and result in more efficient working than is possible with today's mixed and proprietary file formats." -- What is MXF

      --


      To make laws that man cannot, and will not obey, serves to bring all law into contempt.
      --E.C. Stanton
    2. Re:MXF? by NoData · · Score: 1, Funny

      It's a dirty, dirty acronym for among the foulest of slurs:

      Motherchristfucker!

      Which is what most people utter when they discover they have no way of decoding MXF.

    3. Re:MXF? by slavemowgli · · Score: 1

      Thanks. ^^

      --
      quidquid latine dictum sit altum videtur.
    4. Re:MXF? by The+Ultimate+Fartkno · · Score: 3, Funny

      MXF is the new, proprietary video compression method jointly sponsored by Microsoft and MTV. The new Most eXtreme Format is the video compression of choice for today's most hard-core, edgy, in-your-face artists with an attitude!

      Ashlee Simpson says "When I'm performing for a half-time show of 10,000 screaming fans, I want to make sure that every bit of the live energy is caught perfectly! I give 100% for my fans and want to make sure they get every bit of my performance!"

      MXF... in your FACE, Quicktime! This isn't your father's archive-quality lossless video compression algorithm!

      (and keep an eye out for Ogg Vorbis 2 - by Mountain Dew!)

    5. Re:MXF? by T-Ranger · · Score: 1

      Couldnt you condense that down to "Joe"?

    6. Re:MXF? by jdmx · · Score: 1

      Ha, ok. That was pretty funny.

  3. Oh my god... by cartzworth · · Score: 0, Offtopic

    ...I just threw up in my mouth a little bit. That is a Word document!

    1. Re:Oh my god... by Anonymous Coward · · Score: 0

      Forget about virus vulnerabilities and proprietary lock-in for a moment. Given that they're realists about using universally accepted formats, their choice shouldn't be surprising. I mean, who *doesn't* have something that will open Word documents? Now think about how many can open an .sxw, or even know what it is.

    2. Re:Oh my god... by Em+Ellel · · Score: 1

      Forget about virus vulnerabilities and proprietary lock-in for a moment. Given that they're realists about using universally accepted formats, their choice shouldn't be surprising. I mean, who *doesn't* have something that will open Word documents? Now think about how many can open an .sxw, or even know what it is.

      Personally by your own argument PDF would have been a much better choice - much more universally accepted - no need to worry about proprietary lock-in, (somewhat) open format, can be generated from any application, can be open and unlike word looks consistent on any platform or printer, yada yada yada. Also often several orders of magnitude smaller files than MS formats.

      just my 2 cents..

      -Em

      --
      RelevantElephants: A Somatic WebComic...
    3. Re:Oh my god... by Tim+C · · Score: 1

      Also often several orders of magnitude smaller files than MS formats.

      They're smaller, I'll grant you that, but not "several orders of magnitude", unless you're using a meaning I'm not familiar with. One order of magnitude is a factor of ten, and several is "three or so", so you're claiming that a pdf is "often a thousandth the size of" an equivlant file in an MS format.

      Sure about that? I have multi-megabyte pdfs; are you absolutely sure that the equivalent Word docs would be multi-gigabyte?

    4. Re:Oh my god... by Em+Ellel · · Score: 1

      Sure about that? I have multi-megabyte pdfs; are you absolutely sure that the equivalent Word docs would be multi-gigabyte?

      In my limited experience, yes. But thinking more on that, it probably has to do with me making very large docs using things like Visio and Word. I had visio files compress from 8Meg to under 100K when output to PDF and I saw similar things using Word. Now why a several page Visio doc should be over 8meg is beyond me,MS app data files tend to bloat fast if you substantialy edit them. But I guess you may be right as this may not represent average use.

      -Em

      --
      RelevantElephants: A Somatic WebComic...
    5. Re:Oh my god... by Cili · · Score: 1

      If you think binary (like any slashdotter ought to) 6 is one order of magnitude bigger than 3 and 6 is about three orders of magnitude smaller than 60.

      It all lies in the base of the logarithm.

  4. JPEG-2000 by Anonymous Coward · · Score: 3, Interesting

    Why would they go with a compression format that doesn't do inter-frame compression?
    It might be nice for editing, but you could get more quality in the same space with something like h264, or even h263 if they have to do this right now (i.e. before h264 is quite ready for prime time).

    1. Re:JPEG-2000 by Blakflag · · Score: 1

      From the report is seems they wanted each frame to be able to stand on its own as a freezeframe.. dancers need to analyze a single frame for poses etc.

      And another bonus would be that since JPEG-2000 is a technology for encoding still bitmaps, one could grab a single frame of the movie and use it separately without any transcoding to another format.

      --
      *** DRINK MORE COFFEE ***
  5. Recommended Storage Media by Antonymous+Flower · · Score: 5, Funny

    Recommended Storage Media: Peer to Peer network.

    1. Re:Recommended Storage Media by Josh+Booth · · Score: 1

      This sounds funny, but it is absolutely true. It is like a giant distributed RAID and as long as two or more people want to keep the video, it will be there for a long time. If nobody wants the file, it is obviously not that important and probably shouldn't be saved anyway.

    2. Re:Recommended Storage Media by remahl · · Score: 4, Insightful

      Definitely not!

      Most if not all peer to peer networks require a certain level of interest in an item for it to be retained. Popular items are always easy to find while obscure / old items gradually disappear from the network.

      Try finding a movie that's a few years old. You'll have more trouble finding the original Jurassic Park than Jurassic Park III.

      Peer to peer is not a great way to reliably and systematically preserve cultural heritage.

    3. Re:Recommended Storage Media by fred911 · · Score: 2, Funny

      Along with the recommended exchange formant:

      Multi RAR'ed 14mb archives of .BIN and .CUE files. Including the ever necessary .nfo files.

      --
      09 F9 11 02 9D 74 E3 5B - D8 41 56 C5 63 56 88 C0 45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
    4. Re:Recommended Storage Media by tka · · Score: 1

      Just name it as sylvia.saint.hot.lesbian.action.on.sybian.avi
      The re'll be always interest on that...

    5. Re:Recommended Storage Media by Catbeller · · Score: 3, Insightful

      Well, that's more a function of the cost and size of data storage. Give me a Petabyte of soldid state nonvolatile storage, and I'll toss Jurrassic Park I in there for giggles, along with 20's silent films, clips from "Bozo's Circus" on WGN on 1969's Chicago TV, the collected books of mankind, complete 3D terrain maps of Mars and every old time radio recording in existence. Gimme a $200 unit that does this, and I'll preserve anything I can get my hands on!

    6. Re:Recommended Storage Media by owlstead · · Score: 1

      Jurassic Park lost? Let's go with the peer to peer stuff!

    7. Re:Recommended Storage Media by Vombatus · · Score: 1
      Also known as (in the archival community) as LOCKSS

      Lots Of Copies Keeps Stuff Safe

      --
      This sig is intentionally blank
    8. Re:Recommended Storage Media by strider44 · · Score: 2, Funny

      every 14mb RAR will be shared except the last one.

    9. Re:Recommended Storage Media by value_added · · Score: 1

      That's what the PARs are for.

    10. Re:Recommended Storage Media by Anonymous Coward · · Score: 0

      I don't see why Silvia Saint is so popular. She isn't all that good-looking, IMO.

    11. Re:Recommended Storage Media by Babbster · · Score: 1

      Give the actual copyright holders a $200 unit that does that and they'll buy several dozen of them and retain the things perfectly themselves.

    12. Re:Recommended Storage Media by Catbeller · · Score: 1

      Like they preserved all the movies prior to 1940? The vast majority of celluoid has turned to dust, the copyright holders not bothering to remaster over the decades. Keaton's works are almost all gone. The silents are just about gone. I've a hunch that the reason Disney won't release the old cartoon archives on DVD is because they sat on it for so long it disintegrated.

      Copyright holders are rarely the original artists. They are businessmen who hold no emotional or artistic attachment to the "property" they acquire. And they suck at taking care of it.

      This is another reason why copyright should expire. The owners are poor overseers. As Jefferson et al maintained, art spawns new art, endlessly borrowing from the past. Art dies under copyright.

    13. Re:Recommended Storage Media by Babbster · · Score: 1

      I'm no proponent of the ridiculous length of current copyright terms (in fact, I'm an OPPonent), but going back to the "golden age" to show how movie studios didn't preserve their material is weak sauce. They probably never thought they would be able to make money after 60+ years - copyright didn't extend that long and the ideas of consumers purchasing movies or a multitude of television channels wanting to show them would have been considered ridiculous.

  6. Great Except..... by chotchki · · Score: 1

    for all those lousy patent lawsuits you'll get now.

  7. Why HDD? by Orinthe · · Score: 5, Interesting

    The HDD recommendation doesn't seem to make much sense. The article talks about cost-per-gigabyte, but obviously it is much cheaper to use CDRs or DVDRs. This is video preservation, after all, not storing indefinitely for video /editing/, which would require a more malleable storage medium. And before someone points out that there are studies showing that the longevity of CDR/DVDR discs is questionable, surely proper storage of discs (and not buying the Best Buy free-after-rebate special) would be sufficient. HDD, after all, is susceptible to head crashes, and being a magnetic medium can be more easily overwritten.

    --
    SELECT quote.text AS sig FROM quote NATURAL JOIN attribute WHERE attribute.description = 'witty';
    0 rows returned
    1. Re:Why HDD? by chotchki · · Score: 3, Informative

      If you just mirror it on two hard drives and then put them into storage, they will last for a very long time. HDDs only die when run via wearing out and not just sitting on the shelf.

    2. Re:Why HDD? by Orinthe · · Score: 1

      I'd still prefer my archived data to not be affected by strong EM fields.

      I mean, really... for purposes of archival, optical (or possible magneto-optical) storage methods are far more secure than purely magnetic ones.

      --
      SELECT quote.text AS sig FROM quote NATURAL JOIN attribute WHERE attribute.description = 'witty';
      0 rows returned
    3. Re:Why HDD? by Anonymous Coward · · Score: 0

      Exactly, an idle hard-drive should last 12-25 years. Whereas home-writ CDs often don't make 3 months.

      We may complain about hard-drives crashing, but they're actually one of the most reliable storage mechcanisms. Industrial CompactFlash is probably the very best (I've seen projected lifespans of 100yrs+ on idle) though, but it'll co$t you.

    4. Re:Why HDD? by TheRaven64 · · Score: 2, Insightful
      If you just mirror it on two hard drives and then put them into storage, they will last for a very long time.

      If you mirror across two disks and put the into storage, and one develops some minor errors, it is not possible to tell which one has the errors unless the data itself stores error checking and correction information. This is why God RAID-5 was invented. Using 3 drives you can identify and repair any errors that develop on any one drive.

      If you just mirror it on two hard drives and then put them into storage, they will last for a very long time.

      Technically true, but my experience indicates that the most likely time for a drive to fail is when you power it up after a long period of inactivity. It's not exactly optimal to store your data for years only to have the drive(s) die when you first try to read from them...

      --
      I am TheRaven on Soylent News
    5. Re:Why HDD? by chotchki · · Score: 1

      There is a reason why most companies insist on off-site and on site storage. So when you make that mirrored copy, you send one drive to your friend 500 miles away. It really, as always comes down to how much your data is worth. Months to a few years: Optical Years to a decade: HDD Decades: Tape

    6. Re:Why HDD? by drinkypoo · · Score: 4, Insightful

      It would be smarter to use PAR2 (or similar) on a filesystem basis, than to use a RAID filesystem. It's easier to deal with user space programs for reconstructing data.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    7. Re:Why HDD? by Lacrymator · · Score: 1

      I disagree with the benefits of RAID in this case.
      While HDD is probably the best (for now) storage, I'd use it as a default due to lack of a better medium.

      A) HDD is simply too fragile to rely on. A drop, a static discharge, electromagnetic fields, OS errors... The list goes on.

      B) The purpose of the study is to find the best storage medium and format for a permanant archive of dance footage. Therein lies another problem for HDD usage. The footage is going to be a huge amount of data. Right now we are quickly approaching terabyte sized HDD's. If they were to use the biggest, most expensive drives available to date, they will still fill drives. This requires drives to be handled, which IMO is a mistake if the data on them is irreplacable.

      I agree with the conclusion on codecs, and thank the poster for this info. I was needing this information just as I logged in.

      I do think the conclusion on storage was not the best one though. This conclusion is temporary at best. Hopefully somewhere someone had the next million-dollar idea in storage. Permanant, indestructable, versatile and gigantic. Thats what we need.

    8. Re:Why HDD? by FireBug · · Score: 3, Interesting

      I didn't read the article (or the rest of the /. comments), but hard drives make much more sense than any optical storage medium in certain cases.

      Media will always wear out, regardless of what type it is. When you have huge amounts of data to back up, it's much nicer to be able to copy it to the latest greatest storage medium quickly and efficiently. Thousands of CDs/DVDs even with an automated "disc changer" would take a hell of a lot longer to transfer than a bunch of servers with hard drives.

      With a hard drive solution, you can just build a new server with new drives and copy everything over from the old one as fast as the hard drives and network allow. Couple this with RAID and multiple servers in different physical locations and you have a pretty damned resilient data archive. ... and just for fun, here's an (old) example of people using hard drives for large scale backups.

      http://www.tomshardware.com/storage/20030425/index .html

    9. Re:Why HDD? by shish · · Score: 1
      HDDs only die when run via wearing out and not just sitting on the shelf.

      But how long do they live after being knocked off of the shelf?

      --
      I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
    10. Re:Why HDD? by JeffTL · · Score: 1

      That'why you make backups, and transfer the data to new hard drives from time to time -- a process much easier with hard disks than with DVDs, if you can afford all the hard disks.

    11. Re:Why HDD? by sjf · · Score: 1

      I'd still prefer my archived data to not be affected by strong EM fields.

      Absolutely. The first thing I'm going to worry about after a nuclear attack is my porn collection. I hear nuclear winters are very long and cold.

      -S

    12. Re:Why HDD? by Jah-Wren+Ryel · · Score: 4, Informative

      If you mirror across two disks and put the into storage, and one develops some minor errors, it is not possible to tell which one has the errors

      Exceptionally incorrect, prepare for smackdown.

      All data on a hard disk is protected by very sophisticated error detection and correction elgorithms. The chance of getting "some minor errors" is effectively nil - either they are corrected by the disc's controller, or the controller returns a "sector unreadble" error - which is what keys any effective mirroring system to go get the data from the second disk. You just don't get bad data from modern hard disks.

      This is why God RAID-5 was invented.

      No, raid-5 was invented to maintain the I in RAID. Mirroring doubles your costs, RAID-5 only increases them by one disk out of the N disks in the parity group, where N is usually but not limited to 4-5 drives.

      --
      When information is power, privacy is freedom.
    13. Re:Why HDD? by ultranova · · Score: 2, Informative

      The HDD recommendation doesn't seem to make much sense. The article talks about cost-per-gigabyte, but obviously it is much cheaper to use CDRs or DVDRs.

      Wrong. Here in Finland, a new 160 GB hard disk ( Maxtor DiamondMax 10) costs 89 euros. An empty 700 MB cd costs 1 euro. Assuming 1 GB = 1000 MB, it would take 160 GB / 0.7 GB = 229 CD's to get the same capacity as that one HDD. So, if you use CD's, you pay 229 euros, if you use HDD's, you pay 89 euros.

      The cost per gigabyte in my example HDD is about 0.55 euros, while the cost per gigabyte in CD is 1.43 euros, which is over twice as much.

      Please note that this is by no means the cheapest disk of this size you can find (or the cheapes price for this particular disk); the cheapest price for a 160 GB disk I found was 74 euros for a Seagate Barracuda. For that, the price per gigabyte is 0.46 euros, one third of the price with CD's.

      Add the fact that HDD's are much more convenient, and it becomes pretty obvious why HDD's are recommended :).

      Hmm. The lowest price for empty DVD-R's (4.7 GB) seems to be 1 euros, which would make the cost per gigabyte 0.21 euros... However, the same source also claimed that the lowest price for empty CD-R's is 0 euros, which puts it's trustworthiness into some doubt. And in any case, HDD's keep on getting bigger, and are still more convenient (no constant CD switching).

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    14. Re:Why HDD? by Anonymous Coward · · Score: 0
      I'd still prefer my archived data to not be affected by strong EM fields.

      Let me guess...you don't know the first thing about magnetism.

    15. Re:Why HDD? by moonbender · · Score: 1

      For what it's worth, 1 Euro per writable CD is extremely expensive. Here in Germany, it's more like 0.30 Euro and less, depending, of course, on how many you buy. That said, DVDs have been far cheaper per megabyte for a long time now, so that's what you'd probably use. 1 Euro per DVD seems about right, if and only if you buy just one - if you buy a spindle of 100 DVDs, it's more like 0.35 apiece. If CDs and DVDs are really that expensive in Finland, I'd seriously consider importing them...

      --
      Switch back to Slashdot's D1 system.
    16. Re:Why HDD? by Anonymous Coward · · Score: 0
      being a magnetic medium can be more easily overwritten
      For a while there, it seemed as if 1984 would never get here.
    17. Re:Why HDD? by b1t+r0t · · Score: 1
      If you mirror across two disks and put the into storage, and one develops some minor errors, it is not possible to tell which one has the errors unless the data itself stores error checking and correction information.

      Actually, all hard drives store a CRC or other error detection method with each and every sector, and have been doing this for decades. Generally it should be easily possible and reliable to tell whether any given data is correct or not. It's the error recovery that's always been the tricky part.

      Mirroring the same data on two hard drives is relatively space-intensive (vs using RAID-5 or PAR2). Like RAID-5, one of the drives can die and the data is recoverable, but mirroring makes it a lot simpler to read the data.

      --

      --
      "Open source is good." - Steve Jobs
      "Open source is evil." - Microsoft
    18. Re:Why HDD? by pla · · Score: 1

      Wrong. Here in Finland, a new 160 GB hard disk ( Maxtor DiamondMax 10) costs 89 euros. An empty 700 MB cd costs 1 euro.

      WOW do blank media prices suck in Europe! I don't mean that as a taunt or as nationalistic gloating, but just... Wow.

      Here in the US, I can get a 100ct spool of blank name-brand 8x DVD+R for $35 (for noname 4x media, I found them as low as $19/100, but wouldn't really trust those for anything but a throwaway means of getting some large but replaceable files between two places). Comparing that to HDDs, I could almost get a cheap 60GB EIDE HDD. 450GB vs 60GB, making the HDD solution literally 7.5x more expensive, per GB.

    19. Re:Why HDD? by ChrisMaple · · Score: 1

      HDDs can also die when bearing lubricants harden, or for some other reason the spindle freezes. In some cases, this is more likely to happen when sitting on the shelf than when running.

      --
      Contribute to civilization: ari.aynrand.org/donate
    20. Re:Why HDD? by Anonymous Coward · · Score: 0

      I don't think the key is in the price. The problem with HDDs is that they have mechanical moving parts. All moving mechanisms deteriorate with time, partly due to oxidation, partly due to galvanization, even in perfectly sterile environment (like one inside the harddisk, so we can eliminate corrosion and environmental problems). I'm not sure you'd be able to actually plug in and use a hard disk after 100 years.

    21. Re:Why HDD? by Anonymous Coward · · Score: 0

      DVD-R costs 0.50/GB for the cheap shit (CAN) whereas the HDD, which is fifty times more likely to survive in ten years, is 0.75/GB (CAN) for the cheap shit.

      I have CD-R from five years ago that were stored properly, but have gone bad in the meantime. Lots of 'em. However my HDDs from ten years ago still work exactly as they did then.
      Seems a reasonable deal to me.

    22. Re:Why HDD? by owlstead · · Score: 1

      Dunno. I presume you would want to keep the files on one place on the disk. You could spread them out, but that would be like RAID 5. If you keep them at one spot, and you loose the disk, then PAR/PAR2 is not going to save you. If you do PAR over multiple files ... well ... is possible, but yuk. RAID has been proven technology as well.

      Personally I think this whole hard disk storage thing will work out fine for my personal collection, but for something that does not have to be online all the time I don't think disks will cut the mustard. The power requirements alone...

      ps. yes, I know, cut the mustard...psk.

    23. Re:Why HDD? by Anonymous Coward · · Score: 0

      Fins and Germans can now argue prices without currency conversion, all hail the Euro!!!

    24. Re:Why HDD? by Vombatus · · Score: 1
      I hear nuclear winters are very long and cold.

      Right after a fairly hot autumn.

      --
      This sig is intentionally blank
    25. Re:Why HDD? by Anonymous Coward · · Score: 0

      If you are talking about logetivity of backups you are probably not buying the cheapest no-name brand media out there....

    26. Re:Why HDD? by drinkypoo · · Score: 1
      For something that has to be online all the time, hard disks are pretty much the only solution. For things that have to be near-online, we have optical disc, but AFAIK the largest capacity shipping right now is DVD9. The densest jukeboxes I've seen (in the audio world, but the design would be easy enough to adapt) put 100 discs in a space about 14 inches on a side. That's 8.4 * 100 = 840 GB, still not even 1 TB. So I'd say that's got absolutely horrible density. Plus, DVD9 disks are still quite expensive. Film might be a candidate if the density is high enough but I really have no idea how much data you can reasonably get out of it.

      I think, however, that the best solution is to have three caches on different continents (not all in the same hemisphere either) and just store hard drives in them. This probably provides the highest density of data storage currently available and ensures the maximum protection for the data.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    27. Re:Why HDD? by Anonymous Coward · · Score: 0

      and you loose the disk

      "lose".

    28. Re:Why HDD? by GeckoX · · Score: 1

      A solution with your requirements already exists and has been in use for thousands of years.

      The trick is all in learning the technique required to stream all of that data through your tools and onto a chunk of stone with reasonable throughput.

      --
      No Comment.
  8. Re:Huh? by Orinthe · · Score: 1

    No, they mean Hard Disk Drive. RTFA.

    --
    SELECT quote.text AS sig FROM quote NATURAL JOIN attribute WHERE attribute.description = 'witty';
    0 rows returned
  9. Google? by Anonymous Coward · · Score: 0

    Enlighten me. How can I avoid a stupid FP?

    1. Re:Google? by slavemowgli · · Score: 0, Offtopic

      Enlighten me: how can I sign up for slashdot instead of being an anonymous coward (with an emphasis on COWARD)?

      That being said, it wasn't first post, and not intended to be, either.

      --
      quidquid latine dictum sit altum videtur.
  10. first step by same_old_story · · Score: 3, Funny

    make their report available on a format other than a '.doc' file. it is known to change a lot and therefore not suitable for long term storage.

    1. Re:first step by Guspaz · · Score: 1

      Really? I thought it hadn't changed substantially (if at all) since Office 97, which was probably released in 1996, so 9 years ago. In computer terms that's a long lasting format.

    2. Re:first step by same_old_story · · Score: 1

      sure. but you forget that, according to many /.ers microsoft is about to collapse, and who maitain it?

    3. Re:first step by WaR.KiN · · Score: 1

      Would you prefer to read it on a .txt file?

    4. Re:first step by Guspaz · · Score: 1

      What? No. I was replying to the parent post that complained that .doc changed too frequently to use for longterm storage. I was saying that it hasn't changed much in 9 years, so it WAS suitable for longterm storage. Besides, most word processing apps will read .doc that predates the Word 97 standard.

      Me, I'm happy with .doc. They open fine in OpenOffice.org.

  11. Nonsense by sql*kitten · · Score: 5, Insightful

    OK, let's talk archiveability. Let's talk about a medium that you can leave in a shoebox for a hundred years and read just by shining a light through it. I'm not talking hypothetical here - this technology is proven by the fact that people used it a hundred years ago and it worked. And the technology is even better now, even more stable.

    I am of course talking about film. It is very very easy now to write digital images onto film, not very much more difficult than it is to scan film. There's no need to worry about whether the file format will be supported in the future, as I've already said. You don't need to shovel money into vendor's pockets every few years just to copy it to the latest trendiest type of disc. You can build a machine to project film out of junk if you need to, or you can scan it if you want a digital image and when you have a better scanner (e.g. a higher DMax), you can just scan it again.

    The dude who wrote this report is just blowing smoke. He's trying to sell snake oil.

    1. Re:Nonsense by drinkypoo · · Score: 1

      Do you have any information on the overall cost of film storage as opposed to hard disk drives? Specifically, one must account for the initial cost of equipment, the cost of storage, the cost of recovery of damaged data, and the cost of paying the people involved to implement everything. Since film and hard drives have [vaguely] similar storage requirements, I'd say the cost of storage is basically a function of density, another thing I have no idea about. I understand that film has been used for archival of digital data for some time but that it is largely deprecated today, and from what I understand it's due to the cost of purchasing and maintaining the equipment.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    2. Re:Nonsense by same_old_story · · Score: 1

      not really though.
      film is very expensive.
      film must be taken very good care of .
      and no you cannot " build a machine to project film out of junk". do you know how much film projectors cost? (hint a good lens alone, is over $5000,00.
      also film, can mean a lot : 16mm, 35 mm, imax, which one are you talking about? which aspect ratio, which emulsion?
      also, kodak announced they will stop producing film after 2010 (that limit may slide a bit). so no, no deal.
      also "you can scan it if you want a digital image and when you have a better scanner (e.g. a higher DMax), you can just scan it again" everytime you convert from analog to digital or the other way you loose quality.

    3. Re:Nonsense by Mister+Incognito · · Score: 1

      Stone + Chisel. Will perdure. If it breaks you can still put together the two pieces.

      Anyway, film has its problems... talk with anyone who has to work with old reels.

    4. Re:Nonsense by Spy+der+Mann · · Score: 1

      I am of course talking about film. It is very very easy now to write digital images onto film, not very much more difficult than it is to scan film.

      And it is even EASIER to burn film. Yeah. Great preservation, indeed.

    5. Re:Nonsense by Anonymous Coward · · Score: 0

      The problem would be the conversion from digital to analog and back. You're going to lose something there no matter what you do.

    6. Re:Nonsense by jokell82 · · Score: 1

      also "you can scan it if you want a digital image and when you have a better scanner (e.g. a higher DMax), you can just scan it again" everytime you convert from analog to digital or the other way you loose quality.

      Every kind of video goes through an ADC at one point, whether or not it's inside the camera/recording device or done through scanning later doesn't really matter. Archiving in film is still the absolute highest quality you can achieve. Scanning it once does not deteriorate the analog copy for later scanning. Also going from digital to analog very rarely if ever involves a loss of quality, however it may make the digital imperfections more pronounced.

      I agree with you that costs of archiving film can be quite high, however if you can afford it film is still the best way to archive right now.

      --
      I dunno who it is
      but it prolly is fhqwhgads.
    7. Re:Nonsense by Jeff+DeMaagd · · Score: 3, Informative

      and no you cannot " build a machine to project film out of junk". do you know how much film projectors cost? (hint a good lens alone, is over $5000,00.

      I think it is possible. Such an expensive lense isn't necessary. The best film projectors cost a lot, but I don't see it as that difficult to fabricate a basic one from scratch. It won't be the best but it could actually be watchable on a small scale.

      It might look hard to the monkeys that assembles ATX computers but I think a decent one could be made from scratch as a small senior engineering project for college, and probably could be adjustable with different sprockets and such. A little more complex than just shining a light through it. It may be hard to imagine, but there was a time when people had portable film cameras for home videos. It wasn't fancy and didn't need to be.

      Kodak announcing they'll stop producing film has little to do with anything, IMO. Five years is a lot of time but thus far, the drive to push digital projection is going much slower than people expected. Lucas wanted his Episode III to be exclusively projected in digital video, but it's not going to happen unless he wants to drastically cut the number of screens, I'm thinking a tenth of the screens is not an unrealistic figure.

      Of course, part of that is political and economic, because it saves the film distributors from major costs, but they refuse to pass on the savings to the theater companies that must invest as much as a quarter million dollars just to get started.

    8. Re:Nonsense by Waffle+Iron · · Score: 2, Insightful
      OK, let's talk archiveability. Let's talk about a medium that you can leave in a shoebox for a hundred years and read just by shining a light through it.

      Most if not all film from 100 years ago was made from nitrocellulose. If you left that in a shoebox for 100 years, you would probably end up with a box of dust.

      Most of the color film from 50 years ago was made with unstable dyes. If you kept that for 100 years, you'd have a box of transparent plastic.

      Now they think they have film that's more stable. Call me back in a century and we'll see if they're right.

      If there's any tendency for an analog copy to deteriorate, even if it takes centuries, you'll need reprint it periodically to preserve it. The same holds for digital copies. The difference is that each successive analog copy loses more of the original information.

    9. Re:Nonsense by Leo+McGarry · · Score: 1

      Every kind of video goes through an ADC at one point

      Eh, you kids. Forgettin' that video was once an inherently analog medium.

    10. Re:Nonsense by Leo+McGarry · · Score: 1

      You're a nut. Grinding lenses is one of the hardest things human beings know how to do. And the precision machining needed to build those leetle tiny gears with all those leetle tiny teeth shouldn't be underestimated.

      Yes, it's possible. No, it's not something just anybody could do. Think of it as being on the same plane as manufacturing a car out of slabs of raw aluminum and steel.

    11. Re:Nonsense by Alan+Partridge · · Score: 2, Informative

      "Archiving in film is still the absolute highest quality you can achieve."

      Nope.

      "Scanning it once does not deteriorate the analog copy for later scanning."

      Yes, it does. Scanning usually involves physical damage and dye fading.

      --
      That was classic intercourse!
    12. Re:Nonsense by vijayiyer · · Score: 1

      Kodachrome from 50 years ago looks as good today as when it was shot. And that's film that hasn't been stored optimally for preservation. It doesn't use unstable dyes; it instead uses silver, much like black and white film.

    13. Re:Nonsense by ultranova · · Score: 1

      film must be taken very good care of .

      I was under the impression that film can be stored in normal room temperature and humidity. I could be wrong in that, but at least I know that the first films ever shot are still watchable, for the reason that I've seen the famous "man walks" movie - horrendous picture quality, but perfectly watchable.

      Besides, Finnish television sometimes sends old newsfilms from as far back as the World War 2, and they appear to be in perfect quality - apart from being black and white, of course.

      and no you cannot " build a machine to project film out of junk".

      Of course you can. What do you think the first film projectors were made of ?

      I think you are confusing "watchable" with "good quality". A watchable format with poor quality is much better than good quality picture which can't be watched because no one can figure out the format.

      do you know how much film projectors cost? (hint a good lens alone, is over $5000,00.

      Well, I once bought one from a toy store for pocket change - I can't recall the exact price, but I was ten years old at the time, so it couldn't have been much.

      Of course, it wasn't a projector, meant to project the film into a 20-meter wide canvas; it was a cheap toy with a plastic lens which you would look through.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    14. Re:Nonsense by aardvarko · · Score: 1

      No - grinding and then coating some of the precision elements used in modern compound lenses is one of the hardest things we know how to do.

      Grinding a SINGLE lens that can be used to examine (or project!) something is actually fairly simple; how do you think Matthew Brady shot his Civil War photographs? With his precision-engineered 70-200/2.8 made in Japan by an army of clean-suited workers?

      And gears with little tiny teeth are utterly unnecessary to look at or project something. Speaking as a photographer, you're the one that's a little off his rocker.

    15. Re:Nonsense by aardvarko · · Score: 1

      also, kodak announced they will stop producing film after 2010 (that limit may slide a bit)

      I would love to see a source for this statement.

    16. Re:Nonsense by Leo+McGarry · · Score: 1

      how do you think Matthew Brady shot his Civil War photographs? With his precision-engineered 70-200/2.8 made in Japan by an army of clean-suited workers?

      No, with precision-engineered lenses made in Germany by Zeiss. You don't seriously think Matthew Brady ground his own glass, do you?

      And gears with little tiny teeth are utterly unnecessary to look at or project something.

      The word you're groping for here is "sprocket." Without sprockets, you've got no way to pull film through the projector. And yes, sprockets have tiny little teeth that have to be very precisely machined to keep from tearing -- that is, destroying -- the film.

    17. Re:Nonsense by ChrisMaple · · Score: 1

      Kodachrome uses dyes, they are incorporated in the processing instead of being part of the film as manufactured. Although Kodachrome's dyes are stable with temperature and reasonable humidity, they are not stable with exposure to light, and in fact are poorer than most other chromes in that regard.

      --
      Contribute to civilization: ari.aynrand.org/donate
    18. Re:Nonsense by Anonymous Coward · · Score: 0

      A hundred years ago they used SILVER on their films. Silver, as we all know is exceptionally stable. It's not suprising that a film of the era could stand up to sitting in a shoebox in the middle of the Sahara.

      TODAY we use organic dyes. They don't last so long.

    19. Re:Nonsense by Anonymous Coward · · Score: 0

      the other way you loose quality

      "lose".

    20. Re:Nonsense by sql*kitten · · Score: 1

      Yes, it does. Scanning usually involves physical damage and dye fading.

      I've seen fabulous prints made from hundred-year-old negs. There's no more deterioration scanning than there is printing on a traditional enlarger (i.e. none, apart from mechanical if you're careless).

    21. Re:Nonsense by sql*kitten · · Score: 1

      If it breaks you can still put together the two pieces.

      Not if it's been broken up and turned into walls, roads, etc, which is how lots of Classical Greek writing was lost. Film can't really be repurposed for anything else, there's no motivation for anyone do anything but leave it where it is.

    22. Re:Nonsense by Mister+Incognito · · Score: 1

      Good reasoning.

      Unfortunately I'm sure the "modern arts" types will find creative uses for films in sculptures. I'm sure my sister would ;)

  12. Just open it in OOo by reality-bytes · · Score: 0, Redundant

    http://www.openoffice.org

    Now stop being such a pansy ;-)

    --
    Ripping an new rectum in the fabric of spacetime.
  13. Usage question... by teknikl · · Score: 0, Offtopic

    Can I use MXF to recompress my Tivo recordings of MXC?

    1. Re:Usage question... by Bitmanhome · · Score: 1

      No. Your Viewer License doesn't allow transcoding, only time-shifting. You can watch it once, then you must delete it.

      --
      Not that this wasn't entirely predictable.
    2. Re:Usage question... by sjf · · Score: 1

      MXF is a metadata format, not a compression format.

      -S

    3. Re:Usage question... by Anonymous Coward · · Score: 0

      not where I live, pal...

  14. JPEG 2000 for video? Huh? by Zarhan · · Score: 2, Interesting

    Okay, so JPEG 2000 uses wavelets and is therefore quite advanced, but as I have understood, it's still geared for still images (ok, there is probably some form of motion jpeg 2000?).

    I would think that most optimal method would be to use something like DIRAC instead (or Ogg Theora). DIRAC uses wavelets and adaptive arithmetic coding, so it should be "on par" with JPEG 2000 - and should also be free of patent encumberance.

    JPEG 2000 has one feature that might make it better in "archival" purposes - there is a lossless mode which still achieves higher compression ratios than PNG.

    1. Re:JPEG 2000 for video? Huh? by Anonymous Coward · · Score: 0

      DIRAC has a long way to go before it is useable.
      SNOW is also wavelet and is almost useable now.

      The simplest way to try SNOW out is with the
      multi-platform AV tool Avidemux.
      http://fixounet.free.fr/avidemux/
      No codecs to install.

    2. Re:JPEG 2000 for video? Huh? by UpnAtom · · Score: 1

      As I understand it, matching wavelets with motion compensation is very computationally intensive.

      Snow (cached) is the most promising attempt.

      Of traditional block-based MPEG codecs, Nero Recode's H.264 implementation is by far the best for low bitrates.
      For high bitrate archival purposes, XviD might be better.

    3. Re:JPEG 2000 for video? Huh? by Anonymous Coward · · Score: 0
      Okay, so JPEG 2000 uses wavelets and is therefore quite advanced, but as I have understood, it's still geared for still images (ok, there is probably some form of motion jpeg 2000?).

      Ever heard of MPEG-2 moron? Sheesh, get a fucking DVD and you'll see JPEG-2000 in action. Ignorance is bliss.

    4. Re:JPEG 2000 for video? Huh? by Zarhan · · Score: 2, Informative

      Dear kind AC,

      MPEG-2 has nothing to do with wavelets, MPEG-2 is based on DCT. In general, there are four methods for compression, discrete cosine transform (DCT), vector quantization (VQ), fractal compression, and discrete wavelet transform (DWT).

      MPEG codecs (1, 2, 4, H.26x) all use DCT. Have a nice day.

    5. Re:JPEG 2000 for video? Huh? by rillian · · Score: 1

      JPEG 2000 has one feature that might make it better in "archival" purposes - there is a lossless mode which still achieves higher compression ratios than PNG.

      Yes, lossless JPEG 2000 is a reasonable option. I'm not sure any lossy video codec counts as 'archival' storage. Might as well just put published DVDs in a preservation vault. The wide release of movies of movies on DVD has done more for the preservation of movies themselves than anything else in history.

      Still, for a digital archive of the film masters, until the patent issues with JPEG 2000 are resolved, I'd just put MNG and FLAC in an Ogg file.

      And if you can spare the space, a directory with a wav file and a stack of uncompressed TIFF images is even better. Compression formats are complicated to reverse engineer.

    6. Re:JPEG 2000 for video? Huh? by Anonymous Coward · · Score: 0

      Well, he wasn't that far from truth, as discreet cosine transform can be represented with wavelets just as well - wavelets just give you another degree of freedom so you can get a better representation. The main thing is MPEG-2 combines DCT with frame-to-frame compression, while JPEG2000 is just a still image compression. Of course there is nothing preventing anyone from developing frame to frame compression on top of JPEG2000, but we have similar efforts already (as mentioned before).

  15. OK, so when are we going to have support for it? by bersl2 · · Score: 3, Informative

    https://bugzilla.mozilla.org/show_bug.cgi?id=36351 (no link for obvious reasons) is the bug report, which has been around since April 2000 but has not progressed much due to licensing issues (copyright ones fixed, patent ones not?).

  16. Turn it up! by belg4mit · · Score: 3, Interesting

    Ummm what about the sound?!

    --
    Were that I say, pancakes?
    1. Re:Turn it up! by Anonymous Coward · · Score: 0

      MXF also supports wrapping of an open-ended number of sound formats (PCM,AC3,MPEG, etc).

    2. Re:Turn it up! by belg4mit · · Score: 1

      That's nice, my point was audio is one half of a "video", and they didn't make any recommendations regarding it. By your logic they could have just said "HDD + MXF. Fin"

      --
      Were that I say, pancakes?
  17. Re:Keep up the good work. by Anonymous Coward · · Score: 0
    "Good work"? This is amusing?

    It's a really poor cut-and-paste troll. Woo fucking hoo.

  18. "it" being JPEG2k by bersl2 · · Score: 1

    the program being Mozilla's image library

    1. Re:"it" being JPEG2k by TheoMurpse · · Score: 4, Informative

      When there isn't patent litigation surrounding the format.

    2. Re:"it" being JPEG2k by abramsh · · Score: 1

      So mozilla should drop JPEG too?

    3. Re:"it" being JPEG2k by Anonymous Coward · · Score: 0

      As Mozilla has become more popular it has become totally risk-averse. It's the same for CA certificates as for image formats: all the old ones cannot be removed because it would upset the users, but no new ones can be added because of legal risk. The double standard is obvious.

    4. Re:"it" being JPEG2k by TheoMurpse · · Score: 1

      I can't say about JPEG2000, but the link abramsh provides mentions a suit affecting programs that create and/or edit JPEG, and I hope that JPEG editing is not included in a browser suite! I don't know if the suit about JPEG2000 is over creating JPEG2000 or just anything at all to do with wavelet technology in images, as seems to be the case.

  19. Graceful degredation by Dwonis · · Score: 4, Informative

    Avoiding inter-frame compression means that, if you have some small amount of data corruption, you only get one, maybe two corrupted frames of video.

    1. Re:Graceful degredation by Stonent1 · · Score: 1

      Isn't that what key frames are for? Or is this better?

    2. Re:Graceful degredation by zootm · · Score: 1

      With this system, everything is a key frame. If you have corruption in an inter-frame compressed system, it lasts until the next keyframe - with a system such as this one, you only lose quality on the frame where the loss is.

    3. Re:Graceful degredation by CryoPenguin · · Score: 1

      Avoiding inter-frame compression means that, if you have some small amount of data corruption, you only get one, maybe two corrupted frames of video.
      But Inter-frames result in several times lower bitrate for a given quality. If you spend that saved bitrate on error correction (e.g. PAR2), the result is much less error-prone than an Intra-only codec. And PAR2 can completely correct errors, not just limit their scope.

    4. Re:Graceful degredation by Anonymous Coward · · Score: 0

      This is archiving--of potentially historic media! It already costs thousands of dollars to keep the reels around.

      It totally makes sense to avoid inter-frame, and lossy encoding. So what, it costs a few hard drives for any particular piece. Big deal. At least it's still around, in pristine condition, to encode to a lossy-interframe format for distribution a (few) hundred years from now.

    5. Re:Graceful degredation by Anonymous Coward · · Score: 0

      you only lose quality

      "loose".

    6. Re:Graceful degredation by Anonymous Coward · · Score: 0

      I would assume -- I would hope -- that they would keep several copies of the archived media in several different locations. The chances of two different disks in two different locations becoming corrupted in the same place is, hopefully, small.

  20. Media Matters? by Otter · · Score: 0, Offtopic
    Presumably this is unrelated to the David Brock-led (and, depending on who you listen to, George Soros-funded) Media Matters?

    I hope the digital archive guys have their trademark ducks pretty firmly in a row -- if they're playing in the same namespace as a destructive scumbag like Brock, just being in the right isn't going to help. And Soros or not, he has plenty of money, lawyers and press contacts.

    1. Re:Media Matters? by Leo+McGarry · · Score: 1

      The one that publicly skewered the Talon News guy for having a conservative agenda but remained conspicuously silent when CNN head Eason Jordan publicly defamed the US Army in Davos? That Media Matters?

      I hope you're right. Those guys are idiots.

      (Of course, these guys with their "MXF instead of QuickTime, JPEG 2000 instead of RLE" thing are pretty much looking like idiots too.)

    2. Re:Media Matters? by Anonymous Coward · · Score: 0

      Correction. Used to be a scumbag Right Wing Hitman. Now he's seen the light and is on the correct side and has shown the dirty tricks that the Right played all through the 90's. Don't like your dirty laundry being aired in public? TFB.

  21. Paper by FullMetalAlchemist · · Score: 2, Funny

    Storing digital information on paper is feasible and lots of research efforts have been put into it.

    Storing data on anything magnetic or optical is a bit worrysome. But then, it's not critical data so I guess it doesn't really matter.

    1. Re:Paper by TheoMurpse · · Score: 1

      it's not critical data so I guess it doesn't really matter

      Ouch, burn. There is nothing quite like the feeling of being told that your culture and history isn't important, and doesn't matter.

    2. Re:Paper by FullMetalAlchemist · · Score: 1

      But it isn't, unless you're also a fan of war etc.

    3. Re:Paper by RichardX · · Score: 2, Funny


      >>it's not critical data so I guess it doesn't really matter

      >Ouch, burn. There is nothing quite like the feeling of being told that your culture and history isn't important, and doesn't matter.


      Oh, c'mon.. I mean, culture.. history.. it's hardly porn. Who cares if a few decades of historical records get wiped? Heck, just make 'em up again. Losing part of your porn collection though.. now that's a disaster.

      --
      Curiosity was framed. Ignorance killed the cat.
    4. Re:Paper by fossa · · Score: 1

      Paper isn't an optical medium? Punchcards perhaps... I assume you are talking about better tech than my laser printer + scanner which I've estimated could hold about 1MB (which I believe is what the Paper Disk software can do) on a 8.5"x11" page. So what kind of data density can you achieve on paper, and what sort of printer and scanner are you using?

      But, it's an optical format... so why not design materials that can achieve much greater data densities? Like DVDs? Surely it's possible to design something like a DVD that would last as long as paper? Or maybe not...

    5. Re:Paper by Anonymous Coward · · Score: 0

      Data Matrix can accommodate up to 500 MB per square inch with a data capacity of 1 to 2335 characters. Data Matrix has a high degree of redundancy and resists printing defects.

    6. Re:Paper by be-fan · · Score: 1

      Culture and history is only important if you're a fan of war? War is a part of the human condition, but it's hardly the only part!

      --
      A deep unwavering belief is a sure sign you're missing something...
  22. HDs by eno2001 · · Score: 2, Interesting

    I could have told people this as they've replaced video tape, and audio tape for me for the past decade. I find them much more convenient, portable and cross platform. I have SCSI drives from 1994 that will still work in a PC (Linux or Windows) or Mac today. They are easy to backup to and restore from. The HD is about as close to perfection as you can get in a storage medium. At least until you get flash drives that can store 1 terabyte at minimum, and have an infinite number of writes. At least a 100 year lifespan.

    --
    -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
    1. Re:HDs by Dan+East · · Score: 1

      Just out of curiosity, why do you list infinite writes as a requirement for the ideal medium for archival?

      Dan East

      --
      Better known as 318230.
    2. Re:HDs by eno2001 · · Score: 1

      Actually, I wasn't really thinking archival here. I was thinking more in terms of day to day use, which is how I use them. I have a backup system for the home server that completely wipes the entire backup drive system, reformats it and then copies everything from the live data drive system to it on a daily basis. So I write about 250 gigs per day to the backup system. It's not perfect though. I'm hoping to eventually set something up that works similar to an HP VA7400 series storage array for home use.

      --
      -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
    3. Re:HDs by Anonymous Coward · · Score: 0
      I have a backup system for the home server that completely wipes the entire backup drive system, reformats it and then copies everything from the live data drive system to it on a daily basis.

      You are asking for trouble. A few years ago, my main hard disk crashed while I was doing a backup using a similar method (I did not reformat, but I did a block-by-block copy). Result: the main hard disk was dead and the backup disk was working but unusable. A large part of the data had not been copied correctly and the block offsets in the inode table did not match the right disk sectors.

      Use rsync instead. Doing incremental backups will ensure that your backup disk is always in a consistent state. rsync is your friend.

      rsync -avxHS / /backup/

    4. Re:HDs by eno2001 · · Score: 1

      Thanks for the suggestion. I've been thinking over my current backup strategy and I see some potential for huge data loss in the case that something happens to the main drive while I'm in the middle of a backup. Or... when I do the mkreiserfs command to format the backup drive, if the main drive set is dead... well that's just nasty. One question though if you're still around AC... will the 'rsync' incrememental backup allow me to keep the drives in perfect sync including files that I delete on the main data set? I hate the idea of having my backup take up a lot more space than my data drive and not being able to sort through what is good data and what isn't on the backup. A lot of times I copy multigigabyte video files to the data drive for temporary use and delete them. I want to make sure that my backup drive doesn't get filled with these and never delete them.

      --
      -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
  23. Personally by WormholeFiend · · Score: 0, Troll

    I can't wait until the neighbourhood drugstore starts selling HDDs instead of MiniDV cassettes.

    1. Re:Personally by Leo+McGarry · · Score: 1

      They already sell Compact Flash cards. It's not that much of a leap.

  24. Ars Technica... by EMIce · · Score: 3, Interesting

    ...had a guide on capturing analog video, said to be the part of a 3 part series, going over each capturing, cleaning, and compressing. Only part I ever came out - Ars do you read slashdot? - I am waiting on the last guides for some advice on how to preserve these rotting home VHS tapes.

    Meanwhile, does anyone else have advice on capturing and cleaning video since we are already talking about compression? What settings are good for capturing and what sort of software exists to clean up VHS and give it the appearance of more clarity? I am using a WinTV card as Ars recommended it.

    1. Re:Ars Technica... by Noose+For+A+Neck · · Score: 3, Informative

      I wouldn't bother with ArsTechnica. For the definitive guide to capturing analog video and digitally archiving it, you would want to read this guide on Doom9. Plus, they have many other video-related guides on that site and a forum that is second to none in terms of the sheer amount of expertise exhibited by the users there.

      --

      Software piracy is victimless theft.

    2. Re:Ars Technica... by Anonymous Coward · · Score: 0

      What the heck are you talking about?

      On the first page, it says "Part II can now be found here."

      Of course, it's only been up since ... July 2003.

      OK, they've been delinquent with part 3. But to say that "only part I ever came out" is a flat-out lie, and one easily to discover, using the link you so helpfully provided.

      Ars do you read slashdot?

      I don't see why they should. If you want to get ahold of them, posting a slashdot comment is not the way to do it. Hint: if you go to the first page of that guide (or any page in it, for that matter), the by-line is an email link to the author.

  25. Simplistic by fm6 · · Score: 2, Interesting
    I suspect that your picture of the survivability of film stock is a little optimistic. But I'll leave that issue to somebody who actually knows the technology. What really bothers me about your argument is your focus on a single factor: keeping the data available as long as possible with an absolute minimum of maintenance. If that were the only consideration, then film is actually a bad choice. Many more archival techniques are obviously more survivable. You could, for example, etch the data on platinum plates.

    But survivability isn't the only consideration. Cost is always an issue. (So much for my platinum plates, though your approach isn't exactly cheap either.) You also want to be able to able to access the data in the short term. I worked my way through college operating film projectors. It's is not a convenient medium!

    One thing I'd like to know is why archival-quality optical discs weren't considered. (Presumably there's something in the document about this, but it's a poorly structured word file, and finding key facts is more work than I care to expend.) They cost 5 times as much as standard CD-Rs and recordable DVDs, but their manufacters claim the data is good for 300 years. Of course, you need some fairly complicated technology to play them back, but CD and DVD drives are pervasive consumer devices -- they should be around for a very long time.

    1. Re:Simplistic by jonbryce · · Score: 1

      The 5.25" floppy was a pervasive consumer device.

      Try finding anything that can read them today.

    2. Re:Simplistic by fm6 · · Score: 1
      You can still buy 5.25" drives. But even if you were right, this format was never as popular as the CD is today. Not by many orders of magnitude.

      I'm not arguing that optical formats will be around forever. But they'll be around for a lot longer than floppy drives.

      Any format you could pick is a tradeoff between long term availability and the other factors I mentioned. No format will be around forever.

    3. Re:Simplistic by Anonymous Coward · · Score: 0

      The 5.25" floppy was a pervasive consumer device.

      Try finding anything that can read them today.


      MFM and RLL hard drives used to be pervasive. Try finding a PC today that can read them. Even today, Parallel EIDE is being slowly replaced by SATA. Hard drives are no more immume to obselescence than any other technology.

    4. Re:Simplistic by sql*kitten · · Score: 1

      What really bothers me about your argument is your focus on a single factor: keeping the data available as long as possible with an absolute minimum of maintenance.

      Archiveability is about more than just technology. It needs to be as near as possible to maintenance free because if you're planning for the long term, you have to assume that no-one will bother after you're gone, not until your era becomes "interesting" (for example, there's no-one alive who remembers it). Similarly, it needs to be resistant to theft, it needs to have little intrinsic value other than the data on it. Countless artifacts from Classical Greece have been lost because peasants smashed up the buildings, statues, etc to get building materials, they didn't care what they were destroying. That's right: even engraving your data on bloody great hunks of stone doesn't work! People might steal platinum, but they won't steal gelatine for any reason I can think of.

    5. Re:Simplistic by fm6 · · Score: 1

      Great, film is a highlyly archivable medium. But why is archivability the only factor you care to consider?

    6. Re:Simplistic by sql*kitten · · Score: 1

      But why is archivability the only factor you care to consider?

      Because the thread is about preservation!

      I'm a photographer and for fast workflow, you can't beat a DSLR, and they're "as good as" film since the old Nikon D1 for image quality. Got my Nikon D2x on pre-order. But the stuff I shoot that I maybe want to show my grandkids one day, that goes onto traditional silver emulsions and prints through the Ilford Archival Sequence.

    7. Re:Simplistic by fm6 · · Score: 1

      But preservation doesn't happen in isolation. It's something people do for a specific purpose.

  26. The move to disk backup continues by RonBurk · · Score: 3, Insightful
    People objecting to the use of hard drives for backup miss several points.

    • Yes, they are (somewhat, not excessively) vulnerable to magnetism. But optical discs are vulnerable to light, fingerprints, chemicals, etc.
    • Optical discs continue to lag far behind in capacity. So, in the land of audio/video backup, the choice is between a single hard disk, or dozens of optical discs. The risk of failure of multiple optical discs is amplified by the increasing number of discs.
    • Bandwidth is another issue. Though hard disk bandwidth lags behind the growth of hard disk storage, optical disc bandwidth lags even further behind.
    • Restore time is another issue. You can line up a bunch of optical disc drives and try to make all your data available at once, but you're probably never going to get the restore speed of solutions like Massive Arrays of Inactive Disks.


    There will always be multiple backup solutions, but the biggest trend continues to be towards using hard disks for backup. When your data files are enormous (such as with audio/visual data), HDD backup is even more attractive.
    1. Re:The move to disk backup continues by Videojunkie · · Score: 1

      As always with these sort of debates poeple tend to put their own experience and needs onto the recomendations they make.

      So you are right, people objecting to hard drives do miss those pointshard drives are suitable for some purposes but not for others. Similarly those suggesting film based storage have their own reasons and it is right in some situations, for example, when reliablilty is the most important thing, but less good at other purposes, when it needs to be shared with many other users.

      Similarly with the debate on the best format to save the pictures, There is only one right answer - it depends. It depends on
      o Do you want to edit the contents?
      o Where will it be viewed?
      o and by who?
      o What is the quality fo the original?
      o Must the sound remain in tight sync.
      o .....

      You get the idea. there are lioads more of those questions.

      Similarly for a wrapper format - This should be separated out from the question about the best compression for the pictures and sound - of course you may not need to wrap the video and sound and other information all together, but increasingly nowadays you do. In that case MXF is a good choice as a wrapper format, not the only one and not suitable for some purposes. (Another member says MXF is a metadata format, not exactly wrong, MXF wraps many different things together, including metadata) In MXF metadata is encoded in a way known as KLV (Key Length Value), standardised by the film and televison industry.

      So the debate goes on because there are lots of options, and there are lots of options because there are a lot of different needs.

      Which is why you can't say what the best way of presevering video is until the questions are answered in your case, and the questions are:

      Why are you doing it?
      Who are you doing it for?
      What will they do with it?
      When will they do it?

    2. Re:The move to disk backup continues by Noose+For+A+Neck · · Score: 1
      What the hard drive advocates here seem to be forgetting is that it is far preferable to be able to seperate your storage medium from the mechanism that plays it back. If your tape drive breaks for whatever reason, you buy another tape drive and continue storing/retrieving data as before. If your hard drive's electronics fail, the head starts sticking or any number of other things that can happen to this startlingly complex array of precisely manufactured components when left in cold storage for a long time, you've just lost not only your I/O mechanism but also the data contained with it, unless you want to go through some exhorbitantly expensive data recovery process (to whose prices I'm sure the IT professionals here can attest.).

      Not that tapes are necessarily the answer, but it is a better idea to get seperation between storage medium and I/O mechanisms.

      --

      Software piracy is victimless theft.

  27. More JPEG-2000 stuff by fnord_uk · · Score: 2, Informative

    There is more to jpeg2000 than a compression scheme offering scaleable quality and resolution within a single losslessly compressed file. There is also the interactive delivery mechanism offered by the JPIP protocol. Now there is something really useful...

    --
    In theory, theory and practice are the same. In practice, they're not.
  28. Re:Keep up the good work. by Anonymous Coward · · Score: 0

    Hell yes, it's amusing.

    I haven't seen that posted in years now.

    People who dislike Slashdot trolls have no sense of humor.

  29. Digital duplicates by jfengel · · Score: 1

    The nice thing about digital media is that you can leave it in an unrefrigerated shoe box for a decade or two, then come back and make a perfect copy of it with absolutely no degradation.

    You can also make a perfect copy and stick it in numerous locations, making them harder to lose in a fire/terrorist attack/rampaging llama incident. They don't require refrigeration, and they take up a lot less room.

    But it's not perfect: there's a analog-to-digital step, and you lose information there. Even a print of the film is an analog-to-analog copy, which is even blurrier. The only way to preserve that perfectly is to preserve the original medium.

    This plan would be best done alongside preservation of the original media. You preserve media for things you have reason to believe are worth the expenense of elaborate storage for a century. For the rest of it, you keep a digital copy which is as accurate as possible, and you revisit it very, very rarely to ensure that it is still viable.

    There is expense associated with refreshing the media every decade or so (at which time you also compress 10 CDs onto 1 DVD, then 10 DVDs onto a single holo-tera-whatever). But while I haven't seen numbers, I suspect that preserving the film stock is at least as expensive, and probably more so, and still prone to single points of failure.

    1. Re:Digital duplicates by ChrisMaple · · Score: 1

      There's no reason that film can't be used as the digital medium.

      --
      Contribute to civilization: ari.aynrand.org/donate
  30. Why not RAW? by Eunuch · · Score: 1

    There's an interesting prospect. RAW actually takes up less space than TIFF even though it holds potentially more information. But decoding it takes a lot of CPU time.

    --
    Transcend Humanity. Please.
    1. Re:Why not RAW? by Anonymous Coward · · Score: 1, Informative

      There's a good idea in there, but it needs to be clarified.

      We're all familiar with RAW files being smaller than TIFF while containing equal or greater content, in the case of digital cameras.

      The various RAW formats preserve a camera's individual RGB sensor element values, while a TIFF file is the product of processing on each element's neighbors to simulate (for example) 5 megapixels of 24 bit color from a matrix of 5 million 8 bit monochrome sensors. TIFF is a bit of a waste for digital camera pictures. It amounts to puffery for that application, basically inflating the size by 100% and that is done primarily for the unfortunate purpose of avoiding the camera manufacturer's proprietary RAW file format.

      But in other aplications, such as the case of a flatbed scanner, each "pixel" is a real multicolored pixel with its own R, G and B sensor values... so there may actually be 24 (or more) bits of true raw data per pixel... and so correspondingly you don't see a proliferation of RAW file formats for that application. TIFF files are actually not really a waste for scanned images. They may be large, but at least they do not expand the size of the dataset for that application.

      Now when in the case of video, the scanning hardware will be quite different. There will be no optical sensors to encode RGB data (since the video camera did that already.) Instead, you have this composite video signal, with embedded luminance and chroma values constantly being sampled by a high-speed Analog to Digital convertor chip (ADC). Any subsequent "RGB" values would only be only the result of processing... which of course expands out the video data somewhat analogously to how the TIFF format unnecessarily puffs out a digital camera sensor matrix. JPEG-2000 or any of the other compression formats are basically adding processing on top of this processing. I'm sure it would make plenty of sense to explore the feasibility of a "RAW" video format consisting essentially of the data stream from the ADC.

      That would truly be a raw digitized version of the video signal, on top of which compression could be added.

  31. why not jpeg2000 by Anonymous Coward · · Score: 0

    the main reason for not choosing other storage formats is that film has a bit depth greater than 10bits per color channel. MPEG, MPEG-4, WMV, Quicktime, etc do not satisfy this requirement. Also the film people do not want their frames interpolated, they want absolute frame accurate reproduction. Its no wonder that media matters has chosen JPEG2000, its the format that has been ratified by the D-Cinema folks (meaning all future digital projection is jpeg2K) and there are already real-time jpeg2000 solutions available http://www.digital-rapids.com/Brochures/CarbonHD.p df (pdf file)

  32. Yawn, another technology of the day... by HockeyPuck · · Score: 3, Insightful

    All this is is a method to line some guy's pockets. I'm sure the tape guys are gonna say, use XYZ type of tape. The disk guys are gonna say disk.

    What makes this guy think that the interface to the HDD is going to be around in X years?

    PC's have only had two dead (non-(e)IDE/ATA) interfaces, the ESDI and the ST506/ST-412 interfaces.

    But what if you were trying to find a computer with IPI (1960s mainframe) interface.

    The Fed gov't has this problem with trying to find parts for their old 8/9track tape drives..

    Here's a good list of all the HDD interfaces over the years: http://www.i-t-s.com/corporate/terms.html

    Stick with microfiche, film, that way we don't have to pay some vendor $$$/yr to keep alive a dead technology or pay some other vendor $$$/media to move them from old to new media.

    1. Re:Yawn, another technology of the day... by Anonymous Coward · · Score: 0

      How many mainframes with IPI interfaces have been build? 50? 300? 5000? How many machines with ATA interface have been build? Billions?

      I think the chance to use ATA drives in 45 years (1960-2005) is much better then with IPI.

      Besides that, no matter if you use tape, discs or whatnot, you have to transfer to another medium eventually. And in the meantime, discs are just much more convenient.

    2. Re:Yawn, another technology of the day... by Wesley+Felter · · Score: 2, Insightful

      This is not an issue because you never remove the hard disk from its computer. When the computer becomes obsolete, you buy a new one (with new disks), and copy the data over.

    3. Re:Yawn, another technology of the day... by HockeyPuck · · Score: 1

      Have you ever run a large backup operation? I'm not referring to a few tape drives, but say a hundred LTO tape DRIVES, and you're backing up almost a PetaByte a month? Some of those tapes have SEC regulations mandating that you keep them around for 7-10years? That's ALOT of tapes...

      I may be using LTO now, but 10years AGO i was using DLT-1.... That's alot of tapes... in a dead technology. Especially since an LTO tape drive of today cannot read a DLT tape of 10years ago.

  33. Heh. Magnetic tape probably safer. by Ayanami+Rei · · Score: 1

    Honestly... just re-copying a digital linear tape every 10 years is a safer bet than any other technology or solution.

    DLT tapes are nearly indestructible with a price/GB which is close to HDDs. And now we have LTO, which is even more dense in a similar form factor.

    I've recovered data from tapes that were left outside in the rain, kicked around in storage boxes, and stained with smoke (after cleaning of course). All you need to do is keep magnets away (and by this I mean keeping the field strength under a few gauss... )

    Any media relying on a chemical process to retain information is probably not a good idea... and I think it would cost more in the long run.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
    1. Re:Heh. Magnetic tape probably safer. by ChrisMaple · · Score: 1

      Magnetic tapes are flexed whenever they're used. Flaking off of the magnetic medium (oxide) is always a problem, unless the oxide is covered with a protective layer (as is done with some language lab tapes). Such a protective layer moves the oxide away from the head, which reduces data density.

      --
      Contribute to civilization: ari.aynrand.org/donate
  34. Actually... by Craig+Ringer · · Score: 1

    I'd be happy to be corrected, but my understanding is that RAID 5 is no more capable of telling you which drive is "right" about a block than RAID 1, at least its most common default configuration.

    RAID 5 with one parity disk essentially stores the XOR of all the other disks' data on the partity disk. (It's not that simple because it stripes it across disks, but in terms of the logic that doesn't matter).

    If I have three values, a, b, and p where:

    p = a XOR b

    and I find out that *one* of them is wrong because at a later stage I check and find out that now:

    p != a XOR b

    all I know is that either a, p, or b are wrong. Not which one. As I understand it, the purpose of RAID 5 is that if I lose any one of a, b, or p I can reconstruct it from the two remaining values, and I can extend this scheme up to any number of values ("disks").

    For this reason, I'm pretty certain most decent RAID controllers reserve a small percentage of your disk's capacity to store checksum (probably CRC) information. This lets them checksum blocks and determine *which* block is wrong, so they can go from "one of these values is wrong" to "this value is wrong, let's reconstruct it from the others".

    My understanding is that RAID 1 frequently doesn't use this, because it's not doing any logical disk remapping like RAID 5 does anyway, and also typically wants to retain the option of using any one disk stand-alone in an none-RAID system.

    It should also be noted that IIRC high-end RAID systems permit you to dedicate two disks (or even more?) to parity. Much like the difference between a CRC value and an ECC value, this lets you recover one wrong value (without extra CRCs etc) and detect two wrong values.

    1. Re:Actually... by Eric+Seppanen · · Score: 1

      You're assuming that a hard drive will always return the data stored on the platter, right or wrong. In truth, hard drives use enough CRC bits internally that the data will be return correctly, or not at all. That means with two mirrored drives, you know precisely which one went bad, because it's the one that returns an error when you try to read certain sectors.

      --
      314-15-9265
    2. Re:Actually... by Harik · · Score: 1

      You're thinking raid4. Raid4 is parity, and given a "dumb" disk, it would fail if bad bits were returned. Again, disks return good data, or none at all. Raid-5 is more complex, using ECC, and you can't say "this disk is dedicated to ECC" because it's rotated through all the disks for performance reasons.

  35. Good to know by Craig+Ringer · · Score: 1

    Thanks for pointing that out - it's something I should've realised.

  36. Funny Story by ArcSecond · · Score: 1

    A friend of mine has a friend who was running lots of host machines (web servers, irc, file sharing, etc.) in a kind of ghetto set-up in Calgary. One day, everyone sitting in irc saw the connection drop, all the stuff being served was gone. No web site, no streaming audio, no irc.

    The servers were all sitting on a shelf... you know, the kind you use those brackets that screw into the wall, and put some board on top? I will leave it to your imagination to figure out what the technical problem was that day.

    Talk about servers crashing! :)

    --

    I've got a bad attitude and karma to burn. Go ahead. Mod me down.

  37. JPEG-2000 is encumbered by patents by runderwo · · Score: 1

    TSIA. It would probably be a bad idea to start publishing material with it until the patent expires.

  38. OpenOffice.org by tepples · · Score: 1

    Should Microsoft's Office business unit collapse, the OpenOffice.org maintainers will probably freeze the .doc format as whatever OO.o Writer's .doc import filter accepts.

    Besides, if you have Windows 98 or later, you can open simple Word 97 .doc files in WordPad.

  39. Document the file format with C source code by tepples · · Score: 2, Insightful

    And if you can spare the space, a directory with a wav file and a stack of uncompressed TIFF images is even better. Compression formats are complicated to reverse engineer.

    Store .mng + .flac + source code for libmng and libflac, and you don't need to worry about any sort of complicated gnireenigne.

  40. Decoder simplicity importance? by tepples · · Score: 1

    TIFF would be a much better choice for archiving, because it's a much simpler format and is much easier to decode.

    Does it really matter how simple something is to decode if you're including source code for the decoder libraries on each HDD?

    1. Re:Decoder simplicity importance? by Leo+McGarry · · Score: 2, Insightful

      Because when you're archiving digital data, recoverability is paramount. You have to ask yourself, "What if all I had was a piece of this data, say, a hundred gigabytes from the middle of the disk? Could I turn that data into useful information?"

      If you're dealing with a run-length-encoded array of packed pixels, the answer is obviously yes. That's among the simplest forms of encoding known. (If you don't RLE the data it's even simpler, but a trade-off between simplicity and storage requirements is okay as long as you maintain a lot of simplicity.) Even if you don't know how the data was encoded, you've got a good chance of figuring it out just by doing some simple analysis on the bytes. But with a complex encoding scheme, it's much more difficult to figure out what you're dealing with just by looking at it.

      When talking about archiving, the objective is to be able to recover as much as possible given as little as possible.

  41. Shortcimmings of the study by Anonymous Coward · · Score: 0

    I noted several shortcomings of this study.

    1) He states the Jpeg2000 can be lossless or that it can "scale down" and be compressed. He does not subject the "scaled down" Jpeg 2000 codecs to the same rigor as he subjects other compressed formats, etc Mpeg4, Mpeg2, Sorenson, WM9, and real video. I think if anything like the hardware he suggests is put into place there will be extreme temptations to scale down the capture. He suggests that people searching these archives would be happy to have "scaled down" transcodings of the uncompressed file. How many MIPS does that take? How long does that take? Why won't he tell us how this lossy transcode looks compared to the other lossy transcodes. Of the people who go to the library to check out dance videos how many of them are going to be pleased or even able to view a lossy transcode in MXF format?

    2) 640*480 captures??? Who uses this but armatures??? It has the advantage of being square pixels, and many of the compressed formats he uses are also square pixels but even his "uncompressed" AVI is going to be -DOWNSAMPLED- from the DVCAM source he lists. That's not something you want to do before upsampeling to HD.

    3) The Mpeg 2 test lists 20Megabit Mpeg 2 but the source is 640*480 video. Mpeg 2 like the DVCam tape can do better, I don't understand not using "main profile" and not using a normal data rate. CBR like he did is not the only way to produce Mpeg2. Though I am a fan of CBR I wonder about things like 2 pass VBR and how that would affect the quality measurement.

    4) What software or software packages were used to do the respective compressions? It's not listed. Different packages can have wildly different quality results.

    5) I feel the author was truly misunderstanding Mpeg-2 when sentences like this are in the report "While this form of encoding looks more or less attractive on a standard television screen, whole frames of video are thrown lost, thrown away in the digitizing process to get the file small enough to fit onto the DVD media." Not only does he seem to not quite get the techniques he radically underestimates the advantages of Mpeg-2 and it's omnipresence. You can get a wide variety of encoding and decoding packages and devices. You're not waiting around for a few "real time jpeg2000" capture boards to hit the market. You don't need a container format with Mpeg2 and it is HIGHLY optimized to reproduce what is on a video tape, so much so the sampling is RARELY like his lame 640*480 captures.

    I know there are problems with Mpeg2 and I respect the instinct to go for high quality but this study is just flawed. While he devalues the archivists for "hoping for the best" in maintaining their analog libraries, he is out there "hoping" for some jpeg2000 capture boards and some market adobption of MXF format. Goodness he's making an expensive proposition based on Hope. I see no cost analysis besides a "projection" of dropping hard disk storage costs, perhaps there are some extremes system costs to this sort of storage besides just bulk media???

  42. Intraframe vs. interframe by Wesley+Felter · · Score: 4, Interesting

    For whatever reason (I'm not a video expert) many people prefer intraframe codecs for archival. As you probably guessed, Motion JPEG 2000 just treats each video frame as a still image and compresses it with JPEG 2000.

    Dirac will give much better compression that JPEG 2000, but it also introduces the possibility of interframe artifacts.

    1. Re:Intraframe vs. interframe by andrewleung · · Score: 1

      There's more to Motion JPEG 2000 than just wrapping a bunch of JPEG 2000 files into 1 file.

      -timings and support with audio (file format)
      -description for storing interlace video (odd, even, first, last, etc.)
      -Color space and sampling (YUV 4:2:2, 4:2:0, etc.)
      -streaming writes and reads (faster access and writing.)
      -and more.

      Why do people want this for archival? It has to do with preserving the source as closely as possible. Film stock is frame by frame, why should interframe coding be used for archival?

      Also, you can seamlessly use the same codec for your image and video archive.

      What makes JPEG 2000 stand out from other wavelet based codecs (wavelets have been around for a LONG time and in other standards):
      -EBCOT
      a whole bunch of features such as: random codestream access, scalability, network streaming, etc.

      if you're more interested in this, buzz me offline: rockin@gmail.com

  43. the problem is not film cost, but qc by cinnamon+colbert · · Score: 1

    U r 100% right that paper and film have a proven track record, which is more then u can say for anyting digial (proven = > 100 years) Analog is also inherently superior to dig for archive in that partial loss is only partial loss.
    The required equipment could be built in volume at a reasonable price (and even custom glass lens are not that $$, as you can see from say photonics magazine), but as u c every day in cameras, you can mold pretty good lens out of polycarbonate or coc cheaply). The cost of 35 or 64 mm movie film is cheap, etc IN VOLUME
    which raises the question,if this is real, why is kodak or someone like that not doing this.
    I dont know the size of the data archive market, but it has to be at least a few hundred MM a year, small but i think just large enough to interest a company like kodak or fuji or agfa or even a zombie like polaroid.

    but..you still have analog to dig to get the data back, and you have the horrendous problem of cataloging analog data for a digital world and you have the prolbem of the declining cost of storage...which is probably the largest arguemtn, from an ROI standpoint, against film
    IMHO this whole archive thing is silly..you do what every /. reader does, you transfer to new, much larger media every 5 years

    1. Re:the problem is not film cost, but qc by spoco2 · · Score: 1

      I'm sorry, I stopped reading after " U r "

      Geeze... if you have anything worthwhile to post, could you try an approximation to english?

    2. Re:the problem is not film cost, but qc by Anonymous Coward · · Score: 0

      what is wrong with u r for "you are" ?
      even fowler (far more stringent and unforgiving in his approach to english grammar then say strunk) approved of things that make life easier...(and if you do not know who Fowler and Jesperson [http://www.britannica.com/ebi/article?tocId=92751 49] are, maybe you should learn a little before pontificating about english usage)
      You might also try re reading Orwell on usage...

    3. Re:the problem is not film cost, but qc by Anonymous Coward · · Score: 0

      What's wrong with U R?

      It's makes the user sound like an uninformed cretinous shithead with the intellect of a granite countertop-therefore devoid of anything interesting or useful to say.

      U R is great on cellphone or IM between two dipshit freinds; however if you're trying to make a point, it's automatically lost. It's like hanging a billboard on your face advertising "I'm too stupid/lazy to put forth the effort to type the six characters required to communicate my thought effectively, therefore ALL OF MY IDEAS ARE SHIT"

    4. Re:the problem is not film cost, but qc by spoco2 · · Score: 1

      (I shouldn't reply to ACs, but)
      " what is wrong with u r for "you are" ?"
      Urgh... A lot is wrong with it, it immediately puts one in the mindspace of SMS or childish speak... it's not going to have you be listened to by anyone with anything approaching a good education or something to say...

      And 're reading Orwell on usage'??? What, you think we should all move to newspeak? OK... I'll leave you to your doublethink.

  44. Re:MXF because they are stupid by Anonymous Coward · · Score: 0

    An archive format??
    1) Jpeg 2000 is not widely used, and is not much better than good old jpeg. its easier to implement a jpeg decoder, or find info on how to make one.

    2) But really, the most flexible would be to use NO container format.(folder of jpegs, a wav file, and a text file telling you the fps)

    3) raw wav audio is high quality and small (compared to video) and can be read by anything

    4) storage is silly; 10-20 years a new format will be pushed, and you'll have to migrate everything anyway. The media might last, but the players will not. just pick what is cost effective now.

  45. Recoverability depends on seekability by tepples · · Score: 2, Funny

    Because when you're archiving digital data, recoverability is paramount.

    No, Viacom is paramount.

    "What if all I had was a piece of this data, say, a hundred gigabytes from the middle of the disk? Could I turn that data into useful information?"

    As long as your codec is seekable, this works. Motion JPEG is trivially seekable, consisting entirely of keyframes. Toss a redundant copy of the codec on the volume after every GB or so of video data, and recoverability is preserved.

    1. Re:Recoverability depends on seekability by Leo+McGarry · · Score: 0
      You're not understanding me. Let me go through this again.

      Here's a bunch of data:
      00 A0 3C 0D 18 99 5E ...
      Let's assume, for sake of argument, that you know this is video data. That's not a valid assumption, but let's grant it anyway. Your task is to split this data into frames and decode them into something that can be displayed.

      If the data consists of an array of pixels, one right after another, that's easiest. All you need to do is find the boundaries of each frame and then figure out whether the pixels represent color channels or what. Relatively easy.

      If the pixel array is run-length encoded, that's also fairly easy. There will be patterns in the data that make it fairly clear that you're dealing with run-length encoded data.

      If, on the other hand, the data is a list of quantized coefficients that need to be plugged into a transform algorithm in order to be turned into pixel data, you're screwed. You can brute-force it to try every known transform (DCT, wavelet, whatever) against every permutation of the data set, but at that point you're just grinding metal. The odds of getting useful information out of the data in a reasonable amount of time are so close to zero as to be hardly worth discussing.

      Generally speaking, the more clever you are in encoding your data, the less recoverable the encoded data will be. Normally this isn't important, but it becomes vitally important when you start talking about archiving. When it comes to archiving, recoverability is right up there with durability. Everything else is secondary.

      Do you understand now?
    2. Re:Recoverability depends on seekability by glyph42 · · Score: 1

      JPEG has easily recognizable headers, and is divided into blocks again with easily recognizable headers. You're not screwed. The JPEG committee thought of all these things long before you started griping about them. I think you need to do some more understanding yourself.

      --
      Music speeds up when you yawn, but does not change pitch.
    3. Re:Recoverability depends on seekability by tepples · · Score: 1

      Here's a bunch of data: 00 A0 3C 0D 18 99 5E ...

      That's seven bytes. You said we have a 100 GB fragment of extant data. Under the scheme I've been explaining, which semi-periodically embeds a copy of the decoder's source code, you seek forward until you find a long string of bytes with bit 7 clear, which is the signature of US-ASCII text. This string of bytes will contain the source code. Then you compile and link the decoder, and it will parse the extant 100 GB and extract MJPEG frames from it. Or do you assume that C and /* English */ will be dead languages by the time you want to recover this data?

    4. Re:Recoverability depends on seekability by Leo+McGarry · · Score: 1

      That's seven bytes. You said we have a 100 GB fragment of extant data.

      Let me teach you about something wonderful. It's called the "ellipsis." It means "and so on, in that fashion." It means I don't have to actually type a hundred billion two-letter combinations to illustrate that I'm talking about a hundred gigabytes.

      Under the scheme I've been explaining, which semi-periodically embeds a copy of the decoder's source code, you seek forward until you find a long string of bytes with bit 7 clear, which is the signature of US-ASCII text.

      Ooops. Didn't find it. That part of the data was lost. It'd be nice if we were still able to recover something. Unfortunately, because Random Slashdot Dumbass #3 was responsible for the archive, and he chose some dumbass format that nobody remembers, all we have is this random list of coefficients with no clue whatsoever as to the algorithm that was used to encode them.

      You do realize that "encode" also means "to obscure," right?

      Or do you assume that C and /* English */ will be dead languages by the time you want to recover this data?

      Of course we assume that, you dumbass. That's why it's called an archive. It's meant to last, for all practical purposes, forever. Middle English was in widespread use a mere thousand years ago, but today it's a completely dead language. We don't even use the same character set they used. At the rate English is evolving, a thousand years hence this comment I'm writing right now will be about as understood as "Syððan wæs geworden æt he ferde urh a ceastre and æt castel" is to you today. The only people who can understand Middle English are scholars. If you found a shoebox in your attic filled with letters and postcards written in Middle English, translating them to Modern English would be a massive effort.

      The whole idea of creating an archive is to store information in a way that's as recoverable as possible. Expecting the people who want to access that information to (1) understand your language, (2) understand your programming language, and (3) understand all your baroque encoding algorithms is just fundamentally wrong. At that point it's not an archive. The data will be useful for maybe a few decades, possibly, if you're lucky. Beyond that, you're just recovering the archive to get at the refined metals in the disks.

  46. .doc? by b1t+r0t · · Score: 2, Funny

    For people concerned with the preservation of "data", they've sure picked an interesting format to write about it in.

    --

    --
    "Open source is good." - Steve Jobs
    "Open source is evil." - Microsoft
  47. GIGO and the "born digital" problem by retiarius · · Score: 3, Insightful

    indeed, lossless for archival preservation is the
    only way, as it fits the basic rule of art restoration
    technology -- never apply "improvements" which
    cannot be reversibly undone to take advantage
    of future science.

    ironically then, the lossless format doesn't matter.

    however, at least for the instant case of dance video,
    the likely input (a myriad of digital tape formats)
    is hopelessly neanderthal -- anything having to do with DV,
    or MPEG, or even ATSC HDTV already tosses away much
    color information. (4:1:1, 4:2:0, and 4:2:2 colorspace is embarrassing
    to preserve "losslessly".) ditto for temporal
    info, with interlacing being the culprit. even film at
    24fps just will not cut it for motion such as dance.

    so here's to better camera technology, whether it's
    10- or 12-bit 4:4:4 RGB, or something like
    carver mead's foveon made swift.

  48. Future of Video Preservation. by Anonymous Coward · · Score: 0

    The heck with compression. We have an infinite digital storage in space. Beam it out as tightly focused megawatt microwave beams, and just remember the exact heading and the date.

    Our future FTL capabability will ensure that we can recover every last digital bit, as often as we need.

    Downside? Those pesky aliens will not be paying their share of the royalties!

  49. Oh, fuck. by Anonymous Coward · · Score: 0

    I'm so very, very sorry. Please ignore the parent post.

  50. TAPE! USE TAPE! by Ironsides · · Score: 1

    I don't understand why they don't use LTO Digital Tape. LTO-3 currently holds 400GB (using no on tape compression). Is $0.30US/GB
    http://www.cdw.com/shop/search/results.aspx?grp=TM L
    Is very reliable and will last for a very long time. It is great for archiving and is what TV stations use. Also, if they are serious about archiving, why are they not considering higher bit rates? If they are going to do this then they should be considering 50 Mbit instead of 20 Mbit. Shure it takes up a lot of space, but that is why you use LTO. Also, every ~two years they come out with the next gen that has double the capacity of the previous version. And the new gens can read the older tapes.

    --
    Fly me to the moon Let me sing among those stars Let me see what spring is like On jupiter and mars
  51. Automated translation of programming languages by tepples · · Score: 1

    Ooops. Didn't find it. That part of the data was lost.

    In the 100 GB extant fragment you mention, there should have been at least a hundred copies.

    You do realize that "encode" also means "to obscure," right?

    The "EFM" part of the physical layer of CD or DVD storage and the "RLL" part of the physical layer of hard disk storage are also encodings or, in your definition, obscurings. You have a shiny platter and a drive motor that hasn't worked for centuries; how would you retrieve even a byte stream from the platter?

    If you found a shoebox in your attic filled with letters and postcards written in Middle English, translating them to Modern English would be a massive effort.

    By the Church-Turing thesis, every language of computation can be perfectly interpreted within another language of computation. You mention that only scholars can translate Old English to Modern English and Classical Latin to Modern Italian, but the difference between spoken languages such as Middle English and mathematical languages such as C is that a small team of scholars can create a perfect automated translator for a programming language.

    Expecting the people who want to access that information to (1) understand your language, (2) understand your programming language, and (3) understand all your baroque encoding algorithms is just fundamentally wrong.

    People who have the data and want to convert it to the format du jour only have to have access to a program that performs (2). Programming language archaeologists can create this translator program once and be done with it.

    And who's to say that automatic computers will even exist in the distant future? What if thinking machines are banned between now and then? Then how would you pull byte arrays off a platter and turn them into visually perceptible motion?

    1. Re:Automated translation of programming languages by Leo+McGarry · · Score: 1

      In the 100 GB extant fragment you mention, there should have been at least a hundred copies.

      What, stuck in the middle of a piece of video? No.

      You have a shiny platter and a drive motor that hasn't worked for centuries; how would you retrieve even a byte stream from the platter?

      Well, there you go. That's why electronic storage is a controversial topic for archivists. Its use is a compromise.

      Also, we simply don't know how to store video and audio for the centuries. Text is easy: c.f. the Dead Sea Scrolls. If you want to go back farther than a few thousand years, etch words and simple pictures into stone or, better, metal. But for video and audio, we just don't have a good solution. There's a very real possibility that these vast volumes of information will simply be lost to history.

      a small team of scholars can create a perfect automated translator for a programming language

      Sure. And a small team of scholars can translate Middle English to Modern English, too. But every aspect of the story that involves the use of the phrase "a small team of scholars" reduces the likelihood that the information will ever be recovered.

      And who's to say that automatic computers will even exist in the distant future?

      These are, in fact, exactly the kinds of questions archivists have to deal with.

  52. You only have to write the compiler once by tepples · · Score: 1

    What, stuck in the middle of a piece of video? No.

    Why not interleave the video with copies of the decoder? Haven't you seen the movie Contact, where the rules for interpreting the alien diagrams were placed around the edges of the blueprints?

    That's why electronic storage is a controversial topic for archivists. Its use is a compromise.

    Everything is a compromise. It depends on which scale of time you're trying to archive for. I see HDD as useful for archiving video over the course of a century or two. Beyond that, it has become increasingly likely that Armageddon could destroy everything.

    But every aspect of the story that involves the use of the phrase "a small team of scholars" reduces the likelihood that the information will ever be recovered.

    The difference is that a small team of scholars would only have to write a C compiler once in a modern programming language, and then any program written in ancient C would become executable.

    1. Re:You only have to write the compiler once by Leo+McGarry · · Score: 1

      Why not interleave the video with copies of the decoder?

      Because then it's even harder to tell what's content and what's not!

      Haven't you seen the movie Contact

      I'm going to pretend your source of insight for this conversation isn't a bad science fiction movie.

      I see HDD as useful for archiving video over the course of a century or two.

      The hardware? Yes, certainly. Stored in a dry place free from massive temperature fluctuations, a hard drive will last indefinitely.

      But it's the data that matters, not the medium itself.

      The difference is that a small team of scholars would only have to write a C compiler once in a modern programming language, and then any program written in ancient C would become executable.

      Sigh. I love the way you take subjects on which entire books, whole libraries full of books have been written and dismiss them with a wave of the hand.

      If you don't know what the fuck you're talking about, why don't you stop talking?

  53. Observations with MXF by heroine · · Score: 1

    My personal observations are that MXF is primarily used as a container for DV, the Jpeg 2000 codecs are too slow to do much good, and MXF is hardly in use anywhere except trade shows.

  54. Framing and SF-inspired tech by tepples · · Score: 1

    [With interleaved decoder and coded content] it's even harder to tell what's content and what's not!

    As I've said, source code in any programming language that uses US-ASCII encoding will have bit 7 clear throughout. This is a strong correlation in bit 7, a method of framing which I think any future scholar would have little or no trouble discovering.

    I'm going to pretend your source of insight for this conversation isn't a bad science fiction movie.

    I'm going to pretend that a lot of engineers working on breakthrough technology didn't receive inspiration from speculative fiction. Without the Dick Tracy comics, for instance, do you think anybody would have bothered trying to squeeze a PDA into a wristwatch form factor?

    If you don't know what the fuck you're talking about, why don't you stop talking?

    Then please explain why you feel that I don't know what the intercourse I'm talking about. I am willing to learn.

  55. Less obvious once you factor storage and handling by leonbrooks · · Score: 1

    50 x long-life DVDs @AUD$1 each including cover: $50
    Labour @$10 an hour to feed these to a burner: $30
    Controlled storage for 50 DVDs for 20 years: $200?
    Drive to read and copy the suckers after 20 years: $X?
    TOTAL: $280+X

    vs

    1 x 200GB IDE HDD @AUD$160: $160
    1 x tray @AUD$10: $10
    Labour @$10 an hour to plug it in and walk away: $1
    Controlled storage for 20 years: $50?
    Functioning IDE buss to read and copy the suckers after 20 years: Z?
    TOTAL: $221+Z

    If you mail it across Australia to the storage, the hard disk just won by a bit more. It fits in a 750g satchel for about $5, the DVDs won't fit in a 3kg satchel so they'll get "cubed" and probably be about $20-$25.

    --
    Got time? Spend some of it coding or testing
  56. RAID 4 / RAID 5 by Craig+Ringer · · Score: 1

    Thanks. I noted in my post that I was oversimplifying the description of RAID 5 as if the error correction data was all on one disk. I realise that in reality it's striped across all disks. In terms of the data recovery logic that doesn't matter, since for any one set of sectors across all disks there is still an error correction set.

    I was, however, unaware that RAID 5 used ECC not simple parity checks. How does it do this when only a single disk is dedicated to error detection? I would expect ECC data to take up more space than that (in fact, twice the space). If you have details I'd be interested to find out.