Domain: digitalpreservation.gov
Stories and comments across the archive that link to digitalpreservation.gov.
Comments · 24
-
Re:Yes: Thunderbird archive
Agreed Thunderbird works well for an archive. There are just two gotchas I've encountered.
1. The MBOX format gloms all your mail into one continuous text file. It does not have a special string to denote the beginning of a new mail message. It uses "From " (F r o m + a space) to figure out where the beginning of a mail message is. Consequently, if an email has a line in the body where someone has actually typed "From " as the beginning of a sentence, Thunderbird can mistake that as the beginning of a new email (there are a couple other checks it does - read the link if you want the details).
2. If you used Thunderbird as an actual email client in the past, getting it to stop trying to login to check for new emails can be problematic. My Thunderbird setup was extensively customized with mail sorted into different folders by subject. I don't want to lose that sorting so I can't simply dump it into a new Thunderbird install (at least not with a lot of work setting it up again). So I just put up with the program occasionally hanging for 10-15 seconds while it tries to connect to a defunct mail server to download new mail. This may have been fixed - I've only had to look up 3 or 4 archived emails in the past 5 years, so I haven't bothered upgrading Thunderbird in a long time. -
Re:Over what time interval?
-
Re:And in open formats?
OpenDocument
Haha, good joke!
-
Re:Miking up 'open' and 'standard'
The other side of that coin is that, if Google decided to include H.264 support in Chrome, other browser makers, without or with little money, could have been unable to pay the royalties needed to be paid to patent owners to support it in their browsers. Does Google saving itself $6.5 million annually pale in comparison to Google saving itself and everyone else up to $6.5 million annually?
The Web does not need H.264 to function, so if everyone drops support for it preemptively, then even the small browser makers (I'm thinking of Flock here, which is based on Chromium but has a user base of its own) won't need to pay royalties to anyone.
This is like the GIF patents by Unisys* driving adoption of the PNG format, except that the patents are being assessed now because we know of a few formats that are** suitable for the job but unencumbered already. For instance, WebM can, like H.264, be implemented by anyone, but in addition to that, WebM intentionally doesn't require that royalties be paid. It can become a standard, like PNG originally wasn't a standard but then became an ISO/IEC standard 8 years later.
________________
* = As well as other factors such as the lack of true-color support, alpha and ICC in a lossless format, but that's off-topic.
** = May not be for now, due to lack of hardware acceleration, but could become in the future. Some people don't really care about power usage or CPU usage, so it's already suitable for them. -
Re:Firefox, eh?
I thought at first that themp3 standard was already "free":
Open standard. Developed by the Motion Pictures Expert Group (MPEG), Coding of audio, picture, multimedia and hypermedia information.
However, mp3 is not free...yet. Some of these patents are set to expire on their 20 year time frame in a couple of years it would seem.
-
Re:Can't say no to H.264 without reliable alternat
In a previous blog post, you explain that there is no such thing as a patent-free video codec. The reason being that the existence of prior-art is not sufficient to prevent a patent from being granted.
This implies that even video (or image-based) codecs in existence for nearly 20 years will still be patent-encumbered when any original patents expire in a few years.
While it may be prudent for a large player like Google to vet their codecs against the MPEG-LA license pool, the real problem is that software patents are unworkable.
-
Digital Stewardship : PDF vs PDF/A
PDF/A is already open. However, that doesn't mean that anyone knows how to produce it, especially some R.O.A.D. staffer or random hourly GS1.
Open or not, PDF/A is a display format and, in most cases, useless for information retrieval or automated data processing. PDF/A is a useful alternative to paper. However, the open government initiative is not talking about paper. It's about 'born digital', machine readable data.
-
Re:NO
Digital cable is actually is pretty open... most cable boxes are MPEG-2 based just like DVD. That is also the preferred format of the government for digital archiving. http://www.digitalpreservation.gov/formats/content/video_preferences.shtml That said the companies do all sorts of funky stuff to mess with the MPEG-2 standard, but that is the cable company's fault. My problem with flash isn't it being more open (though that would be nice), it is that if I have anything flash open on my computer it eats up memory and runs the heat through the roof. I don't know what is messed up in their code, but it can be sitting idle int he background and it will eventually bring my computer to a crawl. I've tried on dell desktop, acer laptops - one xp one vista, and on both a powerbook and a macbook and the results are the same: open a flash movie, animation, etc. minimize it, forget about it. realize that computer starts to get REALLY slow after a few hours and the fan runs full blast. Close flash, fan stops, computer returns to normal operation.
-
Re:Why make the leap in the first place?
Silverlight supports what users ask it to support.
Oh, and a link to one of the formats it supports
pwnd
-
Re:What a flood of garbageThings nobody has mentioned:
- The relevant search term is "digital preservation". The Library of Congress has an active project.
- One fundamental problem is what the underlying storage medium should be. Microfiche is a well-established choice, and lasts at least 20 years.
- Once you can store a bit-stream, the other fundamental problem is what format to use. You could use a lowest-common-denominator format, and include directions on how to decode it. A better option is to use whatever format the digital preservation community is standardizing around, since they are likely to maintain open-source decoders for those in the future. For images, uncompressed TIFF and JPEG 2000 are common choices.
-
Re:What a flood of garbageThings nobody has mentioned:
- The relevant search term is "digital preservation". The Library of Congress has an active project.
- One fundamental problem is what the underlying storage medium should be. Microfiche is a well-established choice, and lasts at least 20 years.
- Once you can store a bit-stream, the other fundamental problem is what format to use. You could use a lowest-common-denominator format, and include directions on how to decode it. A better option is to use whatever format the digital preservation community is standardizing around, since they are likely to maintain open-source decoders for those in the future. For images, uncompressed TIFF and JPEG 2000 are common choices.
-
NARA and Library of Congress
Forget Wikipedia, ask the people who spend their lives trying to figure this out.
http://www.digitalpreservation.gov/you/digitalmemories.html
http://www.archives.gov/preservation/technical/guidelines.html
http://www.archives.gov/preservation/family-archives/digitizing-photos.html
-
Re:Hi, I'm your polar oposite.It's a safe bet that those paper books will last far longer than any hard drive that you store files on That's probably true. What's also true is that there's a whole discipline working on fixing that.
Those of us working in national copyright libraries are participating in the development of tools, mechanisms and practices to Do This.
http://www.digitalpreservation.gov/
http://www.dpconline.org/
http://www.planets-project.eu/
etc.
Even if we get perfect archival media (at least as good as vellum!) you've still got the problem that the bitstream might not mean jack to future generations.
ps - If anybody out there has an interest in the field, there's a serious lack of programmers/developers and it's a very lucrative niche, which doesn't take a huge effort to learn [google "digital repositories" "Fedora" (not the RH one) "DSpace" OAI-PMH METS...] -
left hand good, right hand evil
What's really stupid about this is that Library of Congress, or at least a component within them, are seen to be a champion for open formats: http://www.digitalpreservation.gov/formats/intro/intro.shtml/ (that's a Library of Congress site). They've got a $3 million dollar deal, but at the cost of a lot of credability in the archival community.
-
Re:Meaning of words
I have a large variety of MP3 files to better understand the file format for possible future creation of my own codec... Does that work for ya?
Not when you can accomplish the same thing without violating copyrights.
http://en.wikipedia.org/wiki/Wikipedia:Sound/list
http://www.digitalpreservation.gov/formats/fdd/fdd000012.shtml
http://www.id3.org/mp3Frame
http://www.dv.co.yu/mpgscript/mpeghdr.htm -
Re:Fascinating position...
http://www.digitalpreservation.gov/formats/fdd/fdd000035.shtml was my source, but it could be a bad source. I can't conveniently look at the actual ISO pdf, so I can't verify, but I'd take someone's look at implementing code over some web page.
If they are different and MPEG-1 is better than H. 261, then Nokia's suggestion is even worse than saying MPEG-1. -
Re:What we reallly want...Advanced Audio Coding is MPEG-2 part 7, with enhancements in MPEG-4 part 3. It's not a replacement for MP3 (MPEG-2 part 3), it's an alternative, which has existed for just as long as MP3 has.
Close.
Except that MP3 was originally MPEG-1 part 3. And from page 1 of ISO/IEC 13818-7 (warning:PDF file) (page 7 of the PDF):This International Standard describes the MPEG-2 audio non-backwards compatible standard called MPEG-2
Advanced Audio Coding, AAC [1], a higher quality multichannel standard than achievable while requiring
MPEG-1 backwards compatibility.
Thus, what the GP said, ("AAC [wikipedia.org] isn't Apple's codec. It's the MPEG group's replacement for MP3."), is pretty much correct. -
Re:It is TIFF hijacked
Yes because Wikipedia is always 100% accurate. Try here, BTW the AC was right, Microsoft worked with Aldus on the development of Tiff.
-
Re:BS Case
And that is exactly why this is a monopoly. You see, Microsoft does allow others to play their file types, the specifications for the file formats are fairly open. You see, WMA is a subtype of ASF, documentation for which you can download here. And this is obviously a logical move as Microsoft does not develop hardware. And it sure as hell wouldn't fend off iPods playing their file format as long as Apple would pay to use it. But Apple does not allow (as far as I've managed to read in the posts) implementing and playing of their formats by their competitors. What I'm trying to say is, that AFAIK, Apple will not allow Creative to develop a direct iPod comptetitor (because it won't allow them to play the music they purchse on ITunes) -- which is a monopoly.
-
Re:A joke, I know, but...
Research before you talk. The Quicktime file format is fully documented, and Apple's licensing is quite open. According to Digital Preservation, "Licensing by Apple appears to be limited to the software and other technology elements." The Wikipedia entry on Quicktime claims that the "the QuickTime file format itself [is] openly documented and available for anyone to use royalty-free."
If you want to be very sure, you could always ask Apple directly, via their Quicktime Software Licensing page (which is related more to bundling actual Quicktime software with products, and using the Quicktime and Apple logo). Their email address is sw.license@apple.com.
That said, here's the actual Apple documentation for the Quicktime File Format, from the developer site. I think this is what you'd want; in its introduction it reads "if you are developing a non-QuickTime application that imports QuickTime files or works with QuickTime VR, you need to understand the material in this book."
So basically, it's nothing like the situation with ASF or WMV at all. Apple has lots of reasons to want people to implement the Quicktime file format -- in digital cameras, third-party software, wherever. A version of it is used in the ISO spec for MPEG-4 video, as well. The more people use it, the more interoperable Macs become; to encourage that, the spec is open. Obviously there are licensing issues on the codecs themselves, but in terms of the container format there don't seem to be any deal-breaking restrictions. It's only if you wanted to use Apple code to play the content of the containers/streams, or use any of their logos that there'd be a problem. -
Re:A couple of other interesting points..Microsoft didn't make quicken. Microsoft DID make AVI.
- Microsoft made the AVI format, but they didn't make DivX/MP3, one of the multitude of third-party codec combinations you can use when building an AVI, nor must that file have been constructed using a Microsoft tool.
- Microsoft created the DLL format, but they didn't make Quicken, or any of the multitude of third-party modules created using the DLL format, nor must that file have been constructed using a Microsoft tool.
(By the way, AVI, the RIFF format from which it descends, DLL, and many other file formats were developed jointly by Microsoft and IBM for Windows and OS/2, respectively.) -
Re:A couple of other interesting points..Microsoft didn't make quicken. Microsoft DID make AVI.
- Microsoft made the AVI format, but they didn't make DivX/MP3, one of the multitude of third-party codec combinations you can use when building an AVI, nor must that file have been constructed using a Microsoft tool.
- Microsoft created the DLL format, but they didn't make Quicken, or any of the multitude of third-party modules created using the DLL format, nor must that file have been constructed using a Microsoft tool.
(By the way, AVI, the RIFF format from which it descends, DLL, and many other file formats were developed jointly by Microsoft and IBM for Windows and OS/2, respectively.) -
Re:the more I think about it...
I'd much rather see those hundreds of millions of dollars invested in, for instance, making all out of print recordings and books available on-line. It's a smaller problem (sounds like), but would benefit the world much more than online copies of every government employee's timecard records
They already invest hundreds of millions of dollars in that. It's called the Library Of Congress.
http://www.loc.gov/
http://www.digitalpreservation.gov/ -
Re:Elaborate
Reverse engineering?