Slashdot Mirror


Lockheed Chosen For Electronic Records Archives

TrentL writes "How will we be able to read 1990's email messages in the year 2090? Will GIF files still be accessible in 2105? The US National Archives - tasked with preserving records "for the life of the republic" - has chosen Lockheed Martin to solve exactly this problem. Lockheed was awarded the $308M Electronic Records Archives contract after a year-long design competition. Full Disclosure: I worked on Lockheed's demo team."

16 of 282 comments (clear)

  1. Why not? by Poromenos1 · · Score: 2, Interesting

    Analog media couldn't be restored because the machines that read it broke (couldn't they make new ones?) but as long as the specs exist, I don't see why they won't be able to read the digital data (assuming we still use two bits in the future).

    --
    Send email from the afterlife! Write your e-will at Dead Man's Switch.
    1. Re:Why not? by TrentL · · Score: 3, Interesting

      All these people whinging about about how cd's won't last - I'm pretty confident that if I bother to hold on to the cdroms in my draw, provided they're kept in their cases/good condition they'll be just as playable (on the same hardware) in 100 years. Frankly I hope (probably all) of the stuff in my e-mail isn't around in 100 years.

      The amount of data we are talking about is HUGE. There is no way humans could manually upgrade the data. It would be a technical and policy nightmare. As for preserving emails, the email messages of the executive branch contain much historical significance.

  2. Chick and Egg problem by Manip · · Score: 4, Interesting

    This has a fundamental chicken and egg problem: So you store the information, you also need to store the format of that information. So then how do you read "format of the information" document? What format is *that* in?

    You see; whatever format you used for anything has to be documented and you can't use paper because it won't last as long ... Do you carve it into stone?

    Worse still you need some computer science grads to write up exactly the format down to how long a char is and the bit/byte order. It is a extremely difficult task even if you don't take into consideration finding a storage medium that will last that long. :-(

  3. IDE Raid.. by markass530 · · Score: 3, Interesting

    Not sure where I read it, but there was an article I read about using good old cheap IDE Raid as a tape replacement. Some guy did it on a large scale for university, and a (relativly low cost). Considering the low cost per GB, and easy scalability, why not?

  4. from the US government? by zogger · · Score: 2, Interesting
    good luck! What's theirs is theirs! What's yours is theirs! from the PR:



    "The system's "initial operating capability" should be available during Fiscal Year 2007. Weinstein noted that "the system's architecture makes it flexible enough to accommodate evolving policy change," including the importance of "providing public access while protecting privacy and sensitive information.""..HAHAHAHAHA! Anything even *remotely* important or interesting, paid for by tax payers or not, sorry, "terrorism, security", yada yada yada.

  5. Momentous Task Indeed by Nerd+Systems · · Score: 2, Interesting
    Lockheed Martin is going to have fun with this one... preserving records for that length of time will be a considerable task... and hopefully they will figure out a way that will succesfully archive records forever...

    Just look back at how much technology has changed in the past 10 years. We had 5.25" Floppy drives used back in those times, and 3.5" floppies were used as well, and CD burners were just starting to come available at the speedy rates of 1-2x, not to mention hard drives were so small compared to the 500gb drives we have today... and Windows 95 was just released, wonderful system based on FAT architecture... not NTFS like we have today...

    Computer technology is increasing at such a rapid rate these days. I can only imagine how it will be in 10 years, much less 100 years from now. I am sure by then that clock speed will be in hundreds of gigahertz, memory in the terabytes, and storage in the petabyte range... if not even higher... who knows...

    I also wonder, if in 2090, will their CD-ROM equivalent even exist to read this storage library? They may have long ago abandoned CD-ROMs for being too slow, and if data is stored in this format, how will it be read? Also, as hard drives get larger and larger, am sure the IDE, SCSI, and SATA drives of today will not be readable by the BIOS of tomorrow... much less have connectors to fit...

    This is a huge undertaking... good luck Lockheed Martin...

    --
    Need a Nerd?
    Nerd Systems
  6. Software is easy. Re:Chick and Egg problem by hypnagogue · · Score: 3, Interesting

    This is not nearly as difficult as you make it seem: implement the parser in a standardized language. The formal specification of the standardized language can then be included with the source of the parser.

    Getting code to run on later architectures is not usually very difficult. I am fairly comfortable with the proposition of porting any code to any future architecture -- the "emulator scene" testifies to the viability of this strategy. The biggest problem to be solved is reading storage media for which no hardware exists.

    For example, how do I get to my college research stored on AmigaDos floppies? Tragically, the easiest solution is to try to get my Amiga running again, and then move the data over a serial cable with kermit. I'm awfully glad I have kermit on that computer, because I don't think I'd be able to find any 2400 baud Amiga BBSes around to download it.

    --
    Liberty you never use is liberty you lose.
  7. Google? by dustinbarbour · · Score: 4, Interesting

    Did Google compete for this contract? They're the ones with the largest infrastructure for such a project and the brains to give us a really slick interface to it all. Not to mention that they could probably have faster response times than archive.org which totally fuckin' blows.

  8. Re:GIF? by sleighb0y · · Score: 2, Interesting

    Correct me if I'm wrong, but the GIF patent boat has sunk. Save IBM's never-enforced LZW patent, GIFs are free.

    http://en.wikipedia.org/wiki/GIF#Unisys_and_LZW_pa tent_enforcement

  9. Standards are good by Anonymous Coward · · Score: 1, Interesting

    Now let's get the Federal, State, and local governments to create standards-based web sites as policy. (Remember FEMA and the US Copyright Office.)

    http://narnia.dnsalias.org/freegovernment/

    Next, let's go for standards for electronic office documents (word processor, spreadsheet). I doubt Lockheed Martin's system here is going to understand every obscure format---say Microsoft Word 5?

  10. Re:GIF? by bloo9298 · · Score: 2, Interesting
    The corporate Disney that we know today should not diminish the work of one of the 20th century's greatest imaginative minds.

    I agree, Walt was much more evil than corporate Disney. Credit where it's due.

  11. Lockhead - Martin data entry... by zenneth · · Score: 3, Interesting

    I worked with them for a while, as a data entry person back in the early 90's. Basically, we were responsible for keying in a parcel's 5-9 digit Zip code after it had been scanned into the system. By scanned, I mean the front of the package or envelope showing the send-to and return addresses was presented on a monochrome display, which allowed the person operating the terminal to enter the zip codes for the parcels. Then you'd hit a key and move to the next one, and so on and so on.

    The bizarre thing is that I found out a few of the invididuals would "pad" their PPM (Parcels Per Minute) by typing in zipcodes they were familiar with instead of reading what was on the display, just to enter a dozen or so really quickly. It didn't happen often, but it helped them keep up the pace and "clear" the system queue more quickly, thus gaining them and their workmates an early break. However, I've no idea what damage may have occurred by their lax attitudes, and I really don't want to know now.

    Which brings me to my point (I think): how can we be certain the data they're entering is one-hundred percent accurate, regardless if the medium lasts a century?

    --
    The Chronic *WHAT* les of Narnia!
  12. Chinese whispers by oliverthered · · Score: 2, Interesting

    I have code on a modern HDD that I typed into a BBC computer 15+ years ago fro ma magazine.

    I took it off of tape, via the BBC and a serial lead, I have all my chickens and all my eggs. So long as you move to a newer form of media before the old one perishes then your going to be OK.

    I think it's a Chinese whisper problem not a chicken and egg problem, what happens when inaccuracies are introduces

    e.g. Someone writes a file in a odd charset, nobody notices that the charset is different from ASCI when they convert the file into unicode. In 20 years time will someone notice that the file has been converted badly or will they think it's corruption? What happens when there are lots of tiny conversion errors like this.

    --
    thank God the internet isn't a human right.
  13. I can fix this. by sbaker · · Score: 2, Interesting

    Hmmm - I'd better email myself the GIF spec - maybe along with some source code to read it with - and a C compiler to compile the source with. Ah WTF - I'll just email myself the Linux sources. ...but seriously...there won't be any problem with reading GIF if anyone actually wants to - the file format is documented all over the place and in 100 years, if there are still GIF files on some kind of readable media - then the odds are very good that those documents will be easy to find. Programming a GIF reader (or a reader for almost any documented file format) is easy - presuming you are sufficiently motivated. A historian who is interested in 100 year old documents shouldn't have any problems getting them converted to whatever format is needed.

    The HUGE concern is the undocumented, encrypted or (worse) DRM'ed files. Reading those in 100 years may be exceedingly difficult.

    We can read documents written in heiroglyphs around 2000 years ago. The only problem is with languages for which no translations *ever* existed.

    Survival and longevity of antique media are a much bigger problem.

    --
    www.sjbaker.org
  14. An Arms Dealer to Guard the Memory Hole! by monk · · Score: 2, Interesting

    The articles were light (to the point of vacuum) on details about the approach proposed by the company.

    From the article: "The system's architecture makes it flexible enough to accommodate evolving policy change," including the importance of "providing public access while protecting privacy and sensitive information." From the sound of that I'm betting its some wonky and ridiculous XML format infected with a sadly pathetic little DRM imp.
    The fact is that I can read anything if I have a copy of the software that originally viewed/created it and the machine (or an emulation of the machine) on which the software ran. Adding one more format to the mix just means we have to emulate one more machine and keep track of one more piece of software and all the doubtlessly expensive effort which will be spent in conversion is wasted.

    It's great to see the National Archives working on this but I would rather see the tax money farmed out in challenge grants to organizations like the
    Long Now that have a chance in Hell of delivering something useful than pouring money into yet another defense company to ensure that whatever technology we use to store records can be properly sanitized and locked away according to the whims of government and "changing policy."
    The biggest issue facing us right now is that most of the music, words and images created by our civilization are illegal to preserve. Ridiculous copyright extensions have ensured that the huge mass of data for which no rights owner can be found will simply rot instead of being digitized and stored.

    A software emulator can ensure that historic file formats are readable in the future, but Big Media would rather squeeze our history to death before it letting go of the rights.

    This is like 1000 fires at the Library in Alexandria. Future generations will curse us for every scrap of information we allow to rot while we squabble.

    --
    [-- Trust the Monkey --]
  15. Re:IFF-ILBM (Interchange File Format) by Anonymous Coward · · Score: 1, Interesting

    That's funny. As an example of OBSCURITY, IFF is a poor pick. It's an expansion of the Macintosh resource concept that, like XML, allows for a hierarchical organization of data, where if you simply can't understand (or use) one of the internal data chunks, you can more or less safely skip over it (since each chunk has a length attached to it).

    And, if you really need to understand it, I can probably spare one of my old Addison-Wesley Amiga guides that details IFF and its various implementations (it was also used for word processors, musical composition and multimedia presentations). ILBM is simply the Interleaved BitMap format.

    The Microsoft WAV file format, is, in fact, a slightly perverted form of IFF.

    Now if your Windows apps can't READ IFF, I'd take it up with the vendor.