Slashdot Mirror


Examples of Obsolete File Formats?

reedk writes "I was having a discussion with my boss about long-term archives, and we got on the topic of older files becoming un-readable by newer versions of software. Not only are those old Ami pro files unreadable by today's common word processors, but I have heard that newer version of Office can't consistently open very old versions of Office documents. With the increasing retention periods being forced by current and coming regulations, this could become a problem of compliance in the future. We want to pursue this topic, but to build support for it internally, I am looking for examples of older file formats that are no longer readable by newer version of the same software or due to the market death of the product. If true, this would lend a lot of force behind moving to products that have an open file format. Can Slashdot readers come up with examples of this, or ways they have had to get around these kinds of problems?"

159 comments

  1. Print to/create PDF? by aliquis · · Score: 1

    Easy as that. I guess PDF/PS is common enough too stay for long, and it's possible too make all prints become PDFs.

    1. Re:Print to/create PDF? by BoomerSooner · · Score: 1

      text, no formatting. Even my apple //e could save text files. the media is usually a bigger problem than the data document.

      pdf is good but make certain you have an older reader, old os and old machine to run it on.

      you could always encode it with alphabits as well. just glue to 8 1/2 by 11 paper and you're set.

    2. Re:Print to/create PDF? by twistedcubic · · Score: 1

      I think the author of this story is looking for reasons to get his/her boos to store stuff in PDF format.

    3. Re:Print to/create PDF? by Anonymous Coward · · Score: 0

      "too" is like "also." The word you're looking for is "to." Otherwise your sentence reads as:

      Easy as that. I gues PDF/PS is common enough also stay for long, and it's possible also make all prints become PDFs.

    4. Re:Print to/create PDF? by shadowmas · · Score: 1

      i prefer html. any text editor can create them. even in the odd chance of all webbrowsing software dissapearing off the face of the earth you would still be able to extract most of, if not all of the text quite easily. plus it has good enough formatting to be used for pretty much any document.

  2. Three Magic Words... by BurritoJ · · Score: 1

    Pee....
    Dee....
    Eff....

    1. Re:Three Magic Words... by catfoo · · Score: 1

      thats silly, your silly, PDF is great, tons of things read it, yada yada yada

      --
      no sig today, come back tomorrow
    2. Re:Three Magic Words... by walt-sjc · · Score: 1

      What about EDITING a PDF file? PDF is not designed for editing. So ya, you may be able to READ it, but not CHANGE it.

      Plain text is the MOST standard, but doesn't really handle modern needs (embedded images, tables, etc.)

      HTML may be a little better, because formatting isn't as important as content. Of course if your source is MS Word, the HTML generated is HORRIBLE.

      I think the bottom line is that there really isn't a good format that easily handles complex documents. Theoretically, XML w/ SVG should work, but with certain companies that are based in the northwest US not following standards and using proprietary extensions, this option is limited too.

    3. Re:Three Magic Words... by NanoGator · · Score: 1

      "So ya, you may be able to READ it, but not CHANGE it."

      Given that the tax forms I download every year, I consider that a blessing.

      It's also nice that it's easy to convert documents such as manuals over to PDF. This is quite handy when buying second hand stuff. Okay, you can't edit a PDF (though you CAN fill out a form in PDF and save the options you've filled in. Again, PDF is a blessing for tax forms.) but there's still plenty of reason for it to hang around for quite a while.

      --
      "Derp de derp."
    4. Re:Three Magic Words... by Wolfrider · · Score: 1

      Acrobat 7.0 Professional will allow you to edit PDF files. You can also create fillable form fields, checkboxes, etc. in existing PDFs.

      http://www.adobe.com/products/acrobatpro/tryout.ht ml

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
    5. Re:Three Magic Words... by hcdejong · · Score: 1

      Have you ever tried doing this? It's horrible. No word wrap, for instance. It's meant to be used for minor corrections, not for writing whole pages. PDF is an output format, not an editing format.

    6. Re:Three Magic Words... by bhtooefr · · Score: 1

      Can't edit a PDF? WTF?

      http://www.foxitsoftware.com/pdf/pe_intro.php

      'Nuff said.

      (Actually, try their reader out, as well. Basically, thanks to them, my *shit... that's a PDF* reflex is dying.)

    7. Re:Three Magic Words... by Wolfrider · · Score: 1

      Yes, I just completed a project that made PDF forms interactive for use on the Web. Very limited features in some areas (you have to hit Tab to get to the next line when filling in fields) but it paid well.

      I agree it's limited in some ways, but useful for making things look the way you want AND making it interactive. In this case, the client had published a book and wanted to make some of the exhibits in it interactive; the Adobe tools have a 30-day free trial, and let you alter the PDF itself in-situ.

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
    8. Re:Three Magic Words... by Anonymous Coward · · Score: 0

      'Cause, you know, it's so easy to crunch those numbers once they're in PDF format...

    9. Re:Three Magic Words... by pla · · Score: 1

      Pee.... Dee.... Eff....

      PDF itself actually seems like a nice idea, and one which no real competition has yet come along to challenge (even the biggest alternative, PostScript, never really took off as anything but a printer language).

      People just need to use the default choice of fonts, and avoid any features not in 4.0. 5.0 doesn't suck too badly, but moving to 6? Pain! I have better uses of my time than to wait over a full minute (on a reasonably new machine) just to read something comparable in content to a webpage (yeah yeah, talk about printing and typesetting and page layout all you want, but 99% of us only tolerate Acrobat because some moron thought a PDF would look more professional than a webpage - For exactly two documents per year do I care about its page layout capabilities - My federal and state tax forms).


      Now, to nominate my own format(s) - TIFF, PCX, even BMP (as a data format - as a container format, it works as well as anything else). We have PNG and lossless JPEG now, why would anyone still use uncompressed raw data? And of course, GIF - PNG gets better compression (without downsampling the color information), and no one "owns" it in the IP sense (yeah, the UNISYS patent has expired, but it still exists...)

    10. Re:Three Magic Words... by BurritoJ · · Score: 1

      I agree with many of your points, particularly the pain associated w/ V6 of Acrobat Reader. V7 is MUCH faster. The biggest benefit of PDF over the image formats you're so fond of is the fact that PDF can encapsulate text in a searchable and extractable format.

    11. Re:Three Magic Words... by NickFitz · · Score: 1

      I don't think it's a good idea to store business document as pictures...

      --
      Using HTML in email is like putting sound effects on your phone calls. Just say <strong>no</strong>.
  3. Necessary. by FireFlie · · Score: 3, Informative

    For this same reason I usually suggest to people that with very long term backups (assuming the backups actually survive) try to save your data in non propriatery forms. I am not trying to make a closed source vs open source argument, however if you want to save a large batch of word documents that you will not need to access in the near future try to convert them to plaintext where you can. Not fullproof, and not applicable for the majority of situations, but there are a few things that we can assume will not happen in the near future: 1) ascii will probably not die, so plaintext is often a good idea, 2) many of the more common image formats will probably be supported in one form or another (gif, jpg), you know stuff like that.

  4. example by Hes+Nikke · · Score: 2, Informative

    AppleWorks had no idea what to do with AppleWorks documents - assuming you can get a mac to read an Apple ][ floppy in the first place...

    For that matter, is there anything that can read VisiCalc files?

    Flame ON!

    --
    Don't call me back. Give me a call back. Bye. So yeah. But bye our, well, but alright we are on a shirt this chill.
    1. Re:example by secolactico · · Score: 1

      AppleWorks had no idea what to do with AppleWorks documents - assuming you can get a mac to read an Apple ][ floppy in the first place...

      Perhaps it wasn't carefully saved after all...

      --
      No sig
    2. Re:example by brwski · · Score: 2, Informative

      Well, Apple ][ AppleWorks can, if I remember correctly --- and if you can get your hands on early versions of ClarisWorks, there is little problem importing Apple ][ AppleWorks files. Then there is the late, great word proc, AppleWriter. At least it used in-line codes to make things work. Made moving to LaTeX pretty easy.

      --

      brwski
      "Because without beer, things do not seem to go as well''

    3. Re:example by jonadab · · Score: 1

      > if you can get your hands on early versions of ClarisWorks

      If you can get your hands on early versions of ClarisWorks, they won't run on any modern system. This is *exactly* the sort of difficulty the original question was talking about. Today, if you had ancient files and needed early versions of ClarisWorks to open them, you could probably solve the problem with a few hours of hunting around on eBay for an old 68k Mac, but with every passing year this will become more and more problematic.

      Ancient AppleWorks file formats are not the best example, though, for a couple of reasons. First, AppleWorks was *the* application (not just *the* word processor, but *the* application, period) for the Apple // series, and second, the Apple // series had and has an unnaturally large hobbyist community, which makes it quite a lot easier to find accurate information, obtain old versions, and so forth.

      There are much better examples of formats from about the same era that were, at the time, very popular, but today are virtually impossible to open. RapidFile springs immediately to mind. PC Write. Perhaps scarrier are the formats used by ancient backup software, such as PC Backup. Your documents could have been in a format we can still open, plain ASCII text even, and yet you could be unable to retrieve them *even* if the media are still good (which is another rather scary thing...), if the backup software's backup format is obscure.

      I'm curious how easy it is to open really old Lotus 123 spreadsheets with today's spreadsheet software.

      --
      Cut that out, or I will ship you to Norilsk in a box.
    4. Re:example by bhtooefr · · Score: 1

      you could probably solve the problem with a few hours of hunting around on eBay for an old 68k Mac

      Or a few minutes of hunting around using Google for a copy of Basilisk II... I think there's even a Mac port...

      And, that's why the best backup format is .tar, possibly with gzip compression. What DOESN'T support gzip? And, GNU tar is GPLed...

    5. Re:example by bhtooefr · · Score: 1

      Also, the opening Lotus 1-2-3 spreadsheets being easy...

      First, Lotus 1-2-3 itself survived until 2000.

      Second, Microsoft used the 1-2-3 format for the spreadsheet in the first version of MS Works. In fact, the MS Works spreadsheet format is to this day a fork of the 1-2-3 spreadsheet format.

    6. Re:example by ratboy666 · · Score: 1

      VisiCalc uses text files -- the format is simple.

      Basically, just a list of commands and data. Read them in, and plant into another spreadsheet.

      The formulas will have to be converted, of course.

      But, the format itself is trivial.

      Ratboy.

      --
      Just another "Cubible(sic) Joe" 2 17 3061
    7. Re:example by leighklotz · · Score: 1

      > For that matter, is there anything that can read VisiCalc files?
      Try this.

  5. .. but not for all kinds of data by aliquis · · Score: 1

    Allthought not usable for all kinds of data I guess, but do you expect to find a format which can handle them all? I guess that atleast as long as you stay with free software you can find out HOW the format worked. With a proprietarian(spelling..) fileformat you might be screwed.

  6. PDF by xwizbt · · Score: 1

    I'd shove it in a PDF. Even if you can't manage it, utilities like FileJuicer can strip the main parts out of the document.

  7. Open up the standards by i.r.id10t · · Score: 2, Interesting

    Well, if it is an open format, nothing is stopping someone from writing something to read it and convert it to something "modern". If it is a closed format, and no longer in use, then the owner really should open it up. Would it be possible to setup an escrow of (closed) file formats - automatic open if the company goes defunct or individual dies.

    Also, if you know what the end result data is supposed to look like, would it be possible to start "decompiling" it? Works with binary executables (sometimes)...

    --
    Don't blame me, I voted for Kodos
    1. Re:Open up the standards by Anonymous Coward · · Score: 1, Interesting

      The question and responses like to blame Microsoft, but certainly early versions of Word had a well-documented, open file format. I had the SDK complete with sample code. Never-the-less, old Word files are here used as an example of data that can't be retrieved.

      Even if the format is open, that's not really an assurance that you can really write a file converter to properly retrieve the data, especially for significantly complex file formats.

      An open source project will fare no better after everyone moves on to the next trendy replacement and their home page on sourceforge disappears. The question really has nothing to do with open source or not; just complexity and the accuracy of documentation.

  8. Simple by jZnat · · Score: 2, Insightful

    Formats that are worth using for old (and sometimes new) documents:
    * RTF (quite universal)
    * PDF (somewhat universal, will always have the same formatting)
    * Plaintext (never becomes unreadable unless the file's character set ceases to exist somehow)

    --
    'Yes, firefox is indeed greater than women. Can women block pops up for you? No. Can Firefox show you naked women? Yes.'
    1. Re:simple by HotNeedleOfInquiry · · Score: 1

      Well big Al, you might want to spend more time working on spelling, grammer and computer history before making silly posts on /.

      OTOH, since you're a senior, there's probably little hope. Go ahead and troll.

      --
      "Eve of Destruction", it's not just for old hippies anymore...
    2. Re:simple by DJCater · · Score: 2, Funny

      Yeah, spelling and grammer...

      --
      Sig Appended to the end of comments you post. 120 chars.
    3. Re:simple by HotNeedleOfInquiry · · Score: 1

      Damn, damn, damn. Every frigg'n time.

      Grammar.

      --
      "Eve of Destruction", it's not just for old hippies anymore...
    4. Re:simple by Larry+Lightbulb · · Score: 1

      Where possible we should encourage the elderly to have pride in their sexuality, not complaining about errors in their posts.

  9. EBCDIC and dead voters by markjugg · · Score: 2, Interesting
    I once worked on a research project for a newspaper to investigate voter fraud.

    To start, they used open records requests to get the details of people who recently voted, and details of those who recently died.

    The goal was to find people who continued to vote after they died, which may sound funny, but is still happening.

    The data the government data gave us was on magnetic reels. The data on the reels was stored in a fixed-width EBCDIC format. Talk about a dead format!

    It turned out the local college still had a working magnetic reel reader, and was able to help me get the data out of EBCDIC into ASCII, but the project was cancelled anyway.

    1. Re:EBCDIC and dead voters by Ratbert42 · · Score: 2, Informative
      The data on the reels was stored in a fixed-width EBCDIC format. Talk about a dead format!

      The physical media might be near death, but I work on modern C++ code that reads and writes fixed block EBCDIC files.

    2. Re:EBCDIC and dead voters by jesup · · Score: 1

      So, if you were to have old ASCII or EBCDIC 9-track tapes, where would one go to get them read? I have some dating to the late 80's I'd love to get the data off of. For amusement, mostly.

    3. Re:EBCDIC and dead voters by saintp · · Score: 2, Interesting

      Do what the GP did: ask your local university. We still have a nine-track drive around, although it hasn't been fired up in a few years. Lots of data from the state government, ACT test reports, etc., came on 9-track tapes until just five or six years ago, so lots of universities still have them around.

    4. Re:EBCDIC and dead voters by gradbert · · Score: 1

      so what part of the financial industry do you work in?

      EBCDIC is alive and well moving money around. All the credit card companies use it for the real-time and the settlement side.

      fixed width stuff is easy. I have code that does variable format record with binary data too in EBCDIC. and its written in tcl (-:

    5. Re:EBCDIC and dead voters by ibm1130 · · Score: 1

      Vote Fraud....
      Hmmm....
      So how IS life in Chicago.

      Parenthetically it looks like the law may finally be catching up with the Daley's. Not before time either.

    6. Re:EBCDIC and dead voters by Scorchio · · Score: 1

      Right. Did you ever get the impression the government agency might not have wanted you to complete the project?

    7. Re:EBCDIC and dead voters by markjugg · · Score: 1

      Not in this case. The newspaper changed their mind for other reasons. As it turned out, the file formats didn't slow us down much while the project was active.

  10. Wordstar 3.3 by HotNeedleOfInquiry · · Score: 1

    Only kidding. I do have Wordstar 3.3 files made under CP/M that will still open though...

    --
    "Eve of Destruction", it's not just for old hippies anymore...
  11. I have a hearsay hypothesis by Philip+K+Dickhead · · Score: 1
    It might be true, I haven't demonstrated the veracity of the claim myself. It does seem to resonate with a number of my prejudices - so I think it's safe to air this as a greivance to presumably sympathetic readers.

    Some of you can probably supply anecdotal evidence, No? I'd like to make broad reccomendations in the future, and hope that some of you have little else to do.

    Thanks!

    --
    "Speaking the Truth in times of universal deceit is a revolutionary act." -- George Orwell
    1. Re:I have a hearsay hypothesis by Anonymous Coward · · Score: 0

      Should we therefore conclude that a solitary question is sufficient to elaborate upon the poignant paradigm we find ourselves traversing?

      Do robot sheep dream of electric shepherds?

  12. Engineering + smaller programs. by Goalie_Ca · · Score: 1

    Smaller programs especially that we use in engineering become obsolete on a daily basis it seems. A lot of electrical/computer engineering programs especially for devices or programming these devices. Often these companies get bought out or lost forever. Then there's all the cad and simulation software. I can't even think of it all but i've come across a lot of essentially unusable stuff as a result.

    --

    ----
    Go canucks, habs, and sens!
    1. Re:Engineering + smaller programs. by Anonymous Coward · · Score: 0

      That's why when we do a product release, we archive all tools required to recreate the release (and have another engineer sign off that they were able to reproduce the release using only what was archived). We should be in good shape until hardware capable of running the tools doesn't exist anymore, and then we'll be totally screwed. Maybe we should be archiving open source emulators for the hardware that the tools run on, too!

  13. Note... by Otter · · Score: 2, Insightful
    If true, this would lend a lot of force behind moving to products that have an open file format.

    Well, yes and no. Let's say Ami Pro file format were fully documented. (I have no idea whether it is or isn't.) At what point would it be worthwhile for your company to actually write a file converter? I can certainly imagine a situation where it might be a cost-effective thing to do, but it's not the kind of thing that anyplace I've ever worked does routinely.

    And from a retention point of view, I don't know if you _want_ whatever scumbag lawyer is subpoenaeing documents from you to be able to demand that you write him a converter. I'd rather be able to say "Here are our VisiCalc files. Enjoy!"

    1. Re:Note... by Anonymous Coward · · Score: 0
      We recently had to write a converter for the flat file databases used by our old 4GL inventory program. We created a parser and dumped the datasets into a modern RDBMS, the only difficulty was working out how the data was indexed.
      0x4d, 0x69, 0x63, 0x72, 0x6f, 0x73, 0x6f, 0x66, 0x74, 0x20, 0x66, 0x75,
      0x63, 0x6b, 0x69, 0x6e, 0x67, 0x20, 0x73, 0x75, 0x63, 0x6b, 0x21, 0x0a
      It was remarkably easier than we had originally anticipated.
    2. Re:Note... by Anonymous Coward · · Score: 0

      Trolling idiot. You suck...

    3. Re:Note... by Anonymous Coward · · Score: 0
      Yes, it took all those precious seconds to decode...
      perl -e'print chr for 0x4d, 0x69, 0x63, 0x72, 0x6f, 0x73, 0x6f, 0x66, 0x74, 0x20, 0x66, 0x75, 0x63, 0x6b, 0x69, 0x6e, 0x67, 0x20, 0x73, 0x75, 0x63, 0x6b, 0x21, 0x0a'
    4. Re:Note... by jonadab · · Score: 1

      > And from a retention point of view, I don't know if you _want_ whatever
      > scumbag lawyer is subpoenaeing documents from you to be able to demand
      > that you write him a converter. I'd rather be able to say "Here are our
      > VisiCalc files. Enjoy!"

      No, no, you can do better than that...

      "Okay, these tapes contain the information you requested. These other tapes contain the in-house software that reads and writes the format that the data is in on the tapes. Now, _these_ tapes contain the in-house software that reads and writes the tapes. It all runs on TOPS-10. Enjoy! Hmmm... TOPS-10? Oh, we no longer have that, our site license ran out. You'll have to ask DEC."

      --
      Cut that out, or I will ship you to Norilsk in a box.
    5. Re:Note... by Anonymous Coward · · Score: 0

      Of course anyone can run PDP-10. Paul Allen (You know, Bill Gate's partner in crime) has a publically running TOPS-10 system.

    6. Re:Note... by Walt · · Score: 1
      But then, there's the The DEC PDP-10 Emulation Webpage http://www.aracnet.com/~healyzh/pdp10emu.html!

      Now was that 7- or 9-track tape?

      --
      (Unix & Network) (Security & SysMgmt)
  14. I can still use an old DOS program by Marxist+Hacker+42 · · Score: 1

    To read TI-99/4A written Display Variable 80 (aka DV80) files- in which all of my early experiments with machine language and my high school word processing papers are saved in. But- and this is a big but- I've got to find a 5.25", 360k drive to do it. So I've kept a few around- I doubt I could still find them new.

    Whenever possible, I convert those to plain text- and store them on CDs.

    --
    SJW: a person who perceives an injustice, and while correcting it, commits a greater injustice.
    1. Re:I can still use an old DOS program by DrSkwid · · Score: 1

      thats fuinny cos your CD's will last less time than the TI/99

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    2. Re:I can still use an old DOS program by Anonymous Coward · · Score: 0

      At least you had a disk drive for your TI-99/4A, I had to use the cassette tape drive to save items. Problem was my siblings thought they were audio tapes that had gone bad so they threw them out.
       
      The first program I ever wrote was on a TI. And I am not talking about typing in the source code for a game from some book we picked up. I was just getting into it when the tape was tossed.
      Ahh, memories

    3. Re:I can still use an old DOS program by aminorex · · Score: 1

      Of course, a 5.25" 1.2MB drive will read those just fine.

      --
      -I like my women like I like my tea: green-
    4. Re:I can still use an old DOS program by Marxist+Hacker+42 · · Score: 1

      As a format? No, I think OrangeBook will be around and still readable for a while. Or are you talking about archiving problems? I use the 100 year quality gold for such things, the type with two layers of plastic instead of one.

      --
      SJW: a person who perceives an injustice, and while correcting it, commits a greater injustice.
    5. Re:I can still use an old DOS program by Marxist+Hacker+42 · · Score: 1

      Not if it was formatted on a Myarc DCC. I'm not entirely sure why, but it seems to have to do with how that card used to run the stepper motor.

      --
      SJW: a person who perceives an injustice, and while correcting it, commits a greater injustice.
    6. Re:I can still use an old DOS program by Anonymous Coward · · Score: 0

      Also, seeing how much space a CD has, you can save it several times on it. Even if some part of the disc becomes unreadable, you can recover the files. Of course, this is only helpful if the CD survives as a whole. :>

  15. That's only part of the problem... by WarPresident · · Score: 1

    What about the storage media itself? I believe that the latest technology used in long-term durable media, in an easy-to-read format (at least for the moment), is quite old.

    --
    Here come da fudge!
    1. Re:That's only part of the problem... by Anonymous Coward · · Score: 0

      "store media themselves", or "storage medium itself".

  16. DARPA requirements to solve this by dbrossard · · Score: 3, Informative

    A Boss I used to have that worked on many DARPA sponsored projects used to have to archive ALL data related to those projects. In order to this, not only did we have to archive the data itself, we had to archive a PC with all the pertinent software necessary to view/compile/manipulate that data including workstations, servers, you name it. Of course the government standard may be over kill for many companies.....

    1. Re:DARPA requirements to solve this by Anonymous Coward · · Score: 0

      Wow, who maintains that PC? What if the hardware breaks or otherwise stops working? Even in those environmentally controlled underground archive places it might still go bad (think of the lithium battery used for the CMOS leaking out onto the motherboard or something). Maybe the answer is to archive two PC's?

      Sheesh, seems like there are better ways to handle that. Documented and/or open file formats seem like the best way to go. Dunno about the hardware, maybe upgrade the format as old ones become outdated (ie. I have upgraded my old data from audio tape->floppy->CDROM->DVD->whatever is next).

  17. simple by bigalsenior · · Score: 0

    where possable use paper documents. there user readable and have a long life .

  18. I've had some wierd ones by squiggleslash · · Score: 4, Funny
    The wierdest I had to decypher essentially comprised of a bunch of hierarchical blocks using headers that constituted a description word and some properties, enclosed in less than and greater than signs.

    It was, frankly, awful. Someone had clearly designed it as some kind of "One size fits all" type thing, except that as it was text based it didn't really work that well. Typically graphics, for example, had to be represented by a block that contained a filename: yep, graphics, sound, anything more complicated than a word or a number had to be put in a separate file. Neither my collegues nor I could understand why anyone would try to put so much effort into making it look hierarchical and extensible, and then not include support for data that isn't well represented as text. Hell, most of the files on our PCs can't easily be represented efficiently or usefully as text.

    It was also remarkably inefficient. To give you some idea, when we converted it into plain text files in a more efficient form, the files were typically 60-70% smaller. I've always found gzip a good indicator of the efficiency of a file format - usually, plain text compresses to about 30% of the original size. In this case, it was frequently 10%.

    Absolutely horrible format. I hope I never have to work with it again.

    --
    You are not alone. This is not normal. None of this is normal.
    1. Re:I've had some wierd ones by Anonymous Coward · · Score: 1

      Typically graphics, for example, had to be represented by a block that contained a filename: yep, graphics, sound, anything more complicated than a word or a number had to be put in a separate file. Neither my collegues nor I could understand why anyone would try to put so much effort into making it look hierarchical and extensible, and then not include support for data that isn't well represented as text.

      In know you're just kidding, but there are at least two ways of including images in HTML/XHTML - with data: URIs and with inline SVG (XHTML only). Of course, neither work in Internet Explorer, but that's not the W3C's fault.

    2. Re:I've had some wierd ones by Anonymous Coward · · Score: 0

      Or more to the point, in XML you could define your own schema/doctype that would let you inline encoded blobs similar to how data URI's work. SVG is XML, so I don't understand where you're going there unless the spec allows embeded raster images?

    3. Re:I've had some wierd ones by pete-classic · · Score: 1
      Typically graphics, for example, had to be represented by a block that contained a filename: yep, graphics, sound, anything more complicated than a word or a number had to be put in a separate file.


      Clever post, and I hate to bust a fan's chops, but look into base64 and uuencode.

      This will get you started.

      -Peter
    4. Re:I've had some wierd ones by Anonymous Coward · · Score: 0

      Oh yeah, I've worked with that one. It sucked.

      I got a file from a web server. The headers of the web server said the text was "ISO-8859-1". However, the first line of this text said "UTF-8". Ok. Then I look through and there's a Windows curly quote in some text. Which is in neither ISO-8859-1 nor UTF-8.

      Just to make things exciting, there actually were UTF-8 characters elsewhere in the file.

      Oh, and if you use this format on Apple Macs, sometimes you'll find system files using UCS-2, which is for all intents and purposes a binary format (try looking at it with "less").

      Gotta love it. Where "love it" means "hate it".

    5. Re:I've had some wierd ones by squiggleslash · · Score: 1
      You're surely not suggesting that it's sane or rational to store data best treated as binary in base64 or uuencoded form? I ask because, yes, you can encapsulate binary data in this way (as I ended the paragraph you quoted from "Hell, most of the files on our PCs can't easily be represented efficiently or usefully as text."), but nobody in their right mind would do so unless actively forced to do so by other circumstances (eg "It must be in one file")

      It's horrendously inefficient. Saying "Yeah, you can include binary data, just uuencode it" isn't much different from saying "What do you mean, plain old analog phones aren't designed for data? I can convert data into tones and transmit it at 300 baud!"

      Like modems, it's a hack. Let's not pretend this isn't a problem with XML.

      --
      You are not alone. This is not normal. None of this is normal.
  19. Yup by Anonymous Coward · · Score: 0

    XML sukz0rz! I'll take the -1 flamebait as AC, so you don't have to.

  20. how about codecs? by artifex2004 · · Score: 2, Interesting

    While the file format itself may be a standard wrapper, there's many codecs out there that are obsolete and that only ever had proprietary drivers written for early MS Windows versions, for example.

  21. Some thoughts about it by Alpha27 · · Score: 2, Interesting

    It first depends on what you want to achieve with them, do you want them to be read only, or do you wish to edit them as well in the future? They may not be too much of an issue but something to think about.

    For images, I would look at the past to see what file formats were around before the internet was mainstream, circa 1995. I remember Paintbrush PCX as a file format, but haven't since a file in that format since then. TGAs and TIFFs were around and still are today, that might be one possibility. You also have SVG formats, and that being an XML file format, allows you to convert it to another format in the future.

    As for text documents, one definite possibility is XML. You can convert to many other formats from XML (HTML, PDF, RTF, etc.) Another possibility is RTF and plain text, though you might lose some of the more advance features. You might even have to extend the XML to deal with anything special in your files. Latex or Tex might be another solution since it's still around, though I have no experience with it, beyond being awware of them.

    I would also recommend keeping a copy of the original software you used at the time, in case you need to get access to the files with a program that actually created. This way, you still have some sort of access. If that means you need to keep a copy of the original O/S as well, so be it.

    1. Re:Some thoughts about it by ratboy666 · · Score: 1

      XML is NOT a solution.

      It is simply a "data wrapper".

      TeX (LaTeX) is a solution for typeset material.

      Ratboy.

      --
      Just another "Cubible(sic) Joe" 2 17 3061
  22. I happen to have a computer museum at my disposal by gdav · · Score: 5, Interesting

    But even so, the other day I got a shock, seeing how quickly the door closes.

    A professor at the university where I work turned up with his original doctoral thesis from 1989 on disk. 3" disk, to be exact - the format that famously lost out to the ubiquitous 3.5" disk. He had written it on the Amstrad PCW 8256, a weird British CP/M machine from the mid 80s. No matter, I have several of these rotting in my loft!

    But they don't boot. At this point you brace yourself for the long haul. The drive belts used to perish on those models, but look! There are loads of drive belts in the Maplin Electronics catalogue. You just need to order the right size.

    No problem! You carefully dismantle the drive and dig out the belt. You broke it? No problem! Just makes it easier to measure. You can only measure the circumference, whereas Maplin only quotes the diameter? No problem! You are about to use Pi for the first and last time in your entire life! Order one that's slightly too big, and one that's slightly too small, just to feel safe.

    When the belt arrives, you fit it. You carefully re-assemble the drive. You insert that CP/M boot disk that you carefully prepared in 1987, the one with the custom PROFILE.SUB that copies important utilities to RAMDISK. You power up and it boots! You feel young again.

    Now your try your Locoscript boot disk - remember, Locoscript did not run under CP/M - it was an entire little operating system unto itself. It works, and when you swap disks (f7) you can read the Prof's work! It's yesterday once more! Shoo-bee-doo-lang-lang!

    At this point I got lucky - I had the LOCOLINK package including the special Amstrad Bus PC parallel port link cable, so I was able to go Locosript PCW -> Locoscript PC -> Wordstar 3.3 -> Wordperfect 5.1 -> Winword. Those nice chaps at Ansible could have shortened that trip by a step or two.

    In the absence of the proprietary LOCOLINK cable I could also have gone Locoscript 1 PCW -> Locoscript 2 PCW -> ASCII on PCW -> ASCII on PC via Kermit -> Winword. But I'd have lost all his bolds and underlines.

    Now I got a fine bottle of Metaxa Greek Brandy out of this exchange, so I'm not exactly complaining. But I was shocked to realise that his files were younger than my eldest child, and she's got two years of school ahead of her.

    In the absence of any credible international initiative to create a reliable permanent archive format, I'd say print it to acid-free paper, multiple copies in separate places, and hope for the best, like Cassiodorus.

  23. Well duh... by DJCater · · Score: 3, Funny

    XML! Open-source! Standards-compliant! Rag-doll physics! (Oh wait, wrong buzzword-bank...)

    --
    Sig Appended to the end of comments you post. 120 chars.
  24. Obsolete files from Kazaa by Anonymous Coward · · Score: 2, Funny

    I keep double clicking on these ".mpg.avi.jpg.Donkey Bukkake Porn.wmv.exe" files and nothing happens!

    Maybe I should start using Windows?

  25. Can you say, "Upward Compatibility"? by TFGeditor · · Score: 1

    Back in the day, mainstream software providers included "upward compatibility" in all new software releases. In other words, any data or files created by the software from PP1 release would be readable by whatever the current version happened to be, including any in between.

    Modern code developers seemingly have no concept of upward compatibility. More's the pity.

    --
    Ignorance is curable, stupid is forever.
    1. Re:Can you say, "Upward Compatibility"? by fluxmov · · Score: 1

      Isn't that "downward compatibility" as opposed to upward/foreward compatibility meaning that the older software can read the newer version's files?

    2. Re:Can you say, "Upward Compatibility"? by TFGeditor · · Score: 1

      Well, it would seem that way, but the term "upward compatibility" was the standard, presumably meaning "upward compatible from old to new."

      Sorta like "upload" and "download." It would seem logical to "download" from one's own machine to the taget machine, and vice-versa, but that ain't how the nomenclature works.

      --
      Ignorance is curable, stupid is forever.
  26. reStructured Plaintext by FFFish · · Score: 4, Interesting

    I contract out as a technical writer. For my primary client, I strongly encouraged and then delivered a plaintext solution that uses plaintext files stored repositoried in CVS, using the reStructured Text markup conventions processed through Docutils; and an XSL:FO template that is used by XEP to render the DocutilsXML to PDF. An autobuild system updates our documentation on a nightly basis.

    This system has worked superlatively. In addition to creating a documentation solution that will forevermore be accessible without special software, our authors can focus entirely on content without concern for layout and visual appearance, our customers get a reasonably open file format (PDF) that looks as good on-screen as it does in print. It's win-win all around, by my reckoning.

    --

    --
    Don't like it? Respond with words, not karma.
  27. Hardware issues too by Kevin+Burtch · · Score: 1


    I used to work at a major corporation (you very likely own something made by them) who had a requirement to keep archives of their older engineering documents. The fire-safe was loaded with various tapes ranging back to many dozens of old open-reel tapes.

    Of course, they hadn't had the (monstrous) tape drives to actually read these tapes for many years. I have no idea what they thought they were keeping them for.

    --
    - Preferences: Solaris 10 (servers), Ubuntu (desktops), Solaris 11 (personal servers) -
  28. What about the backup media formats? by Curmudgeonlyoldbloke · · Score: 2, Informative

    Before you can worry about reading individual files, you'll need to get them off the backup media.

    Assuming that you've got some hardware that can physically read whatever it is, what about the backup software?

    For example:
    http://support.microsoft.com/?kbid=305381, complete with quote "this behavior is by design".

  29. Realplayer (sort of) by Curmudgeonlyoldbloke · · Score: 2, Interesting

    No, don't laugh.

    Realplayer 10 doesn't support Realplayer 2 "out of the box". It will happily connect to Real to download said codec if you want - although obviously this assumes that Real will always be with us.

  30. Easy, MS Word when used for math by marat · · Score: 3, Informative

    Any MS Word ships with only one version of Equation Editor; it was 1.0 in Word 2, 2.0 in Word 6, and probably 3.0 or higher now. It means you cannot edit your old equations after switching to a newer version. Therefore most of those who tried to use Word for writing scientific papers left Word after version 6 came out, now only biologists and like still use it because they don't need no bloody math.

  31. Amiga IFF by jesup · · Score: 1

    I wonder how many "mainstream" programs still read Amiga IFF files (for common types like Deluxe Paint/ILBM, WP files, etc) ... Sort of a more-efficient (binary) predecessor to XML; highly extensible with some basic functionalities you could extend (FORM, BODY, etc)

    I know Gimp supports it via a plugin. Here's a Newtek/Lightwave link:
    http://www.newtek.com/products/lightwave/developer /LW80/8lwsdk/docs/filefmts/ilbm.html

    1. Re:Amiga IFF by metamatic · · Score: 1

      Tons. IFF files (or minor variants thereof) are the basis of the Mac's AIFF format, Windows WAVE format, and the DLS sample file format. And that's just the ones I've personally tried to parse...

      Graphic Converter on the Mac reads Amiga IFF graphics images of at least one flavor.

      --
      GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
  32. You wouldn't. by khasim · · Score: 1

    You'd buy a converter from a company that specialized in converters:
    http://www.w3.org/Tools/Word_proc_filters.html

    Remember, YOU want to know what you're sending to the lawyer BEFORE he does. Being surprised in Court is not a good thing.

  33. There's "open" and then there's *open*! by fm6 · · Score: 4, Interesting
    When people say "open format", they usually refer to documenting the details of the format. (Or, as with XML, using a format that's self-documenting.) Now, that does save a lot of work, but it doesn't address a much harder problem. Namely: OK, you've got the data, now how do you use it?

    Classic example: sharing MS Word files with other word processors. The problem isn't getting at the data in .DOC format (not an easy problem, but one that was solved years ago). The problem is rendering Word formatting using the conventions of other word processors. As anybody who's tried to import complex Word documents into Open Office will testify, that's a problem that's a long way from being solved -- if it ever is.

    I've been working on a project for an organization that has a bunch of certificates created in Adobe Illustrator 6. The files are saved in EPS format, which belongs to Adobe, but is very well documented. So accessing the files should be a snap, right? Wrong. I have Adobe Illustrator 11 (better known as Illustrator CS), which uses completely different conventions for creating an EPS file. It can read the old files OK -- but it horribly mungs the formatting. Somebody's going to have to sit down and undo all that munging, which will be a day or two of work. Then we can make the simple change (inserting a new signature), that's the only change we want to make!

    So true openness has more to it than knowing what all the bits and bytes do. It's making sure that all the different design teams for different products that use the format (or the same product at different times!) are on the same page when it comes to the fine details.

    1. Re:There's "open" and then there's *open*! by jayrtfm · · Score: 1

      that might be a font issue. even if you had a machine running AI6 it may still be munged.

    2. Re:There's "open" and then there's *open*! by fm6 · · Score: 1

      No, there's no font issue. The EPS file displays fine in Ghostscript and other Postscript renderers. The problem is that Adobe changed the way AI breaks EPS files into discrete user-manipulable entities.

    3. Re:There's "open" and then there's *open*! by webagogue · · Score: 1

      How good is RTF? How long should I expect to be able to read my RTF files? For the past 15 years of my computing career I have yet to run across a word processor that would not open RTF files. RTF many not be as ubiquitous and long-standing as ASCII text, but it is a lot nicer to work with.

      --

      Knowledge is valuable. Ignorance is dangerous. Censorship is unacceptable. http://slashdot.org/comments.pl?sid=10
    4. Re:There's "open" and then there's *open*! by Dachannien · · Score: 1

      The problem is rendering Word formatting using the conventions of other word processors.

      Considering how difficult it is getting Word to properly render complicated documents that you're working on, I'd say that problem is unsolvable. ;)

    5. Re:There's "open" and then there's *open*! by fm6 · · Score: 1
      For the past 15 years of my computing career I have yet to run across a word processor that would not open RTF files.
      They can open it. But they can't always format it correctly. As with DOC format, getting at the data is only part of the problem.
  34. The northwest by tepples · · Score: 0, Troll

    but with certain companies that are based in the northwest US not following standards and using proprietary extensions

    Would that be Nintendo? Or would it be the maker of the other major game console whose name doesn't have a P in it?

  35. RFC 2397 by tepples · · Score: 2, Insightful

    I don't understand where you're going there unless the spec allows embeded raster images?

    It's straightforward to make an <img /> or <object /> element that contains raster image data. Look up the data: URL scheme.

    1. Re:RFC 2397 by Anonymous Coward · · Score: 0

      That's exactly what the post you were replying to was saying, are /> and <object /> tags part of the SVG spec? How about learning to read?

    2. Re:RFC 2397 by tepples · · Score: 1

      Several comments up someone mentioned XHTML, so what I said related to XHTML. But yes, SVG does support using images as textures, if the mentions of JPEG and PNG in the SVG spec are to be believed.

  36. Re:I happen to have a computer museum at my dispos by tepples · · Score: 1

    In the absence of any credible international initiative to create a reliable permanent archive format

    I'd guess that CD-R is here to stay, given that it shows 0 signs of becoming unsupported on newly manufactured HW. Every new DVD-ROM drive reads it, and likely so do BD-ROM and HD-DVD-ROM drives. If you don't trust off-site CD-R, then as you said, off-site paper backups using an OCR-friendly font are a safe way to go.

    That is, unless you're thinking about timescales in which English is likely to become a dead language. But by then, Christ will likely have come back, and the God of Wisdom will have won the fight against the Entropy Devil.

  37. Print using an OCR font, save in Rich Text Format. by Anonymous Coward · · Score: 2, Informative

    1)Two copies of archival quality hardcopy stored off site printed using an OCR font.
    2)Two copies of archival quality media stored off site saved as RTF as well as the working format.
    3) Regular on site archives.
    4) Regular on site backups.

    At least one off site facility should be a secure storage facility. The other should be accessible 24/7/365, therefore it should be on company property. Each site has paper and media. Archive quarterly.

    However, mostly it sounds like you need to hire a real Technical Writer and some competent IT people. This is 101 stuff.

  38. Retention requirements by booch · · Score: 1

    I would guess that most of the retention laws only require that you retain the files, not that you be able to load them into any particular program to make the files useful. So by just retaining the files, you're most likely already complying with the letter of the law.

    If someone asks for access to those files, it's their problem/responsibility to make use of them. Of course, if it's something that someone within your company needs, then it would be nice of you to help them access the files in a useful manner.

    BTW, translating the files into a different format than that in which they were originally used probably violates the letter of the retention laws.

    --
    Software sucks. Open Source sucks less.
  39. Re:how about codecs? - BINGO by wowbagger · · Score: 1

    I was just thinking of that very thing, and was going to make a comment anyway.

    Consider the old Motion Pixels MovieCD codec. By today's standards the codec isn't much, and yes, if you happen to own any of the old MovieCDs you would be better served just buying the DVD of the movie.

    However, precisely because the MovieCD format was killed deader than hell by the DVD, Motion Pixels went out of business, and the codec source, if it even still exists, is probably in some bankruptcy liquidator's sock drawer - I doubt that short of hiring a private investigator that you could even FIND the person with the source - and even if you did, you would be unlikely to be able to get the code released so that anybody could do anything with it.

  40. 101 by Anonymous Coward · · Score: 0

    You have two different problems here:
    1/ physical media: Has it has already pointed out, would you be able to read some magnetic reels, or a 8" floppy disks?
    2/ Data format: Binary==Bad; Text==Better. Closed Format==Bad; Open Standard==Better. Closed Source==Bad; Open Source==Better.

    But you are lucky. Believe it or not, you are not the first guy affronting these kinds of problems. Solution being:

    1/ The lowest technology, the better. You still can go to Altamira or Lascaux and "read" what was painted there about 10/15.000 years ago. Rocks are quite tough too: you still can read Roman inscriptions. Pure vegetal paper (ie: no-chlorinated) on proper atmosphere is quite good too: egyptian papyres are still readeable after almost 4000 years. So: if it is possible (it depends on amount and kind of data) good old paper is the way to go; multiple copies, different places, proper environment conditions. Anything not directly readeable will need to be sure you store the "reading machine" with it *AND* unless we are talking about easy mechanical devices (ie: punch cards), which engineering blueprints should be added to the lot, in case you must build your own drive on an unkown future, you should deploy a strategy to test them from time to time. This way you will promptly discover a failing device and/or and overlooked issue (like your "engine" works at 110V where currently you are using 220V and you forgot about storing a transformer). If at all possible, these planned tests should allow you to move forward the physical support (maybe you can have problems now to read a 3" Amstrad diskette when moving it to 3.5" ten years ago and to a CD now would be trivial).
    2/ About electronic formats, go with the Army, man! SGML is still the way to go: easily readeable/parseable, self-documented, and always customizable to your exact needs. Of course don't forget to include your DTD within your media!

  41. LaTeX by Anonymous Coward · · Score: 0

    That is one of the reasons why serious papers including any maths use LaTeX.

    1. Re:LaTeX by Anonymous Coward · · Score: 0

      >That is one of the reasons why serious
      >papers including any maths use LaTeX.

      Rather, that's one of the reasons why serious papers including any maths SHOULD use LaTeX. (And most do, of course. But not all.)

      I recently watched someone try to convert hundreds of pages of dense physics lecture notes from Macintosh WriteNow format into something that could be read today and distributed electronically.

      They eventually found a machine that would run the old software, but it had no networking built into it and no way to attach modern hardware.

      I think in the end they found an equally ancient printer for which ink could still be purchased, printed every page, used a screen reader to grab the text, and then laboriously TeK'd every equation by hand.

      Talk about a lot of wasted hours.

  42. Re:I happen to have a computer museum at my dispos by FLEB · · Score: 1

    I'd say print it to acid-free paper, multiple copies in separate places, and hope for the best, like Cassiodorus.

    How about that, with alternating lines of text and CODE128 (or similar) barcode to make it more easily machine-readable.

    --
    Information wants to be free.
    Entertainment wants to be paid.
    You just want to be cheap.
  43. just uuencode it... by da5idnetlimit.com · · Score: 1

    yeah, true, you shouldn't uuencode it...

    Best way to make this is to open the JPEG with a text editor and directly use the text data as is

    For, after all, a jpeg is just text file with a specific meaning to a specific parser, and the data is already compressed, so...

    --
    It takes 40+ muscles to frown, but only four to extend your arm and bitchslap the motherfucker
    1. Re:just uuencode it... by ivan256 · · Score: 1

      Please, dear god, tell me you're kidding and don't actually think this is a good idea/will work.

    2. Re:just uuencode it... by da5idnetlimit.com · · Score: 1

      well, I'm almost kidding...

      Someone told this guy to use uuencode ... at least my solution will save him some space 8)

      --
      It takes 40+ muscles to frown, but only four to extend your arm and bitchslap the motherfucker
    3. Re:just uuencode it... by ivan256 · · Score: 1

      at least my solution will save him some space 8)

      Yeah, I'd say... Throwing away 1 bit out of every byte tends to do that.

  44. Re:I happen to have a computer museum at my dispos by Johnny+Mnemonic · · Score: 2, Insightful

    I'd guess that CD-R is here to stay, given that it shows 0 signs of becoming unsupported on newly manufactured HW.

    20 years ago, you could have said the same thing about a 3.5" floppy. When the iMac first came out in, what, 98, it was widely denigrated for not having a floppy. It's now getting increasingly harder to get floppy drives on PCs, and I wouldn't be surprised at all if they were special-order in another 5 years. In 10 years, your .sig file will be larger than the contents of a 1.4 MB floppy, so why would anyone include them on new hardware?

    I think the only thing to do about data like this is to keep in on a fileserver, and then move the data as the server gets older. As long as it talks tcp/ip, you'll probably be able to get it off--that's one standard that's not going away for a long time, and will be backwards compatible when it does.

    --

    --
    $tar -xvf .sig.tar
  45. Re:I happen to have a computer museum at my dispos by GoRK · · Score: 1

    Actually, the first pioneer BluRay drive to hit the market BDR-1000 has absolutely no support for CD-R or even reading CD's, though it reads and writes every DVD recordable format. While this probably will not be the norm for BluRay or HD-DVD drives, it's certainly not out of the question to imagine a day when your computer can't read a CD-ROM.

  46. seagate jet data gone by Anonymous Coward · · Score: 0

    1997 or so, seagate sent me a free "jet drive" competition to iomega jazz i guess. 1.3 GB (a lot, at the time) external drive w/ removable cartridges @$40 apiece if i recall correctly.

    i loved the way it worked, i needed extra storage, and i put tons of stuff on a couple of the disks.

    now i cant get drivers to get it working and all that old data seems to be lost forever.

  47. QPW by dtfinch · · Score: 2, Informative

    Some might argue that Quattro Pro is still alive (they're still releasing new versions), but its default spreadsheet format is entirely unsupported by the rest of the world. Every time someone cracks their file format, they make a new one. WB1, WB2, WB3, and now QPW. QPW is already 8 years old and still few have figured it out well enough to even extract data from it. If Corel Office dies, many old spreadsheets will slip into oblivion unless converted manually (open, save as, close, and repeat for each of your 500+ spreadsheets).

  48. Re:I happen to have a computer museum at my dispos by vbrtrmn · · Score: 1

    The only problem with CD/DVD media is rot.
    http://www.mv.com/ipusers/richbreton/m/files/cd_ro t.htm

    --
    it's a sig, wtf?
  49. Retention periods? by Vo0k · · Score: 1

    Well, retention periods aren't a major headache. Just produce given file on request and opening it should be a worry of whoever ordered it. Not changing the file format guarantees no original information is lost along the way.
    But if you -need- these files internally, just keep one-two boxen with all the legacy software you'd ever need.

    --
    Anagram("United States of America") == "Dine out, taste a Mac, fries"
  50. Vivo video files. by Chonine · · Score: 2, Informative
    Back in ~97 a freind and I compressed all of our stupid home movies from web cams into vivo format. It was designed for streaming, but made very small files that we would xfer over modem to eachother.

    Now, playing viv files on windows is a pain, you have to install the archaic vivo player, which was designed for windows 95 or so. Also after years of searching, noone makes an app to convert them to mpg, sans some commercial screen capturing programs that I wouldn't touch. MPlayer plays the files, and Im pretty sure its a simple command to output it into an MPG.

    Ever since I've been penguiny, I've wanted to do that - before the MPlayer team decides to depricate vivo support from the latest versions.

    1. Re:Vivo video files. by juhaz · · Score: 1
      MPlayer plays the files, and Im pretty sure its a simple command to output it into an MPG.

      Indeed it is.
      mencoder -oac mp3lame -ovc lavc oldfile.viv -o newfile.avi
      should do the trick.
  51. Re:QPW is not closed by TeXMaster · · Score: 2, Informative
    Corel is not really in the "don't have a look at our file format" field, at all.

    With the SDK, which you can download for free, you get full reference of the file formats of WP, Presentation and QuattroPro.

    The problem is rather that nobody is interested in creating the conversion filters. For WordPerfect, there is now libwpd, which was built with the aforementioned reference. For QuattroPro, there isn't enough interest.

    A secondary problem is that Corel Office programs have, for most of their programs, more powerful/flexible/numerous features than their competition, which can make conversion clumsy.

    --
    "I'm never quite so stupid as when I'm being smart" (Linus van Pelt)
  52. On Office not being able to open its own files... by tod_miller · · Score: 1

    "Hey boss, here is the new version"

    "Hang on, didn't I tell you to remove that 4kb file reader for the last-last version?"

    "Why boss? I mean then people using the old version will suddenly find it has become obsole...t...e...aaaaaaaah I see!!!"

    "Good boy! Welcome to Microsoft"

    Use open office, for some reason that don't care if you open old office formats, maybe because they are not trying to ass rape you.

    --
    #hostfile 0.0.0.0 primidi.com 0.0.0.0 www.primidi.com 0.0.0.0 radio.weblogs.com
  53. rtf, text, etc by VolciMaster · · Score: 1

    For long-term storage, I'd advocate rich text format, or straight-up text, maybe even HTML. Text is openable by anything, and hasn't changed since it was designed.

  54. CD to disappear? by tepples · · Score: 1

    20 years ago, you could have said the same thing about a 3.5" floppy.

    Difference is that nowadays the typical computer buyer is also the owner of dozens of not hundreds of musical phonorecords in CDDA format. The same couldn't generally be said about a floppy full of MIDI files in the floppy era. Until Sony brings out the SACD Walkman, CDDA is here to stay, and people are going to expect to be able to listen, rip, mix, and burn even on a new computer.

    In 10 years, your .sig file will be larger than the contents of a 1.4 MB floppy

    From dial-up to broadband, Slashdot's signature still didn't increase beyond 120 characters.

    As long as [your file server] talks tcp/ip, you'll probably be able to get [your data] off--that's one standard that's not going away for a long time

    Are you sure that 100BASE-TX, or PCI to add network cards that support a new layer 1, will still be supported?

    1. Re:CD to disappear? by Tim+C · · Score: 1

      Difference is that nowadays the typical computer buyer is also the owner of dozens of not hundreds of musical phonorecords in CDDA format.

      That's true, but they are also increasingly buying mp3 players and ripping their CDs to them. I have utterly non-technical friends who are now moving to mp3. Yes, they're still buying CDs, but they're all being ripped mp3.

      I don't think CDs will disappear any time soon, but as the OP said, that's what I thought about floppies 10 years ago. I've not had a floppy drive in my home PC in about 3 years, and I've not missed it.

  55. Early adopter BD-ROM like DVD-ROM by tepples · · Score: 1

    Actually, the first pioneer BluRay drive to hit the market BDR-1000 has absolutely no support for CD-R or even reading CD's

    The first few DVD-ROM drives' CD support was spotty as well. But as people demand combination BD/DVD/CD drives, those will become standard equipment.

    1. Re:Early adopter BD-ROM like DVD-ROM by GoRK · · Score: 1

      Yes, I remember having DVD drives that couldn't read CD's at all or drives that only read real 'pressed' CD's and not recordable. As I said, this drive will probably be the exception instead of the norm, and since it's marketed to professional dvd authoring companies etc and not at consumers anyway, it won't really matter. I would imagine they just built the drive using their existing dual laser components and the CD-tuned lasers got the shaft to get the product out to market. Consumer drives will probably employ the same method but use a single laser for both DVD's and CD's, the same as most non high-end DVD recorders do now.

  56. strings, continously and maybe be creative by v1z · · Score: 1

    The company I work for maintains among other things census data, the oldest of which is stored on punchcards. We have the cards, and a reader, but due to being stored in a too moist atmosphere, it's doubtful that the cards (a stack of about a 1000 cards or so) could be read by a punchcard reader.

    Luckily, the data has long since been converted to something a little more modern, and stored in am SQL server, but I've always thought that if we needed that data, the most efficient way to get it, would be to use a scanner, with sheet-feeder, scan the cards, as images, and then write a script to process the images to numbers, and then convert that to something useful.

    However, the bottom line is, convert data as you go. For some "trivial" data, eg letters and such, pdf/ps might be a good format. But for anything approaching an application, eg spreadsheets, documents with macros, your only bet would be to continiously convert and update the data, as you move from one platform to the next.

    As for old text/wordprocessor documents, I've always had good success in getting the essential data with a simple "strings file > plain.txt". But it's not the same as having the actual formated file, ofcourse.

    Going with openoffice might help -- not only is the format open, but the code is free, which allows you to archive the implementation as well as the data. I think you'll be able to run code for x86 linux for a long time, even if you might have to emulate the cpu in say 20 years time. It might be possible to do the same for windowscode, ofcourse.

    On a personal note, I have some cad drawings made on the Amiga a few years ago, in a format I can't import anywhere; luckily I've exprorted most of that data as postscript so I can at least view it. But it's not good for editing.

    Other posters have mentionend RTF as an alternative rich text format, and I think it could be a good choice. Spreadsheets, might be a tougher nut to crack. Although I expect MS Excel should be supported both by MS and varios competitors (open and closed source) for a long while still.

  57. Re:I happen to have a computer museum at my dispos by tepples · · Score: 1

    Does rot occur even in archival quality media stored vertically, at the proper temperature, in low humidity?

  58. Microsoft Works 3.0 by quamaretto · · Score: 1

    We have a backlog of Microsoft Works documents that could be traced back as much as 10 years. Unfortunately, these documents cannot be read by any version of Microsoft Office, or any later version of Microsoft Works, that I have tried. so to this day we borrow someone's copy of Microsoft Works 3.0 (ours is lost) every time we set up a new PC.

    And, sadly, it wasn't until just recently, maybe the past year or two, that my dad was persuaded to stop making all of his new documents and databases in MS Works 3.0. (As an added bonus, I have gotten him to stop putting his documents in random locations around the hard drive, and start putting them in a folder on the desktop. He still refuses to use "My Documents" for any such purpose.)

    --
    *is run over by rotten tomatoes*
  59. Re:I happen to have a computer museum at my dispos by Anonymous Coward · · Score: 0

    Are you referring to the 3.25" floppies that look like little 5.25" floppies? I have some of those lying around. I've never seen 3" floppies, maybe that's a Brit thing?
    Heh, how about wafer drives? On a Commodore 64? That's fucked up, right there!

  60. Capture printer output by samjam · · Score: 1

    For text-based systems you can try and capture the printer output.

    Serial is simplest, you would need some trickery to capture the parallel port.

    Then some perl to decode the printer escape codes and re-apply formatting.

    OK, its not ideal, but it may have the most certain "finish time" of all the options, if you can do it with a serial port.

    Sam

  61. Ask DEC? How? :-) by Richard+Steiner · · Score: 1

    You must mean Compaq^H^H^H^H^H^HHewlett Packard. :-)

    --
    Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
    The Theorem Theorem: If If, Then Then.
  62. I have a lot of GeoWrite documents... by Richard+Steiner · · Score: 2, Funny

    ...with embedded images and such that were created by Geoworks Ensemble back in the early 1990's, and converting them to another format has proven to be a bit of a pain due to the lack of good export filters in GeoWrite or its successors, and also due to the fact that nobody else seems to be able to read GeoWrite files.

    Thankfully, I can still get the PC/GEOS environment to work on various PCs at home, but at some point that won't be an option.

    --
    Mainframe/UNIX Bit Twiddler and long time Windows/Linux Hobbyist.
    The Theorem Theorem: If If, Then Then.
  63. Shrink Wrapped Apple IIe by sysadmn · · Score: 2, Interesting

    I work in an aerospace division of a very large corporation. I was talking to a design engineer about the FAA's data retention requirements - he said in most cases, it's the life of the product, plus a little cushion. For us, that's about 40 years. In addition to preserving data, you have to be able to recreate the analysis - so if you used a visicalc spreadsheet to perform an analysis, you have be able to do it again. (I think this is more an "in case we get sued" requirement than an FAA one). I was joking about 40 years being a long time when the coworker said, "Just be glad you don't work for the medical division. They have to keep their design data for the lifespan of the patient. For a neonatal ultrasound product, that's effectively one hundred years!"

    --
    Envy my 5 digit Slashdot User ID!
    1. Re:Shrink Wrapped Apple IIe by Anonymous Coward · · Score: 0

      As a former employee of an aerospace division of a very large corporation (Rhymes with Sunnywell), I can tell you there's no need to shrink wrap an Apple IIe -- they're still using a couple of them today to produce the navigational database updates they provide to commercial airline customers.

  64. Re:I happen to have a computer museum at my dispos by 91degrees · · Score: 2, Informative

    Looks like one of these

    Unusual things. Harder case than a 3.25" disk, and slightly rectangular. Only ever really used on Amstrad machines.

  65. RTF, the unagreed file format by Sits · · Score: 1

    It's a step up from ASCII in that you have formatting but I have no idea how well it copes with non basic ASCII characters. You will always have the option to grovel some text out of the document by opening it up in a text editor if the worst comes to the worst though.

    The main problem with RTF is that there are several mildly incompatible implementations to choose from and consequently formatting (along with other features) may not be preserved/read properly between differing programs.

    These days RTF is a defacto MS format rather than a good interchange format between differing products but that's an aside from archiving (unless you change to a different platform).

  66. Re:I happen to have a computer museum at my dispos by pthisis · · Score: 1

    In the absence of any credible international initiative to create a reliable permanent archive format, I'd say print it to acid-free paper

    I still have source code (BASIC) I wrote in 1983 or so, along with old IBM Word Processor documents, etc.

    It's just not that hard to copy your old files to your new computer when you get one. I mean, CDs may go obsolete, but there will be an interim period when you have a CD-ROM drive and a new whizz-bang drive.

    Though it's been 12 years since I actually used removable media to do this, I just mirror directories across the network. The filesystems, storage hardware, and networking protocols have all changed in that period, but the files themselves just keep on coming along with me.

    The only important thing to remember is to view your data as a library: any time you change storage formats, you need to copy the WHOLE library over. Luckily that's pretty trivial if you have any sort of organization for your files.

    --
    rage, rage against the dying of the light
  67. business opportunity $$$ by Dukhat · · Score: 1

    This seems like a perfect extension of the disk recovery business. All you need is about 20 old computers with a bunch of old software. You can copy all the old files onto a CD-R in pdf format. Be sure to put your logo on the CD-R, so you can get their repeat business in 20 years.

  68. Not quite a file format by renehollan · · Score: 1

    ... but I have a copy of my Masters' Thesis on a nine-track 1600 bpi mag tape, in a CDC 6/12 character set.

    --
    You could've hired me.
    1. Re:Not quite a file format by coffeefrog · · Score: 1

      I thought they used seven track drives...

    2. Re:Not quite a file format by renehollan · · Score: 1

      Both. The seven track tape drives were older. Made saving display code and parity easy, of course.

      --
      You could've hired me.
  69. proprietary formats & obsolete media by rfisher · · Score: 1

    I have a lot of data from a very early version of Microsoft Word for Mac.

    Now, to be fair, everytime I've come across someone complaining about MS Office not opening old MS Office files, it was simply that they didn't realize you had to specifically tell the installer to include the legacy filters. Pop the install CD in & pretty soon you're reading your old files.

    The problem I have is that I no longer own (& have no desire to own) ANY version of MS Word.

    Likewise, I have a bunch of data in ClarisWorks format & for a long time I didn't have a Mac, much less ClarisWorks.

    I have a Mac again, but I don't know whether AppleWorks or NeoOffice could open my old Word & ClarisWorks documents. You see, I no longer have any way to read 800K Mac floppies.

    (OK, not entirely true. Most of the ClarisWork stuff is probably on 1.44M floppies, so I *could* try to get AppleWorks or NeoOffice to open them. I just haven't had the need to try yet.)

  70. No copy to University Microfilms? by Mark+of+THE+CITY · · Score: 1

    When I did a masters' thesis and doctoral disssertation, I had to turn in two copies, one of which went off to University Microfilms in Ann Arbor, MI, for copying onto microfilm and storage wherever they keep it. Doesn't Britian (I'm guessing from his using an Amstrad that that's where he's from) have something similar?

    --
    The clearance system sounds logical. It is not. It is completely arbitrary. -- John Bolton
  71. Re:I happen to have a computer museum at my dispos by jpostel · · Score: 1

    I have to admit I have never seen one of those.

    I learned on Apple's, IBM PC AT & XT, and some C64 for spice. I did not get to the really juicy stuff until the mid 1990s when I worked for a pharma consulting company. We used to get mag-reels from the pharma companies to pull our data from. They were still using 10-15 year old tech and everyone that did business with them had to use it too.

    --
    Ummm, Jon, aren't you supposed to be dead...? - Otter(3800)
  72. Re:I happen to have a computer museum at my dispos by Homology · · Score: 1
    In the absence of any credible international initiative to create a reliable permanent archive format, I'd say print it to acid-free paper, multiple copies in separate places, and hope for the best, like Cassiodorus.

    Actually, very important papers that is intended to last for a very long time are printed on specially made paper. An example of this a treaty between two states.

  73. Re:I happen to have a computer museum at my dispos by Homology · · Score: 1
    Does rot occur even in archival quality media stored vertically, at the proper temperature, in low humidity?

    Yes, but the media last longer with proper storage. What you need to do is to regulary copy the old media to a new media, say every year or few years. Do remember that machines to read the old media may not be available in the future. This is done by the Norwegian agency responsible for archiving, by the way.

  74. RapidFile, Final Writer by jgrahn · · Score: 1
    I still have some documents stored in Final Writer format, Final Writer being an Amiga word processor from the early--mid 1990s. Final Writer is a fitting name, because nothing or noone else will ever write to those documents again. Or read them.

    Then there was the records of the regional Ornithologist's Club, basically bird sightings from the early 1990s. My brother requested them for research purposes, but it turned out they were stored in RapidFile (an ancient personal database) and the RapidFile installation itself was long gone.

    With a hex editor and some Python scripting, I managed to retrieve most (but probably not all) of the information. If the format hadn't been so simple internally, it would have been gone forever. Even if there is a theoretical chance that someone has the original hardware and software (or a DOS emulator), noone will bother unless the data is really vital.

  75. Re:I happen to have a computer museum at my dispos by Tim+C · · Score: 1

    Only ever really used on Amstrad machines.

    And the Sinclair ZX Spectrum +3, the one with the built-in 3" disk drive. Although actually, I guess it may have been Amstrad making them by the time that model was released.

  76. Bootable? by tepples · · Score: 1

    Yes, they're still buying CDs, but they're all being ripped mp3.

    So if people keep buying CDs after they buy a new computer, how will they keep ripping them to Fraunhofer format after CDDA support is no longer standard? Or do you assume that people will start buying music exclusively in .drm format? As long as demand for CDDA discs continues, demand for CDDA compatibility as a standard feature will continue, and this usually comes with CD-ROM compatibility at no extra cost. In addition, one major difference between dropping CD and dropping floppy is that CD-ROM has the same form factor and can fit in the same drive mechanism as DVD-ROM, HD-DVD-ROM, and BD-ROM, unlike floppy that needs a separate drive.

    I've not had a floppy drive in my home PC in about 3 years, and I've not missed it.

    What about when you have had to make a boot disc for another PC in your household or in a relative's household, manufactured before booting from a USB stick became a standard feature? The CD writing software shipped with my CD-ROM drive (Roxio 4) could make a boot disc conformant to the "El Torito" standard only if a floppy drive was present on the computer and if a bootable floppy disk was present in the drive. Has this changed on newer versions of Roxio and Nero bundled with CD and DVD burners? And given that DVD uses UDF and not ISO 9660, is there even a widely used standard for DVD-ROM/-R/+R discs that are bootable on x86 PCs (not counting Xbox)?

  77. Solution by Anonymous Coward · · Score: 0

    Obviousman says:

    Well, you could get a used Win95 or Win98 box and copy your data to a hard disk....

  78. .wks? by yuri+benjamin · · Score: 1

    Here in New Zealand we are required by Inland Revenue to keep financial records for 7 years.

    I have a few 6~7 year old .wks spreadsheets created by MS-Works for DOS for a small business run by my father.
    During the 1990s we moved from MSDos to OS/2 to Linux, and all the while we kept using MS-Works for DOS (dosemu on linux).
    We also have some 4~6 year old Star Office files (.sdc format) and most recently we are using OpenOffice (.sxc format).
    I suppose I could still install dosemu and MS-Works (I have the original disks lying around somewhere - if they're still readable).
    OpenOffice still reads the older StarOffice formats.

    But if my MS-Works disks become unreadable then my .wks files will be useless - unless I can recover them with the strings command.

    --
    You make the mistake of thinking you can educate the fundamental stupidity out of people. You can't.
  79. Sure, I've got an example of an obsolete format. by lw54 · · Score: 1

    .exe

  80. Re:I happen to have a computer museum at my dispos by vrai · · Score: 1
    The +2, +2a and +3 were all Amstrad machines. The +3 was a hideous cross between a Speccy and a CPC6128 (though it did have a proper serial port). The last true Spectrum was the 128K+ which was released shortly before the C5 debacle forced the sale of Sinclair Research's computer division.

    ... and yes, to this day I still refuse to buy anything with an Amstrad logo on it.

  81. A slight off topic example. . . by munpfazy · · Score: 1

    . . .is the Real Audio format.

    If you're not in the business of distributing content, then your boss may not be impressed. But as an example of the dangers of closed source lock-in, it's hard to beat Real audio.

    For a decade they've sold content distributors expensive encoding packages. They provided the only existing client to customers, and they made a big show of claiming that by using a closed, proprietary format it would make it harder for people to archive programs.

    Content producers bought up their products in mass and encoded millions of hours of audio in their format.

    Then without warning, Real decided to make all of their new players incompatible with their old codecs. All the time and money that went into digitizing content is now down the drain. Those who chose the real audio format to distribute their library, and believed they were singing on for a one time fee now find themselves with media that none of their customers can access and a choice between paying a huge maintenance fee to Real to continually re-encode their audio every time the folks at Real decide they want some extra revenue, or the time and expense of switching to a brand new format and re-encoding everything from source files.

    Yes, I know that it's possible to install older versions of the real audio player and convince them to not interfere with eachother. And yes, I know that there are now some alternatives to play Real content, such as mplayer. But neither helps the average computer user or the companies that are trying to communicate with them.

    A few minutes trying to play random samples from albums at amazon.com with the latest real player will convince you that a lot of companies got burned trusting Real.

  82. .Mac Backup locked up my data. by Anonymous Coward · · Score: 0

    I almost lost all my data upgrading to Mac OS X Tiger.

    Using Apple OS X and the Backup Program I had from .Mac, I backed up all my data files.

    Just to be on the safe side, I used an external hard drive to make a second direct copy of all the data files.

    After upgrading to OS X Tiger - The old .Mac Backup program would not let me restore the data files.

    When I contacted Apple, they insisted I subscribe, Again, to .Mac - to get working copies of Backup that would run under OS X - Tiger.
    Basically holding my data hostage until I paid up another $100.00

    Well, I did not pay Apple another $100 for my own data that I backed-up with software I had already purchased.

    I used the external hard drive copies of the data and everything was restored.

    I couldn't see buying .Mac again, not to get pinched for another $100 down the line just to restore data.