Slashdot Mirror


Vendor Neutral File Formats?

timmyv asks: "I have recently been tasked with developing a corporate wide policy that will standardize all employee created documents on vendor neutral file formats. OASIS is good in theory, but I haven't been able to locate enough concrete examples of policies or implementation schemes that work at a corporate level. Does anyone work at a company where documents can only be saved as RTF, HTML, etc. or have any experience with this type of problem?"

83 comments

  1. RTF by Uber+Banker · · Score: 2, Informative

    Isn't vendor neutral.

    1. Re:RTF by Anonymous Coward · · Score: 0

      NeXT used it as their standard file format -- it's why TextEdit.app on MacOS X can do rtf. And NeXT wasn't exactly buddy-buddy with Microsoft. Do you know what the hell you're talking about?

    2. Re:RTF by Anonymous Coward · · Score: 0

      The fact that a file format is used by a few vendors doesn't make it vendor-neutral.

  2. I work in a nice company by ilithiiri · · Score: 3, Insightful

    and we, unfortunately, use _all_ the formats known to the world.

    I've already tried to encourage the adoption of hassle-free formats (rtf, html, TXT, whatever).. they don't pass.

    It seems that people simply can't get it.
    Unfortunately.

    --
    If anyone can hear me, slap some sense into me But you turn your head, and I end up talking to myself
  3. OpenOffice by saden1 · · Score: 2, Informative

    OpenOffice file format is a good start. The format is open standard. As governments around the world embrace it companies will ultimate flock to the format.

    --

    -----
    One is born into aristocracy, but mediocrity can only be achieved through hard work.
    1. Re:OpenOffice by spud603 · · Score: 2, Interesting

      Although Microsoft may have successfully killed OOo's format-acceptance in the US by "opening" their office file formats. With the new xml-based word doc's, microsoft may have defined the new standard for text formats in the US. At least it's better than that gobbledy-binary mess they had before..

    2. Re:OpenOffice by fm6 · · Score: 3, Insightful

      Well, that's not exactly "vendor neutral", since only one vendor supports it. Of course, that one vendor is an open-source project, and the format is well-documented XML. So if you want to break out of the Microsoft orbit, it's the obvious first choice.

    3. Re:OpenOffice by Anonymous Coward · · Score: 1, Informative

      Well, actually. The OASIS OpenDocument format will be supported by OpenOffice.org, KOffice, and apparently IBM. It is an OASIS standard and by next year it will be an ISO standard. And the EU is thinking of making it the standard format for pan-European government data. Finally, the specification is not controlled by OOo, but the OASIS, an non-profit standards group, and under the blessing of ISO.

      So... the format is pretty much vendor neutral.

      Cheers,
      Daniel Carrera.
      OpenOffice.org volunteer.

  4. Derrida’s Ghost by Anonymous Coward · · Score: 0, Funny

    Any postmodernist worth his or her salt would tell you there's no such thing as a vendor-neutral file format.

  5. Wrong question for the task. by Rahga · · Score: 2, Interesting

    "I have recently been tasked with developing a corporate wide policy that will standardize all employee created documents on vendor neutral file formats."

    Sorry, but looking at that statement, it seems to me that you are asking the wrong questions. Rather than getting concerned about formats and standards organizations, you should realize that to replace certain formats you will need to improve on open source projects without funding for the development of them. If they say "no" to this, then congratulations, you don't actually have to do this research. Nothing's quite as useless as an unfunded mandate.

    Sadly, I'm not sure if this post is meant to be funny.

  6. Not sure what the question is limited to by GoofyBoy · · Score: 4, Insightful


    There could be a huge number of different files you need. CAD files, images, Powerpoint presentations, complex spreadsheets will all mess up any format you can come up with (eg HTML). How would you even edit some of these things?

    Even OpenOffice formats are not vendor neutral, you have only one product out there that really uses it.

    --
    The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
    1. Re:Not sure what the question is limited to by TheRealJFM · · Score: 2, Informative
      well KOffice may be adopting this format (if it hasn't already), and StarOffice also uses it (I would consider SO a seperate project now, especially at version 2 of OO.o).

      also don't forget that it may be made an ISO standard.

      --
      Joseph Farthing
      http://josephfarthing.com
    2. Re:Not sure what the question is limited to by Mr.Ned · · Score: 1

      Abiword does as well.

  7. PDF by AkaXakA · · Score: 3, Interesting

    It might sound like Adobe lock-in,
    but with PDF Printers (files are printed to pdf's) for Linux and Windows (I asume Mac has it built in), it's a good option for creating documents that'll be displayed everywhere in the same manner.

    1. Re:PDF by topham · · Score: 2, Informative

      Any standard application can print to PDF on a Mac. (running OS X). PDF is inherent to printing. (Very cool, means every program can use the built in viewer for print-preview and what the print-preview shows is what actually prints... unlike certain Microsoft applications under windows)

      The only issue with PDF is the tendancy to be one-way. But there are programs out there designed to convert PDF documents to other formats.

    2. Re:PDF by Zzootnik · · Score: 2, Informative

      Yep-For a large part, it Is a lock-in.
      My company is standardized (at least for production work) on PDF format, which everything can make. The problem is getting things back out or editing such documents...
      It seems that the only truly accurate interpreter is Adobe's Acrobat Software, but it 'just works' for the final output. Converting it to anything else useable doesn't seem to work vey well or reliable.
      Editing these things is a bit of a pain, but it can be done, and we do for a chunk of the production----> but this is definitely beyond the capabilities of any of the PHB's or the multitudes of Customer reps/etc, so the 90% paperwork and other miscellaneous office/corporate documentation never sees PDF format. A lot of that gets done in MS Word/Excel. Oh well... I find myself idly wondering/dreading what the next version of Acrobat is introducing...

      --
      Sig currently under construction. Mind the gap....
    3. Re:PDF by samael · · Score: 1

      And how do you edit them? PDF editing is a complete nightmare...

    4. Re:PDF by __aafkqj3628 · · Score: 2, Informative

      The only issue with PDF is the tendancy to be one-way. But there are programs out there designed to convert PDF documents to other formats.

      There's also -
      pdf2txt@adobe.com
      pdf2html@adobe.com

  8. Hmmm. by Pig+Hogger · · Score: 1

    XML maybe????

    1. Re:Hmmm. by fm6 · · Score: 2, Insightful

      XML isn't a format. It's a language for creating formats. Saying "we'll use XML" is like saying "we'll use an SQL database". It's a step, but only a small one. The big decisions remain.

    2. Re:Hmmm. by pauljlucas · · Score: 3, Insightful
      XML maybe?
      XML without a schema (and applications that can understand it) is useless. One needs something like DocBook.
      --
      If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.
    3. Re:Hmmm. by TykeClone · · Score: 1
      "we'll use an SQL database"

      Make it muave - I hear it's faster.

      --
      A fine is a tax you pay for doing wrong and a tax is a fine you pay for doing all right.
    4. Re:Hmmm. by fm6 · · Score: 1

      Mauve is so 90s!

    5. Re:Hmmm. by A.Chwunbee · · Score: 1

      No, if I am rememebering well the jolly fine Dilbert, it is having the most ram.

      --
      select * from base where originalOwner = 'you' and currentOwner != 'us'.
      0 rows returned.
    6. Re:Hmmm. by Phixxation · · Score: 2

      XML without a schema (and applications that can understand it) is useless. One needs something like DocBook

      I work at a company that regulary consumes vendor data - We're plagued by a certain unnamed corporate enties lack of technical knowledge and insistance upon using XML. I don't understand what it is about that format that draws additional users, but it drives me fsckin nuts.

      --
      "In a world without walls or fences, who needs Windows or Gates?"
    7. Re:Hmmm. by pauljlucas · · Score: 1
      I don't understand what it is about that format that draws additional users, but it drives me fsckin nuts.
      The advantages of XML are that (1) it's plain text and therefore easy to read, and (2) it's easy to parse because all XML is the same, hence you need only one parser. Even if you don't have a schema, you can simply look at the data and pretty much figure it out. Try that with some weird binary format.

      But I do agree that there's too much hype around XML.

      --
      If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.
  9. PDF and the Things That Turn Into It by Anonymous Coward · · Score: 3, Interesting

    What you need is a toolchain that allows conversion back and forth between several different types. For example, I could write a short paper in XML, SGML, or LaTeX, and convert any of the three to PDF. I could convert the XML or SGML versions to LaTeX, then use latex2html to turn it into an HTML document. I don't know of converters that turn XML,SGML->HTML, but they probably exist.

    The point is that it doesn't matter which method I used to create the document; I can convert any of them into either of the other formats without losing information, and any of the three can be turned into HTML or PDF for display purposes.

    You've probably got several different types of documents to mess with. Technical papers with plots, accounting spreadsheets, secretary generated memos, and presentations with pretty pictures so that management can understand what's going on. LaTeX alone could handle all of these situations. Create document types and environments to match the needs of each type of document. XML, being completely generic, could also handle any of the situations, but it's easier to type LaTeX markup than it is XML. There is at least one caveat: you have to be careful what type of images you feed TeX.

    Heck, you could use Perl bindings to MS-Excel to snag data out of spreadsheets and export it into a format that some other chart making tool uses. You could use Excel itself to export as CSV files, which you could then use awk to convert into some other format.

    Basically, it doesn't matter what tool each person uses, as long as what they export off their own workstation is in a standard format.

    1. Re:PDF and the Things That Turn Into It by GoofyBoy · · Score: 2, Insightful

      Umm... you a moving from a vendor-specific system to in-house expertise-specific system.

      --
      The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
    2. Re:PDF and the Things That Turn Into It by zedkineece · · Score: 2, Interesting

      I agree with Anonymous Coward. Why not use XML as your standard format? You could use Word 2003 (or even the entire Office 2003 suite) or XMLSpy to author your documents, but store everything in XML. You could then write (or obtain consulting like we did) to develop XSLTs to convert the XML to whatever format you or your vendors require. One source format to virtually any format you need. It is also somewhat painless to have another XSLT developed when a future format is required, which eliminates the need to do wholesale changes in the future. I highly recommend the company we used, "Docsoft" (http://www.docsoft.com I think). They have some smart guys with probably the best support of an vendor we have dealt with. Since implementation, we have discovered a lot of additional "pluses" that we didn't consider, such as using the XML as a DMS (Docsoft has a search tool that indexes XML in relation to how data is tagged, which has turned out to be invaluable to us). We can even store images with XML meta data to find out what the subject and author is. We sometimes spent 40 hours trying to find extremely specific data, now only takes us 15 minutes or less. All because of XML. Just my 2 cents worth.

    3. Re:PDF and the Things That Turn Into It by Anonymous Coward · · Score: 0

      from a vendor-specific system to in-house expertise-specific system.

      Sounds good to me, but then again, I'm an engineer. A manager would probably rather be able to blame someone outside of the company.

      From a technology standpoint, there aren't any mysteries involved in this. Someone who understands exactly what was needed out of each department in their company, and moderately understood the technologies involved could probably draw up rudimentary LaTeX environments in the same amount of time this fellow has been given to investigate possible policies.

      Also, once initial scripts/templates have been drawn up, maintenance is an absolute breeze. Everything has been stored in a format where presentation is separate from content. Altering the look of documents later on is nothing.

    4. Re:PDF and the Things That Turn Into It by GoofyBoy · · Score: 1

      >From a technology standpoint, there aren't any mysteries involved in this. Someone who understands exactly what was needed out of each department in their company, and moderately understood the technologies involved could probably draw up rudimentary LaTeX environments ...

      And this isn't a mystery?

      --
      The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
    5. Re:PDF and the Things That Turn Into It by Anonymous Coward · · Score: 1, Insightful

      And this isn't a mystery?

      No. It's a matter of researching documentation.

    6. Re:PDF and the Things That Turn Into It by tepples · · Score: 2, Informative

      I don't know of converters that turn XML,SGML->HTML, but they probably exist.

      The tool to convert from domain-specific XML to XHTML is called XSLT. For more info, Ask Google.

    7. Re:PDF and the Things That Turn Into It by iammaxus · · Score: 1

      mod parent down. "XML" and "SGML" are not file formats. They are formats for formats. "I don't know of converters that turn XML,SGML->HTML, but they probably exist." Is horribly ridiculous because HTML is an SGML file format. XML based formats are useful because of XSLT as other posters mentioned. You can create and XML format and then automatically convert it into any other XML format (XHTML for one) or even to non XML formats.

  10. Vendor neutral is not always the answer.... by Alpha27 · · Score: 3, Insightful

    The idea of switching applications for people can be a task no one wants to undertake for many two reasons.

    Comfort level:
    It's like having designers switch from Photoshop to The GIMP, or MS Word to OO Writer. Granted, the apps accomplish the same thing, but it's not the *same* program. People will resist the change because they know how to use the first program, and the reason for the change isn't a concern for them.

    Dominance:
    Going vendor neutral when the major still use vendor specific requires you to see if your users use vendor specific features that are not available in the neutral. If those features aren't there, then what do you do? Write code to compenstate for the feature, or get plugins, or do nothing if there's nothing you can do. Are there tools that can do as good a job as the old tools, to work in this neutral envirnoment?

    It would help more if you stated your case in more detail.

    1. Re:Vendor neutral is not always the answer.... by Anonymous Coward · · Score: 0

      the gimp isn't even close to paint shop pro's level yet, comparing it to photoshop makes artists/designers/whatever weep.

    2. Re:Vendor neutral is not always the answer.... by tepples · · Score: 1

      the gimp isn't even close to paint shop pro's level yet

      In what way, specifically?

    3. Re:Vendor neutral is not always the answer.... by amokk · · Score: 1

      I don't really like to bring it down to this level, but your assertion that Photoshop and Gimp accomplish the same thing is entirely wrong.

      It is not possible to move from Photoshop to Gimp in many, incredibly common, situations. Assuming one would even want to.

      --
      I think, therefore I am an Atheist.
  11. Vendor Neutral? by rueger · · Score: 1

    That seems like kind of an unclear idea. How many vendors do you have, and do they all use the same software in the same fashion?

    Unless you have pretty carefully surveyed all of those people you really can't choose one file format over another.

    In other words, you're asking the wrong question. Instead of trying to figure out what your employees can standardize on, you will first need to find out what what the majority of your vendors have standardized on.

    Of course you'll have problems. HTML or PDF are horrible if you're circulating documents that need to be edited or excerpted. And vendors and suppliers will still send you documents in whatever their house file format is.

    Really, for this to be effective you need to involve your employees, management, vendors, and probably suppliers in order to get everyone working within the same set of file formats.

  12. Right motivation, wrong question... by moreati · · Score: 2, Insightful

    Avoiding vendor lockin is of course A Good Thing. However, as others have said, there is no format completely vendor neutral - each platform has it's own set of unique features that don't translate directly and must be stored somewhere in an extension or custom tag. I'm certain the OASIS/OOo format has a few StarOfficeisms in it.

    What matters is that the data you own is readly transformable into a Fully Open and documented format independant of your chosen platform, normally (but not necessarily) this will mean your native format is Fully Open and documented. This includes all data, styling, formatting, metadata and interrelationships. Bascially you should be able to quickly jump ship, even if your vendor has been wiped of the earth or there are legal/technical issues preventing you from running the original platform, without loss or 'damage' of any information. There must be at least one other clear route to all your information, completely bypassing the original platform.

    As an example .doc would be unsuitable since the format is undocumented and you would be reliant on the correct version of office to correctly and completely read/export it, hence you would depend on Microsoft.

    Similarly prior to it's released as open source software and even immediately after .sxw would have been unsuitable (even though it was 'just zipped xml'), since OOo/StarOffice were the only way of performing any completely trustworthy export. Now the format is formally documented and independant tools exist it is suitable.

    There are grey areas such as databases, which have no common datafile format but do expose Fully Open interfaces such as ODBC or JDBC.

    With this in mind I would argue that forcing everyone to save documents in 'basic' formats such as HTML and RTF is counterproductive, they lack wide support for features such styling and precise page layout. Any format will do as long as you can readily, fully & demonstratably extract all your information, independantly of the platform that created it.

    Alex

  13. postscript/PDF and XML? by SHEENmaster · · Score: 3, Interesting

    XCircuit, a circuit layout app for X, uses postscript as its default format. If you have XCircuit, you can load the postscript file into it and edit it like any other circuit. If not, you can still print it or view it as you would any other postscript file.

    XML is a good start, because it's easy for a new app (the fictional YCircuit) to add support for the format, but you are still stuck unable to print it if you don't have the skills to write a conversion script and no one else has written it for you.

    Why not combine the two? XML embedded in a standard PDF file would allow any application with support for the creator's XML tagset to import the file, and at the very least those without any similar application could view and print the file.

    --
    You can't judge a book by the way it wears its hair.
  14. "Vendor Neutral"???!!! by fm6 · · Score: 4, Insightful
    You think RTF is "vendor neutral"? It's simply a 7-bit-safe version of Word's native format. There are lots of third-party tools that read and write RTF, but the same is true of Word native. Either way, you run up against all the formatting issues you always get when you're importing and exporting unstructured formats.

    HTML is only vendor neutral if you don't use any vendor-specific extensions. So you can't just say, "Everybody save your files as HTML". You also have to forbid anybody using apps (such as Word) that save to a non-standard HTML.

    In theory, you can create an XML-based format that looks the same in Word, OpenOffice, FrameMaker, and any other XML-aware app. But doing so means designing a schema in extreme nit-picking detail, and writing a lot of transformations to get that XML in and out of all the apps that need to read or write it. It's a lot of work, and nobody does it unless they have a specific application that requires highly-structured information. Like if you have a huge set of technical documentation that you need to update a lot. (I was involved in just such a project -- and the politics of converting all those documents to XML cost me my job.) Or if you have invoices or similar business documents that need to go into or out of a web services app.

    But for the big mass of unstructured documents, there just isn't a vendor-neutral solution, and nobody has any real incentive to create one. The solution remains the same: standardize on certain specific applications. Which boils down to using OpenOffice if you hate giving money to Bill and/or want a platform-neutral solution. Otherwise you standardize on Microsoft Office, because it's what everybody knows how to use.

    1. Re:"Vendor Neutral"???!!! by Anonymous Coward · · Score: 2, Informative
      You think RTF is "vendor neutral"? It's simply a 7-bit-safe version of Word's native format.
      That it is not.

      RTF does contain, in theory, sufficient control words to describe everything that Word 2000 can do, but it's hardly a direct translation and things get lost a lot. Furthermore, RTF contains a few control words that Microsoft didn't put there: such as \collapsed (added by NeXT to describe paragraphs that had been hidden by the user).

      There are lots of third-party tools that read and write RTF, but the same is true of Word native. Either way, you run up against all the formatting issues you always get when you're importing and exporting unstructured formats.
      There is a huge difference. RTF is a formally published, open specification and Microsoft openly encourages third-party implementations. It's been stable for 5 years now. Word .doc files are a closed spec that Microsoft jealously guards and changes often.
    2. Re:"Vendor Neutral"???!!! by fm6 · · Score: 1
      Technically, I suppose you're right. But Microsoft's past attempts to promote RTF as an open format have little practical meaning nowadays. I mean, if an unsuccessful platform is your best example of non-Microsoft development of RTF-based software, it doesn't say much for as an industry standard. A "standard" technology that only one company fully implements is, for all practical purposes, proprietary.

      And although it's easier to find documentation for RTF than for Word native, the latter does exists. You just have to have the right developer's license to see it. I don't know whether products like OpenOffice, AbiWord, and WordPerfect, use that documentation, or whether they just reverse-engineer the files. But however they go about it, they don't do any worse a job reading Word native format than reading RTF.

      So, yeah, characterizing RTF as "7-bit native" is a slight oversimplification. But not one that really matters to anybody trying to find a neutral format.

    3. Re:"Vendor Neutral"???!!! by Anonymous Coward · · Score: 0
      I mean, if an unsuccessful platform is your best example of non-Microsoft development of RTF-based software, it doesn't say much for as an industry standard.
      You do realize who bought NeXT, right? RTF is the primary format for Apple's Cocoa text objects. TextEdit.app is all RTF. Apple's code documentation is all in RTF. RTF is the chief text transfer format for Apple's cut/copy/paste facility.
  15. Easy by Anonymous Coward · · Score: 1, Funny

    Store everything in giant PNGs.

  16. Re:OOo file format is open though by pbhj · · Score: 1

    OpenOffice.org format may not be vendor neutral particularly (though like others said, KOffice at least uses it) but it is an open and prevalent format. MS .doc is prevalent but as it's not open then it's not necessarily going to have filters available for it in the future. I think OOo is safer in this respect. Also OOo format is (compressed) xml so can probably be parsed by xml readers (? - I haven't got a clue, really!!).

  17. You're asking the wrong questions by abb3w · · Score: 4, Insightful
    The first question is not what, or how; the first question is WHY. As in, why do this? And therefore, is there a better way to achieve this goal?

    Are they doing this to save money? to clamp down on the uppity workers? because the CEO got emailed an AppleWorks attachment with no file extension from some Mac user? to avoid the risks of single vendor lock-in?

    Many documents formats can be converted back-and-forth with some degree of effectiveness. Yes, if you open a document from WordPerfect in Microsoft Office, the word spacing may change a little. However, this happens if you move from a machine connected with a HP4000 printer to a HP2100 printer as well. However, some formats give different feature capabilities; saving from DOC to RTF will cause (as an example) tables to shift about a bit. TXT format is readable by most anything, but the formatting capabilites are nigh nonexistant. (Ooh! Tabs!) While WordPerfect and Word will each open the others documents, they aren't so good for saving in open formats

    What formats are currently used? Why are they needed? Will everyone need to be able to write to them, or are pay-writer/free-reader combos acceptable? And, *ARE* there any "vendor neutral" formats out there? (For desktop publishing, the real answer is "no". Publisher is a joke, and while Adobe and Quark maintain some import compatibilties, the formats AREN'T neutral.)

    For myself, working in a small department, "Let a thousand flowers bloom" is just fine. I accept that I will occaisionally get forwarded an e-mail with an attachement that the user can't figure out how to open-- usually Mac/PC file extension name issues solved easily by renaming. Once in a blue moon I have to explain to someone that no, not everyone has FooBarBaz market research organizer, since for most the $800 license cost for it would be more beneficially used for other things, and they will probably need to examine such data files once in their career, if that.

    Perhaps a list of universally accepted formats-- that is, formats that must be used for wide distribution-- would be more appropriate, after considering what features are needed in said formats. After all, Photoshop .PSD documents are harder to view outside Photoshop, but far more useful for subtle graphics work than JPEGs.

    I suspect you are being sent out on a project inadequately considered. Depending on the pointy-hairyness of the person who assigned it to you, you may find some substantial benefit to reconsidering the ground assumptions.

    --
    //Information does not want to be free; it wants to breed.
    1. Re:You're asking the wrong questions by RevDobbs · · Score: 1
      Once in a blue moon I have to explain to someone that no, not everyone has FooBarBaz market research organizer, since for most the $800 license cost for it would be more beneficially used for other things, and they will probably need to examine such data files once in their career, if that.

      I know it's illegal, but there was a torrent for the latest FooBarBaz on SuprNova just before it got shot down... you may be able to still find it out there.

    2. Re:You're asking the wrong questions by timmyv · · Score: 2, Informative
      I guess I neglected to mention that the "corporation" I work for is a state government. Therefore Open Standards are essential to allow for:
      The types of files we are talking about are essentially textual documents, spreadsheets, databases, etc. 2 of the 3 OOo provides, but I have a pretty good idea of how our user base would respond if we upped and replaced all their MS Office installations with Open Office, or for that matter how our DBAs would respond if we moved entirely to MySQL or MaxDB without a strong policy or incentives.
    3. Re:You're asking the wrong questions by Anonymous Coward · · Score: 0
      I know it's illegal, but there was a torrent for the latest FooBarBaz on SuprNova just before it got shot down... you may be able to still find it out there.

      Not worth the bother, it's a bad crack with an invalid key, and installs 180Solutions, Bullseye, and a rootkit to boot. :P

    4. Re:You're asking the wrong questions by abb3w · · Score: 2, Informative
      Ah, that's a somewhat more clear problem.

      For free access to documents by citizens, PDF is pretty good. There are viewers for most platforms (I don't know about BSD or Solaris, but Mac/PC/Linux all are OK); and there are non-Acrobat print-to-PDF knockoffs at economical prices. Requiring PDF publication of all publicly available printed documents in, say, PDFv1.2, PDFv1.3 or PDFv1.4 would be a useful and not overly onerous step. (Adding forms-completion ability to the PDF requirement might well be too much.) The PDF standards are public, although copyrighted.
      M$ Office has free viewers for older versions on Windows, but the Mac version isn't native on the current Apple OS, and OpenOffice is the only viewer I know for .DOC under Linux. =)

      As far as permanence of data, nothing beats the long term unkillability of a bare TXT file; it also allows improved handicapped accessibility to the data in the process. For databases (w/o queries) and single-page spreadsheets, CSV comma-separated text format is similarly hard to destroy. Most Office Suites will read in such applications. For charts and other pictures, JPG may eventually be replaced, but will probably be readable for a long time. Of course, data corruption is always a risk (especially for JPG), so backups should be made redundantly, and be prepared for at least one major media format migration (EG: CD to DVD-Blue, or whatever). Requiring that any software be able to import from and export to these as relevant would be a reasonable and not overly onerous step.

      Security is a more problematic issue. Some documents are meant to be kept non-public, barring (or even given) FOIA requests. Were it in my desmene to do so, I would still require the creation of the files for archive purposes, but storage off-site at a secure abandoned-salt-mine-type facility. Given that Security is oft diametrically opposed to Accesibility and/or Permanence, this may be a problem.

      Oh... and PDF has some built-in security features. Requiring them to be used only when such security is mandatory might be worth thinking about.

      --
      //Information does not want to be free; it wants to breed.
    5. Re:You're asking the wrong questions by Anonymous Coward · · Score: 0

      I think paper documents are the only form that can fully meet these requirements. Even if you use only .TXT files the media those files are written on may be ureadable in a decade or two.

    6. Re:You're asking the wrong questions by fm6 · · Score: 1

      When you say "the word spacing may change a little", you're really underestimating the problem. If you ever do anything more than really simple memos with no nested lists, no complex tables, and no charts, you find yourself in a real mess trying to import documents from another vendor. It's something you can deal with if you just want to read other people's documents -- but normal business workflow requires that people pass documents back and forth, making changes and annotations. You simply can't do that without standardizing on a format. And I don't mean RTF, which is effectively a Microsoft proprietary format, despite Redmond's past attempts to get it adopted as "neutral".

    7. Re:You're asking the wrong questions by topham · · Score: 1


      I realize a lot of people do not like PDF; but any other format is asking for grief from end-users.

      A company I currently do a lot of work for is slowly migrating towards PDF, each step a long the way has been pretty smooth. It's easy enough for the users to understand they 'print to PDF' to make a presentation version of a document.

      I don't believe intermediate documents (works in process) should be stored in open formats. Not enough open formats support enough features, you would simply end up with a half dozen, or more 'open formats' and have more difficulties than necessary for everyone involved.

      As described it is important you have the finished documents in a format that can be read without difficulty. PDF meets that need, as well as allowing re-printing of an archived document long after you've replaced the original program that created the document with something newer and different.

      My employer has provided solutions for Intranet websites to upload the documents (as PDF) and allow the users to view them on kiosks. Having a universal viewer like PDF is much better than using multiple add-on viewers for different document types. (Excel/Word, etc).

      If I were presenting drawings, as opposed to Documents I'd probably add SVG to the mix. But for presentation of a document as was originally intended PDF can't be beat.

    8. Re:You're asking the wrong questions by Anonymous Coward · · Score: 0

      But you are failed to realise the "access" part of the question. Paper implies in physical access. It is very, very, useful for long term archivement but not for access.

  18. That is what i do... by Uber+Banker · · Score: 1

    ...when i get locked PDFs. Just take a screenshot of the document. Easy.

  19. LaTeX by KivlE · · Score: 2, Insightful

    Hmm, I'd say LaTeX would be a good alternative? There are interpreters for most platforms, the source files are plain text, and it can output a variety of readable formats (pdf,ps,html etc).

    1. Re:LaTeX by Planesdragon · · Score: 1

      Show us how to move a MS Word file to LaTeX with no loss of information (yes, formatting counts as "information") or human editing.

      if you can't do that, it's not worth his time.

    2. Re:LaTeX by homeobocks · · Score: 1

      It's possible, but most people who use MS Word don't form real headers/sections/numbering, they just increase the font size and centre things and do things manually. Because of that, it would be hard to turn style information into logical information.

      --
      MOUNT TAPE U1439 ON B3, NO RING
    3. Re:LaTeX by Planesdragon · · Score: 1

      You can turn a set font size to a header easily enough. Heck, you can do it with VB script.

      Got a link for "possible"?

  20. Bad Assignment by salesgeek · · Score: 2, Insightful

    I'd recommend you find a way to get out of the assignment. You will not find what you seek as it is one of the holy grails of computing that should exist but does not and does not for good reason (money).

    --
    -- $G
  21. What's the true question? by crath · · Score: 1

    There could be a huge number of different files you need. CAD files, images, ...

    Before starting, try to determine what the true question is. Were you asked to choose something that is truly vendor neutral, or were you asked to choose corporate standards that will interoperate with your customers and suppliers? The first question is *very* difficult to answer; the second one is easily solved (albeit in a non-Slashdot friendly manner).

    I will assume the latter question is the true question, and continue my posting based upon that assumption.

    For each major document type, determine who needs to be able to read and edit those documents. This question must be answered for your employees as well as your customers and suppliers. Then, choose a file type that is widely used in that community; which may mean standardising on an older version of a particular application.

    For example, in the case of word processed documents, MS Word 97 is a very safe, very widely readable (by other applications) format. Newer versions of MS Word can be configured to only create Word97 files, and many other non-MS applications are able to open and edit Word97 files. So, although Word97 format isn't vendor neutral, it is widely interoperable and makes a good corporate standard.

  22. what are you talking about? by Anonymous Coward · · Score: 1, Informative

    MS-office2003 is XML format but that does not mean it is open.

    It is restricted by patents, see..
    http://news.com.com/Microsoft+seeks+XML-rel ated+pa tents/2100-1013_3-5146581.html

  23. Re:OOo file format is open though by michaela · · Score: 2, Insightful

    Yep. Just use unzip and you'll get several XML files, among them: content.xml is the document itself, meta.xml is the property sheet info, styles.xml is the stylesheet(s) in use when the document was saved.

    After that, you can your favorite XML widget, such as the XML::Parser Perl module, to turn it into HTML or other things of your choosing.

    Or create an XSLT file and use something like Xalan to
    format it on the fly.

    Gotta love OOo and those open formats!

    --
    That is all.
  24. SVG instead by tepples · · Score: 1

    XML embedded in a standard PDF file would allow any application with support for the creator's XML tagset to import the file, and at the very least those without any similar application could view and print the file.

    For a more pure XML solution, it'd be better to embed domain-specific XML data in an SVG document, which Adobe's SVG viewer can display and print. In fact, it might even be possible to XSLT the XML into SVG.

  25. NEXTSTEP is now Mac OS X by tepples · · Score: 1

    I mean, if an unsuccessful platform is your best example of non-Microsoft development of RTF-based software

    Unsuccessful my ass; learn why.

    1. Re:NEXTSTEP is now Mac OS X by fm6 · · Score: 1

      NextStep may be the platform on which OS X was built. (Just as NextStep itself was built on Project Mach.) But OS X is hardly a continuation of NextStep. How many NextStep applications have migrated to OS X?

    2. Re:NEXTSTEP is now Mac OS X by jbolden · · Score: 1

      Its been 10 years, and NexTStep was primarily a development platform when Apple got it. But if you count apps in OSX like the Dock, Preview, NetInfo,... you get lots. If you count ideas from Next that moved to the whole of computing like WYSIWYG fonts then even more. The big one which is not OSy and moved directly is Interface builder.

    3. Re:NEXTSTEP is now Mac OS X by Anonymous Coward · · Score: 0

      Just off the top of my head: OmniWeb, Create! ChartSMITH, Mathematica, PStill. That was with about 15 seconds of thought.

  26. Infer what? by tepples · · Score: 1

    True, but given an RTF using visual formatting, how can a program know in advance which font size was meant to be "heading level 1", which was meant to be "heading level 2", whether italics represent emphasis or the title of a work, etc?

    1. Re:Infer what? by Planesdragon · · Score: 1

      Two ways.

      Number one: the office tells them. I.e., "use everything that's size 14 as Heading 1, use italics as italics, etc."

      Number two: write a program to figure it out. This could be done in Office VB to apply and redefine headings for any given document.

  27. I always try and use portable files by The_Dougster · · Score: 3, Informative

    Well, for CAD, its a screwed up world. The best/most portable format is probably IGES, except its such a huge specification that nobody's IGES file is compatible with anybody else's. I'm an engineer and for myself I use Turbocad 10 professional at home. It reads/writes AutoCAD files and numerous other formats, and is somewhere in between AutoCAD and Pro/Engineer in terms of its capabilities. You'll have a tough time convincing any corporation to use TurboCAD though.

    For text documents, HTML would be good, except MS products tend to produce the most screwed up HTML files I've ever seen. All I can recommend is to use PDF files for important and official documents because they are essentially immutable and tend to produce consistent hardcopies from any computer.

    OpenOffice formats are nice, and if I were starting up a new business I would of course set up Linux workstations to use OO exclusively, and put a Windows machine down in the IT room so the IT staff could convert any troublesome documents that come through the email.

    For Visio, there is no equivalent, other than exporting the visio file as a DXF or maybe a WMF. Windows MetaFiles never seem to load right in other apps though so thats something to think about. SVG files will probably be the future here if Dia starts using them.

    --
    Clickety Click ...
  28. OASIS Open Document vendor independent by SgtChaireBourne · · Score: 2, Interesting
    Open Document will be interesting to follow.

    Like HTML, which surprised people in the 1990's, the OASIS OpenOffice.org file format is indeed vendor independent, though, it is now called Open Document. Anyone can use it or develop tools for it without restriction. Even Microsoft is part of the team at OASIS, at least on paper. And, even if MS doesn't get out of the way, interesting things will happen with Open Doument.

    So far OASIS Open Document being used by at least the following:

    • StarOffice
    • OpenOffice.org
    • AbiWord
    • kWord
    Unlike MS-WordML, which is encumbered by patents, trade secrets, and difficult licensing issues, OpenDocument is free to use. It also meets the requirements specified in European Interoperability Framework for Pan-European eGovernment Services. It's getting increasing attention:
    ... the adoption of an OASIS Open Office Standard should be welcomed, and industry actors not currently involved with the OASIS Open Document Format should consider participating in the standardisation process in order to encourage a wider consensus around the format.

    --EU Telematics between Administrations Committee, May 24, 2004

    Note that the only industry actor not currently involved in the OASIS Open Document Format has been and still is MS. MS is still trying to shoehorn old MS-Office 97 customers into DRM'd MS-Office 2003, which functions in effect like a roach motel for your data. So far the worst insult that Balmer and Gates can cough up is that OpenOffice.org (OOo) is like MS-Office 97. However, I think even those two can see that OOo meets this groups functional requirements quite well, and is free and multiplatform. OOo is also available in more languages than MS-Office, handles long documents better, and does better with styles and stylesheets.

    Currently, there are many governments moving up to StarOffice or OpenOffice.org for the sake of these formats. Singapore comes to mind first, but there are many, many others that don't necessarily make the mainstream press like Sarpsborg. Likewise, there are many small, medium and large businesses moving along. Some with an axe to grind (with good reason ) speak up. However, most are silent until the move is being implemented to keep the goon squad from Redmond from getting in the way.

    The current choice:

    • OASIS Open Document --
      1. be able to access your own data indefinitely as XML
      2. and change productivity tools, operating systems and hardware only if and when it suites you
    • MS-WordML --
      1. pay that Redmond tithe indefinitely
      2. and buy new productivity tools, operating systems and hardware when Chairman Bill tells you to
    Easy choice. You don't need to be a wizard to see which direction things are going to head.
    --
    Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
  29. Nonsense. by jotaeleemeese · · Score: 1

    Where is he mentioning that the applications have to be Open Source ones?

    For all applications there are formats that are industry standards and unencumbered by patents (as far as it is possible to ensure this in certain litigious countries).

    The knee jerk reaction "boooh! Open Source software is not ready" should be only used when actually Open Source is a necessary part of a requested solution.

    --
    IANAL but write like a drunk one.
  30. What is your point? by jotaeleemeese · · Score: 1

    The article poster is explicitly stating they want to move to vendor neutral applications.

    In such a situation why would they need to do such conversions?

    --
    IANAL but write like a drunk one.
    1. Re:What is your point? by Planesdragon · · Score: 2, Interesting

      That's not what he said. He said vendor neutral file formats.

      This may result in dropping MS Office entirely -- or it may just result in changing the default "save as" settings for every install of Word, or the creation of an "archive and share" custom function that takes DOCs or WPSs or whatever and turns them into the new neutral format.

  31. RTF by alexo · · Score: 1


    > RTF does contain, in theory, sufficient control words to describe
    > everything that Word 2000 can do, but it's hardly a direct translation and
    > things get lost a lot.


    What gets lost?
    Examples please.

  32. The Shot Heard Round the World by garyedwards · · Score: 2, Informative


    There are no "StarOfficeisms" in the OASIS XML Open Document file format specification. Least ways not any we know of. By December of 2004, when the OASIS TC submitted the XML file format specification to ISO, all known references and anachronisms that might be called starisms were changed. Neutralizing changes were even made to such things as the file format extensions and mime type registrations. We even changed the name from OASIS Open Office to OASIS/ISO Open Document.

    Separating the file format from any particular application or applications suite is a big deal. Especially if there is a rising demand from enterprise level end users for an applications independent universal structured file format solution. tty. Separating the file format from any particular application or applications suite is a big deal. Especially if there is a rising demand from enterprise level end users for an applications independent universal structured file format solution.

    So the OASIS/ISO TC chose to keep that most powerful of technology terms, the word "Open", but lose the direct reference and/or suggestion to OpenOffice.org.

    The second reason for changing the name to "OASIS Open Document" is far more interesting, and directly relates to the European Union "TAC/IDA" task force recommendations based on the infamous Valoris Report. You will recall that by September of 2004, the EU had evaluated responses from both Sun and Microsoft regarding the Valoris recommendation that all EU information system purchases be required to support an open standards based XML file format specification.

    Microsoft's open XML proposal was determined by the EU to be "not open enough". This criticism was in the original Valoris Report, and not altered by subsequent Microsoft arguments. After much squealing, squawking, finger pointing, complaining and outrageous misrepresentation, in mid November of 2004 Microsoft finally conceded and agreed to meet EU requirements. More about this in a moment, but for now the important thing to note is that the EU held firm. A remarkable feat even though there is currently a range of cross platform alternative solutions that meet EU requirements, including the open and free OpenOffice.org, Sun's StarOffice, IBM's WorkPlace, and Novell's Open Office. And if Microsoft had not sold their share in Corel to a vulture investor outfit for pennies on the dollar, an investor who then proceeded to cut XML out of Corel, WordPerfect Office would also be OASIS/ISO XML compliant.

    Meanwhile, the EU was also not entirely satisfied with the OASIS XML specification as explained in Sun's response to the EU requirements recommendation. Three things in particular concerned the EU.

    First, that OASIS submit the file format specification to ISO. In September of 2004, OASIS management and the OASIS TC came to agreement with ISO that the file format specification would be submitted to ISO before years end, but maintenance and improvement would remain with the current OASIS TC. Hence the combo moniker "OASIS/ISO".

    Second, there was a great deal of concern about "custom-defined schemas". Sometimes this issue is also referred to as "user-defined schemas". Others just call it a "forms" or "template" issue. Basically it refers to an applications ability to load (or consume) an externally defined schema template that might include specific user interfaces (forms), business - workgroup logic (routing), meta data interfaces, and other things related to the emerging world of collaborative computing.

    Microsoft of course champions the auxiliary Office productivity application, InfoPath. However, in September of 2004, the OASIS TC finished work on extending the specification to include XForms, SVG, and SMiL. Current OOo -v.2 builds fully demonstrate the powerful capabilities of these extensions, including the binding of web services and data to graphical objects and forms/template widgets. Move over InfoPath. Hello OASIS UBL!

    The third issue involves EU concerns fo

  33. Because OpenStep is now Cocoa by tepples · · Score: 1

    How many NextStep applications have migrated to OS X?

    Depends on whether the developer is still around. Mac OS X implements the Mac OS Toolbox API as "Carbon" and the OpenStep API as "Cocoa". If the developer still has the source code and wants to reach thousands of Mac users, porting starts with a recompile. But if your developer has gone out of business, on the other hand...

  34. Permanence by jbolden · · Score: 1

    Permanence of public data.

    I guess how permanent is permanent? Its very hard to store data electronically long term and have it be accessible years later. How many computer techs today could even deal with a 9 track data tape (a state of the art archival format 20 years ago)? While PCs can handle Bus and Tag data streams the adapter card is $3k per. No one 30 years ago would have conceived of having individual users not connected in any meaningful way to operations center.

    I've done a lot of work taking data in "will be good forever" formats like code 1 and moving them to formats that are actually usable by non mainframes. I see no reason to believe that .pdf deployed on modern tapes archives will be meaningfully usable in 30 years. If by "permanent" you mean 10 years or less than no problem. If you mean 100 then in addition to all the other suggestions below, I'm going to say a Microfiche printer should be part of the solution. 100 years from now people may not have a clue what Microsoft Word was and thus no idea what to do with a ".doc file" on a DVD or whatever but they will know how to use a magnifying glass and a light source just fine.

    With one of these printers your users either export, .jpg, .pdf, .doc... or they "print" to this printer which captures 400 ppm very cheaply (server + printer + setup for a little over $10k). It may sound really really old fashioned but I think it is worth considering. Think about how you would get digital data from the systems you were using in 1975....