Vendor Neutral File Formats?
timmyv asks: "I have recently been tasked with developing a corporate wide policy that will standardize all employee created documents on vendor neutral file formats. OASIS is good in theory, but I haven't been able to locate enough concrete examples of policies or implementation schemes that work at a corporate level. Does anyone work at a company where documents can only be saved as RTF, HTML, etc. or have any experience with this type of problem?"
and we, unfortunately, use _all_ the formats known to the world.
I've already tried to encourage the adoption of hassle-free formats (rtf, html, TXT, whatever).. they don't pass.
It seems that people simply can't get it.
Unfortunately.
If anyone can hear me, slap some sense into me But you turn your head, and I end up talking to myself
There could be a huge number of different files you need. CAD files, images, Powerpoint presentations, complex spreadsheets will all mess up any format you can come up with (eg HTML). How would you even edit some of these things?
Even OpenOffice formats are not vendor neutral, you have only one product out there that really uses it.
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
It might sound like Adobe lock-in,
but with PDF Printers (files are printed to pdf's) for Linux and Windows (I asume Mac has it built in), it's a good option for creating documents that'll be displayed everywhere in the same manner.
What you need is a toolchain that allows conversion back and forth between several different types. For example, I could write a short paper in XML, SGML, or LaTeX, and convert any of the three to PDF. I could convert the XML or SGML versions to LaTeX, then use latex2html to turn it into an HTML document. I don't know of converters that turn XML,SGML->HTML, but they probably exist.
The point is that it doesn't matter which method I used to create the document; I can convert any of them into either of the other formats without losing information, and any of the three can be turned into HTML or PDF for display purposes.
You've probably got several different types of documents to mess with. Technical papers with plots, accounting spreadsheets, secretary generated memos, and presentations with pretty pictures so that management can understand what's going on. LaTeX alone could handle all of these situations. Create document types and environments to match the needs of each type of document. XML, being completely generic, could also handle any of the situations, but it's easier to type LaTeX markup than it is XML. There is at least one caveat: you have to be careful what type of images you feed TeX.
Heck, you could use Perl bindings to MS-Excel to snag data out of spreadsheets and export it into a format that some other chart making tool uses. You could use Excel itself to export as CSV files, which you could then use awk to convert into some other format.
Basically, it doesn't matter what tool each person uses, as long as what they export off their own workstation is in a standard format.
The idea of switching applications for people can be a task no one wants to undertake for many two reasons.
Comfort level:
It's like having designers switch from Photoshop to The GIMP, or MS Word to OO Writer. Granted, the apps accomplish the same thing, but it's not the *same* program. People will resist the change because they know how to use the first program, and the reason for the change isn't a concern for them.
Dominance:
Going vendor neutral when the major still use vendor specific requires you to see if your users use vendor specific features that are not available in the neutral. If those features aren't there, then what do you do? Write code to compenstate for the feature, or get plugins, or do nothing if there's nothing you can do. Are there tools that can do as good a job as the old tools, to work in this neutral envirnoment?
It would help more if you stated your case in more detail.
Well, that's not exactly "vendor neutral", since only one vendor supports it. Of course, that one vendor is an open-source project, and the format is well-documented XML. So if you want to break out of the Microsoft orbit, it's the obvious first choice.
If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.
XCircuit, a circuit layout app for X, uses postscript as its default format. If you have XCircuit, you can load the postscript file into it and edit it like any other circuit. If not, you can still print it or view it as you would any other postscript file.
XML is a good start, because it's easy for a new app (the fictional YCircuit) to add support for the format, but you are still stuck unable to print it if you don't have the skills to write a conversion script and no one else has written it for you.
Why not combine the two? XML embedded in a standard PDF file would allow any application with support for the creator's XML tagset to import the file, and at the very least those without any similar application could view and print the file.
You can't judge a book by the way it wears its hair.
HTML is only vendor neutral if you don't use any vendor-specific extensions. So you can't just say, "Everybody save your files as HTML". You also have to forbid anybody using apps (such as Word) that save to a non-standard HTML.
In theory, you can create an XML-based format that looks the same in Word, OpenOffice, FrameMaker, and any other XML-aware app. But doing so means designing a schema in extreme nit-picking detail, and writing a lot of transformations to get that XML in and out of all the apps that need to read or write it. It's a lot of work, and nobody does it unless they have a specific application that requires highly-structured information. Like if you have a huge set of technical documentation that you need to update a lot. (I was involved in just such a project -- and the politics of converting all those documents to XML cost me my job.) Or if you have invoices or similar business documents that need to go into or out of a web services app.
But for the big mass of unstructured documents, there just isn't a vendor-neutral solution, and nobody has any real incentive to create one. The solution remains the same: standardize on certain specific applications. Which boils down to using OpenOffice if you hate giving money to Bill and/or want a platform-neutral solution. Otherwise you standardize on Microsoft Office, because it's what everybody knows how to use.
Are they doing this to save money? to clamp down on the uppity workers? because the CEO got emailed an AppleWorks attachment with no file extension from some Mac user? to avoid the risks of single vendor lock-in?
Many documents formats can be converted back-and-forth with some degree of effectiveness. Yes, if you open a document from WordPerfect in Microsoft Office, the word spacing may change a little. However, this happens if you move from a machine connected with a HP4000 printer to a HP2100 printer as well. However, some formats give different feature capabilities; saving from DOC to RTF will cause (as an example) tables to shift about a bit. TXT format is readable by most anything, but the formatting capabilites are nigh nonexistant. (Ooh! Tabs!) While WordPerfect and Word will each open the others documents, they aren't so good for saving in open formats
What formats are currently used? Why are they needed? Will everyone need to be able to write to them, or are pay-writer/free-reader combos acceptable? And, *ARE* there any "vendor neutral" formats out there? (For desktop publishing, the real answer is "no". Publisher is a joke, and while Adobe and Quark maintain some import compatibilties, the formats AREN'T neutral.)
For myself, working in a small department, "Let a thousand flowers bloom" is just fine. I accept that I will occaisionally get forwarded an e-mail with an attachement that the user can't figure out how to open-- usually Mac/PC file extension name issues solved easily by renaming. Once in a blue moon I have to explain to someone that no, not everyone has FooBarBaz market research organizer, since for most the $800 license cost for it would be more beneficially used for other things, and they will probably need to examine such data files once in their career, if that.
Perhaps a list of universally accepted formats-- that is, formats that must be used for wide distribution-- would be more appropriate, after considering what features are needed in said formats. After all, Photoshop .PSD documents are harder to view outside Photoshop, but far more useful for subtle graphics work than JPEGs.
I suspect you are being sent out on a project inadequately considered. Depending on the pointy-hairyness of the person who assigned it to you, you may find some substantial benefit to reconsidering the ground assumptions.
//Information does not want to be free; it wants to breed.
Well, for CAD, its a screwed up world. The best/most portable format is probably IGES, except its such a huge specification that nobody's IGES file is compatible with anybody else's. I'm an engineer and for myself I use Turbocad 10 professional at home. It reads/writes AutoCAD files and numerous other formats, and is somewhere in between AutoCAD and Pro/Engineer in terms of its capabilities. You'll have a tough time convincing any corporation to use TurboCAD though.
For text documents, HTML would be good, except MS products tend to produce the most screwed up HTML files I've ever seen. All I can recommend is to use PDF files for important and official documents because they are essentially immutable and tend to produce consistent hardcopies from any computer.
OpenOffice formats are nice, and if I were starting up a new business I would of course set up Linux workstations to use OO exclusively, and put a Windows machine down in the IT room so the IT staff could convert any troublesome documents that come through the email.
For Visio, there is no equivalent, other than exporting the visio file as a DXF or maybe a WMF. Windows MetaFiles never seem to load right in other apps though so thats something to think about. SVG files will probably be the future here if Dia starts using them.
Clickety Click