Cross Platform Document Management Systems?
Alan asks: "I'm
looking for a way to do document management at the office. We have
windows people and linux people, some writing documents that are a few lines (developer notes for example) and others are full of charts, graphs, etc. Currently we have a file server that has shares
set up for the documentation, but it lacks any sort of revision
control, and with the salespeople writing in Microsoft Word there are
cross-platform issues. We were thinking of setting up an wiki or an everything-based site, but as it is
only text, it's not good enough for everyone. There is also the
matter of getting our master documentation (which is in PDF format)
accessable to everyone as well, possibly in an XML format that can be
imported into indesign or Pagemaker or something. There are lots of
solutions that work for different departments and different systems, but
it would be nice to have something that works for everyone."
cross-platform independent format (dunno if there's a Linux version, though).
You could've hired me.
PVCS
or any one of a million other source/document control systems.
Mmmmmmm
Our company uses a program we wrote for document management. It handles authentication, keeping track of documents attached to files, and shells out to Word/other programs to create and edit documents.
Downloading/saving files is done via HTTP (users can work via the same system at home).
That said, our company likes making our own wheels. In the end we find it's less work (especially on a simple project like this).
Let's not stir that bag of worms...
If you consider getting a commercial product, try Xerox's Docushare.
It's web based, features access controls and revisions, HTML rendering of Office documents, and a lot of other nice things.
Best simple document management system I've seen that scales from small teams to large groups.
free the mallocs!
Why not just stick with PDF for everybody? There are plenty of free (beer/speech) utilites out there to make any document a PDF in Linux, as well as (costly perhaps) Acrobat for Windows. That solves you're common format problem...and you could use any one of the bazillions of version control systems to manage the PDFs.
Here is the best system for managing documents, it's simple, yet very effective. The technology or database here is a matter of preference - let's address "business flow".
You need a single directory to store all of your documents - no subdirectories. The categorization should be held in the database. This makes for simple backups and not having to heavily integrate your application with the server's file system.
You should have 5 ways to access the files - categorical, by date, by "uploader", by type, and a search function that indexes the complete text held within each document. Verity has a very nice offering for indexing MANY different types of files.
All of the 5 different methods of access should be linked. For instance, if I am browsing categorically and I run across a documentation style I like in particular, I should be able to click a link that takes me to the "uploader" filter that will show me all the other documents that individual uploaded.
Another example is if I did a full-text search for "widgets 123" and had a long listing of documents, you should list the categories they reside in so the user can click that category and be taken to all of the documents in that category.
Two phrases should be at the top of every thought you have regarding this system:
Fully Integrated
Stupid dumb easy to use
Adam.
my sig is so witty and fun - it tickles almost everyone who reads it.
Format documents in HTML. Simple, supports practically all platforms out there. Most tools will export to HTML these days...
(Or, alternatively, PDF, which is equally well supported. Though you lose the ability to browse the documentation off your favorite web browser.)
For revision control: Try CVS with SSH (instead of RSH) for updating. Cheap. Effective. Highly portable. Add a little script so that it automatically checks out the latest versions of whatever is checked in, and checks them out into a public place -- perhaps into your intra-web-server's (Apache's) directory.
For your sales people, add commands (batch or bash scripts) to create/checkin files to their windows explorer browser commands list... So they can easily update/checkin files using their winX click and drool interfaces... (But, of course, it only works for the "right" file types...)
Web based, which solves the cross-platform bit. Includes a document distiller so that it will render HTML for documents thrown into it, which saves the bother of having to download the attachment. Will render a PDF, realtime, for anything else. Does a lot of other things, too. (Metadata, tracking, revision control, check-in/out, restricted access, ACLs,
http://www.lotus.com/home.nsf/welcome/domdoc
You probably won't find a good, free one, because lots of businesses (like legal departments) are willing to fork over lots of dough for this type of product. Thus there is a demand by paying customers.
PDF? Only useful if document revisions and annotations are only done by the original author!
The problem is compatibility between applications. Every application has its own format -- in some cases even different versions of the same app can't share files.
The software vendors would like you to believe that their products solve this with "import/export" filters. Bullshit. I have never, ever seen such a filter that's suitable for everyday use. Some require a lot of skill to use. But most just fail to parse this element and that. So you get data loss ranging from minor formatting errors to suprise content loss.
The closest I can suggest to a total solution is to make everybody standardize on a small set of formats. There's a minimum of three, for plain text (and don't forget the Mac/Unix/DOS line break issue!), rich text, and graphics (possibly more than one). Easy enough to find standard formats for each these. The hard one is rich text, but not for any technical reason.
Technically it's simple. Settle on a widely-used rich text format and forbid everything else. If you don't care about the content-formatting dichotomy, LaTeX is a good candidate -- techies can use their favorite text editors, techno-muggles can use any number of WYSIWYG tools. (Being a technical writer, I would insist on XML, but that doesn't make sense for every organization.) Problem solved, right?
Wrong. If your organization is at all typical, you've got a lot of people who have an investment in their Word, Powerpoint, and Excel skills, and would quit if they had to start over. That's a social engineering problem, and I don't have a solution for it.
I can't believe that nobody has mentioned Zope.
I've been looking at this lately and Zope is an ideal solution.
Zope can grok anything if you can find/write a product for it. It can also search it using ZCatalog.
I downloaded the MSWordDocument product and it kicks ass. When you stick a Word document into the Zope database it has it's own 'type.' When you access the document it will, by default, render it in HTML (thanks to wvware) and display it, with a bar at the top with a 'download' link that retrieves the original document. What makes this even cooler is that, since Zope can extract the text, ZCatalog can give you a search interface.
I built a simple system with search in about five minutes using the web interface and DTML tags. No lie.
There's a similar product for PDF files and if I make one for StarOffice files, it'll be useful at the place where I work.
To top it all of, Zope has built in versioning. You can even do diffs between arbitrary versions. It also has webdav support so that Windows users, with 'Web Folders' and Linux users, with davfs, can open and save files, with locking and everything as if they were local.
All the little stuff is already there, too. User accounts and login handling is native, you can attach metadata to anything, and you can write scripts in Python, Perl, or PHP.
Needless to say, I highly recommend it.
Although wiTHinc markets panFora as just discussion forum software, we've been using it for both sharing documents and having online discussions on/around the shared documents.
It's neat that we can deal with simple lines of text as message text, and more complex documents (pdf, excel, word, powerpoint,...) as file uploads. The discussion postings are fully threaded, so you can upload new document revisions within the same thread. We track both the comments and revisions this way. People can make public comments online or privately via panFora's email.
panFora takes a fairly full subset of HTML formatting (tags: fonts,tables,...) in message postings. It will automatically correct HTML syntax coding errors for you, if you write your own.
Although this is a web server application, navigation within panFora's topic/folder structure using 3 frames is just like what you are use to on the desktop for newsreaders and email. This is panFora unique strength.
We've been using it a while and found it easy to use. More recently, they released a free version you can download. I may put the free version on a couple more servers here running Linux and Apache.
http://www.withinc.com/home.html
Try a webdav based solution. WebDAV allows you to interact with a WebDAV enabled server to manipulate files with support for locking via extensions to HTTP 1.1
Delta V adds version control to that.
There are webdav enabled DMSs starting to appear (inc eg MS Sharepoint); WebDAV and its related standards will become the standard API for interacting with a DMS.
While not yet fully featured DMS, there are good open source implementations of webDAV clients and servers; I use Slide - which works well with Tomcat 4.
I'd tell you how old I am if I thought you could count that high!
You don't mention your budget or how much effort you want to put into this but seriously consider Framemaker. It's cross platform - runs on Unix, Mac and Windows - long tried and true for documentation and manuals - Office and sales types even "get it" pretty quickly. You will find it much less cumbersome than cobbling with Pagemaker or Indesign for sure and offers you much better management controls. Has great integration with PDF, SGML, XML and more. No I don't work for Adobe - I just used to do this kind of stuff - it's a real workhorse for what you are after... Adobe's web site has lots of info.
no sig