Paperless Office Solutions Under Linux?

← Back to Stories (view on slashdot.org)

Paperless Office Solutions Under Linux?

Posted by Cliff on Tuesday October 1, 2002 @05:55PM from the document-management-nightmares dept.

sholgate asks: "I've been asked to look into implementing a paperless office under Linux. We receive emails, letters, word documents, PDFs etc and need a way of converting and storing them in a way that provides easy searching and accessing. We've been offered two Windows solutions, one based on Canon ScanFile and the other using Lotus Notes. My office went with Canon back in 1995 and now has a load of unreadable CDs as the original software was DOS based doesn't seem to work under Win98/XP. We now face paying for conversion to the new system plus new license fees. We are primarily Linux/Unix based here so Windows is inconvenient and history has shown that a closed product is not a good solution. I favour having a directory browsing system based on thumbnails (such as nautilus or konqueror) and searching with grep, but I can see the benefits of more complex systems that store a database of search terms etc. Have other Slashdotters thought about paperless offices? What answers did you come up with?"

4 of 44 comments (clear)

Min score:

Reason:

Sort:

Google search appliance by Tomah4wk · 2002-10-01 18:16 · Score: 4, Interesting

A google search appliance sounds like it would suit the needs for at least your search requirements. It can also look through MS Office documents (i assume these get emailed to you) and PDF documents and display them as HTML in your browser. With regard to your letters, Clara OCR is free (as in beer, not sure as in speech) for linux (is debian packaged anyway).

Hope this helps.
1. Re:Google search appliance by BornInASmallTown · 2002-10-02 01:59 · Score: 4, Informative
  Yikes! Having evaluated Google along with many other search vendors and open source search tools for the enterprise, I can say that this would be a bad idea long term. The Google search appliance:
  
  is closed
  
  requires an ongoing fee for no new functionality
  
  has a hard limit to the number of indexable docs
  
  can't really do anything that open source tools do
  
  I would recommend trying a combination of an open source search engine like Lucene along with its contributed filters (PDFs and other document types). You can also use open office document filters for MS Office docs where necessary.
Nope by DreamerFi · 2002-10-01 19:14 · Score: 4, Funny

Every attempt I've ever seen to go "paperless office" have been failures - if all you end up with is a set of unreadable CD's a few years later, you've done very well so far. Personally, I think a paperless office is about as useful as a paperless toilet.

-John
DjVu is better for this than PDF by 0x0d0a · 2002-10-02 01:59 · Score: 4, Interesting

I will grant that PDF can store scanned documents, but it's really designed and best for storing printed-directly-to-PDF files...otherwise, you end up with absolutely massive files. Unfortunately, it's commonly used for said purpose. Even PNG would be much better.

DjVu is an interesting format that was primarily designed for storing scanned formats.

It uses a couple of techniques, such as OCR/pseudo-OCR, and multiple embedded images (JPEG/PNG) within the file for rasterable images. The idea is that, say, a scanned magazine page with text and a photographic image is stored as text, a little bit of outline font information, and a JPEG of the photographic image.

--
May we never see th