Slashdot Mirror


Paperless Office Solutions Under Linux?

sholgate asks: "I've been asked to look into implementing a paperless office under Linux. We receive emails, letters, word documents, PDFs etc and need a way of converting and storing them in a way that provides easy searching and accessing. We've been offered two Windows solutions, one based on Canon ScanFile and the other using Lotus Notes. My office went with Canon back in 1995 and now has a load of unreadable CDs as the original software was DOS based doesn't seem to work under Win98/XP. We now face paying for conversion to the new system plus new license fees. We are primarily Linux/Unix based here so Windows is inconvenient and history has shown that a closed product is not a good solution. I favour having a directory browsing system based on thumbnails (such as nautilus or konqueror) and searching with grep, but I can see the benefits of more complex systems that store a database of search terms etc. Have other Slashdotters thought about paperless offices? What answers did you come up with?"

44 comments

  1. uhh... by Anonymous Coward · · Score: 1, Funny

    Why would you want an office without paper? What are we going to do with all those extra trees? Columbo always used a pen and paper....

  2. What good are thumbnails? by Anonymous Coward · · Score: 0

    Itty-bitty print hurts your eyes; not many documents can be recognized that way. It would seem to me that it would be nice to have them stored in a database with a 'flattened' version as text to search through them.

    Of course, while it's useful to have electronic copies of all your documents, I don't see it as being practical to be completely paperless. You spend too much time printing things...

  3. Google search appliance by Tomah4wk · · Score: 4, Interesting

    A google search appliance sounds like it would suit the needs for at least your search requirements. It can also look through MS Office documents (i assume these get emailed to you) and PDF documents and display them as HTML in your browser. With regard to your letters, Clara OCR is free (as in beer, not sure as in speech) for linux (is debian packaged anyway).

    Hope this helps.

    1. Re:Google search appliance by BornInASmallTown · · Score: 4, Informative

      Yikes! Having evaluated Google along with many other search vendors and open source search tools for the enterprise, I can say that this would be a bad idea long term. The Google search appliance:

      • is closed
      • requires an ongoing fee for no new functionality
      • has a hard limit to the number of indexable docs
      • can't really do anything that open source tools do


      I would recommend trying a combination of an open source search engine like Lucene along with its contributed filters (PDFs and other document types). You can also use open office document filters for MS Office docs where necessary.
    2. Re:Google search appliance by searchtools · · Score: 1

      I agree that the Google Box is closed, but as I understand it, there is no ongoing fee, except for support. You buy it, you own it.

  4. Zope by jalet · · Score: 2, Informative

    Zope might be a good start for you :

    http://www.zope.org

    --
    Votez ecolo : Chiez dans l'urne !
  5. Nope by DreamerFi · · Score: 4, Funny

    Every attempt I've ever seen to go "paperless office" have been failures - if all you end up with is a set of unreadable CD's a few years later, you've done very well so far. Personally, I think a paperless office is about as useful as a paperless toilet.

    -John

    1. Re:Nope by Anonymous Coward · · Score: 0

      It's not as if modern recycled paper lasts a hell of a lot longer, though...

    2. Re:Nope by gmhowell · · Score: 3, Funny

      What do you need paper for? I thought the clamshells replaced it?

      --
      Jesus was all right but his disciples were thick and ordinary. -John Lennon
  6. The Myth of the Paperless Office by Virtual+G.W.+Bush+Jr · · Score: 1

    In addition I suggest you read the book The myth of the Paperless Office by Abigail J. Sellen and Richard H. R. Harper. It sheds some decent light on why we will never reach the paperless office.

  7. HTDIG by Skord · · Score: 2, Informative

    htdig has support for msoffice docs and pdf's, and sounds a little cheaper than a google search appliance (although I'm sure a shiny yellow solution does a good job).

    I've never used a modem in linux, so I have no idea what the telephony capabilities are.

    I tend to agree with most of the replies here however. I tried my hardest to save a tree here and there and the other system administrator here prints EVERYTHING out. Until you can fire all the idiots and be left working alone, I'd skip on the "paperless office" idea and spend more time working on projects.

  8. Printer Ink = $$$ by awerg · · Score: 2, Informative

    HP makes 40% of their money in Printer cartridges. Printer stuff is consumable. That means I use it up and buy more.

    I have spent more on printer paper and ink than all my computer hardware put together in the last 5 years.

    Paperless office is a dream.

    --
    -- Andy
    1. Re:Printer Ink = $$$ by Anonymous Coward · · Score: 1, Informative

      To show how times have not changed. When I worked for a government agency in Canada in 1985 our annual budget for printing (we had a separate budget for printouts from our data systems) was in the order of $85,000. Oh, that was just for paper, doesn't even include ribbons or printer maintenance.

    2. Re:Printer Ink = $$$ by holstein · · Score: 1

      That's why I stayed away from the inkjet, bubble-jet and that kind of printers...

      I got here at home a LED-based Okidata (ie. simili laser) that cost me only two ink cartridges (at 35 CAN$ each) in more than five years. And it's not because I don't print very often : my girlfriend (well, now my wife.. ;o) and I did all our university homework on it (and she did after our common under-grad studies a master (with 120 pages these, printed a _lot_ of times...) and 2 more years at school). Based on the quantity of paper bought, that means something around 5000-6000 pages. With _2_ cartridges!

      So I consider this printer my best computer buy ever!

  9. Paperless Office by Old.UNIX.Nut · · Score: 1

    A solution I saw used in some Las Vegas Casinos 10 years ago (I did some unrelated VMS software upgrades for them) was accomplished by printing their reports to Optical drives. The data was pure ASCII (cannot get any more portable than that), which they could search and reformat as needed for later output to screen or printer using COBOL, Perl, etc. With some tweaks to allow for improvements in technology and your situation this technique could work for you.

    1. Re:Paperless Office by smatthew · · Score: 1

      it's called COLD - Computer Output to Laser Disk. Very portable and most data systems will deal with it.

      --
      slashdot username - at - email.domain.name
  10. Paperless Office by I'm+not+a+script · · Score: 0

    Hmm, try printing on plastic sheets.

    --
    kthx
  11. Solutions by dth · · Score: 2, Informative

    There are a few solutions. Basically what you're looking for is a nice front end to a virtual filesystem, with some bells and whistles.

    Take a bunch of paper, scan it, index it, file it. Additionally, do the same for non-scanned work (email, doc, pdf, ...)

    Windows wise, Doctrieve (now Redmap networks, look for a similar product) is a good solution. Theres a range of products, all providing more or less similar functionality (some more bells here, some less whistles there...) Non-windows wise, theres an opensource one called DocMgr which provides similar functionality, albeit a bit immature.

    OCR is really the big issue here with scanned work. I've only dabbled with OCR under linux (using GOCR) with limited success. Bad OCR == bad indexing == useless searching.

    I'm currently in the process of writing something similar targeted for the higher-end market. If you're interested in testing or evaulating, drop me an email.

  12. Ignorant company? by Futurepower(R) · · Score: 2


    The Doctrieve link is not Mozilla friendly:

    "To view this site you must be using Microsoft® Internet Explorer 4 or above.

    If you do not have a copy of Microsoft® Internet Explorer please use the links below to download a FREE copy.

    We look forward to you visiting us at www.doctrieve.com


    A software company that is that ignorant about how to make web pages might not be the best business partner.

    1. Re:Ignorant company? by GRW · · Score: 1

      Actually, the page just forwards to http://www.redmap.net/, which seems to be Mozilla friendly.

  13. Damn by Treeluvinhippy · · Score: 2

    We are primarily Linux/Unix based here so Windows is inconvenient

    You are a lucky SOB.

    --
    >
  14. You sure it was Lotus Notes they offered? by Anonymous Coward · · Score: 1, Informative

    I think it was more likely Lotus' product Domino.Doc, which has some very nice document revision, tracking, and indexing abilities. I think it was also tied in to Adobe Acrobat Distiller, and thus generated PDF files for everything you had in the system.

    Domino does run on Linux (and damned well). In fact, the newest version, 6.0, just went public on Monday of this week. Lotus has a strong history of supporting Unix platforms with their server product, but I'm not sure if Domino.Doc works/is available for the Linux platform.

    Domino.Doc is designed to work completely from the web, but the audience I've seen it used for were more the parts of corporations that had long lists of document handling requirements (like legal, HR, trademarking, etc). Might be overkill for your situation.

    The key thing here is templates, and a database to store things in. That breaks down when you start getting in to presentations (like PowerPoint), and further breaks down when you get into spreadsheets (because frankly, if you can use a template, then you should just put the spreadsheet into a database somewhere).

    I'd give Lotus a look, because they have a good client that runs on both the new Mac OS X, and on Windows platforms. The servers come in a variety of flavors, but Linux is on the list. Surf the Lotus forums at http://domino.lotus.com and see what you turn up.

  15. Content Management System by spike666 · · Score: 2

    Since it sounds like you already are receiving the documents electronically, you need a content management system. There are plenty out there, and it depends on the types of things you want to do. there's Stellent which is primarily a content management system for documents, but i dont know what sorts of Linux support they have. Also there's Interwoven which is a little more based on web deployment content management.

    another poster has mentioned Lotus, but there is a product from IBM called IBM Content Manager that runs on DB2 and WebSphere (which both run on Linux) and gives you really powerful storage and delivery of your stored content.

    Of course, you could always check SourceForge which shows at least a dozen projects with "Content Management" in their descriptions...

    1. Re:Content Management System by Anonymous Coward · · Score: 0

      Content Management is not the same as Electronic Document Management. CM is a web publishing tool.

  16. Good scanners and outsourced proofreading. by duffbeer703 · · Score: 3, Interesting

    It depends on what you want to do.

    I've worked with a state agency which, not suprisingly, handles alot of paperwork. They have a scanning solution which brings in the images, stores them in graphics format (i thibk TIFF), and indexes the document under the case number it is associated with. Meta-info can be added by the people who work with the documents.

    Note that if you need to have legal proof of a signature or if your auditiors require you to keep documents for x years, they must be in graphic format --- an OCR'd document in ASCII text won't fly.

    If you are looking to automate data-entry, get a high speed commercial scanner (if you have large volume) from a company like Bell & Howell and outsource the OCR activity to another company. Tons of companies (Lockheed Martin does it for most federal agencies) do this. The outsourcers send your documents to a 3rd world country like Ghana for proofreading. OCR is only about 95% accurate, and automated OCR is not reliable enough for anything!

    The free Ziff-Davis magazine "Baseline" ran an article about this a couple of months ago, you might want to find their website (or look through the pile of free mags on your desk) and see fi you can find it.

    Don't shop for a solution based on platform, "Free"/non-"Free", etc. A "Free" solution will take longer and and your cost driver will be the implementation, not inital licensing cost.

    Get whatever provides you with the best solution, period.

    --
    Conformity is the jailer of freedom and enemy of growth. -JFK
  17. Linux has had this from day one by babbage · · Score: 2
    Have all of your workers wrestle out their own /etc/printcap entries and your office will go paperless remarkably quickly. Possible obstacles:
    • Some poor bastard will figure out how to point a print document at the printer, but will never -- this you can count on -- never ever figure out how to make it look even remotely correct. For a time this frustrated sap will burn through reams of paper before giving up in disgust
    • Some people will try to print from Mac or Windows computers. Do not allow this! It is far too easy to allow some unwitting boob get up & running with normal printing on these platforms, at which your noble ecological pipe dream will end up flushed like yesterday's half digested tofu burger.

    Otherwise, really, that's about all their is to it -- normal Linux / Unix LP print services. Switch to that and you'll never have to replace your toner cartridges again!

    :-)

    1. Re:Linux has had this from day one by rjamestaylor · · Score: 1
      In my office I regularly receive printouts of webpages placed in my mailslot from other executive management (I'm the CTO) wanting to show me something they think I may be interested in seeing. Drives me bonkers!

      The same people are more likely to print a Word doc and fax it than send an email...

      --
      -- @rjamestaylor on Ello
    2. Re:Linux has had this from day one by Jon+Peterson · · Score: 1

      I like your RHAT vs MSFT link. Try this one:

      RHAT vs MSFT

      --
      ----- .sig: file not found
    3. Re:Linux has had this from day one by babbage · · Score: 1

      But that's the beauty of the old lp system -- force people to switch to that and they'll never want to print anything ever again :-)

    4. Re:Linux has had this from day one by Anonymous Coward · · Score: 0

      In my office I regularly receive printouts of webpages placed in my mailslot from other executive management (I'm the CTO) wanting to show me something they think I may be interested in seeing. Drives me bonkers!

      Same here. I never ever want to be given a peice of paper ever! There is e-mail for a reason.

      Even better is I've had people give me printouts of data they want changed. Ummm, I'll change the original data, but what good is a printout??? I'm not retyping what is already there!

      I for one see no reason why the paperless office couldn't work. I prefer to read off a monitor however, so I'm probably not the best one to judge the feesability.

    5. Re:Linux has had this from day one by rjamestaylor · · Score: 1
      • Same here. I never ever want to be given a peice of paper ever! There is e-mail for a reason.
      I've also made it known that voice mail -- the little flashing light on the phone -- will be rudely ignored for at least 24 hours. But email will receive an immediate response. I can't "reply" to a voice mail--even though there is a "reply" function in our voice mail server it is not in anyway the sames or nearly as convenient.

      Oh, even though we have a web-enabled corporate email system (*not* Exchange) our top execs still--this is gross--use their AOL accounts. Hey, I know AOL owns Netscape who sponsors Mozilla, but that doesn't mean I have to like it.

      --
      -- @rjamestaylor on Ello
  18. DjVu is better for this than PDF by 0x0d0a · · Score: 4, Interesting

    I will grant that PDF can store scanned documents, but it's really designed and best for storing printed-directly-to-PDF files...otherwise, you end up with absolutely massive files. Unfortunately, it's commonly used for said purpose. Even PNG would be much better.

    DjVu is an interesting format that was primarily designed for storing scanned formats.

    It uses a couple of techniques, such as OCR/pseudo-OCR, and multiple embedded images (JPEG/PNG) within the file for rasterable images. The idea is that, say, a scanned magazine page with text and a photographic image is stored as text, a little bit of outline font information, and a JPEG of the photographic image.

    1. Re:DjVu is better for this than PDF by greenhide · · Score: 2

      On the other hand, keep in mind that PDF is a very, very well supported format. You will have no problem finding reams and reams of solution providers, software, etc. that suport the PDF format.

      DjVu may or may not be closed, but it's not exactly a standard, while PDF is. I'd at least keep a copy of every document in PDF format.

      --
      Karma: Chevy Kavalierma.
  19. Yeah sure. by Anonymous Coward · · Score: 0

    Paperless office, leisure society. All these early 80s buzzwords aren't dead yet?

    I mean, we certainly COULD have those things, but then capitalism would have to die and people would have to find things to do with their lives other than fucking each other over for a dollar.

  20. PDF is a Tree Killer by cyber_rigger · · Score: 1

    I understand the need to insure that documents are complete
    but I think PDFs are a pain to view and/or print out.

    A better system would allow copy, paste and editing
    yet still indicate (by hash code digital signature etc.) whether or not it has been modified.

  21. Notes on Linux, perhaps? by MightyTribble · · Score: 1

    You said you were offered a Notes Windows solution - have you considered running Notes/Domino on Linux? Domino runs native on Linux, and you can use Wine to run the Notes client quite nicely (although Codeweavers Crossover Office does a better job, for a little extra $$$). That'd get you a supported, commercial-grade software without having to pay the Windows tax.

    Plus I really like Domino for groupware. It's the only real challenger for MS Exchange out there.

  22. More ignorance: by Futurepower(R) · · Score: 2


    More ignorance: "With the recent release of the new operating system Windows XP by Microsoft, Redmap Networks support wishes to advise that existing ManageEzy and ManagePoint will not run on the Microsoft Windows XP platform. We are currently striving towards a solution for Windows XP and this is expected to be completed by the 3rd quarter of 2002."

    The company looks understaffed and underskilled. They gave themselves a year, and missed that deadline.

  23. You Dont Need "paperless office" software by jcasey · · Score: 2, Insightful

    Forget about buying "paperless office" software. This is a dumb idea that only serves to filter money into some unimaginative software company's pocket. If you need to save your corespondence, save it to a directory(folder). Make a rule or standard for filing and naming these documents - hell, in the old days companies would hire 'secrataries' to do this sort of thing - they didnt have to be intelligent or usefull either - just organized. The cost of hiring a secratary wasnt too bad either - still isnt. The problem with these "paperless office" or "document management" systems is that: 1. They are overkill. 2. They are usually proprietary 3. Its one more thing for the average employee to f*ck up. 4. New employees will have to learn this system 5. It costs money 6. If you have a problem with it, you better hope the software provider is capable of fixing it 7. It makes your data less mobile - If you decide to go with another system 10 years down the road, you will have to figure out a way to translate the data from the old system to the new system. My Advice, K.I.S.S. - Keep it simple stupid!

    --
    X
  24. Paperless Solution by Anonymous Coward · · Score: 1, Interesting

    One great system we've installed where I work, is from a company call Stellent. www.stellent.com, Great piece of software, configurable and changable to your hearts content, The server is java based runs on linux/solaris/windows with apache.
    It can convert documents automatically to PDF, and stores both the PDF and the native file in a specified direcory in the filesystem.
    Only sad thing is it's not open source (but you can modify anything you want to anyway) and it can get expensive depending on the number of users that will be checking in files.

    We've been using it here for over a year now and most people love it. Documents are easier to find than before, and we don't loose documents like we used to.
    Just wanted to pass this one on.

  25. my situation by jjshoe · · Score: 1
    i am currently writing a simple database program for the company my mom works for. i am delivering this application for $0. for several reasons. 1) they wont pay for it 2) the buisness is owned by three brothers, one who would die before switching to entirely paperless 3) why not?


    2) is my biggest concern however its easily worked around. the database holds first middle last name, home/work phone address balance due and product/s bought


    currently all this information is stored on a 5x8 sheet of paper. one for each customer which totals out at over a thousand of these. each month my mom has to copy each one, stuff it in an envelope and mail it. what a waste. i said id make her some database software to lighten the load on her. she claimed her boss would never use it because another similar company was doing all computer based record keeping and lost it all to some bug. i then told my mother the easiest way to do it. you make another box for the cards. put a sign on it called "changes" everytime they change a card they throw it in that box, she then takes the card out and enters/changes information on the database, prints the new card and files it back in the first box. That way a person can do whatever they are comfortable with and at the end of the month my mother cant just run her statements.


    no extreme setup fee's or ocr needed. whenever a card shows up in the "changes box" you enter the information into the computer. wether it be a first entry or the third entry


    backup -- the database will be backed up nightly as well as on paper too. everytime a change is made you print out a new card and file it.


    versatile -- they dont have to learn a damn thing if they dont want to. they can stick with what they know, or learn, whatever they feel is quicker.


    me personaly i realize moving to paper isnt easy or even quicker then filling out paper. but when it comes down to it it atleast makes the office look nicer :)

    --
    -- botsex is {grep;touch;strip;unzip;head;mount} /dev/girl -t {wet;fsck;fsck;yes;yes;yes;umount} {/de
  26. Canon software & DOS ? by chthon · · Score: 1

    You say you are mainly a Linux/Unix based operation ? Then why haven't you tried DOSEMU yet with your DOS based Canon software ? At least this could give you access to your documents on CD.

    1. Re:Canon software & DOS ? by xyster · · Score: 0

      Better yet, why not write a dos application to read the files and convert them to a new format. If you're unable to do that yourself, why not hire someone that can. (There are more than enough programmers out there looking for work :) )

  27. Greenstone www.greenstone.org by Anonymous Coward · · Score: 0

    Greenstone is a digital library software designed for this sort of thing. Can manage massive amounts of text, build indices and thumbnails, etc. www.greenstone.org