Slashdot Mirror


Preserving Old Research Notes and Documents?

twistedcubic asks: "I have several thousand 8.5 x 11 inch dead tree pages of notes and research that takes up too much storage space. I would like to have all these notes scanned into PDF files (for example) so I can recycle the pages and reclaim storage space. Does anyone know of a store that provides this service, or an inexpensive machine that will do the job in a reasonable amount of time?"

10 of 101 comments (clear)

  1. Re:Not the ideal solution, but a start.. by NanoGator · · Score: 3, Informative

    Sorry to reply to my own post, but I felt bad about the unhelpfulness of my previous comment. I headed over to Visioneer's site (www.visioneer.com) and found a few scanners that handle like 25 pages at a time. The more you spend, the faster it scans. Sorry, I cannot personally recommend a scanner in particular. Never had one like this.

    Good luck!

    --
    "Derp de derp."
  2. Legal Services Firm by Anonymous Coward · · Score: 1, Informative

    There's tons of companies that specialize in electronic document scanning & OCR, usually for the legal industry. Probably cost .05 to .10 a page, but you might be able to cut a deal as an individual rather than a law firm.

  3. Scan to PDF with OCR behind the image by fatboy-fitz · · Score: 2, Informative

    There are companies that will do this for you. For example, IMC in WV (http://www.imcwv.com/). They can scan it all to PDF using the image as what you see in the PDF backed up with the OCR'd text. That way the document is somewhat searchable, but you always see the exact scan of the doc when you look at the PDF.

    --
    I'm better, because I'm bigger
  4. imDex by cstew · · Score: 3, Informative

    Disclaimer: I used to work for this company as a coop student.

    I would contact PRG Schultz as they have done this for large clients in the past. Hey have a program called imDex which is pretty slick. Basically, it's a searchable, cross-indexable database, so you'll have OCR'd text, along with TIFF's or PDF's of the documents. If you would like more information, let me know.

  5. What are you going to store them on? by the+eric+conspiracy · · Score: 2, Informative

    The problem is then you have to come up with a safe long term way to store digital data.

    Clue:

    There isn't one.

    The best thing to do is NOT convert the paper to digitized format. Find some space instead, and store the paper. Your data will be much safer.

    1. Re:What are you going to store them on? by aminorex · · Score: 3, Informative

      Not unless the notebooks in question were made of acid-free archival paper. I've seen cheap paper falling apart in 5 years, irrecoverable in 10. Phase-change media, like CD-RW, will easily outlast my children.

      --
      -I like my women like I like my tea: green-
  6. Go low tech? by andreMA · · Score: 2, Informative
    If you just want to have it to refer to very infrequently and (possibly) print a page, look into having it filmed as microfiche. Viewers are fairly cheap and in a pinch a strong lens (loupe, possibly) will do.

    Many libraries will have reader-printers that for a small fee (eg, $0.20/page?) you can print a copy.

    Most of the expense with fiche is the production of the silver halide original; diazo copies are relatively cheap. If it's really important to you, have a copy made and lock the original film in a safe deposit box (or at least offsite)

  7. Re:In a few months time... by sribe · · Score: 2, Informative

    Check out the Fuji ScanSnap. Their lowest-end document scanner; but still faster than all the slow consumer-level junk; and comes with a version of Acrobat that will OCR the images and put the text in a "hidden" layer for searching.

  8. Re:Maybe you should try djvulibre by twistedcubic · · Score: 2, Informative

    Dude! I already found a $100 scanner that does the job and works in Linux (HP officejet 4215). It scans really fast. My only problem up til now was that PDF redering was too slow. But then I compared the results to DJVU... Wow! The DJVU files render incredibly fast! Thanks!

  9. DjVu, not PDF by TeXMaster · · Score: 2, Informative
    There is a file format which is specifically created for this kind of stuff, and it's called DjVu. There is a free (as in open source) reference library, and proprietary tools by LizardTech.

    (Of course, you will still need to spend lots of time scanning, naming and classifying those pages. The ADF and 10yo nephew suggested in another post might be useful for that.)

    DjVu offers very compact representation without the need to OCR the document (I've converted a 13 megs scanned PDF into a 600K DjVu which was much faster and easier to read), and optionally a "hidden text layer" if you want to OCR it to make it searchable.

    --
    "I'm never quite so stupid as when I'm being smart" (Linus van Pelt)