Preserving Old Research Notes and Documents?
twistedcubic asks: "I have several thousand 8.5 x 11 inch dead tree pages of notes and research that takes up too much storage space. I would like to have all these notes scanned into PDF files (for example) so I can recycle the pages and reclaim storage space. Does anyone know of a store that provides this service, or an inexpensive machine that will do the job in a reasonable amount of time?"
"I have several thousand PDF files taking up too much disk storage space. I would like to have all these files printed on to 8.5 x 11 inch dead tree pages of notes so I can delete the files, empty the recycle bin and reclaim storage space. Does anyone know of a store that provides this service, or an inexpensive machine that will do the job in a reasonable amount of time?"
For future reference, I suggest a printer.
--BladeMelbourne
10-year old nephew and a scanner.
Sorry to reply to my own post, but I felt bad about the unhelpfulness of my previous comment. I headed over to Visioneer's site (www.visioneer.com) and found a few scanners that handle like 25 pages at a time. The more you spend, the faster it scans. Sorry, I cannot personally recommend a scanner in particular. Never had one like this.
Good luck!
"Derp de derp."
ADF (Automatic Document Feeder) scanners are fairly pricey (good ones are in the US$400 - US$1000 range, but you can get a cheapie Brother MFC-3240C All-In-One (C$140) that has a 20-page document feeder and then get a slave (e.g. some grad student) to feed in your pages for you.
My Brother MFC-2340C scanner comes with the PaperPort application, which generates PDFs and supports double-sided scanning even though the scanner doesn't support it. (You just flip over the whole stack once you've scanned one side, and start scanning the other side. Paperport knows how to automatically reconcile the pages.)
If you have Acrobat Professional, you can do a Paper Capture(TM) which is basically doing an OCR on the PDF and then storing the recognized words as "keywords" so that the PDF is searchable via Spotlight or other indexing mechanisms.
A document scanner is indeed a very useful piece of equipment -- I use it to scan notes and scrap paper containing rough ideas, often with lots of mathematics. Sometimes writing stuff on paper is just easier than typing in LaTeX...
The eminent computer scientist Edsger Dijkstra also liked to write stuff using pen and paper. His digitized works, called EWDs (after his initials, Edsger Wybe Dijkstra) are available here:
http://www.cs.utexas.edu/users/EWD/
Are the notes graphics-heavy (i.e., scientific/engineering)?
If not, give it to a typing service. Once you show them how much "stuff" you have, I'm sure they'll give you a discount. They might even agree to use OpenOffice2 (because it handles huge documents well, the files are small, and it has an excellent PDF exporter).
You'd still have to scan in the pictures/drawing/graphs, and place them appropriately, which will take time.
Also, there are firms that specialize it digitizing paper documents (mostly forms and regularized documents for businesses). Depending on the amount of hand-writing & graphics, it might not be appropriate, though.
All in all, no matter how you do it, the project will
"I don't know, therefore Aliens" Wafflebox1
Disclaimer: I used to work for this company as a coop student.
I would contact PRG Schultz as they have done this for large clients in the past. Hey have a program called imDex which is pretty slick. Basically, it's a searchable, cross-indexable database, so you'll have OCR'd text, along with TIFF's or PDF's of the documents. If you would like more information, let me know.
Not unless the notebooks in question were made of acid-free archival paper. I've seen cheap paper falling apart in 5 years, irrecoverable in 10. Phase-change media, like CD-RW, will easily outlast my children.
-I like my women like I like my tea: green-
The matter of the fact is, documents on papers are not nearly as available as electronic copies. Hell, you could let thousands of people read all those documents at once for just a tiny amount of money in bandwidth costs (unless you have a university host it for free, which I'm sure they will). For most of us, this accessability is easily worth keeping a backup of the data, even if it also requires us to store it on new mediums as time goes on (i.e. switch from floppies to cdrs to dvdrs to whatever every 5-10 years).
go buy a modem, and grab an old fax machine, then fax the documents to yourself. You should be able to fax a decent number of pages at a time and can walk away and leave it running. these will be saved as multi-page tiffs which while not pdfs and searchable at least solve part of your problem.
RandomAndInteresting.comdefending the world from stupidity since 1979