Large-Scale Paper-To-Digital Conversion?
An anonymous reader writes "I've just been asked to digitize several dozen sets of lecture outlines at the university where I work. Basically, professors want to hand me a big (often 100+ page) stack of their handwritten lecture notes (with messy text, equations, and diagrams; sometimes double-sided) and expect me to post a PDF-or-something-similar to their course's web page. However, every desktop scanner I've ever used takes 1-2 minutes of user-attention per page and the resulting files end up Huge, impossible-to-read, or both. All I have at my disposal is my PowerBook, Acrobat, a couple hundred dollars of department funds for a new scanner (this maybe?), and, if I ask nicely, overnight use of the secretary's Win2k box. Any ideas? Sheet-fed scanner recommendations? Better file formats than PDF (or better PDF settings)? Do any of you students have usability advice?"
Just fax the documents to a computer.
I think you're right on the money. May be well worth taking the job to an outside agency. There are many print shops using Xerox Docutechs, which scan in many hundreds of sheets at once to print copies of documents. The scanning takes barely a second a page, and it wouldn't surprise me if the document format being stored inside the docutech is something that can be used for this purpose.
I've had a similar job, where our school's lecturers wanted their notes in the same style so one of my jobs as admin assistant was retyping chapters from textbooks & inserting the original illustrations. That didn't start out too bad until lecturers started basing course notes on entire quarters of books, expecting them to be retyped completely in their own style. Give an inch they'll try to take a mile - use the few hundred $$ to get it professionally scanned.
Do it the open source way.
Get several (dozen) other students to use their own equipment and time in echange for a copy/copies of the completed work.
I would hazard a guess that there are more than a few people who would like to have a copy of the complete series of the lecture outlines.
Get some students of the professor's course to type them into LaTeX. Give them some points they'd otherwise get for homework.
a) Publication quality DVI/PS/PDF files
b) The student can deepen their knowledge of the topic
Everyone happy. Used to work like this at the university I went to. And you may be even lucky that some student typed these notes in for himself.
I tend to scan lots of documents and setup a simple perl script that uses the 'scanimage' command line tool to do the scanning. Using my Epson Perfection 1650 scanner (pretty standard flatbed scanner) I can scan an 8"x10" page in black & white mode in about 10 seconds.
I actually added a button to the Nautilus GUI shell so I can move to the directory I want and hit the button to scan a page to that directory. Very convenient.
I scan to tiff and then use the convert utility (part of imagemagick) to convert to png. The resulting files typically run about 100K to 200K depending on the content.
If anyone's interested in seeing the perl script I've posted it to: www.ollies.net/scanscript.html
Steve