Slashdot Mirror


Automated OCR for Forms Processing?

Oscar Carrillo asks: "We have to do a large NIH grant which collects tons of data. And much of that data is in the form of questionnaires. The forms will be available on the web, but it's mostly not feasible to have the subjects sit in front of the computer all day (not to mention that people get annoyed sitting in front of a computer all day). The study is being conducted at several universities and institutions around the country. Using Linux/JSP/Struts/PostgreSQL will take care of most of our needs. But it would save a lot of data entry, if all forms could be scanned at each site, images uploaded to the website, and then automatically put through OCR (Optical Character Recognition) to get only the relevant raw data that subjects wrote. Does anyone know of something that can handle this? Are there any open source projects that can handle this? Any good commercial alternatives?"

2 of 30 comments (clear)

  1. p1ss fr0st by Anonymous Coward · · Score: -1, Offtopic

    Fist P0st

    Alarm. Alarm. Baxter Meowmix alarm.

  2. UCFPKF by poopbot by Anonymous Coward · · Score: -1, Offtopic

    How are things in the civilized world? You probably don't know who I am. That's
    okay. I'm here to inform you of my mission, what I've found, and what I hope to
    teach all of you.

    I work for the United Christians Food for Poor Kids Foundation, and let me tell
    you, there's a lot of poor kids in Afghanistan. As in most countries in the
    Middle East, most people are unemployed, and therefore poor. And where there's a
    lot of poor people, UCFPKF is needed.

    UCFPKF always has the latest in technology. In this instance, we had access to
    some Pentium 4's(r) 2GHz. Obviously, we needed an operating system that could
    handle the power of Intel's beast. Unfortunately, we didn't have any computer
    experts on hand up to the task, so it was going to be trial and error.

    We'd heard good things about Linux and its "ACL's". Little did we know of its
    incompatibility with modern hardware. It didn't even support Token Ring
    networking, the newest form of Ethernet(r), which we require to always keep
    in contact between bases. Also, it didn't seem to use SSE optimizations, which
    when processing food amounts, are also very important. Also, there were
    homo-erotic implications in the structure of Linux, which is strictly
    unallowable in a Christian organization such as ours.

    The next obvious step was to install Windows. We hesitated because we knew that
    it was common knowledge that Windows crashed incessantly. Our experience was
    less than stellar. It also didn't support Token Ring networking. Security is
    important in this region because many people try to steal food, but "Windows
    2000" (which I hear didn't even come out in 2000) doesn't even allow you to
    have seperate permissions. Once again, the SSE optimizations were not used.

    I was in a situation that seemed impossible. The two most famous operating
    systems had failed me. I walked around the base in a dazed stupor. What was I
    going to do for our ultra-important network? A boy saw me pouting and sighing,
    and asked me what was wrong. I said nothing, but we exchanged names, and little
    did I know, that young Junis had a gift for computers.

    Junis saw me the next day, slaving away at the sparse terminal that "Windows
    2000" makes you type in. He asked what I was doing with that primitive OS. I
    laughed and told him that I was doing inventory. He ran to his village, into his
    hut, and pulled out a box I had never seen before. The box said "SCO Xenix" the
    front. I had never seen or heard of this Xenix before. But I soon learned that
    Junis was a computer genius.

    All we had to do was put the Xenix CD into the computer, and everything worked
    like magic (not the devil's magic... good magic:) ). Our Token Ring network
    integrated flawlessly with it. And it even used SSE optimizations. Well, me and
    Junis are now on a new mission. We're spreading the word. It might not be the
    word of the lord, but then again, maybe it is ;).

    SCO Xenix: The Unix of Tomorrow.

    Janet Milman
    Network Administrator, UCFPKF
    Afghanistan base

    - posted by poopbot: information likes to be narrow

    B5P7oqT3PH