Search Engines for Handwritten Documents
An anonymous reader writes "Researchers at the University of Massachusetts have created a tool for automatically searching handwritten historical documents, such as the 140,000 pages that make up George Washington's personal papers in the Library of Congress. The most interesting part is that the papers are scanned versions of the originals and the search tool actually recognizes the handwritten text from these images."
In America, handwriting is only for old people.
The most interesting part is that the papers are scanned versions of the originals and the search tool actually recognizes the handwritten text from these images.
How else would it search handwritten documents? Am I missing something here?
Huh? Well, lets see how well it keeps up with my doctor's handwriting...
Free XBox, PS2
Somebody invented a way for computers to recognize handwriting.
Like, so 10 years ago.
paintball
No OCR is performed on the documents. The search tool operates on the image.
Fair is where you take your cow to be judged.
Wow, looking at some of those examples, I was amazed by the fact that I couldn't READ most of the words. It looks completely foreing to me, might as well be trying to read Japanese.
How good is the accuracy? The OCR technology of today might not be able to recognize the "flowery" text of most historical documents (look at "We the People" in the Declaration of Independence)
got sig?
These documents are old and handwritten. Why waste the processing power decyphering results for each search when you can decypher the text once with a similar algorithm and search an index built that way? It's not like the information is ever going to change. (unless we do rewrite history)
Google already did it! Well, it's not handwritten, but that's just a logical progression.
I hate reading/producing anything longer than a post-it note that's in handwriting.
The owls are not what they seem
I took a lot of notes in College. I took a lot more notes in graduate school. I've even taken notes on books I've read for the fun of it. If I could run all of these through my scanner & search them from an application on my desktop, I could be really obnoxious in an argument.
Trying to use sarcasm in text-based forums does not work.
You have to be able to handle a quill pen to use it.
Sometimes seventeen/Syllables aren't enough to/Express a complete
It's an interesting approach that should be extended to other languages than English. Most of the world's history is not about the US and it has certainly not been written down in English. What I would really like to have is a similar tool that can search, say, Greek, or Latin, (or whatever) handwritten text. Imagine being able to query Ovid for an item of interest without having to consult everything he's written. I can imagine that this might encourage people to study the classics (a pet peeve of mine is that many people lack historical sense...) and it would certainly facilitate research in this area.
If you can put the queries in English, with the search engine taking care of translation, it would be even better. Then, extended historical study comes within everyone's reach and the classical studies (or humaniora) might be transformed.
----- One learns to itch where one can scratch.
How pleafant that they've done what waf neceffary to make this happen. How did they train the foftware to recognize the quirky 18th Century handwriting?
And the brethren went away edified.
We could use it as a jobs program for monks. Their predecessors wrote the manuscripts, and now they could transcribe them into digital form...
A fine is a tax you pay for doing wrong and a tax is a fine you pay for doing all right.
It's "Pixelative Text Cognizance."
It's different. With OCR these rays of light scan the original, translate each scanpoint to discrete RGB values, and do pattern recognition.
With this system, they just read the discrete RGB values directly from pixels of documents scanned in with rays of light, then they do recognition of patterns. See, it's totally different.
If only Nicholas Cage had this tool at his disposal, it would have made things much, much easier.
Somebody invented a way for computers to recognize handwriting. Like, so 10 years ago.
I worked on an OCR system about 20 years ago. No pre-defined bitmaps of text, you trained the system on the font to be recognized. After a few hours you could turn it loose and it did fairly well. While goofing off we tried handwritten text. With good penmanship it worked to a degree.
great. now people are just going to spoof documents and put pr0n or enlargement spams in the pdfs when i search for anything academic related. i'm glad i dont have that problem yet finding pdf papers via google yet.
my blog
The only real threat is fire, and it is no more dangerous than it is to CDs or hard drives.
Go back and look at some old notebooks - if they used acid-based paper, then they'll be getting rather fragile.
Although it is hard to OCR text and very hard to OCR cursive text written in historical documents, performing searches on those documents does not require a complete comprehension of the textand is therefore much easier to do.
For instance, the software may be unable to distinguish the word bug from dog in one person's handwriting, but can still mark it with probabilities of the word's possible meanings.
If a person later searches for the word bug or dog at a future date along with other terms, a mathematical calculation can be done for the likelyhood of the match and the searcher can make his/her own judgement to the meaning of the text.
---
Conrad Barski