Google Pushes Open Source OCR
SocialWorm writes "Google has just announced work on OCRopus, which it says it hopes will 'advance the state of the art in optical character recognition and related technologies.' OCRopus will be available under the Apache 2.0 License. Obviously, there may be search and image search implications from OCRopus. 'The goal of the project is to advance the state of the art in optical character recognition and related technologies, and to deliver a high quality OCR system suitable for document conversions, electronic libraries, vision impaired users, historical document analysis, and general desktop use. In addition, we are structuring the system in such a way that it will be easy to reuse by other researchers in the field.'"
The goal of the project is to stop the damn email image spammers.
among other things, sure, but it's got to be a high priority for google.
... for Captchas? If Google is pushing OCR I could see it eventually becoming good enough to parse at least some types of captchas.
I've been hoping that someone with deep pockets (Google, IBM, Sun) would take on this area for a while.
There is a major need for an OSS OCR package, and right now the field is pretty bare. There's GOCR, and a commercial offering called OCRShop, and at least that I've run across, that's about it. Nothing really on par with Omnipage, or other commercial packages for other platforms.
I think there are some really neat applications for OCR that have never really been investigated, because it's so expensive to build that capability into other products. A free OCR engine that really worked could lead to some very neat book-scanning applications, just for starters. I don't think that there's really any integrated packages around for helping people scan books and manuscripts. (Right now you have to photograph the pages, keep them organized, then OCR them and proofread the text against the images. Bit of a nightmare.) I'd love to see a free application for libraries that let a user batch scan (via a digital camera -- let's not get into what I think of SANE and scanners generally) a book, and then provided a nice interface for proofreading the OCRed text against the original image.
Something like that could have a huge social impact. There are a lot of libraries where I'm sure they'd love to scan some of their out-of-copyright assets and provide them to patrons in a digital form, but it's just too technically complicated. An easy-to-use program that let the proofreading be done by nontechnical users (maybe remotely, as long as we're dreaming) could vastly increase the volume of digital materials available.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
And of course, as a side effect they'll probably wind up with a lovely distributed system for solving captcha. ;)
True, but CAPTCHAs always seemed like a bit of an inelegant hack anyway. First, they're horrible from a disabled-access standpoint, and second they're really not all that effective against a concerted enemy when there's a lot of money on the line. Spammers can just pay a few kids in some Third World country to sit there all day and solve CAPTCHAs if they want to.
Since message boards, which are the major users of CAPTCHAs, are practically by design little fiefdoms, I don't think they're nearly as hard to patrol as a common-carrier network like email. The solution to message-board spam is to either institute a moderator-delay (for small blogs and boards), or simply make enough admins with IP-ban powers so that the second someone starts spamming, they get banned and the spam gets deleted. Lameness filters working on the same principles as email spam-filters are probably helpful, too.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
This'll be a much needed boost for us Linux users who want to help out Project Gutenburg.
Okay, so one thing will lead to another and soon Google will be creating technology to recognize non-symbol shapes... How long before I can login to my G-Accounts by smiling at my computer?
All you people who are worried about this breaking captchas seem to be missing something--there have been a number of fairly decent OCR packages out there for a long time. The goal of this Google project is to create an open-sourced one that does a good job deciphering HUMAN-READABLE TEXT. Captchas are far from human-readable (the good ones at least), and I seriously doubt this project will help very much in that arena.
This guy's the limit!
When we can make a computer that can tell the difference between a kitten and an adult cat (or hell even another furred mamal) with any kind of accuracy, I think the LEAST of your problems at that point is coming up with captchas. You should be more worried about how you're going to escape from Skynet.
have you tried gocr? it's nice as a random number generator, but beside that... it's pretty much garbage
captcha's are not restricted to images of letters. For example: you could ask people to solve a regular text question (this would also fix accessibility issues)
What about a free service to upload scanned images to and recieve html in return?... Please....
Because you can - or because you should?
Now wait a second... you would rather upload a scanned image, which should be at a pretty decent resolution if you want good results, than run the OCR software locally? What, are you using a system with a 33MHz CPU or something?
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
this just slows down the spammers but can't stop them. If you have a small number of choices it's just a matter of how much the spammer must try to get through. You present 10 images with 1 correct, the spammer has a 10% chance and that's enough to make his bussiness work. I'm not really in favor of captchas but multiple choices won't work for long.
La vida no es una pastafrola.
Just for you, I made one, because I'm that fucking cool.
Prepare to be unimpressed, because Results follow:
JABBERWOCKY Lewis Carroll
(from Through the Looking-Glass and What Alice Found There, 1872) `Twas bri11ig,_ andjghe 4s1it_hy toyes Digl gyre amid gimblejn thg wabe: All xiiimsy wei^e thg borogovgs, And theamome raths outigrabe. ''ggwqre thg Jalgbervvpck,_my sqn! The jaw; that bijtel the clayksathat catch! Bgyvaiie the Jubjub bird, anti shun The frumidus Bandersnatch!' I-Ie took his yorpal sword in hand: Long timg tlgewmangome foe he sought So rgSted he by the Tu_mtum tree, And stood awhile in thought. And, as in uffish thought he stood, The Jalgbgjwoclg, with eyes of flame, Cqmgwhjfflixgg through fhe tulgey wood, And burbled as it came! Qne, two! One, two! And through and thIi`Ollgh The jrorpgal b]ade went; snicaker-snack! I-Ie left iifdead, and with its head He went galumphing back. ''And, has thou slain thejabbexfwpck? Cpmg to my a_rxps!_my ljgaxjgishboyl Ojralqjousi dwgy! Qalladhl Callayl' He chortled in his joy. S
\ A S
X A ?`^s :
, ' Was ga. ka%#* mm. -- M 1 1 Q at ) a iv 2. `Ail A it 3*,* `i 2 (V H ;. ````( * 4 ^Nq@ Eu..*s..%im X M is ? lgh ~ ``A? S [ A Fax I /),2*gE it ^`* 4 ~ *: ' X A mg x ix, ,t~;;;..: v' it ix '~ t ~ ^ ,4~ ---= =-^ A A i gv ; * XX, x> . . N S A ft 1 A-`A 3; `> ' ''YY \Jh ^***`(?i* , ~~ x `* at -;v- *<~ ' H ~~~-=.- ; `Twas bri11ig,_ and_the 4s1it_hy toyes Dig gyre arid gimblejn the wabe; All Qiixjnsy wei^e thq borogovgs, And thdmome raths outvgrabe.
dshaw@iabbenNockv.com
Return to Glorious Nonsense Return to Lewis Carroll
Results End.
Beautiful, eh? I also tried a 100 dpi grayscale scan, which came out even more like hash (one big paragraph) and a 300 dpi bitmap (1bpp) which was about the same as the 100 dpi gray scan in quality, though a bit better.
Looks like ocropus has a while to go before it can slay the Jabberwock instead of thejabbexfwpck.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"