What is the average time to discovering the card start to finish? I saw one project but the time per card was some thing like 10-60 seconds and I can type a lot faster than that.
Card damage is a concern. However this could be avoided by feeding cards loaded in to protective sheets provided the scanner was a. robust enough to take that thickness, and b. gentle enough not to curl them.
My issue is ocr. The preprocessing is complex and makes my poor head hurt.
Odd that you should ask this question a few days after I started trying to create a solution for myself. This is a strictly for profit venture for me. Apparently paying for my kid's college fund is naughty in some circles. Not sure how that works out for the world economy but I digress.
I've spent about six days on this and might be able to save you some dead alleys. Mostly I've found a lot of frustration.
My plan was to develop an app which could scan images of cards via a flat bed 9 at a time, crop those to single images, then extract the trading card title. It would then run the title against any number of online databases for current value of card.
Going in to it I did not expect major issues. I've done OCR on many types of trading cards using Microsoft OneNote and text extraction is nearly 100% accurate. So I figured this was simple. Not so.
I decided to use Tesseract which seems to be the open source gold standard for OCR. However I discovered rather quickly that tesseract does almost no preprocessing of the image and spits out perhaps at best 5% accurate text on these cards. So I went to image magick and graphics magick to see if I could use them to format my incoming scans in a way that tesseract could use. The teseract and image magick communities have been very helpful in trying to help me find a solution, however the reality is no simple, or even sort of simple solution exists. I'm shocked and amazed but it seems that no real world out of the box solution exsists for open source OCR.
That is the stick. At this point I am at a cross roads. I have neither the program skills or time to devote to creating OCR for this. There is a good project for MTG cards using a webcam on github. However it is specific to magic cards and from what I can tell actually does image matching more than OCR. I am either going to abandon the project, or, and this is corny, write a script to drop the files in to one note and use that clunky interface for OCR. It's an awful awful solution, but due to my limited programming skills, and the lack of integration between image preprocessing and open source OCR, I think that is where I'm at.
I may be missing some thing but I think this is where I'm at.
you get the text back in a second? That my friend is very impressive.
What is the average time to discovering the card start to finish? I saw one project but the time per card was some thing like 10-60 seconds and I can type a lot faster than that.
Not a bad idea. Except that the card names in MTG are hardly normal English words. That might complicate matter for voice.
Card damage is a concern. However this could be avoided by feeding cards loaded in to protective sheets provided the scanner was a. robust enough to take that thickness, and b. gentle enough not to curl them. My issue is ocr. The preprocessing is complex and makes my poor head hurt.
Odd that you should ask this question a few days after I started trying to create a solution for myself. This is a strictly for profit venture for me. Apparently paying for my kid's college fund is naughty in some circles. Not sure how that works out for the world economy but I digress. I've spent about six days on this and might be able to save you some dead alleys. Mostly I've found a lot of frustration. My plan was to develop an app which could scan images of cards via a flat bed 9 at a time, crop those to single images, then extract the trading card title. It would then run the title against any number of online databases for current value of card. Going in to it I did not expect major issues. I've done OCR on many types of trading cards using Microsoft OneNote and text extraction is nearly 100% accurate. So I figured this was simple. Not so. I decided to use Tesseract which seems to be the open source gold standard for OCR. However I discovered rather quickly that tesseract does almost no preprocessing of the image and spits out perhaps at best 5% accurate text on these cards. So I went to image magick and graphics magick to see if I could use them to format my incoming scans in a way that tesseract could use. The teseract and image magick communities have been very helpful in trying to help me find a solution, however the reality is no simple, or even sort of simple solution exists. I'm shocked and amazed but it seems that no real world out of the box solution exsists for open source OCR. That is the stick. At this point I am at a cross roads. I have neither the program skills or time to devote to creating OCR for this. There is a good project for MTG cards using a webcam on github. However it is specific to magic cards and from what I can tell actually does image matching more than OCR. I am either going to abandon the project, or, and this is corny, write a script to drop the files in to one note and use that clunky interface for OCR. It's an awful awful solution, but due to my limited programming skills, and the lack of integration between image preprocessing and open source OCR, I think that is where I'm at. I may be missing some thing but I think this is where I'm at.