Slashdot Mirror


Ask Slashdot: Automated Tool To OCR CCGs Like Magic: the Gathering?

An anonymous reader writes I buy massive collections of trading card games, Magic:The Gathering, Yu-Gi-Oh!, Pokemon, Weiss Schwarts, Cardfight Vanguard, etc, etc. And I've gotten the process fairly streamlined as far as price checking, grading, sorting, etc. Part of my process involves using higher-quality web cams positioned over the top of the cards which are in a stack. I keep a cam window on the screen to show a larger, brighter version of the card. What I'm wondering: Is there is an OCR solution out there that will look at the same spot on the screen, capture, ocr, dump to clipboard, etc.? I've tried several open source solutions but none of them quite fit my needs. What I'd really like is to be able to hit a hotkey, and have my clipboard populated with the textual data of the graphics in a pre-set x,y window range. All this should be done via a hotkey. I may be asking for a lot, but then again, I'm sure someone out there has had need of this type of set-up before. Anyone have any recommendations?

59 of 96 comments (clear)

  1. Re:Why not get someone to make it for you? by binarylarry · · Score: 2

    I bet wizards of the coast will be totally cool with that.

    --
    Mod me down, my New Earth Global Warmingist friends!
  2. Re:Why not get someone to make it for you? by Anonymous Coward · · Score: 2

    $25 seems like a good deal, or did you mean $25,000 rather than $25.000?

  3. Image Database by Anonymous Coward · · Score: 2, Interesting

    A different method would be to have frames from the webcam be compared to a database of images and tally the matches. Space bar could serve as the "capture and compare image" function. Similar to http://www.tineye.com but local and with a limited data set.

    1. Re:Image Database by Anonymous Coward · · Score: 4, Interesting

      And, as a bonus, that has application has already been done specifically for MTG cards. e.g. https://github.com/tenderlove/magic_scan

    2. Re:Image Database by hjf · · Score: 3, Interesting

      This is what I did: https://www.youtube.com/watch?...

  4. I don't have a solution by halivar · · Score: 5, Funny

    But I just wanted to say that you are perhaps the biggest nerd I have ever been aware of. I mean that as a sign of respect.

    1. Re:I don't have a solution by hjf · · Score: 2
    2. Re:I don't have a solution by antdude · · Score: 1

      I thought I was the biggest the nerd/geek. :P

      --
      Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
  5. CCG by backslashdot · · Score: 1

    I read the title and thought this article was going to be about DNA and the amino acid proline.

  6. ImageMagick by Anonymous Coward · · Score: 5, Informative

    Grab an OCR system off of https://help.ubuntu.com/community/OCR. Get ImageMagick. Get streamer (package xawtv). Create a script on the order of:

    now=$(date --iso-8601=ns)
    file=$now.png
    outfile=$now-cropped.png
    streamer -c /dev/video0 -b 32 -o $file
    convert $file -crop 40x80+150+120 $outfile
    gocr $outfile > $now.txt
    rm $outfile

    Now create a keyboard shortcut with your window manager to run this script, or open a terminal and get used to pressing up and enter a lot.

    If you're not on Linux, sorry.

    1. Re:ImageMagick by jones_supa · · Score: 2

      This. He might be looking for a single monolithic program, but his problem is actually completely solvable with clever usage of UNIX. It's the perfect platform for creating a customized pipeline for this kind of task.

    2. Re:ImageMagick by Anonymous Coward · · Score: 3, Funny

      This. He might be looking for a single monolithic program, but his problem is actually completely solvable with clever usage of UNIX. It's the perfect platform for creating a customized pipeline for this kind of task.

      Also in 2 days it will be integrated into systemd.

    3. Re:ImageMagick by Anonymous Coward · · Score: 3, Funny

      Which means it was already part of emacs.

    4. Re:ImageMagick by CronoCloud · · Score: 4, Informative

      Or just have it run continously, snapping pictures every 8 seconds or so, then all they have to do is swap cards.

      while true;do
      echo "Preparing to scan new MTG card in 8 seconds"
      for i in `seq 8 -1 1`; do
                      echo $i
                      sleep 1
      done
      now=$(date --iso-8601=ns)
      file=$now.png
      outfile=$now-cropped.png
      streamer -c /dev/video0 -b 32 -o $file
      convert $file -crop 40x80+150+120 $outfile
      gocr $outfile > $now.txt
      rm $outfile
      done

    5. Re:ImageMagick by Anonymous Coward · · Score: 1

      All it needs now is a decent text editor.

  7. Re:Why not get someone to make it for you? by halivar · · Score: 1

    Some European countries swap the , and ..

  8. Re:Why not get someone to make it for you? by tepples · · Score: 1

    Some European countries swap the , and ..

    True, but do any of those countries have English as an official language? I thought the choice of thousands or decimal separator depended on the language of the surrounding words: French uses one convention, German another, etc.

  9. Re:There are really good options today. by Garridan · · Score: 1

    The submitter is actually handling meatspace items. RTFS: price checking, grading, sorting, etc.

  10. Decked Builder by DarrenBaker · · Score: 5, Informative

    I use it every day. The Android app is phenomenal at picking the right card from the database based on the picture. The only real problem is that it doesn't have all the alternate art versions of cards from older MTG sets. The interface is a bit sloppy on the desktop version, but the recognition is pretty good.

  11. Re:Why not get someone to make it for you? by halivar · · Score: 4, Insightful

    I can tell you that when I lived in Germany, even if I was writing in German, I got the decimal notation wrong every single time. I was just too used to my way of doing it.

  12. Computer Vision by TheCreeep · · Score: 1

    It can be done by scraping the database of cards, creating a model out of them, then matching the new card to the database.

    So how much is this worth to you?

    1. Re:Computer Vision by hjf · · Score: 1

      This is what I did, exactly for what OP wants: https://www.youtube.com/watch?...

    2. Re:Computer Vision by hjf · · Score: 1
  13. IDK by ssam · · Score: 4, Funny

    OMG WTF TLA OCR CCGs?

    1. Re:IDK by ArhcAngel · · Score: 1

      I feel embarrassed. The only acronym I didn't know was TLA.

      --
      "A person is smart. People are dumb, panicky dangerous animals and you know it." - K
    2. Re:IDK by Anonymous Coward · · Score: 1

      Person A: What does TLA stand for?
      Person B: Three-letter Acronym
      Person A: I know it's a three-letter acronym, but what does it stand for?
      Person B: Three-letter Acronym
      Person A: I said I already know that! What does it stand for?
      Person B: It's three-letter acronym.
      Person A: I said I already know that it's a three-letter acronym! I JUST WANT TO KNOW WHAT IT STANDS FOR!
      Person A goes to Google...
      Person A: Oh, I get it. Wait, can't it also stand for two-letter acronym too?
      Person B facepalms

    3. Re:IDK by jellomizer · · Score: 1

      That is a problem with Slashdot.
      Different geeks have there area of specialties and they have their own set of acronyms, often the same as something different. Then you mix in political acronyms and company acronyms. It gets messed up.

      Also there are times where the acronym isn't used much, then the poster just decided to use it.
      For example "Network Nutrality" to NN. There can be a big topic on say how Verizon is fighting NN, and you are trying to guess what the story is about. Is NN some sort of wireless frequency name, perhaps they are talking about New Nodes.

      The general convention is if you are going to use an acronym is to spell it out once just so we get the gist.

      But what makes it worse is that we are so proud of our geekiness we rarely ask what does it mean, as it would make it seem like we need to hand in our geek card.

      --
      If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    4. Re:IDK by gstoddart · · Score: 1

      I feel embarrassed. The only acronym I didn't know was TLA.

      LOL!!

      --
      Lost at C:>. Found at C.
  14. Camera, timer, done. by YrWrstNtmr · · Score: 1

    A regular digital camera, on a tripod, 5 second timer
    A Canon P&S, CHDK (intervelometer), swap the card before it clicks again.

  15. Big deal, already done. by www.sorehands.com · · Score: 1

    Seto Kaiba has already done it, but added holograms.

  16. Re: There are really good options today. by Garridan · · Score: 1

    The submitter is collecting magic cards, and cataloging them for resale. Sounds "for profit" to me... and not like that fact is hidden.

  17. There is already shortcut for this. by Anonymous Coward · · Score: 3, Funny

    In Emacs: Ctrl + M + T + G. Also runs a Monte Carlo on the last 3000 cards scanned and outputs the optimal 60 card deck and registers you in the nearest FNM.

    1. Re:There is already shortcut for this. by Immerman · · Score: 2

      Oh crap, I thought that was Ctrl + F + H + T + A + G + N. What the hell have I been doing!?!?! Come to think of it though, that might explain all the encounters I've been having with transdimensional horrors gibbering for my soul.

      --
      --- Most topics have many sides worth arguing, allow me to take one opposite you.
  18. Re:Yes. There is. by UnknownSoldier · · Score: 1

    /sarcasm ...

    Gee, if only some one would invent a device to do repetitive work.

    It would follow a set of what I'll call instructions.

    And instead of hard-coding them, it would be programmable, so that it is more flexible.

    I even have a name for it! A computer, because it "computes" the math along the way it needs.

    Nah, that will never sell.

  19. Re:There are really good options today. by Immerman · · Score: 1

    Hey, people can be convinced to buy any old ridiculous thing - just look at baseball cards. Or stamps. Or tulip bulbs. It seems a certain percentage of the population has an obsessive compulsion to hoard things and, all in all, playing cards beat old chicken bones or pizza boxes.

    --
    --- Most topics have many sides worth arguing, allow me to take one opposite you.
  20. Re:Tineye or similar? by Immerman · · Score: 1

    Hashes/checksums are unlikely to be of any use - all it takes is one pixel being a slightly different color and the hash will change completely, unless it's a fairly worthless hash to begin with.

    There are various techniques by which you could "fingerprint" images in a more variation tolerant manner, but they have nothing to do with hashes/checksums, which are specifically designed to be able to detect even single-bit changes.

    --
    --- Most topics have many sides worth arguing, allow me to take one opposite you.
  21. Re:Why not get someone to make it for you? by Anonymous Coward · · Score: 1

    I just use spaces for the digit grouping and a period or comma for the decimal. That way seems the least ambiguous to me.

  22. If you're looking for OCR software, try Debian by Johnny+Loves+Linux · · Score: 1
    I just did a quick check for OCR software:

    $ apt-cache search ocr | grep -v ^lib | grep -i ocr | grep -i -v language | grep -v motocross
    fonts-ocr-a - ANSI font readable by the computers of the 1960s
    fuzzyocr - spamassassin plugin to check image attachments
    gimagereader - Graphical GTK+ front-end to tesseract-ocr
    gocr - Command line OCR
    gocr-tk - tcl/tk wrapper around gocr
    python-gamera.toolkits.greekocr - toolkit for building OCR systems for polytonal Greek
    hocr-gtk - GTK+ frontend for Hebrew OCR
    python-gamera.toolkits.ocr - toolkit for building OCR systems
    ocrad - optical character recognition program
    ocrfeeder - Document layout analysis and optical character recognition system
    ocrodjvu - tool to perform OCR on DjVu documents
    r-cran-rocr - GNU R package to prepare and display ROC curves
    tesseract-ocr - Command line OCR tool
    tesseract-ocr-dev - transitional dummy package

  23. Re:Why not get someone to make it for you? by Hognoxious · · Score: 1

    Maybe if it's a price, but it wouldn't be valid as an amount. Or can you tell me whose head is on the one mil coin?

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  24. Re:Why not get someone to make it for you? by meerling · · Score: 2

    The smallest coin in the USA is the 1 cent coin, yet all gas stations have their prices end with 9 tenths of a cent. (Obviously they automatically round it up and keep it. Essentially it's a real world version of the stealing a fraction of a cent scam that has been used in movies for a very long time.)

  25. Let me know what you come up with by tehlinux · · Score: 1

    No hurry though, still waiting to get my holographic Charizard back from Mt Gox...

    --
    Most linux users don't know this, but the man pages were named after Chuck Norris. Chuck Norris fsck'ing hates noobs!
  26. Re:Why not get someone to make it for you? by binarybum · · Score: 1

    I think he meant if you are dumb enough to take this AC post seriously, I will happily take your $25.

    --
    ôó
  27. Re: There are really good options today. by EdwardFurlong · · Score: 2

    I do not know all that much about MTG, but my SO's kid plays it. There seems to be a new release of cards every few months. Certain cards get re-released, some old ones you can not use in tournament play, if a card is damaged it can't be used. There are all sorts of rare cards on whatever scale they use. I really this it is what is keeping comic book shops going. People will spend hundreds if not thousands buying cards. I am sure you can imagine to person who has to win spending that much on just one card.

  28. Re:Tineye or similar? by retchdog · · Score: 3, Interesting

    uh, you're thinking of cryptographic/non-invertible/fast-mixing/whatever hashes specifically. it's not exactly defined what a hash is, but generally it means a possibly many-to-one (i.e. lossy) function of data, usually with outputs of fixed (or parametrizable) size.

    for example, an OCR is a hash; it (ideally) hashes images of arbitrary dimension into an output space of characters according to which one it most resembles; similarly for any other image recognizer.

    --
    "They were pure niggers." – Noam Chomsky
  29. I've been working on this project for a week by jswolcott · · Score: 1

    Odd that you should ask this question a few days after I started trying to create a solution for myself. This is a strictly for profit venture for me. Apparently paying for my kid's college fund is naughty in some circles. Not sure how that works out for the world economy but I digress. I've spent about six days on this and might be able to save you some dead alleys. Mostly I've found a lot of frustration. My plan was to develop an app which could scan images of cards via a flat bed 9 at a time, crop those to single images, then extract the trading card title. It would then run the title against any number of online databases for current value of card. Going in to it I did not expect major issues. I've done OCR on many types of trading cards using Microsoft OneNote and text extraction is nearly 100% accurate. So I figured this was simple. Not so. I decided to use Tesseract which seems to be the open source gold standard for OCR. However I discovered rather quickly that tesseract does almost no preprocessing of the image and spits out perhaps at best 5% accurate text on these cards. So I went to image magick and graphics magick to see if I could use them to format my incoming scans in a way that tesseract could use. The teseract and image magick communities have been very helpful in trying to help me find a solution, however the reality is no simple, or even sort of simple solution exists. I'm shocked and amazed but it seems that no real world out of the box solution exsists for open source OCR. That is the stick. At this point I am at a cross roads. I have neither the program skills or time to devote to creating OCR for this. There is a good project for MTG cards using a webcam on github. However it is specific to magic cards and from what I can tell actually does image matching more than OCR. I am either going to abandon the project, or, and this is corny, write a script to drop the files in to one note and use that clunky interface for OCR. It's an awful awful solution, but due to my limited programming skills, and the lack of integration between image preprocessing and open source OCR, I think that is where I'm at. I may be missing some thing but I think this is where I'm at.

  30. Re: There are really good options today. by Feyshtey · · Score: 1

    Why is that a problem? There's nothing at all nefarious about buying and selling MTG or any other CCGs. In fact that producers of the cards rely on it. It fuels a good portion of their sales.

    As far as becoming a rival to a site, so what? If he can do it better then more power to him. If he can get people to help him to it better for free, even more so. In the end, what do you care unless you have some vested interested in one of the services that already does this kind of thing.

    --
    "But we have to pass the bill so that you can find out what is in it,..." - Nancy Pelosi
  31. Re:Why not automate? by jswolcott · · Score: 1

    Card damage is a concern. However this could be avoided by feeding cards loaded in to protective sheets provided the scanner was a. robust enough to take that thickness, and b. gentle enough not to curl them. My issue is ocr. The preprocessing is complex and makes my poor head hurt.

  32. Re: There are really good options today. by CanHasDIY · · Score: 1

    Why is that a problem?

    Because it's another example of turning Ask Slashdot into Slashdot, Please Do My Job.

    Remind me to never ask you for directions. Sheesh.

    --
    An enigma, wrapped in a riddle, shrouded in bacon and cheese
  33. Re:There are really good options today. by CanHasDIY · · Score: 1

    Search Ebay for sold items with "Box Only" in the title. You'd be amazed how much some OEM boxes sell for.

    --
    An enigma, wrapped in a riddle, shrouded in bacon and cheese
  34. Re:There are really good options today. by Immerman · · Score: 1

    Indeed. I'm unconvinced though that a lot of those prices aren't due to careless idiots not paying close enough attention to what they're buying. Especially considering the deceptive images often included.

    --
    --- Most topics have many sides worth arguing, allow me to take one opposite you.
  35. Re:There are really good options today. by Immerman · · Score: 1

    Oh, and then I suppose there's money laundering as well. It's not uncommon to see things selling on ebay, Amazon, etc. with prices that are hard to explain any other way.

    --
    --- Most topics have many sides worth arguing, allow me to take one opposite you.
  36. Re:Voice recognition by jswolcott · · Score: 1

    Not a bad idea. Except that the card names in MTG are hardly normal English words. That might complicate matter for voice.

  37. I've had this working for a few months... by HanClinto · · Score: 1

    Currently I'm using OpenCV and a lot of glue code to scan real-time video and recognize cards for MtG. The database is easily extendable for Pokemon, Yugioh, L5R, and other card games.

    I wrote it in Python on the PC, and recently ported it over to native Android. So far it works really well, and you can see a screenshot of it in action right here:
    http://imgur.com/gallery/v44gIbB

    Like others, I'm trying to put my kids through college, and am not quite willing to open-source my months of work just yet. However, I'm not looking to scalp anyone, and my rates are very reasonable. Feel free to PM me if you would like me to license this library to you -- it would be a fairly turn-key solution for you.

    1. Re:I've had this working for a few months... by jswolcott · · Score: 1

      What is the average time to discovering the card start to finish? I saw one project but the time per card was some thing like 10-60 seconds and I can type a lot faster than that.

    2. Re:I've had this working for a few months... by HanClinto · · Score: 1

      Maybe 0.1 seconds, on average? Not quite fast enough for 30 fps, but close enough that the lag isn't really noticeable to the user.

    3. Re:I've had this working for a few months... by jswolcott · · Score: 1

      you get the text back in a second? That my friend is very impressive.

    4. Re:I've had this working for a few months... by HanClinto · · Score: 1

      Thanks!

      Yeah, it could possibly be sped up a bit, but right now I'm doing a linear search for the nearest Hamming distance in a data set of about 25k cards (all of the MtG cards printed) -- if I were to optimize the Hamming search with a tree of sorts (similar to the algorithms used for spell-checker algorithms) I could possibly speed it up, but no need to prematurely optimize things at this point.

    5. Re:I've had this working for a few months... by HanClinto · · Score: 1

      I should also note that I'm not doing an OCR based method -- I'm using a "fingerprinting" perceptual hashing method that instead looks at the entire picture of the card (similar to how Google's "Find Similar Images" function works)

  38. Re:Why not get someone to make it for you? by RingDev · · Score: 1

    It gets even more interesting when the last digit is a 5. Accounting rules kick in and you round towards the nearest even number so $22.995 = $23, but $22.985 = $22.98.

    This one threw me for a loop when I first hit it as there are some programming languages that the default Math.Round function follows to RNE (round nearest even) definition.

    -Rick

    --
    "Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs