Image Recognition on Mobile Phones
mysticalgremlin writes "In a recent presentation, Semacode founder Simon Woodside presents his company's bar code scanning technology that is used in mobile phones. Simon also discusses many places where bar code scanning powered phones are being used. Not bad for an 'image recognizer for a 100 MHz mobile phone processor with 1 MB heap, 320x240 image, on a poorly-optimized Java stack'"
And here I thought a bar code was a hand signal you used to let everyone in a large crowd, in a noisy bar, know where you were going next.
Like standing up and holding up five fingers to let everyone know the next bar is the "Five Spot".
Oh well, live and learn.
Skivvy Niner? Email me!
HEY! Look left just ONE MORE TIME!
Surely you mean "phone-powered bar code scanning", ie using the phone to scan bar codes, not powering the phone by scanning bar codes...
It's official. Most of you are morons.
Beleive it or not this is pretty impressive. Computer vision gets quite difficult when you don't have a lot of pixels to work with, as the shapes are all "helpfully" smeared together by the imager. And with the cheap lenses in camera phones, edges can be smeared by more than one pixel. In some of my prior work doing vision systems for Sony Aibos for RoboCup, we had to deal with similar problems (find an orange ball in an image that may be only 3x2 pixels, while ignoring the boundaries between red and yellow objects). So, kudos for the technical achievement, and hopefully they find a better application than the cuecat :)
Some years ago, I read an article about the possibility of printing tiny barcodes in newspaper stories that would code for a website address. You'd use a special reader that interfaces with your PC to visit the referenced site. This was supposed to be easier than typing in a lengthy, complicated URL.
We've got around this, mostly by having nice succinct URLs and tinyurl.com for everything else, and who wants to carry a barcode reader with them when they're reading the paper?
However, I wonder whether this idea may have some re-interest. If your mobile phone can read barcodes, we could print them anywhere - in papers, on billboards, TV adverts - and all you'd need to do is take a photo and your phone automatically loads the webpage in its built-in browser.
That might be useful.
Argh.
"You're everywhere. You're omnivorous."
Once a barcode is read you just get the product code. What good is that?
You need then to lookup that code up in a database for real info.
> Not bad for an 'image recognizer for a 100 MHz mobile phone processor with 1 MB heap, 320x240
> image, on a poorly-optimized Java stack'"
10 or so years ago we had 3d games on 7mhz machines with 512k of ram, pretty much the same screen resolution yadda yadda - this isn't so impressive.
My colleague once wrote a prototype doing the same thing (barcode recognition). This is also a nice solution for building tickets. THe main advantage is that you can give the guy at the entrance just one phone and he'll be able to scan entry tickets without the need for a computer or heavy equipment.
We even have a video showing this technology being used for payment. Note that in the video you see the recognition engine in java run on a PC with a webcam, but the same engine runs on many MIDP 2.0 phones (like a nokia 6230) and is also able to find a barcode instantly. In this case the phone is only used as a client for the payment concept.
Obviously, the web server does not run on the mobile phone.
8 of 13 people found this answer helpful. Did you?
I've seen some Japanese phones that have apparently had this ability for quite some time now, I was absolutely amazed when a friend showed me one that even OCR'd english text out of a snapshot!
And there's a company called Grabba that makes commercial bar-code scanning solutions out of PDAs and PDA-phones (among other things). A friend of mine works there... interesting stuff; they also sell a dock thing that a PDA can clip into, which gives it a camera so you don't need to use a mobile phone. Popular with inventory/warehouse type applications, it also does 2D barcodes as well.
Well, it is fairly impressive. J2ME is many things, but convenient and fast are not amongst them. Getting a Java phone to do anything useful at all is quite a feat of programming, getting one to recognise barcodes in realtime is damn near a miracle. Bear in mind the "virtual machine" on most phones is in fact simply a slow interpreter - it makes BASIC look souped-up.
My phone ( which I have had for more than half a year ) besides the bar code reader, has OCR of roman and japanese characters. And the most impresive use of this in the telephone is the ability to input some japanese word (yes in Kanji) directly into the dictionary. Really impresive for us non native japanese speakers. My phone is a sanyo w32SA , in the link you can read about in the part OCR kino.
"We all know Linux is great...it does infinite loops in 5 seconds." -- Linus
Yes, the image resolution is low but you need to take into account that another spec mentioned is 1MB of Java heap memory. The captured images are stored in Java Image obejcts uncompressed. So the memory requirement for a QVGA (320x240) four-bytes-per-pixel RGBA image is 307,200 bytes, which will fit well within the 1MB of heap memory.
The phone camera will probably be able to capture images with a higher resolution (up to megapixels), but because of this Java heap memory limitation they probably need to limit themselves to QVGA resolution. And besides, if the image processing / recognition algorithms work sufficiently well with images of QVGA resolution, there is no need to use higher resolutions.
Your comment got me thinking about how much information you could squeeze into one of those barcodes.
At most, a 320x240 tag would give you 76,800 bits of information, or slightly less than 11,000 7-bit ASCII characters. That's assuming you could match the pixels of the tag to the camera's sensor exactly.
I assume you probably wouldn't want to use any more than half of the camera's vertical and horizontal resolution though, which leaves you with 160x120 (for 2,700 characters), and I assume you'd need to have a few rows blacked-out on at least two sides to identify the border of the tag (so subtract (160+120)*2 pixels for bordering...that leaves 2662 characters) and you'd probably want to have a hash or checksum (lose 128 bits).
Still, that leaves about 2,643 characters in an image, which is about a page and a half of typed text using the old guideline of approximately 1,500 characters per page.
That's pretty impressive; provided you could make your reader focus on objects near to the lens, so that you could make the tag suitably small (less than an inch or so across), that's a lot more efficent way to convey textual information than actually writing it out. Instead of just embedding a URL link, you could put written information on there; maybe stuff that would clutter up the packaging / display / poster if you wrote it all down. If these things became ubiquitious, I could see whole advertising campaigns in urban areas (e.g. subways) where the "ad" got you interested, and then you could get more information via the tag.
They say a picture's worth a thousand words, and it sounds like it may not be far off from that.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
Could you tell me which approach was used in your project? I mean, I don't need an uber-detailed description, just some key facts; ex: "we used correlation", or maybe you applied some sort of scaling\rotation - invariant techniques, etc.
As a student, I experimented with image processing last year, and I was amazed by all the cool things that could be done with different algorithms, but I never managed to write a tool that could recognize an object on an image. It sort of worked, but I haven't had time to finalize it and release a version that would work for others too, not only for me, only when launched with a debugger, and only at step-by-step execution :-)
How reliable is object detection on a 3x2 sample? Looking for an orange ball on such a small image... Hmm, won't it be just an orange pixel on such a small image?
Another question - was that pattern recognition? i.e. your program was fed with images of orange balls and it attempted to find them on the target images, or did you somehow define an orange ball (ex: "a closed curve, the color of which must be within the specified RGB range") and the program had to figure the rest by itself?
The saddest poem
Bear in mind the "virtual machine" on most phones is in fact simply a slow interpreter - it makes BASIC look souped-up.
Presumably you're referring to the KVM (the J2ME JVM) which is slow. I think you're out of date.
AFAIK modern phones have Sun's CLDC HotSpot VM (the "CLDC HI VM") which has speeds equivilent in relative terms to a JVM on a desktop PC. The Blackberry phones in particular have a great JVM. When more phones have decent ARM-based gigahertz processors speed Java speed will stop being an issue much like the desktop space.