Cell Phone with Camera = Scanner
An anonymous reader writes "TechJapan has posted a translation of an Impress Watch Article regarding a new technology developed by NEC and the Nara Institute of Science and Technology, that lets people use their cellular phones with cameras as scanners. It says all you have to do is move your phone over the surface of the piece of paper while recording a movie, and the technology (some sort of software I presume) will construct a high resolution image from the individual frames of the video.
Here is the original (Japanese) NEC press release." I'd love to see before and afters to see how well this works.
does it make phone calls?
I can make phonecalls with my scanner?
Why make a hi-res image, why not just OCR it? That could probably even be done on the phone. Then you could email or send it as a plain text document, much smaller file size then an image.
My ghEtt0 webpage.
I remember seeing news about Japanese scanner pens (smaller than any cell phone nowadays) that would let you write with it, OCR scan text, and it store the text. I don't have a link right now because I'm lazy. But those were a few hundred dollars back then - maybe eight years ago.
This is probably just a combination of that technology (which never took off here) and the cell phone feature craze.
NEC and the Nara Institute of Science and Technology have cooperated to develop technology which allows for phones with cameras - even low resolution cameras - to act as scanners, by having users move their camera over the surface of the page.
t oppag e/17729.html
. html
NEC and the Nara Institute of Science and Technology have devloped technology which uses movie recordings to produce high quality images, on par with those of a scanner. This technology will be aimed atcellular phones and video cameras.
The technique involves recording a part of the subject to a movie, while moving the camera; the "Mosaicing Technology" analyzes the moving image and estimates the three-dimensional position of the subject, and under the supervision of the "Ultra Resolution Technology," the joining points of the image are deleted, thereby optimizing it so that even low resolution cameras can produce scanner like output. In other words, even cellular phones and video cameras can produce high quality images.
Up until now, there were certain cameras that contained equipment to turn low quality images into high quality ones, but this technology marks the first time that this sort of technique can be accomplished with existing equipment. For example, a high quality image can be produced of an A4 size sheet of paper from video cameras currently on the market.
Inspired by:
http://k-tai.impress.co.jp/cda/article/news_
News Release:
http://www.nec.co.jp/press/ja/0402/2303
Enable 3D printed prosthetics!
Stay tuned for the explosive shockumentary, where we demonstrate how two tin cans and a piece of string make for a handy alternative to VoIP.
I wonder if you can use it to rip tracks from vinyl records, as described in this Slashdot article.
Whether some trick like this makes it happen sooner rather than later only time will tell, but eventually just in terms of raw resolution camera equipped cell phones will be functional full-color scanners.
And this is where things get interesting because fair use permits compies of material in the library for research. But if enough students scan journals at high resolution and then organize and exchange them through the Net, there will be an enormous levelling of the academic playing field. That is a time I look forward to with eager anticipation.
It only mentions paper as the object to take a picture of, but it might also work for objects further away. This could solve the problem of the often very narrow angle lenses those tiny cameras have.
Stitching multiple images automatically is nothing new but is CPU intensive. So Moore's law will take care of that.
Net sa best, mar it koe minder
All the people standing in front of some national icon (e.g. Liberty Bell, Eiffel Tower, Big Ben) waving their phones at each other... that could make tourists even *more* amusing!
Windows isn't the answer... it's the question. NO is the answer!
Computers can do that?
And magazine publishers (especially in Japan!) thought they had problems before with pirating articles...perhaps this is another forced movement towards changing the way we see and envision publishing content.
I'm sure James Bond and MacGiver have been doing this very technique for quite some time now.
they could also fit the phone with an optical scanner like the ones found in optical mice and log where the phone is on the sheet, this would probably make it easier to rebuild the entire page
The difference is that the phone will create an aggregate image from the multiple frames of the movie with a resulting higher resolution. Also, the camera will NOT use OCR. That's different from what you describe (using one frame at regular resolution, and running OCR software on it).
Under capitalism man exploits man. Under communism it's the other way around.
Take care!
Erick
http://www.busyweather.com/
A few years back a company had the same sort of software that used panning video from a DV camera to create wide-angle/fully panoramic images. It was extrememly smart and fast. I doubt anything truly new is being done here. Of course, I'd kill to have this on my 3650.
There is one thing that is not clear yet: How far away is "over the surface"? I mean, looking at a piece of paper from 1 meter distance is "over the surface", but I doubt it will get a high-res picture on a cell-phone camera. If I get closer to the thing I want to scan, then the field of vision is getting smaller. At the end that means that a cell-phone camera laying on the piece of paper that it should scan will only "see" a very small part of the image. So if I have to move it along to get the whole image, I'll be busy for a while, the data stream will be quite big and if I'm unlucky the camera shadow will darken my scanned movie. If I scan from a distance of lets say 10 cm away then the question is how much influence a variation of this distance will have to the result. And how I know that I got all details of the picture. And when the camera memory is exceeded. :-)
This software is from the mid to late 90s and unfortunately not available anymore. iPIX purchased the company and discontinued all of its products. There are a few links to buy it but they say it's unavailable and I haven't ever been able to find it on file sharing.
Another interesting program they had is VideoBrush Panorama. It is can only stich vertical and horizontal pans (don't even try zig-zag). It's pretty cool to be able to get panoramas from video pans, and the software is very easy to use. There is no need for a tripod. You can get an evaluation copy here. This and a resource editor might come in handy if you want to use it.
Last July there was an article here on /. about Japanese publishers' concerns that people were using their phones' digital cameras to photograph magazine pages. I'd bet they are really worried now.
Bureaucracy loves company.
Combined with those handheld printers that work by "rubbing" the printer over paper, I can foresee some serious "wax on, wax off" action in the future.
Sounds like something of a workaround to get behaviour similar to that of the old handheld scanners that used be around (I loved my monochrome 300dpi handheld, it was just something of a black art sometimes to try and keep the alignment good).
I can see why people might want to do this (panoramic photos suddenly springs to mind...), and if I hadn't been surprised by the uptake in camera phones, I might be jumping on the Slashdot bandwagon of "Who'll use it? I want a phone that only makes phone calls! I hate cell phones!!", but camera phones have *seriously* caught on, certainly here in the UK.
With the problems that the DoD has had in the past with Cell Phones with Cameras I wonder if this will get them even more scared of such technology.
Imagine if they freaked out over 1Mega Pix cameras because they could take FUZZY pictures of classified docs - This kind of technology will send the DoD over the edge. As it is right now Cell Phones with cameras are prohibited in all classified environments (at least byt the NAVY that I know of).
A Cell Phone with this kind ouf tech could be banned from the ENTIRE base/post/shipyard etc. One of the things that the drill into your brain in the service is that over time a bunch of little bits of unclassigied data can be made into a very informative report that borders on the classified.
Just my 3MegaPix Worth
or it will look pants. The big limiting factor in phone pic quality is the shocking quality of the 'lens'. Almost all the phones I've seen have had awful sharpness, low contrast and the most horrible pincushion distortion.
/.ed :)
they are getting slightly better, but not much. In fact at the moment there is no point upping the resolution on camera phones because of poor lenses.
I'd give you a link to my Nokia Art page except I fear it
This is almost certainly using a technique usually called called super-resolution. The basic idea is:
- Take multiple offset images with a low resolution sensor (usually a motion sweep)
- Stitch the images together
- In the overlapping areas, you can now generate the most probable underlying pixels at a higher resolution.
You can read about some of the underlying ideas here and here. It's a pretty cool area of research.Cell phones are so overrated anyway.
i do a bit of computer vision and here could be a basic method this this to work:
.. not that hard :)
for each image
fit an affine transform to the last
[this should work easily because
1) the paper is planar
2) the paper and it's background are hopefully different - with nice edges in between
3) the lighting conditions are the same (depending on how you hold the phone)
4) the paper is not moving
each of these tranforms can be applied cumulatively to the future images, though error is reduced by mapping everything to the center image.
this takes care of the registration problem (other techniques like KLT might be useful...maybe [ http://vision.stanford.edu/~birch/klt/ ] )
then you can apply techniques of super-resolution to get a higher resolution image [ http://www.ri.cmu.edu/projects/project_323.html ]
try it
having a rectangular, planar, still, evenly-lit piece of paper helps!
Robo-Blogs of the world: UNITE!
ok, with all the built in anti-counterfeiting tech in scanners now, is NEC being forced to push the anit-counterfeit measures?
Hmm I think I'd like to do this, email a moviemail to a server, or send what is normally a lousy video stream from a FOMA phone to a server and do it there. Seen this sort of algorithm around in Siggraph once upon a time, wonder if there is anything linuxy that can be bent into shape to use the cycles on my new VPS..
(To me this looks like what was documented in Graphica Obscura - projective warping of multiple photos - by SGI researcher Paul Haeberli. Actually his site has lots of info (I haven't seen code though) for doing wild things with color, depth of field, resolution, and so on using neat algorithms, style, and mucho fast computers. But now we can approach more closely the power he had 10 years ago. I love how slashdot forces me to look for sites I loved and lost. This seems to be a repetitive cycle for me with a period of a year. Help Mr. Gerlernter!
Here is more info from the original article.
It is a quickie so if someone wants to take a shot at translating the whole thing it might be good.
This is jointly announced by NEC and Nara Advanced Science and Technology University Graduate Program, which were working together as an example of biz-academic collaborations the government has been trying to foster.
It's based on two technologies, "mosaicing" and "ultra high resolution imaging". Mosaicing is defined as making an image of a flat surface or virtually flat distant scene with a wider angle than the camera is normally capable of capturing, by changing the position and angle of the camera, and later composing the resulting images into a single one. (So this is just a definition of a mosaic)
Ultra-res imaging technology is defined as oversampling by turning the object through slightly different angles and composing the resulting images into a single one. (So this is like Magellan's oversampling).
It says they were aiming at using consumer video cameras and camera-equipped phones to make a low-cost, low-annoyance way to do imaging, with a goal of say 15 megapixels, or like what you would get with an A4 page scanned at 400dpi.
The development was done without any special sensors or whatnot, and claim they are able to get similar quality to what a scanner would get by just using a consumer video camera to scan an A4 sheet of paper with this technique.
Then there's some marketing speak and it is presented as research results, no discussion of exactly what the system is or if it will be provided to the public.
Its called "super-resolution" and there are a bunch of papers on it, using very different techniques, and different sources of images.
I also think this is used on Mars by the MOC team to produce 0.5m resolution images from 1.5m source data.
You can do this with a normal digicam btw, download registax 2 for example. Just take consecutive images of the same static subject, and combine them.
(come to think of it, I don't own a scanner either =/)
The World Wide Web is dying. Soon, we shall have only the Internet.
Meet the wearcam!
http://wearcam.org/
http://wearcam.org/orbits/alanalda.html
You can even get the code from sourceforge, although now he seems more interested in his studies into what he calls "Comparametric Toolkit", which seems to mix Video Orbits with software based on the Wyckoff principle (how to get high dynamic range pictures from one underexposed pic and one overexposed pic, for those who don't RTFL).
I suppose the amount of processing power in those phonecams must be insane, or maybe the algorithm they use is more generic, but it is good to know all this Moore's Law horsepower applied towards useful stuff, not just Laracroftish games (ducks).
Finally, it is worth of note that, although Mann's software is now GPL (I don't recall it being Free, or even released, last time I checked three years ago), at least one of the algoritms is under US Patent5,706,416, which of course is not nice, unless he plans to license it free of charge for GPL software.
http://barrapunto.com/ - News for nerds, en español
... that was on slashdot some time ago. Had the same concept, instead of printing the normal linear way, you roll it around and the printer somehow figures what to print (or something like that). I remember reading jokes about how easy it'll be to print "kick me" notes on people's backs ;)
Founder of Mirror Moon - Tsukihime Game Trans
I know some fitness centers have already banned cell phone usage in their locker rooms. Maybe other places such as lavatories and clothing store fitting rooms are or will be doing the same. Guess this could mean bans on cell phones at the football stadium as well (think: Janet Jackson).
Although countries other than the US generally don't have the same hang-ups about nudity, it would be interesting to know what policies they have regarding cell phone usage and personal privacy.
To-do List: Receive telemarketing call during a tornado warning. Check.
I use my normal 2MPixel digital camera to take photos of articles all the time.
Works well, and it is cheaper then photocopies.
I think it would be cool to be able to combine images like this.
But I'm not an imaging expert.
I would also like to build 3d models from several photos, not that anyone cares, but I think it would be neat.
It's different because it's not just making a bunch of small pics into one big one, it's making a bunch of lo-res pics into a hi-res one. It's also different because it doesn't require YOU to do any alignment or adjustment of your composition.
In theory, you could take a 320x240 movie of the *whole page* at once, moving around, and when the movie got sufficiently long the software would reconstruct a high-res image of the whole page, as in 300 dpi or some such scanner-type resolution.
I realize that this is Slashdot, but you might try RTFA. You won't lose karma for that, I promise.
Ce n'est pas un vrai mouvement de robot!
The HP Capshare scannner did the same a few years back, I remember seeing the demo at COMDEX.
Rubbed it on a document like you were erasing a whiteboard. It assemembled all the bits it saw into a single image.
It listed for $1295 in 1999
http://www.canada.hp.com/cpo/home.html
*ducks*
Stick Men
I don't see how this could *not* require a rediculously steady hand. I have enough trouble making my digital camera photos not blurred!
This Washingtonpost article(blah blah reg req'd) may shed some light for you.
In the future, I would want to not be isolated from my friends in the Space Station.
+1 funny
I'd love to see the conversation where someone has to explain to the base general that mr. smith is a security risk because of his photographic memory.
While it'd be nice to block cell/GSM/etc signals on/in gov't facilities, just think of the leakage from any large military installation. You'll have residents for miles screaming bloody murder.
[Fuck Beta]
o0t!
Thanks for the good post.
I notice that Mann's work appears to deal with flat scenes-mobile camera, or stationary camera-arbitrary scene.
Do you know what state-of-the-art in 3d model building is? Is there effective work on arbitrary camera, arbitrary scene?
I know that CMU has a bunch of work that can pull off some of this, but I think that it may be special-cased (i.e. determining the location of the camera using other methods) and may have sensors other than vision (like laser range finders and the like) involved.
May we never see th
When I first got my digital camera (2 megapixels) I experimented with this sort of thing and gave up. The camera resolution was more than sufficient to capture an entire book page well enough OCR it... IF I put the page under glass to flatten it and lit it very carefully, with two lights at 45 degrees each as with a copy stand.
If I just handheld the camera over the page and pushed the button, the page curl prevented the page from being evenly in focus. The lighting was so uneven--even on pages that looked flat and readable to the naked eye--that the images were very unpleasant to read and completely impossible to OCR. If you set the threshold properly for the center of the page, the area within a couple of inches of the gutter went completely black. Using available light and using the camera's built-in flash produced very different, but equally unusable results.
The ability to wave a phone over a reference book in a library and capture a page would be genuinely useful, but I am rather skeptical that the software is really clever enough to synthesize a flat, evenly lit, in-focus image from the resulting set of images.
Yes, I've considered bringing a sheet of Plexiglass, a table-top tripod, and a couple of battery-powered fluorescent lamps into the library with me. And thought better of it.
"How to Do Nothing," kids activities, back in print!
I suspect that certain space telescopes do something like this. I wonder if, for example, the images we have of Pluto are constructed by taking multiple photos that are slightly offset and mining higher-resolution data from them.
In Denmark the subscription plans are not allowed to last more than 6 months. After this period the phone company is required by law to remove any SIM-lock, allowing you to use the phone with any other phone company. Of course you can also get the phone without getting locked into the subscription plans, but you have to pay a lot more to get it. /Spiff
I can't believe nobody's mentioned the HP CapShare.
Link
Picture
I was doing some consulting for a lawyer in 1999, and he showed me some 'new' HP scanner he just got for some outrageous price. He told me they didn't even have it in the stores/catalogs. It was a very 'James Bond' device, you could swipe it over a large page, and the image was automatically stitched together. You could store/view pages on the scanner, or send them to an HP printer or a laptop via IR. Very cool.
eBay has a couple of them for sale.
That still left the question how the tricorder came into being. Did someone sit down one day and say to himself, "I am going to build myself a tricorder?" That just doesn't seem very likely to me.
But now I finally figured that out too. The tricorder will evolve from the mobile phone! Every year you can see how more and more sensor functionality is added, while the physical size of the phone is getting smaller and smaller. First they could just acquire audio signals. Then came video signals. Soon it will be able to monitor your heartrate, body temperature, and various other vital signs, and maybe even automatically call 911 if you get into trouble. Sensors for electricity, magnetism, seismic waves, spectral analysis, alien energy, and other things will invariably follow, driven as they are by our lust for gadgets, useless functionality, and the latest and greatest. Meanwhile rest assured that ever-increasing software capabilities will provide the ability to make rudimentary medical diagnosis, do chemical analysis, and contain drivers for every alien Bluetooth-enabled device in a thousand lightyears.
While we are at it, you can rest assured that the very moment someone develops a universal translator, it will be embedded in a mobile phone.
So there we have it: the tricorder in a small, handy package. There are only two downsides that I can see: if we are to believe Star Trek, it will at some point lose its communication functionality (Kirk was always using a separate communicator), and based on current trends the battery life may not exceed 2-3 minutes...
Evolve! Evolve, damnit!
ALE is an open source tool that does this nicely. It is normally intended for turning a large number of images of the same thing into one higher quality image, but when you use the --follow and --extend flags. it can turn a sequence of images from a video into a single larger image.
To quote from their site: ALE is a free software program that renders high-fidelity images of real scenes by aligning and combining many similar images from a camera or scanner. The correct similarity between images is roughly that achieved by a somewhat unsteady hand holding a camera.
So you want it to remember phone numbers. How about some function so that you can add a small note to each entry? Like who that person is? How about an alternative number if they can't be reached on that one?
Soon you have a complete agenda. All its function perfectly reasonable to the people that use them.
Sure a phone company could make a phone that just makes calls. I tell you a little secret, they all did. Then they introduced phones with extra feautures and the old simple models stopped selling.
But there still are simple phones being sold. It is just that 99% of the buyers want the gadgets so that is were all the developments are taking place. Why should they release a new model that doesn't do anything new? The old phones from 4 years ago do what you want. Why design a new one.
Shop around, they are still out there.
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
http://www.chiltonwebb.com/iStill/
Now you can have both!
---- Take the Space Quiz!
The technique is called "super-resolution". Some references:
motion super-resolution, super-resolution in forensic science, super-resolution in astrophotography, "Bayesian Image Super-resolution", "Example-Based Super-Resolution", "Limits on Super-Resolution and How to Break Them".
A lot of my friends who are in the Navy stationed out of Japan already have cell phones that can do OCR on them. Not exactly a scanner per se, but they can scan in text from a from, and considering that these cell phones also can interface with the new Memory Stick Pro (1GB), you can just go to the library/book store, and stand there and scan your brains away without buying the book. (Which is why a lot of book stores in Japan now have their books in shrink wrap to keep people from leeching/OCRing for free.)
The phones can also read barcodes, too.
I have vague recollections of the kids in Iron Eagle using a portable handheld scanner. I assume these phones will do similar things. It was neat, but how necessary is it? More toys I guess.
No sig for you. YOU GET NO SIG!
Geekier-than-thou Steve Mann already demonstrated this as video orbits, and there has been plenty of other work on the same subject. Most of the research actually looks at general mosaicing, not just documents. Use with cell phones so far has simply been limited by the limited availability of cell phones with video capability, not by any conceptual problems.
The ads were in Snglish speaking publications and of course it implied it works with this alphabet but perhaps there was also a Japanese version.
I assume that they haven't seen actual camera-phones. If they did, they would know that the quality SUCKS. It would be much better to use regular digital-cameras for this.
Oh... you mean like the idea seen in Earth: Final Conflict with the Global?
:)
They used to scan those over an object all the time. Ever since that show first came out I've wanted one.
I want my bluetooth cell phone with camera to function as a mouse... and I can think of thousands of others that would also like that.
to raise to a fine art form the act of prank faxing people a picture of certain portions of your anatomy!
"Freedom means freedom for everybody" -- Dick Cheney
It also seems that it doesn't do a lot of the things that VideoBrush Panorama does. It doesn't blend images but just sticks them there. It can't make 360 degree panoramas, it can't output QuickTime VR, it can't capture video itself, you can't fix it's misalignments and you can't do any basic image processing on the panorama before saving. It looks more like an alpha or the result of some research in progress than a software product that's actually being sold!
I remember reading about this, like forever ago.
It's called "Video Orbits," I guess. Originally, it was made to make panoramic stills from video. But it can also do the same thing mentioned in the article, sort of scanner like.
Here's the writeup and
you can download it over here.
I played with it a bit using the movie function of my digital camera, transfering to computer, then using
mplayer -vo png movie.mov && mogrify -format pnm *png && estcement.pl *pnm
(make sure the binaries and scripts are in your path)
You can play with the $steps= line in estpairwise.pl to change the settings. also, i like to take out the -display in estpairwise.pl, in order to speed things up, otherwise it draws each image on screen as it tries to match them up.
will produce cemented.pnm.
This works both as the article talks about, like a scanner, but it also makes kickass hires panoramic shots from crappy 320x240 video.
Note: turn off automatic brightness/ auto white balance when taking your video, or it make look a little funny.
no idea if any of this stuff works under windows. but it works like a charm under linux.
sig? uhh, umm, ok
Build in =good= OCR software, and such a thing would actually be useful. Phones have pathetic storage.
:P
I'd say the ultimate phone, at the time being, would be the following:
- a phone. obviously.
- built-in LED flashlight
- USB interface for uploading stuff to
- reasonable storage (32M? 128M?)
- voice recorder (c'mon, this makes complete sense!)
- (given this development) scanner + OCR
- (I'm dreaming here) run on linux so that I can ssh to it over WiFi
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
...so?
In December, when my wife and I went to the local INS (sorry, BCIS) office to get her green card, in addition to the other mounds and mounds of ridiculous security procedures that cost everyone a half-hour of their lives, the security guards were inspecting cell phones for lenses and asking "can this phone take pictures?". I wonder what sort of incredible state secret you're going to uncover by taking snapshots inside a waiting room for an hour.
"A great democracy must be progressive or it will soon cease to be a great democracy." --Theodore Roosevelt
See this paper for some details: A. K. Jain and A. Ross, " Fingerprint Mosaicking", Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , Orlando, Florida, May 13 - 17, 2002.
Here, 'the joining points of the image are deleted', so any overlap is not analyzed further.
In theory you could develop this algorithm further to account for the fact that the overlap looks at slightly different areas of the page, thereby deriving sub-pixel information (like fuji's octagonal CCD technology ) but I think the point is that this technology does not try to make too many assumptions about the quality of the source image.
This would also significantly increase the complexity of the algorithm. Assuming this technology is also being targeted for phones with cameras, a simpler algorithm is also one that will not drain the battery (as much).
Besides the software, why is this new? Ok so I don't have a cell with a video camera on it, but I do have a cell with a regular camera and I've taken tons of pictures of text for later reference, from test scores to notes. Ok sure the quality isn't good enough for a 8x10, but I can snap a half-page and it be readable. Are you telling me I'm the only person on here that has ever taken a picture of text?
Besides, how many cellphones in the US can even take full-motion video? Last I checked the newest US phones only took individual pictures.
my karma will be here long after I'm gone
new technology = cell phone + old technology
There is a printer that works in a very similar fashion. I am looking for the link right now.
I wonder if I could turn my mouse into a scanner too. Should I post this into What (non-PC) Hardware Do You Hack??
In this blog post I demonstrated how my Japanese cell phone scans URLs from a 2D bar code. With pictures.
Well, the article is a translation from the Japanese original, and there are some statements made in there which are contradictory or confusing... but I stand by my statements.
Your "in theory" part describes exactly what I believe is being done - several images of the same subject, from slightly different vantages, are analyzed to produce a higher resolution image of the subject. Not a larger image, a higher-resolution image.
If you got the sense from the article that the algorithm is concerned entirely with stitching together small images to make a large one, you may be right. But I think you're mistaken.
Ce n'est pas un vrai mouvement de robot!