Software To Flatten a Photographed Book?

← Back to Stories (view on slashdot.org)

Software To Flatten a Photographed Book?

Posted by kdawson on Sunday September 27, 2009 @08:01AM from the shadows-in-the-gutter dept.

davidy writes "I have photographed some pages of a book for reading on my PDA. This is much faster than scanning and I don't have to carry the heavy books. However, the photographed books are not as nice: curved, skewed, and shadowed, as opposed to the much flatter, cleaner scanned books. I have searched for software that can flatten the pages for better reading on the PDA. So far I have come across Unpaper and Scan Tailor. Unpaper doesn't seem to have a windows GUI, and Scan Tailor doesn't unskew well. I remember reading about Google's technique of converting books to e-books with a camera and a laser overlay. Is there any home user software that can do a similar job without the need for a laser overlay or other sophisticated (and patented) technology?"

20 of 172 comments (clear)

Min score:

Reason:

Sort:

Snapter by brusk · 2009-09-27 08:05 · Score: 4, Informative

Snapter is a bit cumbersome but that's what it does.

--
.sig withheld by request
1. Re:Snapter by DingerX · 2009-09-27 08:48 · Score: 2, Informative
  
  Okay, as per my previous post, I'm trying Snapter. It might have crashed, for alll I know. I'm at 3 bars (out of about 20) on the left side of the first page, and one processor is pegged. We'll see if it comes out.
2. Re:Snapter by DingerX · 2009-09-27 09:44 · Score: 3, Informative
  
  restarted. 30 minutes later, it threw a fatal exception.
  
  My short review: FAIL.
3. Re:Snapter by brusk · 2009-09-27 13:23 · Score: 3, Informative
  
  I include that in my definition of cumbersome.
  
  --
  .sig withheld by request
4. Re:Snapter by DingerX · 2009-09-27 20:26 · Score: 4, Informative
  
  Okay Cool. I found out what my problem was:
  
  1. The book must be on a uniform surface.
  2. All the edges of the book must be in the frame.
  3. Only hold the book down from the side.
  4. The photograph must be taken directly over the book.
  5. Use a dSLR for best results.
  
  Okay, so now try holding a dSLR directly over an open book that you're holding with another hand, from the side, and at a range where the entire book fits in the frame. At that point, you might as well build that book scanning rig.
  
  In short: FAIL.
Anonymous Coward by Anonymous Coward · 2009-09-27 08:08 · Score: 4, Informative

Get a thick, heavy piece of glass and lay it atop the pages to flatten them out before you photograph them. Use ambient light and avoid the flash.
1. Re:Anonymous Coward by polymeris · 2009-09-27 08:23 · Score: 3, Informative
  
  Also use a zoom lens and take the shot from as far as possible, to reduce curvature. The longer the focal distance, the flatter the picture will appear.
2. Re:Anonymous Coward by Anonymous Coward · 2009-09-27 08:37 · Score: 2, Informative
  
  It doesn't have to be glass. Target stores have these nice plexiglass photo boxes. An advantage of them over glass is that the edge of the box helps hold the opposing page up.
3. Re:Anonymous Coward by Anonymous Coward · 2009-09-27 10:20 · Score: 2, Informative
  
  That's not what polymeris is getting at. Wide angle lenses create strong perspective foreshortening. That's why there is a sweet spot for portrait photography: too wide makes noses look big, too long leaves no perspective. Lens distortion is easily removed because it is inherent to the lens, so you only need to calibrate once and can use the profile for all pictures shot at the same focal length. Perspective distortion depends on the scene, so there is no "calibrate once, correct all" option without creating a repeatable setup.
4. Re:Anonymous Coward by jonbryce · 2009-09-27 10:27 · Score: 2, Informative
  
  Barrel distortion can be easily fixed in photoshop, and once you get the right settings for your first pic, you can batch process the rest of them.
5. Re:Anonymous Coward by Anonymous Coward · 2009-09-27 11:24 · Score: 1, Informative
  
  Also use a zoom lens and take the shot from as far as possible, to reduce curvature. The longer the focal distance, the flatter the picture will appear.
  I'm sure you mean a tele lens. Zoom just means variable focal length. and could even a wide angle one.
ahhh - book scanning by ZERO1ZERO · 2009-09-27 08:22 · Score: 3, Informative

As with most scanning and other things, you can save your self immense amounts of hassle time and money later, by spending a fraction of that time up front sorting out the 'input'. A bit of glass over the book, using a scanner, or even getting a friend to hold a book will mean that your source image will be much better to start with.
Not everyone has 5-10mm thick peices of book sized glass lying around and it can be hard to take that sort of thing about the place in case of requiring to photo a book.
There is software called Book restorer that does this removes curves 'geometrical correction' etc but it's pricy.
i've tried un paper and it's pretty decent for what it does but it does have some limitations and it's not the most convenient to use.
Deskewing, cropping, filling, etc etc are all easily done and I've even written imagemagick batch scripts in windows to do these things. The major trick is the curve removal.
There's various ways you can determine the curve from a scanned image. If you have the edge of the page, you can calculate the movement required to straighten that, and then apply it to the whole image. You can use text based curve removal, similar to well known deskew algorithms for text, but takes into account different parts of the text may be 'more' skewed. i.e. rather than a rotational deskew a 'sliced' deskew. This needs to be done from the top to the middle and the bottom to the middle.
If you have a good 'shape' of the page, and know the true size of the page, you can use a kind of morph operator to morph the corners back to th eright position and hope the image follows.
Using a Greyscale/colour source will work better than a black and white source image in general.
the other option is if the scanned / photoed page is actually of reasonaly good quality but if just a bit squint, then OCR it to a PDF and generate a new document using the OCR text, which will be pin sharp accurate, compress a lot better and be easier to use, although may not be ideal if there are too many errors.
Re:No it wouldn't be faster by Anonymous Coward · 2009-09-27 08:34 · Score: 1, Informative

Seriously, have you ever compared the time photographing a book vs. scanning it? The fastest scanners run like photocopiers. With a book, all you need is to set up a decent or ghetto rig for the camera and turn the pages. Until now, I've been shooting with a DSLR at the same lighting/camera settings for each shot, and applying a batch transform process followed by a universal levels setting, finishing up with a PDF assembly. But I'll report back on how Snapter works on the same files.
Exactly, the document scanners used in libraries and archives are pretty much high resolution cameras on an adjustable stand. They don't work like flatbed desktop scanners where you have to squash the book flat on a plate of glass. As a result they are much faster, easier on the books and you get better quality scans for OCR processing.
Use a homemade book scanner. by s4m7 · 2009-09-27 08:46 · Score: 4, Informative

If you have ~$300 to drop on the project, Make has plans for a nice book scanner: http://blog.makezine.com/archive/2009/04/how-to_book_scanner_on_the_cheap.html It seems to hold the pages at an angle so there's little-to-no distortion on the page.

--
This comment is fully compliant with RFC 527.
Re:Contact Scan Tailor Author? by Anonymous Coward · 2009-09-27 08:48 · Score: 1, Informative

Whoo, Mod +1 Funny!
Any time I've ever done that I've either gotten crickets or flames.
How about a $300 home-built scanner? by plover · 2009-09-27 09:18 · Score: 5, Informative

Some guy posted a great instructables on building your own high speed book scanner, purposely designed to rapidly photograph book pages without curves. He even includes a software stream that OCRs the contents and sticks them into PDFs.
It's been quite popular -- so much so that he's created an online forum at http://www.diybookscanner.org/ dedicated to discussions from DIY book scanners all over the place, where they talk about builds, parts, and software.
I've been very tempted to build one myself just to avoid carrying heavy books around in my backpack.

--
John
Re:What does "and patented" have to do with it? by Dachannien · 2009-09-27 09:22 · Score: 3, Informative

Really, if you are doing this for yourself and have no intention of selling your product, then you are free to use their method all you want.
35 U.S.C. 271 (a) Except as otherwise provided in this title, whoever without authority makes, uses, offers to sell, or sells any patented invention, within the United States, or imports into the United States any patented invention during the term of the patent therefor, infringes the patent.
Yes, it's extremely unlikely that anyone would ever sue you for infringing a patent in the privacy of your own home because the damages would be minuscule and it would be very difficult to prove infringement, but it's still an infringement.
Re:Contact Scan Tailor Author? by Anonymous Coward · 2009-09-27 10:13 · Score: 1, Informative

Or, you know, if you find that there is a Linux only app that is exactly what you're looking for, you could just use Linux for it. Many of us are stuck with a Windows box, partition, or VM for the same reason. With Linux you can even run it off a CD or USB drive.
QT3 by gd2shoe · 2009-09-27 16:10 · Score: 2, Informative

No need. At a quick glance, Scan Tailor is programmed in QT3 (a superset of C++, used by KDE). This is a multi-platform environment, making it very easy to fix something on all supported platforms at once. If unskew doesn't work well, then that should be addressed in both versions. Fixing the Linux version will fix the Windows version too (unless he's relying on platform specific libraries in addition to QT).

--
I won't join Slashcott. OTOH, If Beta goes live, I just won't be back until it's fixed. Sorry Dice.
Re:a nifty new program by Taxman415a · 2009-09-28 09:08 · Score: 2, Informative

Basically you look up the internal command names such as file-jpeg-save and what arguments it takes then program either a plugin for gimp or a non interactive script. You can do it in scheme which they refer to as script-fu, or you can write them in python and it's called python-fu. The former is lightly documented and the latter barely at all. The only way I've found to look up all the command names for the python interface is to run gimp, then go to filters -> python-fu -> console then hit the browse button. But yeah basically it's either learn to program in scheme or python at this point.