Slashdot Mirror


Scanning Large Amounts of Pictures?

ClintJCL asks: "My wife & I are involved in scanning every photo I've ever taken in my life. She can lay down 4 or 5 pictures into the flatbed scanner at once, thereby saving the scanning time which is the bottleneck. But then she has to split them with Photoshop, which is also somewhat time-consuming. I've searched on the net for hours for a piece of software that would automatically split these 'Batch image scans' into single images and it just doesn't seem to exist. There are plenty of pieces of software to split a single image arbitrarily into sections for the purpose of loading faster on an HTML page (which I disagree with anyway and is not what I'm looking for). But -nothing- that seems to do any sort of edge-detection to determine what pictures exist in a given 'scan batch'. I'm out of resources. I've nowhere else to go. Perhaps someone can clue me in on a piece of software that can do this for me."

28 of 72 comments (clear)

  1. be a hacker by photon317 · · Score: 3, Insightful


    Finding the image edges from her bulk scans is one of the more trivial operations you can do on an image. Grab a handy-dandy image library for your chosen format (pnglib, jpeg, whatever) and write a couple pages of code and you're done. GPL it and help others with the same problem.

    --
    11*43+456^2
    1. Re:be a hacker by ConceptJunkie · · Score: 2

      That was my first thought.

      If you were doing this project, what image library would you use?

      I know there are lots of easy to find standard libraries for PNG, JPG, etc, etc. But what are some good options for image manipulation?

      --
      You are in a maze of twisty little passages, all alike.
    2. Re:be a hacker by KnightStalker · · Score: 3, Insightful
      --
      * And remember, it's spelled N-e-t-s-c-a-p-e, but it's pronounced "Mozilla."
    3. Re:be a hacker by photon317 · · Score: 2


      Yes, I suggested he fix his own problem. Consider it my small protest against Ask Slashdot. This is a great do it yourself job.

      --
      11*43+456^2
  2. Coincidence by Henry+V+.009 · · Score: 3, Interesting

    I was asked just yesterday to find something very much like this (except for some text processing and databasing thrown in). I haven't found anything out of the box that fits (not that I was smart enough to do an ask slashdot on it), so I think that I may have to code my own in the next week. Are you willing to pay a little cash for a custom solution?

  3. The scanner driver itself by Evro · · Score: 3, Informative

    It's been quite a while since I scanned anything, but I remember that with whatever scanner I was using at the time, in Photoshop you would do file->acquire->twain and it would bring up the scanning program. This program scanned the image lightly as a preview, and then let you select however many "jobs" you wanted at once, so you could select 1 square as job 1, another as job 2, another as job 3. Then it would make one pass and generate an image for each job.

    As I said, this was a while ago and I don't remember the scanner, but it was probably some UMAX. The name "Mira scanner" stands out in my mind as the scanning software. You scanner may have this capability also; poke around a bit.

    --
    rooooar
  4. Re:perhaps sorting the photos ahead of time by FunkyRat · · Score: 2

    This was my thinking. ImageMagick is well suited for the job. There is even a Win32 TWAIN API for Perl. Combined with the Perl API for Imagemagick you could even write that custom application that will automate the whole process.

  5. hardware solution? by Brewst3r · · Score: 2, Interesting

    Why not get a automatic document feeder for the scanner?

  6. Photoshop Actions by Diamon · · Score: 2

    Why not just set up a Photoshop action to make 6 copies of the current image and set the action to crop each to an individual image. I don't know if you can script the name to save under automatically, but even if you can't you can just set up the actions to work on 10 or so original images and say to specific filenames and then just batch rename between scanning batches.

  7. umax magic scan.. Your scanning Prints???? by acomj · · Score: 3, Informative

    I have a higher end scanner (powerlook 3000) It allows you to do multiple scans in one pass.
    ie you preview
    you box all the pictures
    you click scan

    you can adjust each box color/exposure separate.
    it scans each image as a separate file. Of course you have to preview each image which takes times.

    You could write some software to do it. It might help to use a background matt of a consistant color though.

    I think you really should consider scaning only images you care about and adjusting each one individually. If you really care about your images get a negative scanner. Scanning negatives is far far fat better than scanning prints.

  8. You Have Everything You Need by edthemonkey · · Score: 3, Informative

    Photoshop should be just fine for what you want to do.

    There should be a pallete around that lets you create actions. You basically hit the record button and go through the steps you want to do. Map that to a keystroke and voila.

    When selecting the picture, you could even used a fixed width box, so all of the pictures will have the same dimensions. For the action, you could have it copy what you've selected, create a new image and paste the selection into it. Then have it merge layers and auto-adjust the levels/contrast and then have it bring up the save dialog box.

    All you're really left doing is scanning the set of pictures, clicking the mouse once (for the fixed width box), hitting the keystroke and then typing in the name of the file.

  9. Re:Learn to use the scanner software by isorox · · Score: 2

    Karma: Totally excellent! (mostly affected by Bill and Ted and one Anonymous Coward)

    Rufus is not an Anonymous Coward!!

  10. A couple suggestions on how to implement by blackcoot · · Score: 5, Insightful
    The solution to this problem should be fairly easy (this doesn't mean non-trivial) to implement, assuming that a couple conditions hold:
    1. The border around every image shows a high contrast against the scanner background (which is usually white). This shouldn't be too much of a problem, unless you take lots of pictures of very light things.
    2. Your photographs are rectangular. This may sound silly, but it's a lot easier to find a rectangle than some arbitrary n-gon.
    3. Your photographs are placed so that edges of the photograph run parallel to the bed (i.e. you put the pictures down squarely)
    If (1), (2) and (3) hold, then implementing this shouldn't be too bad --- I would use this algorithm:
    1. take a scan of the background of the scanner (i.e. hit the scan button with no pictures on the bed) and remember this background image
    2. for each image in the input set
      1. perform edge detection
      2. use a Hough transform to detect lines in the edge map
      3. calculate the difference image D. If B[k](i,j) is the value of the k color band at pixel (i,j) in the background image and F[k](i,j) is the value of the k color band at pixel (i,j), then D(i,j) = max(abs(F[r](i,j)-B[r](i,j)),abs(F[g](i,j)-B[g](i, j)),abs(F[b](i,j)-B[b](i,j))
      4. divide the difference image and the input image into rectangular regions using the lines that were detected with the Hough transform. For each region, calculate the average difference; if this value is large enough (i.e. over ~ 8-16), then consider that region to be a picture, otherwise it's blank space.
    Intel's OpenCV library (look it up at SourceForge) can do most of the "tough" stuff for you (i.e. the edge detection and the Hough Transform). Hope this helps :-)
  11. Scratch the itch by Smack · · Score: 3, Interesting

    Of course there are ways to do this on Windows. But it's hard to believe that no one has yet implemented it the unix way. There should be a utility where you type "picsplit *.jpg", and you get a directory of split files, sequentially numbered.

    Programmers are always looking for projects, this sounds like a relatively easy one.

  12. Re:is this the orthodoxy chatroom by TRACK-YOUR-POSITION · · Score: 2

    i dont understand how do i point my AOL to the free doctoral thesis channel.

  13. ImageMagick tools by austad · · Score: 3, Interesting

    ImageMagick comes with some command line utils that you can use to pull sections out of an image. Since you are scanning photos, I assume they are all the same size and have the same placement on the scanner. You could just write a script to grab 5 sections (or however many you are scanning) out of the image produced by the scanner. It won't automatically find the boundaries between each photo, but if you place them on the scanner carefully, you should be able to get consistent results.

    The NetPBM package may have similar capabilities.

    --
    Need Free Juniper/NetScreen Support? JuniperForum
  14. Good suggestion by MarkusQ · · Score: 2

    This is almost exactly what I was about to suggest, except that I would use string taped down at the edges (& pulled tight) instead of sticks. Why? It makes it easier to remove the pictures afterwards. You'll probably want to do the following:
    1. Measure & make light pen marks on the sides/ends of the scanner where you want the strings to be (in case you have to replace them).
    2. Tape the strings down, but not too tightly.
    3. Tape two strips of wood (e.g. rulers) along two adjacent edges, over the strings.
    4. Lap the strings over & tape them to the top of the wood.
    5. Do the same on the other two edges, this time pulling the strings tight enough to "snap"

    Make a point to check that your strings haven't moved every hundred scans or so, and periodically check the cut images as well, just to save yourself from plunging ahead with a miscalibrated setup.

    -- MarkusQ

    1. Re:Good suggestion by cr0sh · · Score: 2

      How about a thin piece of glass with a grid made from black pinstripe tape - sandwitch the pictures between the glass and a piece of cardboard (or similar material), then place the sandwitch on the scanner, and scan - use software to pull the images based on coordinate or edge detection (detecting the grid)...

      --
      Reason is the Path to God - Anon
    2. Re:Good suggestion by adamjaskie · · Score: 2

      The problem with this is that the glass would move the picture away from the scanner element, which would cause the image to be out of focus. Even the thickness of a thin sheet of glass would most likely be enough to bring the image out of focus.

      The best way to do this is to sort the pictures by size, and put, say, 3 4x6 photos on the bed at once, all landscape. Tell the scanner to scan 7200 pixels high, 3600 pixels wide, and save all the 4x6 images this way. Now, use image manipulation programs (there are some programs that are commandline based for resizing, cropping, etc) to write a script that divides each image into 3 images 2400 pixels high each, save them with auto-numbered filenames, and delete the 3 picture images. Open the whole folder in xv or something, and go thru each photo with next, rotating as nescisary.

      --
      /usr/games/fortune
    3. Re:Good suggestion by cr0sh · · Score: 2
      Maybe, maybe not - I thought about this when I posted, that it would be out of focus, but many scanners actually have good focus up to about an inch away from the surface. I am not saying perfect, obviously if you move away from the surface it will be worse, but if you scan at a high enough res, when you reduce the res (why would you want ultra-high res - storage would be tougher, and you would start to pick up the grain of the photo), what little focus issue there is wouldn't matter.

      That said, I do like your idea, it is the most logical thing to do - but I was trying to think up some way that photos could be loaded quickly off-bed as one set is scanning, so that you could scan/swap/load/swap/scan/swap, etc...

      One thing that I want to build or buy is a backlight for my scanner for negatives/slides. However, this need is becoming less and less as I use my cheapo digital camera...

      --
      Reason is the Path to God - Anon
  15. Scan the negatives by debugdave · · Score: 2, Interesting

    Why don't you just scan the negatives? (that is if you still have them) It would be more economical than spending the next 2 years chopping up 8"x10"x150dpi.

    Just a thought.

    Dave

  16. Organizing Images by fm6 · · Score: 2
    You can create individual files when you need to. But when do you need to? Browsing through them? Printing them out? You don't need them in individual files for that.

    It is useful to have individual files to share with other people. But these might not be the same files you originally captured. You'd probably scan in the photos at the highest resolution and color depth your scanner can do. But when you attach an image to an email, you probably need to step it down a bit.

    Anyway, you just have to have some kind of organizing software, if you've got so many photos that optimizing scanner time becomes an issue. If I were designing software that did this, I'd use a database, like the biolife example that comes with Delphi and Kylix. But now that I look at what's available, I see that most apps do store images as individual files.

    Still, the principles the same. A good organizer can capture new images from your clipboard. So the procedure works out like this:

    1. Set organizer to automatically capture from clipboard.
    2. Place a bunch of images on the scanner.
    3. Scan. (This is the time-consuming part we need to minimize.)
    4. Drag a rectangle around an image, and click "copy".
    5. Repeat previous step for all images we just scanned.
    6. Go to step 2.
    I guess that is a little less efficient than software that recognizes all the image boundaries. But the slight extra effort might not be as hard as finding software with the right kind of boundary recognition feature.
  17. A few approaches... by rthille · · Score: 2

    First, scanning negatives with a film scanner will give you better results, but you state that you're poor, so that's out.

    Second, you can split the image files from the scanner by size. This would require that you setup the photos that you scan groups of photos that are all the same size together, and get them placed in the same place on the scanner. Even then, you'd end up with either some white space around the photos, or they'd be cropped a tiny bit.

    The third option (the one I think you had in mind) is image processing. pnmcrop (http://netpbm.sourceforge.net/doc/pnmcrop.html) will crop off the white or black border around a photo, but probably won't handle 4 photos at once. If you can fit your 4 photos within 4 static boxes, you can crop to those boxes by size, then use pnmcrop to chop off the extra white border.

    --
    Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
  18. gimp by pizza_milkshake · · Score: 2

    this is an interesting problem, and since I've got GIMP installed and know a bit of Lisp (similiar to the Scheme that "Script-Fu" uses) I'm going to try and see what I can do using gimp's built-in functions. i'll set up a page at www.parseerror.com/split/ with my progress. i wouldn't gamble on when i'll have something worth using, but hey, you never know. i'll post back to this thread if i make any major break-throughs.

    1. Re:gimp by adamjaskie · · Score: 2

      There is a version of GIMP for Windows. Do a Google search for GIMP Windows, and you should find it. The GIMP is a very nice program. It is not as stable under Windows as it is under Linux, but it is comparable with Photoshop. From what I can tell, it is quite a bit harder to use, but a LOT more extensible than Photoshop. Of course, it is Open Source, so you can extend it however much you want hehe.

      --
      /usr/games/fortune
  19. Re:Hamrick Software Vuescan by Cy+Guy · · Score: 2

    PLease mod up parent, it sounds like the best solution for a non-techie.

    I figured an out of the box solution existed since most OCR software does this automatically when you scan page of mixed text and images, it will block out the page into sections to be OCR'd and images to be jpg compressed. I know Adobe Acrobat's scanning version is pretty good at this and let's you pre-define the default compression for the images.

  20. Re:is this the orthodoxy chatroom by TRACK-YOUR-POSITION · · Score: 2

    Every moment of my life the universe asks "Would you like to die now? It's free!" And the correct answer/reaction to give the universe is always NO.

  21. Re:is this the orthodoxy chatroom by TRACK-YOUR-POSITION · · Score: 2

    Slow down, space cowboy!