Slashdot Mirror


Content-Aware Image Resizing

An anonymous reader writes "At the SIGGRAPH 2007 conference in San Diego, two Israeli professors, Shai Avidan and Ariel Shamir, have demonstrated a new method to shrink images. The method is called 'Seam Carving for Content-Aware Image Resizing' (PDF paper here) and it figures out which parts of an image are less significant. This makes it possible to change the aspect ratio of an image without making the content look skewed or stretched out. There is a video demonstration up on YouTube."

43 of 174 comments (clear)

  1. The paper via ACM by xenocide2 · · Score: 4, Informative

    The author's website was pegged serving that 20MB PDF before slashdot got ahold of it, I doubt it'll survive now. The paper is also hosted by the ACM, if you're a subscriber.

    --
    I Browse at +4 Flamebait

    Open Source Sysadmin

    1. Re:The paper via ACM by spydir31 · · Score: 4, Informative

      The Coral Cache" has it also.

    2. Re:The paper via ACM by Anonymous Coward · · Score: 5, Informative

      I used a lossy compression algorithm on their paper and got this...

      Shrink image:
      Step 1: Run an edge detection algorithm.
      Step 2: Find minimal energy (least amount of edges crossed) path from top to bottom or left to right (graph-cut algorithm).
      Step 3: Remove pixels along that path.
      Step 4: Repeat steps 2 and 3 as necessary.

      Extend image:
      Step 1: Run an edge detection algorithm.
      Step 2: Find minimal energy (least amount of edges crossed) path from top to bottom or left to right (graph-cut algorithm).
      Step 3: Insert pixels along that path (interpolated from neighbors)
      Step 4: Repeat steps 2 and 3 as necessary.

      Remove objects:
      Step 1: Run an edge detection algorithm.
      Step 2: Mask object by giving its pixels low/negative energy values.
      Step 3: Find minimal energy (least amount of edges crossed) path from top to bottom or left to right (graph-cut algorithm).
      Step 4: Remove pixels along that path.
      Step 5: Repeat steps 3 and 4 as necessary.

    3. Re:The paper via ACM by Anonymous Coward · · Score: 5, Insightful

      I think you've got it except for a small detail in the "Remove objects", which the narrator alludes to around timestamp 4:01 of the video. You might want to add:

      Step 6: Extend image to match original size using the previous extend image algorithm

      (Of course, I leave the obligatory Profit step as an exercise for the reader).

    4. Re:The paper via ACM by DotDotSlasher · · Score: 2, Informative

      This excellent website http://trowley.org/sig2007.html has a host of links to almost all of the papers presented at SIGGRAPH 2007.

  2. nice! by White+Shade · · Score: 4, Interesting

    It seems like a little bit of work is left to make it as completely automated as you would need to have it just "always work" on any platform or device, but it seems like they're already working on that...

    Other than that though, that's pretty awesome... I'm sure there's more instances where it doesn't look right than what they showed, but it's definitely cool how well it works as it stands!

    I can imagine it would be extremely useful for ex-boyfriends or ex-girlfriends; just load up all their photos of them and their ex, wave the magic eraser, and *boom* you don't have to delete all your old vacation shots ;)

    I wonder how well it would work for the porn industry too; nice automatic resizing of breasts without ruining the picture! Fetishists will be SO happy! :)

    --
    ìì!
    1. Re:nice! by Allicorn · · Score: 2, Interesting

      It seems like a little bit of work is left to make it as completely automated as you would need to have it just "always work" Completely right, yes. The images in the video have been selected to show this technique in the best possible light. There's a great variety of images that'll really not work quite right with a completely automated treatment. Speaking from experience having implemented this last week.

      That said, as pointed out in the paper there's plenty of room for a higher level of analysis over the top of the basic seam-carving procedure. The function used to calculate the energy of a given pixel is easily swapped out with any one of dozens of different approaches. A more user-friendly implementation could attempt seam-carving based on a number of different feature maps and work out which is likely to produce the least distortion for a given image.

      Anyhow, cracking bit of work IMHO, with boatloads of potential applications.

      Alli

      --
      OMG!!! Ponies!!!
    2. Re:nice! by aliquis · · Score: 5, Funny

      I'd never understod this hate-your-ex-thing? The person where part of your life for some time but you have decided to hate it and want to erase it from it?
      Better never get a partner then at all if you are going to hate the person once it doesn't work longer.

      But then I'm a regular slashdot visitor and don't have any exs so what do I know.

  3. Practical uses by themushroom · · Score: 2, Funny

    Finally, a way to reduce the space between surgically augmented breasts and lengthen wangs on Flickr!

  4. Slightly Strange by JamesRose · · Score: 3, Interesting

    Okay, I get that they remove the pixels with least energy, so the unimportant information is lost when shrinking, it kinda works, looks a bit strange, but it's okay. however, when they make an image larger they also add the least information so you end up with a large image- but the useful information is the same size and the extra/useless low energy or background gets duplicated- to me, I think thats kinda pointless, I mean, you're adding stuff you've analysed and found NOT to be the focus of the picture. This may work for pictures with no obvious background, but lanscapes like one of the examples, have such an obvious background that only that gets enlarged and just gives you more background. You may aswell just add a nice blue frame round the edge of the picture to make it fit.

    1. Re:Slightly Strange by Anonymous Coward · · Score: 2, Interesting

      when they make an image larger they also add the least information so you end up with a large image- but the useful information is the same size and the extra/useless low energy or background gets duplicated- to me,

      According to the video, the added background information is actually the averaging of the extra "low energy" information around it. So it's not quite duplicated.

    2. Re:Slightly Strange by Nutty_Irishman · · Score: 4, Insightful

      I think you're missing the point of their method, which is to provide realistic images during rescaling that aren't corrupted by blind interpolation (equal averaging). In downscaling the images, it preserves parts of the images that would lose their information through downscaling (e.g. complex textures, people), while at the same time removing textures that would not lose information through downscaling (sky, water, sand). The sky, water and sand will still look like sky water and sand whether it's at 1/4 or 10x resolution, people however look much different if you try and downscale them or upscale them(they would appear blurry and hard to distinguish). The same works in reverse. The sky is still going to look like the sky whether you scale it to 10x or 5x-- it would still look natural. Tree's on the other hand, would not. Once you start to scale up the trees you would expect to be seeing different characteristics-- leaves, branches, etc. Any type of scaling up of a tree would make it seem very blurry and unnatural (lacking leaves, branches, etc.)-- you cannot create an additional information that isn't present in the original image. Therefore, the most natural looking image would be to increase the sky.

      It's not perfect of course. I'm guessing that if you had a picture of two people next to each other, one with a solid colored shirt, and the other with a striped colored shirt, that the solid colored shirt guy would get skinner than the striped when shrinking, and the reverse when enlarging. However, it's a neat idea, and I look forward to reading the paper.

  5. Re:I For One by cnettel · · Score: 3, Insightful
    It's not compression as we know it, Jim. It's more like scaling on totally overcool steroids. The basic idea seems rather simple. I would even imagine you could get a bit of enhanced picture quality by coding simplified vector info on seams, and then doing a normal JPEG of a downscaled picture. That would be a quite contrived way to get a kind of VBR-like behavior in normal JPEG. One issue with JPEG is, after all, that redundancy is detected and handled on the block level, while this algorithm works along arbitrary paths.

    I'm really impressed. Again, maybe not too hard to implement at first, but probably damn hard to get working perfectly, and I might just be ignorant (and I'm entitled too, it's far from my field of work), but I've not seen anyone doing it before.

  6. A picture speaks a thousand words... by Aphrika · · Score: 3, Insightful

    So does this mean you're taking some of those words away?

    There are probably a few situations where the 'unimportant' bits of an image are still as relevant as the rest. Sports photos for instance - especially those played on grass - would not give you a true picture (literally) of what's going on in the scene.

    This'd be good for reference photos - like the animals at the start of the YouTube video, but applications where precision and distance are required wouldn't benefit. Nice bit of work though and I reckon with some smart scaling embedded too (rather than its 'folding effect'), it'd cater for most image retargetting requirements.

    1. Re:A picture speaks a thousand words... by Fred+Ferrigno · · Score: 5, Informative

      It's not removing any more pixels than normal resizing or cropping would, it's just doing it such that the least important ones are removed first. Instead of:

      he uic bownfoxjumed verthelaz yelowdog

      You get:

      Th qik brwn fx jmpd ovr th lzy ylo dog

      Which reduces the total size by the same amount, but retains more information than treating every bit of information the same.

    2. Re:A picture speaks a thousand words... by random735 · · Score: 5, Interesting

      while this is technically true, you're also rearranging the relative positioning of those pixels. cropping something out doesn't change the relationship of what is left in the photo (though it may remove critical details).

      if you have 3 people in a picture and you crop it down to 2, you've erased a person, but you haven't changed who is seated next to whom. if you use this method and the middle person is erased, you make it appear as though the outer two people were in fact seated next to each other when they weren't.

      we are used to the idea that a picture can be cropped (mentally considering what might be just outside the frame). We aren't yet used to the concept that the photo has effectively been cut and pasted together to create new relationships between the objects in the photo (though of course photoshop is getting us there).

      to continue your analogy, if we take:
      the quick brown fox jumped over the lazy dog

      and drop letters, we can create:
      the cow jumped over the dog

      whereas "cropping" might let us say:
      the quick brown fox jumped

      I think it's clear that one of these is more misleading than the other, though in both cases you're just removing information. (in one case, some of that information happens to be spaces between letters/words)

    3. Re:A picture speaks a thousand words... by zippthorne · · Score: 2, Insightful

      I don't know whether I'm "used to it" or not, but after watching the video, I'm totally ready for more intelligent image resizing that isn't quite scaling. Most of the applications I see this being used in don't really require that the exact photographic position (which really isn't the same as what you'd see if you were there) relationships be maintained anyway.

      Hopefully someone will write a GIMP plugin and we can all experiment with it. Also a firefox plugin. Obviously some metadata will eventually need to be included in the the images to delineate faces and whatnot, but web designers can easily handle sloppy painting-over in photoshop type tasks.

      --
      Can you be Even More Awesome?!
    4. Re:A picture speaks a thousand words... by pclminion · · Score: 2, Insightful

      There are probably a few situations where the 'unimportant' bits of an image are still as relevant as the rest. Sports photos for instance - especially those played on grass - would not give you a true picture (literally) of what's going on in the scene.

      Sorry -- "true picture?" That assumes such a thing can exist in the first place. Take a color-blind viewer for instance. Can he (and I say he because statistically, most color-blind people are male) look at ANY image and say that he is seeing the "true image?" How is his experience any more or less true than the experience YOU have when you look at the image?

      Any scaling of an image, by definition, must remove (or insert, if up-scaling) information in an image. Usual scaling techniques insert or remove a constant information density across the image. This means that areas with low information lose just as much fidelity as regions with high information. A better method would have removed more information from the area that is already low in information to begin with, leaving more information in the area where it matters. This is exactly what this new algorithm does.

      So it is fairly obvious that this method is superior, from a purely information-theoretic standpoint, to typical scaling algorithms. Are there images where its application might be inappropriate? Yes. Compressing an image of an abstract piece of art might do unforgivable damage to it. There is a simple solution -- do not use this algorithm on such images.

  7. DP Approach by xquark · · Score: 3, Interesting

    This method is quiet interesting, though it falls over in situations where the detail level
    or entropy of the background is as great as the foreground. Also the paper doesn't go into
    too much details about the dynamic programming approach they used to find the path of least
    energy, I guess that aspect of it is patentable. Another thing they could investigate is the
    use of diagonal seams instead of just staggered vertical and horizontal seams.

    All in all a very interesting read.

    --
    Arash Partow's Philosophy: Be a person who knows what they don't know, and not a person who doesn't know.
    1. Re:DP Approach by The+New+Andy · · Score: 2, Insightful
      I certainly hope it isn't patented, since by just watching the video once (without sound) I was able to to make my own implementation in C in under two hours. I completely agree that it is a cool idea, but I think the reason it is so cool is that the parts they used to build it are all so simple/well known - it is just a really novel combination of ideas that people have already come up with. The idea of a patent (I believe) is so that an inventor won't keep their invention to themselves, so that people can see how it all works and it benefits the public. There aren't any hidden tricks here - the (image-processing) public can easily work out how it is working just by looking at it.

      Just in case I haven't been clear - I think that the idea is awesome, novel and brilliant. And I believe that it is possible for something to be awesome, novel and brilliant but also 'obvious'. Just like in maths when they showed you complex numbers, and how they bring some sanity into the system. Once they give you the hint that the square root of a negative number can be defined, then you can go away and easily derive all the cool things like Euler's form and whatnot. Now replace 'the square root of a negative number can be defined' with 'you can crop a jagged column from an image' and you have a pretty good parallel.

    2. Re:DP Approach by pclminion · · Score: 2, Informative

      Also the paper doesn't go into too much details about the dynamic programming approach they used to find the path of least energy, I guess that aspect of it is patentable.

      Not so much patentable, as "Easy enough for the reader to implement that it deserves little mention."

  8. I Think You'll Find by JamesRose · · Score: 3, Insightful

    10 Seconds of work there, most probably a good deal longer finding a picture that is easy to do it to...

  9. Prior art by SamP2 · · Score: 2, Informative

    The technique was already invented by the Soviets in the '30s:

    Before

    After

    Insignificant person removed.

    1. Re:Prior art by francium+de+neobie · · Score: 2, Informative

      No, your images is just an often-cited example of what image inpainting could do. And image inpainting has nothing to do with the new resize algorithm talked about in the article, although similar effects are achieved in this specific case.

  10. Whao by Arthur+B. · · Score: 4, Funny

    Ths s rly gret !

    --
    \u262D = \u5350
  11. Gimp! by larry+bagina · · Score: 5, Interesting

    Although they demonstrated on Windows, a friend of mine is one of their graduate students and was peripherally involved. He said it was originally developed as a GIMP plug in, but moved to a separate Windows app to show off the realtime resizing, etc. Hopefully they'll release the GIMP plugin? More likely Adobe will write them a check and license it to make sure that never happens.

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

  12. I can see the spam now by MarkovianChained · · Score: 3, Funny

    Shrink the rest of your body, and increase you penis size by up to 20 pixels!

  13. Does Anyone Find It Ironic by szyzyg · · Score: 4, Funny

    I find a small irony in the fact that the video is posted on youtube, a site which stretches and squeezes video to fit into a 4:3 aspect ratio

  14. Paranoia! It's not just for Gimps by Anonymous Coward · · Score: 2, Insightful

    "More likely Adobe will write them a check and license it to make sure that never happens."

    Is that check going to cover the removal of their paper from above and the ACM archives, let alone OUR archives?

  15. Re:The Commissar Vanishes by larry+bagina · · Score: 2, Interesting

    It could be worse.

    In December 2001 The New York Fire Department unveiled plans for a statue based on the photograph to be placed at the Brooklyn headquarters. In an effort to be politically correct, the statue was to include black, white, and Hispanic firefighters. However, it was cancelled in an outcry about rewriting history -- the depicted firefighters are white.

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

  16. My Implementation by The+New+Andy · · Score: 5, Interesting

    I thought it was pretty cool, so I made my own version after seeing the video. It obviously won't be as awesome as their one, but if you want to play around with it, you can get my C source and have a play around. It is GPL3.

  17. Re:Let us be wholly thankful... by kennygraham · · Score: 3, Funny

    It's times like this when I become truly aware of my own gaping inadequacies, and feel the deep, deep obligation to rectify my own short comings.

    hehe... gaping... deep deep... rectum... i mean rectify... hehe

    i need to get some sleep

  18. removing the intended layout by __aapbzv4610 · · Score: 2, Insightful

    What about artistic photographs? Most photos in that sense are planned to have a certain layout, composition, empty spaces, etc. Say I make a nice panorama shot with a 6:1 aspect ratio. Now my photo that took careful planning is reduced to a 4:3 with all the 'unimportant' spaces removed? Maybe it's just me, but there seem to be lots more instances where this would hurt than help. Journalistic images? Sports photos? Oh, the image can't fit, let's get rid of everything between the 50 and 20 yard lines. There aren't any players standing there. I really only see this being beneficial for web ads. Instead of creating square, vertical, and horizontal versions of the same ad, just make one and let the image be 'resized' accordingly.

  19. Re:Not ready for Prime Time by pclminion · · Score: 4, Insightful

    It has nothing to do with edge detection. The algorithm simply detects paths of minimal gradient which lead from one side of the image to the opposite side. This can be used to produce a "pretty picture" which shows the edges -- but this is merely fallout.

    They showed what I thought were several realistic photos with complex backgrounds, and the algorithm did well overall, except on structures where people are closely attuned to exact detail -- such as human faces. If we weren't innately wired to process faces in incredible detail, we wouldn't even notice the distortion.

    So it's not perfect. Can you show me something in this world that is? And I don't think there has been any mention of "prime time" application, whatever that means.

  20. Ariel Shamir by Schraegstrichpunkt · · Score: 2, Informative

    ... not to be confused with Adi Shamir (the cryptographer).

  21. some code by Arthur+B. · · Score: 3, Interesting

    Too much caffeine in the blog, couldn't sleep... I can't get my hand on the paper but the youtube presentation was extremely clear and I just wrote this C code based on libgd2. Basically it lowers the height of an image by 1 pixel, you can run it multiple time to remove more line.

    http://rafb.net/p/jinioy45.html

    (yeah my coding sucks but it produces awesome results and I reversed engineered the algorithm from youtube so please grovel...)

    I'll improve it soon to remove an arbitrary number of line, horizontally or vertically
      - no recalculation of gradient, only the gradient near the line needs to be recomputed
      - precomputes a file that store the order of the pixel needing to be removed

    I need help with something though, I understand how the algorithm can precompute a file which says in which order pixel should be removed, but I don't see how this can work in *both* direction. Suppose you want to reduce vertically and horizontally at the same time, the horizontal change should completely break the precomputed vertical changes. How would you handle that?

    --
    \u262D = \u5350
    1. Re:some code by Arthur+B. · · Score: 2, Interesting
      --
      \u262D = \u5350
    2. Re:some code by jez9999 · · Score: 2, Funny

      jpegin = fopen("test.png","rb");
        jpegout = fopen("out.png","wb");
      WTF?
  22. Video is on youtube.... by Tmack · · Score: 2, Informative
    Been up for a while now too, at least I saw it a few days ago...

    Clicky

    Tm

    --
    Support TBI Research: http://www.raisinhope.org
  23. Re:Great by compro01 · · Score: 2, Funny

    I'm not sure what else you want. for it to automatically remove less-significant faces.
    --
    upon the advice of my lawyer, i have no sig at this time
  24. Re:The Commissar Vanishes by compro01 · · Score: 2, Insightful

    I think it's obvious that this technique is just plain cool and has great potential for beneficial use, even if it might be used for ill. what do you expect? it's technology. technology works to the highest bidder, not the people with the highest morals.
    --
    upon the advice of my lawyer, i have no sig at this time
  25. Re:Great - We can do this, but should we? by vasanth · · Score: 4, Insightful

    Your comment seems to be similar to the headline on tabloids.. Just because a technology could be used for negative purposes does not mean that it should not be developed.. If your reasoning was used, we should have all been living in caves by now..

    By your reasoning
    Cars can be used by criminals to travel faster.
    A knife can be used to kill
    Electricity can be used to kill
    Computers can be used by the govt to collect more information abt us effectively

    Is that really what we want?

    see the flaw in the logic?

  26. Re:Great - We can do this, but should we? by Dogtanian · · Score: 2, Funny

    Is that really what we want? Reminds me of that Harry Enfield sketch.... Is that what you want? 'Cos that's what'll happen. Won't make any sense to Americans, but who cares :-)
    --
    "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).