Choosing Better-Quality JPEG Images With Software?

← Back to Stories (view on slashdot.org)

Choosing Better-Quality JPEG Images With Software?

Posted by timothy on Thursday July 16, 2009 @10:02AM from the on-the-tip-of-my-script dept.

kpoole55 writes "I've been googling for an answer to a question and I'm not making much progress. The problem is image collections, and finding the better of near-duplicate images. There are many programs, free and costly, CLI or GUI oriented, for finding visually similar images — but I'm looking for a next step in the process. It's known that saving the same source image in JPEG format at different quality levels produces different images, the one at the lower quality having more JPEG artifacts. I've been trying to find a method to compare two visually similar JPEG images and select the one with the fewest JPEG artifacts (or the one with the most JPEG artifacts, either will serve.) I also suspect that this is going to be one of those 'Well, of course, how else would you do it? It's so simple.' moments."

26 of 291 comments (clear)

Min score:

Reason:

Sort:

File size by Tanman · 2009-07-16 10:06 · Score: 2, Insightful

it is lossy compression, after all . . .
1. Re:File size by teko_teko · 2009-07-16 10:16 · Score: 3, Insightful
  
  File size may not be accurate if it has been converted multiple times at different quality, or if the source is actually lower quality.
  The only way to properly compare is if you have the original as the control.
  If you compare between 2 different JPEG quality images, the program won't know which parts are the artifacts. You still have to decide yourself...
2. Re:File size by Anonymous Coward · 2009-07-16 10:19 · Score: 3, Insightful
  
  File size doesn't tell you anything. If I take a picture with a bunch of noise (eg. poor lighting) in it then it will not compress as well. If I take the same picture with perfect lighting it might be higher quality but smaller file size.
  Why this is modded up, I don't know. Too many morons out there.
3. Re:File size by Qzukk · 2009-07-16 11:27 · Score: 2, Insightful
  
  actually one of the meta values that is stored is a quality indicator.
  And when you save a max quality copy of a min quality jpeg, the picture still looks like crap.
  
  --
  If I have been able to see further than others, it is because I bought a pair of binoculars.
4. Re:File size by nabsltd · 2009-07-16 12:41 · Score: 3, Insightful
  
  Unfortunately, that's a subjective term based on the 'codec' used to make the jpg. Not everyone's 100 is the same nor is everyone working off the same scale (i.e. 1-10 vs 1-100).
  In addition, I bought a program (Windows only, sorry) that allows the user to pick the areas of the image that need the most bits. Basically, it allows you to pick the quality for any abitrary region (using standard selection tools like lasso) when saving the JPEG.
  I mostly got it for the batch processing and its excellent image quality when you set it to minimum compression.
5. Re:File size by sbeckstead · 2009-07-16 13:26 · Score: 2, Insightful
  
  But another bit of meta data there is "generation" so at least you could see how far it went from the place it started. The meta data actually has a purpose and people that process images without preserving it should be shot. And if the image hasn't got meta data and you are a professional you won't use it anyway. I hate tools like Paint because they destroy all that beautiful meta data you could have used to make this determination much easier. Assuming of course that image was generated and stored by someone who used the meta data in the first place. Alas you may be hosed here.
  
  --
  Why bother
6. Re:File size by timeOday · 2009-07-16 14:11 · Score: 5, Insightful
  
  This is the kind of problem you can solve in 2 minutes with 95% accuracy (by using file size), or never finish at all by listening to all the pedants on slashdot. When people know a little too much they love to go on about stuff like entropy and information gain, just because they (sort of) can.
  Try file size on the set of images of interest to you and see if it coincides with your intuition. If it does, you're done.
Try compressing both further by Ed+Avis · 2009-07-16 10:09 · Score: 2, Insightful

I suppose you could recompress both images as JPEG with various quality settings, then do a pixel-by-pixel comparison computing a difference measure between each of the two source images and its recompressed version. Presumably, the one with more JPEG artefacts to start with will be more similar to its compressed version, at a certain key level of compression. This relies on your compression program generating the same kind of artefacts as the one used to make the images, but I suppose that cjpeg with the default settings has a good chance of working.
Failing that, just take the larger (in bytes) of the two JPEG files...

--
-- Ed Avis ed@membled.com
Translation: Please help me with my porn... by Chyeld · 2009-07-16 10:13 · Score: 5, Insightful

Dear Slashdot,
Recently I checked my porn drive and realized that I have over 50 gigibytes of jpg quality porn collected. Unfortunately, I've noticed that a good portion of these are all the same picture of Natlie Portman eating hot grits. Could you please point me to a free program that will allow me to find the highest resolution, best quality version of this picture from my collection and delete the rest?
Many Thanks!
use the JPEG underlying details by cellurl · 2009-07-16 10:15 · Score: 2, Insightful

To make a JPEG, you cut it into blocks, run the DCT on each block and mess with the 4:2:2 color formula and pkzip the pieces... That said, I would think measuring the number of blocks would be related to number of artifacts... In my barbaric approach to engineering, (assuming there is no other suggested way on slashdot), I would get the source code to the JPEG encoder/decoder and print out statistics (number of blocks, block size) of each image...
It's easy by Anonymous Coward · 2009-07-16 10:16 · Score: 5, Insightful

Run the DCT and check how much it's been quantized. The higher the greatest common factor, the more it has been compressed.
Alternatively, check the raw data file size.
quantization tables by angryargus · 2009-07-16 10:17 · Score: 3, Insightful

Others have mentioned file size, but another good approach is to look at the quantization tables in the image as an overall quality factor. E.g., JPEG over RTP (RFC 2435) uses a quantization factor to represent the actual tables, and the value of 'Q' generally maps to quality of the image. Wikipedia's doc on JPEG has a less technical discussion of the topic, although the Q it uses is probably different from the example RFC.
Re:AI problem? by nametaken · 2009-07-16 10:46 · Score: 2, Insightful

You're right, it needs to be done by humans to be sure.
Amazon's Mechanical Turk should do the trick.
https://www.mturk.com/mturk/welcome
Re:Measure sharpness? by uhmmmm · 2009-07-16 10:52 · Score: 3, Insightful

Even faster is look at the DCT coefficients in the file itself. Doesn't even require decoding - JPEG compression works by quantizing the coefficients more heavily for higher compression rates, and particularly for the high frequency coefficients. If more high frequency coefficients are zero, it's been quantized more heavily, and is lower quality.
Now, it's not foolproof. If one copy went through some intermediate processing (color dithering or something) before the final JPEG version was saved, it may have lost quality in places not accounted for by this method. Comparing quality of two differently-sized images is also not as straightforward either.
Re:DCT by mikenap · 2009-07-16 10:54 · Score: 3, Insightful

This seems to me the best suggestion, and there's a simple visual way to accomplish it! The hardest hit part of the image is going to be the chroma information, which your eye normally has reduced resolution sensitivity for in a normal scene. To overcome this, load your JPEGs into your favorite image editor and crank the saturation to the max(this throws away the luminance data). Now the JPEG artifacts in the chroma information will HIT YOU IN THE FACE, even in images that seemed rather clean before. Pick the least blocky of the two, and there you go!
Re:AI problem? by CajunArson · 2009-07-16 10:56 · Score: 2, Insightful

And to reply to myself.. several other posters have noted that taking the DCT of the compression blocks in the image will give information on how highly compressed the image is... there's one example.

--
AntiFA: An abbreviation for Anti First Amendment.
Re:use a "difference matte" by uhmmmm · 2009-07-16 11:03 · Score: 2, Insightful

So, that will show you which parts differ. How do you tell which is higher quality? Sure, you can probably do it by eye. But it sounds like the poster wants a fully automated method.
Re:Easy by Random+Destruction · 2009-07-16 11:06 · Score: 3, Insightful

Ok, so you know how two images differ. Which one is closer to the original? You don't know, because you don't have the original to compare.

--
:x
Re:DCT by Anonymous Coward · 2009-07-16 11:37 · Score: 1, Insightful

Or just take the 2D FFT of the entire images. Higher JPEG compression should result in fewer high frequency components in an image.
Re:AI problem? by moderatorrater · 2009-07-16 12:06 · Score: 2, Insightful

Even simpler mathematical analysis would include such techniques as seeing which one takes up more disk space. Last I checked, that was very highly correlated with compression level.
Re:AI problem? by Spy+der+Mann · 2009-07-16 12:37 · Score: 4, Insightful

Here's a simple but expensive formula:
1. Get the image
2. Compress it severely.
3. Compare the difference between original and the compressed.
The lower the difference, the lower the image quality.
4. Profit!
Or you could just measure the amount of data in the DCT space. Duh.
Re:DCT by eggnoglatte · 2009-07-16 14:12 · Score: 3, Insightful

That works, but only if you have exact, pixel-to-pixel correspondence between the photos. It won't work if you just grab 2 photos from flicker that both show the Eiffel tower, and you wonder which one is "better".
Luckly, there is a simple way to do it: use jpegtran to extract the quantization table form each image. Pick the one with the smaller values. This can easily be scripted.
Caveat: this will not work if the images have been decoded and re-coded multiple times.
Re:AI problem? by VanessaE · 2009-07-16 14:44 · Score: 2, Insightful

Just checking the size of the file (or, I suspect, just the size of the DCT data) won't always work. Sometimes an image can end up growing in size slightly while losing quality, depending on the nature of the image and the settings of the imaging program.
Things such as thin wires, multi-colored ribbon cable, close-ups of a circuit board, and other images with lots of similar details seem to benefit most from this kind of tweaking, mainly thanks to the placement and qualities of the artifacts, rather than their mere existence or apparent severity.
I've had this happen many times - set an icon for, say, 35% quality and it will probably look kinda grungy, but step it down by just one or two percent and suddenly the artifacts shift around or change their appearance, sometimes in a manner that better suits the image - almost like constructive interference.
Re:AI problem? by scdeimos · 2009-07-16 18:39 · Score: 2, Insightful

That's only a reasonable indicator if the two copies of the same image you are comparing are also the same resolution. It's not hard to have a higher resolution image consume less disk space if the compression level has been bumped up. Also, different programs usually produce different JFIF streams even when set to the same compression level and using the same *uncompressed* source image, making the DCT size approach even less reliable.
Re:AI problem? by SlashWombat · 2009-07-16 21:31 · Score: 2, Insightful

Unfortunately, its not all that easy to compare. In general, the file with the higher byte count will be the better image, BUT ... The problem is there are different ways to compress the same picture. (There are several "controls", even in baseline JPEG. (Where the "quantisation" steps occur, where the high frequency cutoff for each macroblock occurs. Then there are different ways for the JPEG engine to entropy encode the bitstream. IE: Arithmetic coding is allowed by the JPEG standard, however, due to patent issues, most implementations use Huffman coding, which is slightly less efficient.) It should be remembered that the JPEG standard is just baseline Any implementer is free to improve upon the baseline coding, as long as it still decodes correctly. There used to be JPEG viewing software that decompressed and cleaned up images that looked terrible using "standard JPEG decoding software. (I am not sure, but I suspect the blockiness and quantisation errors were smoothed out, improving the displayed image immensely.)

Of course, what you really need is the NCIS image enhancement package.
Re:AI problem? by nahdude812 · 2009-07-16 23:42 · Score: 2, Insightful

This just about gets to the heart of it. "Better" is a subjective term, so choosing better quality images is not going to be something everyone can agree on. Your example nails it. If you have two copies of the same image, one is higher resolution than the other, but saved with a higher compression rate, which is better? The answer is going to be "it depends on if the noise introduced by the higher compression annoys me more than the reduced information in the lower resolution image."
If the compression on the high resolution image is high enough, you might still have better detail in the lower resolution image. If the higher resolution image isn't actually higher resolution, just higher dimensions (it's the smaller image scaled up), this is automatically a lower quality image (you can always recreate the higher resolution image from the lower resolution image, but not vice versa as rounding errors cause information loss whenever you scale an image).
There may also be subjective differences like brightness/contrast/tone mapping differences.
Given that the question being asked is a subjective one, the correlation of file size to subjective image quality should be so high that you may gain only a few percent better predictability with an extremely complex algorithm.

--
Slay a dragon... over lunch!