Scaling Algorithm Bug In Gimp, Photoshop, Others
Wescotte writes "There is an important error in most photography scaling algorithms. All software tested has the problem: The Gimp, Adobe Photoshop, CinePaint, Nip2, ImageMagick, GQview, Eye of Gnome, Paint, and Krita. The problem exists across three different operating systems: Linux, Mac OS X, and Windows. (These exceptions have subsequently been reported — this software does not suffer from the problem: the Netpbm toolkit for graphic manipulations, the developing GEGL toolkit, 32-bit encoded images in Photoshop CS3, the latest version of Image Analyzer, the image exporters in Aperture 1.5.6, the latest version of Rendera, Adobe Lightroom 1.4.1, Pixelmator for Mac OS X, Paint Shop Pro X2, and the Preview app in Mac OS X starting from version 10.6.) Photographs scaled with the affected software are degraded, because of incorrect algorithmic accounting for monitor gamma. The degradation is often faint, but probably most pictures contain at least an array where the degradation is clearly visible. I believe this has happened since the first versions of these programs, maybe 20 years ago."
Photographs scaled with the affected software are degraded, because of incorrect algorithmic accounting for monitor gamma.
Seriously!
I have a theory on why this has gone unnoticed for so long, but I'll keep it to myself...
To display the pictures, it makes sense to use the monitor gamma. But to actually modify the data using that information which is probably flawed in 99.9999999% of cases? That's just wrong.
This is only a bug depending on what you are doing with your final images. One of the things that annoys me is that many image manipulation programs do not actually explain the primitives they are using. The result can be a complete mess depending on what you are trying to accomplish. This article is an example of this effect.
If you want photo-realistic results, then you need to take Gamma into account. However, very few file formats specify the Gamma, the grey level, the white level, the black level or the colour space of the original image. The result is that the many imaging operations must be wrong, as they can never be accomplished the way intended. For the most part, no one cares. This person found an application where people care.
Is what this article is about.
This matter has been known for a long time, and there's a reason why so many softwares ignore it:
it hardly matters. That and it's also way more complicated to do it properly.
Gain / Pain is clearly inferior to 1 there.
Come on, this isn't news...
Helmut Dersch (of Panorama Tools fame) certainly posted about this before;
http://www.all-in-one.ee/~dersch/gamma/gamma.html - Interpolation and Gamma Correction
There's no factual error in the scaling algorithm, as the /. headline would like you to believe - it's a color space (linearity) issue; you have to do your calculations in linear space which means a typical photo off of a camera/scanner gets the inverse of an sRGB curve applied (a gamma of 0.454545 is 'close enough' if you can't do the proper color bits). Then scale. Then re-apply the curve.
And no - for real life imagery, nobody really cares - the JPEGs out of the cameras and subsequent re-compression to JPEG after scaling will have 'destroyed' far more data than the linearity issue.
They're nice example images in the story, but they should be called 'academic'.
The basic issue here has to do with gamma curves and the way they're being handled (they're not).
Most image files on your computer (BMP, JPG, PNG, etc.) are stored in the sRGB color space. sRGB defines the use of a gamma curve, which is a nonlinear transformation applied to each of the components (R, G, and B). The issue here is that most scalers make the assumption that the components are linear, rather than try to process the gamma curve. While this does save processing time (undoing the gamma curve then redoing it), it does add some error, especially when the values being scaled are not near each other.
So does this matter? Well, in some pathological cases where there are repeated sharp boundaries (such as alternating black-white lines or fine checkerboard patterns), this would make a difference. This is because the linear average of the pixels (what most image scalers use) yields a different result than if the gamma value was taken into account. For most images (both photographic and computer generated), this shouldn't be a big problem. Most samples are close in value to other nearby samples, so the error resulting from the gamma curve is very small. Sparse light-dark transitions also wouldn't be noticeable as there would only be an error right on the boundary. Only when you exercise this case over a large area does it become obvious.
One final point: this gamma scaling effect would occur regardless of the actual scaling algorithm. Bilinear, bicubic, and sinc would all have the same issue. Nearest neighbor interpolation would be unaffected, but in these cases, the output would look far worse.
That might be true, but it's no reason to turn this into an undocumented and unavoidable feature.
One filter where this issue really bites is "unsharp mask". Non-linear gamma results in very noticeable artifacts around high contrast edges. With linear gamma, you can crank the filter strength much higher without producing these artifacts, resulting in very crisp pictures that do not look obviously sharpened.
It is not a scaling algorithm problem. It's a problem that affects all filters which combine pixels. The only reasonable way to avoid the problem is to hold the image data in memory in a linear representation (as converting it back and forth on the fly every time a filter is applied would quickly accumulate rounding errors). Unsurprisingly every image editing software worth its money already offers that conversion, and doesn't force it on the user because it's not without downsides: The conversion isn't lossless and distributes the tone curve unfavorably in the allocated bits, resulting in either loss of precision at the same bit depth or bigger memory allocations for minimal loss of precision.
Well, as much as desired goes, this also affects how a lot of filters and effects work. For example, it causes most Gaussian blur implementations to 'flare' brights into darks more than they should. And that's been happening for so long, that that's now the expected/wanted behavior out of 'Gaussian Blurs'. If you changed that, you would have some confused/annoyed users.
"There is an important error in most photography scaling algorithms."
No, there isn't. If millions of professional users haven't been bothered by it over the course of two decades, it is CLEARLY not important.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
Note that I'm not excusing the software programs from handling this better - certainly not Photoshop - but it's 1. not a new revelation and 2. certainly not a "scaling algorithm bug".
In what sense is it not a scaling algorithm bug? The images look different after scaling than before, when interpreted in accordance with the appropriate specs. It seems to me that the specification for the scale function is something like "returns an image that is as visually similar as possible to the original, but reduced in size by the specified amount." It might be known, and it might be better described as using the wrong algorithm than an algorithm bug, but it's definitely a bug in the program.
The rest of the post I basically agree with: the differences are minor except in weird test images. However, if I want to adjust the brightness, I'll do that. If I edit the photo at full res and then save at lower res for use on the web, I don't want the result to look different. I wouldn't be able to tell the difference if not in swappable comparisons, but one might still look better.
Same with Windows font rendering. It is plainly inferior in all objective measures of typeface fidelity.
Like I said, that's because fidelity isn't the primary goal. Windows goes out of its way to distort rendered text in order to align the character outlines with pixel boundaries.
You might as well claim that donuts are "plainly inferior in all objective measures of bagel quality": they crumble apart when you try to spread anything on them, you can't buy them pre-sliced, none of them come with raisins inside... and that's all fine, because donuts aren't trying to be bagels.
Subjective taste is subjective. Measurably fidelity is not, and handwaving about one being suited more to a given task than another doesn't make it so.
Once again, you're assuming fidelity is the only goal of a font renderer, but that's not the case.
When you take a font designed to be printed at 600 DPI and render it on a 96 DPI screen where each character is only a few pixels high, you face a tradeoff between fidelity and legibility. Apple decided to prioritize fidelity; Microsoft decided to prioritize legibility.
The link you provided does not support that argument. Legibility is not a significant issue.
Perhaps you missed this part: "Microsoft generally believes that the shape of each letter should be hammered into pixel boundaries to prevent blur and improve readability, even at the cost of not being true to the typeface."
I personally find that working in Windows-based terminals on LCD monitors is far more straining than Mac-based ones or CRT monitors--the text is too sharp and loses distinctiveness.
Fair enough. That's a matter of subjective taste.
Others may be accustomed to something else, and that's fine, but it's flat-out falsehood to claim that grid priority makes for better onscreen legibility.
At the point size in the article's example? Yes, you're right, both lines are legible.
But at smaller sizes, it's a plain statement of fact: distorting characters to keep the lines distinct does indeed make for more legible text than accurately rendering them into amorphous blobs.
Visual IRC: Fast. Powerful. Free.
Like I said, that's because fidelity isn't the primary goal.
Talk about changing the goalposts! This whole Slashdot story is about fidelity.
You might as well claim that donuts are "plainly inferior in all objective measures of bagel quality"
No, I'm pretty sure that font display is measured by fidelity to the creator's intention and design, just as photographic display is being measured by fidelity to accurate gamma values in TFA.
Unless you're saying a typeface stops being a typeface when it shows up on a screen and becomes a...bagel, I think you're being evasive.
The technical ins and outs of photo editing and display are all about fidelity; why would that not be the case with typeface rendering and display?
Once again, you're assuming fidelity is the only goal of a font renderer, but that's not the case.
And again, this is the most bizarre argument I've ever heard.
That's like saying rendering the proper color gamut isn't the purpose of a monitor. Or that fidelity to the original source material isn't the purpose of speakers.
Perhaps you missed this part: "Microsoft generally believes that the shape of each letter should be hammered into pixel boundaries to prevent blur and improve readability,
No, but there's nothing to back up that statement of belief with reality, and in fact the results clearly show it not to be an issue. So the question is, if Microsoft had bothered to put some effort into proper rendering, would there be any meaningful loss of legibility?
Empirically, the answer is obviously "no". As even that article points out, the determining factor is familiarity.
Moreover, "blur" is not a bad thing. Smoothness of form is a critical element of successful typeface design. The ungainly reproduction of Microsoft's snap-to-grid shortcut ruins the flow of the best and most famous typefaces, hindering legibility.
Why the article pretends that Microsoft's decision was anything other than lack of interest in fine-tuning is rather curious. It wasn't a conscious decision about legibility that led them here--it was instead a desire not to alienate what had become familiar by 1998 when ClearType started to take shape. Spolsky is neither a typographer nor a neutral commentator--he has an open preference for ClearType and no formal training in typeface design or in perceptual optics.
When you take a font designed to be printed at 600 DPI and render it on a 96 DPI screen where each character is only a few pixels high, you face a tradeoff between fidelity and legibility.
A false dichotomy.
Pixel grid rendering is simply easier to implement; it is not quantifiably better in any way. Its distortion of precisely designed typeface lines interferes with the expert, the typeface designers and font engineers who not only carefully construct the aesthetics, but also the science.
Microsoft commissioned fonts to work with their rendering technology to improve readability--had their system provided an overall advantage in actual legibility, such an effort would have been redundant. In fact, Verdana (and related fonts) exists to address the typographical shortcomings of ClearType.
The fact of the matter is that the article provides nothing to suggest that legibility is improved with the Microsoft method, and provides both the empirical counterexample and the "Verdana paradox".
But at smaller sizes, it's a plain statement of fact: distorting characters to keep the lines distinct does indeed make for more legible text than accurately rendering them into amorphous blobs.
Setting aside for the moment the serious legibility issues of using small font sizes period for extensive work, that's the reason why subpixel rendering is disabled at a certain font size (default on OS X is 8pt;
They summary already names a fix for Gimp (GEGL), but the posters only seem interested in whining instead of RTFS. Sigh.
I was talking about scientific and engineering uses, which often depend on the gamma curve even if most authors ignore it.
Most photographic software is oriented towards deliberately messing with the gamma curves arbitrarily to achieve aesthetic goals. Consumer cameras are even starting to do this onboard the camera, in, as far as I know, completely un-documented ways (indeed, they probably consider them trade secrets). See features like iContrast in canon cameras.
"the assumption one makes is that these integer values are not photons ^( 1/gamma) but simple photon counts (scaled to the 0-255 range)."
That is a very reasonable assumption to make, and one that most people who don't know anything about gamma make. Unfortunately, it's flat out false. Both MATLAB and PIL return the gamma compressed data, which, unless you used a linear machine vision camera (and you should if you're serious about this stuff), which of course had no gamma compression to begin with. If you need proof, load and save an image. The image data will be bit for bit identical to the original, indicating that no conversions were performed. (note that the header might have slightly different metadata, and JPEG re-compression is usually always lossy)
Gamma is so rarely handled properly, even by scientists and engineers, that OpenCV (the most popular library for computer vision) does not even contain a function for doing gamma (de)compression.