Why EWA/circular scaling operators?

hjulenissen · Post by **hjulenissen** » 2012-10-14T11:42:59-07:00

fmw42 wrote:Anti-aliasing quality can be better with EWA (elliptical/circular) filters, especially if not a simple constant scaling. See perspective transformations at http://www.imagemagick.org/Usage/distorts/#horizon. The elliptical filters can do a better job of "averaging" pixels over a spread out region. The tensor filters are limited to small rectangular regions about a given pixel.

I assume that for such (extreme) distortions, you really want the filter "cutoff frequency" to be a local function of the input/output density. That may mean oddly shaped kernels that may be impossible to obtain using "tensor" methods?

-h

Post by **anthony** » 2012-10-15T21:43:40-07:00

hjulenissen wrote:That may mean oddly shaped kernels that may be impossible to obtain using "tensor" methods?
-h

This was the original reason for using a EWA for distortions. Also why it is not used for one of the polar distortions as the 'area' the source of a pixel covers is actually a circular arc and not an ellipse (de-polar distort method).
http://www.imagemagick.org/Usage/distorts/#polar_tricks
Though that does not mean a elliptical area is not a bad compromise, I just felt it was better not to bother using EWA in that situation, as the blurring near the center of the resulting image would be too sever.

However in resizing or scaling images, the saming area are either rectangles or circles. You have then three posibilities, tensor (2-pass orthogonal) weighted filters, and cylindical (radially weighted) filters. Resize does the first, distort does the second.

Nicholas and I are also considering another Area Weighting sampling method that is really more like a distorted 2-pass area resampling for distort. In a scaling distortion (resize type) situation this would directly devolve to the same result as a tensor 'resize', but in a single (less optimized) pass. In heavy distortions it would have a rhombus like sampling area.

For image scaling there is another group of possibilities. Single 'point' interpolation (not a interpolative filter) whcih produces heavy aliasing when educing image size, while interpolative filters default to this for enlargements. Or a Super-sampled variation (take many points, using a grid, disk, or some other sampling of the source area that contributes to the destination pixel).

And there is also a thrid posibility... forward mapped 'splating'. though that is used more for the display of 3D objects that has a sparse surface sampling, than 2D image distortions.

Just remember a resize is just a special restricted case of a normal image distortion, and as such all teh introductory explanations given in Image Distortions actually apply, even before you get into 'areas', 'super-sampling', or 'tensor filters'.

See General Distortion Techniques
http://www.imagemagick.org/Usage/distort/#summary

And in final summary (the second last line in the above section)...

Remember however all resampling techniques are just methods for determining the color of each individual pixel

.

hjulenissen · Post by **hjulenissen** » 2012-10-16T02:44:18-07:00

As the documentation explains, any "linear" resampling is a subset of out(i,j) = sum(in * Mij), where Mij is an array the size of the input image that changes values for every output pixel (might be seen as 4-d). Finding the visually most appealing and/or most accurate linear scaling should reduce to finding the desired Mij matrix. Finding a real-world "usable" linear scaler adds the additional burden of approximating that Mij by a very sparse version (mostly zeros), some special symmetry properties (so to reduce cost by separable dimensions), and/or quantizing the i,j indexes so that the same M matrix can be reused for several similar filter "phases". Since image boundary artifacts are often dealt with using linear functions of the image, I am betting that the M matrix can take account of those as well (Might make the Mij matrix a lot less "smooth" close to edges, though).

My point being that if we _only_ want to figure out "what is the best possible linear image scaler" (for a given input/output grid, given source image, given display characteristic etc etc), then it is sufficient to figure out what the best possible Mij matrixes are for that scenario. Once we have found this highly expensive, inefficient ideal, we may approximate it using whatever methods maps well to our hardware/implementation language limitations, and depending on how long the user is prepared to wait.

I have experimented with DCT-based scaling as suggested in the Gustafsson thesis, and as expected the results were not as visually pleasing as e.g. lanczos3. I expect that there is some "optimal" filter design criterion (at least for the simple scaling operation) that (given high-quality, properly prefiltered input images) tends to give the most visually pleasing output. The lanczos family are, after all, just windowed sinc functions that trade pass-band vs stop-band vs spatial extent. Windowed sincs tends to be the easiest filters to synthesise and understand, but the search-space for other filters (that minimize some unknown cost function) is large, even for neighborhoods of 4x4=16 or 9x9=81 pixels.

There is a BBC whitepaper that suggests that image resolution can be somewhat reduced if the reconstruction filter is "sharp". I.e. applying sharpening after upsampling an image, you can perceptually achieve the same thing as actually having somewhat higher resolution. This may not surprise you guys, but it is nice to have a properly conducted test-panel conclude this in a "blind" test.

-h

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-10-16T04:08:20-07:00

If we had an accurate model of the HVS that could be turned into a "score", we could, at least in principle, select a large class of matrices (and a class of color space transformations, because, really, with sRGB, for example, there already is nonlinearity introduced into the system) that would, for a single image or family of images, for a single or family of viewing conditions, and for a single or a family of geometrical transformations, optimize this score.
Unfortunately, quality metrics, generally, are not very good. Starting with good ol' RMSE or PSNR.
There is correlation with perceived quality, yes. Correlation is not quite enough when comparing decent schemes.
Linear light RMSE, unfortunately, is not very well correlated with human perception of "defects" or "faithfulness".
And this does not account for taste.

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-10-16T04:35:38-07:00

What complicates things is that if you change the geometrical operation, you change the matrix completely, and then, in a sense, you must extract a recipe or pattern from a collection of (parameterized?) matrices based on a score that does not match what humans want very well.

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-10-16T04:51:18-07:00

I believe in post-sharpening (so it's interesting to hear about a well-conducted experiment), despite what's in "The Recommendations".
I've not quite figured out how to articulate this into a useable recommendation, however.

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-10-16T04:52:37-07:00

Here is an example of complications: bilinear (Triangle, in IM) is actually a decent scheme when you enlarge "a little", or when you downsample.
It's horrid if you enlarge a lot.

hjulenissen · Post by **hjulenissen** » 2012-10-16T10:32:01-07:00

NicolasRobidoux wrote:Here is an example of complications: bilinear (Triangle, in IM) is actually a decent scheme when you enlarge "a little", or when you downsample.
It's horrid if you enlarge a lot.

I thought that "bilinear" as a term was used for 2-d triangular filtering where the triangle is scaled so that a maximum of 4 input pixels contributed to one output pixel. If that is your definition as well, then I would be very sceptical about bilinear filtering for downscaling by large factors: most input pixels does not affect the output.

-h

hjulenissen · Post by **hjulenissen** » 2012-10-16T10:35:26-07:00

NicolasRobidoux wrote:What complicates things is that if you change the geometrical operation, you change the matrix completely, and then, in a sense, you must extract a recipe or pattern from a collection of (parameterized?) matrices based on a score that does not match what humans want very well.

I have not pondered much about the general, complex distorted case. Would it be possible to assume "local uniformity" over a suitable neighbourhood of pixels, calculate a local spatial cutoff frequency (either rotationally symmetric or tensor), and design a local kernel tailored for the local conditions?

-h

hjulenissen · Post by **hjulenissen** » 2012-10-16T10:44:35-07:00

NicolasRobidoux wrote:If we had an accurate model of the HVS that could be turned into a "score", we could, at least in principle, select a large class of matrices (and a class of color space transformations, because, really, with sRGB, for example, there already is nonlinearity introduced into the system) that would, for a single image or family of images, for a single or family of viewing conditions, and for a single or a family of geometrical transformations, optimize this score.
Unfortunately, quality metrics, generally, are not very good. Starting with good ol' RMSE or PSNR.
There is correlation with perceived quality, yes. Correlation is not quite enough when comparing decent schemes.
Linear light RMSE, unfortunately, is not very well correlated with human perception of "defects" or "faithfulness".
And this does not account for taste.

Ideas:
1. Include a model of the image generation and display mechanism (both tend to be highly "nonideal"). Minimize some simple numeric error within a limited spatial window (decreasing complexity and implicitly modelling the "spatial extent more important than spatial frequency" property of the HVS). What happens?
2. Color spaces may be related to the dsp concept of "homomorphic processing": i.e. expressing a nonlinear problem as a linear problem pre/post-processed by an invertible nonlinear transform e.g. gamma. Perhaps that litterature contains clues? I believe that the reason that we have gamma today (besides legacy, history etc) is that the inevitable signal quantization is "perceptually linearized", i.e. sRGB using its gamma can be quantized at 8 bits while perceptually similar to linear sRGB quantized to 12-13 bits(see Poynton).
3. If the system/solution cannot be sufficiently well be modelled as a linear system, or a daisy-chaining of linear systems with memory-less non-linear systems, you are off into general nonlinear memory systems. You might be able to do something cool with relatively general non-linear systems where the weights are linear (e.g. Volterra filters) but that is beyond my mathematical skills.
4. It is not necessary to find _the_ best scaler. If you are able to calculate a list of 10 or 100 candidates that are very likely to include _the_ best scaler, then it is practically possible to do the final sorting using perceptual experiments involving real humans.

-h

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-10-16T10:50:37-07:00

hjulenissen wrote: I thought that "bilinear" as a term was used for 2-d triangular filtering where the triangle is scaled so that a maximum of 4 input pixels contributed to one output pixel. If that is your definition as well, then I would be very sceptical about bilinear filtering for downscaling by large factors: most input pixels does not affect the output.

Nobody sane downsamples like this unless they have very very special reasons.
When downsampling, you scale the filter so that the triangle has width 2r, where r is the ratio between the original width, and the final width (when using the "align corners" image geometry convention, like is done in ImageMagick).
For example, when you reduce an image from, say 512x2048 to 256x1024, tensor Triangle produces each output filter using exactly nine input pixels (you'd think you need 16, but the ones at the edge of the square have weights equal to 0).

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-10-16T10:52:58-07:00

hjulenissen wrote:...
I have not pondered much about the general, complex distorted case. Would it be possible to assume "local uniformity" over a suitable neighbourhood of pixels, calculate a local spatial cutoff frequency (either rotationally symmetric or tensor), and design a local kernel tailored for the local conditions?

I was not talking about warping. The matrix involved in, say, enlarging an image from 10x10 to 11x11 is not the same as the matrix involved enlarging the same image to 12x12, or enlarging a 9x9 to 11x11.

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-10-16T10:56:09-07:00

hjulenissen wrote: Ideas:
1. Include a model of the image generation and display mechanism (both tend to be highly "nonideal"). Minimize some simple numeric error within a limited spatial window (decreasing complexity and implicitly modelling the "spatial extent more important than spatial frequency" property of the HVS). What happens?
2. Color spaces may be related to the dsp concept of "homomorphic processing": i.e. expressing a nonlinear problem as a linear problem pre/post-processed by an invertible nonlinear transform e.g. gamma. Perhaps that litterature contains clues? I believe that the reason that we have gamma today (besides legacy, history etc) is that the inevitable signal quantization is "perceptually linearized", i.e. sRGB using its gamma can be quantized at 8 bits while perceptually similar to linear sRGB quantized to 12-13 bits(see Poynton).
3. If the system/solution cannot be sufficiently well be modelled as a linear system, or a daisy-chaining of linear systems with memory-less non-linear systems, you are off into general nonlinear memory systems. You might be able to do something cool with relatively general non-linear systems where the weights are linear (e.g. Volterra filters) but that is beyond my mathematical skills.
4. It is not necessary to find _the_ best scaler. If you are able to calculate a list of 10 or 100 candidates that are very likely to include _the_ best scaler, then it is practically possible to do the final sorting using perceptual experiments involving real humans.

Sounds to me lilke a research program.
Unfortunately, not mine.
I don't want to wait 5 years for results that are not necessarily better.
I want better now.

hjulenissen · Post by **hjulenissen** » 2012-10-16T10:58:39-07:00

NicolasRobidoux wrote:
hjulenissen wrote: I thought that "bilinear" as a term was used for 2-d triangular filtering where the triangle is scaled so that a maximum of 4 input pixels contributed to one output pixel. If that is your definition as well, then I would be very sceptical about bilinear filtering for downscaling by large factors: most input pixels does not affect the output.
Nobody sane downsamples like this unless they have very very special reasons.

Agreed. Nevertheless, I think that is the agreed-upon definition for bilinear resampling. It is only a matter of words, technically I totally agree with you. If Imagemagick chose to use the (technically much saner) scaling of the triangular prototype function, then I think that some users might avoid the function needlessly:
http://en.wikipedia.org/wiki/Bilinear_sampling

"Bilinear filtering uses these points to perform bilinear interpolation between the four texels nearest to the point that the pixel represents (in the middle or upper left of the pixel, usually).
...
Bilinear filtering is rather accurate until the scaling of the texture gets below half or above double the original size of the texture - that is, if the texture was 256 pixels in each direction, scaling it to below 128 or above 512 pixels can make the texture look bad, because of missing pixels or too much smoothness."

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-10-16T10:59:49-07:00

Note: When upsampling, filtering with the Triangle filter ("Mexican hat function") gives the exact same results as doing the usual bilinear interpolation because the Triangle filter defines a partition of unity.
Hence the abuse of language.
Given that literal bilinear interpolation does not make sense, from a practical viewpoint, when downsampling, the former extension takes over the latter in this context.

Legacy ImageMagick Discussions Archive

Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?

Re: Why EWA/circular scaling operators?