3/31/2011

03-31-11 - Some image filter notes

Say you have a filter like F = [1,2,3,2,1] . The normal thing to do is compute the sum and divide so you have pre-normalized values and you just do a bunch of madd's. eg. you make N = [1/9,2/9,3/9,2/9,1/9].

Now there's the question of how you handle the boundaries of the image. The normal thing to do is to take the pre-normalized filter N and apply all over, and when one of the taps samples off edge, you have to give it something to sample. You can use various edge modes, such as :


SampleBounded(int i, int w) :

clamp :
  return Sample( Clamp(i,0,w-1) );

wrap :
  return Sample( (i+256*w)%w );

mirror (no duplicated edge pixel) :
  if ( i < 0  ) return SampleBounded( -i, w );
  if ( i >= w ) return SampleBounded( -i + 2*w - 2 , w );
  else return Sample( i );

mirror with duplicated edge pixel :
  if ( i < 0  ) return SampleBounded( - i - 1, w );
  if ( i >= w ) return SampleBounded( -i + 2*w - 1 , w );
  else return Sample( i );

(the correct edge mode depends on the usage of the image, which is one of those little annoying gotchas in games; eg. the mips you should make for tiling textures are not the same as the mips for non-tiling textures). (another reasonable option not implemented here is "extrapolate" , but you have to be a bit careful about how you measure the slope at the edge of the image domain)

The reason we do all this is because we don't want to have to accumulate the sum of filter weights and divide by the weight.

But really, in most cases what you really should be doing is applying the filter only where its domain overlaps the image domain. Then you sum the weights in the area that is valid and renormalize. eg. if our filter F is two pixels off the edges, we just apply [3,2,1] / 6 , we don't clamp the sampler and put an extra [1,2] on the first pixel.

ADDENDUM : in video games there's another special case that needs to be handled carefully. When you have a non-tiling texture which you wish to abutt seamlessly to another texture. That is, you have two textures T1 and T2 that are different and you wish to line them up beside each other without a seam.

I call this mode "shared", it sort of acts like "clamp" but has to be handled specially in filtering. Lets say T1 and T2 are layed against eachother horizontally, so they abutt along a column. What the artist should do is make the pixels in that border column identical in both textures (or you could have your program enforce this). Then, the UV mapping on the adjacent rectangles should be inset by half a pixel - that is, it picks the center of the pixels, not the edge of the texture. Thus the duplicated pixel edge only appears to be a single column of pixels.

But that's not the special case handling - the special case is whenever you filter a "shared" image, you must make border column pixels only from other border column pixels. That is, that shared edge can only be vertically filtered, not horizontally filtered. That way it stays identical in both images.

Note that this is not ideal with mipping, what happens is the shared edge gets fatter at higher mip levels - but it never develops a seam, so it is "seamless" in that sense. To do it right without any artifacts (eg. to look as if it was one solid bigger texture) you would have to know what image is on the other side of the shared edge and be able to filter tap into those pixels. Obviously that is impossible if your goal is a set of terrain tiles or something like that where you use the same shared edge in multiple different ways.

(is there a better solution to this issue?)


I did a little look into the difference between resizing an image 8X by either doubling thrice or directly resizing. I was sanity checking my filters and I thought - hey if I use a Gaussian filter, it should be the same thing, because convolution of a Gaussian with a Gaussian is a Gaussian, right?

In the continuous case, you could either use one Gaussian with an sdev of 8 (not actually right for 8X mag, but you get the idea). If you had a Gaussian with sdev 2 and convolved it 3 times - you should get a Gaussian with sdev of 8.

So I tried it on my filters and I got :


Gaussian for doubling, thrice :

1.0000,0.9724,0.8822,0.7697,0.5059,0.3841,0.2635,0.1964,0.1009,0.0607,0.0281,0.0155,0.0067,0.0034,0.0012,0.0004,...

Gaussian for direct 8x :

1.0000,0.9439,0.8294,0.6641,0.4762,0.3057,0.1784,0.0966,0.0492,0.0235,0.0103,0.0041,0.0014,0.0004,...

and I was like yo, WTF they're way off, I must have a bug. (note : these are scaled to make the max value 1.0 rather than normalizing because it's easier to compare this way, they look more unequal after normalizing)

But then I realized - these are not really proper Gaussians. These are discrete samples of Gaussians. If you like, it's a Gaussian multiplied by a comb. It's not even a Gaussian convolved with a box filter - that is, we are not applying the gaussian over the range of the pixel as if the pixel was a box, but rather just sampling the continuous function at one point on the pixel. Obviously the continuous convolution theorem that Gauss [conv] Gauss = Gauss doesn't apply.

As for the difference between doing a direct 8X and doubling thrice, I can't see a quality difference with my eyes. Certain the filters are different numerically - particularly filters with negatives, eg. :


sinc double once : 
1.0000,0.6420,0.1984,-0.0626,-0.0974,-0.0348,0.0085,0.0120,
sinc double twice : 
1.0000,0.9041,0.7323,0.5193,0.3042,0.1213,-0.0083,-0.0790,-0.0988,-0.0844,-0.0542,-0.0233,-0.0007,0.0110,0.0135,0.0107,0.0062,0.0025,0.0004,-0.0004,-0.0004,
sinc double thrice : 
1.0000,0.9755,0.9279,0.8596,0.7743,0.6763,0.5704,0.4617,0.3549,0.2542,0.1633,0.0848,0.0203,-0.0293,-0.0645,-0.0861,-0.0960,-0.0962,-0.0891,-0.0769,-0.0619,-0.0459,-0.0306,-0.0169,-0.0057,0.0029,0.0087,0.0120,0.0133,0.0129,0.0116,0.0096,0.0073,0.0052,0.0033,0.0019,0.0008,0.0001,-0.0003,-0.0004,-0.0004,-0.0004,-0.0002,

sinc direct 8x : 
1.0000,0.9553,0.8701,0.7519,0.6111,0.4595,0.3090,0.1706,0.0528,-0.0386,-0.1010,-0.1352,-0.1443,-0.1335,-0.1090,-0.0773,-0.0440,-0.0138,0.0102,0.0265,0.0349,0.0365,0.0328,0.0259,0.0177,0.0097,0.0029,-0.0019,-0.0048,-0.0059,

very different, but visually meh? I don't see much.


The other thing I constantly forget about is "filter inversion". What I mean is, if you're trying to sample between two different grids using some filter, you can either apply the filter to the source points or the dest points, and you get the same results.

More concretely, you have filter shape F(t) and some pixels at regular locations P[i].

You create a continuous function f(t) = Sum_i P[i] * F(i-t) ; so we have placed a filter shape at each pixel center, and we are sampling them all at some position t.

But you can look at the same thing a different way - f(t) = Sum_i F(t-i) * P[i] ; we have a filter shape at position t, and then we are sampling it at each position i around it.

So, if you are resampling from one size to another, you can either do :

1. For each source pixel, multiply by filter shape (centered at source) and add shape into dest, or :

2. For each dest pixel, multiply filter shape (centered at dest) by source pixels and put sum into dest.

And the answer is the same. (and usually the 2nd is much more efficient than the first)


And for your convenience, here are some doubling filters :


box        : const float c_filter[1] = { 1.00000 };
linear     : const float c_filter[2] = { 0.25000, 0.75000 };
quadratic  : const float c_filter[3] = { 0.28125, 0.68750, 0.03125 };
cubic      : const float c_filter[4] = { 0.00260, 0.31510, 0.61198, 0.07031 };
mitchell0  : const float c_filter[4] = { -0.02344, 0.22656, 0.86719, -0.07031 };
mitchell1  : const float c_filter[4] = { -0.01476, 0.25608, 0.78212, -0.02344 };
mitchell2  : const float c_filter[4] = { 0.01563, 0.35938, 0.48438, 0.14063 };
gauss      : const float c_filter[5] = { 0.00020, 0.20596, 0.78008, 0.01375, 0.00000 };
sqrtgauss  : const float c_filter[5] = { 0.00346, 0.28646, 0.65805, 0.05199, 0.00004 };
sinc       : const float c_filter[6] = { 0.00052, -0.02847, 0.23221, 0.87557, -0.08648, 0.00665 };
lanczos4   : const float c_filter[4] = { -0.01773, 0.23300, 0.86861, -0.08388 };
lanczos5   : const float c_filter[5] = { -0.04769, 0.25964, 0.89257, -0.11554, 0.01102 };
lanczos6   : const float c_filter[6] = { 0.00738, -0.06800, 0.27101, 0.89277, -0.13327, 0.03011 };

These are actually pairs of filters to create adjacent pixels in a double-resolution output. The second filter of each pair is simply the above but in reverse order (so the partner for linear is 0.75, 0.25).

To use these, you scan it over the source image and apply centered at each pixel. This produces all the odd pixels in the output. Then you take the filter and reverse the order of the coefficients and scan it again, this produces all the even pixels in the output (you may have to switch even/odd, I forget which is which).

These are created by taking the continuous filter function and sampling at 1/4 offset locations - eg. if 0 is the center (maximum) of the filter, you sample at -0.75,0.25,1.25, etc.

And here's the same thing with a 1.15 X blur built in :


box        : const float c_filter[1] = { 1.0 };
linear     : const float c_filter[2] = { 0.30769, 0.69231 };
quadratic  : const float c_filter[3] = { 0.00000, 0.33838, 0.66162 };
cubic      : const float c_filter[5] = { 0.01586, 0.33055, 0.54323, 0.11034, 0.00001 };
mitchell0  : const float c_filter[5] = { -0.05174, 0.30589, 0.77806, -0.03143, -0.00078 };
mitchell1  : const float c_filter[5] = { -0.02925, 0.31410, 0.69995, 0.01573, -0.00052 };
mitchell2  : const float c_filter[5] = { 0.04981, 0.34294, 0.42528, 0.18156, 0.00041 };
gauss      : const float c_filter[6] = { 0.00000, 0.00149, 0.25842, 0.70629, 0.03379, 0.00002 };
sqrtgauss  : const float c_filter[6] = { 0.00000, 0.01193, 0.31334, 0.58679, 0.08726, 0.00067 };
sinc       : const float c_filter[7] = { 0.00453, -0.05966, 0.31064, 0.78681, -0.03970, -0.00277, 0.00015 };
lanczos4   : const float c_filter[5] = { -0.05129, 0.31112, 0.78006, -0.03946, -0.00042 };
lanczos5   : const float c_filter[6] = { 0.00499, -0.09023, 0.33911, 0.80082, -0.04970, -0.00499 };
lanczos6   : const float c_filter[7] = { 0.02600, -0.11420, 0.34931, 0.79912, -0.05497, -0.00837, 0.00312 };

The best doubling filters to my eyes are sinc and lanczos5, they have a good blend of sharpness and lack of artifacts. Stuff like gauss and cubic are too blurry, but are very smooth ; lanczos6 is sharper but has more ringing and stair-steps; wider lanczos filters get worse in that way. Sinc and lanczos5 without any blur built in can have a little bit of visible stair-steppiness (there's an inherent tradeoff when linear upsampling of sharpness vs. stair-steps) (by stair steps I mean the ability to see the original pixel blobs).

3 comments:

ryg said...

"But then I realized - these are not really proper Gaussians. These are discrete samples of Gaussians. If you like, it's a Gaussian convolved with a comb filter."
This is if you use a sampled Gaussian as your filter. There's an alternative approach - the basic idea is that convolution with a Gaussian is the solution to a simple continuous uniform linear diffusion equation (at a time proportional to the desired variance). You can then consider that same diffusion problem on a discrete space (i.e. grid) and use the solutions as your discrete Gaussian approximation.

The 1D kernel coefficients are given by

K[sigma](n) = exp(-sigma) * I_n(sigma)

where I_n is the modified Bessel function of order n. For 2D you just do the usual separable filtering thing; the resulting kernel doesn't have perfect rotational symmetry, but it does have the convolution theorem: K[sigma1] * K[sigma2] = K[sigma1+sigma2].

There's a nice introduction on Wikipedia, and the original paper by Lindeberg is available online.

castano said...

> what you really should be doing is applying the filter only where its domain overlaps the image domain.

Interesting... and if you are using a polyphase filter (case 2 below) you are probably precomputing the kernel weights for each column, so you can normalize the edge case in advance and it comes out for free.

I still think that wrapping properly is preferable if you have that information.

cbloom said...

"I still think that wrapping properly is preferable if you have that information."

Yeah, video game textures are sort of a special case. Note : addendum added to original post on this subject.

Also, for the small filters shown here, the issue of off-edge sampling is not very important (assuming your image is large - 3 pixels of edge being slightly not perfect on a 1920 wide image is invisible).

In some cases however (SCIELAB for example) I've used some huge Gaussians, like 100 pixels wide, and then the off-edge contribution to the filter can be very significant.

old rants