04-27-11 - Things we need

Things that the world needs :

1. A real privacy law in America.

It should be illegal to give my personal information away without my express permission. It's absolutely sickening that my banks and credit cards are selling information about me.

Furthermore, it should be illegal to require personal information for marketting purposes in exchange for discounts. eg. stores that ask for your phone number when you check out, stores that use "club cards" to compell you to give your personal info, etc.

Nobody should be able to check your credit report without your explicit permission. eg. when some credit card company goes to ping your credit info, it should email you and say "is this allowed?" and you can just say no.

2. An academic journal that's free, peer reviewed, and gives the writers full ownership of their own work.

2.A. Information should be free. There's no reason for technical papers to be anything but online, so the costs are near zero, so there is absolutely no excuse for charging money for it. The only paid position is the editor, and that could easily be supported through donations.

2.B. Peer review is important to weed out junk. I would go further actually, I don't think the typical academic peer review structure is good enough. I'd like an organized way for other experts to post official counter arguments or addenda. The best journal I've ever seen is the Physical Review, in which you would frequently see a paper, and then a small counter-paper right after it.

2.C. It's absolutely sickening that major publishers take away the rights of the author. Authors would retain all rights to reproduce, distribute, whatever they want (however, the full free access to the paper could never be revoked).

2.D. I would like to see every paper start with a declaration of whether the authors or their organization have (or will try to get) patents on the described technique. This would also be a spot where they could state that they will not ever patent the work described, or they could donate the work to the defensive patent collection described next.

3. A viral defensive patent collection. I've written about this before, but the idea in brief is to create a pool of patents which is "viral" like the GPL is. That is, you can use absolutely any patent in the pool, if and only if you do not own any patent that is outside the pool. If you don't comply, then the patents in the pool cost money, and that money pays for administration of the pool and law suits against violators and so on. This is probably not realistic due to the necessity of corporations to cross-license, eg. even if someone like Google wanted be altruistic and do this, they can't because they need a patent portfolio to make cross-license arrangements with the other fuckers.

04-27-11 - Mexico

Just back from Mexico and a few random thoughts I want to record.

We were there during Semana Santa which is this huge Mexican holiday where everyone goes to the beach and parties. Lots of people told us not to go during that time, it was horrible, blah blah blah. Well those people suck, they're the kind of people who say things like "Mexico would be great if it wasn't full of Mexicans". I thought it was very cool to see. For one thing it meant that these gringo tourist towns were actually full of Mexicans, which hides some of their horribleness. It's also just a wonderful lively atmosphere, it's like being in New Orleans for Mardi Gras or something. Obviously you have to be smart about where you book your hotel - not right on a busy beach, not right on the town square - but other than that it's fine.

It was really cool seeing all the families. So the families come to the beach and find a table at a palapa and just sit there all day. Mexican restaurant hospitality believes in never rushing the patron or making them feel like they have to leave, so you can just sit there all day. The families even often bring their own food and drink, and you just have to order a tiny bit from the restaurant over the course of the day. Then the parents just sit there and the kids run around and play all day. It's an absolutely wonderful system.

It's so good for me just to be away from computers and television. And it's lovely to be somewhere warm where you can be outside all the time. I love the evening promenade, after the sun is down and it's cool enough, everyone strolls around town or hangs out in the square. It's tragic that as soon as I get home I immediately go for the TV and computer and fuck myself over. I have such peace when I wake up in the morning and sit on the porch and I know that all I can do right then is sit there because there is no computer, there is no TV.

Whenever I get back from a third world country, it strikes me how little freedom we actually have in the US. We are an incredibly repressed people. For example, during Semana Santa tons of people descend on the beaches and camp. Many of the places that they camp are technically private property or even public beaches but with forbidden camping; it's allowed because there's just not very much enforcement and because it's tradition. If a house is abandoned or something of course you move into it. If you're poor, you can make work by selling stuff on the beach, or opening up a food cart.

The latest trend I'm seeing in America that really bothers me is that even parking lots are gated or chained off. God forbid somebody goes on your parking lot when you're not using it. In Seattle city area I don't think there's a single parking lot that isn't fenced/chained at night. There's this oppressive feeling that you have to stay within the lines all the time.

We spent most of the time in the Costa Alegre area, a huge, empty, dry, hot space. It sort of reminded me of the trips to Mexico we used to take when I was a kid. We went to Isla Mujeres primarily, but it was 20 years ago and it was very different than it is now. There was only one big hotel for gringos, and the rest of the town was still local fisherman or a few small Mexican hotels (concrete boxes with no AC, very cheap). We went every summer for many years, and it was gradually changing, getting more developed, but the big change happened when some huge hurricane wiped the island clean, and of course it was rebuilt by the big money tourism developers, and now it's like a Disneyland styrofoam facsimile of a "Mexican island village".

There are lots of little farm towns around the Costa Alegre area. They're dusty and poor and feel like time moves very slowly there. It occurs to me that little farm towns like these are all over Mexico; I have no idea if this is true, but it feels like the towns are full of children and old people, like the middle aged have all left to seek work. It's sort of tragic for the world that it's not possible for people to make a living in these little farm towns. The amount of money they need to make to stay in the town is so tiny too, probably only a dollar a day or so, and they can't even make that.

It seems to me that it's in the best interest of the whole world to make subsistence farming viable. It would prevent mass immigrations and refugee problems. If you are anti-illegal-immigrant in the US, forget building a wall, what you need to do is stop the US government from subsidizing crop prices and domestic agribusiness.

The other issue is that the wonderful handicrafts they make in these towns just can't support them. There's lots of wonderful weavers, leather workers, etc. with great skills, and in theory they could make that stuff and sell it to the city folk, but they just can't sell enough of it for high enough prices. The problem is that western consumers just don't want quality hand made stuff, we want stuff that's the latest trend, and we don't care if it's cheap factory made junk. Certainly part of the problem is that the country handiworkers don't cater to modern tastes enough; I feel like there is an opportunity there for some charity to work with the small town craftsmen to teach them how to make things more aligned with what western consumers want. But it's not clear that would make a big difference, there just aren't enough consumer who care about getting a hand made belt vs a machine made one.

Anyway, I'm just amazed by the tacos you get for fifty cents. The tortillas are always made to order, even at the most ghetto road side cart, of course they make the tortillas to order! Someone (usually an older woman) grabs a handful of masa, slaps it around to shape it, presses it, and tosses it on the griddle (comal). The cooked tortilla has a subtle flavor of malty browned corn. My favorite thing was probably the quesadillas (which aren't really like our quesadillas at all), because they use a larger thicker tortilla which lets you get more of the flavor of the masa.

Some random impressions of my youth in Mexico :

Packs of wild dogs. On Isla Mujeres there was a pack of some 20 dogs or so that roamed the streets, constantly barking, stopping to smell things then running to catch up, breaking into sudden vicious fights with each other, fucking each other, snarling. It was variable in size, because dog ownership in Mexico is sort of a gray area; some people own dogs like Americans do, coddling them and keeping them indoors and feeding them, but others own them in a looser way, sort of just allowing a semi-wild dog to hang around, and the semi-wild dog might run off and hang with the pack for a while before coming home. We would chase them around town, or be chased by them, always sort of afraid of their wildness and fascinated by the way they could roam the streets and nobody seemed to be bothered much by it.

Rebar. Every building is ready to have another floor added. There's a shop on the first floor, the second floor is poured concrete, but unfinished and empty, and above that rebar juts into the sky. Everywhere, even on buildings that look perfectly finished, there's odd bits of rebar sticking out. Concrete walls always have a bit of extra rebar sticking out in case you want to go higher.

Wonderful loose hand-woven hammocks. I think the best hotel we ever stayed at was a desperation find after something else didn't work out. It was a concrete box with one wall missing facing the sea. There were beds, but also piles of giant cockroaches, cockroaches the size of a deck of playing cards, and it was hot as hell, so nobody slept in the beds. You can always tell you're in a real proper Mexican hotel when there are hammock hooks drilled into the walls. We all slept in hammocks and the sea air blew in through the side of the room that had no wall and it was delightful. (delightful other than the fact that you had to walk down to the ground floor to operate a hand pump to get water pressure up to the third floor).

I remember eating hamburgers a lot. And the great Cristal sodas (Toronja por favor, to which they would always say "Naranja?" and I would have to say "no, toe-ron-ja"). In the summer a little travelling carnival would come to town for a few weeks; the rides were all hand made, rickety, mostly human-powered. One of my favorites was this large wooden box that was held off the ground by an axle through the middle of the box. You would sit inside and then two strong men would just grab the outside of the box and rock it back and forth increasingly hard until it spun all the way around.


04-12-11 - Some TV Reviews

Circus - PBS reality show about a shitty circus ; they don't really do a good job of tapping into the crack-rock capabilities of reality TV, it's too much like an interview/documentary. The biggest problem is that the Circus managers and the head clown and the all the people in power are just giant douchebags, they're literally like the Office Space manager, "I'm going to need you to work through the weekend mmmkay" and you just hate them, but not in a compelling reality TV way, more in just a "I don't want to watch this any more way". Anyhoo, it's saved by some of the Circus performers, who are just such sympathetic characters, so full of art and passion, such as Sarah Schwarz . Meh, can't recommend it, but if you fast-forward extensively it's okay.

Luther - Idris Elba (Stringer Bell) as the hammy over-acted ever-so-intense John Luther. It's enjoyable and well made, though rather heavy on the gruesomeness and trying too hard to be serious all the time. Yeah it's ridiculous, over the top, cliched, but Idris has an awful lot of charisma and carries it pretty well.

Sherlock - modern resetting of some Holmes stories ; at first it seems appealing, because it is at least trying to be intellectual; unfortunately it's only "smart" on a wafer-thin vapid surface level. Tolerable but not worth seeking out.

Long Way Round / Long Way Down - Ewan McGregor motorbikes around the world. Actually, to be more accurate I should say, "Ewan McGregor plans to motorbike around the world in agonizing detail, whines about every little inconvenience, travels with a massive support crew, and then goes on and on about what a difficult thing it is". The main thing that struck me was how someone who's so rich and outgoing can be so bad at life. They seem to have no concept of how to travel. Point of advice #1 would be don't bring a film crew. And then #2 is don't plan out an hourly itenerary for yourself in advance. The whole show is just sort of upsetting because you watch them botching their journey so badly, just constantly worrying about getting to the next checkpoint (or coming across a small puddle and calling in the support team to help them across it). It's pathetic.

I did find watching Ewan somewhat inspiring though. Actually I got the same thing from Michael Palin in his travel shows. You can watch both of them turn on the performance personality when it's called for, even when they don't really want to, they're tired, they're embarassed, it's a situation where a normal person like me would just beg off "no no, I can't" but you watch them sort of take a deep breath and steel themselves and get on with it, make a speech or sing a song or whatever it is that's called for. Performance situations happen constantly in life, not just on stage, you come across a moment when the appropriate thing to do is to tell a funny story, or to dance, or whatever you should do to smooth out the situation or make it fun for everyone in that moment, and unless you're a psychopath, much of the time you won't really want to do it or you'll be embarassed or afraid or whatever, so the question is whether you just get on with it, or whether you wuss out and be bland and boring.

They both also do a very good job of talking to people in a nice way but without sacrificing their own dignity. You know like when you're in a weird place and you talk to some guy and he turns out to be a rabid racist or some awkward shit, there are two easy ways to deal with it which most people fall back on (1) is to just pretend that you agree with what he's saying, lots of nodding and mm-hmm, (2) is to just go "you're a nutter" or get sarcastic or just generally not engage with him any more. The hard thing is to stay engaged and talk to the guy in a joking, friendly manner, but without agreeing, and even making it clear that you disagree. I'm very impressed with that.


04-08-11 - Friday Filters

So in the last post we made the analysis filters for given synthesis filters that minimize L2 error.

Another common case is if you have a set of discrete data and wish to preserve the original values exactly. That is, you want to make your synthesis interpolate the original points.

(obviously some synthesis filters like sinc and box are inherently interpolating, but others like gauss and cubic are not).

So, the construction proceeds as before, but is actually simpler. Rather than using the hat overlaps we only need the values of the synthesis at the lattice points. That is :

Given synthesis/reconstruction filter S(t)
and discrete points P_i
Find points Q_i such that :

f(t) = Sum_i Q_i * S(t - i)

f(j) = P_j for all integers j

We can write this as a matrix equation :

Sij = S(j - i) = S(i - j)

S * Q = P

note that S is band-diagonal. This is the exact same kind of matrix problem that you get when solving for B-splines.

In general, if you care about the domain boundary effects, you have to construct this matrix and solve it somehow. (matrix inversion is a pretty bad way, there are fast ways to solve sparse band-diagonal matrices).

However, if the domain is large and you don't care too much about the boundary, there's a much simpler way. You just find S^-1 and look at a middle row. As the domain size goes to infinity, all the rows of S^-1 become the same. For finite domain, the first and last rows are weird and different, but the middle rows are about the same.

This middle row of S^-1 is the analysis filter.

A = middle row of S^-1

Q = A <conv> P

Q_i = Sum_j A(i-j) P_j

note A is only defined discretely, not continuously. Now our final output is :

f(t) = Sum_i Q_i * S(t - i)

f(t) = Sum_i Sum_j A(i-j) P_j * S(t - i)

f(t) = Sum_j C(t-j) * P_j

C(t) = Sum_i A(i) * S(t-i)

where C is the combined analysis + synthesis filter. The final output is just a simple filter applied to the original points.

For example, for the "canonical cubic" synthesis , (box convolved thrice), we get :

cubic analysis :
const float c_filter[11] = { -0.00175, 0.00876, -0.03327, 0.12434, -0.46410, 1.73205, -0.46410, 0.12434, -0.03327, 0.00876, -0.00175 };
chart : 

(A is 11 wide because I used an 11x11 matrix; in general it's infinitely wide, but gets smaller as you go out)

The synthesis cubic is piece-wise cubic and defined over four unit intervals from [-2,2]
The combined filter is piece-wise cubic; each unit interval is a linear combo of the 4 parts of synthesis

{ ADDENDUM : BTW this *is* the B-spline cubic; see for example : Neil Dodgson Image resampling page 128-129 ; the coefficients are exactly the same }

So, you could do all this work, or you could just use a filter that looks like "combined" from the start.

gaussian analysis :
const float c_filter[11] = { -0.00001, 0.00008, -0.00084, 0.00950, -0.10679, 1.19614, -0.10679, 0.00950, -0.00084, 0.00008, -0.00001 };
chart : 

mitchell1 analysis :
const float c_filter[11] = { -0.00000, 0.00002, -0.00028, 0.00446, -0.07115, 1.13389, -0.07115, 0.00446, -0.00028, 0.00002, -0.00000 };
chart : 

Now, not surprisingly all of the "combined" filters look very similar, and they all look rather a lot like windowed sincs, because there simply aren't that many ways to make interpolating filters. They have to be = 1.0 at 0, and = 0.0 at all other integer locations.

ADDENDUM : well, I finally read the Nehab/Hoppe paper "Generalized Sampling" , and guess what, it's this. It comes from a 1999 paper by Blu et.al. called "Generalized Interpolation: Higher Quality at no Additional Cost".

The reason they claim it's faster than traditional filtering is that what we have done is to construct a sinc-like interpolating filter with wide support, which I call "combined", which can be separated into a simple compact "synthesis" filter, and a discrete matrix (S^-1). So for the case where you have only a few sets of data and you sample it many times (eg. texture in games), you can obviously implement this quickly by pre-applying the discrete matrix to the data, so you no longer have samples of the signal as your pixels, instead you have amplitudes of synthesis basis functions (that is, you store Q in your texture, not P). Now you can effectively sample the "combined" filter by applying only the "synthesis" filter. So for example Nehab/Hoppe suggests that you can do this fast on the GPU by using a GPU implementation of the cubic basis reconstruction.

Both of the papers are somewhat misleading in their claims of "higher quality than traditional filtering". You can of course compute a traditional linear filter that gives the same result, since their techniques are still just linear. They are higher quality *for the same complexity of filter* , and only if you have pre-done the S^-1 multiplication. The speed advantage is in being able to precompute the S^-1 multiplication.


04-07-11 - Help me I can't stop

Some common synthesis filters and their corresponding analysis filter :
BEGIN review

if you want to approximate f(t) by

Sum_i P_i * synthesis(t-i)

you can find the P's by :

P_i = Convolve{ f(t) analysis(t-i) }

END review
a note on the method :

BEGIN note

the H overlap matrix was computed on a 9x9 domain
because my matrix inverse is ungodly slow

for sanity checking I compared to 11x11 a few times and found the difference to be small
for example :

linear filter invH 9x9 :


linear filter invH 11x11 :


(ideally I would use a very large matrix and then look at the middle row, because that is
where the boundary has the least effect)

for real use in a high precision environment you would have to take the domain boundary more seriously

also, I did something stupid and printed out the invH rows with the maximum value scaled to 1.0 ; the
unscaled values for linear are :


but, I'm not gonna redo the output to fix that, so the numbers below have 1.0 in the middle.

END note

For box synthesis, analysis is box.

linear : invH middle row : = 

(note: we've see this linear analysis filter before when we talked about how to find the optimum image such that when it's bilinear interpolated you match some original as well as possible)

quadratic invH middle row : = 

gauss-unity : invH middle row : = 

note : "unity" means no window, but actually it's a rectangular window with width 5 ; the gaussian has sdev of 0.5

sinc half-width of 20 :

sinc-unity : invH middle row : = 

note : obviously sinc is its own analysis ; however, this falls apart very quickly when you window the sinc at all, or even just cut it off when the values get tiny :

sinc half-width of 8 :

sinc-unity : invH middle row : = 

lanczos6 : invH middle row : = 

lanczos4 : invH middle row : = 

Oh, also, note to self :

If you print URLs to the VC debug window they are clickable with ctrl-shift, and it actually uses a nice simple internal web viewer, it doesn't launch IE or any such shite. Nice way to view my charts during testing.

ADDENDUM : Deja vu. Rather than doing big matrix inversions, you can get these same results using Fourier transforms and Fourier convolution theorem.

cbloom rants 06-16-09 - Inverse Box Sampling
cbloom rants 06-17-09 - Inverse Box Sampling - Part 1.5
cbloom rants 06-17-09 - Inverse Box Sampling - Part 2


04-06-11 - And yet more on filters

In the comments of the last post we talked a bit about reconstruction/resampling. I was a bit confounded, so I worked it out.

So, the situation is this. You have some discrete pixels, P_i. For reconstruction, each pixel value is multiplied by some continuous impulse function which I call a "hat", centered at the pixel center. (maybe this is the monitor's display response and the output value is light, or whatever). So the actual thing you care about is the continuous output :

Image(t) = Sum_i P_i * hat( t - i)

I'll be working in 1d here for simplicity but obviously for images it would be 2d. "hat" is something like a box, linear, cubic, gaussian, whatever you like, presumably symmetric, hopefully compact.

Okay, so you wish to resample to a new set of discrete values Q_i , which may be on a different lattice spacing, and may also be reconstructed with a different basis function. So :

Image'(t) = Sum_j Q_j * hat'( t - r*j )

(where ' reads "prime). In both cases the sum is on integers in the image domain, and r is the ratio of lattice spacings.

So what you actually want to minimize is :

E = Integral_dt {  ( Image(t) - Image'(t) )^2 }

Well to find the Q_j you just do derivative in Q_j , set it to zero, rearrange a bit, and what you get is :

Sum_k H'_jk * Q_k = Sum_i R_ij * P_i

H'_jk = Integral_dt { hat'(t-r*j) * hat'(t-r*k) }

R_ij = Integral_dt { hat'(t-r*j)) * hat(t-i) }

or obviously in matrix notation :

H' * Q = R * P

"H" (or H') is the hat self-overlap matrix. It is band-diagonal if the hats are compact. It's symmetric, and actually

H_ij = H( |i-j| )

that is, it only depends on the absolute value of the difference :

H_ij = 
H_0 H_1 H_2 ...
H_1 H_0 H_1 H_2 ...
H_2 H_1 H_0 ...

and again in words, H_i is the amount that the hat overlaps with itself when offset by i lattice steps. (you could normalize the hats so that H_0 is always 1.0 ; I tend to normalize so that h(0) = 1, but it doesn't matter).

If "hat" is a box impulse, then H is the identity matrix (H_0 = 1, else = 0). If hat is the linear tent, then H_0 = 2/3 and H_1 = 1/6 , if hat is Mitchell1 (compromise) the terms are :

H_0 : 0.681481
H_1 : 0.176080
H_2 : -0.017284
H_3 : 0.000463

While the H matrix is very simple, there doesn't seem to be a simple closed form inverse for this type of matrix. (is there?)

The R matrix is the "resampling matrix". It's the overlap of two different hat functions, on different spacings. We can sanity check the trivial case, if r = 1 and hat' = hat, then H = R, so Q = P is the solution, as it should be. R is also sparse and sort of "band diagonal" but not along the actual matrix diagonal, the rows are shifted by steps of the resize ratio r.

Let's try a simple case. If the ratio = 2 (doubling), and hat and hat' are both linear tents, then :

H_0 = 2/3 and H_1 = 1/6 , and the R matrix has sparse rows made of :

...0..., 0.041667, 0.250000, 0.416667, 0.250000, 0.041667 , .... 0 ....

Compute  H^-1 * R =

makes a matrix with rows like :

0 .. ,0.2500,0.5000,0.2500,0 ..

which is a really complicated way to get our triangle hat back, evaluated at half steps.

But it's not always that trivial. For example :

if the hat and hat' are both cubic, r = 2, 

H (self overlap) is :

0 : 0.479365
1 : 0.236310
2 : 0.023810
3 : 0.000198

R (resize overlap) is :

0 : 0.300893
1 : 0.229191
2 : 0.098016
3 : 0.020796
4 : 0.001538
5 : 0.000012

and H^-1 * R has rows of :


which are actually the values of the standard *quadratic* filter evaluated at half steps.

So the thing we get in the end is very much like a normal resampling filter, it's just a *different* one than if you just evaluated the filter shape. (eg. the H^-1 * R for a Gaussian reconstruction hat is not a Gaussian).

As noted in the previous comments - if your goal is just to resize an image, then you should just choose the resize filter that looks good to your eyes. The only place where this stuff might be interesting is if you are trying to do something mathematical with the actual image reconstruction. Like maybe you're trying to resample from monitor pixels to rod/cone pixels, and you have some a-priori scientific information about what shape reconstruction functions each surface has, so your evaluation metric is not ad-hoc.

.. anyhoo, I'm sure this topic has been covered in academic papers so I'm going to leave it alone.

ADDENDUM : another example while I have my test app working.

Gaussian filter with sdev = 1/2, evaluated at half steps :


The rows of (H^-1 * R) provide the resizing filter :


which comes from :

filter self overlap (H) :
0 : 0.886227
1 : 0.326025
2 : 0.016231
3 : 0.000109
4 : 0.000000
5 : 0.000000

filter resample overlap (R) :
0 : 1.120998
1 : 0.751427
2 : 0.226325
3 : 0.030630
4 : 0.001862

Let me restart from the beginning on the more general case :

Say you are given a continuous function f(t). You wish to find the discrete set of coefficients P_i such that under simple reconstruction by the hats h(t-i) , the L2 error is minimized (we are not doing simple sampling such that P_i = f(i)). That is :

the reconstruction F is :

F(t) = Sum_i P_i * h(t-i)

the error is :

E = Integral_dt { ( f(t) - F(t) ) ^2 }

do derivative in P_i and set to zero, you get :

Sum_j H_ij * P_j = Integral_dt { f(t) * h(t-i) }

where H is the same hat self-overlap matrix as before :

H_ij = h_i <conv> h_j

(with h_i(t) = h(t-i) and conv means convolution obviously)

or in terse notation :

H * P = f <conv> h

(H is a matrix, P is a vector )

rearranging you can also say :

P_j = f <conv> g


g_i(t) = Sum_j [ H^-1_ij * h(t-j) ]

what we have found is the complementary basis function for h. h (the hat) is like a "synthesis wavelet" and g is like an "analysis wavelet". That is, once you have the basis set g, simple convolution with the g's produces the coefficients which are optimal for reconstruction with h.

Note that typically H^-1 is not compact, which means g is not compact - it has significant nonzero value over the entire image.

Also note that if there is no edge to your data (it's on infinite domain), then the g's are just translations of each other, that is, g_i(t) = g_0(t-i) ; however on data with finite extent this is not the case (though the difference is compact and occurs only at the edges of the domain).

It should be intuitively obvious what's going on here. If you want to find pixel P_i , you take your continuous signal and fit the hat at location i and subtract that out. But your neighbors' hats also may have overlapped in to your domain, so you need to compensate for the amount of the signal that they are taking, and your hat overlaps into your neighbors, so choosing the value of P_i isn't just about minimizing the error for that one choice, but also for your neighbors. Hence it becomes non-local, and very much like a deconvolution problem.


04-04-11 - Yet more notes on filters

Topics for today :

N-way filters

symmetry of down & up

qualitative notes

comb sampling

1. N-way filters. So for a long time cblib has had good doublers and halvers, but for non-binary ratios I didn't have a good solution and I wasn't really sure what the solution should be. What I've been doing for a long time has been to use doublers/halvers to get the size close, then bilinear to get the rest of the way, but that is not right.

In fact the solution is quite trivial. You just have to go back to the original concept of the filters as continuous functions. This is how arbitrary float samplers work (see the Mitchell papers for example).

Rather than a discrete filter, you use the continuous filter impulse. You put a continuous filter shape at each pixel center, multiplied by the pixel value. Now you have a continuous function for your whole image by just summing all of these :

Image(u,v) = Sum[ all pixels ] Pixel[i,j] * Filter_func( u - i, v - j )

So to do an arbitrary ratio resize you just construct this continuous function Image, and then you sample it at all the fractional u,vs.

Now, because of the "filter inversion" principle that I wrote about before, rather than doing this by adding up impulse shapes, you can get the exact same output by constructing an impulse shape and convolving it with the source pixels once per output pixel. The impulse shape you make should be centered on the output pixel's position, which is fractional in general. This means you can't precompute any discrete filter taps.

So - this is cool, it all works. But it's pretty slow because you're evaluating the filter function many times, not just using discrete taps.

There is one special case where you can accelerate this : integer ratio magnifications or minifications. Powers of two obviously, but also 3x,5x, etc. can all be fast.

To do a 3x minification, you simply precompute a discrete filter that has a "center width" of 3 taps, and you apply it in steps of 3 to make each output pixel.

To do a 3x magnification, you need to precompute 3 discrete filters. The 3 filters will be applied at each source pixel location to produce 3 output pixels per source pixel. They correspond to the impulse shape offset by the correct subpixel amounts (for 3X the offets are -1/3,0,1/3). Note that this is just the same as the arbitrary ratio resize, we're just reusing the computation when the subpixel part repeats.

(in fact for any rational ratio resize, you could precompute the filters for the repeating sub-integer offsets ; eg. to resize by a ratio of 7/3 you would need 21 filters ; this can save a lot of work in some cases, but if your source and dest image sizes are relatively prime it doesn't save any work).

2. Symmetry of down & up . If you look at the way we actually implement minification and magnification, they seem very different, but they can be done with the same filter if you like.

That is, the way we actually implement them, as described above for 3X ratio for example :

Minify : make a filter with center width 3, convolve with source at every 3rd pixel to make 1 output

Magnify : make a filter with center width 1, convolve with source at every 1/3rd pixel to make 1 output

But we can do magnify another way, and use the exact same filter that we used for minify :

Magnify : make a filter with center with 3, multiply with each source pel and add into output

Magnify on the left , Minify on the right :

As noted many times previously, we don't actually implement magnify this way, but it's equivalent.

3. Qualitative notes. What do the filters actually look like, and which should you use ?

Linear filters suffer from an inherent trade-off. There is no perfect filter. (as noted previously, modern non-linear techniques are designed to get around this). With linear filters you are choosing along a spectrum :

blurry -> sharp / ringy

The filters that I've found useful, in order are :

[blurry end]

gauss - no window
gauss - cos window (aka "Hann")
gauss - blackman

Mitchell blurry (B=3/2)
Mitchell "compromise" (B=1/3)
Mitchell sharp (B=0)

sinc - blackman
sinc - cos
sinc - sinc (aka Lanczos)

[sharp end]

there are lots of other filters, but they are mostly off the "pareto frontier" ; that is, one of the filters above is just better.

Now, if there were never any ringing artifacts, you would always want sinc. In fact you would want sinc with no window at all. The output from sinc resizing is sometimes just *amazing* , it's so sharp and similar to the original. But unfortunately it's not reliable and sometimes creates nasty ringing. We try to limit that with the window function. Lanczos is just about the widest window you ever want to use with sinc. It produces very sharp output, but some ringing.

Note that sinc also has a very compelling theoretical basis : it reproduces the original pixels if you resize by a factor of 1.0 , it's the only (non-trivial) filter that does this. (* not true - see later posts on this topic where we construct interpolating versions of arbitrary filters)

If you are resizing in an interactive environment where the user can see the images, you should always start with the sharp filters like Lanczos, and the user can see if they produce unnacceptable artifacts and if so go for a blurrier filter. In an automated environment I would not use Lanczos because it is too likely to produce very nasty ringing artifacts.

The Mitchell "compromise" is a very good default choice in an automated environment. It can produce some ringing and some blurring, but it's not too horrible in either way. It's also reasonably compact.

The Gauss variants are generally more blurring than you need, but have the advantage that all their taps are positive. The windowed gaussians generally look much better than non-windowed, they are much sharper and look more like the original. They can produce some "chunkiness" (raster artifacts), that is under magnification they can make edges have visible stair steps. Usually this is preferrable to the extreme blurriness of the true non-windowed gaussian.

4. comb sampling

Something that bothers me about all this is the way I make the discrete filters from the continuous ones is by comb sampling. That is, I evaluate them just at the integer locations and call that my discrete filter.

I have some continuous mother filter function, f(t) , and I make the discrete filter by doing :

D[] = { f(-2), f(-1), f(0), f(1), f(2) }

and then I apply D to the pixels by just doing mult-adds.

But this is equivalent to convolving the continuous filter f(t) with the original image if the original pixels has dirac delta-functions at each pixel center.

That seems kind of whack. Don't I want to imagine that the original pixels have some size? Maybe the are squares (box impulses), or maybe they are bilinear (triangle hats), etc.

In that case, my discrete filter should be made by convolving the base pixel impulse with the continuous filter, that is :

D[] = { f * hat(-2) , f * hat(-1) , .. }

(* here means convolve)

Note that this doesn't really apply to resizing, because in resizing what we are doing when we apply a filter is we are representing the basic pixel impulses. This applies if I want to apply a filter to my pixels.

Like say I want to apply a Gaussian to an image. I shouldn't just evaluate a gaussian function at each pixel location - I should convolve pixel shapes with the gaussian.

Note that in most cases this only changes the discrete filter values slightly.

Also note that this is equivalent to up-sizing your image using the "hat" as the upsample filter, and then doing a normal comb discrete filter at the higher res.


04-03-11 - Some more notes on filters

Windowing the little discrete filters we use in image processing is a bit subtle. What we want from windowing is : you have some theoretical filter like Gauss or Sinc which is non-zero over an infinite region and you wish to force it to act only on a finite domain.

First of all, for most image filtering operations, you want to use a very small filter. A wider filter might look like it has a nicer shape, and it may even give better results on smooth parts of the image, but near edges wide filters reach too much across the edge and produce nasty artifacts, either bad blurring or ghosting or ringing.

Because of that, I think a half-width of 5 is about the maximum you ever want to use. (in the olden days, Blinn recommended a half-width of 3, but our typical image resolution is higher now, so I think you can go up to 5 most of the time, but you will still get artifacts sometimes).

(also for the record I should note that all this linear filtering is rather dinosaur-age technology; of course you should use some sort of non-linear adaptive filter that is wider in smooth areas and doesn't cross edges; you could at least use steam-age technology like bilateral filters).

So the issue is that even a half-width of 5 is actually quite narrow.

The "Windows" that you read about are not really meant to be used in such small discrete ranges. You can go to Wikipedia and read about Hann and Hamming and Blackman and Kaiser and so on, but the thing they don't really tell you is that using them here is not right. Those windows are only near 1.0 (no change) very close to the origin. The window needs to be at least 4X wider than the base filter shape, or you will distort it severely.

Most of the this stuff comes from audio, where you're working on 10000 taps or something.

Say you have a Gaussian with sigma = 1 ; most of the Gaussian is inside a half-width of 2 ; that means your window should have a half-width of 8. Any smaller window will strongly distort the shape of the base function.

In fact if you just look at the Blackman window : (from Wikipedia)

The window function itself is like a cubic filter. In fact :

Odd Blackman window with half-width of 3 :

const float c_filter[7] = { 0.00660, 0.08050, 0.24284, 0.34014, 0.24284, 0.08050, 0.00660 };

Odd Nutall 3 :

const float c_filter[7] = { 0.00151, 0.05028, 0.24743, 0.40155, 0.24743, 0.05028, 0.00151 };

can be used for filtering by themselves and they're not bad. If you actually want it to work as a *window* , which is just supposed to make your range finite without severely changing your action, it needs to be much wider.

But we don't want to be wide for many reasons. One is the artifacts mentioned previously, the other is efficiency. So, I conclude that you basically don't want all these famous windows.

What you really want for our purposes is something that's flatter in the middle and only get steep at the very edges.

Some windows on 8 taps :

from sharpest to flattest :

//window_nutall :
const float c_filter[8] = { 0.00093, 0.02772, 0.15061, 0.32074, 0.32074, 0.15061, 0.02772, 0.00093 };

//blackman :
const float c_filter[8] = { 0.00435, 0.05122, 0.16511, 0.27932, 0.27932, 0.16511, 0.05122, 0.00435 };

//cos : (aka Hann)
const float c_filter[8] = { 0.00952, 0.07716, 0.17284, 0.24048, 0.24048, 0.17284, 0.07716, 0.00952 };

//window_blackman_sqrt :
const float c_filter[8] = { 0.02689, 0.09221, 0.16556, 0.21534, 0.21534, 0.16556, 0.09221, 0.02689 };

//window_sinc :
const float c_filter[8] = { 0.02939, 0.09933, 0.16555, 0.20572, 0.20572, 0.16555, 0.09933, 0.02939 };

//sin :
const float c_filter[8] = { 0.03806, 0.10839, 0.16221, 0.19134, 0.19134, 0.16221, 0.10839, 0.03806 };

I found the sqrt of the blackman window is pretty close to a sinc window. But really if you use any of these on a filter which is 8-wide, they are severely distorting it.

You can get flatter windows in various ways; by distorting the above for example (stretching out their middle part). Or you could use a parameterized window like Kaiser :

Kaiser : larger alpha = sharper

// alpha = infinite is a delta function
//window_kaiser_6 :
const float c_filter[8] = { 0.01678, 0.07635, 0.16770, 0.23917, 0.23917, 0.16770, 0.07635, 0.01678 };
//window_kaiser_5 :
const float c_filter[8] = { 0.02603, 0.08738, 0.16553, 0.22106, 0.22106, 0.16553, 0.08738, 0.02603 };
//window_kaiser_4 :
const float c_filter[8] = { 0.03985, 0.09850, 0.16072, 0.20093, 0.20093, 0.16072, 0.09850, 0.03985 };
//window_kaiser_3 :
const float c_filter[8] = { 0.05972, 0.10889, 0.15276, 0.17863, 0.17863, 0.15276, 0.10889, 0.05972 };
// alpha = 0 is flat line

Kaiser-Bessel-Derived :

//kbd 6 :
const float c_filter[8] = { 0.01763, 0.10200, 0.17692, 0.20345, 0.20345, 0.17692, 0.10200, 0.01763 };
//kbd 5 :
const float c_filter[8] = { 0.02601, 0.10421, 0.17112, 0.19866, 0.19866, 0.17112, 0.10421, 0.02601 };
//kbd 4 :
const float c_filter[8] = { 0.03724, 0.10637, 0.16427, 0.19213, 0.19213, 0.16427, 0.10637, 0.03724 };
//kbd 3 :
const float c_filter[8] = { 0.05099, 0.10874, 0.15658, 0.18369, 0.18369, 0.15658, 0.10874, 0.05099 };

these are interesting, but they're too expensive to evaluate at arbitrary float positions; you can only use them to fill out small discrete filters. So that's mildly annoying.

There's something else about windowing that I should clear up as well : Where do you put the ends of the window?

Say I have a discrete filter of 8 taps. Obviously I don't put the ends of the window right on my last taps, because the ends of the window are zeros, so they would just make my last taps zero, and I may as well use a 6-tap filter in that case.

So I want to put the end points somewhere past the ends of my discrete filter range. In all the above examples in this post, I have put the end points 0.5 taps past each end of the discrete range. That means the window goes to zero half way between the last tap inside the window and the first tap outside the window. That's on the left :

On the right is another option, which is the window could go to zero 1.0 taps past the end of the discrete range - that is, it goes to zero exactly on the next taps. In both cases this produces a finite window of the desired size, but it's slightly different.

For a four tap filter, the left-side way uses a window that is 4.0 wide ; the right side uses a window that is 5.0 wide. If you image the pixels as squares, the left-side way only overlaps the 4 pixels cenetered on the filter taps; the right side way actually partially overlaps the next pixels on each side, but doesn't quite reach their centers, thus has zero value when sampled at their center.

I don't know of any reason to strongly prefer one or the other. I'm using the 0.5 window extension in my code.

Let me show some concrete examples, then we'll talk about them briefly :

//filter-windower-halfwidth :

//gauss-unity-5 :
const float c_filter[11] = { 0.00103, 0.00760, 0.03600, 0.10936, 0.21301, 0.26601, 0.21301, 0.10936, 0.03600, 0.00760, 0.00103 };

//gauss-sinc-5 :
const float c_filter[11] = { 0.00011, 0.00282, 0.02336, 0.09782, 0.22648, 0.29882, 0.22648, 0.09782, 0.02336, 0.00282, 0.00011 };

//gauss-blackman-5 :
const float c_filter[11] = { 0.00001, 0.00079, 0.01248, 0.08015, 0.23713, 0.33889, 0.23713, 0.08015, 0.01248, 0.00079, 0.00001 };

//gauss-blackman-7 :
const float c_filter[15] = { 0.00000, 0.00000, 0.00015, 0.00254, 0.02117, 0.09413, 0.22858, 0.30685, 0.22858, 0.09413, 0.02117, 0.00254, 0.00015, 0.00000, 0.00000 };

//gauss-blackman-7 , ends cut :
const float c_filter[11] = { 0.00015, 0.00254, 0.02117, 0.09413, 0.22858, 0.30685, 0.22858, 0.09413, 0.02117, 0.00254, 0.00015 };

//gauss-blackman-13, ends cut :
const float c_filter[11] = { 0.00061, 0.00555, 0.03085, 0.10493, 0.21854, 0.27906, 0.21854, 0.10493, 0.03085, 0.00555, 0.00061 };

1. The gaussian, even though technically infinite, goes to zero so fast that you really don't need to window it *at all*. All the windows distort the shape quite a bit. (technically this is a rectangular window, but the hard cut-off at the edge is invisible)

2. Windows distorting the shape is not necessarily a bad thing! Gauss windowed by Sinc (for example) is actually a very nice filter, somewhat sharper than the true gaussian. Obviously you can get the same thing by playing with the sdev of the gaussian, but if you don't have a parameterized filter you can use the window as a way to adjust the compactness of the filter.

3. Cutting off the ends of a filter is not the same as using a smaller window! For example if you made gauss-blackman-7 you might say "hey the ends are really close to zero, it's dumb to run a filter over an image with zero taps at the end, I'll just use a width of 5!". But gauss-blackman-5 is very different! Making the window range smaller changes the shape.

4. If you want the original filter shape to not be damaged too much, you have to use wider windows for sharper window functions. Even blackman at a half-width of 13 (full width of 27) is distorting the gaussian a lot.


04-02-11 - House rambling

There are a lot of short sales and a few foreclosures popping up in the area. (one of the houses we stopped by had a lovely sign saying "this house is the property of xx bank and if you enter you will be prosecuted"). Some of the short sales look like good deals, but everything I've read says to stay the fuck away from them. Apparently what's happening is neither the owner nor the banks really want to sell their short sales. While the short sale is listed, the owner can live their for free, so they don't actually want to sell. And, the banks would have to write down a loss if the sale went through, so they want to delay it to keep it off their books.

So, no short sales for me I think. Foreclosures might be okay though, dunno. I have seen a few places that are cheap enough that I believe you could buy them and immediately have positive cash flow from rent, which is where you want to be with real estate investing. (mainly places that have "Mother in Law" units because you get two rents; I can't believe how much those things rent for; the one in my house is rented out for $850 and it's an absolute hole). (you wouldn't actually be cash+ because that's not counting property tax or upkeep or months when it's not rented, etc, etc, but then it's also not counting appreciation; in any case rent = mortgage is a rough sign of price sanity, or maybe just a sign that rents are awfully high around here).

There are houses in the Beacon Hill area that are under $200k. I could buy that outright and have no mortgage! My depression-era financial mentality finds that rather appealing, but it's really bad investment. One of the biggest advantages that houses have as an investment vehicle is leverage (the other of course is tax benefits) (and another is that the market for houses is actually propped up by the government, they actually subsidize your risk).

My understanding on fees is that the 6% is usually locked into the seller's contract, then they give 3% to the buyer's agent. This is an obviously illegal way of making you take a buyer's agent.

One option is to use Redfin, which gives you back half their fee (1.5% back). On a 500k house that's 7500, so it's not trivial, but maybe a good buyer's agent would save you 7500 worth of time and haggling and fixes on the house, so it's not a clear win. I've read good and bad things about buying through Redfin. It seems like sort of a shoddy operation, but I do like doing things online and I hate talking to humans, so that aspect is right up my alley.

The advantage of Redfin is that they will at least make appointments for seeing houses and take you around, so you don't have to call up agents. There's another one here called "$500 Realty" which gives you back 75% of the 3%, but they literally do nothing, you have to do all the touring yourself.

On top of the 6% real estate agent fee you get various closing costs which I understand are 3-5% typically. WTF WTF. 10% in fees on this transaction. This is such a fucking scam.

And I was thinking about mortgages a bit. Obviously fixed rate is the way to go right now to lock in the low interest rates. I'm kind of tempted by 15-year mortgages because I like the idea of paying it off (depression era youth, there you are again!), but I'm pretty sure the 30 year is actually a better deal. For one thing, the spread between 15 and 30 is very small right now; apparently sometimes the spread is much bigger and that makes more of a case for the 15. If you look at the total amount of payment difference, it looks like a big difference (almost twice as much), but that's a lie. You have to account for inflation - the dollars paid after the first 15 years are worth *way* less, and you have to account for investment income on all the dollars you didn't put towards the mortgage, and you have to count the larger tax deduction with the 30 year, and I believe the result of all that is a decent positive on the side of the 30 year. One thing I spotted that I wasn't aware of is pre-payment penalties. WTF, you fucking scammers. So, have to watch out for that.

I dunno.


04-01-11 - Pots 1

This is sort of a test of putting some images in my blog. I'm not sure how the cbloom.com bandwidth will handle it...

I've been doing pottery classes for the last number of weeks. It's quite pleasant. We're finally getting some finished pieces out of the kiln so I'll post some pics and notes.

It's very good meditation for me. You sit with the clay and stare into it; you have to really relax and go slow, because pots don't like to be forced around. I like to watch a point on the pot as it goes around, which gets you into a rocking rhythm with the wheel.

I really like the way any one pot is no big deal. There are a million ways you can ruin a pot, and they can happen at any time (surprise air bubbles, off centering, dry spots in throwing, cutting through the bottom trimming, S cracks, etc etc etc). So you can't get too attached to any one pot, you have to be ready to just throw it away and not care. At first I thought that was stressful because I cared too much about each pot, but once you change your viewpoint and let go, it's actually really relaxing and liberating. If you screw up a throw, no biggie, do another, you can experiment with new techniques and get it wrong many times, no biggie.

For one thing, it's important to get out of the house. If you live with someone, it's extremely inconsiderate to be home all the time. Everybody deserves some alone time in their own house, and it's just tacky to impose your presence on them constantly. It's especially good to have a weekly scheduled time when you will be gone, so that they can count on it and look forward to it, so that they can crossdress and eat a gallon of ice cream, or whatever it is they want to do. I know that I am generally pretty bad about being home too much, and I feel bad about that, but I just hate the outside world (when it's gray and wet and cold), so this is my way to get out a bit.

These are the first pieces of crap I made :

You have to make a bunch of crap and get it in the pipeline. Pottery is a lot like a deeply pipelined CPU - the latency from grabbing a hunk of clay to getting a finished pot might take a month, but your throughput could be very high; you can easily throw 10 pots an hour, but not see them come out for a month. So in the beginning my goal was just to get some stuff in the pipeline.

Pro tip : centering : if you are even slightly off center it will be annoying to deal with. The things that really help me are : 1. putting my elbow in my groin, so that the forward pressure is braced against my whole body, and 2. when you release the pressure make sure it's very gradual, and equal with both your hands - you can get something perfectly centered under pressure, and then you release unevenly and it's back off center.

Pro tip : wedging : don't fold the clay over itself, you can create air bubbles; it's not really aggressive like kneading, it's more a gentle motion like massaging. Coning up and down can substitude for good wedging.

Pro tip : fixing to the wheel ; a slightly convex bottom of your lump is better than flat or concave when you slam it down, because it ensures the middle touches. You can press the edge onto the wheel before centering to help it stick.

This is my first bowl :

It's not bad, thought I was messing around with different glazes too much and that just made everything look really messy. I need to glaze simpler and cleaner. I also tend to not leave enough foot on things, which is okay, but it's just a pain in the ass to dip things if you don't have a good foot to grab. A lot of the things you're supposed to do in throwing are really just to make life easier on yourself later; eg. you can always fix things in trimming, but trimming is a major pain in the ass, so it's just easier if you throw it as close to the right form as possible to begin with.

Pro tip : bowls should not get too wide too soon, or they can get weak and fall. Basically you want to throw an upside down cone shape first, and then you can round out the bottom as the last thing you do.

Pro tip : pulling is pretty basic; some notes to self that I forget some times. The most important thing first is that the pot is evenly wet, if there are dry spots they will catch during the pull and fuck the pot. Go slow and steady, don't stop, if something unexpected happens along the way, just keep moving slow and steady. Try to get your eyes directly over the pull and look straight down , if your head is off to one side you won't pull straight. Don't rest your elbows on your body, it makes you pull an arc, get them out in space. Make sure the two hands are well connected. Do not apply much pressure, you aren't squeezing hard, just gentle pressure and slide up. One finger tip should be slightly higher than the other, if the outside hand is lower you can narrow the pot, if the outside hand is higher you can widen the pot.

I like to open "Simon Leach style" against the whole flat of my hand and do the first rough pull that way. One trick I've seen for wetting is potters hold their sponge in the heel of their hand while they pull with their fingertips, so if they hit dry they can squeeze the sponge right there as they go. You can also pull directly against a sponge.

Some little tea bowl type things :

My first attempts at vase shapes :

Necking down to narrow is quite difficult. If the clay gets too wet or too thin, it will ribbon as you neck like the bottom pot. I've learned two different techniques for necking, one is the "six points of contact" technique which is very slow and delicate, the other is to just grab the clay with both hands like you're trying to strangle it and brute force it. The main thing is that it's much easier to widen than it is to narrow, so you want to do your initial open and pull narrow, and keep it narrow at the top the whole time, don't throw out and try to bring it back in.

04-01-11 - Dirty Coder Tricks

A friend reminded me recently that one of the dirtiest tricks you can pull as a coder is simply to code something up and get it working.

Say you're in a meeting talking about the tech for your game, and you propose doing a new realtime radiosity lighting engine (for example). Someone else says "too risky, it will take too much time, etc. let's go with a technique we know".

So you go home and in your night hours you code up a prototype. The next day you find a boss and go "look at this!". You show off some sexy realtime radiosity demo and propose now it should be used since "it's already done".

Dirty, dirty coder. The problem is that it's very hard for the other person to win the argument that it's too hard to code once you have a prototype working. But in reality a sophisticated software engineer should know that a demo prototype proves nothing about whether the code is a sensible thing to put into a big software product like a game. It's maybe 1% of the total time you will spend on that feature, and it doesn't even prove that it works, because it can easily work in the isolation of a demo but not be feasible in real usage.

I used to pull this trick all the time, and usually got away with it. I would be in a meeting with my lead and the CTO, the lead would not want some crazy feature I was pushing, and then I would spring it on him that it was "already working" and I'd show a demo to the CTO and there you go, the feature is in, and the lead is quietly smoldering with futile anger. (the most dickish possible coder move (and my favorite in my youth) is to spring the "it's already done, let me show you" as a surprise in the green light meeting).

But I also realize now that my leads had a pretty good counter-play to my dirty trick. They would make me put the risky feature they didn't like in an optional DLL or something like that so it wasn't a crucial part of the normal production pipeline, that way my foolishness didn't impact lots of other people, and the magnified workload mainly fell only on me. Furthermore, optional bits tend to decay and fall off like frostbitten toes every time you have to change compiler or platform for directx version. In that way, the core codebase would clean itself of the unnecessary complicated features.

old rants