If you have any different insights on these topics, please share them. Maybe some day some of these ideas will make it to the big leagues and become pro-verbs. I also have a blog for technical computery stuff - zachstechnotes.blogspot.com.

Wednesday, January 16, 2013

Kalman Filter Lessons (Bayes' Rule in Your Life)

The Kalman Filter is one of the most elegant and beautiful devices used in control systems engineering and is among my favorites in the list of remarkable things that I have stumbled upon in my journey to becoming an engineer. Not only is it a useful tool, but I think that we can learn something about how to think from it. So what exactly is a Kalman Filter? "Kalman" is simply the name of the guy who first put it all together, but perhaps "filter" is not the best word to describe it to someone who hasn't taken an electrical engineering class and eaten the fruit of the Tree of Knowledge of Differential Equations.

Instead, I would call it more of an "information manager". It is a set of mathematical equations that takes in pieces of information that are uncertain and combines them into a single estimate of what is really going on. If some specific conditions are satisfied (they, of course, never are quite satisfied in real life), this estimate contains all of the information that we could possibly know about the thing we are observing; it is the best estimate that we could have. The Kalman filter is built on Bayes Rule, the statistics tool that has been hyped so much in the nerd community because it helped blogger Nate Silver to correctly predict the electoral outcome in all 50 states.

So how does the Kalman filter work? I like to describe it using the metaphor of a punt returner catching a punt in a football game. When he sees the ball being kicked, he has some initial idea of how far away it is and how fast it comes off of the punter's foot. As he watches the ball, though, all he can really measure with his eyes is the angle to the ball; from so far away, he can't accurately judge the distance to the ball through vision alone. Furthermore, he cannot keep his eyes on the ball the whole time. Instead he must look down to check to see how close the coverage is. But even with these challenges, the player is able to make the catch. The key is that he also knows the physics of the system. He has practiced enough to know how the ball will behave even if his observation is limited. What is going on in his brain is very similar to what a Kalman filter does. It combines knowledge of the physics of the system with observed measurements to get the best idea of what is actually going on.

Another way to think of the Kalman filter is as a device to remove sensor noise. Let's look at an example. Say that you're trying to track the angle of a swinging pendulum. Why would you be doing this? Maybe you're trying to lower a robot to the surface of another planet; maybe you're just contriving an example that's simple enough to work out in an afternoon. Either way, the math is the same. Here is the motion of our pendulum (blue), along with the measurements that our imaginary angle sensor makes (red dots).

Our measurements are clearly very noisy (apparently the managers decided we needed to save money by buying bad sensors), and since our sensors are the only way that we can observe the system, all we have to work with is this:

This looks like a dire situation. The human eye can't even really tell that there is a sine wave in there somewhere. If we don't use any filtering, and simply string all of these sensor measurements together, we get this estimate (in red) of what the pendulum is doing.

This estimate of the pendulum's behaviour is appalling. It doesn't make any sense for the pendulum to swing like this, and we cannot trust our estimates at all. But, before we despair, we remember that we know what the physics of the pendulum should be, and thus we can write a Kalman filter. If we apply our new filter to these measurements, Voila! The new estimate matches the actual system response almost exactly:

I first learned about the Kalman Filter at the beginning of the summer of 2011 when I was working with one of my mentors, Dr. Suman Chakravorty, at the Air Force Research Lab in Albuquerque. I triumphantly presented him with a graph like the one above, immensely satisfied with my ability to vanquish sensor noise. He then proceeded to graciously explain to me that I was missing the main point.

The Kalman Filter does not just give us a single estimate of what the system is doing, but it gives us an entire probability distribution of where the system might be. With a Kalman Filter, we not only know what the pendulum angle most likely is at a given time, but we also know how far away from that estimate it plausibly could be. The graph below shows the 1σ bounds of our estimate probability distribution. If the Kalman Filter is correctly programmed, the actual pendulum angle has a 68.2% chance of being between the 1σ boundaries*. In the picture below, notice how the 1σ boundaries start far away from each other, but gradually come together, showing the Kalman Filter's accumulation of information; as more measurements are received, the filter becomes more certain about the angle. At the beginning of the simulation, the filter believes that the angle could plausibly be anywhere from -8 to 46 degrees, while at the end, it expects the angle to be between 25 and 35 degree - a very narrow range.

By understanding that the Kalman Filter's estimate is actually a probability distribution, we are getting close to the fundamental enlightenment that the filter can give us about how to weigh ideas. The key that makes the Kalman Filter work so well is that it uses the uncertainty in its current estimate to make a decision about how much it should trust new measurements that it receives.

In a sense, the Kalman Filter is constantly optimally adjusting its open-mindedness to new measurements, and I submit that this is how people should handle new ideas. The level of open-mindedness in the Kalman Filter is based on the accuracy of the physics model and the reliability of the measurements. If the observer has a very accurate model of the physics, and has received many measurements in the past, it will not be led astray by new inaccurate measurements as would be the case in the absence of a filter. If, on the other hand, the observer has few previous measurements to work with, and doesn't have a great physics model, it is open to the innovation brought by new measurements. If the physics model isn't great, and the measurements are unreliable, of course, the observer simply cannot make a good judgment about the state of the system.

I am not saying that we should use some sort of cold, hard mathematics to make decisions. On the contrary, it is usually quite impossible to make any quantitative statements about the decisions that we face in everyday life. Instead, I am saying that when we are told new things or are trying to discern what is right, we should make a conscious effort to determine the degree of reliability of the premises we hold and observations that we take in.

It seems to me that people tend to fall into one of two extreme categories: either they are tossed to and fro by every new idea that they hear (this is the side that I tend to err on) or they are almost always immovably set in the ideas that they hold. Bayes' rule and the ideas behind the Kalman Filter suggest that the "inertia" of our ideas should be dynamic and should have differing magnitudes with regard to different facets of life based on the certainty of knowledge we have in a particular area.

We are all becoming experts in certain fields. By "field" I mean any related set of activities that require specialized knowledge. A field could be anything from psychology to fishing to interacting with a specific family member. After becoming experts in a field, we can be confident that our ideas are correct and can safely reject conflicting ideas (assuming that the foundational facts haven't changed since we developed the ideas). However, before we become experts, we should be open to new ideas because our body of knowledge is still in its infancy.

People are too often too afraid to say that they don't know about something. In the extremely complex world we live in, there are a great deal of processes that are incomprehensible to our brains. Our minds are optimized for dealing with things on human scales, that is, they are adept at conceptualizing things that take place over times from 1 second to 10 years, on distances from 1 millimeter to 5 kilometers, involve groups of 1 to 500, etc., but our minds don't handle bigger or smaller processes well, and we can only begin to grasp them after years of absorbing abstract information and undergoing extensive training. Furthermore, since our brains make so many approximations and use so many heuristics, they are not very effective at understanding things based purely on reason and the testimony of others rather than direct experience. Thus, it is rare for someone to be able to fully comprehend something without having personally experienced it.

For these reasons, I think that we should admit that we simply can't be sure of the answers to most of the questions that we ask in our lives. This doesn't mean that we are unable to make decisions. Decisions don't wait until we know enough to demand being made. Sometimes if there is not more information available we have to go with our best estimate of what is right even if we are not certain about it. This is completely reasonable. However some people will make a further jump that is not reasonable. They will assume that since they have taken a course of action or made a statement, they are bound to believe that course of action or statement was right. "I acted on this idea, so it must be right; otherwise I am a fool," they say in their mind. I submit that a decision should not cause any change our level of certainty. Instead, we should be candid with ourselves about our doubtable ideas and seek real evidence to determine if our decisions were justifiable.

In a sense, a human's journey through life can be thought of as a Kalman Filter or Bayesian inference process (though it is certainly also much more than that). When we start out, we know essentially nothing about the world. As we make our way through our time on earth, we gradually accumulate information, tying ideas together and examining their consistency. This accumulation of knowledge is a collective effort by all of humanity, so we must not let pride or other forces inflate our certainty in our ideas. Instead, we should approach new fields in humility with an open mind in light of our lack of knowledge, but maintain with confidence the conclusions that we have come to based on a wealth of sturdy evidence, and treat those who have ideas that oppose ours with civility and understanding whenever possible.

* this number is actually not strictly correct for this example problem because the pendulum is slightly nonlinear and thus the Kalman Filter is not perfectly tuned, but it is close enough for illustrative purposes