Cannon Fire


I spend most of my life feeling like I've been shot out of a cannon

I spend most of my life feeling like I've been shot out of a cannon. - Molly Ivins

The Kalman Filter


I have been using this blog to post techniques that I have been trying to learn. I hope eventually to apply these to more serious data than the examples I have been  using. I'm going to continue and take a look at a very simple application of a Kalman filter. The Kalman filter allows us to use noisy measurements of a dynamic process to make reasonably accurate predictions of the process's state. Kalman filters are important for guidance, navigation, and control of vehicles and robots, aircraft and spacecraft. It is also widely used in time series analysis, signal processing, and econometrics.

If you Google Kalman filter, you will find many web pages describing the method in varying levels of  mathematical difficulty. My favorite source for Kalman filter details is Kalman and Bayesian Filters in Python by Roger Labbe. The book's website provides an Juypter notebook and supporting software. 

 I'm not going to go into the derivation of the filter. Instead I'm going to look at a simple application, the motion of a projectile launched with an initial velocity and negligible air resistance.

Kalman filters rely on measurements taken at repeated periods. They are recursive in the sense that the prediction of the future state depends on the present state and the measurement. The state represents the quantities that are important for the dynamics of the system. In a projectile example, these quantities could be things like position, velocity, and acceleration. The filter also includes control information such as rudder position. The state is represented as a vector of real numbers. At each  time increment, a linear operator is applied to the state to generate the new state. The prediction of the new state includes noise, and optionally some information from the controls on the system. Then, another linear operator mixed with more noise generates the observed outputs from the true state. To build the filter requires:
  • A mathematical model of the system.
  • An estimate of the initial state of the system.
  • An estimate of the error inherent in the model.
  • Estimates of the process and measurement error.
At each step, the filter must be provided with a vector containing the control state and a vector of measurements. The filter calculations generate a current estimate of the state of the  system and an estimate of the overall error of the system.

The Kalman Filter Equations

The Kalman filter consists of two stages, a prediction step and an update step. The fitter iterates between these steps. In essence, the Kalman filter is a predictor-corrector model. The prediction step estimates a new state vector and the system noise covariance based on the current state, covariance and control input. The update step takes system measurements and the predictions and uses them to calculate a corrected state and covariance.

Prediction

The prediction step predicts the state and covariance, i.e. how much error, at each time step. The filter model assumes the true state at time k is evolved from the state at (k − 1) according to
\[{{\mathbf{x}}_k} = {\mathbf{F}}{{\mathbf{x}}_{k - 1}} + {\mathbf{B}}{{\mathbf{u}}_k}\]

  • ${\mathbf{x}}$ is the state vector.
  • ${\mathbf{u}}$ is called the control vector. It represents control input to the system.
  • ${\mathbf{F}}$ is called the state transition matrix. It is applied to the previous state to obtain an updated state vector.
  • ${\mathbf{B}}$ is the control model. It is applied to the control vector to produce the control effect on the state.
The covariance is updated as

\[{{\mathbf{P}}_k} = {\mathbf{F}}{{\mathbf{P}}_{k - 1}}{{\mathbf{F}}^T} + {\mathbf{Q}}\]

  • ${\mathbf{P}}$ is the covariance matrix. It is the estimate of the error for each part of the system. ${{\mathbf{F}}^T}$ is the transpose of the state transition matrix.
  • ${\mathbf{Q}}$ is the estimated process error covariance. Getting a good model for this matrix can be difficult.

Update

The update step consists of a series of linear equations to produce an updated value for ${\mathbf{x}}$ and ${\mathbf{P}}$.

\[{\mathbf{\tilde y}} = {{\mathbf{z}}_k} - {\mathbf{H}}{{\mathbf{x}}_k}\]
\[{\mathbf{S}} = {\mathbf{H}}{{\mathbf{P}}_k}{{\mathbf{H}}^T} + {\mathbf{R}}\]
\[{\mathbf{K}} = {{\mathbf{P}}_k}{{\mathbf{H}}^T}{{\mathbf{S}}^{ - 1}}\]
\[{{\mathbf{x}}_k} = {{\mathbf{x}}_k} + {\mathbf{K\tilde y}}\]
\[{{\mathbf{P}}_k} = ({\mathbf{I}} - {\mathbf{KH}}){{\mathbf{P}}_k}\]

  • ${\mathbf{z}}$ is the input measurement vector. It contains measurements of the state  components.
  • ${\mathbf{H}}$ is  the observation matrix. It translates the state vector to a measurement vector.
  • ${\mathbf{\tilde y}}$ is called the innovation or residual. It compares the state prediction against the measurement.
  • ${\mathbf{S}}$ is the residual covariance. It compares the real error against the prediction.
  • ${\mathbf{R}}$ is the measurement covariance.
  • ${\mathbf{K}}$ is the Kalman gain. ${{\mathbf{S}}^{ - 1}}$ is the matrix inverse of ${\mathbf{S}}$.
  • ${\mathbf{I}}$ is the identity matrix.
To design a filter, we have to provide matrices ${\mathbf{F}}$, ${\mathbf{B}}$, ${\mathbf{H}}$, and ${\mathbf{R}}$. Like ${\mathbf{Q}}$, finding good estimates for ${\mathbf{R}}$ can be difficult.

A Filter Class

Given all of this linear algebra, it's pretty straight forward to write to write code for a filter. The tricky part is designing the matrices. Here's a Python class for a Kalman filter. It's not especially efficient. If you want efficiency, you can always use CUDA.

import numpy as np

"""
KalmanFilter - a class. This is pretty self-explanatory given some documentation of the Kalman Filter
               Based on https://drive.google.com/file/d/0By_SW19c1BfhSVFzNHc0SjduNzg/view
"""
class KalmanFilter(object):
    def __init__(self, x0, P, F, B, Q, H, R):
        """
        Initialization
        arguments:
        x0 - initial state vector
        
        """
        self._x = x0 # state vector
        self._P = P  # state covariance
        self._F = F  # state transition matrix
        self._B = B  # control matrix
        self._Q = Q  # process covariance
        self._H = H  # observation covariance
        self._R = R  # measurement covariance

    # a couple of getters for state and covariance
    @property
    def x(self):
        return self._x

    @property
    def P(self):
        return self._P

    def predict(self, u):
        """
        predict - predict new state
                  calculates x = Fx + Bu; P = FPF.T + Q
        arguments:
        u - control vector
        """
        self._x = np.dot(self._F, self._x) + np.dot(self._B, u)
        self._P = np.dot(np.dot(self._F, self._P), self._F.T) + self._Q

    def update(self, z):
        """
        update - Kalman update
        arguments:
        z - measurement vector
        """
        y = z - np.dot(self._H, self._x) # residual
        S = np.dot(np.dot(self._H, self._P), self._H.T) + self._R # innovation covariance
        K = np.dot(np.dot(self._P, self._H.T), np.linalg.inv(S))  # Kalman gain
        self._x = self._x + np.dot(K, y) # new state
        self._P = np.dot(np.identity(self._P.shape[0]) - np.dot(K, self._H), self._P) # new state covariance



Fire The Cannon

As a simple example of the use of the Kalman filter, we'll consider an example from elementary physics, simple projectile motion. An object, such as a cannon shell or an arrow, is launched with an initial velocity and travels under the influence of gravity until it hits the ground. We will ignore air resistance, Coriolis effects, and other forces that make this a tough problem in the real world.

Here are the equations of motion for the projectile as recurrence relations. Dt is the time step and g is gravitational acceleration.

\[{x_t} = {x_{t - 1}} + {v_{t - 1}}\Delta t\]
\[{v_{x,t}} = {v_{x,t - 1}}\]
\[{y_t} = {y_{t - 1}} + {v_{y,t - 1}}\Delta t - \tfrac{1}{2}g\Delta {t^2}\]
\[{v_{y,t}} = {v_{y,t - 1}} - g\Delta t\]

The left hand side of these equations is our state vector. The right hand side represents prediction equations transforming the old state and control into the new state. The only control for the projectile is g, the acceleration of gravity.

Rewriting

\[{{\mathbf{x}}_t} = {\mathbf{F}}{{\mathbf{x}}_{t - 1}} + {\mathbf{B}}{{\mathbf{u}}_t}\]

\[{{\mathbf{x}}_t} = \left[ {\begin{array}{*{20}{c}}
  1&{\Delta t}&0&0 \\
  0&1&0&0 \\
  0&0&1&{\Delta t} \\
  0&0&0&1
\end{array}} \right]{{\mathbf{x}}_{t - 1}} + \left[ {\begin{array}{*{20}{c}}
  1&0&0&0 \\
  0&1&0&0 \\
  0&0&1&0 \\
  0&0&0&1
\end{array}} \right]\left[ {\begin{array}{*{20}{c}}
  0 \\
  0 \\
  { - \tfrac{1}{2}g\Delta {t^2}} \\
  { - g\Delta t}
\end{array}} \right]\]

We can see that

\[{\mathbf{x}} = \left[ {\begin{array}{*{20}{c}}
  x \\
  {{v_x}} \\
  y \\
  {{v_y}}
\end{array}} \right]\]

\[{\mathbf{F}} = \left[ {\begin{array}{*{20}{c}}
  1&{\Delta t}&0&0 \\
  0&1&0&0 \\
  0&0&1&{\Delta t} \\
  0&0&0&1
\end{array}} \right]\]

\[{\mathbf{B}} = \left[ {\begin{array}{*{20}{c}}
  1&0&0&0 \\
  0&1&0&0 \\
  0&0&1&0 \\
  0&0&0&1
\end{array}} \right]\]

The control vector is a constant.

\[u = \left[ {\begin{array}{*{20}{c}}
  0 \\
  0 \\
  { - \tfrac{1}{2}g\Delta {t^2}} \\
  { - g\Delta t}
\end{array}} \right]\]

We need to supply a few other matrices to the filter.

${\mathbf{Q}}$ is the estimated process error covariance. Since the process is just elementary physics, we will set ${\mathbf{Q}}$ to a 4X4 matrix of zeros. ${\mathbf{H}}$  translates the state vector to a measurement vector. We assume our measurements are on the same scale as the projectile data, so ${\mathbf{H}}$ will be an identity matrix.

${\mathbf{R}}$, the measurement covariance. We will assume that the variance is the same for each element of the vector and there is no covariance. That is, the measurement of velocity doesn't affect the position measurement.. This probably a bad assumption, but for the purposes of example, it should be OK.

\[{\mathbf{R}} = \left[ {\begin{array}{*{20}{c}}
  {{\sigma ^2}}&0&0&0 \\
  0&{{\sigma ^2}}&0&0 \\
  0&0&{{\sigma ^2}}&0 \\
  0&0&0&{{\sigma ^2}}
\end{array}} \right]\]

To get started, we need an initial state vector and covariance. The initial state is

\[{{\mathbf{x}}_0} = \left[ {\begin{array}{*{20}{c}}
  {{x_0}} \\
  {\left| v \right|\cos (\theta )} \\
  {{y_0}} \\
  {\left| v \right|\sin (\theta )}
\end{array}} \right]\]

${\theta}$ is the initial launch angle and ${\left| v \right|}$ is the launch speed.

With all of this in mind, we can set up the Kalman filter.

def init_projectile(initial_position, initial_speed, initial_angle, measurement_cov, dt):
    """
    init_projectile - get the cannon ready to fire
    arguments:
    initial_position - tuple (initial x, initial y) positions
    initial_speed - launch speed in m/sec
    initial_angle - launch angle in degrees
    measurment_cov - measurement covarinace. We are assuming this doesn't change during flight
    dt - measurement time step
    returns:
    kf - a Kalman Filter object
    F - state transition matrix
    B - control matrix
    Note: see https://drive.google.com/file/d/0By_SW19c1BfhSVFzNHc0SjduNzg/view to get an
          idea of what all these matrices mean
          state vector is np.array([x_pos, vx, y_pos, vy]).T
    """
    # set up initial state vector
    x = np.array([[initial_position[0], initial_speed * np.cos(np.radians(initial_angle)),
                   initial_position[1], initial_speed * np.sin(np.radians(initial_angle))]]).T
    
    # state transition matrix
    F = np.identity(4)
    F[0,1]=dt
    F[2,3]=dt

    # control matrix
    B = np.zeros((4,4))
    B[2,2] = 1
    B[3,3] = 1

    # measurement covarinace
    R = measurement_cov  # measurement covarinace
    H = np.identity(4)   # observation matrix Hx is a measurement vector
    P = np.identity(4)   # initial state covariance
    Q = np.zeros((4,4))  # error covariance - we're assuming it's null

    kf = KalmanFilter(x, P, F, B, Q, H, R)  # create the filter

    return kf, F, B


Let's try out the  filter and see how well the fitter matches the trajectory.

Here's the Newton's law version.

def update_flight(F, B, u, x):
    """
    update_flight - update the state for a flight governed by Netwon's laws and no air
                             resistance.
                             Calculates x_new = Fx_old + Bu
    arguments:
    see filter.py
    returns:
    the new state vector. In this case a (4,1) numpy array
    """
    return np.dot(F, x) + np.dot(B, u)
    

Here's the body of the projectile simulation.

def main():
    num_obs = 200
    dt = 0.1
    measurement_cov = np.identity(4) * 20
    real_initial_pos = (0, 0)
    test_initial_pos = (0, 0)
    initial_speed = 100
    initial_angle = 45

    # initialize
    kf, F, B = init_projectile(test_initial_pos, initial_speed, initial_angle, measurement_cov, dt)

    # control - the only control is gravity
    g = -9.81
    u = np.array([[0, 0, (1/2) * g * dt**2, g * dt]]).T

    x = kf.x  # inital state

    # some lists for plotting
    x_flight = [np.array([[real_initial_pos[0], initial_speed * np.cos(np.radians(initial_angle)),
                           real_initial_pos[1], initial_speed * np.sin(np.radians(initial_angle))]]).T]  
    z_list = [np.random.multivariate_normal(x_flight[-1].flatten(), measurement_cov)]
    x_list = [kf.x]
    
    while True:
        # update flight and measurement. Assume flight is perfectly governed by Newton's laws
        x_f = update_flight(F, B, u, x_flight[-1])
        if x_f[2,0] <= 0:  # don't go underground
            break;
        z = np.array([np.random.multivariate_normal(x_f.flatten(), measurement_cov)]).T
        x_flight.append(x_f)
        z_list.append(z)

        # the real work - predict new state and then update prediction based on measurement
        kf.predict(u)
        kf.update(z)
        
        x_list.append(kf.x)

    # extract some saved data for plotting
    flight_xs = [x_flight[i][0][0] for i in range(len(x_flight))]
    flight_ys = [x_flight[i][2][0] for i in range(len(x_flight))]
    z_xs = [z_list[i][0] for i in range(len(z_list))]
    z_ys = [z_list[i][2] for i in range(len(z_list))]
    xs = [x_list[i][0][0] for i in range(len(x_list))]
    ys = [x_list[i][2][0] for i in range(len(x_list))]

    # plot it
    fig = plt.figure()
    ax = fig.add_subplot(1,1,1)
    ax.plot(flight_xs, flight_ys, label='Flight')
    ax.plot(z_xs, z_ys, 'ro', label='Measurements')
    ax.plot(xs, ys, label='Fit')
    plt.xlabel('x')
    plt.ylabel('y')
    ax.legend()
    plt.show()


The results are plotted below. The fit from the Kalman filter follows the trajectory closely.


What happens if we increase the measurement error? In the the plot above, the measurement error standard deviation was ${{\sigma^2}  = 20}$. ]The scatter in the measurements looks relatively small on the scale of the plot. What happens if we increase the measurement error.

With ${{\sigma^2}  = 200}$, the fit from the filter is still pretty reasonable given the high measurement error and large step time step size.


If we increase the measurement step size, the fit gets better, but the plot gets so crowded with data points that it's hard to see the fit.

The code is available here.

No comments:

Post a Comment