Local Trend Kalman Filter


Don't follow trends, start trends.
Frank Capra

I have to tell you about the Kalman filter, because what it does is pretty damn amazing.
Tim Babb

In a previous post, I looked at an application of a Kalman filter for simple projectile motion. In this post, I'll explore a different use of the Kalman filter, analyzing the trend in time series data. A trend is a long-term increase, decrease, or stationary period in the data. It may be linear or nonlinear.

Consider this data from the NASA Global Climate Change website.


The points represent the change in global surface temperature relative to 1951-1980 average temperatures. Seventeen of the 18 warmest years in the 136-year record all have occurred since 2001, with the exception of 1998. As you can see, there is a definite upward trend in the data that accelerates somewhere 1980.

The Local Trend Model


The local trend Kalman filter comes from Ruey S. Tsay's book Analysis of Financial Time Series. It is also available online as lecture notes for Tsay's Business 4191 class at .the Univ. of Chicago school of business.

We will model the time series data $\{ {y_t}\}$ with a simple model

\[{y_t} = {\mu _t} + {\varepsilon _t}\]
\[{\mu _{t+1}} = {\mu _t} + {\eta _t}\]

${\varepsilon _t} \sim N(0,{\sigma _\varepsilon })$ and ${\eta _t} \sim N(0,{\sigma _\eta })$ are Gaussian white noise. ${\mu _t}$ is the local mean or trend of the data. We can observe the data $\{{y_t}\}$, but not ${\mu _t}$. We're going to assume the noise variances ${\sigma _\varepsilon }$ and ${\sigma _\eta }$ are constant, but they don't have to be. 

This model says that the data is described by random variations around a trend. One way to think about the data shown above is that there is some continuous process, global climate change, represented by ${\mu _t}$ that emits some data with noise. There are good models for climate change. We could use a climate model ${\mu _t} = f(t,\theta )$ where ${\theta}$ represents a whole bunch of parameters and forcing functions, and fit the data to the trend. Unfortunately, climate models are complicated. Instead, we'll keep things simple and use statistical inference and a Kalman filter to model the trend.

I won't go through the full derivation of the local trend Kalman filter. It can be found in Tsay's book and lecture notes and Durbin and Koopman's Time Series Analysis by State Space Methods. However, let's define a few terms.

${\mu _{t|t - 1}} = E({\mu _t}|{y_1}...{y_{t - 1}})$ is the expected value of ${\mu_t}$ given the data from the beginning of the series up to the preceding data value. Similarly, ${y_{t|t - 1}} = E({y_t}|{y_1}...{y_{t - 1}})$ is the expectation value of y given the preceding data. What value do we expect for ${y_t}$? We expect it to be ${\mu_t}$. So, ${y_{t|t - 1}} = {\mu_{t|t-1}}$

We will define some more quantities.
\[{v_t} = {y_t} - {\mu _{t|t - 1}}\]
\[{V_t} = {\mathop{\rm var}} ({v_t}|{y_1}...{y_{t - 1}}) = {\mathop{\rm var}} ({\mu _t} + {\varepsilon _t} - {\mu _{t|t - 1}}|{y_1}...{y_{t - 1}}) = {\Sigma _{t|t - 1}} + \sigma _\varepsilon ^2\]
\[{\Sigma _{t|t - 1}} = \operatorname{var} ({\mu _t} - {\mu _{t|t - 1}}|{y_1}...{y_{t - 1}})\]

We now define our model.

\[\left[ {\begin{array}{*{20}{c}} {{\mu _t}}\\ {{v_t}} \end{array}} \right] \sim N\left( {\left[ {\begin{array}{*{20}{c}} {{\mu _{t|t - 1}}}\\ 0 \end{array}} \right],\left[ {\begin{array}{*{20}{c}} {{\Sigma _{t|t - 1}}}&{{\Sigma _{t|t - 1}}}\\ {{\Sigma _{t|t - 1}}}&{{V_t}} \end{array}} \right]} \right)\]

The Local Trend Kalman Filter


Using the definitions above, it is possible to derive a simple Kalman filter.
\[\begin{array}{l} {v_t} = {y_t} - {\mu _{t|t - 1}}\\ {V_t} = {\Sigma _{t|t - 1}} + \sigma _\varepsilon ^2\\ {K_t} = \frac{{{\Sigma _{t|t - 1}}}}{{{V_t}}}\\ {\mu _{t + 1|t}} = {\mu _{t|t - 1}} + {K_t}{v_t}\\ {\Sigma _{t + 1|t}} = {\Sigma _{t|t - 1}}(1 - {K_t}) + \sigma _\eta ^2 \end{array}\]

${v_t}$ is the one-step ahead error estimate, also called the residual. ${V_t}$ is the estimate of  the variance of ${v_t}$.  ${K_t}$ is the Kalman gain.

That looks pretty straight forward. Here is the code

class LocalTrendKalmanFilter(object):
    def __init__(self, y_data, obs_var = 0.1,
                 state_var = 0.1, debug = False):
        """
        Initialization
        arguments:
        y_data - numpy arrays of timeseries data
        obs_var - variance of observed data
        state_var - variance of mu, the state
        debug - if true, prints values at each step
        """
        self._ydata = y_data  # observations
        self._obs_var = obs_var  # data variance
        self._state_var = state_var  # model variance
        self._step = 1 # what time step are we on?
        self._debug = debug

        # state
        self._mu = y_data[0]
        self._v = 0
        self._cov = self._obs_var

        self._V = 0
        self._K = 0

    # some getters for state and covariance
    @property
    def mu(self):
        return self._mu

    @property
    def v(self):
        return self._v

    @property
    def cov(self):
        return self._cov

    def predict(self):
        """
        predict - predict new state        
        """
        # are we at the end of the data?
        if self._step == len(self._ydata):
            self._mu = None
            self._v = None
            return

        # update error prediction and variance
        self._v = self._ydata[self._step] - self._mu
        self._V = self._cov + self._obs_var

    def update(self):
        """
        update - Kalman update
        """
        self._K = self._cov / self._V # Kalman gain
        self._mu = self._mu + self._K * self._v # new state
        self._cov = self._cov * (1 - self._K) + self._state_var  # new variance

        if self._debug:
            print('y =', self._ydata[self._step], 'mu =', self._mu, 'v =', self._v,
                  'V =', self._V,  'K =', self._K,  'cov =', self._cov, '\n')
            
        self._step += 1



Here is the result of applying the filter to the temperature anomaly data with ${\sigma _\varepsilon } = 1.0$ and ${\sigma _\eta } = 0.1$.

The fit follows the trend pretty well. The predicted error bounces around 0, but goes up a bit at the end.

The filter is somewhat sensitive to the estimates of the variance. Watch what happens with ${\sigma _\varepsilon } = 0.1$ and ${\sigma _\eta } = 0.5$.


The fit appears a bit jumpier, although now the error mean stays closer to 0.

In an extreme case, ${\sigma _\varepsilon } = 0.000001$ and ${\sigma _\eta } = 1$, the fiter simply overfits.


The code and data can be downloaded from here.

No comments:

Post a Comment