INTRODUCTION
Many models in economics and finance depend on data that are not observable. These unobserved data are usually in a context in which it is desirable for a model to predict future events. The Kalman filter has been used to estimate an unobservable source of jumps in stock returns, unobservable noise in equity index levels, unobservable parameters and state variables in commodity futures prices, unobservable inflation expectations, unobservable stock betas, and unobservable hedge ratios across interest rate contracts. (1) In the field of engineering, a Kalman filter (Kalman 1960) is employed for similar problems involving physical phenomena. The technique is appearing more frequently in the fields of finance and economics. However, understanding the technique can be very difficult given the available resource material.
When viewing chapter thirteen of Hamilton's (1994) Times Series Analysis text, one can understand why the topic of Kalman filters is generally reserved for the graduate classroom. However, as we will demonstrate, the technique is not quite as difficult as one may perceive initially and has similarities to standard linear regression analysis. Consequently, if placed in the correct context, it is accessible to the undergraduate student. In order to make the Kalman filter more accessible, an Excel application is developed in this article to work the student through the mechanics of the process.
In the first section, a derivation of the Kalman filter algorithm is presented in a univariate context and a connection is made between the algorithm and linear regression. In the second section, the Kalman filter is combined with maximum likelihood estimation (MLE) to create an iterative process for parameter estimation. In the third section, an Excel application/example of using the Kalman filter/MLE iterative routine is performed.
DEVELOPING THE KALMAN FILTER ALGORITHM
There are two basic building blocks of a Kalman filter, the measurement equation and the transition equation. The measurement equation relates an unobserved variable ([X.sub.t]) to an observable variable ([Y.sub.t]). In general, the measurement equation is of the form:
[Y.sub.t] = [m.sub.t] x [X.sub.t] + [b.sub.t] + [[epsilon].sub.t] (1)
To simplify the exposition, assume the constant [b.sub.t] is zero and [m.sub.t] remains constant through time eliminating the need for a subscript. Further, [[epsilon].sub.t] has a mean of zero and a variance of [r.sub.t]. Equation (1) becomes:
[Y.sub.t] = m x [X.sub.t] + [[epsilon].sub.t] (2)
The transition equation is based on a model that allows the unobserved variable to change through time. In general, the transition equation is of the form:
[X.sub.t+1] = [a.sub.t] x [X.sub.t] + [g.sub.t] + [[theta].sub.t] (3)
Again, to simplify the exposition, assume the constant [g.sub.t] is zero and [a.sub.t] remains constant through time eliminating the need for a subscript. Further, [[theta].sub.t] has a mean of zero and a variance of [q.sub.t]. Equation (3) becomes:
[X.sub.t+1] = a x [X.sub.t] + [[theta].sub.t] (4)
To begin deriving the Kalman filter algorithm, insert an initial value [X.sub.0] into Eq. (4) (the transition equation) for [X.sub.t], [X.sub.0] has a mean of [[micro].sub.0] and a standard deviation of [[sigma].sub.0]. It should be noted that [[epsilon].sub.t], [[theta].sub.t], and [X.sub.0] are uncorrelated. (Note: these variables are also uncorrelated relative to lagged variables.) Equation (4) becomes:
[X.sub.1P] = a x [X.sub.0] + [[theta].sub.0] (5)
where [X.sub.1P] is the predicted value for [X.sub.1].
[X.sub.1P] is inserted into Eq. (2) (the measurement equation) to get a predicted value for [Y.sub.1], call it [Y.sub.1P]:
[Y.sub.1P] = m x [X.sub.1P] + [[epsilon].sub.1] + m x [a x [X.sub.0] + [[theta].sub.0] + [[epsilon].sub.1] (6)
When [Y.sub.1] actually occurs, the error, [Y.sub.1E], is computed by subtracting [Y.sub.1P] from [Y.sub.1]:
[Y.sub.1E] = [Y.sub.1] - [Y.sub.1P] (7)
The error can now be incorporated into the prediction for [X.sub.1]. To distinguish the adjusted predicted value of [X.sub.1] from the predicted value of [X.sub.1] in Eq. (5), the adjusted predicted value is called [X.sub.1P] - ADJ:
[X.sub.1P - ADJ] = [X.sub.1P] + [k.sub.1] x [Y.sub.1E] = [X.sub.1P] + [k.sub.1] [[Y.sub.1] - [Y.sub.1P]] = [X.sub.1P] + [k.sub.1] [[Y.sub.1] - m x [X.sub.1P] - [[epsilon].sub.1]] = [X.sub.1P] [1 - m x [k.sub.1]] + [k.sub.1] x [Y.sub.1] - [k.sub.1] x [[epsilon].sub.1] (8)
where [k.sub.l] is the Kalman gain, which will be determined shortly.
The Kalman gain variable is determined by taking the partial derivative of the variance of [X.sub.1P] - ADJ relative to [k.sub.1] in order to minimize the variance based on [k.sub.1] (i.e., the partial derivative is set to zero and then one finds a solution for [k.sub.1]). For ease of exposition, let [p.sub.1] be the variance of [X.sub.1P] (technically, [p.sub.1] equals: [(a x [[sigma].sub.0]).sup.2] + [q.sub.0]). The solution for the Kalman gain is as follows (see Joseph, 2007, for a numerical example):
Var([X.sub.1P - ADJ]) = [p.sub.1] x [[1 - m x [k.sub.1]].sup.2] + [k.sup.2.sub.1] x [r.sub.1] (9)
[partial derivative]Var([X.sub.1P - ADJ]/[partial derivative][k.sub.1] = -2m x [1 - m x [k.sub.1]] x [p.sub.1] + 2 x [k.sub.1] x [r.sub.1] = 0 (10)
[??] [k.sub.1] = [p.sub.1] x m/([p.sub.1] x [m.sup.2] + [r.sub.1]) = Cov([X.sub.1P], [Y.sub.1P])/Var([Y.sub.1P]) (11)
Notice, the Kalman gain is equivalent to a [beta]-coefficient from a linear regression with [X.sub.1P] as the dependent variable and [Y.sub.1P] as the independent variable. Not that one would have a sufficient set of data to perform such a regression, but the idea that a [beta]-coefficient is set to reduce error in a regression is equivalent to the idea of the Kalman gain being set to reduce variance in the adjusted predicted value for [X.sub.1].
The next step is to use [X.sub.1P - ADJ] in the transition equation (Equation (4)) for [X.sub.t] and start the process over again to find equivalent values when t = 2. However, before ending this section, it is important to note the advantages of [X.sub.1P - ADJ] over [X.sub.1P]. Recall that the variance for [X.sub.1P] is [p.sub.1]. Substituting Eq. (11) into Eq. (9), the variance of [X.sub.1P - ADJ is
Var([X.sub.1P - ADJ]) = [p.sub.1] x [[1 - 1/(1 + [r.sub.1]/[p.sub.1] x [m.sup.2])].sup.2] + [k.sup.2.sub.1] x [r.sub.1] (12)
The portion of the equation that pertains to the variance of [X.sub.1P], i.e., [p.sub.1], has a bracketed term that is less than one (and is further reduced because the "less than one quantity" is squared). This means that the portion of the variance attributed to estimating [X.sub.1] has been reduced by using [X.sub.1P - ADj] instead of [X.sub.1P].
For reference, the Kalman filter algorithm is summarized in Table 1.
In the next section, it will be necessary to use the mean and variance of [X.sub.1P - ADJ] and of [Y.sub.1P]. Although, some of these quantities have already been calculated, all are presented below for reference purposes with the time index variable t incorporated (t = 1 to T) and the adjusted predicted values for [X.sub.t] incorporated into [Y.sub.tP]:
E[[X.sub.tP - ADJ]] = E[[X.sub.tP] + [k.sub.t] x [Y.sub.tE]] = E[[X.sub.tP]] + [k.sub.t] x ([Y.sub.t] - E[[Y.sub.tP]]) (13)
Var[[X.sub.tP - ADJ]] = [p.sub.t] x [[1 - 1/(1 + [r.sub.t]/[p.sub.t] x [m.sup.2])].sup.2] + [k.sup.2.sub.t] x [r.sub.t] (14)
E[[Y.sub.tP]] = E[m x ([X.sub.tP - ADJ]) + [[epsilon].sub.t]] = m x E[[X.sub.tP - ADJ]] (15)
Var[[Y.sub.tP]] = Var[[X.sub.tP - ADJ]] x [m.sup.2] + [r.sub.t] (16)
Note: [[epsilon].sub.t] technically appears within Eq. (13) within the [Y.sub.tE] term and within Eq. (15); however, these error terms are independent of each other. In other words, Eqs. (15) and (16) refer to an "updated" or "adjusted" version of the [Y.sub.tP] term in Eqs. (13) and (14). Consequently, the error terms corresponding to [Y.sub.tP] within the two sets of equations are uncorrelated.
In the classroom setting, it is important to keep the application in a univariate setting initially to allow the student to follow the logic of the filter. Further, it is suggested that the instructor reinforce the logic of the algorithm using Table 1 in conjunction with an assignment (such as the assignment developed in this article) or a quiz. Because this presentation is not reliant on many expectation calculations and only one variance calculation, it is a more palatable introduction of the Kalman filter than what many texts present. Consequently, this presentation works best as an introduction to the technique, which can eventually lead to the more sophisticated presentations available in many time series texts. If the instructor only requires an introduction to the Kalman filter technique with the ability to create an assignment, then this presentation of the algorithm will be sufficient without a text.
APPLYING MAXIMUM LIKELIHOOD ESTIMATION TO THE KALMAN FILTER
The Kalman filter provides output throughout the time series in the form of estimated values for an unobservable variable: [X.sub.tP - ADJ] with a mean and a variance defined in Eqs. (13) and (14). Further, the observable variable has a time series of values and a distribution based on its predicted value, [Y.sub.tP], which has a mean and variance defined by Eqs. (15) and (16). What the Kalman filter cannot determine are unknown model parameters in the measurement equation, [[epsilon].sub.t], in Eq. (2) (note: m is a constant and assumed known) and unknown parameters in the transition equation, a and [[theta].sub.t], in Eq. (4). Consequently, it is necessary to have a means of estimating these parameters and, when estimated, allow the Kalman filter to generate the time series of the unobservable variable that is desired.




Mobile Edition
Print
Get the Mag
Weekly Updates