The Kalman Filter is an optimal estimation algorithm that infers
parameters of interest from indirect, inaccurate and uncertain
observations. It is recursive, so that new measurements can be processed
as they arrive.
It projects measurements onto the state estimate. Finding the best
estimate from noisy data amounts to filtering out the noise. The state
is the hidden layer and it follows the first-order Markov chain. TODO:
HMMs
Optimal:
If noise is Gaussian, the Kalman filter minimizes the mean square
error of estimated parameters.
If noise is non-Gaussian, it is the best linear estimator given the
mean and covariance of the noise.
Why popular?
Good results in practice due to optimality and structure
Convenient form for online real-time processing
Easy to formulate and implement with basic understanding
Measurement equations need not be inverted
Common applications:
Navigation systems (GPS + inertial measurement units)
Target tracking (radar, computer vision)
Signal processing and sensor fusion
Robotics and autonomous vehicles
Financial time series estimation
When to use it:
Variables of interest can only be measured indirectly
Multiple sensors provide measurements subject to noise
Real-time online processing is needed
Pros and Cons:
allows noise in estimates, transitions, observations
doesn’t require historical data
efficient, suitable for real-time applications
only for linear-Normal state space models (state transitions may be
non-linear or the noise may be non-Normal). Variants exist to deal with
this.
Notation
Symbol
Meaning
Dimension
n_x
State dimension
n_y
Observation dimension
x_k
State vector at time step k
n_x \times 1
y_k
Observation/measurement at time step k
n_y \times 1
u_k
Control input at time step k
n_u \times 1
F
State transition matrix
n_x \times n_x
G
Control input matrix
n_x \times n_u
H
Observation matrix (maps state to measurement space)
n_y \times n_x
w_k
Process noise: w_k \sim (0, Q)
n_x \times 1
v_k
Observation noise: v_k \sim (0,
R)
n_y \times 1
Q
Process noise covariance matrix
n_x \times n_x
R
Observation noise covariance matrix
n_y \times n_y
P_k
Estimate error covariance matrix
n_x \times n_x
K_k
Kalman gain matrix
n_x \times n_y
\hat{x}_k
Estimated state at time k
n_x \times 1
\hat{x}_{k\|k-1}
Predicted state (before observing y_k)
n_x \times 1
\hat{x}_{k\|k}
Filtered state (after observing y_k)
n_x \times 1
State Space Model
A Kalman filter operates on a linear state space model consisting
of:
We want to estimate the current position x_k of a car using:
State equation (speedometer u_k):
x_k = x_{k-1} + u_k \Delta t + w_k
where w_k \sim (0, q)
Here F=1, G=\Delta t, Q=q
Observation equation (GPS y_k):
y_k = x_k + v_k where v_k \sim (0, r)
Here H=1, R=r
To estimate true position, we integrate both distributions and weight
by inverse variance (lower uncertainty → higher weight).
Prediction, Filtering, and
Smoothing
Assuming we have observations up to time k-1:
Prediction: Forecast future states \hat{x}_{k+1|k}, \hat{x}_{k+2|k}, \ldots
Filtering: Correct the current state \hat{x}_{k|k}
Smoothing: Correct previous states \hat{x}_{k-1|k}, \hat{x}_{k-2|k}, \ldots
The Kalman filter (forward algorithm) performs prediction + filtering
at each time step. The Kalman smoother (forward-backward algorithm) runs
the Kalman filter forward, then corrects past states backward.
Kalman Filter Intuition
prediction (prior):
system state x_{t | t-1}
uncertainty P_{t | t-1}
correction (posterior), after observing y_t, corrects the state prediction and
updates the uncertainty:
P_{k|k} = (I - K_k H)P_{k|k-1}
(corrected prediction uncertainty)
Kalman Smoother
Motivation
The Kalman filter gives the best estimate given observations up to
the current time. However, for offline applications
(e.g., analyzing recorded data), we have all observations available. The
smoother uses future observations to refine past state estimates,
reducing estimation error.
Filter: \hat{x}_{k|k} (uses y_1, \ldots, y_k)
Smoother: \hat{x}_{k|T} (uses y_1, \ldots, y_T where T > k)
The smoother always has lower or equal error variance: P_{k|T} \leq P_{k|k}.
Kalman Smoother Algorithm
For all k = 1, \ldots, T:
Run the Kalman filter forward to compute \hat{x}_{k|k}, P_{k|k}, P_{k|k-1}
The smoother gain, analogous to Kalman gain but for backward
information propagation. It measures how much correction from future
states affects the current state estimate.
Sequential Kalman Filter
Applicable when the measurements are uncorrelated (R diagonal). Avoids inverting a matrix of
size n_y \times n_y in the measurement
step. This reduces computation and can improve numerical stability.
Systems that do not have diagonal R can
be diagonalized.
For systems with non-linear dynamics or non-Gaussian noise, the
linear Kalman filter may not be optimal.
Extended Kalman Filter (EKF)
Linearizes non-linear transition and observation functions around the
current estimate using first-order Taylor expansion (Jacobians). Nearly
as efficient as standard KF but can diverge if non-linearity is
strong.
Unscented Kalman Filter (UKF)
Uses a set of carefully chosen sample points (“sigma points”) to
capture mean and covariance of non-linear transformations without
linearization. More accurate than EKF for moderate non-linearities,
slightly higher computational cost.
Particle Filter
Represents the state distribution as a collection of samples and
updates via importance weighting. Handles arbitrary non-linear and
non-Gaussian systems but requires many more samples (computationally
expensive).
Learning Parameters
The parameters \Theta = \{F, G, H, Q, R,
E[x_0], \text{Var}[x_0] \} are often unknown and must be
estimated from data.
EM Algorithm
The Expectation-Maximization (EM) algorithm is suitable for this:
E-step: Run Kalman filter and smoother to estimate hidden states
x_1, \ldots, x_T given current
parameter estimates
M-step: Update parameters \Theta to
maximize likelihood of observations
Repeat until convergence
Use EM when you have multiple observations and want principled
parameter estimation. Tune manually usually for real-time systems,
manual tuning of Q and R based on sensor specifications and process
knowledge is faster.
Numerical Stability
The standard covariance update P_{k|k} =
P_{k|k-1} - K_k H P_{k|k-1} involves subtraction and can lose
symmetry and positive-definiteness due to rounding errors, especially in
high-dimensional systems or long-running filters.
Use the Joseph form covariance update instead: P_{k|k} = (I - K_k H)P_{k|k-1}(I - K_k H)^T + K_k R
K_k^T
This form:
Maintains symmetry and positive-definiteness numerically
Costs more computation (~3x) but ensures stability
As a diagnostic: if standard form diverges, switch to Joseph
form
State Observers
A state observer (state estimator) reconstructs unobserved internal
states from input-output measurements. The Kalman filter is the optimal
observer for linear systems with Gaussian noise, providing:
Estimates \hat{x}_k of hidden
states x_k
Automatic weighting of model predictions vs. sensor
measurements
Uncertainty quantification via covariance P_k
State observers are fundamental in control theory and are used to
implement state-feedback controllers when full state measurement is
unavailable.