Kalman Filters

Introduction

The Kalman Filter is an optimal estimation algorithm that infers parameters of interest from indirect, inaccurate and uncertain observations. It is recursive, so that new measurements can be processed as they arrive.

It projects measurements onto the state estimate. Finding the best estimate from noisy data amounts to filtering out the noise. The state is the hidden layer and it follows the first-order Markov chain. TODO: HMMs

Optimal:

Why popular?

Common applications:

When to use it:

Pros and Cons:

Notation

Symbol Meaning Dimension
n_x State dimension
n_y Observation dimension
x_k State vector at time step k n_x \times 1
y_k Observation/measurement at time step k n_y \times 1
u_k Control input at time step k n_u \times 1
F State transition matrix n_x \times n_x
G Control input matrix n_x \times n_u
H Observation matrix (maps state to measurement space) n_y \times n_x
w_k Process noise: w_k \sim (0, Q) n_x \times 1
v_k Observation noise: v_k \sim (0, R) n_y \times 1
Q Process noise covariance matrix n_x \times n_x
R Observation noise covariance matrix n_y \times n_y
P_k Estimate error covariance matrix n_x \times n_x
K_k Kalman gain matrix n_x \times n_y
\hat{x}_k Estimated state at time k n_x \times 1
\hat{x}_{k\|k-1} Predicted state (before observing y_k) n_x \times 1
\hat{x}_{k\|k} Filtered state (after observing y_k) n_x \times 1

State Space Model

A Kalman filter operates on a linear state space model consisting of:

Example: Car position estimation

We want to estimate the current position x_k of a car using:

To estimate true position, we integrate both distributions and weight by inverse variance (lower uncertainty → higher weight).

Prediction, Filtering, and Smoothing

Assuming we have observations up to time k-1:

The Kalman filter (forward algorithm) performs prediction + filtering at each time step. The Kalman smoother (forward-backward algorithm) runs the Kalman filter forward, then corrects past states backward.

Kalman Filter Intuition

For the 1D case of the product of two Normal PDFs

where K_t is the Kalman gain defined as K_t = \frac{\sigma^2_{t|t-1}}{\sigma^2_{t|t-1} + r}

Interpretation:

Kalman Filter Algorithm

For k = 1, 2, 3, \ldots:

Kalman Smoother

Motivation

The Kalman filter gives the best estimate given observations up to the current time. However, for offline applications (e.g., analyzing recorded data), we have all observations available. The smoother uses future observations to refine past state estimates, reducing estimation error.

The smoother always has lower or equal error variance: P_{k|T} \leq P_{k|k}.

Kalman Smoother Algorithm

For all k = 1, \ldots, T:

For k = T-1, \ldots, 1:

The smoother gain, analogous to Kalman gain but for backward information propagation. It measures how much correction from future states affects the current state estimate.

Sequential Kalman Filter

Applicable when the measurements are uncorrelated (R diagonal). Avoids inverting a matrix of size n_y \times n_y in the measurement step. This reduces computation and can improve numerical stability. Systems that do not have diagonal R can be diagonalized.

System Model

Algorithm

Extensions: Non-linear Variants

For systems with non-linear dynamics or non-Gaussian noise, the linear Kalman filter may not be optimal.

Extended Kalman Filter (EKF)

Linearizes non-linear transition and observation functions around the current estimate using first-order Taylor expansion (Jacobians). Nearly as efficient as standard KF but can diverge if non-linearity is strong.

Unscented Kalman Filter (UKF)

Uses a set of carefully chosen sample points (“sigma points”) to capture mean and covariance of non-linear transformations without linearization. More accurate than EKF for moderate non-linearities, slightly higher computational cost.

Particle Filter

Represents the state distribution as a collection of samples and updates via importance weighting. Handles arbitrary non-linear and non-Gaussian systems but requires many more samples (computationally expensive).

Learning Parameters

The parameters \Theta = \{F, G, H, Q, R, E[x_0], \text{Var}[x_0] \} are often unknown and must be estimated from data.

EM Algorithm

The Expectation-Maximization (EM) algorithm is suitable for this:

Use EM when you have multiple observations and want principled parameter estimation. Tune manually usually for real-time systems, manual tuning of Q and R based on sensor specifications and process knowledge is faster.

Numerical Stability

The standard covariance update P_{k|k} = P_{k|k-1} - K_k H P_{k|k-1} involves subtraction and can lose symmetry and positive-definiteness due to rounding errors, especially in high-dimensional systems or long-running filters.

Use the Joseph form covariance update instead: P_{k|k} = (I - K_k H)P_{k|k-1}(I - K_k H)^T + K_k R K_k^T

This form:

When to Use

State Observers

A state observer (state estimator) reconstructs unobserved internal states from input-output measurements. The Kalman filter is the optimal observer for linear systems with Gaussian noise, providing:

State observers are fundamental in control theory and are used to implement state-feedback controllers when full state measurement is unavailable.

References