Particle filtering

In this tutorial we explain how to use ProbNum for particle filtering. We assume that you have read the tutorial on non-linear filtering.

[1]:
import numpy as np
import matplotlib.pyplot as plt
from probnum import filtsmooth, randvars, diffeq, randprocs, problems
from probnum.problems import TimeSeriesRegressionProblem
from scipy import stats
[2]:
rng = np.random.default_rng(seed=123)
[3]:
# Make inline plots vector graphics instead of raster graphics
%matplotlib inline
from IPython.display import set_matplotlib_formats

set_matplotlib_formats("pdf", "svg")

# Plotting
import matplotlib.pyplot as plt

plt.style.use("../../probnum.mplstyle")

# Consistent plotting styles for particles and for "true" latent states
particle_style = {
    "color": "C1",
    "marker": "o",
    "markersize": 5,
    "linestyle": "None",
    "alpha": 0.5,
}
latent_state_style = {"color": "C0", "linewidth": 5, "linestyle": "-", "alpha": 0.5}
/tmp/ipykernel_19021/4149861519.py:5: DeprecationWarning: `set_matplotlib_formats` is deprecated since IPython 7.23, directly use `matplotlib_inline.backend_inline.set_matplotlib_formats()`
  set_matplotlib_formats("pdf", "svg")

Pendulum

We begin with setting up the pendulum problem, which is a standard non-linear test problem. The code below is taken from the non-linear filtering notebook.

We begin by assembling a NonlinearGaussian as a model of the prior dynamics.

[4]:
state_dim = 2
observation_dim = 1

# approx. gravitational constant
g = 9.81
delta_t = 0.05


def pendulum_rhs(state):
    """Right-hand side of an ODE that defines pendulum dynamics"""
    x1, x2 = state
    y1 = x1 + x2 * delta_t
    y2 = x2 - g * np.sin(x1) * delta_t
    return np.array([y1, y2])


def pendulum_jacobian(state):
    """Jacobian of the pendulum ODE"""
    x1, x2 = state
    dy1_dx = [1.0, delta_t]
    dy2_dx = [-g * np.cos(x1) * delta_t, 1.0]
    return np.array([dy1_dx, dy2_dx])


dynamics_transition_function = lambda t, state: pendulum_rhs(state)
dynamics_transition_jacobian_function = lambda t, state: pendulum_jacobian(state)

dynamics_diffusion_matrix = 1.0 * (
    np.diag(np.array([delta_t ** 3 / 3, delta_t]))
    + np.diag(np.array([delta_t ** 2 / 2]), 1)
    + np.diag(np.array([delta_t ** 2 / 2]), -1)
)

# Create discrete, non-linear Gaussian dynamics model
dynamics_model = randprocs.markov.discrete.NonlinearGaussian(
    input_dim=state_dim,
    output_dim=state_dim,
    transition_fun=dynamics_transition_function,
    noise_fun=lambda t: randvars.Normal(mean=np.zeros(state_dim), cov=dynamics_diffusion_matrix),
    transition_fun_jacobian=dynamics_transition_jacobian_function,
)

Next, we set up the measurement model as another discrete, non-linear Gaussian transformation.

[5]:
def pendulum_measurement(state):
    x1, x2 = state
    return np.array([np.sin(x1)])


def pendulum_measurement_jacobian(state):
    x1, x2 = state
    return np.array([[np.cos(x1), 0.0]])


measurement_function = lambda t, state: pendulum_measurement(state)
measurement_jacobian_function = lambda t, state: pendulum_measurement_jacobian(state)

measurement_variance = 0.12 ** 2
measurement_covariance = measurement_variance * np.eye(observation_dim)

# Create discrete, non-linear Gaussian measurement model
measurement_model = randprocs.markov.discrete.NonlinearGaussian(
    input_dim=state_dim,
    output_dim=observation_dim,
    transition_fun=measurement_function,
    noise_fun=lambda t: randvars.Normal(np.zeros(1), cov=measurement_covariance),
    transition_fun_jacobian=measurement_jacobian_function,
)

Finally, we define the initial distribution: a Gaussian random variable. We also generate some artificial observations.

[6]:
mu_0 = np.ones(state_dim)
sigma_0 = measurement_variance * np.eye(state_dim)
initial_state_rv = randvars.Normal(mean=mu_0, cov=sigma_0)

Now we’re essentially good to go. With this setup we could do non-linear Gaussian filtering (extended Kalman filtering, unscented Kalman filtering), but here, we avoid the Gaussian approximations and do particle filtering.

To construct a particle filter, we provide prior, measurement model, and initial RV as well as

  • The number of particles: num_particles. The larger the better, but also the larger the slower (it is Monte Carlo approximation, after all).

  • A linearized measurement model that is used as an importance density: linearized_measurement_model. In ProbNum we implement two types of particle filters: a bootstrap particle filter, that uses the prior as an importance distribution (which always works, but also requires a lot of particles – this is what is used in linearized_measurement_model is left empty), and a particle filter that uses an approximate Gaussian filter as an importance density. In the pendulum example, we do the latter, because the extended Kalman filter has already been proven to be successful.

[7]:
num_particles = 20
prior_process = randprocs.markov.MarkovSequence(
    transition=dynamics_model, initrv=initial_state_rv, initarg=0.0
)
importance_distribution = filtsmooth.particle.LinearizationImportanceDistribution.from_ekf(
    dynamics_model
)

# Construct the PF
pf = filtsmooth.particle.ParticleFilter(
    prior_process,
    rng=rng,
    importance_distribution=importance_distribution,
    num_particles=num_particles,
)

Generate Data for the State-Space Model

statespace.generate_artificial_measurements() is used to sample both latent states and noisy observations from the specified state space model.

[8]:
time_grid = np.arange(0.0, 3.5, step=delta_t)
latent_states, observations = randprocs.markov.utils.generate_artificial_measurements(
    rng=rng,
    prior_process=prior_process,
    measmod=measurement_model,
    times=time_grid,
)
[9]:
regression_problem = TimeSeriesRegressionProblem(
    observations=observations,
    locations=time_grid,
    measurement_models=[measurement_model] * len(time_grid),
)

The remainder is the same .filter() interface that we know from Gaussian filtering. After applying this method, we plot the mode of the posterior distribution.

[10]:
posterior, _ = pf.filter(regression_problem)
[11]:
plt.plot(
    time_grid, np.sin(latent_states[:, 0]), **latent_state_style, label="Latent state"
)
plt.plot(
    time_grid,
    np.sin(posterior.states.mode[:, 0]),
    **particle_style,
    label="Mode of posterior"
)
plt.xlabel("t")
plt.ylabel(r"$x$")
plt.legend()

plt.show()
../../_images/tutorials_filtsmooth_particle_filtering_18_0.svg

It seems that the true latent state is recovered fairly well. The RMSE of the mode is also much smaller than the RMSE of the data:

[12]:
rmse_mode = np.linalg.norm(
    np.sin(posterior.states.mode[:, 0]) - np.sin(latent_states[:, 0])
)
rmse_data = np.linalg.norm(observations - np.sin(latent_states[:, 0]))
print(f"Mode of PF: \t{rmse_mode}\nObservations: \t{rmse_data}")
Mode of PF:     1.3104582172798598
Observations:   61.48996491275069

The strength of a particle filter is not the point estimate, but the posterior distribution. Let us look at a few more particles. Since we want to exclude the unlikely states from the visualization, we resample the posterior before plotting. This way, only those states that have sufficient probability to be resampled are shown.

[13]:
# Remove the unlikely particles by resampling the posterior
resampled_states = posterior.states.resample(rng=rng)

plt.plot(
    time_grid, np.sin(latent_states[:, 0]), **latent_state_style, label="Latent state"
)
plt.plot(
    time_grid,
    np.sin(resampled_states.support[:, :, 0]),
    **particle_style,
    label="Particles"
)
plt.xlabel("t")
plt.ylabel(r"$x$")

# Remove duplicate legend entries before showing the legend
handles, labels = plt.gca().get_legend_handles_labels()
by_label = dict(zip(labels, handles))
plt.legend(by_label.values(), by_label.keys())

plt.show()
../../_images/tutorials_filtsmooth_particle_filtering_22_0.svg

It seems that the distribution nicely concentrates around the true state.