tfp.substrates.numpy.sts.impute_missing_values

Runs posterior inference to impute the missing values in a time series.

This method computes the posterior marginals p(latent state | observations), given the time series at observed timesteps (a missingness mask should be specified using tfp.sts.MaskedTimeSeries). It pushes this posterior back through the observation model to impute a predictive distribution on the observed time series. At unobserved steps, this is an imputed value; at other steps it is interpreted as the model's estimate of the underlying noise-free series.

model tfp.sts.Sum instance defining an additive STS model.
observed_time_series float Tensor of shape concat([sample_shape, model.batch_shape, [num_timesteps, 1]]) where sample_shape corresponds to i.i.d. observations, and the trailing [1] dimension may (optionally) be omitted if num_timesteps > 1. Any NaNs are interpreted as missing observations; missingness may be also be explicitly specified by passing a tfp.sts.MaskedTimeSeries instance.
parameter_samples Python list of Tensors representing posterior samples of model parameters, with shapes [concat([ [num_posterior_draws], param.prior.batch_shape, param.prior.event_shape]) for param in model.parameters]. This may optionally also be a map (Python dict) of parameter names to Tensor values.
include_observation_noise If False, the imputed uncertainties represent the model's estimate of the noise-free time series at each timestep. If True, they represent the model's estimate of the range of values that could be observed at each timestep, including any i.i.d. observation noise. Default value: False.
timesteps_are_event_shape Deprecated, for backwards compatibility only. If False, the predictive distribution will return per-timestep probabilities Default value: True.

imputed_series_dist a tfd.MixtureSameFamily instance with event shape [num_timesteps] if timesteps_are_event_shape else [] and batch shape concat([sample_shape, model.batch_shape, [] if timesteps_are_event_shape else [num_timesteps]), with num_posterior_draws mixture components.

Example

To specify a time series with missing values, use tfp.sts.MaskedTimeSeries:

time_series_with_nans = [-1., 1., np.nan, 2.4, np.nan, 5]
observed_time_series = tfp.sts.MaskedTimeSeries(
  time_series=time_series_with_nans,
  is_missing=tf.math.is_nan(time_series_with_nans))

Masked time series can be passed to tfp.sts methods in place of a observed_time_series Tensor:

# Build model using observed time series to set heuristic priors.
linear_trend_model = tfp.sts.LocalLinearTrend(
  observed_time_series=observed_time_series)
model = tfp.sts.Sum([linear_trend_model],
                    observed_time_series=observed_time_series)

# Fit model to data
parameter_samples, _ = tfp.sts.fit_with_hmc(model, observed_time_series)

After fitting a model, impute_missing_values will return a distribution

# Impute missing values
imputed_series_distribution = tfp.sts.impute_missing_values(
  model, observed_time_series, parameter_samples=parameter_samples)
print('imputed means and stddevs: ',
      imputed_series_distribution.mean(),
      imputed_series_distribution.stddev())