|View source on GitHub|
Named tuple encoding a time series
Tensor and optional missingness mask.
tfp.substrates.numpy.sts.MaskedTimeSeries( time_series, is_missing )
Structural time series models handle missing values naturally, following the rules of conditional probability. Posterior inference can be used to impute missing values, with uncertainties. Forecasting and posterior decomposition are also supported for time series with missing values; the missing values will generally lead to corresponding higher forecast uncertainty.
All methods in the
tfp.sts API that accept an
Tensor should optionally also accept a
The time series should be a float
Tensor of shape
[..., num_timesteps] or
[..., num_timesteps, 1]. The
is_missing mask must be either a boolean
Tensor of shape
[..., num_timesteps], or
True values in
is_missing denote missing (masked) observations;
False denotes observed
(unmasked) values. Note that these semantics are opposite that of low-level
TensorFlow methods like
tf.boolean_mask, but consistent with the behavior
of Numpy masked arrays.
The batch dimensions of
is_missing must broadcast with the batch
MaskedTimeSeries is just a
collections.namedtuple instance, i.e., a dumb
container. Although the convention for the elements is as described here, it's
left to downstream methods to validate or convert the elements as required.
In particular, most downstream methods will call
on the components. In order to prevent duplicate
Tensor creation, you may
(if memory is an issue) wish to ensure that the components are already
Tensors, as opposed to numpy arrays or similar.
To construct a simple MaskedTimeSeries instance:
observed_time_series = tfp.sts.MaskedTimeSeries( time_series=tf.random.stateless_normal([3, 4, 5]), is_missing=[True, False, False, True, False])
Note that the mask we specified will broadcast against the batch dimensions of the time series.
For time series with missing entries specified as NaN 'magic values', you can
generate a mask using
import numpy as np from tensorflow_probability.python.internal.backend import numpy as tf import tensorflow_probability as tfp; tfp = tfp.substrates.numpy time_series_with_nans = [-1., 1., np.nan, 2.4, np.nan, 5] observed_time_series = tfp.sts.MaskedTimeSeries( time_series=time_series_with_nans, is_missing=tf.is_nan(time_series_with_nans)) # Build model using observed time series to set heuristic priors. linear_trend_model = tfp.sts.LocalLinearTrend( observed_time_series=observed_time_series) model = tfp.sts.Sum([linear_trend_model], observed_time_series=observed_time_series) # Fit model to data parameter_samples, _ = tfp.sts.fit_with_hmc(model, observed_time_series) # Forecast forecast_dist = tfp.sts.forecast( model, observed_time_series, num_steps_forecast=5) # Impute missing values observations_dist = tfp.sts.impute_missing_values(model, observed_time_series) print('imputed means and stddevs: ', observations_dist.mean(), observations_dist.stddev())