# tfp.sts.AutoregressiveIntegratedMovingAverage

Represents an autoregressive integrated moving-average (ARIMA) model.

Inherits From: `StructuralTimeSeries`

An autoregressive moving-average (ARMA) process is defined by the recursion

``````level[t + 1] = (
level_drift
+ noise[t + 1]
+ sum(ar_coefficients * levels[t : t - order : -1])
+ sum(ma_coefficients * noise[t : t - order : -1]))
noise[t + 1] ~ Normal(0., scale=level_scale)
```

where `noise` is an iid noise process. An integrated ([ARIMA](
https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average))
process corresponds to an ARMA model of the
`integration_degree`th-order differences of a sequence, or equivalently,
taking `integration_degree` cumulative sums of an underlying ARMA process.

<!-- Tabular view -->
<table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2"><h2 class="add-link">Args</h2></th></tr>

<tr>
<td>
`ar_order`<a id="ar_order"></a>
</td>
<td>
scalar Python positive `int` specifying the order of the
autoregressive process (`p` in `ARIMA(p, d, q)`).
</td>
</tr><tr>
<td>
`ma_order`<a id="ma_order"></a>
</td>
<td>
scalar Python positive `int` specifying the order of the
moving-average process (`q` in `ARIMA(p, d, q)`).
</td>
</tr><tr>
<td>
`integration_degree`<a id="integration_degree"></a>
</td>
<td>
scalar Python positive `int` specifying the number
of times to integrate an ARMA process. (`d` in `ARIMA(p, d, q)`).
Default value: `0`.
</td>
</tr><tr>
<td>
`ar_coefficients_prior`<a id="ar_coefficients_prior"></a>
</td>
<td>
optional `tfd.Distribution` instance specifying a
prior on the `ar_coefficients` parameter. If `None`, a default standard
normal (`tfd.MultivariateNormalDiag(scale_diag=tf.ones([ar_order]))`)
prior is used.
Default value: `None`.
</td>
</tr><tr>
<td>
`ma_coefficients_prior`<a id="ma_coefficients_prior"></a>
</td>
<td>
optional `tfd.Distribution` instance specifying a
prior on the `ma_coefficients` parameter. If `None`, a default standard
normal (`tfd.MultivariateNormalDiag(scale_diag=tf.ones([ma_order]))`)
prior is used.
Default value: `None`.
</td>
</tr><tr>
<td>
`level_drift_prior`<a id="level_drift_prior"></a>
</td>
<td>
optional `tfd.Distribution` instance specifying a prior
on the `level_drift` parameter. If `None`, the parameter is not inferred
and is instead fixed to zero.
Default value: `None`.
</td>
</tr><tr>
<td>
`level_scale_prior`<a id="level_scale_prior"></a>
</td>
<td>
optional `tfd.Distribution` instance specifying a prior
on the `level_scale` parameter. If `None`, a heuristic default prior is
constructed based on the provided `observed_time_series`.
Default value: `None`.
</td>
</tr><tr>
<td>
`initial_state_prior`<a id="initial_state_prior"></a>
</td>
<td>
optional `tfd.Distribution` instance specifying a
prior on the initial state, corresponding to the values of the process
at a set of size `order` of imagined timesteps before the initial step.
If `None`, a heuristic default prior is constructed based on the
provided `observed_time_series`.
Default value: `None`.
</td>
</tr><tr>
<td>
`ar_coefficient_constraining_bijector`<a id="ar_coefficient_constraining_bijector"></a>
</td>
<td>
optional `tfb.Bijector` instance
representing a constraining mapping for the autoregressive coefficients.
For example, `tfb.Tanh()` constrains the coefficients to lie in
`(-1, 1)`, while `tfb.Softplus()` constrains them to be positive, and
`tfb.Identity()` implies no constraint. If `None`, the default behavior
constrains the coefficients to lie in `(-1, 1)` using a `Tanh` bijector.
Default value: `None`.
</td>
</tr><tr>
<td>
`ma_coefficient_constraining_bijector`<a id="ma_coefficient_constraining_bijector"></a>
</td>
<td>
optional `tfb.Bijector` instance
representing a constraining mapping for the moving average coefficients.
For example, `tfb.Tanh()` constrains the coefficients to lie in
`(-1, 1)`, while `tfb.Softplus()` constrains them to be positive, and
`tfb.Identity()` implies no constraint. If `None`, the default behavior
is to apply no constraint.
Default value: `None`.
</td>
</tr><tr>
<td>
`observed_time_series`<a id="observed_time_series"></a>
</td>
<td>
optional `float` `Tensor` of shape
`batch_shape + [T, 1]` (omitting the trailing unit dimension is also
supported when `T > 1`), specifying an observed time series. Any `NaN`s
are interpreted as missing observations; missingness may be also be
explicitly specified by passing a <a href="../../tfp/sts/MaskedTimeSeries"><code>tfp.sts.MaskedTimeSeries</code></a> instance.
Any priors not explicitly set will be given default values according to
the scale of the observed time series (or batch of time series).
Default value: `None`.
</td>
</tr><tr>
<td>
`name`<a id="name"></a>
</td>
<td>
the name of this model component.
Default value: 'ARIMA'.
</td>
</tr>
</table>

<!-- Tabular view -->
<table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>

<tr>
<td>
`batch_shape`<a id="batch_shape"></a>
</td>
<td>
Static batch shape of models represented by this component.
</td>
</tr><tr>
<td>
`init_parameters`<a id="init_parameters"></a>
</td>
<td>
Parameters used to instantiate this `StructuralTimeSeries`.
</td>
</tr><tr>
<td>
`initial_state_prior`<a id="initial_state_prior"></a>
</td>
<td>

</td>
</tr><tr>
<td>
`integration_degree`<a id="integration_degree"></a>
</td>
<td>

</td>
</tr><tr>
<td>
`latent_size`<a id="latent_size"></a>
</td>
<td>
Python `int` dimensionality of the latent space in this model.
</td>
</tr><tr>
<td>
`name`<a id="name"></a>
</td>
<td>
Name of this model component.
</td>
</tr><tr>
<td>
`parameters`<a id="parameters"></a>
</td>
<td>
List of Parameter(name, prior, bijector) namedtuples for this model.
</td>
</tr>
</table>

## Methods

<h3 id="batch_shape_tensor"><code>batch_shape_tensor</code></h3>

<a target="_blank" class="external" href="https://github.com/tensorflow/probability/blob/v0.17.0/tensorflow_probability/python/sts/structural_time_series.py#L114-L127">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>batch_shape_tensor()
</code></pre>

Runtime batch shape of models represented by this component.

<!-- Tabular view -->
<table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Returns</th></tr>

<tr>
<td>
`batch_shape`
</td>
<td>
`int` `Tensor` giving the broadcast batch shape of
all model parameters. This should match the batch shape of
derived state space models, i.e.,
`self.make_state_space_model(...).batch_shape_tensor()`.
</td>
</tr>
</table>

<h3 id="copy"><code>copy</code></h3>

<a target="_blank" class="external" href="https://github.com/tensorflow/probability/blob/v0.17.0/tensorflow_probability/python/sts/structural_time_series.py#L172-L187">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>copy(
**override_parameters_kwargs
)
</code></pre>

Creates a deep copy.

Note: the copy distribution may continue to depend on the original
initialization arguments.

<!-- Tabular view -->
<table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Args</th></tr>

<tr>
<td>
`**override_parameters_kwargs`
</td>
<td>
String/value dictionary of initialization
arguments to override with new values.
</td>
</tr>
</table>

<!-- Tabular view -->
<table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Returns</th></tr>

<tr>
<td>
`copy`
</td>
<td>
A new instance of `type(self)` initialized from the union
of self.init_parameters and override_parameters_kwargs, i.e.,
`dict(self.init_parameters, **override_parameters_kwargs)`.
</td>
</tr>
</table>

<h3 id="joint_distribution"><code>joint_distribution</code></h3>

<a target="_blank" class="external" href="https://github.com/tensorflow/probability/blob/v0.17.0/tensorflow_probability/python/sts/structural_time_series.py#L256-L397">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>joint_distribution(
observed_time_series=None,
num_timesteps=None,
trajectories_shape=(),
initial_step=0,
mask=None,
experimental_parallelize=False
)
</code></pre>

Constructs the joint distribution over parameters and observed values.

<!-- Tabular view -->
<table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Args</th></tr>

<tr>
<td>
`observed_time_series`
</td>
<td>
Optional observed time series to model, as a
`Tensor` or <a href="../../tfp/sts/MaskedTimeSeries"><code>tfp.sts.MaskedTimeSeries</code></a> instance having shape
`concat([batch_shape, trajectories_shape, num_timesteps, 1])`. If
an observed time series is provided, the `num_timesteps`,
`trajectories_shape`, and `mask` arguments are ignored, and
an unnormalized (pinned) distribution over parameter values is returned.
Default value: `None`.
</td>
</tr><tr>
<td>
`num_timesteps`
</td>
<td>
scalar `int` `Tensor` number of timesteps to model. This
must be specified either directly or by passing an
`observed_time_series`.
Default value: `0`.
</td>
</tr><tr>
<td>
`trajectories_shape`
</td>
<td>
`int` `Tensor` shape of sampled trajectories
for each set of parameter values. Ignored if an `observed_time_series`
is passed.
Default value: `()`.
</td>
</tr><tr>
<td>
`initial_step`
</td>
<td>
Optional scalar `int` `Tensor` specifying the starting
timestep.
Default value: `0`.
</td>
</tr><tr>
<td>
`mask`
</td>
<td>
Optional `bool` `Tensor` having shape
`concat([batch_shape, trajectories_shape, num_timesteps])`, in which
`True` entries indicate that the series value at the corresponding step
is missing and should be ignored. This argument should be passed only
if `observed_time_series` is not specified or does not already contain
a missingness mask; it is an error to pass both this
argument and an `observed_time_series` value containing a missingness
mask.
Default value: `None`.
</td>
</tr><tr>
<td>
`experimental_parallelize`
</td>
<td>
If `True`, use parallel message passing
algorithms from <a href="../../tfp/experimental/parallel_filter"><code>tfp.experimental.parallel_filter</code></a> to perform time
series operations in `O(log num_timesteps)` sequential steps. The
overall FLOP and memory cost may be larger than for the sequential
implementations by a constant factor.
Default value: `False`.
</td>
</tr>
</table>

<!-- Tabular view -->
<table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Returns</th></tr>

<tr>
<td>
`joint_distribution`
</td>
<td>
joint distribution of model parameters and
observed trajectories. If no `observed_time_series` was specified, this
is an instance of `tfd.JointDistributionNamedAutoBatched` with a
random variable for each model parameter (with names and order matching
`self.parameters`), plus a final random variable `observed_time_series`
representing a trajectory(ies) conditioned on the parameters. If
`observed_time_series` was specified, the return value is given by
`joint_distribution.experimental_pin(
observed_time_series=observed_time_series)` where `joint_distribution`
is as just described, so it defines an unnormalized posterior
distribution over the parameters.
</td>
</tr>
</table>

#### Example:

The joint distribution can generate prior samples of parameters and
trajectories:

```python
from matplotlib import pylab as plt
import tensorflow_probability as tfp

# Sample and plot 100 trajectories from the prior.
model = tfp.sts.LocalLinearTrend()
prior_samples = model.joint_distribution(num_timesteps=200).sample([100])
plt.plot(
tf.linalg.matrix_transpose(prior_samples['observed_time_series'][..., 0]))
``````

It also integrates with TFP inference APIs, providing a more flexible alternative to the STS-specific fitting utilities.

``````jd = model.joint_distribution(observed_time_series)

# Variational inference.
surrogate_posterior = (
tfp.experimental.vi.build_factored_surrogate_posterior(
event_shape=jd.event_shape,
bijector=jd.experimental_default_event_space_bijector()))
losses = tfp.vi.fit_surrogate_posterior(
target_log_prob_fn=jd.unnormalized_log_prob,
surrogate_posterior=surrogate_posterior,
optimizer=tf.optimizers.Adam(0.1),
num_steps=200)
parameter_samples = surrogate_posterior.sample(50)

# No U-Turn Sampler.
samples, kernel_results = tfp.experimental.mcmc.windowed_adaptive_nuts(
n_draws=500, joint_dist=dist)
``````

### `joint_log_prob`

View source

Build the joint density `log p(params) + log p(y|params)` as a callable. (deprecated)

Args
`observed_time_series` Observed `Tensor` trajectories of shape `sample_shape + batch_shape + [num_timesteps, 1]` (the trailing `1` dimension is optional if `num_timesteps > 1`), where `batch_shape` should match `self.batch_shape` (the broadcast batch shape of all priors on parameters for this structural time series model). Any `NaN`s are interpreted as missing observations; missingness may be also be explicitly specified by passing a `tfp.sts.MaskedTimeSeries` instance.

Returns
`log_joint_fn` A function taking a `Tensor` argument for each model parameter, in canonical order, and returning a `Tensor` log probability of shape `batch_shape`. Note that, unlike `tfp.Distributions` `log_prob` methods, the `log_joint` sums over the `sample_shape` from y, so that `sample_shape` does not appear in the output log_prob. This corresponds to viewing multiple samples in `y` as iid observations from a single model, which is typically the desired behavior for parameter inference.

### `make_state_space_model`

View source

Instantiate this model as a Distribution over specified `num_timesteps`.

Args
`num_timesteps` Python `int` number of timesteps to model.
`param_vals` a list of `Tensor` parameter values in order corresponding to `self.parameters`, or a dict mapping from parameter names to values.
`initial_state_prior` an optional `Distribution` instance overriding the default prior on the model's initial state. This is used in forecasting ("today's prior is yesterday's posterior").
`initial_step` optional `int` specifying the initial timestep to model. This is relevant when the model contains time-varying components, e.g., holidays or seasonality.
`**linear_gaussian_ssm_kwargs` Optional additional keyword arguments to to the base `tfd.LinearGaussianStateSpaceModel` constructor.

Returns
`dist` a `LinearGaussianStateSpaceModel` Distribution object.

### `prior_sample`

View source

Sample from the joint prior over model parameters and trajectories. (deprecated)

Args
`num_timesteps` Scalar `int` `Tensor` number of timesteps to model.
`initial_step` Optional scalar `int` `Tensor` specifying the starting timestep. Default value: 0.
`params_sample_shape` Number of possible worlds to sample iid from the parameter prior, or more generally, `Tensor` `int` shape to fill with iid samples. Default value: `[]` (i.e., draw a single sample and don't expand the shape).
`trajectories_sample_shape` For each sampled set of parameters, number of trajectories to sample, or more generally, `Tensor` `int` shape to fill with iid samples. Default value: `[]` (i.e., draw a single sample and don't expand the shape).
`seed` PRNG seed; see `tfp.random.sanitize_seed` for details. Default value: `None`.

Returns
`trajectories` `float` `Tensor` of shape `trajectories_sample_shape + params_sample_shape + [num_timesteps, 1]` containing all sampled trajectories.
`param_samples` list of sampled parameter value `Tensor`s, in order corresponding to `self.parameters`, each of shape `params_sample_shape + prior.batch_shape + prior.event_shape`.

### `__add__`

View source

Models the sum of the series from the two components.

[]
[]