tf.contrib.timeseries.ARModel

View source on GitHub

Auto-regressive model, both linear and non-linear.

Features to the model include time and values of input_window_size timesteps, and times for output_window_size timesteps. These are passed through a configurable prediction model, and then fed to a loss function (e.g. squared loss).

Note that this class can also be used to regress against time only by setting the input_window_size to zero.

Each periodicity in the periodicities arg is divided by the num_time_buckets into time buckets that are represented as features added to the model.

A good heuristic for picking an appropriate periodicity for a given data set would be the length of cycles in the data. For example, energy usage in a home is typically cyclic each day. If the time feature in a home energy usage dataset is in the unit of hours, then 24 would be an appropriate periodicity. Similarly, a good heuristic for num_time_buckets is how often the data is expected to change within the cycle. For the aforementioned home energy usage dataset and periodicity of 24, then 48 would be a reasonable value if usage is expected to change every half hour.

Each feature's value for a given example with time t is the difference between t and the start of the time bucket it falls under. If it doesn't fall under a feature's associated time bucket, then that feature's value is zero.

For example: if periodicities = (9, 12) and num_time_buckets = 3, then 6 features would be added to the model, 3 for periodicity 9 and 3 for periodicity 12.

For an example data point where t = 17:

  • It's in the 3rd time bucket for periodicity 9 (2nd period is 9-18 and 3rd time bucket is 15-18)
  • It's in the 2nd time bucket for periodicity 12 (2nd period is 12-24 and 2nd time bucket is between 16-20).

Therefore the 6 added features for this row with t = 17 would be:

Feature name (periodicity#_timebucket#), feature value

P9_T1, 0 # not in first time bucket P9_T2, 0 # not in second time bucket P9_T3, 2 # 17 - 15 since 15 is the start of the 3rd time bucket P12_T1, 0 # not in first time bucket P12_T2, 1 # 17 - 16 since 16 is the start of the 2nd time bucket P12_T3, 0 # not in third time bucket

periodicities periodicities of the input data, in the same units as the time feature (for example 24 if feeding hourly data with a daily periodicity, or 60 * 24 if feeding minute-level data with daily periodicity). Note this can be a single value or a list of values for multiple periodicities.
input_window_size Number of past time steps of data to look at when doing the regression.
output_window_size Number of future time steps to predict. Note that setting it to > 1 empirically seems to give a better fit.
num_features number of input features per time step.
prediction_model_factory A callable taking arguments num_features, input_window_size, and output_window_size and returning a tf.keras.Model. The Model's call() takes two arguments: an input window and an output window, and returns a dictionary of predictions. See FlatPredictionModel for an example. Example usage:

prediction_model_factory=functools.partial( FlatPredictionModel,
hidden_layer_sizes=[10, 10])) ```

The default model computes predictions as a linear function of flattened
input and output windows.
</td>
</tr><tr>
<td>
`num_time_buckets`
</td>
<td>
Number of buckets into which to divide (time %
periodicity). This value multiplied by the number of periodicities is
the number of time features added to the model.
</td>
</tr><tr>
<td>
`loss`
</td>
<td>
Loss function to use for training. Currently supported values are
SQUARED_LOSS and NORMAL_LIKELIHOOD_LOSS. Note that for
NORMAL_LIKELIHOOD_LOSS, we train the covariance term as well. For
SQUARED_LOSS, the evaluation loss is reported based on un-scaled
observations and predictions, while the training loss is computed on
normalized data (if input statistics are available).
</td>
</tr><tr>
<td>
`exogenous_feature_columns`
</td>
<td>
A list of <a href="../../../tf/feature_column"><code>tf.feature_column</code></a>s (for example
<a href="../../../tf/feature_column/embedding_column"><code>tf.feature_column.embedding_column</code></a>) corresponding to
features which provide extra information to the model but are not part
of the series to be predicted.
</td>
</tr>
</table>





<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2"><h2 class="add-link">Attributes</h2></th></tr>

<tr>
<td>
`exogenous_feature_columns`
</td>
<td>
`tf.feature_colum`s for features which are not predicted.
</td>
</tr>
</table>



## Methods

<h3 id="define_loss"><code>define_loss</code></h3>

<a target="_blank" href="https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/timeseries/python/timeseries/model.py#L172-L203">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>define_loss(
    features, mode
)
</code></pre>

Default loss definition with state replicated across a batch.

Time series passed to this model have a batch dimension, and each series in
a batch can be operated on in parallel. This loss definition assumes that
each element of the batch represents an independent sample conditioned on
the same initial state (i.e. it is simply replicated across the batch). A
batch size of one provides sequential operations on a single time series.

More complex processing may operate instead on get_start_state() and
get_batch_loss() directly.

<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Args</th></tr>

<tr>
<td>
`features`
</td>
<td>
A dictionary (such as is produced by a chunker) with at minimum
the following key/value pairs (others corresponding to the
`exogenous_feature_columns` argument to `__init__` may be included
representing exogenous regressors):
TrainEvalFeatures.TIMES: A [batch size x window size] integer Tensor
with times for each observation. If there is no artificial chunking,
the window size is simply the length of the time series.
TrainEvalFeatures.VALUES: A [batch size x window size x num features]
Tensor with values for each observation.
</td>
</tr><tr>
<td>
`mode`
</td>
<td>
The tf.estimator.ModeKeys mode to use (TRAIN, EVAL). For INFER,
see predict().
</td>
</tr>
</table>



<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Returns</th></tr>
<tr class="alt">
<td colspan="2">
A ModelOutputs object.
</td>
</tr>

</table>



<h3 id="generate"><code>generate</code></h3>

<a target="_blank" href="https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/timeseries/python/timeseries/ar_model.py#L335-L337">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>generate(
    number_of_series, series_length, model_parameters=None, seed=None
)
</code></pre>

Sample synthetic data from model parameters, with optional substitutions.

Returns `number_of_series` possible sequences of future values, sampled from
the generative model with each conditioned on the previous. Samples are
based on trained parameters, except for those parameters explicitly
overridden in `model_parameters`.

For distributions over future observations, see predict().

<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Args</th></tr>

<tr>
<td>
`number_of_series`
</td>
<td>
Number of time series to create.
</td>
</tr><tr>
<td>
`series_length`
</td>
<td>
Length of each time series.
</td>
</tr><tr>
<td>
`model_parameters`
</td>
<td>
A dictionary mapping model parameters to values, which
replace trained parameters when generating data.
</td>
</tr><tr>
<td>
`seed`
</td>
<td>
If specified, return deterministic time series according to this
value.
</td>
</tr>
</table>



<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Returns</th></tr>
<tr class="alt">
<td colspan="2">
A dictionary with keys TrainEvalFeatures.TIMES (mapping to an array with
shape [number_of_series, series_length]) and TrainEvalFeatures.VALUES
(mapping to an array with shape [number_of_series, series_length,
num_features]).
</td>
</tr>

</table>



<h3 id="get_batch_loss"><code>get_batch_loss</code></h3>

<a target="_blank" href="https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/timeseries/python/timeseries/ar_model.py#L726-L882">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>get_batch_loss(
    features, mode, state
)
</code></pre>

Computes predictions and a loss.


<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Args</th></tr>

<tr>
<td>
`features`
</td>
<td>
A dictionary (such as is produced by a chunker) with the
following key/value pairs (shapes are given as required for training):
TrainEvalFeatures.TIMES: A [batch size, self.window_size] integer
Tensor with times for each observation. To train on longer
sequences, the data should first be chunked.
TrainEvalFeatures.VALUES: A [batch size, self.window_size,
self.num_features] Tensor with values for each observation.
When evaluating, `TIMES` and `VALUES` must have a window size of at
least self.window_size, but it may be longer, in which case the last
window_size - self.input_window_size times (or fewer if this is not
divisible by self.output_window_size) will be evaluated on with
non-overlapping output windows (and will have associated
predictions). This is primarily to support qualitative
evaluation/plotting, and is not a recommended way to compute evaluation
losses (since there is no overlap in the output windows, which for
window-based models is an undesirable bias).
</td>
</tr><tr>
<td>
`mode`
</td>
<td>
The tf.estimator.ModeKeys mode to use (TRAIN or EVAL).
</td>
</tr><tr>
<td>
`state`
</td>
<td>
Unused
</td>
</tr>
</table>



<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Returns</th></tr>
<tr class="alt">
<td colspan="2">
A model.ModelOutputs object.
</td>
</tr>

</table>



<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Raises</th></tr>

<tr>
<td>
`ValueError`
</td>
<td>
If `mode` is not TRAIN or EVAL, or if static shape information
is incorrect.
</td>
</tr>
</table>



<h3 id="get_start_state"><code>get_start_state</code></h3>

<a target="_blank" href="https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/timeseries/python/timeseries/ar_model.py#L320-L329">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>get_start_state()
</code></pre>

Returns a tuple of state for the start of the time series.

For example, a mean and covariance. State should not have a batch
dimension, and will often be TensorFlow Variables to be learned along with
the rest of the model parameters.

<h3 id="initialize_graph"><code>initialize_graph</code></h3>

<a target="_blank" href="https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/timeseries/python/timeseries/ar_model.py#L307-L318">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>initialize_graph(
    input_statistics=None
)
</code></pre>

Define ops for the model, not depending on any previously defined ops.


<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Args</th></tr>

<tr>
<td>
`input_statistics`
</td>
<td>
A math_utils.InputStatistics object containing input
statistics. If None, data-independent defaults are used, which may
result in longer or unstable training.
</td>
</tr>
</table>



<h3 id="loss_op"><code>loss_op</code></h3>

<a target="_blank" href="https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/timeseries/python/timeseries/ar_model.py#L458-L472">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>loss_op(
    targets, prediction_ops
)
</code></pre>

Create loss_op.


<h3 id="predict"><code>predict</code></h3>

<a target="_blank" href="https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/timeseries/python/timeseries/ar_model.py#L488-L664">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>predict(
    features
)
</code></pre>

Computes predictions multiple steps into the future.


<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Args</th></tr>

<tr>
<td>
`features`
</td>
<td>
A dictionary with the following key/value pairs:
PredictionFeatures.TIMES: A [batch size, predict window size]
integer Tensor of times, after the window of data indicated by
`STATE_TUPLE`, to make predictions for.
PredictionFeatures.STATE_TUPLE: A tuple of (times, values), times with
shape [batch size, self.input_window_size], values with shape [batch
size, self.input_window_size, self.num_features] representing a
segment of the time series before `TIMES`. This data is used
to start of the autoregressive computation. This should have data for
at least self.input_window_size timesteps.
And any exogenous features, with shapes prefixed by shape of `TIMES`.
</td>
</tr>
</table>



<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Returns</th></tr>
<tr class="alt">
<td colspan="2">
A dictionary with keys, "mean", "covariance". The
values are Tensors of shape [batch_size, predict window size,
num_features] and correspond to the values passed in `TIMES`.
</td>
</tr>

</table>



<h3 id="prediction_ops"><code>prediction_ops</code></h3>

<a target="_blank" href="https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/timeseries/python/timeseries/ar_model.py#L371-L446">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>prediction_ops(
    times, values, exogenous_regressors
)
</code></pre>

Compute model predictions given input data.


<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Args</th></tr>

<tr>
<td>
`times`
</td>
<td>
A [batch size, self.window_size] integer Tensor, the first
self.input_window_size times in each part of the batch indicating
input features, and the last self.output_window_size times indicating
prediction times.
</td>
</tr><tr>
<td>
`values`
</td>
<td>
A [batch size, self.input_window_size, self.num_features] Tensor
with input features.
</td>
</tr><tr>
<td>
`exogenous_regressors`
</td>
<td>
A [batch size, self.window_size,
self.exogenous_size] Tensor with exogenous features.
</td>
</tr>
</table>



<!-- Tabular view -->
 <table class="responsive fixed orange">
<colgroup><col width="214px"><col></colgroup>
<tr><th colspan="2">Returns</th></tr>
<tr class="alt">
<td colspan="2">
Tuple (predicted_mean, predicted_covariance), where each element is a
Tensor with shape [batch size, self.output_window_size,
self.num_features].
</td>
</tr>

</table>



<h3 id="random_model_parameters"><code>random_model_parameters</code></h3>

<a target="_blank" href="https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/contrib/timeseries/python/timeseries/ar_model.py#L332-L333">View source</a>

<pre class="devsite-click-to-copy prettyprint lang-py tfo-signature-link">
<code>random_model_parameters(
    seed=None
)
</code></pre>






## Class Variables

* `NORMAL_LIKELIHOOD_LOSS = 'normal_likelihood_loss'` <a id="NORMAL_LIKELIHOOD_LOSS"></a>
* `SQUARED_LOSS = 'squared_loss'` <a id="SQUARED_LOSS"></a>