tfp.substrates.numpy.sts.fit_with_hmc

Draw posterior samples using Hamiltonian Monte Carlo (HMC).

View aliases

Main aliases

tfp.experimental.substrates.numpy.sts.fit_with_hmc

tfp.substrates.numpy.sts.fit_with_hmc(
    model,
    observed_time_series,
    num_results=100,
    num_warmup_steps=50,
    num_leapfrog_steps=15,
    initial_state=None,
    initial_step_size=None,
    chain_batch_shape=(),
    num_variational_steps=150,
    variational_optimizer=None,
    variational_sample_size=5,
    seed=None,
    name=None
)

Markov chain Monte Carlo (MCMC) methods are considered the gold standard of Bayesian inference; under suitable conditions and in the limit of infinitely many draws they generate samples from the true posterior distribution. HMC [1] uses gradients of the model's log-density function to propose samples, allowing it to exploit posterior geometry. However, it is computationally more expensive than variational inference and relatively sensitive to tuning.

This method attempts to provide a sensible default approach for fitting StructuralTimeSeries models using HMC. It first runs variational inference as a fast posterior approximation, and initializes the HMC sampler from the variational posterior, using the posterior standard deviations to set per-variable step sizes (equivalently, a diagonal mass matrix). During the warmup phase, it adapts the step size to target an acceptance rate of 0.75, which is thought to be in the desirable range for optimal mixing [2].

Args
`model`	An instance of `StructuralTimeSeries` representing a time-series model. This represents a joint distribution over time-series and their parameters with batch shape `[b1, ..., bN]`.
`observed_time_series`	`float` `Tensor` of shape `concat([sample_shape, model.batch_shape, [num_timesteps, 1]])` where `sample_shape` corresponds to i.i.d. observations, and the trailing `[1]` dimension may (optionally) be omitted if `num_timesteps > 1`. Any `NaN`s are interpreted as missing observations; missingness may be also be explicitly specified by passing a `tfp.sts.MaskedTimeSeries` instance.
`num_results`	Integer number of Markov chain draws. Default value: `100`.
`num_warmup_steps`	Integer number of steps to take before starting to collect results. The warmup steps are also used to adapt the step size towards a target acceptance rate of 0.75. Default value: `50`.
`num_leapfrog_steps`	Integer number of steps to run the leapfrog integrator for. Total progress per HMC step is roughly proportional to `step_size * num_leapfrog_steps`. Default value: `15`.
`initial_state`	Optional Python `list` of `Tensor`s, one for each model parameter, representing the initial state(s) of the Markov chain(s). These should have shape `concat([chain_batch_shape, param.prior.batch_shape, param.prior.event_shape])`. If `None`, the initial state is set automatically using a sample from a variational posterior. Default value: `None`.
`initial_step_size`	Python `list` of `Tensor`s, one for each model parameter, representing the step size for the leapfrog integrator. Must broadcast with the shape of `initial_state`. Larger step sizes lead to faster progress, but too-large step sizes make rejection exponentially more likely. If `None`, the step size is set automatically using the standard deviation of a variational posterior. Default value: `None`.
`chain_batch_shape`	Batch shape (Python `tuple`, `list`, or `int`) of chains to run in parallel. Default value: `[]` (i.e., a single chain).
`num_variational_steps`	Python `int` number of steps to run the variational optimization to determine the initial state and step sizes. Default value: `150`.
`variational_optimizer`	Optional `tf.train.Optimizer` instance to use in the variational optimization. If `None`, defaults to `tf.train.AdamOptimizer(0.1)`. Default value: `None`.
`variational_sample_size`	Python `int` number of Monte Carlo samples to use in estimating the variational divergence. Larger values may stabilize the optimization, but at higher cost per step in time and memory. Default value: `1`.
`seed`	PRNG seed; see `tfp.random.sanitize_seed` for details.
`name`	Python `str` name prefixed to ops created by this function. Default value: `None` (i.e., 'fit_with_hmc').

Returns
`samples`	Python `list` of `Tensors` representing posterior samples of model parameters, with shapes `[concat([[num_results], chain_batch_shape, param.prior.batch_shape, param.prior.event_shape]) for param in model.parameters]`.
`kernel_results`	A (possibly nested) `tuple`, `namedtuple` or `list` of `Tensor`s representing internal calculations made within the HMC sampler.

Examples

Assume we've built a structural time-series model:

  day_of_week = tfp.sts.Seasonal(
      num_seasons=7,
      observed_time_series=observed_time_series,
      name='day_of_week')
  local_linear_trend = tfp.sts.LocalLinearTrend(
      observed_time_series=observed_time_series,
      name='local_linear_trend')
  model = tfp.sts.Sum(components=[day_of_week, local_linear_trend],
                      observed_time_series=observed_time_series)

To draw posterior samples using HMC under default settings:

samples, kernel_results = tfp.sts.fit_with_hmc(model, observed_time_series)
print("acceptance rate: {}".format(
  np.mean(kernel_results.inner_results.inner_results.is_accepted, axis=0)))
print("posterior means: {}".format(
  {param.name: np.mean(param_draws, axis=0)
   for (param, param_draws) in zip(model.parameters, samples)}))

We can also run multiple chains. This may help diagnose convergence issues and allows us to exploit vectorization to draw samples more quickly, although warmup still requires the same number of sequential steps.

from matplotlib import pylab as plt

samples, kernel_results = tfp.sts.fit_with_hmc(
  model, observed_time_series, chain_batch_shape=[10])
print("acceptance rate: {}".format(
  np.mean(kernel_results.inner_results.inner_results.is_accepted, axis=0)))

# Plot the sampled traces for each parameter. If the chains have mixed, their
# traces should all cover the same region of state space, frequently crossing
# over each other.
for (param, param_draws) in zip(model.parameters, samples):
  if param.prior.event_shape.ndims > 0:
    print("Only plotting traces for scalar parameters, skipping {}".format(
      param.name))
    continue
  plt.figure(figsize=[10, 4])
  plt.title(param.name)
  plt.plot(param_draws.numpy())
  plt.ylabel(param.name)
  plt.xlabel("HMC step")

# Combining the samples from multiple chains into a single dimension allows
# us to easily pass sampled parameters to downstream forecasting methods.
combined_samples = [np.reshape(param_draws,
                               [-1] + list(param_draws.shape[2:]))
                    for param_draws in samples]

For greater flexibility, you may prefer to implement your own sampler using the TensorFlow Probability primitives in tfp.mcmc. The following recipe constructs a basic HMC sampler, using a TransformedTransitionKernel to incorporate constraints on the parameter space.

transformed_hmc_kernel = tfp.mcmc.TransformedTransitionKernel(
    inner_kernel=tfp.mcmc.DualAveragingStepSizeAdaptation(
        inner_kernel=tfp.mcmc.HamiltonianMonteCarlo(
            target_log_prob_fn=model.joint_distribution(observed_time_series).log_prob,
            step_size=step_size,
            num_leapfrog_steps=num_leapfrog_steps,
            state_gradients_are_stopped=True,
            seed=seed),
        num_adaptation_steps = int(0.8 * num_warmup_steps)),
    bijector=[param.bijector for param in model.parameters])

# Initialize from a Uniform[-2, 2] distribution in unconstrained space.
initial_state = [tfp.sts.sample_uniform_initial_state(
  param, return_constrained=True) for param in model.parameters]

samples, kernel_results = tfp.mcmc.sample_chain(
  kernel=transformed_hmc_kernel,
  num_results=num_results,
  current_state=initial_state,
  num_burnin_steps=num_warmup_steps)

References

[1]: Radford Neal. MCMC Using Hamiltonian Dynamics. Handbook of Markov Chain Monte Carlo, 2011. https://arxiv.org/abs/1206.1901 [2] M.J. Betancourt, Simon Byrne, and Mark Girolami. Optimizing The Integrator Step Size for Hamiltonian Monte Carlo. https://arxiv.org/abs/1411.6669

tfp.substrates.numpy.sts.fit_with_hmc

View aliases

Args

Returns

Examples

References