tfp.substrates.numpy.distributions.MarkovChain

Distribution of a sequence generated by a memoryless process.

Inherits From: Distribution

View aliases

Main aliases

tfp.experimental.substrates.numpy.distributions.MarkovChain

tfp.substrates.numpy.distributions.MarkovChain(
    initial_state_prior,
    transition_fn,
    num_steps,
    experimental_use_kahan_sum=False,
    validate_args=False,
    name='MarkovChain'
)

A discrete-time Markov chain is a sequence of random variables in which the variable(s) at each step is independent of all previous variables, conditioned on the variable(s) at the immediate predecessor step. That is, there can be no (direct) long-term dependencies. This 'Markov property' is a simplifying assumption; for example, it enables efficient sampling. Many time-series models can be formulated as Markov chains.

Instances of tfd.MarkovChain represent fully-observed, discrete-time Markov chains, with one or more random variables at each step. These variables may take continuous or discrete values. Sampling is done sequentially, requiring time that scales with the length of the sequence; log_prob evaluation is vectorized over timesteps, and so requires only constant time given sufficient parallelism.

Related distributions

The discrete-valued Markov chains modeled by tfd.HiddenMarkovModel (using a trivial observation distribution) are a special case of those supported by this distribution, which enable exact inference over the values in an unobserved chain. Continuous-valued chains with linear Gaussian transitions are supported by tfd.LinearGaussianStateSpaceModel, which can similarly exploit the linear Gaussian structure for exact inference of hidden states. These distributions are limited to chains that have the respective (discrete or linear Gaussian) structure.

Autoregressive models that do not necessarily respect the Markov property are supported by tfd.Autoregressive, which is, in that sense, more general than this distribution. These models require a more involved specification, and sampling in general requires quadratic (rather than linear) time in the length of the sequence.

Exact inference for unobserved Markov chains is not possible in general; however, particle filtering exploits the Markov property to perform approximate inference, and is often a well-suited method for sequential inference tasks. Particle filtering is available in TFP using tfp.experimental.mcmc.particle_filter, and related methods.

Example: Gaussian random walk

One of the simplest continuous-valued Markov chains is a Gaussian random walk. This may also be viewed as a discretized Brownian motion.

tfd = tfp.distributions

gaussian_walk = tfd.MarkovChain(
  initial_state_prior=tfd.Normal(loc=0., scale=1.),
  transition_fn=lambda _, x: tfd.Normal(loc=x, scale=1.),
  num_steps=100)
# ==> `gaussian_walk.event_shape == [100]`
# ==> `gaussian_walk.batch_shape == []`

x = gaussian_walk.sample(5)  # Samples a matrix of 5 independent walks.
lp = gaussian_walk.log_prob(x)  # ==> `lp.shape == [5]`.

Example: batch of random walks

To spice things up, we'll now define a batch of random walks, each following a different distribution (in this case, different starting locations). We'll also demonstrate scales that differ across timesteps.

scales = tf.convert_to_tensor([0.5, 0.3, 0.2, 0.2, 0.3, 0.2, 0.7])
batch_gaussian_walk = tfd.MarkovChain(
  # The prior distribution determines the batch shape for the chain.
  # Transitions must respect this batch shape.
  initial_state_prior=tfd.Normal(loc=[-10., 0., 10.],
                                 scale=[1., 1., 1.]),
  transition_fn=lambda t, x: tfd.Normal(
    loc=x,
    # The `num_steps` dimension will always be leftmost in `x`, so we
    # pad the scale to the same rank as `x` to make their shapes line up.
    tf.reshape(tf.gather(scales, t),
               tf.concat([[-1],
                          tf.ones(tf.rank(x) - 1, dtype=tf.int32)], axis=0))),
  # Limit to eight steps since we only specified scales for seven transitions.
  num_steps=8)
# ==> `batch_gaussian_walk.event_shape == [8]`
# ==> `batch_gaussian_walk.batch_shape == [3]`

x = batch_gaussian_walk.sample(5)  # ==> `x.shape == [5, 3, 8]`.
lp = batch_gaussian_walk.log_prob(x)  # ==> `lp.shape == [5, 3]`.

Example: multivariate chain with longer-term dependence

We can also define multivariate Markov chains. In addition to the obvious use of modeling the joint evolution of multiple variables, multivariate chains can also help us work around the Markov limitation by the trick of folding state history into the current state as an auxiliary variable(s). The next example, a second-order autoregressive process with dynamic coefficients and scale, contains multiple time-dependent variables and also uses an auxiliary previous_level variable to enable the transition function to access the previous two steps of history:


def transition_fn(_, previous_state):
  return tfd.JointDistributionNamedAutoBatched(
      # The transition distribution must match the batch shape of the chain.
      # Since `log_scale` is a scalar quantity, its shape is the batch shape.
      batch_ndims=tf.rank(previous_state['log_scale']),
      model={
          # The autoregressive coefficients and the `log_scale` each follow
          # an independent slow-moving random walk.
          'coefs': tfd.Normal(loc=previous_state['coefs'], scale=0.01),
          'log_scale': tfd.Normal(loc=previous_state['log_scale'],
                                  scale=0.01),
          # The level is a linear combination of the previous *two* levels,
          # with additional noise of scale `exp(log_scale)`.
          'level': lambda coefs, log_scale: tfd.Normal(  
              loc=(coefs[..., 0] * previous_state['level'] +
                   coefs[..., 1] * previous_state['previous_level']),
              scale=tf.exp(log_scale)),
          # Store the previous level to access at the next step.
          'previous_level': tfd.Deterministic(previous_state['level'])})

process = tfd.MarkovChain(
    # For simplicity, define the prior as a 'transition' from fixed values.
    initial_state_prior=transition_fn(
        0, previous_state={
            'coefs': [0.7, -0.2],
            'log_scale': -1.,
            'level': 0.,
            'previous_level': 0.}),
    transition_fn=transition_fn,
    num_steps=100)
# ==> `process.event_shape == {'coefs': [100, 2], 'log_scale': [100],
#                              'level': [100], 'previous_level': [100]}`
# ==> `process.batch_shape == []`

x = process.sample(5)
# ==> `x['coefs'].shape == [5, 100, 2]`
# ==> `x['log_scale'].shape == [5, 100]`
# ==> `x['level'].shape == [5, 100]`
# ==> `x['previous_level'].shape == [5, 100]`
lp = process.log_prob(x)  # ==> `lp.shape == [5]`.

Args
`initial_state_prior`	`tfd.Distribution` instance describing a prior distribution on the state at step 0. This may be a joint distribution.
`transition_fn`	Python `callable` with signature `current_state_dist = transition_fn(previous_step, previous_state)`. The arguments are an integer `previous_step`, and `previous_state`, a (structure of) Tensor(s) like a sample from the `initial_state_prior`. The returned `current_state_dist` must have the same `dtype`, `batch_shape`, and `event_shape` as `initial_state_prior`.
`num_steps`	Integer `Tensor` scalar number of steps in the chain.
`experimental_use_kahan_sum`	If `True`, use Kahan summation to mitigate accumulation of floating-point error in log_prob calculation.
`validate_args`	Python `bool`, default `False`. Whether to validate input with asserts. If `validate_args` is `False`, and the inputs are invalid, correct behavior is not guaranteed.
`name`	The name to give ops created by this distribution.

Attributes
`allow_nan_stats`	Python `bool` describing behavior when a stat is undefined. Stats return +/- infinity when it makes sense. E.g., the variance of a Cauchy distribution is infinity. However, sometimes the statistic is undefined, e.g., if a distribution's pdf does not achieve a maximum within the support of the distribution, the mode is undefined. If the mean is undefined, then by definition the variance is undefined. E.g. the mean for Student's T for df = 1 is undefined (no clear way to say it is either + or - infinity), so the variance = E[(X - mean)**2] is also undefined.
`batch_shape`	Shape of a single sample from a single event index as a `TensorShape`. May be partially defined or unknown. The batch dimensions are indexes into independent, non-identical parameterizations of this distribution.
`dtype`	The `DType` of `Tensor`s handled by this `Distribution`.
`event_shape`	Shape of a single sample from a single batch as a `TensorShape`. May be partially defined or unknown.
`experimental_shard_axis_names`	The list or structure of lists of active shard axis names.
`initial_state_prior`
`name`	Name prepended to all ops created by this `Distribution`.
`num_steps`
`parameters`	Dictionary of parameters used to instantiate this `Distribution`.
`reparameterization_type`	Describes how samples from the distribution are reparameterized. Currently this is one of the static instances `tfd.FULLY_REPARAMETERIZED` or `tfd.NOT_REPARAMETERIZED`.
`trainable_variables`
`transition_fn`
`validate_args`	Python `bool` indicating possibly expensive checks are enabled.
`variables`

Args
`value`	`float` or `double` `Tensor`.
`name`	Python `str` prepended to names of ops created by this function.
`**kwargs`	Named arguments forwarded to subclass implementation.

Args
`other`	`tfp.distributions.Distribution` instance.
`name`	Python `str` prepended to names of ops created by this function.

Args
`*args`	Passed to implementation `_default_event_space_bijector`.
`**kwargs`	Passed to implementation `_default_event_space_bijector`.

Args
`value`	a `Tensor` valid sample from this distribution family.
`sample_ndims`	Positive `int` Tensor number of leftmost dimensions of `value` that index i.i.d. samples. Default value: `1`.
`validate_args`	Python `bool`, default `False`. When `True`, distribution parameters are checked for validity despite possibly degrading runtime performance. When `False`, invalid inputs may silently render incorrect outputs. Default value: `False`.
`**init_kwargs`	Additional keyword arguments passed through to `cls.__init__`. These take precedence in case of collision with the fitted parameters; for example, `tfd.Normal.experimental_fit([1., 1.], scale=20.)` returns a Normal distribution with `scale=20.` rather than the maximum likelihood parameter `scale=0.`.

Args
`value`	`float` or `double` `Tensor`.
`backward_compat`	`bool` specifying whether to fall back to returning `FullSpace` as the tangent space, and representing R^n with the standard basis.
`**kwargs`	Named arguments forwarded to subclass implementation.

Returns
`log_prob`	a `Tensor` representing the log probability density, of shape `sample_shape(x) + self.batch_shape` with values of type `self.dtype`.
`tangent_space`	a `TangentSpace` object (by default `FullSpace`) representing the tangent space to the manifold at `value`.

Args
`sample_shape`	integer `Tensor` desired shape of samples to draw. Default value: `()`.
`seed`	PRNG seed; see `tfp.random.sanitize_seed` for details. Default value: `None`.
`name`	name to give to the op. Default value: `'sample_and_log_prob'`.
`**kwargs`	Named arguments forwarded to subclass implementation.

Returns
`samples`	a `Tensor`, or structure of `Tensor`s, with prepended dimensions `sample_shape`.
`log_prob`	a `Tensor` of shape `sample_shape(x) + self.batch_shape` with values of type `self.dtype`.

Args
`sample_shape`	`Tensor` or python list/tuple. Desired shape of a call to `sample()`.
`name`	name to prepend ops with.

Args
`dtype`	Optional float `dtype` to assume for continuous-valued parameters. Some constraining bijectors require advance knowledge of the dtype because certain constants (e.g., `tfb.Softplus.low`) must be instantiated with the same dtype as the values to be transformed.
`num_classes`	Optional `int` `Tensor` number of classes to assume when inferring the shape of parameters for categorical-like distributions. Otherwise ignored.

Args
`sample_shape`	0D or 1D `int32` `Tensor`. Shape of the generated samples.
`seed`	PRNG seed; see `tfp.random.sanitize_seed` for details.
`name`	name to give to the op.
`**kwargs`	Named arguments forwarded to subclass implementation.

tfp.substrates.numpy.distributions.MarkovChain

View aliases

Related distributions

Example: Gaussian random walk

Example: batch of random walks

Example: multivariate chain with longer-term dependence

Args

Attributes

Methods

batch_shape_tensor

cdf

copy

covariance

cross_entropy

entropy

event_shape_tensor

experimental_default_event_space_bijector

experimental_fit

experimental_local_measure

experimental_sample_and_log_prob

is_scalar_batch

is_scalar_event

kl_divergence

log_cdf

log_prob

log_survival_function

mean

mode

param_shapes

param_static_shapes

parameter_properties

prob

quantile

sample

stddev

survival_function

unnormalized_log_prob

variance

__getitem__

__iter__

`batch_shape_tensor`

`cdf`

`copy`

`covariance`

`cross_entropy`

`entropy`

`event_shape_tensor`

`experimental_default_event_space_bijector`

`experimental_fit`

`experimental_local_measure`

`experimental_sample_and_log_prob`

`is_scalar_batch`

`is_scalar_event`

`kl_divergence`

`log_cdf`

`log_prob`

`log_survival_function`

`mean`

`mode`

`param_shapes`

`param_static_shapes`

`parameter_properties`

`prob`

`quantile`

`sample`

`stddev`

`survival_function`

`unnormalized_log_prob`

`variance`

`getitem`

`iter`