Builds a bijector to approximately transform N(0, 1) into distribution.

This represents a distribution as a bijector that transforms a (multivariate) standard normal distribution into the distribution of interest.

distribution A tfd.Distribution instance; this may be a joint distribution.
name Python str name for ops created by this method.

distribution_bijector a tfb.Bijector instance such that distribution_bijector(tfd.Normal(0., 1.)) is approximately equivalent to distribution.


This method may be used to convert structured variational distributions into MCMC preconditioners. Consider a model containing funnel geometry, which may be difficult for an MCMC algorithm to sample directly.

model_with_funnel = tfd.JointDistributionSequentialAutoBatched([
    tfd.Normal(loc=-1., scale=2., name='z'),
    lambda z: tfd.Normal(loc=[0., 0., 0.], scale=tf.exp(z), name='x'),
    lambda x: tfd.Poisson(log_rate=x, name='y')])
pinned_model = tfp.experimental.distributions.JointDistributionPinned(
    model_with_funnel, y=[1, 3, 0])

We can approximate the posterior in this model using a structured variational surrogate distribution, which will capture the funnel geometry, but cannot exactly represent the (non-Gaussian) posterior.

# Build and fit a structured surrogate posterior distribution.
surrogate_posterior =
_ =,

Creating a preconditioning bijector allows us to obtain higher-quality posterior samples, without any Gaussianity assumption, by using the surrogate to guide an MCMC sampler.

surrogate_posterior_bijector = (
samples, _ = tfp.mcmc.sample_chain(
  trace_fn=lambda _0, _1: [],

Mathematical details

The bijectors returned by this method generally follow the following principles, although the specific bijectors returned may vary without notice.

Normal distributions are reparameterized by a location-scale transform.

b = tfp.experimental.bijectors.make_distribution_bijector(
  tfd.Normal(loc=10., scale=5.))
# ==> tfb.Shift(10.)(tfb.Scale(5.)))

b = tfp.experimental.bijectors.make_distribution_bijector(
  tfd.MultivariateNormalTriL(loc=loc, scale_tril=scale_tril))
# ==> tfb.Shift(loc)(tfb.ScaleMatvecTriL(scale_tril))

The distribution's quantile function is used, when available:

d = tfd.Cauchy(loc=loc, scale=scale)
b = tfp.experimental.bijectors.make_distribution_bijector(d)
# ==> tfb.Inline(forward_fn=d.quantile, inverse_fn=d.cdf)(tfb.NormalCDF())

Otherwise, a quantile function is derived by inverting the CDF:

d = tfd.Gamma(concentration=alpha, rate=beta)
b = tfp.experimental.bijectors.make_distribution_bijector(d)
# ==> tfb.Invert(
#  tfp.experimental.bijectors.ScalarFunctionWithInferredInverse(fn=d.cdf))(
#    tfb.NormalCDF())

Transformed distributions are represented by chaining the transforming bijector with a preconditioning bijector for the base distribution:

b = tfp.experimental.bijectors.make_distribution_bijector(
  tfb.Exp(tfd.Normal(loc=10., scale=5.)))
# ==> tfb.Exp(tfb.Shift(10.)(tfb.Scale(5.)))

Joint distributions are represented by a joint bijector, which converts each component distribution to a bijector with parameters conditioned on the previous variables in the model. The joint bijector's inputs and outputs follow the structure of the joint distribution.

jd = tfd.JointDistributionNamed(
    {'a': tfd.InverseGamma(concentration=2., scale=1.),
     'b': lambda a: tfd.Normal(loc=3., scale=tf.sqrt(a))})
b = tfp.experimental.bijectors.make_distribution_bijector(jd)
whitened_jd = tfb.Invert(b)(jd)
x = whitened_jd.sample()
# x <=> {'a': tfd.Normal(0., 1.).sample(), 'b': tfd.Normal(0., 1.).sample()}