View source on GitHub |
Builds a bijector to approximately transform N(0, 1)
into distribution
.
tfp.experimental.bijectors.make_distribution_bijector(
distribution, name='make_distribution_bijector'
)
This represents a distribution as a bijector that transforms a (multivariate) standard normal distribution into the distribution of interest.
Args | |
---|---|
distribution
|
A tfd.Distribution instance; this may be a joint
distribution.
|
name
|
Python str name for ops created by this method.
|
Returns | |
---|---|
distribution_bijector
|
a tfb.Bijector instance such that
distribution_bijector(tfd.Normal(0., 1.)) is approximately equivalent
to distribution .
|
Examples
This method may be used to convert structured variational distributions into MCMC preconditioners. Consider a model containing funnel geometry, which may be difficult for an MCMC algorithm to sample directly.
model_with_funnel = tfd.JointDistributionSequentialAutoBatched([
tfd.Normal(loc=-1., scale=2., name='z'),
lambda z: tfd.Normal(loc=[0., 0., 0.], scale=tf.exp(z), name='x'),
lambda x: tfd.Poisson(log_rate=x, name='y')])
pinned_model = tfp.experimental.distributions.JointDistributionPinned(
model_with_funnel, y=[1, 3, 0])
We can approximate the posterior in this model using a structured variational surrogate distribution, which will capture the funnel geometry, but cannot exactly represent the (non-Gaussian) posterior.
# Build and fit a structured surrogate posterior distribution.
surrogate_posterior = tfp.experimental.vi.build_asvi_surrogate_posterior(
pinned_model)
_ = tfp.vi.fit_surrogate_posterior(pinned_model.unnormalized_log_prob,
surrogate_posterior=surrogate_posterior,
optimizer=tf.optimizers.Adam(0.01),
num_steps=200)
Creating a preconditioning bijector allows us to obtain higher-quality posterior samples, without any Gaussianity assumption, by using the surrogate to guide an MCMC sampler.
surrogate_posterior_bijector = (
tfp.experimental.bijectors.make_distribution_bijector(surrogate_posterior))
samples, _ = tfp.mcmc.sample_chain(
kernel=tfp.mcmc.DualAveragingStepSizeAdaptation(
tfp.mcmc.TransformedTransitionKernel(
tfp.mcmc.NoUTurnSampler(pinned_model.unnormalized_log_prob,
step_size=0.1),
bijector=surrogate_posterior_bijector),
num_adaptation_steps=80),
current_state=surrogate_posterior.sample(),
num_burnin_steps=100,
trace_fn=lambda _0, _1: [],
num_results=500)
Mathematical details
The bijectors returned by this method generally follow the following principles, although the specific bijectors returned may vary without notice.
Normal distributions are reparameterized by a location-scale transform.
b = tfp.experimental.bijectors.make_distribution_bijector(
tfd.Normal(loc=10., scale=5.))
# ==> tfb.Shift(10.)(tfb.Scale(5.)))
b = tfp.experimental.bijectors.make_distribution_bijector(
tfd.MultivariateNormalTriL(loc=loc, scale_tril=scale_tril))
# ==> tfb.Shift(loc)(tfb.ScaleMatvecTriL(scale_tril))
The distribution's quantile
function is used, when available:
d = tfd.Cauchy(loc=loc, scale=scale)
b = tfp.experimental.bijectors.make_distribution_bijector(d)
# ==> tfb.Inline(forward_fn=d.quantile, inverse_fn=d.cdf)(tfb.NormalCDF())
Otherwise, a quantile function is derived by inverting the CDF:
d = tfd.Gamma(concentration=alpha, rate=beta)
b = tfp.experimental.bijectors.make_distribution_bijector(d)
# ==> tfb.Invert(
# tfp.experimental.bijectors.ScalarFunctionWithInferredInverse(fn=d.cdf))(
# tfb.NormalCDF())
Transformed distributions are represented by chaining the transforming bijector with a preconditioning bijector for the base distribution:
b = tfp.experimental.bijectors.make_distribution_bijector(
tfb.Exp(tfd.Normal(loc=10., scale=5.)))
# ==> tfb.Exp(tfb.Shift(10.)(tfb.Scale(5.)))
Joint distributions are represented by a joint bijector, which converts each component distribution to a bijector with parameters conditioned on the previous variables in the model. The joint bijector's inputs and outputs follow the structure of the joint distribution.
jd = tfd.JointDistributionNamed(
{'a': tfd.InverseGamma(concentration=2., scale=1.),
'b': lambda a: tfd.Normal(loc=3., scale=tf.sqrt(a))})
b = tfp.experimental.bijectors.make_distribution_bijector(jd)
whitened_jd = tfb.Invert(b)(jd)
x = whitened_jd.sample()
# x <=> {'a': tfd.Normal(0., 1.).sample(), 'b': tfd.Normal(0., 1.).sample()}