![]() |
Builds a joint variational posterior that factors over model variables. (deprecated arguments)
tfp.experimental.vi.build_factored_surrogate_posterior(
event_shape=None, bijector=None, constraining_bijectors=None,
initial_unconstrained_loc=_sample_uniform_initial_loc,
initial_unconstrained_scale=0.01,
trainable_distribution_fn=_build_trainable_normal_dist, seed=None,
validate_args=False, name=None
)
Used in the notebooks
Used in the tutorials |
---|
By default, this method creates an independent trainable Normal distribution for each variable, transformed using a bijector (if provided) to match the support of that variable. This makes extremely strong assumptions about the posterior: that it is approximately normal (or transformed normal), and that all model variables are independent.
Args | |
---|---|
event_shape
|
Tensor shape, or nested structure of Tensor shapes,
specifying the event shape(s) of the posterior variables.
|
bijector
|
Optional tfb.Bijector instance, or nested structure of such
instances, defining support(s) of the posterior variables. The structure
must match that of event_shape and may contain None values. A
posterior variable will be modeled as
tfd.TransformedDistribution(underlying_dist, bijector) if a
corresponding constraining bijector is specified, otherwise it is modeled
as supported on the unconstrained real line.
|
constraining_bijectors
|
Deprecated alias for bijector .
|
initial_unconstrained_loc
|
Optional Python callable with signature
tensor = initial_unconstrained_loc(shape, seed) used to sample
real-valued initializations for the unconstrained representation of each
variable. May alternately be a nested structure of
Tensor s, giving specific initial locations for each variable; these
must have structure matching event_shape and shapes determined by the
inverse image of event_shape under bijector , which may optionally be
prefixed with a common batch shape.
Default value: functools.partial(tf.random.uniform,
minval=-2., maxval=2., dtype=tf.float32) .
|
initial_unconstrained_scale
|
Optional scalar float Tensor initial
scale for the unconstrained distributions, or a nested structure of
Tensor initial scales for each variable.
Default value: 1e-2 .
|
trainable_distribution_fn
|
Optional Python callable with signature
trainable_dist = trainable_distribution_fn(initial_loc, initial_scale,
event_ndims, validate_args) . This is called for each model variable to
build the corresponding factor in the surrogate posterior. It is expected
that the distribution returned is supported on unconstrained real values.
Default value: functools.partial(
tfp.experimental.vi.build_trainable_location_scale_distribution,
distribution_fn=tfd.Normal) , i.e., a trainable Normal distribution.
|
seed
|
Python integer to seed the random number generator. This is used
only when initial_loc is not specified.
|
validate_args
|
Python bool . Whether to validate input with asserts. This
imposes a runtime cost. If validate_args is False , and the inputs are
invalid, correct behavior is not guaranteed.
Default value: False .
|
name
|
Python str name prefixed to ops created by this function.
Default value: None (i.e., 'build_factored_surrogate_posterior').
|
Returns | |
---|---|
surrogate_posterior
|
A tfd.Distribution instance whose samples have
shape and structure matching that of event_shape or initial_loc .
|
Examples
Consider a Gamma model with unknown parameters, expressed as a joint Distribution:
Root = tfd.JointDistributionCoroutine.Root
def model_fn():
concentration = yield Root(tfd.Exponential(1.))
rate = yield Root(tfd.Exponential(1.))
y = yield tfd.Sample(tfd.Gamma(concentration=concentration, rate=rate),
sample_shape=4)
model = tfd.JointDistributionCoroutine(model_fn)
Let's use variational inference to approximate the posterior over the
data-generating parameters for some observed y
. We'll build a
surrogate posterior distribution by specifying the shapes of the latent
rate
and concentration
parameters, and that both are constrained to
be positive.
surrogate_posterior = tfp.experimental.vi.build_factored_surrogate_posterior(
event_shape=model.event_shape_tensor()[:-1], # Omit the observed `y`.
bijector=[tfb.Softplus(), # Rate is positive.
tfb.Softplus()]) # Concentration is positive.
This creates a trainable joint distribution, defined by variables in
surrogate_posterior.trainable_variables
. We use fit_surrogate_posterior
to fit this distribution by minimizing a divergence to the true posterior.
y = [0.2, 0.5, 0.3, 0.7]
losses = tfp.vi.fit_surrogate_posterior(
lambda rate, concentration: model.log_prob([rate, concentration, y]),
surrogate_posterior=surrogate_posterior,
num_steps=100,
optimizer=tf.optimizers.Adam(0.1),
sample_size=10)
# After optimization, samples from the surrogate will approximate
# samples from the true posterior.
samples = surrogate_posterior.sample(100)
posterior_mean = [tf.reduce_mean(x) for x in samples] # mean ~= [1.1, 2.1]
posterior_std = [tf.math.reduce_std(x) for x in samples] # std ~= [0.3, 0.8]
If we wanted to initialize the optimization at a specific location, we can specify one when we build the surrogate posterior. This function requires the initial location to be specified in unconstrained space; we do this by inverting the constraining bijectors (note this section also demonstrates the creation of a dict-structured model).
initial_loc = {'concentration': 0.4, 'rate': 0.2}
bijector={'concentration': tfb.Softplus(), # Rate is positive.
'rate': tfb.Softplus()} # Concentration is positive.
initial_unconstrained_loc = tf.nest.map_fn(
lambda b, x: b.inverse(x) if b is not None else x, bijector, initial_loc)
surrogate_posterior = tfp.experimental.vi.build_factored_surrogate_posterior(
event_shape=tf.nest.map_fn(tf.shape, initial_loc),
bijector=bijector,
initial_unconstrained_loc=initial_unconstrained_state,
initial_unconstrained_scale=1e-4)