TensorFlow is back at Google I/O on May 14! Register now

tfp.experimental.util.DeferredModule

Wrapper to defer initialization of a tf.Module instance.

tfp.experimental.util.DeferredModule(
    build_fn, *args, also_track=None, **kwargs
)

DeferredModule is a general-purpose mechanism for creating objects that are 'tape safe', meaning that computation occurs only when an instance method is called, not at construction. This ensures that method calls inside of a tf.GradientTape context will produce gradients to any underlying tf.Variables.

Examples

TFP's built-in Distributions and Bijectors are tape-safe by contract, but this does not extend to cases where computation is required to construct an object's parameters prior to initialization. For example, suppose we want to construct a Gamma distribution with a given mean and variance. In a naive implementation, we would convert these to the Gamma's native concentration and rate parameters when the distribution is constructed. Any future method calls would produce gradients to concentration and rate, but not to the underlying mean and variance:

mean, variance = tf.Variable(3.2), tf.Variable(9.1)
dist = tfd.Gamma(concentration=mean**2 / variance,
                 rate=mean / variance)

with tf.GradientTape() as tape:
  lp = dist.log_prob(5.0)
grads = tape.gradient(lp, [mean, variance])
# ==> `grads` are `[None, None]` !! :-(

To preserve the gradients, we can defer the parameter transformation using DeferredModule. The resulting object behaves just like a tfd.Gamma instance, however, instead of running the Gamma constructor just once, it internally applies the parameter transformation and constructs a new, temporary instance of tfd.Gamma on every method invocation. This ensures that all operations needed to compute a method's return value from any underlying variables are performed every time the method is invoked. A surrounding GradientTape context will therefore be able to trace the full computation.

def gamma_from_mean_and_variance(mean, variance, **kwargs):
  rate = mean / variance
  return tfd.Gamma(concentration=mean * rate, rate=rate, **kwargs)

mean, variance = tf.Variable(3.2), tf.Variable(9.1)
deferred_dist = tfp.experimental.util.DeferredModule(
  build_fn=gamma_from_mean_and_variance,
  mean=mean,  # May be passed by position or by name.
  variance=variance)

with tf.GradientTape() as tape:
  lp = deferred_dist.log_prob(5.0)
grads = tape.gradient(lp, [mean, variance])
# ==> `grads` are defined!

Note that we could have achieved a similar effect by using tfp.util.DeferredTensor to individually defer the concentration and rate parameters. However, this would have been significantly more verbose, and would not share any computation between the two parameter transformations. In general, DeferredTensor is often idiomatic for simple transformations of a single value, while DeferredModule may be preferred for transformations that operate on multiple values and/or contain multiple steps.

Caveats

Objects derived from a DeferredModule are no longer deferred, so they will not preserve gradients. For example, slicing into a deferred Distribution yields a new, concrete Distribution instance:

def normal_from_log_scale(scaled_loc, log_scale):
  return tfd.Normal(loc=5 * scaled_loc, scale=tf.exp(log_scale))

dist = tfp.experimental.util.DeferredModule(
  build_fn=normal_from_log_scale,
  scaled_loc=tf.Variable([1., 2., 3.]),
  log_scale=tf.Variable([1., 1., 1.]))
dist.batch_shape  # ==> [3]
len(dist.trainable_variables)  # ==> 2

slice = dist[:2]  # Instantiates a new, non-deferred Distribution.
slice.batch_shape  # ==> [2]
len(slice.trainable_variables)  # ==> 0 (!)

# If needed, we could defer the slice with another layer of wrapping.
deferred_slice = tfp.experimental.util.DeferredModule(
  build_fn=lambda d: d[:2],
  d=dist)
len(deferred_slice.trainable_variables)  # ==> 2

Args
`build_fn`	Python callable specifying a deferred transformation of the provided arguments. This must have signature `module = build_fn(args, *kwargs)`. The return value `module` is an instance of `tf.Module`.
`*args`	Optional positional arguments to `build_fn`.
`also_track`	Optional instance or structure of instances of `tf.Variable` and/or `tf.Module`, containing any additional trainable variables that the `build_fn` may access beyond the given `args` and `kwargs`. This ensures that such variables will be correctly tracked in `self.trainable_variables`. Default value: `None`.
`**kwargs`	Optional keyword arguments to `build_fn`.

Attributes
`name`	Returns the name of this module as passed or determined in the ctor. Note: This is not the same as the `self.name_scope.name` which includes parent module names.
`name_scope`	Returns a `tf.name_scope` instance for this class.
`non_trainable_variables`	Sequence of non-trainable variables owned by this module and its submodules. Note: this method uses reflection to find variables on the current instance and submodules. For performance reasons you may wish to cache the result of calling this method if you don't expect the return value to change.
`submodules`	Sequence of all sub-modules. Submodules are modules which are properties of this module, or found as properties of modules which are properties of this module (and so on). `a = tf.Module()` `b = tf.Module()` `c = tf.Module()` `a.b = b` `b.c = c` `list(a.submodules) == [b, c]` `True` `list(b.submodules) == [c]` `True` `list(c.submodules) == []` `True`
`trainable_variables`	Sequence of trainable variables owned by this module and its submodules. Note: this method uses reflection to find variables on the current instance and submodules. For performance reasons you may wish to cache the result of calling this method if you don't expect the return value to change.
`variables`	Sequence of variables owned by this module and its submodules. Note: this method uses reflection to find variables on the current instance and submodules. For performance reasons you may wish to cache the result of calling this method if you don't expect the return value to change.

Methods

`with_name_scope`

@classmethod
with_name_scope(
    method
)

Decorator to automatically enter the module name scope.

class MyModule(tf.Module):
  @tf.Module.with_name_scope
  def __call__(self, x):
    if not hasattr(self, 'w'):
      self.w = tf.Variable(tf.random.normal([x.shape[1], 3]))
    return tf.matmul(x, self.w)

Using the above module would produce tf.Variables and tf.Tensors whose names included the module name:

mod = MyModule()
mod(tf.ones([1, 2]))
<tf.Tensor: shape=(1, 3), dtype=float32, numpy=..., dtype=float32)>
mod.w
<tf.Variable 'my_module/Variable:0' shape=(2, 3) dtype=float32,
numpy=..., dtype=float32)>

Args
`method`	The method to wrap.

Returns
The original method wrapped such that it enters the module's name scope.

`abs`

__abs__()

Return the absolute value of the argument.

`add`

__add__(
    b, /
)

Same as a + b.

`and`

__and__(
    b, /
)

Same as a & b.

`bool`

__bool__()

bool(x) -> bool

Returns True when the argument x is true, False otherwise. The builtins True and False are the only two instances of the class bool. The class bool is a subclass of the class int, and cannot be subclassed.

`call`

View source

__call__(
    *args, **kwargs
)

`contains`

__contains__(
    b, /
)

Same as b in a (note reversed operands).

`enter`

View source

__enter__()

`eq`

__eq__(
    b, /
)

Same as a == b.

`exit`

View source

__exit__(
    exc_type, exc_value, traceback
)

`floordiv`

__floordiv__(
    b, /
)

Same as a // b.

`ge`

__ge__(
    b, /
)

Same as a >= b.

`getitem`

__getitem__(
    b, /
)

Same as a[b].

`gt`

__gt__(
    b, /
)

Same as a > b.

`invert`

__invert__()

Same as ~a.

`iter`

__iter__()

iter(iterable) -> iterator iter(callable, sentinel) -> iterator

Get an iterator from an object. In the first form, the argument must supply its own iterator, or be a sequence. In the second form, the callable is called until it returns the sentinel.

`le`

__le__(
    b, /
)

Same as a <= b.

`len`

__len__()

Return the number of items in a container.

`lshift`

__lshift__(
    b, /
)

Same as a << b.

`lt`

__lt__(
    b, /
)

Same as a < b.

`matmul`

__matmul__(
    b, /
)

Same as a @ b.

`mod`

__mod__(
    b, /
)

Same as a % b.

`mul`

__mul__(
    b, /
)

Same as a * b.

`ne`

__ne__(
    b, /
)

Same as a != b.

`neg`

__neg__()

Same as -a.

`or`

__or__(
    b, /
)

Same as a | b.

`pos`

__pos__()

Same as +a.

`pow`

__pow__(
    exp, mod=None
)

Equivalent to baseexp with 2 arguments or baseexp % mod with 3 arguments

Some types, such as ints, are able to use a more efficient algorithm when invoked using the three argument form.

`radd`

__radd__(
    b, /
)

Same as a + b.

`rand`

__rand__(
    b, /
)

Same as a & b.

`rfloordiv`

__rfloordiv__(
    b, /
)

Same as a // b.

`rlshift`

__rlshift__(
    b, /
)

Same as a << b.

`rmatmul`

__rmatmul__(
    b, /
)

Same as a @ b.

`rmod`

__rmod__(
    b, /
)

Same as a % b.

`rmul`

__rmul__(
    b, /
)

Same as a * b.

`ror`

__ror__(
    b, /
)

Same as a | b.

`rpow`

__rpow__(
    exp, mod=None
)

Equivalent to baseexp with 2 arguments or baseexp % mod with 3 arguments

Some types, such as ints, are able to use a more efficient algorithm when invoked using the three argument form.

`rrshift`

__rrshift__(
    b, /
)

Same as a >> b.

`rshift`

__rshift__(
    b, /
)

Same as a >> b.

`rsub`

__rsub__(
    b, /
)

Same as a - b.

`rtruediv`

__rtruediv__(
    b, /
)

Same as a / b.

`rxor`

__rxor__(
    b, /
)

Same as a ^ b.

`sub`

__sub__(
    b, /
)

Same as a - b.

`truediv`

__truediv__(
    b, /
)

Same as a / b.

`xor`

__xor__(
    b, /
)

Same as a ^ b.

tfp.experimental.util.DeferredModule

Examples

Caveats

Args

Attributes

Methods

with_name_scope

__abs__

__add__

__and__

__bool__

__call__

__contains__

__enter__

__eq__

__exit__

__floordiv__

__ge__

__getitem__

__gt__

__invert__

__iter__

__le__

__len__

__lshift__

__lt__

__matmul__

__mod__

__mul__

__ne__

__neg__

__or__

__pos__

__pow__

__radd__

__rand__

__rfloordiv__

__rlshift__

__rmatmul__

__rmod__

__rmul__

__ror__

__rpow__

__rrshift__

__rshift__

__rsub__

__rtruediv__

__rxor__

__sub__

__truediv__

__xor__

`with_name_scope`

`abs`

`add`

`and`

`bool`

`call`

`contains`

`enter`

`eq`

`exit`

`floordiv`

`ge`

`getitem`

`gt`

`invert`

`iter`

`le`

`len`

`lshift`

`lt`

`matmul`

`mod`

`mul`

`ne`

`neg`

`or`

`pos`

`pow`

`radd`

`rand`

`rfloordiv`

`rlshift`

`rmatmul`

`rmod`

`rmul`

`ror`

`rpow`

`rrshift`

`rshift`

`rsub`

`rtruediv`

`rxor`

`sub`

`truediv`

`xor`