tfp.substrates.numpy.math.value_and_gradient

Computes f(*args, **kwargs) and its gradients wrt to args, kwargs.

The function f is invoked according to one of the following rules:

  1. If f is a function of no arguments then it is called as f().

  2. If len(args) == 1, len(kwargs) == 0, auto_unpack_single_arg == True and isinstance(args[0], (list, tuple)) then args is presumed to be a packed sequence of args, i.e., the function is called as f(*args[0]).

  3. Otherwise, the function is called as f(*args, **kwargs).

Regardless of how f is called, gradients are computed with respect to args and kwargs.

Examples

tfd = tfp.distributions
tfm = tfp.math

# Case 1: argless `f`.
x = tf.constant(2.)
tfm.value_and_gradient(lambda: tf.math.log(x), x)
# ==> [log(2.), 0.5]

# Case 2: packed arguments.
tfm.value_and_gradient(lambda x, y: x * tf.math.log(y), [2., 3.])
# ==> [2. * np.log(3.), (np.log(3.), 2. / 3)]

# Case 3: default.
tfm.value_and_gradient(tf.math.log, [1., 2., 3.],
                       auto_unpack_single_arg=False)
# ==> [(log(1.), log(2.), log(3.)), (1., 0.5, 0.333)]

f Python callable to be differentiated. If f returns a scalar, this scalar will be differentiated. If f returns a tensor or list of tensors, the gradient will be the sum of the gradients of each part. If desired the sum can be weighted by output_gradients (see below).
*args Arguments as in f(*args, **kwargs) and basis for gradient.
output_gradients A Tensor or structure of Tensors the same size as the result ys = f(*args, **kwargs) and holding the gradients computed for each y in ys. This argument is forwarded to the underlying gradient implementation (i.e., either the grad_ys argument of tf.gradients or the output_gradients argument of tf.GradientTape.gradient). Default value: None.
use_gradient_tape Python bool indicating that tf.GradientTape should be used rather than tf.gradient and regardless of tf.executing_eagerly(). (It is only possible to use tf.gradient when not use_gradient_tape and not tf.executing_eagerly().) Default value: False.
auto_unpack_single_arg Python bool which when False means the single arg case will not be interpreted as a list of arguments. (See case 2.) Default value: True.
has_aux Whether f(*args, **kwargs) actually returns two outputs, the first being y and the second being an auxiliary output that does not get gradients computed.
name Python str name prefixed to ops created by this function. Default value: None (i.e., 'value_and_gradient').
**kwargs Named arguments as in f(*args, **kwargs) and basis for gradient.

If has_aux is False: y: y = f(*args, **kwargs). dydx: Gradients of y with respect to each of args and kwargs.
otherwise A tuple ((y, aux), dydx), where y, aux = f(*args, **kwargs) and dydx are the gradients of y with respect to each of args and kwargs.