|  View source on GitHub | 
Backward differentiation formula (BDF) solver for stiff ODEs.
Inherits From: Solver
tfp.substrates.jax.math.ode.BDF(
    rtol=0.001,
    atol=1e-06,
    first_step_size=None,
    safety_factor=0.9,
    min_step_size_factor=0.1,
    max_step_size_factor=10.0,
    max_num_steps=None,
    max_order=bdf_util.MAX_ORDER,
    max_num_newton_iters=4,
    newton_tol_factor=0.1,
    newton_step_size_factor=0.5,
    bdf_coefficients=(-0.185, -1.0 / 9.0, -0.0823, -0.0415, 0.0),
    evaluate_jacobian_lazily=False,
    make_adjoint_solver_fn=None,
    validate_args=False,
    name='bdf'
)
Implements the solver described in [Shampine and Reichelt (1997)][1], a variable step size, variable order (VSVO) BDF integrator with order varying between 1 and 5.
Algorithm details
Each step involves solving the following nonlinear equation by Newton's method:
0 = 1/1 * BDF(1, y)[n+1] + ... + 1/k * BDF(k, y)[n+1]
  - h ode_fn(t[n+1], y[n+1])
  - bdf_coefficients[k-1] * (1/1 + ... + 1/k) * (y[n+1] - y[n] - BDF(1, y)[n]
                                                        -  ... - BDF(k, y)[n])
where k >= 1 is the current order of the integrator, h is the current step
size, bdf_coefficients is a list of numbers that parameterizes the method,
and BDF(m, y) is the m-th order backward difference of the vector y. In
particular, BDF(0, y)[n] = y[n] and
BDF(m + 1, y)[n] = BDF(m, y)[n] - BDF(m, y)[n - 1] for m >= 0.
Newton's method can fail because
- the method has exceeded the maximum number of iterations,
- the method is converging too slowly, or
- the method is not expected to converge (the last two conditions are determined by approximating the Lipschitz constant associated with the iteration).
When evaluate_jacobian_lazily is True, the solver avoids evaluating the
Jacobian of the dynamics function as much as possible. In particular, Newton's
method will try to use the Jacobian from a previous integration step. If
Newton's method fails with an out-of-date Jacobian, the Jacobian is
re-evaluated and Newton's method is restarted. If Newton's method fails and
the Jacobian is already up-to-date, then the step size is decreased and
Newton's method is restarted.
Even if Newton's method converges, the solution it generates can still be rejected if it exceeds the specified error tolerance due to truncation error. In this case, the step size is decreased and Newton's method is restarted.
References
[1]: Lawrence F. Shampine and Mark W. Reichelt. The MATLAB ODE Suite. SIAM Journal on Scientific Computing 18(1):1-22, 1997.
| Attributes | |
|---|---|
| name | |
Methods
solve
solve(
    ode_fn,
    initial_time,
    initial_state,
    solution_times,
    jacobian_fn=None,
    jacobian_sparsity=None,
    batch_ndims=None,
    previous_solver_internal_state=None,
    constants=None
)
Solves an initial value problem.
An initial value problem consists of a system of ODEs and an initial condition:
dy/dt(t) = ode_fn(t, y(t), **constants)
y(initial_time) = initial_state
Here, t (also called time) is a scalar float Tensor and y(t) (also
called the state at time t) is an N-D float or complex Tensor.
constants is are values that are constant with respect to time. Passing
the constants here rather than just closing over them in ode_fn is only
necessary if you want gradients with respect to these values.
Example
The ODE dy/dt(t) = dot(A, y(t)) is solved below.
t_init, t0, t1 = 0., 0.5, 1.
y_init = tf.constant([1., 1.], dtype=tf.float64)
A = tf.constant([[-1., -2.], [-3., -4.]], dtype=tf.float64)
def ode_fn(t, y):
  return tf.linalg.matvec(A, y)
results = tfp.math.ode.BDF().solve(ode_fn, t_init, y_init,
                                   solution_times=[t0, t1])
y0 = results.states[0]  # == dot(matrix_exp(A * t0), y_init)
y1 = results.states[1]  # == dot(matrix_exp(A * t1), y_init)
If the exact solution times are not important, it can be much
more efficient to let the solver choose them using
solution_times=tfp.math.ode.ChosenBySolver(final_time=1.).
This yields the state at various times between t_init and final_time,
in which case results.states[i] is the state at time results.times[i].
Gradients
The gradients are computed using the adjoint sensitivity method described in [Chen et al. (2018)][1].
grad = tf.gradients(y1, y0) # == dot(e, J)
# J is the Jacobian of y1 with respect to y0. In this case, J = exp(A * t1).
# e = [1, ..., 1] is the row vector of ones.
This is not capable of computing gradients with respect to values closed
over by ode_fn, e.g., in the example above:
def ode_fn(t, y):
  return tf.linalg.matvec(A, y)
with tf.GradientTape() as tape:
  tape.watch(A)
  results = tfp.math.ode.BDF().solve(ode_fn, t_init, y_init,
                                     solution_times=[t0, t1])
tape.gradient(results.states, A)  # Undefined!
There are two options to get the gradients flowing through these values:
- Use tf.Variablefor these values.
- Pass the values in explicitly using the constantsargument:
def ode_fn(t, y, A):
  return tf.linalg.matvec(A, y)
with tf.GradientTape() as tape:
  tape.watch(A)
  results = tfp.math.ode.BDF().solve(ode_fn, t_init, y_init,
                                     solution_times=[t0, t1],
                                     constants={'A': A})
tape.gradient(results.states, A)  # Fine.
By default, this uses the same solver for the augmented ODE. This can be
controlled via make_adjoint_solver_fn.
References
[1]: Chen, Tian Qi, et al. "Neural ordinary differential equations." Advances in Neural Information Processing Systems. 2018.
| Args | |
|---|---|
| ode_fn | Function of the form ode_fn(t, y, **constants). The inputtis
a scalar floatTensor. The inputyand output are bothTensors
with the same shape anddtypeasinitial_state.constantsis are
values that are constant with respect to time. Passing the constants
here rather than just closing over them inode_fnis only necessary if
you want gradients with respect to these values. | 
| initial_time | Scalar float Tensorspecifying the initial time. | 
| initial_state | N-D float or complex Tensorspecifying the initial state.
Thedtypeofinitial_statemust be complex for problems with
complex-valued states (even if the initial state is real). | 
| solution_times | 1-D float Tensorspecifying a list of times. The solver
stores the computed state at each of these times in the returnedResultsobject. Must satisfyinitial_time <= solution_times[0]andsolution_times[i] < solution_times[i+1]. Alternatively, the user can
passtfp.math.ode.ChosenBySolver(final_time)wherefinal_timeis a
scalar floatTensorsatisfyinginitial_time < final_time. Doing so
requests that the solver automatically choose suitable times up to and
includingfinal_timeat which to store the computed state. | 
| jacobian_fn | Optional function of the form jacobian_fn(t, y). The inputtis a scalar floatTensor. The inputyhas the same shape anddtypeasinitial_state. The output is a 2N-DTensorwhose shape isinitial_state.shape + initial_state.shapeand whosedtypeis the
same asinitial_state. In particular, the(i1, ..., iN, j1, ...,
jN)-th entry ofjacobian_fn(t, y)is the derivative of the(i1, ...,
iN)-th entry ofode_fn(t, y)with respect to the(j1, ..., jN)-th
entry ofy. If this argument is left unspecified, the solver
automatically computes the Jacobian if and when it is needed.
Default value:None. | 
| jacobian_sparsity | Optional 2N-D boolean Tensorwhose shape isinitial_state.shape + initial_state.shapespecifying the sparsity
pattern of the Jacobian. This argument is ignored ifjacobian_fnis
specified.
Default value:None. | 
| batch_ndims | Optional nonnegative integer. When specified, the first batch_ndimsdimensions ofinitial_stateare batch dimensions.
Default value:None. | 
| previous_solver_internal_state | Optional solver-specific argument used to
warm-start this invocation of solve.
Default value:None. | 
| constants | Optional dictionary with string keys and values being (possibly
nested) float Tensors. These represent values that are constant with
respect to time. Specifying these here allows the adjoint sentitivity
method to compute gradients of the results with respect to these values. | 
| Returns | |
|---|---|
| Object of type Results. |