Class for representing a trainable relative quantile constraint.

Inherits From: NeuralConstraint, BaseConstraint

This constraint class implements a relative quantile constraint such as

Q_tau(action) >= Q_tau(baseline_action)


Q_tau(action) <= Q_tau(baseline_action)

time_step_spec A TimeStep spec of the expected time_steps.
action_spec A nest of BoundedTensorSpec representing the actions.
constraint_network An instance of used to provide estimates of action feasibility. The input structure should be consistent with the observation_spec.
quantile A float between 0. and 1., the quantile we want to regress.
comparator_fn a comparator function, such as tf.greater or tf.less.
baseline_action_fn a callable that given the observation returns the baseline action. If None, the baseline action is set to 0.
name Python str name of this agent. All variables in this module will fall under that name. Defaults to the class name.





View source

Computes loss for training the constraint network.

observations A batch of observations.
actions A batch of actions.
rewards A batch of rewards.
weights Optional scalar or elementwise (per-batch-entry) importance weights. The output batch loss will be scaled by these weights, and the final scalar loss is the mean of these values.
training Whether the loss is being used for training.

loss A Tensor containing the loss for the training step.


View source

Returns an op to initialize the constraint.


View source

Returns the probability of input actions being feasible.