View source on GitHub |
Class for representing a trainable constraint using a neural network.
Inherits From: BaseConstraint
tf_agents.bandits.policies.constraints.NeuralConstraint(
time_step_spec: tf_agents.typing.types.TimeStep
,
action_spec: tf_agents.typing.types.BoundedTensorSpec
,
constraint_network: Optional[tf_agents.typing.types.Network
],
error_loss_fn: tf_agents.typing.types.LossFn
= tf.compat.v1.losses.mean_squared_error,
name: Optional[Text] = 'NeuralConstraint'
)
This constraint class uses a neural network to compute the action feasibility. In this case, the loss function needs to be exposed for training the neural network weights, typically done by the agent that uses this constraint.
Attributes | |
---|---|
constraint_network
|
|
observation_spec
|
Methods
compute_loss
compute_loss(
observations: tf_agents.typing.types.NestedTensor
,
actions: tf_agents.typing.types.NestedTensor
,
rewards: tf_agents.typing.types.Tensor
,
weights: Optional[types.Float] = None,
training: bool = False
) -> tf_agents.typing.types.Tensor
Computes loss for training the constraint network.
Args | |
---|---|
observations
|
A batch of observations. |
actions
|
A batch of actions. |
rewards
|
A batch of rewards. |
weights
|
Optional scalar or elementwise (per-batch-entry) importance weights. The output batch loss will be scaled by these weights, and the final scalar loss is the mean of these values. |
training
|
Whether the loss is being used for training. |
Returns | |
---|---|
loss
|
A Tensor containing the loss for the training step.
|
initialize
initialize()
Returns an op to initialize the constraint.
__call__
__call__(
observation, actions=None
)
Returns the probability of input actions being feasible.