tf_agents.bandits.multi_objective.multi_objective_scalarizer.HyperVolumeScalarizer

Implement the hypervolume scalarization.

Inherits From: Scalarizer

Given a vector of (at least two) objectives M, a unit-length vector V with non-negative coordinates, a slope vector A, and an offset vector B, all having the same dimension, the hypervolume scalarization of M is defined as:

min_{i: V_i > 0} max(A_i * M_i + B_i, 0) / V_i.

See https://arxiv.org/abs/2006.04655 for more details. Note that it is recommended for the user to set A_i and B_i in such a way to ensure non-negativity of the transformed objectives.

direction A Sequence representing a directional vector, which will be normalized to have unit length. Coordinates of the normalized direction whose absolute values are less than HyperVolumeScalarizer.ALMOST_ZERO will be considered zeros.
transform_params A Sequence of namedtuples HyperVolumeScalarizer.PARAMS, each containing a slope and an offset for transforming an objective to be non-negative.

TypeError if not isinstance(direction, Sequence).
ValueError if any([x < 0 for x in direction]).
ValueError if the 2-norm of direction is less than HyperVolumeScalarizer.ALMOST_ZERO.
TypeError if not isinstance(transform_params, Sequence).
ValueError if len(transform_params) != len(self._direction).

Child Classes

class PARAMS

Methods

call

View source

Implementation of scalarization logic by subclasses.

__call__

View source

Returns a single reward by scalarizing multiple objectives.

Args
multi_objectives A Tensor of shape [batch_size, number_of_objectives], where each column represents an objective.

Returns: A Tensor of shape [batch_size] representing scalarized rewards.

Raises
ValueError if multi_objectives.shape.rank != 2.
ValueError if multi_objectives.shape.dims[1] != self._num_of_objectives.

ALMOST_ZERO 1e-16