Module: tf_agents.bandits.networks.heteroscedastic_q_network

Network Outputting Expected Value and Variance of Rewards.

Classes

class HeteroscedasticQNetwork: Network Outputting Expected Value and Variance of Rewards.

class QBanditNetworkResult: QBanditNetworkResult(q_value_logits, log_variance)

absolute_import Instance of __future__._Feature
division Instance of __future__._Feature
print_function Instance of __future__._Feature