Module: tf_agents.bandits.policies.ranking_policy

Ranking policy.

Classes

class CosinePenalizedPlackettLuce: A distribution that samples items based on scores and cosine similarity.

class DescendingScoreRankingPolicy: A policy that is deterministically ranks elements based on their scores.

class DescendingScoreSampler: Base neural network module class.

class NoPenaltyPlackettLuce: Identical to PlackettLuce, with input signature modified to our needs.

class NoPenaltyRankingPolicy: A class implementing ranking policies in TF Agents.

class PenalizeCosineDistanceRankingPolicy: A Ranking policy that penalizes scores based on cosine distance.

class PenalizedPlackettLuce: A distribution that samples permutations and penalizes item scores.

class RankingPolicy: A class implementing ranking policies in TF Agents.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2024-04-26 UTC.