tfp.substrates.jax.distributions.DeterminantalPointProcess

Determinantal point process (DPP) distribution.

Inherits From: AutoCompositeTensorDistribution, Distribution

View aliases

Main aliases

tfp.experimental.substrates.jax.distributions.DeterminantalPointProcess

tfp.substrates.jax.distributions.DeterminantalPointProcess(
    eigenvalues,
    eigenvectors,
    validate_args=False,
    allow_nan_stats=False,
    name='DeterminantalPointProcess'
)

The DPP disribution parameterized by the eigenvalues and eigenvectors of the L-ensemble matrix. The L-ensemble matrix indicates the degree of "repulsion" between pairs of items.

Mathematical details

A Determinantal Point Process is a distribution over subsets of n items, called the ground set. The DPP is parameterized by a positive definite matrix of shape n x n, the L-ensemble matrix. It assigns to any subset S of {1, ..., n} the probability:

Pr(S) = det(L_S) / det(I + L)

where:

L is the L-ensemble matrix parameterized by eigenvalues and eigenvectors, i.e. L = U D U^T for U = eigenvectors and D = eigenvalues.
L_S is the principal submatrix of L indexed by items in S. In Numpy slicing notation, L_S = L[S, :][:, S].
det is the matrix determinant.

Marginal probabilities, i.e. the probability that a sample from the DPP contains the subset S, are obtained by way of the marginal kernel:

K = L / (I + L)

where / is the matrix inverse.

When sampling a random set A from the DPP, the marginal probability of S, given by exp(dpp.marginal_log_prob(S)), is:

Pr(A is a superset of S) = det(K_S)

This is a marginal probability in the following sense. If we think of the DPP as a joint distribution over n binary indicator variables, each telling whether a given element is in a given subset S, then we can consider the marginal distribution obtained by "summing" out some of these binary indicators. The resulting marginal distribution happens also to be a DPP. What is referred to as the marginal_log_prob of S (under the original DPP) is just the log_prob of S under the marginal DPP, obtained by summing out the indicators of the complement of S. This tells us the (log) probability that a sample from the full DPP includes S as a subset.

Written in terms of sets, with each S' a subset of the complement of S:

det(K_S) = sum_{S' s.t. S' intersect S is empty} [ Pr(S union S') ]

where Pr(S union S') is the probability of sampling exactly S union S' from the DPP.

For further detail, see Theorem 2.2 of [3].

Repulsion

Rewriting L = B B^T (which in particular can be done using B = U sqrt(D), where D are the eigenvalues and U the eigenvectors), we have

Pr(S) = Vol^2(b_s1, b_s2, ..., b_sk)

where b_s1, ... is the s1th column of B. Hence, the probability of sampling two points simultaneously decreases as a function of how colinear their corresponding eigenvectors are.

Sampling

Sampling is implemented following the algorithm introduced in 2, and proceeds in two phases.

Given an orthonormalization L = U D U^T:

First, an elementary DPP (E-DPP) is built by sampling a subset of eigenvectors S from a Bernoulli distribution with probs equal to D / (D + 1). This E-DPP has the same eigenvectors U as L, but its eigenvalues are 1 iff the corresponding Bernoulli trial was succesful, 0 otherwise.
Then, a number of points k equal to the number of selected eigenvalues is selected iteratively from the elementary DPP. After sampling a point i, the kernel is updated by projecting it onto the subspace of eigenvectors orthogonal to the ith basis vector.

Examples

Sample points on the unit square grid:

import itertools
from tensorflow_probability.python.internal.backend import jax as tf
import tensorflow_probability as tfp; tfp = tfp.substrates.jax
import matplotlib.pyplot as plt

tfd = tfp.distributions
tfpk = tfp.math.psd_kernels

grid_size = 16
# Generate grid_size**2 pts on the unit square.
grid = np.arange(0, 1, 1./grid_size)
points = np.array(list(itertools.product(grid, grid)))

# Create the kernel L that parameterizes the DPP.
kernel_amplitude = 2.
kernel_lengthscale = 2. / grid_size
kernel = tfpk.ExponentiatedQuadratic(kernel_amplitude, kernel_lengthscale)
kernel_matrix = kernel.matrix(points, points)

eigenvalues, eigenvectors = tf.linalg.eigh(kernel_matrix)
dpp = tfd.DeterminantalPointProcess(eigenvalues, eigenvectors)

# The inner-most dimension of the result of `dpp.sample` is a multi-hot
# encoding of a subset of {1, ..., ground_set_size}.

plt.figure(figsize=(6, 6))
for i, samp in enumerate(dpp.sample(4, seed=(1, 2))):  # 4 x grid_size**2
  plt.subplot(221 + i)
  plt.scatter(*points[np.where(samp)].T)
  plt.xticks([])
  plt.yticks([])
plt.tight_layout()
plt.show()

# Like any TFP distribution, the DPP supports batching and shaped samples.

kernel_amplitude = [2., 3, 4]  # Build a batch of 3 PSD kernels.
kernel_lengthscale = 2. / grid_size
kernel = tfpk.ExponentiatedQuadratic(kernel_amplitude, kernel_lengthscale)
kernel_matrix = kernel.matrix(points, points)  # 3 x 256 x 256

eigenvalues, eigenvectors = tf.linalg.eigh(kernel_matrix)
dpp = tfd.DeterminantalPointProcess(eigenvalues, eigenvectors)
print(dpp)  # batch shape: [3], event shape: [256]
samps = dpp.sample(2, seed=(10, 20))
print(samps.shape)  # shape: [2, 3, 256]
print(dpp.log_prob(samps))  # tensor with shape [2, 3]

References

[1]: Odile Macchi. The coincidence approach to stochastic point processes. Advances in Applied Probability, 1975.

[2]: J. Ben Hough, Manjunath Krishnapur, Yuval Peres, Balint Virag. Determinantal point processes and independence. Probability Surveys, 2006. https://arxiv.org/abs/math/0503110

[3]: Alex Kulesza, Ben Taskar. Determinantal point processes for machine learning. Foundations and Trends in Machine Learning, 2012. https://arxiv.org/abs/1207.6083

Args
`eigenvalues`	`float` `Tensor` representing the eigenvalues of the DPP kernel (a.k.a. "L"). All eigenvalues must be > 0. Shape has the form `[b1, ..., bB, n]` where `n` is the number of points in the ground set.
`eigenvectors`	`float` `Tensor` representing the column eigenvectors of the DPP kernel ("L"), provided in the same order as the eigenvalues. Shape has the form `[b1, ..., bB, n, n]` where `n` is the number of points in the ground set. The batch shape components need not be identical to those of `eigenvalues`, but must be broadcast compatible with them.
`validate_args`	Python `bool`, default `False`. When `True` distribution parameters are checked for validity despite possibly degrading runtime performance. When `False` invalid inputs may silently render incorrect outputs. Default value: `False`.
`allow_nan_stats`	Python `bool`, default `True`. When `True`, statistics (e.g., mean, mode, variance) use the value "`NaN`" to indicate the result is undefined. When `False`, an exception is raised if one or more of the statistic's batch members are undefined. Default value: `False`.
`name`	Python `str` name prefixed to ops created by this class.

Attributes
`allow_nan_stats`	Python `bool` describing behavior when a stat is undefined. Stats return +/- infinity when it makes sense. E.g., the variance of a Cauchy distribution is infinity. However, sometimes the statistic is undefined, e.g., if a distribution's pdf does not achieve a maximum within the support of the distribution, the mode is undefined. If the mean is undefined, then by definition the variance is undefined. E.g. the mean for Student's T for df = 1 is undefined (no clear way to say it is either + or - infinity), so the variance = E[(X - mean)**2] is also undefined.
`batch_shape`	Shape of a single sample from a single event index as a `TensorShape`. May be partially defined or unknown. The batch dimensions are indexes into independent, non-identical parameterizations of this distribution.
`dtype`	The `DType` of `Tensor`s handled by this `Distribution`.
`eigenvalues`
`eigenvectors`
`event_shape`	Shape of a single sample from a single batch as a `TensorShape`. May be partially defined or unknown.
`experimental_shard_axis_names`	The list or structure of lists of active shard axis names.
`name`	Name prepended to all ops created by this `Distribution`.
`parameters`	Dictionary of parameters used to instantiate this `Distribution`.
`reparameterization_type`	Describes how samples from the distribution are reparameterized. Currently this is one of the static instances `tfd.FULLY_REPARAMETERIZED` or `tfd.NOT_REPARAMETERIZED`.
`trainable_variables`
`validate_args`	Python `bool` indicating possibly expensive checks are enabled.
`variables`

Args
`value`	`float` or `double` `Tensor`.
`name`	Python `str` prepended to names of ops created by this function.
`**kwargs`	Named arguments forwarded to subclass implementation.

Args
`other`	`tfp.distributions.Distribution` instance.
`name`	Python `str` prepended to names of ops created by this function.

Args
`*args`	Passed to implementation `_default_event_space_bijector`.
`**kwargs`	Passed to implementation `_default_event_space_bijector`.

Args
`value`	a `Tensor` valid sample from this distribution family.
`sample_ndims`	Positive `int` Tensor number of leftmost dimensions of `value` that index i.i.d. samples. Default value: `1`.
`validate_args`	Python `bool`, default `False`. When `True`, distribution parameters are checked for validity despite possibly degrading runtime performance. When `False`, invalid inputs may silently render incorrect outputs. Default value: `False`.
`**init_kwargs`	Additional keyword arguments passed through to `cls.__init__`. These take precedence in case of collision with the fitted parameters; for example, `tfd.Normal.experimental_fit([1., 1.], scale=20.)` returns a Normal distribution with `scale=20.` rather than the maximum likelihood parameter `scale=0.`.

Args
`value`	`float` or `double` `Tensor`.
`backward_compat`	`bool` specifying whether to fall back to returning `FullSpace` as the tangent space, and representing R^n with the standard basis.
`**kwargs`	Named arguments forwarded to subclass implementation.

Returns
`log_prob`	a `Tensor` representing the log probability density, of shape `sample_shape(x) + self.batch_shape` with values of type `self.dtype`.
`tangent_space`	a `TangentSpace` object (by default `FullSpace`) representing the tangent space to the manifold at `value`.

Args
`sample_shape`	integer `Tensor` desired shape of samples to draw. Default value: `()`.
`seed`	PRNG seed; see `tfp.random.sanitize_seed` for details. Default value: `None`.
`name`	name to give to the op. Default value: `'sample_and_log_prob'`.
`**kwargs`	Named arguments forwarded to subclass implementation.

Returns
`samples`	a `Tensor`, or structure of `Tensor`s, with prepended dimensions `sample_shape`.
`log_prob`	a `Tensor` of shape `sample_shape(x) + self.batch_shape` with values of type `self.dtype`.

Args
`sample_shape`	`Tensor` or python list/tuple. Desired shape of a call to `sample()`.
`name`	name to prepend ops with.

Args
`dtype`	Optional float `dtype` to assume for continuous-valued parameters. Some constraining bijectors require advance knowledge of the dtype because certain constants (e.g., `tfb.Softplus.low`) must be instantiated with the same dtype as the values to be transformed.
`num_classes`	Optional `int` `Tensor` number of classes to assume when inferring the shape of parameters for categorical-like distributions. Otherwise ignored.

Args
`sample_shape`	0D or 1D `int32` `Tensor`. Shape of the generated samples.
`seed`	PRNG seed; see `tfp.random.sanitize_seed` for details.
`name`	name to give to the op.
`**kwargs`	Named arguments forwarded to subclass implementation.

tfp.substrates.jax.distributions.DeterminantalPointProcess Stay organized with collections Save and categorize content based on your preferences.

View aliases

Mathematical details

Repulsion

Sampling

Examples

References

Args

Attributes

Methods

batch_shape_tensor

cdf

copy

covariance

cross_entropy

entropy

event_shape_tensor

experimental_default_event_space_bijector

experimental_fit

experimental_local_measure

experimental_sample_and_log_prob

is_scalar_batch

is_scalar_event

kl_divergence

l_ensemble_matrix

log_cdf

log_prob

log_survival_function

marginal_kernel

marginal_log_prob

mean

mode

param_shapes

param_static_shapes

parameter_properties

prob

quantile

sample

stddev

survival_function

unnormalized_log_prob

variance

__getitem__

__iter__

tfp.substrates.jax.distributions.DeterminantalPointProcess

`batch_shape_tensor`

`cdf`

`copy`

`covariance`

`cross_entropy`

`entropy`

`event_shape_tensor`

`experimental_default_event_space_bijector`

`experimental_fit`

`experimental_local_measure`

`experimental_sample_and_log_prob`

`is_scalar_batch`

`is_scalar_event`

`kl_divergence`

`l_ensemble_matrix`

`log_cdf`

`log_prob`

`log_survival_function`

`marginal_kernel`

`marginal_log_prob`

`mean`

`mode`

`param_shapes`

`param_static_shapes`

`parameter_properties`

`prob`

`quantile`

`sample`

`stddev`

`survival_function`

`unnormalized_log_prob`

`variance`

`getitem`

`iter`