tfp.stats.cholesky_covariance
Stay organized with collections
Save and categorize content based on your preferences.
Cholesky factor of the covariance matrix of vector-variate random samples.
tfp.stats.cholesky_covariance(
x, sample_axis=0, keepdims=False, name=None
)
This function can be use to fit a multivariate normal to data.
tf.enable_eager_execution()
import tensorflow_probability as tfp
tfd = tfp.distributions
# Assume data.shape = (1000, 2). 1000 samples of a random variable in R^2.
observed_data = read_data_samples(...)
# The mean is easy
mu = tf.reduce_mean(observed_data, axis=0)
# Get the scale matrix
L = tfp.stats.cholesky_covariance(observed_data)
# Make the best fit multivariate normal (under maximum likelihood condition).
mvn = tfd.MultivariateNormalTriL(loc=mu, scale_tril=L)
# Plot contours of the pdf.
xs, ys = tf.meshgrid(
tf.linspace(-5., 5., 50), tf.linspace(-5., 5., 50), indexing='ij')
xy = tf.stack((tf.reshape(xs, [-1]), tf.reshape(ys, [-1])), axis=-1)
pdf = tf.reshape(mvn.prob(xy), (50, 50))
CS = plt.contour(xs, ys, pdf, 10)
plt.clabel(CS, inline=1, fontsize=10)
Why does this work?
Given vector-variate random variables X = (X1, ..., Xd)
, one may obtain the
sample covariance matrix in R^{d x d}
(see tfp.stats.covariance
).
The Cholesky factor
of this matrix is analogous to standard deviation for scalar random variables:
Suppose X
has covariance matrix C
, with Cholesky factorization C = L L^T
Then multiplying a vector of iid random variables which have unit variance by
L
produces a vector with covariance L L^T
, which is the same as X
.
observed_data = read_data_samples(...)
L = tfp.stats.cholesky_covariance(observed_data, sample_axis=0)
# Make fake_data with the same covariance as observed_data.
uncorrelated_normal = tf.random.normal(shape=(500, 10))
fake_data = tf.linalg.matvec(L, uncorrelated_normal)
Args |
x
|
Numeric Tensor . The rightmost dimension of x indexes events. E.g.
dimensions of a random vector.
|
sample_axis
|
Scalar or vector Tensor designating axis holding samples.
Default value: 0 (leftmost dimension). Cannot be the rightmost dimension
(since this indexes events).
|
keepdims
|
Boolean. Whether to keep the sample axis as singletons.
|
name
|
Python str name prefixed to Ops created by this function.
Default value: None (i.e., 'covariance' ).
|
Returns |
chol
|
Tensor of same dtype as x . The last two dimensions hold
lower triangular matrices (the Cholesky factors).
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2023-11-21 UTC.
[null,null,["Last updated 2023-11-21 UTC."],[],[],null,["# tfp.stats.cholesky_covariance\n\n\u003cbr /\u003e\n\n|-----------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/probability/blob/v0.23.0/tensorflow_probability/python/stats/sample_stats.py#L219-L287) |\n\nCholesky factor of the covariance matrix of vector-variate random samples. \n\n tfp.stats.cholesky_covariance(\n x, sample_axis=0, keepdims=False, name=None\n )\n\nThis function can be use to fit a multivariate normal to data. \n\n tf.enable_eager_execution()\n import tensorflow_probability as tfp\n tfd = tfp.distributions\n\n # Assume data.shape = (1000, 2). 1000 samples of a random variable in R^2.\n observed_data = read_data_samples(...)\n\n # The mean is easy\n mu = tf.reduce_mean(observed_data, axis=0)\n\n # Get the scale matrix\n L = tfp.stats.cholesky_covariance(observed_data)\n\n # Make the best fit multivariate normal (under maximum likelihood condition).\n mvn = tfd.MultivariateNormalTriL(loc=mu, scale_tril=L)\n\n # Plot contours of the pdf.\n xs, ys = tf.meshgrid(\n tf.linspace(-5., 5., 50), tf.linspace(-5., 5., 50), indexing='ij')\n xy = tf.stack((tf.reshape(xs, [-1]), tf.reshape(ys, [-1])), axis=-1)\n pdf = tf.reshape(mvn.prob(xy), (50, 50))\n CS = plt.contour(xs, ys, pdf, 10)\n plt.clabel(CS, inline=1, fontsize=10)\n\nWhy does this work?\nGiven vector-variate random variables `X = (X1, ..., Xd)`, one may obtain the\nsample covariance matrix in `R^{d x d}` (see `tfp.stats.covariance`).\n\nThe [Cholesky factor](https://en.wikipedia.org/wiki/Cholesky_decomposition)\nof this matrix is analogous to standard deviation for scalar random variables:\nSuppose `X` has covariance matrix `C`, with Cholesky factorization `C = L L^T`\nThen multiplying a vector of iid random variables which have unit variance by\n`L` produces a vector with covariance `L L^T`, which is the same as `X`. \n\n observed_data = read_data_samples(...)\n L = tfp.stats.cholesky_covariance(observed_data, sample_axis=0)\n\n # Make fake_data with the same covariance as observed_data.\n uncorrelated_normal = tf.random.normal(shape=(500, 10))\n fake_data = tf.linalg.matvec(L, uncorrelated_normal)\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|---------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `x` | Numeric `Tensor`. The rightmost dimension of `x` indexes events. E.g. dimensions of a random vector. |\n| `sample_axis` | Scalar or vector `Tensor` designating axis holding samples. Default value: `0` (leftmost dimension). Cannot be the rightmost dimension (since this indexes events). |\n| `keepdims` | Boolean. Whether to keep the sample axis as singletons. |\n| `name` | Python `str` name prefixed to Ops created by this function. Default value: `None` (i.e., `'covariance'`). |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|--------|-----------------------------------------------------------------------------------------------------------------|\n| `chol` | `Tensor` of same `dtype` as `x`. The last two dimensions hold lower triangular matrices (the Cholesky factors). |\n\n\u003cbr /\u003e"]]