View source on GitHub
|
A running variance computation.
Inherits From: RunningCovariance, AutoCompositeTensor
tfp.experimental.stats.RunningVariance(
num_samples,
mean,
sum_squared_residuals,
event_ndims,
name='RunningCovariance'
)
This is just an alias for RunningCovariance, with the event_ndims set to 0
to compute variances.
RunningVariance is meant to serve general streaming variance needs.
For a specialized version that fits streaming over MCMC samples, see
VarianceReducer in tfp.experimental.mcmc.
Methods
covariance
covariance(
ddof=0
)
Returns the covariance accumulated so far.
| Args | |
|---|---|
ddof
|
Requested dynamic degrees of freedom for the covariance calculation.
For example, use ddof=0 for population covariance and ddof=1 for
sample covariance. Defaults to the population covariance.
|
| Returns | |
|---|---|
covariance
|
An estimate of the covariance. |
from_example
@classmethodfrom_example( example )
Starts a RunningVariance from an example.
| Args | |
|---|---|
example
|
A Tensor. The RunningVariance will accept samples
of the same dtype and broadcast-compatible shape as the example.
|
| Returns | |
|---|---|
var
|
An empty RunningVariance, ready for incoming samples. Note
that by convention, the supplied example is used only for
initialization, but not counted as a sample.
|
from_shape
@classmethodfrom_shape( shape=(), dtype=tf.float32 )
Starts a RunningVariance from shape and dtype metadata.
| Args | |
|---|---|
shape
|
Python Tuple or TensorShape representing the shape of incoming
samples. This is useful to supply if the RunningVariance will be
carried by a tf.while_loop, so that broadcasting does not change the
shape across loop iterations.
|
dtype
|
Dtype of incoming samples and the resulting statistics.
By default, the dtype is tf.float32. Any integer dtypes will be
cast to corresponding floats (i.e. tf.int32 will be cast to
tf.float32), as intermediate calculations should be performing
floating-point division.
|
| Returns | |
|---|---|
var
|
An empty RunningCovariance, ready for incoming samples.
|
from_stats
@classmethodfrom_stats( num_samples, mean, variance )
Initialize a RunningVariance object with given stats.
This allows the user to initialize knowing the mean, variance, and number of samples seen so far.
| Args | |
|---|---|
num_samples
|
Scalar float Tensor, for number of examples already seen.
|
mean
|
float Tensor, for starting mean of estimate.
|
variance
|
float Tensor, for starting estimate of the variance.
|
| Returns | |
|---|---|
RunningVariance object, with given mean and variance estimate.
|
tree_flatten
tree_flatten()
tree_unflatten
@classmethodtree_unflatten( metadata, tensors )
update
update(
new_sample, axis=None
)
Update the RunningCovariance with a new sample.
The update formula is from Philippe Pebay (2008) [1]. This implementation supports both batched and chunked covariance computation. A "batch" is the usual parallel computation, namely a batch of size N implies N independent covariance computations, each stepping one sample (or chunk) at a time. A "chunk" of size M implies incorporating M samples into a single covariance computation at once, which is more efficient than one by one.
To further illustrate the difference between batching and chunking, consider the following example:
# treat as 3 samples from each of 5 independent vector random variables of
# shape (2,)
sample = tf.ones((3, 5, 2))
running_cov = tfp.experimental.stats.RunningCovariance.from_shape(
(5, 2), event_ndims=1)
running_cov = running_cov.update(sample, axis=0)
final_cov = running_cov.covariance()
final_cov.shape # (5, 2, 2)
| Args | |
|---|---|
new_sample
|
Incoming sample with shape and dtype compatible with those
used to form this RunningCovariance.
|
axis
|
If chunking is desired, this is an integer that specifies the axis
with chunked samples. For individual samples, set this to None. By
default, samples are not chunked (axis is None).
|
| Returns | |
|---|---|
cov
|
Newly allocated RunningCovariance updated to include new_sample.
|
References
[1]: Philippe Pebay. Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments. Technical Report SAND2008-6212, 2008. https://prod-ng.sandia.gov/techlib-noauth/access-control.cgi/2008/086212.pdf
variance
variance(
ddof=0
)
Returns the variance accumulated so far.
| Args | |
|---|---|
ddof
|
Requested dynamic degrees of freedom for the variance calculation.
For example, use ddof=0 for population variance and ddof=1 for
sample variance. Defaults to the population variance.
|
| Returns | |
|---|---|
variance
|
An estimate of the variance. |
View source on GitHub