View source on GitHub
|
Entropy model for power-law distributed random variables.
tfc.entropy_models.PowerLawEntropyModel(
coding_rank, alpha=0.01, bottleneck_dtype=None
)
This entropy model handles quantization and compression of a bottleneck tensor and implements a penalty that encourages compressibility under the Elias gamma code.
The gamma code has code lengths 1 + 2 floor(log_2(x)), for x a positive
integer, and is close to optimal if x is distributed according to a power
law. Being a universal code, it also guarantees that in the worst case, the
expected code length is no more than 3 times the entropy of the empirical
distribution of x, as long as probability decreases with increasing x. For
details on the gamma code, see:
"Universal Codeword Sets and Representations of the Integers"
P. Elias
https://doi.org/10.1109/TIT.1975.1055349
Given a signed integer, run_length_gamma_encode encodes zeros using a
run-length code, the sign using a uniform bit, and applies the gamma code to
the magnitude.
The penalty applied by this class is given by:
log((abs(x) + alpha) / alpha)
This encourages x to follow a symmetrized power law, but only approximately
for alpha > 0. Without alpha, the penalty would have a singularity at
zero. Setting alpha to a small positive value ensures that the penalty is
non-negative, and that its gradients are useful for optimization.
Args | |
|---|---|
coding_rank
|
Integer. Number of innermost dimensions considered a coding
unit. Each coding unit is compressed to its own bit string, and the
estimated rate is summed over each coding unit in bits().
|
alpha
|
Float. Regularization parameter preventing gradient singularity around zero. |
bottleneck_dtype
|
tf.dtypes.DType. Data type of bottleneck tensor.
Defaults to tf.keras.mixed_precision.global_policy().compute_dtype.
|
Attributes | |
|---|---|
alpha
|
Alpha parameter. |
bottleneck_dtype
|
Data type of the bottleneck tensor. |
coding_rank
|
Number of innermost dimensions considered a coding unit. |
name
|
Returns the name of this module as passed or determined in the ctor. |
name_scope
|
Returns a tf.name_scope instance for this class.
|
non_trainable_variables
|
Sequence of non-trainable variables owned by this module and its submodules. |
submodules
|
Sequence of all sub-modules.
Submodules are modules which are properties of this module, or found as properties of modules which are properties of this module (and so on).
|
trainable_variables
|
Sequence of trainable variables owned by this module and its submodules. |
variables
|
Sequence of variables owned by this module and its submodules. |
Methods
compress
compress(
bottleneck
)
Compresses a floating-point tensor.
Compresses the tensor to bit strings. bottleneck is first quantized
as in quantize(), and then compressed using the run-length gamma code. The
quantized tensor can later be recovered by calling decompress().
The innermost self.coding_rank dimensions are treated as one coding unit,
i.e. are compressed into one string each. Any additional dimensions to the
left are treated as batch dimensions.
| Args | |
|---|---|
bottleneck
|
tf.Tensor containing the data to be compressed. Must have at
least self.coding_rank dimensions.
|
| Returns | |
|---|---|
A tf.Tensor having the same shape as bottleneck without the
self.coding_rank innermost dimensions, containing a string for each
coding unit.
|
decompress
decompress(
strings, code_shape
)
Decompresses a tensor.
Reconstructs the quantized tensor from bit strings produced by compress().
| Args | |
|---|---|
strings
|
tf.Tensor containing the compressed bit strings.
|
code_shape
|
Shape of innermost dimensions of the output tf.Tensor.
|
| Returns | |
|---|---|
A tf.Tensor of shape tf.shape(strings) + code_shape.
|
penalty
penalty(
bottleneck
)
Computes penalty encouraging compressibility.
| Args | |
|---|---|
bottleneck
|
tf.Tensor containing the data to be compressed. Must have at
least self.coding_rank dimensions.
|
| Returns | |
|---|---|
Penalty value, which has the same shape as bottleneck without the
self.coding_rank innermost dimensions.
|
quantize
quantize(
bottleneck
)
Quantizes a floating-point bottleneck tensor.
The tensor is rounded to integer values. The gradient of this rounding operation is overridden with the identity (straight-through gradient estimator).
| Args | |
|---|---|
bottleneck
|
tf.Tensor containing the data to be quantized.
|
| Returns | |
|---|---|
A tf.Tensor containing the quantized values.
|
with_name_scope
@classmethodwith_name_scope( method )
Decorator to automatically enter the module name scope.
class MyModule(tf.Module):@tf.Module.with_name_scopedef __call__(self, x):if not hasattr(self, 'w'):self.w = tf.Variable(tf.random.normal([x.shape[1], 3]))return tf.matmul(x, self.w)
Using the above module would produce tf.Variables and tf.Tensors whose
names included the module name:
mod = MyModule()mod(tf.ones([1, 2]))<tf.Tensor: shape=(1, 3), dtype=float32, numpy=..., dtype=float32)>mod.w<tf.Variable 'my_module/Variable:0' shape=(2, 3) dtype=float32,numpy=..., dtype=float32)>
| Args | |
|---|---|
method
|
The method to wrap. |
| Returns | |
|---|---|
| The original method wrapped such that it enters the module's name scope. |
__call__
__call__(
bottleneck
)
Perturbs a tensor with (quantization) noise and computes penalty.
| Args | |
|---|---|
bottleneck
|
tf.Tensor containing the data to be compressed. Must have at
least self.coding_rank dimensions.
|
| Returns | |
|---|---|
A tuple (self.quantize(bottleneck), self.penalty(bottleneck)).
|
View source on GitHub