tf.quantization.fake_quant_with_min_max_vars
Stay organized with collections
Save and categorize content based on your preferences.
Fake-quantize the 'inputs' tensor of type float via global float scalars
tf.quantization.fake_quant_with_min_max_vars(
inputs: Annotated[Any, _atypes.Float32],
min: Annotated[Any, _atypes.Float32],
max: Annotated[Any, _atypes.Float32],
num_bits: int = 8,
narrow_range: bool = False,
name=None
) -> Annotated[Any, _atypes.Float32]
Fake-quantize the inputs
tensor of type float via global float scalars
min
and max
to outputs
tensor of same shape as inputs
.
Attributes
[min; max]
define the clamping range for the inputs
data.
inputs
values are quantized into the quantization range (
[0; 2^num_bits - 1]
when narrow_range
is false and [1; 2^num_bits - 1]
when it is true) and then de-quantized and output as floats in [min; max]
interval.
num_bits
is the bitwidth of the quantization; between 2 and 16, inclusive.
Before quantization, min
and max
values are adjusted with the following
logic.
It is suggested to have min <= 0 <= max
. If 0
is not in the range of values,
the behavior can be unexpected:
- If
0 < min < max
: min_adj = 0
and max_adj = max - min
.
- If
min < max < 0
: min_adj = min - max
and max_adj = 0
.
- If
min <= 0 <= max
: scale = (max - min) / (2^num_bits - 1)
,
min_adj = scale * round(min / scale)
and max_adj = max + min_adj - min
.
This operation has a gradient and thus allows for training min
and max
values.
Args |
inputs
|
A Tensor of type float32 .
|
min
|
A Tensor of type float32 .
|
max
|
A Tensor of type float32 .
|
num_bits
|
An optional int . Defaults to 8 .
|
narrow_range
|
An optional bool . Defaults to False .
|
name
|
A name for the operation (optional).
|
Returns |
A Tensor of type float32 .
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-04-26 UTC.
[null,null,["Last updated 2024-04-26 UTC."],[],[],null,["# tf.quantization.fake_quant_with_min_max_vars\n\n\u003cbr /\u003e\n\nFake-quantize the 'inputs' tensor of type float via global float scalars\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.fake_quant_with_min_max_vars`](https://www.tensorflow.org/api_docs/python/tf/quantization/fake_quant_with_min_max_vars), [`tf.compat.v1.quantization.fake_quant_with_min_max_vars`](https://www.tensorflow.org/api_docs/python/tf/quantization/fake_quant_with_min_max_vars)\n\n\u003cbr /\u003e\n\n tf.quantization.fake_quant_with_min_max_vars(\n inputs: Annotated[Any, _atypes.Float32],\n min: Annotated[Any, _atypes.Float32],\n max: Annotated[Any, _atypes.Float32],\n num_bits: int = 8,\n narrow_range: bool = False,\n name=None\n ) -\u003e Annotated[Any, _atypes.Float32]\n\nFake-quantize the `inputs` tensor of type float via global float scalars\n`min` and `max` to `outputs` tensor of same shape as `inputs`.\n\nAttributes\n\n- `[min; max]` define the clamping range for the `inputs` data.\n- `inputs` values are quantized into the quantization range ( `[0; 2^num_bits - 1]` when `narrow_range` is false and `[1; 2^num_bits - 1]` when it is true) and then de-quantized and output as floats in `[min; max]` interval.\n- `num_bits` is the bitwidth of the quantization; between 2 and 16, inclusive.\n\nBefore quantization, `min` and `max` values are adjusted with the following\nlogic.\nIt is suggested to have `min \u003c= 0 \u003c= max`. If `0` is not in the range of values,\nthe behavior can be unexpected:\n\n- If `0 \u003c min \u003c max`: `min_adj = 0` and `max_adj = max - min`.\n- If `min \u003c max \u003c 0`: `min_adj = min - max` and `max_adj = 0`.\n- If `min \u003c= 0 \u003c= max`: `scale = (max - min) / (2^num_bits - 1)`, `min_adj = scale * round(min / scale)` and `max_adj = max + min_adj - min`.\n\nThis operation has a gradient and thus allows for training `min` and `max`\nvalues.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|----------------|------------------------------------------|\n| `inputs` | A `Tensor` of type `float32`. |\n| `min` | A `Tensor` of type `float32`. |\n| `max` | A `Tensor` of type `float32`. |\n| `num_bits` | An optional `int`. Defaults to `8`. |\n| `narrow_range` | An optional `bool`. Defaults to `False`. |\n| `name` | A name for the operation (optional). |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| A `Tensor` of type `float32`. ||\n\n\u003cbr /\u003e"]]