TensorFlow is back at Google I/O on May 14! Register now

tensorflow::ops::QuantizeV2

#include <array_ops.h>

Quantize the 'input' tensor of type float to 'output' tensor of type 'T'.

Summary

[min_range, max_range] are scalar floats that specify the range for the 'input' data. The 'mode' attribute controls exactly which calculations are used to convert the float values to their quantized equivalents. The 'round_mode' attribute controls which rounding tie-breaking algorithm is used when rounding float values to their quantized equivalents.

In 'MIN_COMBINED' mode, each value of the tensor will undergo the following:

out[i] = (in[i] - min_range) * range(T) / (max_range - min_range)
if T == qint8: out[i] -= (range(T) + 1) / 2.0

here range(T) = numeric_limits::max() - numeric_limits::min()

MIN_COMBINED Mode Example

Assume the input is type float and has a possible range of [0.0, 6.0] and the output type is quint8 ([0, 255]). The min_range and max_range values should be specified as 0.0 and 6.0. Quantizing from float to quint8 will multiply each value of the input by 255/6 and cast to quint8.

If the output type was qint8 ([-128, 127]), the operation will additionally subtract each value by 128 prior to casting, so that the range of values aligns with the range of qint8.

If the mode is 'MIN_FIRST', then this approach is used:

num_discrete_values = 1 << (# of bits in T)
range_adjust = num_discrete_values / (num_discrete_values - 1)
range = (range_max - range_min) * range_adjust
range_scale = num_discrete_values / range
quantized = round(input * range_scale) - round(range_min * range_scale) +
  numeric_limits::min()
quantized = max(quantized, numeric_limits::min())
quantized = min(quantized, numeric_limits::max())

The biggest difference between this and MIN_COMBINED is that the minimum range is rounded first, before it's subtracted from the rounded value. With MIN_COMBINED, a small bias is introduced where repeated iterations of quantizing and dequantizing will introduce a larger and larger error.

SCALED mode Example

SCALED mode matches the quantization approach used in QuantizeAndDequantize{V2|V3}.

If the mode is SCALED, the quantization is performed by multiplying each input value by a scaling_factor. The scaling_factor is determined from min_range and max_range to be as large as possible such that the range from min_range to max_range is representable within values of type T.

  
  
    
  const int min_T = std::numeric_limits::min();
  const int max_T = std::numeric_limits::max();
  const float max_float = std::numeric_limits::max();
  
  
    
  const float scale_factor_from_min_side =
      (min_T * min_range > 0) ? min_T / min_range : max_float;
  const float scale_factor_from_max_side =
      (max_T * max_range > 0) ? max_T / max_range : max_float;
  
  
    
  const float scale_factor = std::min(scale_factor_from_min_side,
                                      scale_factor_from_max_side);

  
  We next use the scale_factor to adjust min_range and max_range as follows:
  
    
      min_range = min_T / scale_factor;
      max_range = max_T / scale_factor;

  
  e.g. if T = qint8, and initially min_range = -10, and max_range = 9, we would compare -128/-10.0 = 12.8 to 127/9.0 = 14.11, and set scaling_factor = 12.8 In this case, min_range would remain -10, but max_range would be adjusted to 127 / 12.8 = 9.921875
  So we will quantize input values in the range (-10, 9.921875) to (-128, 127).
  The input tensor can now be quantized by clipping values to the range min_range to max_range, then multiplying by scale_factor as follows:
  
    
result = round(min(max_range, max(min_range, input)) * scale_factor)

  
  The adjusted min_range and max_range are returned as outputs 2 and 3 of this operation. These outputs should be used as the range for any further calculations.
  
    narrow_range (bool) attribute
  
  If true, we do not use the minimum quantized value. i.e. for int8 the quantized output, it would be restricted to the range -127..127 instead of the full -128..127 range. This is provided for compatibility with certain inference backends. (Only applies to SCALED mode)
  
    axis (int) attribute
  
  An optional axis attribute can specify a dimension index of the input tensor, such that quantization ranges will be calculated and applied separately for each slice of the tensor along that dimension. This is useful for per-channel quantization.
  If axis is specified, min_range and max_range
  if axis=None, per-tensor quantization is performed as normal.
  
    ensure_minimum_range (float) attribute
  
  Ensures the minimum quantization range is at least this value. The legacy default value for this is 0.01, but it is strongly suggested to set it to 0 for new uses.
  Arguments:

      scope: A Scope object
      min_range: The minimum value of the quantization range. This value may be adjusted by the op depending on other parameters. The adjusted value is written to output_min. If the axis attribute is specified, this must be a 1-D tensor whose size matches the axis dimension of the input and output tensors.
      max_range: The maximum value of the quantization range. This value may be adjusted by the op depending on other parameters. The adjusted value is written to output_max. If the axis attribute is specified, this must be a 1-D tensor whose size matches the axis dimension of the input and output tensors.
    

  Returns:

      Output output: The quantized data produced from the float input.
      Output output_min: The final quantization range minimum, used to clip input values before scaling and rounding them to quantized values. If the axis attribute is specified, this will be a 1-D tensor whose size matches the axis dimension of the input and output tensors.
      Output output_max: The final quantization range maximum, used to clip input values before scaling and rounding them to quantized values. If the axis attribute is specified, this will be a 1-D tensor whose size matches the axis dimension of the input and output tensors. 
    

  
    
      
        Constructors and Destructors
      
    
    
      
        QuantizeV2(const ::tensorflow::Scope & scope, ::tensorflow::Input input, ::tensorflow::Input min_range, ::tensorflow::Input max_range, DataType T)
        

      
    
    
      
        QuantizeV2(const ::tensorflow::Scope & scope, ::tensorflow::Input input, ::tensorflow::Input min_range, ::tensorflow::Input max_range, DataType T, const QuantizeV2::Attrs & attrs)
        

      
    
  
  
    
      
        Public attributes
      
    
    
      
        operation
      
      
        
          Operation
        
      
    
    
      
        output
      
      
        
          ::tensorflow::Output
        
      
    
    
      
        output_max
      
      
        
          ::tensorflow::Output
        
      
    
    
      
        output_min
      
      
        
          ::tensorflow::Output
        
      
    
  
  
    
      
        Public static functions
      
    
    
      
        Axis(int64 x)
      
      
        
          Attrs
        
      
    
    
      
        EnsureMinimumRange(float x)
      
      
        
          Attrs
        
      
    
    
      
        Mode(StringPiece x)
      
      
        
          Attrs
        
      
    
    
      
        NarrowRange(bool x)
      
      
        
          Attrs
        
      
    
    
      
        RoundMode(StringPiece x)
      
      
        
          Attrs
        
      
    
  
  
    
      
        Structs
      
    
    
      
        tensorflow::ops::QuantizeV2::Attrs
      
      
        Optional attribute setters for QuantizeV2. 
      
    
  
  Public attributes
  
    operation
    Operation operation
    
  
  
    output
    ::tensorflow::Output output
    
  
  
    output_max
    ::tensorflow::Output output_max
    
  
  
    output_min
    ::tensorflow::Output output_min
    
  
  Public functions
  
    QuantizeV2
     QuantizeV2(
  const ::tensorflow::Scope & scope,
  ::tensorflow::Input input,
  ::tensorflow::Input min_range,
  ::tensorflow::Input max_range,
  DataType T
)
    
  
  
    QuantizeV2
     QuantizeV2(
  const ::tensorflow::Scope & scope,
  ::tensorflow::Input input,
  ::tensorflow::Input min_range,
  ::tensorflow::Input max_range,
  DataType T,
  const QuantizeV2::Attrs & attrs
)
    
  
  Public static functions
  
    Axis
    Attrs Axis(
  int64 x
)
    
  
  
    EnsureMinimumRange
    Attrs EnsureMinimumRange(
  float x
)
    
  
  
    Mode
    Attrs Mode(
  StringPiece x
)
    
  
  
    NarrowRange
    Attrs NarrowRange(
  bool x
)
    
  
  
    RoundMode
    Attrs RoundMode(
  StringPiece x
)

Public attributes
`operation`	`Operation`
`output`	`::tensorflow::Output`
`output_max`	`::tensorflow::Output`
`output_min`	`::tensorflow::Output`

Public static functions
`Axis(int64 x)`	`Attrs`
`EnsureMinimumRange(float x)`	`Attrs`
`Mode(StringPiece x)`	`Attrs`
`NarrowRange(bool x)`	`Attrs`
`RoundMode(StringPiece x)`	`Attrs`

Constructors and Destructors
`QuantizeV2(const ::tensorflow::Scope & scope, ::tensorflow::Input input, ::tensorflow::Input min_range, ::tensorflow::Input max_range, DataType T)`
`QuantizeV2(const ::tensorflow::Scope & scope, ::tensorflow::Input input, ::tensorflow::Input min_range, ::tensorflow::Input max_range, DataType T, const QuantizeV2::Attrs & attrs)`