Performs a quantized matrix multiplication of a by the matrix b with bias
tf.raw_ops.QuantizedMatMulWithBias(
    a,
    b,
    bias,
    min_a,
    max_a,
    min_b,
    max_b,
    Toutput=tf.dtypes.qint32,
    transpose_a=False,
    transpose_b=False,
    input_quant_mode='MIN_FIRST',
    name=None
)
add.
The inputs must be two-dimensional matrices and 1D bias vector. And the inner
  dimension of a (after being transposed if transpose_a is non-zero) must
  match the outer dimension of b (after being transposed if transposed_b is
  non-zero). Then do broadcast add operation with bias values on the matrix
  multiplication result. The bias size must match inner dimension of b.
Args:
    a: A Tensor. Must be one of the following types: qint8, quint8, qint32, qint16, quint16.
      A matrix to be multiplied. Must be a two-dimensional tensor of type quint8.
    b: A Tensor. Must be one of the following types: qint8, quint8, qint32, qint16, quint16.
      A matrix to be multiplied and must be a two-dimensional tensor of type qint8.
    bias: A Tensor. Must be one of the following types: float32, qint32.
      A 1D bias tensor with size matching inner dimension of b (after being
      transposed if transposed_b is non-zero).
    min_a: A Tensor of type float32.
      The float value that the lowest quantized a value represents.
    max_a: A Tensor of type float32.
      The float value that the highest quantized a value represents.
    min_b: A Tensor of type float32.
      The float value that the lowest quantized b value represents.
    max_b: A Tensor of type float32.
      The float value that the highest quantized b value represents.
    Toutput: An optional tf.DType from: tf.qint8, tf.quint8, tf.qint32, tf.qint16, tf.quint16. Defaults to tf.qint32.
    transpose_a: An optional bool. Defaults to False.
      If true, a is transposed before multiplication.
    transpose_b: An optional bool. Defaults to False.
      If true, b is transposed before multiplication.
    input_quant_mode: An optional string from: "MIN_FIRST", "SCALED". Defaults to "MIN_FIRST".
      Input data quantization mode. Either MIN_FIRST(default) or SCALED.
    name: A name for the operation (optional).
Returns:
    A tuple of Tensor objects (out, min_out, max_out).
out: A `Tensor` of type `Toutput`.
min_out: A `Tensor` of type `float32`.
max_out: A `Tensor` of type `float32`.