QuantizedMatMulWithBiasAndReluAndRequantize

public final class QuantizedMatMulWithBiasAndReluAndRequantize

Perform a quantized matrix multiplication of `a` by the matrix `b` with bias add and relu and requantize fusion.

The inputs must be two-dimensional matrices and 1D bias vector. And the inner dimension of `a` (after being transposed if `transpose_a` is non-zero) must match the outer dimension of `b` (after being transposed if `transposed_b` is non-zero). Then do broadcast add operation with bias values on the matrix multiplication result. The bias size must match inner dimension of `b`. Then do relu activation to get non-negative result. Then do requantize operation to get final uint8 result.

Nested Classes

Constants

String OP_NAME The name of this op, as known by TensorFlow core engine

Public Methods

static <W extends TType> QuantizedMatMulWithBiasAndReluAndRequantize<W>
create(Scope scope, Operand<? extends TType> a, Operand<? extends TType> b, Operand<? extends TType> bias, Operand<TFloat32> minA, Operand<TFloat32> maxA, Operand<TFloat32> minB, Operand<TFloat32> maxB, Operand<TFloat32> minFreezedOutput, Operand<TFloat32> maxFreezedOutput, Class<W> Toutput, Options... options)
Factory method to create a class wrapping a new QuantizedMatMulWithBiasAndReluAndRequantize operation.
static QuantizedMatMulWithBiasAndReluAndRequantize.Options
inputQuantMode(String inputQuantMode)
Output<TFloat32>
maxOut()
The float value that the highest quantized output value represents.
Output<TFloat32>
minOut()
The float value that the lowest quantized output value represents.
Output<W>
out()
static QuantizedMatMulWithBiasAndReluAndRequantize.Options
transposeA(Boolean transposeA)
static QuantizedMatMulWithBiasAndReluAndRequantize.Options
transposeB(Boolean transposeB)

Inherited Methods

org.tensorflow.op.RawOp
final boolean
equals(Object obj)
final int
Operation
op()
Return this unit of computation as a single Operation.
final String
boolean
equals(Object arg0)
final Class<?>
getClass()
int
hashCode()
final void
notify()
final void
notifyAll()
String
toString()
final void
wait(long arg0, int arg1)
final void
wait(long arg0)
final void
wait()
org.tensorflow.op.Op
abstract ExecutionEnvironment
env()
Return the execution environment this op was created in.
abstract Operation
op()
Return this unit of computation as a single Operation.

Constants

public static final String OP_NAME

The name of this op, as known by TensorFlow core engine

Constant Value: "QuantizedMatMulWithBiasAndReluAndRequantize"

Public Methods

public static QuantizedMatMulWithBiasAndReluAndRequantize<W> create (Scope scope, Operand<? extends TType> a, Operand<? extends TType> b, Operand<? extends TType> bias, Operand<TFloat32> minA, Operand<TFloat32> maxA, Operand<TFloat32> minB, Operand<TFloat32> maxB, Operand<TFloat32> minFreezedOutput, Operand<TFloat32> maxFreezedOutput, Class<W> Toutput, Options... options)

Factory method to create a class wrapping a new QuantizedMatMulWithBiasAndReluAndRequantize operation.

Parameters
scope current scope
a A matrix to be multiplied. Must be a two-dimensional tensor of type `quint8`.
b A matrix to be multiplied and must be a two-dimensional tensor of type `qint8`.
bias A 1D bias tensor with size matching with inner dimension of `b` (after being transposed if `transposed_b` is non-zero).
minA The float value that the lowest quantized `a` value represents.
maxA The float value that the highest quantized `a` value represents.
minB The float value that the lowest quantized `b` value represents.
maxB The float value that the highest quantized `b` value represents.
minFreezedOutput The float value that the highest quantized output value after requantize.
options carries optional attributes values
Returns
  • a new instance of QuantizedMatMulWithBiasAndReluAndRequantize

public static QuantizedMatMulWithBiasAndReluAndRequantize.Options inputQuantMode (String inputQuantMode)

Parameters
inputQuantMode Input data quantization mode. Either MIN_FIRST(default) or SCALED.

public Output<TFloat32> maxOut ()

The float value that the highest quantized output value represents.

public Output<TFloat32> minOut ()

The float value that the lowest quantized output value represents.

public Output<W> out ()

public static QuantizedMatMulWithBiasAndReluAndRequantize.Options transposeA (Boolean transposeA)

Parameters
transposeA If true, `a` is transposed before multiplication.

public static QuantizedMatMulWithBiasAndReluAndRequantize.Options transposeB (Boolean transposeB)

Parameters
transposeB If true, `b` is transposed before multiplication.