Mfcc

public final class Mfcc

Transforms a spectrogram into a form that's useful for speech recognition.

Mel Frequency Cepstral Coefficients are a way of representing audio data that's been effective as an input feature for machine learning. They are created by taking the spectrum of a spectrogram (a 'cepstrum'), and discarding some of the higher frequencies that are less significant to the human ear. They have a long history in the speech recognition world, and https://en.wikipedia.org/wiki/Mel-frequency_cepstrum is a good resource to learn more.

Nested Classes

class Mfcc.Options Optional attributes for Mfcc

Constants

String OP_NAME The name of this op, as known by TensorFlow core engine

Public Methods

Output < TFloat32 >
asOutput ()
Returns the symbolic handle of the tensor.
static Mfcc
create ( Scope scope, Operand < TFloat32 > spectrogram, Operand < TInt32 > sampleRate, Options... options)
Factory method to create a class wrapping a new Mfcc operation.
static Mfcc.Options
dctCoefficientCount (Long dctCoefficientCount)
static Mfcc.Options
filterbankChannelCount (Long filterbankChannelCount)
static Mfcc.Options
lowerFrequencyLimit (Float lowerFrequencyLimit)
Output < TFloat32 >
static Mfcc.Options
upperFrequencyLimit (Float upperFrequencyLimit)

Inherited Methods

Constants

public static final String OP_NAME

The name of this op, as known by TensorFlow core engine

Constant Value: "Mfcc"

Public Methods

public Output < TFloat32 > asOutput ()

Returns the symbolic handle of the tensor.

Inputs to TensorFlow operations are outputs of another TensorFlow operation. This method is used to obtain a symbolic handle that represents the computation of the input.

public static Mfcc create ( Scope scope, Operand < TFloat32 > spectrogram, Operand < TInt32 > sampleRate, Options... options)

Factory method to create a class wrapping a new Mfcc operation.

Parameters
scope current scope
spectrogram Typically produced by the Spectrogram op, with magnitude_squared set to true.
sampleRate How many samples per second the source audio used.
options carries optional attributes values
Returns
  • a new instance of Mfcc

public static Mfcc.Options dctCoefficientCount (Long dctCoefficientCount)

Parameters
dctCoefficientCount How many output channels to produce per time slice.

public static Mfcc.Options filterbankChannelCount (Long filterbankChannelCount)

Parameters
filterbankChannelCount Resolution of the Mel bank used internally.

public static Mfcc.Options lowerFrequencyLimit (Float lowerFrequencyLimit)

Parameters
lowerFrequencyLimit The lowest frequency to use when calculating the ceptstrum.

public Output < TFloat32 > output ()

public static Mfcc.Options upperFrequencyLimit (Float upperFrequencyLimit)

Parameters
upperFrequencyLimit The highest frequency to use when calculating the ceptstrum.