API

Features

SpeechFeatures.Features — Type

abstract type Features end

Abstract type of features extractors.

source

Base.summary — Method

Base.summary(f::Features)

Display a table summarizing each field of the given feature (name, value, type).

Only available in HTML context.

source

Base.summary — Method

Base.summary(c::ComposedFunction{<:Any,<:Features})

Provides a summary for each feature

source

AudioSources.load — Method

AudioSources.load(s::AbstractAudioSource, withprops::Bool; subrange::Union{AbstractRange, Colon}=:, ch=1)

Similar to load(s::AbstractAudioSource), with aditional option: set withprops to true to return a FeaturesProperties object with the audio data.

source

Properties

SpeechFeatures.FeaturesProperties — Type

FeaturesProperties{T}

T is used to know which feature is associated.

Fields

fs is the frequency sampling of the input to the feature extractor
fs_init keep track of the initial fs when several Features are chained
scale is a Vector giving the feature resolution, in its given unit

source

SpeechFeatures.DataProps — Type

DataProps{T}::DataType = Tuple{AbstractArray, FeaturesProperties{T}}

Alias type that is the input and output type of all Features functions.

source

Base.summary — Method

Base.summary(props::FeaturesProperties)

Display a table summarizing the given features properties.

source

Frames

Default field values:

Frames
field	value	type
frameduration	0.025	Float64
framestep	0.01	Float64
dithering	0.0	Float64
preemph	0.97	Float64
removedc	true	Bool
windowname	hann	String
padding	0	Int64
dropedge	true	Bool

SpeechFeatures.Frames — Type

Frames <: Features

Segment resulting from framing the signals. This representation is usually used to extract the short-term Fourier transform.

source

SpeechFeatures.Frames — Method

Frames(; <keyword arguments>)

Initialize frames with default values if not specified.

Arguments

frameduration = 0.025 in seconds
framestep = 0.01 time between two frames in seconds
dithering = 0.0 add gaussian noise to the signal
preemph = 0.97 improve signal-to-noise ratio by boosting high frequencies
removedc = true
windowname = "hann" framing window, one of ["hann", "hamming", "povey", "rectangular"]
padding = 0 amount of padding
dropedge = true

source

SpeechFeatures.Frames — Method

(f::Frames)((x, props)::DataProps{AudioSources.AbstractAudioSource})

Apply Frames to the given signal.

Result is a matrix of size (frame length, number of frames), and the new FeaturesProperties

source

STFT

SpeechFeatures.STFT — Type

STFT <: Features

Short-Term Fourier Transform.

source

SpeechFeatures.STFT — Method

(f::STFT)((X, props)::DataProps{Frames})

Apply Short-Term Fourier Transform to the given Frames matrix.

Result is a matrix of size (≈ frame length / 2, number of frames), and the new FeaturesProperties.

source

FBANK

Default field values:

FBANK
field	value	type
numfilters	26	Int64
lofreq	80	Int64
hifreq	-400	Int64

SpeechFeatures.FBANK — Type

FBANK <: Features

Mel spectrum.

source

SpeechFeatures.FBANK — Method

FBANK(; <keyword arguments>)

Initialize FBANK with default values if not specified.

Arguments

numfilters = 26 number of filters ("triangles")
lofreq = 80 lowest frequency to keep
hifreq = -400 Nyquist frequency + hifreq is the highest frequency to keep

source

SpeechFeatures.FBANK — Method

(f::FBANK)((X, props)::DataProps{STFT})

Apply Mel filterbank to the given matrix.

Result is a matrix of size (numfilters, number of frames), and the new FeaturesProperties.

source

MFCC

Default field values:

MFCC
field	value	type
nceps	13	Int64
liftering	22	Int64

SpeechFeatures.MFCC — Type

MFCC <: Features

Mel Frequency Cepstral Coefficients.

source

SpeechFeatures.MFCC — Method

MFCC(; <keyword arguments>)

Initialize MFCC with default values if not specified.

Arguments

nceps = 13 number of cepstral coefficients
liftering = 22 lifter value

source

SpeechFeatures.MFCC — Method

(f::MFCC)((X, props)::Union{DataProps{FBANK}, DataProps{STFT}})

Apply MFCC to the given matrix.

Result is a matrix of size (nceps, number of frames), and the new FeaturesProperties.

source

Autocorr

SpeechFeatures.Autocorr — Type

Autocorr <: Features

Autocorrelation

source

SpeechFeatures.Autocorr — Method

(f::Autocorr)((X, props)::DataProps{Frames})

Apply autocorrelation to the given frames.

Result is a matrix of size (frame length, number of frames), and the new FeaturesProperties

source

AddDeltas

Default field values:

AddDeltas
field	value	type
order	2	Int64
winlen	2	Int64

SpeechFeatures.AddDeltas — Type

AddDeltas <: Features

Add the derivatives to the features.

source

SpeechFeatures.AddDeltas — Method

AddDeltas(; <keyword arguments>)

Initialize AddDeltas with default values if not specified.

Arguments

order = 2 derivative order
winlen = 2 length of delta window

source

SpeechFeatures.AddDeltas — Method

(f::AddDeltas)((X, props)::DataProps{<:Features})

Apply AddDeltas to the given Features matrix.

Result is a matrix of size (nb input matrix rows * (order+1), number of frames), and the new FeaturesProperties

source

Index

SpeechFeatures.AddDeltas
SpeechFeatures.AddDeltas
SpeechFeatures.AddDeltas
SpeechFeatures.Autocorr
SpeechFeatures.Autocorr
SpeechFeatures.DataProps
SpeechFeatures.FBANK
SpeechFeatures.FBANK
SpeechFeatures.FBANK
SpeechFeatures.Features
SpeechFeatures.FeaturesProperties
SpeechFeatures.Frames
SpeechFeatures.Frames
SpeechFeatures.Frames
SpeechFeatures.MFCC
SpeechFeatures.MFCC
SpeechFeatures.MFCC
SpeechFeatures.STFT
SpeechFeatures.STFT
AudioSources.load
Base.summary
Base.summary
Base.summary