API
Features
SpeechFeatures.Features
— Typeabstract type Features end
Abstract type of features extractors.
Base.summary
— MethodBase.summary(f::Features)
Display a table summarizing each field of the given feature (name, value, type).
Only available in HTML context.
Base.summary
— MethodBase.summary(c::ComposedFunction{<:Any,<:Features})
Provides a summary for each feature
AudioSources.load
— MethodAudioSources.load(s::AbstractAudioSource, withprops::Bool; subrange::Union{AbstractRange, Colon}=:, ch=1)
Similar to load(s::AbstractAudioSource)
, with aditional option: set withprops
to true to return a FeaturesProperties object with the audio data.
Properties
SpeechFeatures.FeaturesProperties
— TypeFeaturesProperties{T}
T
is used to know which feature is associated.
Fields
fs
is the frequency sampling of the input to the feature extractorfs_init
keep track of the initial fs when several Features are chainedscale
is a Vector giving the feature resolution, in its givenunit
SpeechFeatures.DataProps
— TypeDataProps{T}::DataType = Tuple{AbstractArray, FeaturesProperties{T}}
Alias type that is the input and output type of all Features functions.
Base.summary
— MethodBase.summary(props::FeaturesProperties)
Display a table summarizing the given features properties.
Frames
Default field values:
Frames | ||
---|---|---|
field | value | type |
frameduration | 0.025 | Float64 |
framestep | 0.01 | Float64 |
dithering | 0.0 | Float64 |
preemph | 0.97 | Float64 |
removedc | true | Bool |
windowname | hann | String |
padding | 0 | Int64 |
dropedge | true | Bool |
SpeechFeatures.Frames
— TypeFrames <: Features
Segment resulting from framing the signals. This representation is usually used to extract the short-term Fourier transform.
SpeechFeatures.Frames
— MethodFrames(; <keyword arguments>)
Initialize frames with default values if not specified.
Arguments
frameduration = 0.025
in secondsframestep = 0.01
time between two frames in secondsdithering = 0.0
add gaussian noise to the signalpreemph = 0.97
improve signal-to-noise ratio by boosting high frequenciesremovedc = true
windowname = "hann"
framing window, one of ["hann", "hamming", "povey", "rectangular"]padding = 0
amount of paddingdropedge = true
SpeechFeatures.Frames
— Method(f::Frames)((x, props)::DataProps{AudioSources.AbstractAudioSource})
Apply Frames to the given signal.
Result is a matrix of size (frame length, number of frames), and the new FeaturesProperties
STFT
SpeechFeatures.STFT
— TypeSTFT <: Features
Short-Term Fourier Transform.
SpeechFeatures.STFT
— Method(f::STFT)((X, props)::DataProps{Frames})
Apply Short-Term Fourier Transform to the given Frames matrix.
Result is a matrix of size (≈ frame length / 2, number of frames), and the new FeaturesProperties.
FBANK
Default field values:
FBANK | ||
---|---|---|
field | value | type |
numfilters | 26 | Int64 |
lofreq | 80 | Int64 |
hifreq | -400 | Int64 |
SpeechFeatures.FBANK
— TypeFBANK <: Features
Mel spectrum.
SpeechFeatures.FBANK
— MethodFBANK(; <keyword arguments>)
Initialize FBANK with default values if not specified.
Arguments
numfilters = 26
number of filters ("triangles")lofreq = 80
lowest frequency to keephifreq = -400
Nyquist frequency + hifreq is the highest frequency to keep
SpeechFeatures.FBANK
— Method(f::FBANK)((X, props)::DataProps{STFT})
Apply Mel filterbank to the given matrix.
Result is a matrix of size (numfilters, number of frames), and the new FeaturesProperties.
MFCC
Default field values:
MFCC | ||
---|---|---|
field | value | type |
nceps | 13 | Int64 |
liftering | 22 | Int64 |
SpeechFeatures.MFCC
— TypeMFCC <: Features
Mel Frequency Cepstral Coefficients.
SpeechFeatures.MFCC
— MethodMFCC(; <keyword arguments>)
Initialize MFCC with default values if not specified.
Arguments
nceps = 13
number of cepstral coefficientsliftering = 22
lifter value
SpeechFeatures.MFCC
— Method(f::MFCC)((X, props)::Union{DataProps{FBANK}, DataProps{STFT}})
Apply MFCC to the given matrix.
Result is a matrix of size (nceps, number of frames), and the new FeaturesProperties.
Autocorr
SpeechFeatures.Autocorr
— TypeAutocorr <: Features
Autocorrelation
SpeechFeatures.Autocorr
— Method(f::Autocorr)((X, props)::DataProps{Frames})
Apply autocorrelation to the given frames.
Result is a matrix of size (frame length, number of frames), and the new FeaturesProperties
AddDeltas
Default field values:
AddDeltas | ||
---|---|---|
field | value | type |
order | 2 | Int64 |
winlen | 2 | Int64 |
SpeechFeatures.AddDeltas
— TypeAddDeltas <: Features
Add the derivatives to the features.
SpeechFeatures.AddDeltas
— MethodAddDeltas(; <keyword arguments>)
Initialize AddDeltas with default values if not specified.
Arguments
order = 2
derivative orderwinlen = 2
length of delta window
SpeechFeatures.AddDeltas
— Method(f::AddDeltas)((X, props)::DataProps{<:Features})
Apply AddDeltas to the given Features matrix.
Result is a matrix of size (nb input matrix rows * (order+1), number of frames), and the new FeaturesProperties
Index
SpeechFeatures.AddDeltas
SpeechFeatures.AddDeltas
SpeechFeatures.AddDeltas
SpeechFeatures.Autocorr
SpeechFeatures.Autocorr
SpeechFeatures.DataProps
SpeechFeatures.FBANK
SpeechFeatures.FBANK
SpeechFeatures.FBANK
SpeechFeatures.Features
SpeechFeatures.FeaturesProperties
SpeechFeatures.Frames
SpeechFeatures.Frames
SpeechFeatures.Frames
SpeechFeatures.MFCC
SpeechFeatures.MFCC
SpeechFeatures.MFCC
SpeechFeatures.STFT
SpeechFeatures.STFT
AudioSources.load
Base.summary
Base.summary
Base.summary