Supported Datasets
AVID
Aalto Vocal Intensity Database includes speech and EGG produced by 50 speakers (25 males, 25 females) who varied their vocal intensity in four categories (soft, normal, loud, and very loud).
Authors
Manila Kodali, Paavo Alku, Sudarsana Reddy Kadiri
INA Diachrony
Voice recordings and transcriptions sorted by time period, sex and speaker.
Keyword arguments
(ina_csv_dir = nothing,)
Mini LibriSpeech
Subset of LibriSpeech corpus for purpose of regression testing.
Authors
Vassil Panayotov, Daniel Povey
Subsets
train, dev
Keyword arguments
(subset = "",)
Multilingual LibriSpeech
Multilingual LibriSpeech (MLS) dataset is a large multilingual corpus suitable for speech research. The dataset is derived from read audiobooks from LibriVox and consists of 8 languages - English, German, Dutch, Spanish, French, Italian, Portuguese, Polish
Authors
Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, Ronan Collobert
Subsets
train, dev, test
Keyword arguments
(lang = "eng", subset = "")
TIMIT
The TIMIT corpus of read speech has been designed to provide speech data for the acquisition of acoustic-phonetic knowledge and for the development and evaluation of automatic speech recognition systems.
Authors
John S. Garofolo, Lori F. Lamel, William M. Fisher, Jonathan G. Fiscus, David S. Pallett, Nancy L. Dahlgren, Victor Zue
Subsets
train, dev, test
Keyword arguments
(formantsdir = nothing, audio_fmt = "SPHERE", subset = "")
Speech2Tex
Recordings of read equations, literal transcriptions and latex transcriptions.
Authors
Lorenzo Brucato