Supported Datasets

AVID

Aalto Vocal Intensity Database includes speech and EGG produced by 50 speakers (25 males, 25 females) who varied their vocal intensity in four categories (soft, normal, loud, and very loud).

Source

Authors

Manila Kodali, Paavo Alku, Sudarsana Reddy Kadiri

INA Diachrony

Voice recordings and transcriptions sorted by time period, sex and speaker.

Keyword arguments

(ina_csv_dir = nothing,)

Mini LibriSpeech

Subset of LibriSpeech corpus for purpose of regression testing.

Source

Authors

Vassil Panayotov, Daniel Povey

Subsets

train, dev

Keyword arguments

(subset = "",)

Multilingual LibriSpeech

Multilingual LibriSpeech (MLS) dataset is a large multilingual corpus suitable for speech research. The dataset is derived from read audiobooks from LibriVox and consists of 8 languages - English, German, Dutch, Spanish, French, Italian, Portuguese, Polish

Source

Authors

Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, Ronan Collobert

Subsets

train, dev, test

Keyword arguments

(lang = "eng", subset = "")

TIMIT

The TIMIT corpus of read speech has been designed to provide speech data for the acquisition of acoustic-phonetic knowledge and for the development and evaluation of automatic speech recognition systems.

Source

Authors

John S. Garofolo, Lori F. Lamel, William M. Fisher, Jonathan G. Fiscus, David S. Pallett, Nancy L. Dahlgren, Victor Zue

Subsets

train, dev, test

Keyword arguments

(formantsdir = nothing, audio_fmt = "SPHERE", subset = "")

Speech2Tex

Recordings of read equations, literal transcriptions and latex transcriptions.

Authors

Lorenzo Brucato