Supported Datasets

AVID

License Language

Aalto Vocal Intensity Database includes speech and EGG produced by 50 speakers (25 males, 25 females) who varied their vocal intensity in four categories (soft, normal, loud, and very loud).

Source

Authors

Manila Kodali, Paavo Alku, Sudarsana Reddy Kadiri


INA Diachrony

License Language

Voice recordings and transcriptions sorted by time period, sex and speaker.

Keyword arguments

(ina_csv_dir = nothing,)

Mini LibriSpeech

License Language

Subset of LibriSpeech corpus for purpose of regression testing.

Source

Authors

Vassil Panayotov, Daniel Povey

Subsets

train, dev

Keyword arguments

(subset = "",)

Multilingual LibriSpeech

License Language Language Language Language Language Language Language Language

Multilingual LibriSpeech (MLS) dataset is a large multilingual corpus suitable for speech research. The dataset is derived from read audiobooks from LibriVox and consists of 8 languages - English, German, Dutch, Spanish, French, Italian, Portuguese, Polish

Source

Authors

Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, Ronan Collobert

Subsets

train, dev, test

Keyword arguments

(lang = "eng", subset = "")

TIMIT

License Language

The TIMIT corpus of read speech has been designed to provide speech data for the acquisition of acoustic-phonetic knowledge and for the development and evaluation of automatic speech recognition systems.

Source

Authors

John S. Garofolo, Lori F. Lamel, William M. Fisher, Jonathan G. Fiscus, David S. Pallett, Nancy L. Dahlgren, Victor Zue

Subsets

train, dev, test

Keyword arguments

(formantsdir = nothing, audio_fmt = "SPHERE", subset = "")

Speech2Tex

License Language

Recordings of read equations, literal transcriptions and latex transcriptions.

Authors

Lorenzo Brucato