expresso¶
Created by TA Nguyen, W-N Hsu, A D’Avirro, B Shi, I Gat, M Fazel-Zarani, T Remez, Ja Copet, G Synnaeve, M Hassid, F Kreuk, Y Adi, E Dupoux
version |
1.0.0 |
license |
|
usage |
research |
languages |
eng |
format |
wav |
channel |
1, 2 |
sampling rate |
48000 |
bit depth |
24 |
duration |
1 days 18:23:42.549854165 |
files |
11954, duration distribution: 0.6 s |
segments |
27545, duration distribution: nan s |
repository |
audb-public |
Description¶
Expresso is a dataset of expressive speech recordings. It contains read speech and singing in various styles including default, confused, enunciated, happy, laughing, narration, sad, singing, and whisper. The dataset is part of the TextlessLib project.
Example¶
audio_48khz/read/ex02/happy/base/ex02_happy_00339.wav
Tables¶
Click on a row to toggle a preview.
ID |
Type |
Columns |
|---|---|---|
dev.channel0 |
segmented |
speaker, style, corpus |
dev.channel1 |
segmented |
speaker, style, corpus |
files |
filewise |
id, speaking_style |
test.channel0 |
segmented |
speaker, style, corpus |
test.channel1 |
segmented |
speaker, style, corpus |
train.channel0 |
segmented |
speaker, style, corpus |
train.channel1 |
segmented |
speaker, style, corpus |
vad.channel0 |
segmented |
channel |
vad.channel1 |
segmented |
channel |
Schemes¶
ID |
Dtype |
Labels |
|---|---|---|
channel |
int |
0, 1 |
corpus |
str |
base, longform |
id |
str |
00001, 00002, 00003, 00004, 00005, 00006, 00007, […], 014, 015, 016, 017, 018, 019, 020, 021 |
speaker |
str |
ex01, ex02, ex03, ex04 |
speaking_style |
str |
conversational, read |
style |
str |
angry, animal, animal_directed, awe, bored, calm, child, […], narration, non_verbal, projected, sad, sarcastic, sleepy, sympathetic, whisper |
1346.7 s
nan s