vadtoolkit¶

Created by Kim Jaeseok

version	1.1.0
license	GPLv3
source	https://github.com/jtkim-kaist/VAD
usage	commercial
languages	kor
format	wav
channel	1
sampling rate	16000
bit depth	16, 32
duration	0 days 02:00:09.703062500
files	4, duration distribution: 1801.5 s 1804.0 s
segments	588, duration distribution: 0.0 s 51.1 s
repository	data-public
published	2024-01-02 by audeering

Description¶

VAD Toolkit: A Database for Voice Activity Detection At each environment, conversational speech by two Korean male speakers was recorded. The ground truth labels are manually annotated. Because the recording was carried out in the real world, unexpected noises are included to the dataset such as the crying of baby, the chirping of insects, mouse click sound, and etc..

Tables¶

segments

segmented

noise

file	start	end	noise
bus_stop.wav	0 days 00:00:02.173875	0 days 00:00:03.775937	0
bus_stop.wav	0 days 00:00:05.641875	0 days 00:00:07.691937	0
bus_stop.wav	0 days 00:00:08.876875	0 days 00:00:11.498937	0
bus_stop.wav	0 days 00:00:12.468875	0 days 00:00:13.644937	0
bus_stop.wav	0 days 00:00:14.182875	0 days 00:00:16.498937	0
588 rows x 1 column

Schemes¶

ID	Dtype	Labels	Mappings
noise	int	0, 1, 2, 3	✓