vadtoolkit

Created by Kim Jaeseok

version

1.1.0

license

GPLv3

source

https://github.com/jtkim-kaist/VAD

usage

commercial

languages

kor

format

wav

channel

1

sampling rate

16000

bit depth

16, 32

duration

0 days 02:00:09.703062500

files

4, duration distribution: 1801.5 s vadtoolkit-1.1.0-file-duration-distribution 1804.0 s

segments

588, duration distribution: 0.0 s vadtoolkit-1.1.0-segment-duration-distribution 51.1 s

repository

data-public

published

2024-01-02 by audeering

Description

VAD Toolkit: A Database for Voice Activity Detection At each environment, conversational speech by two Korean male speakers was recorded. The ground truth labels are manually annotated. Because the recording was carried out in the real world, unexpected noises are included to the dataset such as the crying of baby, the chirping of insects, mouse click sound, and etc..

Tables

ID

Type

Columns

segments

segmented

noise

file

start

end

noise

bus_stop.wav

0 days 00:00:02.173875

0 days 00:00:03.775937

0

bus_stop.wav

0 days 00:00:05.641875

0 days 00:00:07.691937

0

bus_stop.wav

0 days 00:00:08.876875

0 days 00:00:11.498937

0

bus_stop.wav

0 days 00:00:12.468875

0 days 00:00:13.644937

0

bus_stop.wav

0 days 00:00:14.182875

0 days 00:00:16.498937

0

588 rows x 1 column

Schemes

ID

Dtype

Labels

Mappings

noise

int

0, 1, 2, 3