Emodb example

In this example we download the small emodb database, that contains sentences spoken with different emotions by different actors. The audio is stored as WAV files.

Get source database

First we download the source emodb database to the folder emodb-src.

import os
import urllib.request

import audeer


# Get database source
source = 'http://emodb.bilderbar.info/download/download.zip'
src_dir = 'emodb-src'
if not os.path.exists(src_dir):
    urllib.request.urlretrieve(source, 'emodb.zip')
    audeer.extract_archive('emodb.zip', src_dir)

os.listdir(src_dir)
['erklaerung.txt', 'silb', 'wav', 'erkennung.txt', 'lablaut', 'labsilb']

Gather metadata and annotations

Afterwards we collect all metadata and annotations that we would like to store in the audformat version of the database.

First, have a look at the file names.

os.listdir(os.path.join(src_dir, 'wav'))[:3]
['12b09Ac.wav', '11b02Fd.wav', '13a01Lb.wav']

As described in the emodb documentation the encoding is the following.

Position

Encoding

0..1

speaker

2..4

spoken sentence

5

emotion

6

repetition

For speaker further information is provided.

Speaker ID

Gender

Age

03

male

31

08

female

34

09

female

21

10

male

32

11

male

26

12

male

30

13

female

32

14

female

35

15

male

25

16

female

31

For the sentences we have transcriptions.

Code

Transcription

a01

Der Lappen liegt auf dem Eisschrank.

a02

Das will sie am Mittwoch abgeben.

a04

Heute Abend könnte ich es ihm sagen.

a05

Das schwarze Stück Papier befindet sich da oben neben dem Holzstück.

a07

In sieben Stunden wird es soweit sein.

b01

Was sind denn das für Tüten, die da unter dem Tisch stehen?

b02

Sie haben es gerade hoch getragen und jetzt gehen sie wieder runter.

b03

An den Wochenenden bin ich jetzt immer nach Hause gefahren und habe Agnes besucht.

b09

Ich will das eben wegbringen und dann mit Karl was trinken gehen.

b10

Die wird auf dem Platz sein, wo wir sie immer hinlegen.

The emotion codes belong to the following emotions.

Code

Emotion

W

anger

L

boredom

E

disgust

A

fear

F

happiness

T

sadness

N

neutral

As stated in the emodb paper, the acted emotions were further evaluated by 20 participants that had to assign emotion labels to the audio presentations. Their agreement of the rating is stored as the erkannt column in the file erkennung.txt. We will read in this file and use the annotations to add a confidence column to the emotion table.

import audformat
import audiofile as af
import pandas as pd

# Prepare functions for getting information from file names
def parse_names(names, from_i, to_i, is_number=False, mapping=None):
    for name in names:
        key = name[from_i:to_i]
        if is_number:
            key = int(key)
        yield mapping[key] if mapping else key


description = (
   'Berlin Database of Emotional Speech. '
   'A German database of emotional utterances '
   'spoken by actors '
   'recorded as a part of the DFG funded research project '
   'SE462/3-1 in 1997 and 1999. '
   'Recordings took place in the anechoic chamber '
   'of the Technical University Berlin, '
   'department of Technical Acoustics. '
   'It contains about 500 utterances '
   'from ten different actors '
   'expressing basic six emotions and neutral.'
)

files = sorted(
    [os.path.join('wav', f) for f in os.listdir(os.path.join(src_dir, 'wav'))]
)
names = [audeer.basename_wo_ext(f) for f in files]

emotion_mapping = {
    'W': 'anger',
    'L': 'boredom',
    'E': 'disgust',
    'A': 'fear',
    'F': 'happiness',
    'T': 'sadness',
    'N': 'neutral',
}
emotions = list(parse_names(names, from_i=5, to_i=6, mapping=emotion_mapping))

y = pd.read_csv(
    os.path.join(src_dir, 'erkennung.txt'),
    usecols=['Satz', 'erkannt'],
    index_col='Satz',
    delim_whitespace=True,
    encoding='Latin-1',
    decimal=',',
    converters={'Satz': lambda x: os.path.join('wav', x)},
    squeeze=True,
)
y = y.loc[files]
y = y.replace(to_replace=u'\xa0', value='', regex=True)
y = y.replace(to_replace=',', value='.', regex=True)
confidences = y.astype('float').values

male = audformat.define.Gender.MALE
female = audformat.define.Gender.FEMALE
language = audformat.utils.map_language('de')
speaker_mapping = {
    3: {'gender': male, 'age': 31, 'language': language},
    8: {'gender': female, 'age': 34, 'language': language},
    9: {'gender': female, 'age': 21, 'language': language},
    10: {'gender': male, 'age': 32, 'language': language},
    11: {'gender': male, 'age': 26, 'language': language},
    12: {'gender': male, 'age': 30, 'language': language},
    13: {'gender': female, 'age': 32, 'language': language},
    14: {'gender': female, 'age': 35, 'language': language},
    15: {'gender': male, 'age': 25, 'language': language},
    16: {'gender': female, 'age': 31, 'language': language},
}
speakers = list(parse_names(names, from_i=0, to_i=2, is_number=True))

transcription_mapping = {
    'a01': 'Der Lappen liegt auf dem Eisschrank.',
    'a02': 'Das will sie am Mittwoch abgeben.',
    'a04': 'Heute abend könnte ich es ihm sagen.',
    'a05': 'Das schwarze Stück Papier befindet sich da oben neben dem '
           'Holzstück.',
    'a07': 'In sieben Stunden wird es soweit sein.',
    'b01': 'Was sind denn das für Tüten, die da unter dem Tisch '
           'stehen.',
    'b02': 'Sie haben es gerade hochgetragen und jetzt gehen sie '
           'wieder runter.',
    'b03': 'An den Wochenenden bin ich jetzt immer nach Hause '
           'gefahren und habe Agnes besucht.',
    'b09': 'Ich will das eben wegbringen und dann mit Karl was '
           'trinken gehen.',
    'b10': 'Die wird auf dem Platz sein, wo wir sie immer hinlegen.',
}
transcriptions = list(parse_names(names, from_i=2, to_i=5))

Create audformat database

Now we create the database object and assign the information to it.

db = audformat.Database(
    name='emodb',
    source=source,
    usage=audformat.define.Usage.UNRESTRICTED,
    languages=[language],
    description=description,
    meta={
        'pdf': (
            'http://citeseerx.ist.psu.edu/viewdoc/'
            'download?doi=10.1.1.130.8506&rep=rep1&type=pdf'
        ),
    },
)

# Media
db.media['microphone'] = audformat.Media(
    format='wav',
    sampling_rate=16000,
    channels=1,
)

# Raters
db.raters['gold'] = audformat.Rater()

# Schemes
db.schemes['emotion'] = audformat.Scheme(
    labels=[str(x) for x in emotion_mapping.values()],
    description='Six basic emotions and neutral.',
)
db.schemes['confidence'] = audformat.Scheme(
    audformat.define.DataType.FLOAT,
    minimum=0,
    maximum=1,
    description='Confidence of emotion ratings.',
)
db.schemes['speaker'] = audformat.Scheme(
    labels=speaker_mapping,
    description=(
        'The actors could produce each sentence as often as '
        'they liked and were asked to remember a real '
        'situation from their past when they had felt this '
        'emotion.'
    ),
)
db.schemes['transcription'] = audformat.Scheme(
    labels=transcription_mapping,
    description='Sentence produced by actor.',
)

# Tables
index = audformat.filewise_index(files)
db['files'] = audformat.Table(index)

db['files']['speaker'] = audformat.Column(scheme_id='speaker')
db['files']['speaker'].set(speakers)

db['files']['transcription'] = audformat.Column(scheme_id='transcription')
db['files']['transcription'].set(transcriptions)

db['emotion'] = audformat.Table(index)
db['emotion']['emotion'] = audformat.Column(
    scheme_id='emotion',
    rater_id='gold',
)
db['emotion']['emotion'].set(emotions)
db['emotion']['emotion.confidence'] = audformat.Column(
    scheme_id='confidence',
    rater_id='gold',
)
db['emotion']['emotion.confidence'].set(confidences / 100.0)

Inspect database header

Before storing the database, we can inspect its header.

db
name: emodb
description: Berlin Database of Emotional Speech. A German database of emotional utterances
  spoken by actors recorded as a part of the DFG funded research project SE462/3-1
  in 1997 and 1999. Recordings took place in the anechoic chamber of the Technical
  University Berlin, department of Technical Acoustics. It contains about 500 utterances
  from ten different actors expressing basic six emotions and neutral.
source: http://emodb.bilderbar.info/download/download.zip
usage: unrestricted
languages: [deu]
media:
  microphone: {type: other, format: wav, channels: 1, sampling_rate: 16000}
raters:
  gold: {type: human}
schemes:
  confidence: {description: Confidence of emotion ratings., dtype: float, minimum: 0,
    maximum: 1}
  emotion:
    description: Six basic emotions and neutral.
    dtype: str
    labels: [anger, boredom, disgust, fear, happiness, sadness, neutral]
  speaker:
    description: The actors could produce each sentence as often as they liked and
      were asked to remember a real situation from their past when they had felt this
      emotion.
    dtype: int
    labels:
      3: {gender: male, age: 31, language: deu}
      8: {gender: female, age: 34, language: deu}
      9: {gender: female, age: 21, language: deu}
      10: {gender: male, age: 32, language: deu}
      11: {gender: male, age: 26, language: deu}
      12: {gender: male, age: 30, language: deu}
      13: {gender: female, age: 32, language: deu}
      14: {gender: female, age: 35, language: deu}
      15: {gender: male, age: 25, language: deu}
      16: {gender: female, age: 31, language: deu}
  transcription:
    description: Sentence produced by actor.
    dtype: str
    labels: {a01: Der Lappen liegt auf dem Eisschrank., a02: Das will sie am Mittwoch
        abgeben., a04: Heute abend könnte ich es ihm sagen., a05: Das schwarze Stück
        Papier befindet sich da oben neben dem Holzstück., a07: In sieben Stunden
        wird es soweit sein., b01: 'Was sind denn das für Tüten, die da unter dem
        Tisch stehen.', b02: Sie haben es gerade hochgetragen und jetzt gehen sie
        wieder runter., b03: An den Wochenenden bin ich jetzt immer nach Hause gefahren
        und habe Agnes besucht., b09: Ich will das eben wegbringen und dann mit Karl
        was trinken gehen., b10: 'Die wird auf dem Platz sein, wo wir sie immer hinlegen.'}
tables:
  emotion:
    type: filewise
    columns:
      emotion: {scheme_id: emotion, rater_id: gold}
      emotion.confidence: {scheme_id: confidence, rater_id: gold}
  files:
    type: filewise
    columns:
      speaker: {scheme_id: speaker}
      transcription: {scheme_id: transcription}
pdf: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.130.8506&rep=rep1&type=pdf

Inspect database tables

First check which tables are available.

list(db.tables)
['files', 'emotion']

Then list the first 10 entries of every table.

db['files'].get()[:10]
speaker transcription
file
wav/03a01Fa.wav 3 a01
wav/03a01Nc.wav 3 a01
wav/03a01Wa.wav 3 a01
wav/03a02Fc.wav 3 a02
wav/03a02Nc.wav 3 a02
wav/03a02Ta.wav 3 a02
wav/03a02Wb.wav 3 a02
wav/03a02Wc.wav 3 a02
wav/03a04Ad.wav 3 a04
wav/03a04Fd.wav 3 a04
db['emotion'].get()[:10]
emotion emotion.confidence
file
wav/03a01Fa.wav happiness 0.90
wav/03a01Nc.wav neutral 1.00
wav/03a01Wa.wav anger 0.95
wav/03a02Fc.wav happiness 0.85
wav/03a02Nc.wav neutral 1.00
wav/03a02Ta.wav sadness 0.90
wav/03a02Wb.wav anger 1.00
wav/03a02Wc.wav anger 1.00
wav/03a04Ad.wav fear 0.90
wav/03a04Fd.wav happiness 1.00

You access additional header information in a table with the map argument of audformat.Table.get(), see Map scheme labels for an extended documentation.

db['files'].get(map={'speaker': ['speaker', 'age', 'gender']})[:10]
speaker transcription age gender
file
wav/03a01Fa.wav 3 a01 31 male
wav/03a01Nc.wav 3 a01 31 male
wav/03a01Wa.wav 3 a01 31 male
wav/03a02Fc.wav 3 a02 31 male
wav/03a02Nc.wav 3 a02 31 male
wav/03a02Ta.wav 3 a02 31 male
wav/03a02Wb.wav 3 a02 31 male
wav/03a02Wc.wav 3 a02 31 male
wav/03a04Ad.wav 3 a04 31 male
wav/03a04Fd.wav 3 a04 31 male

Store database to disk

Now we store the database in the folder emodb. Note, that we have to make sure that the media files are located at the correct position ourselves.

import shutil


db_dir = audeer.mkdir('emodb')
shutil.copytree(
    os.path.join(src_dir, 'wav'),
    os.path.join(db_dir, 'wav'),
)
db.save(db_dir)

os.listdir(db_dir)
['db.emotion.csv', 'wav', 'db.yaml', 'db.files.csv']

You can read the database from disk as well.

db = audformat.Database.load(db_dir)
db.name
'emodb'