Load a database

To load a database you only need its name. However, we recommend to specify its version as well. This is not needed, as audb.load() searches automatically for the latest available version, but it will ensure your code returns the same data, even if a new version of the database is published.

db = audb.load(
    "emodb",
    version="1.4.1",
    verbose=False,
)

audb.load() will download the data, store them in a cache folder, and return the database as an audformat.Database object. The most important content of that object are the database tables.

db.tables
emotion:
  type: filewise
  columns:
    emotion: {scheme_id: emotion, rater_id: gold}
    emotion.confidence: {scheme_id: confidence, rater_id: gold}
emotion.categories.test.gold_standard:
  type: filewise
  split_id: test
  columns:
    emotion: {scheme_id: emotion, rater_id: gold}
    emotion.confidence: {scheme_id: confidence, rater_id: gold}
emotion.categories.train.gold_standard:
  type: filewise
  split_id: train
  columns:
    emotion: {scheme_id: emotion, rater_id: gold}
    emotion.confidence: {scheme_id: confidence, rater_id: gold}
files:
  type: filewise
  columns:
    duration: {scheme_id: duration}
    speaker: {scheme_id: speaker}
    transcription: {scheme_id: transcription}

They contain the annotations of the database, and can be requested as a pandas.DataFrame.

db["emotion"].get()
emotion emotion.confidence
file
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Fa.wav happiness 0.90
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Nc.wav neutral 1.00
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Wa.wav anger 0.95
... ... ...
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/16b10Td.wav sadness 0.95
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/16b10Wa.wav anger 1.00
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/16b10Wb.wav anger 1.00

535 rows × 2 columns

Or you can directly request single columns as pandas.Series.

db["files"]["duration"].get()
file
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Fa.wav 0 days 00:00:01.898250
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Nc.wav 0 days 00:00:01.611250
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Wa.wav 0 days 00:00:01.877812500
... ...
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/16b10Td.wav 0 days 00:00:03.934187500
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/16b10Wa.wav 0 days 00:00:02.414125
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/16b10Wb.wav 0 days 00:00:02.522499999

535 rows × 1 columns

As you can see the index of the returned object holds the path to the corresponding media files.

For a full overview how to handle the database object we refer the reader to the corresponding audformat documentation. We also recommend to make you familiar how to combine tables and how to map labels.

Here, we continue with discussing Media conversion and flavors, how to load Metadata and header only, and Loading on demand.

Media conversion and flavors

When loading a database, audio files can be automatically converted. This creates a new flavor of the database, represented by audb.Flavor. The following properties can be changed.

bit_depth:
  - 8
  - 16
  - 24
  - 32 (WAV only)
format:
  - 'wav'
  - 'flac'
channels:
  - 0        # select first channel
  - [0, -1]  # select first and last channel
  - ...
mixdown:
  - False
  - True
sampling_rate:
  - 8000
  - 16000
  - 22500
  - 44100
  - 48000

The next example will convert the original files to FLAC with a sampling rate of 44100 Hz. For each flavor a sub-folder will be created inside the cache.

db = audb.load(
    "emodb",
    version="1.4.1",
    format="flac",
    sampling_rate=44100,
    verbose=False,
)

The flavor information of a database is stored inside the db.meta["audb"] dictionary.

db.meta["audb"]["flavor"]
{'bit_depth': None,
 'channels': None,
 'format': 'flac',
 'mixdown': False,
 'sampling_rate': 44100}

You can list all available flavors and their locations in the cache with:

df = audb.cached()
df[["name", "version", "complete", "format", "sampling_rate"]]
name version complete format sampling_rate
/home/runner/audb/emodb/1.4.1/9c872f39 emodb 1.4.1 False flac 44100
/home/runner/audb/emodb/1.4.1/d3b62a9b emodb 1.4.1 False None None

The entry "complete" tells you if a database flavor is completely cached, or if some table or media files are still missing.

Metadata and header only

It is possible to request only metadata (header and annotations) of a database. In that case media files are not loaded, but all the tables and the header.

db = audb.load(
    "emodb",
    version="1.4.1",
    only_metadata=True,
    verbose=False,
)

For databases with many annotations, this can still take some time. If you are only interested in header information, you can use audb.info.header(). Or if you are only interested in parts of the header, have a look at the audb.info module. It can list all table definitions.

audb.info.tables(
    "emodb",
    version="1.4.1",
)
emotion:
  type: filewise
  columns:
    emotion: {scheme_id: emotion, rater_id: gold}
    emotion.confidence: {scheme_id: confidence, rater_id: gold}
emotion.categories.test.gold_standard:
  type: filewise
  split_id: test
  columns:
    emotion: {scheme_id: emotion, rater_id: gold}
    emotion.confidence: {scheme_id: confidence, rater_id: gold}
emotion.categories.train.gold_standard:
  type: filewise
  split_id: train
  columns:
    emotion: {scheme_id: emotion, rater_id: gold}
    emotion.confidence: {scheme_id: confidence, rater_id: gold}
files:
  type: filewise
  columns:
    duration: {scheme_id: duration}
    speaker: {scheme_id: speaker}
    transcription: {scheme_id: transcription}

Or get the total duration of all media files.

audb.info.duration(
    "emodb",
    version="1.4.1",
)
Timedelta('0 days 00:24:47.092187500')

See audb.info for a list of all available options.

Loading on demand

It is possible to request only specific tables or media of a database.

For instance, many databases are organized into train, dev, and test splits. Hence, to evaluate the performance of a machine learning model, we don’t have to download the full database, but only the table(s) and media of the test set.

Or, if we want the data of a specific speaker, we can do the following. First, we download the table with information about the speakers (here db["files"]):

db = audb.load(
    "emodb",
    version="1.4.1",
    tables=["files"],
    only_metadata=True,
    full_path=False,
    verbose=False,
)
db.tables
files:
  type: filewise
  columns:
    duration: {scheme_id: duration}
    speaker: {scheme_id: speaker}
    transcription: {scheme_id: transcription}

Note, that we set only_metadata=True since we only need the labels at the moment. By setting full_path=False we further ensure that the paths in the table index are relative and therefore match the paths on the backend.

speaker = db["files"]["speaker"].get()
speaker
file
wav/03a01Fa.wav 3
wav/03a01Nc.wav 3
wav/03a01Wa.wav 3
... ...
wav/16b10Td.wav 16
wav/16b10Wa.wav 16
wav/16b10Wb.wav 16

535 rows × 1 columns

Now, we use the column with speaker IDs to get a list of media files that belong to speaker 3.

media = db["files"].files[speaker == 3]
media
file
0 wav/03a01Fa.wav
1 wav/03a01Nc.wav
2 wav/03a01Wa.wav
... ...
46 wav/03b10Nc.wav
47 wav/03b10Wb.wav
48 wav/03b10Wc.wav

49 rows × 1 columns

Finally, we load the database again and use the list to request only the data of this speaker.

db = audb.load(
    "emodb",
    version="1.4.1",
    media=media,
    full_path=False,
    verbose=False,
)

This will also remove entries of other speakers from the tables.

db["emotion"].get()
emotion emotion.confidence
file
wav/03a01Fa.wav happiness 0.90
wav/03a01Nc.wav neutral 1.00
wav/03a01Wa.wav anger 0.95
... ... ...
wav/03b10Nc.wav neutral 0.80
wav/03b10Wb.wav anger 0.95
wav/03b10Wc.wav anger 1.00

49 rows × 2 columns

Streaming

audb.stream() provides a pseudo-streaming mode, which helps to load large datasets. It will only load batch_size number of rows from a selected table into memory, and download only matching media files in each iteration. The table and media files are still stored in the cache.

 db = audb.stream(
    "emodb",
    "emotion",
    version="1.4.1",
    batch_size=4,
    full_path=False,
    verbose=False,
)

It returns an audb.DatabaseIterator object, which behaves as audformat.Database, but provides the ability to iterate over the database:

next(db)
emotion emotion.confidence
file
wav/03a01Fa.wav happiness 0.90
wav/03a01Nc.wav neutral 1.00
wav/03a01Wa.wav anger 0.95
wav/03a02Fc.wav happiness 0.85

With shuffle=True, a user can request that the data is returned in a random order. audb.stream() will then load buffer_size of rows into an buffer and selected randomly from those.

import numpy as np
np.random.seed(1)
db = audb.stream(
    "emodb",
    "emotion",
    version="1.4.1",
    batch_size=4,
    shuffle=True,
    buffer_size=100_000,
    only_metadata=True,
    full_path=False,
    verbose=False,
)
next(db)
emotion emotion.confidence
file
wav/14a05Fb.wav happiness 1.0
wav/15a05Eb.wav disgust 1.0
wav/12a05Nd.wav neutral 0.9
wav/13a07Na.wav neutral 0.9