Tables

A table links labels to media files. It consists of one or three index columns followed by an arbitrary number of label columns. Labels can either refer to whole files or part of files. An empty label means that no label has been assigned (yet).

There are two types of tables:

  • Filewise: labels refer to whole files

  • Segmented: labels refer to specific parts of files (segments)

Each type comes with a characteristic index.

Filewise

Index columns

Description

file

Path to media file

audformat implementation

Create a filewise index:

import audformat
import audformat.testing


filewise_index = audformat.filewise_index(
    ["f1", "f2", "f3"],
)
filewise_index
file
f1
f2
f3

Create database and add table with a filewise index:

db = audformat.testing.create_db(minimal=True)
db["filewise"] = audformat.Table(filewise_index)
db["filewise"]["values"] = audformat.Column()
db.tables["filewise"]
type: filewise
columns:
  values: {}

Assign labels to a table:

values_list = [1, 2, 3]
values_dict = {"values": values_list}
db["filewise"].set(values_dict)

Access labels as pandas.DataFrame:

db["filewise"].get()
values
file
f1 1
f2 2
f3 3

Assign labels to a column:

db["filewise"]["values"].set(values_list)

Access labels as pandas.Series

db["filewise"]["values"].get()
values
file
f1 1
f2 2
f3 3

Access labels and convert index to a segmented index:

db["filewise"]["values"].get(as_segmented=True)
values
file start end
f1 0 days NaT 1
f2 0 days NaT 2
f3 0 days NaT 3

Access labels from a filewise table with a segmented index:

segmented_index = audformat.segmented_index(
    files=["f1", "f1", "f1", "f2"],
    starts=["0s", "1s", "2s", "0s"],
    ends=["1s", "2s", "3s", None],
)
db["filewise"].get(segmented_index)
values
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 1
0 days 00:00:02 0 days 00:00:03 1
f2 0 days 00:00:00 NaT 2

Access labels from a filewise column with a segmented index:

db["filewise"]["values"].get(segmented_index)
values
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 1
0 days 00:00:02 0 days 00:00:03 1
f2 0 days 00:00:00 NaT 2

Segmented

Index columns

Description

file

Path to media file

start

Start time of the segment (relative to the beginning of the file)

end

End time of the segment (relative to the beginning of the file)

audformat implementation

Create a segmented index:

segmented_index = audformat.segmented_index(
    files=["f1", "f1", "f1", "f2", "f3"],
    starts=["0s", "1s", "2s", "0s", "1m"],
    ends=["1s", "2s", "3s", None, "1h"],
)
segmented_index
file start end
f1 0 days 00:00:00 0 days 00:00:01
0 days 00:00:01 0 days 00:00:02
0 days 00:00:02 0 days 00:00:03
f2 0 days 00:00:00 NaT
f3 0 days 00:01:00 0 days 01:00:00

Add table with a segmented index:

db["segmented"] = audformat.Table(segmented_index)
db["segmented"]["values"] = audformat.Column()
db.tables["segmented"]
type: segmented
columns:
  values: {}

Assign labels to the whole table:

values_list = [1, 2, 3, 4, 5]
values_dict = {"values": values_list}
db["segmented"].set(values_dict)

Access all labels as pandas.DataFrame:

db["segmented"].get()
values
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 2
0 days 00:00:02 0 days 00:00:03 3
f2 0 days 00:00:00 NaT 4
f3 0 days 00:01:00 0 days 01:00:00 5

Assign labels to a column:

db["segmented"]["values"].set(values_list)

Access labels from a column as pandas.Series:

db["segmented"]["values"].get()
values
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 2
0 days 00:00:02 0 days 00:00:03 3
f2 0 days 00:00:00 NaT 4
f3 0 days 00:01:00 0 days 01:00:00 5

Access labels from a segmented table with a filewise index:

filewise_index = audformat.filewise_index(
    ["f1", "f2"],
)
db["segmented"].get(filewise_index)
values
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 2
0 days 00:00:02 0 days 00:00:03 3
f2 0 days 00:00:00 NaT 4

Access labels from a segmented column with a filewise index:

db["segmented"]["values"].get(filewise_index)
values
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 2
0 days 00:00:02 0 days 00:00:03 3
f2 0 days 00:00:00 NaT 4