Tables¶
A table links labels to media files. It consists of one or three index columns followed by an arbitrary number of label columns. Labels can either refer to whole files or part of files. An empty label means that no label has been assigned (yet).
There are two types of tables:
Filewise: labels refer to whole files
Segmented: labels refer to specific parts of files (segments)
Each type comes with a characteristic index.
Filewise¶
Index columns |
Description |
---|---|
file |
Path to media file |
audformat implementation¶
Create a filewise index:
>>> import audformat
>>> filewise_index = audformat.filewise_index(["f1", "f2", "f3"])
>>> filewise_index
Index(['f1', 'f2', 'f3'], dtype='string', name='file')
Create database and add table with a filewise index:
>>> db = audformat.Database("mydb")
>>> db["filewise"] = audformat.Table(filewise_index)
>>> db["filewise"]["values"] = audformat.Column()
>>> db.tables["filewise"]
type: filewise
columns:
values: {}
Assign labels to a table:
>>> values_list = [1, 2, 3]
>>> values_dict = {"values": values_list}
>>> db["filewise"].set(values_dict)
Access labels as pandas.DataFrame
:
>>> db["filewise"].get()
values
file
f1 1
f2 2
f3 3
Assign labels to a column:
>>> db["filewise"]["values"].set(values_list)
Access labels as pandas.Series
>>> db["filewise"]["values"].get()
file
f1 1
f2 2
f3 3
Name: values, dtype: object
Access labels and convert index to a segmented index:
>>> db["filewise"]["values"].get(as_segmented=True)
file start end
f1 0 days NaT 1
f2 0 days NaT 2
f3 0 days NaT 3
Name: values, dtype: object
Access labels from a filewise table with a segmented index:
>>> segmented_index = audformat.segmented_index(
... files=["f1", "f1", "f1", "f2"],
... starts=["0s", "1s", "2s", "0s"],
... ends=["1s", "2s", "3s", None],
... )
>>> db["filewise"].get(segmented_index)
values
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 1
0 days 00:00:02 0 days 00:00:03 1
f2 0 days 00:00:00 NaT 2
Access labels from a filewise column with a segmented index:
>>> db["filewise"]["values"].get(segmented_index)
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 1
0 days 00:00:02 0 days 00:00:03 1
f2 0 days 00:00:00 NaT 2
Name: values, dtype: object
Segmented¶
Index columns |
Description |
---|---|
file |
Path to media file |
start |
Start time of the segment (relative to the beginning of the file) |
end |
End time of the segment (relative to the beginning of the file) |
audformat implementation¶
Create a segmented index:
>>> segmented_index = audformat.segmented_index(
... files=["f1", "f1", "f1", "f2", "f3"],
... starts=["0s", "1s", "2s", "0s", "1m"],
... ends=["1s", "2s", "3s", None, "1h"],
... )
>>> segmented_index
MultiIndex([('f1', '0 days 00:00:00', '0 days 00:00:01'),
('f1', '0 days 00:00:01', '0 days 00:00:02'),
('f1', '0 days 00:00:02', '0 days 00:00:03'),
('f2', '0 days 00:00:00', NaT),
('f3', '0 days 00:01:00', '0 days 01:00:00')],
names=['file', 'start', 'end'])
Add table with a segmented index:
>>> db["segmented"] = audformat.Table(segmented_index)
>>> db["segmented"]["values"] = audformat.Column()
>>> db.tables["segmented"]
type: segmented
columns:
values: {}
Assign labels to the whole table:
>>> values_list = [1, 2, 3, 4, 5]
>>> values_dict = {"values": values_list}
>>> db["segmented"].set(values_dict)
Access all labels as pandas.DataFrame
:
>>> db["segmented"].get()
values
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 2
0 days 00:00:02 0 days 00:00:03 3
f2 0 days 00:00:00 NaT 4
f3 0 days 00:01:00 0 days 01:00:00 5
Assign labels to a column:
>>> db["segmented"]["values"].set(values_list)
Access labels from a column as pandas.Series
:
>>> db["segmented"]["values"].get()
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 2
0 days 00:00:02 0 days 00:00:03 3
f2 0 days 00:00:00 NaT 4
f3 0 days 00:01:00 0 days 01:00:00 5
Name: values, dtype: object
Access labels from a segmented table with a filewise index:
>>> filewise_index = audformat.filewise_index(["f1", "f2"])
>>> db["segmented"].get(filewise_index)
values
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 2
0 days 00:00:02 0 days 00:00:03 3
f2 0 days 00:00:00 NaT 4
Access labels from a segmented column with a filewise index:
>>> db["segmented"]["values"].get(filewise_index)
file start end
f1 0 days 00:00:00 0 days 00:00:01 1
0 days 00:00:01 0 days 00:00:02 2
0 days 00:00:02 0 days 00:00:03 3
f2 0 days 00:00:00 NaT 4
Name: values, dtype: object