Database¶
An audformat database consists of a header,
several tables,
and media files.
On hard disk all of them are stored inside a single folder.
The header is stored as a YAML file,
the tables contain labels stored in (possibly) multiple CSV or PARQUET files,
and the media files are usually stored in sub-folders.
Media files are not restricted to a particular file type.
Usually, they consist of audio, video, or text files.
Each table column is linked to a scheme and/or to a rater.
Each table row is linked to a media file,
or,
if applicable,
a specific segment in a media file.
If no links to media files are given,
the table is called miscellaneous table,
or short misc table.
The database is implemented as audformat.Database
.
File |
Content |
---|---|
|
Meta information, schemes, list of raters |
|
Table with files or file segments as index and columns holding annotations |
|
Misc table with unspecified index and columns holding annotations |
|
Media files referenced in the tables |
The connection between the header, media files and a table is highlighted in the following sketch:
The connection between the header and a misc table is highlighted in the following sketch:
The annotations stored in the tables
can be accessed as pandas.DataFrame
.
The following sketch shows an example instance of a database: