Dataset¶
- class audbcards.Dataset(name, version, *, cache_root=None, load_tables=True)[source]¶
Dataset representation.
Dataset object that represents a dataset that can be loaded with
audb.load()
.- Parameters
name (
str
) – name of datasetversion (
str
) – version of datasetcache_root (
Optional
[str
]) – cache folder. IfNone
, the environmental variableAUDBCARDS_CACHE_ROOT
, oraudbcards.config.CACHE_ROOT
is usedload_tables (
bool
) – ifTrue
, it caches values extracted from tables. Set this toFalse
, if loading the tables takes too long, or does not fit into memory
example_media¶
- Dataset.example_media¶
Example media file.
The media file is selected by its median duration from all files in the dataset with a duration between 0.5 s and 300 s. In addition, the media file needs to be stored in an archive with less than 100 media files. If no media file meets this criterium,
None
is returned instead.
license_link¶
- Dataset.license_link¶
Link to license of dataset.
If no link is available
None
is returned.
schemes_summary¶
- Dataset.schemes_summary¶
Summary of dataset schemes.
It lists all schemes in a string, showing additional information on schemes named
'emotion'
and'speaker'
, e.g.'speaker: [age, gender, language]'
.
schemes_table¶
- Dataset.schemes_table¶
Schemes table with name, type, min, max, labels, mappings.
The table is represented as a dictionary with column names as keys.
tables_columns¶
- Dataset.tables_columns¶
Number of columns for each table of the dataset.
- Returns
dictionary with table IDs as keys and number of columns as values
Examples
>>> ds = Dataset("emodb", "1.4.1") >>> ds.tables_columns["speaker"] 3
tables_preview¶
- Dataset.tables_preview¶
Table preview for each table of the dataset.
Shows the header and the first 5 lines for each table as a list of lists. All table values are converted to strings, stripped from HTML tags or newlines, and limited to a maximum length of 100 characters.
- Returns
dictionary with table IDs as keys and table previews as values
Examples
>>> from tabulate import tabulate >>> ds = Dataset("emodb", "1.4.1") >>> preview = ds.tables_preview["speaker"] >>> print(tabulate(preview, headers="firstrow", tablefmt="github")) | speaker | age | gender | language | |-----------|-------|----------|------------| | 3 | 31 | male | deu | | 8 | 34 | female | deu | | 9 | 21 | female | deu | | 10 | 32 | male | deu | | 11 | 26 | male | deu |