Table¶

class audformat.Table(index=None, *, split_id=None, media_id=None, description=None, meta=None)[source]¶

Table conform to table specifications.

Consists of a list of file names to which it assigns numerical values or labels. To fill a table with labels, add one or more audformat.Column and use audformat.Table.set() to set the values. When adding a column, the column ID must be different from the index level names, which are 'file' in case of a filewise table and 'file', 'start' and 'end' in case of segmented table.

Parameters

index (Optional[Index]) – index conform to table specifications. If None creates an empty filewise table
split_id (Optional[str]) – split identifier (must exist)
media_id (Optional[str]) – media identifier (must exist)
description (Optional[str]) – database description
meta (Optional[dict]) – additional meta fields

Raises

ValueError – if index not conform to table specifications

Examples

>>> index = filewise_index(["f1", "f2", "f3"])
>>> table = Table(
...     index,
...     split_id=define.SplitType.TEST,
... )
>>> table["values"] = Column()
>>> table
type: filewise
split_id: test
columns:
  values: {}
>>> table.get()
     values
file
f1      NaN
f2      NaN
f3      NaN
>>> table.set({"values": [0, 1, 2]})
>>> table.get()
     values
file
f1        0
f2        1
f3        2
>>> table.get(index[:2])
     values
file
f1        0
f2        1
>>> table.get(as_segmented=True)
                values
file start  end
f1   0 days NaT      0
f2   0 days NaT      1
f3   0 days NaT      2
>>> index_new = filewise_index("f4")
>>> table_ex = table.extend_index(
...     index_new,
...     inplace=False,
... )
>>> table_ex.get()
     values
file
f1        0
f2        1
f3        2
f4      NaN
>>> table_ex.set(
...     {"values": 3},
...     index=index_new,
... )
>>> table_ex.get()
     values
file
f1        0
f2        1
f3        2
f4        3
>>> table_str = Table(index)
>>> table_str["strings"] = Column()
>>> table_str.set({"strings": ["a", "b", "c"]})
>>> (table + table_str).get()
     values strings
file
f1        0       a
f2        1       b
f3        2       c
>>> (table_ex + table_str).get()
     values strings
file
f1        0       a
f2        1       b
f3        2       c
f4        3     NaN

add()¶

Table.__add__(other)¶

Create new table by combining two tables.

The new combined table contains index and columns of both tables. Missing values will be set to NaN.

If table is conform to table specifications and at least one table is segmented, the output has a segmented index.

Columns with the same identifier are combined to a single column. This requires that:

both columns have the same dtype
in places where the indices overlap the values of both columns match or one column contains NaN

Media and split information, as well as, references to schemes and raters are discarded. If you intend to keep them, use update().

Parameters

other – the other table

Raises

ValueError – if columns with the same name have different dtypes
ValueError – if values in the same position do not match
ValueError – if level and dtypes of indices do not match

eq()¶

Table.__eq__(other)¶

Compare if table equals other table.

Return type: bool

getitem()¶

Table.__getitem__(column_id)¶

Return view to a column.

Parameters: column_id (str) – column identifier
Return type: Column

len()¶

Table.__len__()¶

Number of rows in table.

Return type: int

setitem()¶

Table.__setitem__(column_id, column)¶

Add new column to table.

Parameters

column_id (str) – column identifier
column (Column) – column

Raises

BadIdError – if a column with a scheme_id or rater_id is added that does not exist
ValueError – if column ID is not different from level names
ValueError – if the column is linked to a scheme that is using labels from a misc table, but the misc table the column is assigned to is already used by the same or another scheme

Return type

Column

columns¶

Table.columns¶: Table columns

copy()¶

Table.copy()¶

Copy table.

Returns: new table object

db¶

Table.db¶

Database object.

Returns: database object or None if not assigned yet

description¶

Table.description¶: Description

df¶

Table.df¶

Table data.

Returns: data

drop_columns()¶

Table.drop_columns(column_ids, *, inplace=False)¶

Drop columns by ID.

Parameters

column_ids – column IDs
inplace – drop columns in place

Returns

new object if inplace=False, otherwise self

drop_files()¶

Table.drop_files(files, *, inplace=False)[source]¶

Drop files.

Remove rows with a reference to listed or matching files.

Parameters

files (Union[str, Sequence[str], Callable[[str], bool]]) – list of files or condition function
inplace (bool) – drop files in place

Return type

Table

Returns

new object if inplace=False, otherwise self

drop_index()¶

Table.drop_index(index, *, inplace=False)¶

Drop rows from index.

Parameters

index – index object
inplace – drop index in place

Returns

new object if inplace=False, otherwise self

Raises

ValueError – if level and dtypes of index does not match table index

dump()¶

Table.dump(stream=None, indent=2)¶

Serialize object to YAML.

Parameters

stream – file-like object. If None serializes to string
indent (int) – indent

Return type

str

Returns

YAML string

ends¶

Table.ends¶

Segment end times.

Returns: timestamps

extend_index()¶

Table.extend_index(index, *, fill_values=None, inplace=False)¶

Extend table with new rows.

Parameters

index – index object
fill_values – replace NaN with these values (either a scalar applied to all columns or a dictionary with column name as key)
inplace – extend index in place

Returns

new object if inplace=False, otherwise self

Raises

ValueError – if level and dtypes of index does not match table index

files¶

Table.files¶

Files referenced in the table.

Returns: files

from_dict()¶

Table.from_dict(d, ignore_keys=None)¶

Deserialize object from dictionary.

Parameters

d (dict) – dictionary of class variables to assign
ignore_keys (Optional[Sequence[str]]) – variables listed here will be ignored

get()¶

Table.get(index=None, *, map=None, copy=True, as_segmented=False, allow_nat=True, root=None, num_workers=1, verbose=False)[source]¶

Get labels.

By default, all labels of the table are returned, use index to get a subset.

Examples are provided with the table specifications.

Parameters

index (Optional[Index]) – index conform to table specifications
copy (bool) – return a copy of the labels
map (Optional[Dict[str, Union[str, Sequence[str]]]]) – map scheme or scheme fields to column values. For example if your table holds a column speaker with speaker IDs, which is assigned to a scheme that contains a dict mapping speaker IDs to age and gender entries, map={'speaker': ['age', 'gender']} will replace the column with two new columns that map ID values to age and gender, respectively. To also keep the original column with speaker IDS, you can do map={'speaker': ['speaker', 'age', 'gender']}
as_segmented (bool) – if set to True and table has a filewise index, the index of the returned table will be converted to a segmented index. start will be set to 0 and end to NaT or to the file duration if allow_nat is set to False
allow_nat (bool) – if set to False, end=NaT is replaced with file duration
root (Optional[str]) – root directory under which the files are stored. Provide if file names are relative and database was not saved or loaded from disk. If None audformat.Database.root is used. Only relevant if allow_nat is set to False
num_workers (Optional[int]) – number of parallel jobs. If None will be set to the number of processors on the machine multiplied by 5
verbose (bool) – show progress bar

Return type

DataFrame

Returns

labels

Raises

FileNotFoundError – if file is not found
RuntimeError – if table is not assign to a database
ValueError – if trying to map without a scheme
ValueError – if trying to map from a scheme that has no labels
ValueError – if trying to map to a non-existing field

index¶

Table.index¶

Table index.

Returns: index

is_filewise¶

Table.is_filewise¶

Check if filewise table.

Returns: True if filewise table.

is_segmented¶

Table.is_segmented¶

Check if segmented table.

Returns: True if segmented table.

load()¶

Table.load(path)¶

Load table data from disk.

Tables can be stored as PKL and/or CSV files to disk. If both files are present it will load the PKL file as long as its modification date is newer, otherwise it will raise an error and ask to delete one of the files.

Parameters

path (str) – file path without extension

Raises

RuntimeError – if table file(s) are missing
RuntimeError – if CSV file is newer than PKL file

map_files()¶

Table.map_files(func)[source]¶

Apply function to file names in table.

If speed is crucial, see audformat.utils.map_file_path() for further hints how to optimize your code.

Parameters: func (Callable[[str], str]) – map function

media¶

Table.media¶

Media object.

Returns: media object or None if not available

media_id¶

Table.media_id¶: Media ID

meta¶

Table.meta¶: Dictionary with meta fields

pick_columns()¶

Table.pick_columns(column_ids, *, inplace=False)¶

Pick columns by ID.

All other columns will be dropped.

Parameters

column_ids – column IDs
inplace – pick columns in place

Returns

new object if inplace=False, otherwise self

pick_files()¶

Table.pick_files(files, *, inplace=False)[source]¶

Pick files.

Keep only rows with a reference to listed files or matching files.

Parameters

files (Union[str, Sequence[str], Callable[[str], bool]]) – list of files or condition function
inplace (bool) – pick files in place

Return type

Table

Returns

new object if inplace=False, otherwise self

pick_index()¶

Table.pick_index(index, *, inplace=False)¶

Pick rows from index.

Parameters

index – index object
inplace – pick index in place

Returns

new object if inplace=False, otherwise self

Raises

ValueError – if level and dtypes of index does not match table index

save()¶

Table.save(path, *, storage_format='csv', update_other_formats=True)¶

Save table data to disk.

Existing files will be overwritten.

Parameters

path (str) – file path without extension
storage_format (str) – storage format of table. See audformat.define.TableStorageFormat for available formats
update_other_formats (bool) – if True it will not only save to the given storage_format, but update all files stored in other storage formats as well

set()¶

Table.set(values, *, index=None)¶

Set labels.

By default, all labels of the table are replaced, use index to select a subset. If a column is assigned to a Scheme values will be automatically converted to match its dtype.

Examples are provided with the table specifications.

Parameters

values (Union[Dict[str, Union[int, float, str, Timedelta, Sequence[Union[int, float, str, Timedelta]], ndarray, Series]], DataFrame]) – dictionary of values with column_id as key
index (Optional[Index]) – index

Raises

ValueError – if values cannot be converted to match the schemes dtype

split¶

Table.split¶

Split object.

Returns: split object or None if not available

split_id¶

Table.split_id¶: Split ID

starts¶

Table.starts¶

Segment start times.

Returns: timestamps

to_dict()¶

Table.to_dict()¶

Serialize object to dictionary.

Return type: dict
Returns: dictionary with attributes

type¶

Table.type¶

Table type

See audformat.define.IndexType for possible values.

update()¶

Table.update(others, *, overwrite=False)¶

Update table with other table(s).

Table which calls update() to combine tables must be assigned to a database. For all tables media and split must match.

Columns that are not yet part of the table will be added and referenced schemes or raters are copied. For overlapping columns, schemes and raters must match.

Columns with the same identifier are combined to a single column. This requires that both columns have the same dtype and if overwrite is set to False, values in places where the indices overlap have to match or one column contains NaN. If overwrite is set to True, the value of the last table in the list is kept.

The index type of the table must not change.

Parameters

others – table object(s)
overwrite – overwrite values where indices overlap

Returns

the updated table

Raises

RuntimeError – if table is not assign to a database
ValueError – if split or media does not match
ValueError – if overlapping columns reference different schemes or raters
ValueError – if a missing scheme or rater cannot be copied because a different object with the same ID exists
ValueError – if values in same position overlap
ValueError – if level and dtypes of table indices do not match

Table¶

__add__()¶

__eq__()¶

__getitem__()¶

__len__()¶

__setitem__()¶

columns¶

copy()¶

db¶

description¶

df¶

drop_columns()¶

drop_files()¶

drop_index()¶

dump()¶

ends¶

extend_index()¶

files¶

from_dict()¶

get()¶

index¶

is_filewise¶

is_segmented¶

load()¶

map_files()¶

media¶

media_id¶

meta¶

pick_columns()¶

pick_files()¶

pick_index()¶

save()¶

set()¶

split¶

split_id¶

starts¶

to_dict()¶

type¶

update()¶

add()¶

eq()¶

getitem()¶

len()¶

setitem()¶