MiscTable¶
- class audformat.MiscTable(index, *, split_id=None, media_id=None, description=None, meta=None)[source]¶
Miscellaneous table.
Note
Intended for use with tables that have an index that is not conform to table specifications. Otherwise, use
audformat.Table.To fill a table with labels, add one or more
audformat.Columnand useaudformat.MiscTable.set()to set the values. When adding a column, the column ID must be different from the index level names. When initialized with a single-levelpandas.MultiIndex, the index will be converted to apandas.Index.- Parameters:
- Raises:
ValueError – if level names of index are empty or not unique
Examples
>>> index = pd.MultiIndex.from_tuples( ... [ ... ("f1", "f2"), ... ("f1", "f3"), ... ("f2", "f3"), ... ], ... names=["file", "other"], ... ) >>> index = utils.set_index_dtypes(index, "string") >>> table = MiscTable( ... index, ... split_id=define.SplitType.TEST, ... ) >>> table["match"] = Column() >>> table levels: {file: str, other: str} split_id: test columns: match: {} >>> table.get() match file other f1 f2 NaN f3 NaN f2 f3 NaN >>> table.set({"match": [True, False, True]}) >>> table.get() match file other f1 f2 True f3 False f2 f3 True >>> table.get(index[:2]) match file other f1 f2 True f3 False >>> index_new = pd.MultiIndex.from_tuples( ... [ ... ("f4", "f1"), ... ], ... names=["file", "other"], ... ) >>> index_new = utils.set_index_dtypes(index_new, "string") >>> table_ex = table.extend_index( ... index_new, ... inplace=False, ... ) >>> table_ex.get() match file other f1 f2 True f3 False f2 f3 True f4 f1 NaN >>> table_ex.set( ... {"match": True}, ... index=index_new, ... ) >>> table_ex.get() match file other f1 f2 True f3 False f2 f3 True f4 f1 True >>> table_str = MiscTable(index) >>> table_str["strings"] = Column() >>> table_str.set({"strings": ["a", "b", "c"]}) >>> (table + table_str).get() match strings file other f1 f2 True a f3 False b f2 f3 True c >>> (table_ex + table_str).get() match strings file other f1 f2 True a f3 False b f2 f3 True c f4 f1 True NaN
__add__()¶
- MiscTable.__add__(other)¶
Create new table by combining two tables.
The new combined table contains index and columns of both tables. Missing values will be set to
NaN.If table is conform to table specifications and at least one table is segmented, the output has a segmented index.
Columns with the same identifier are combined to a single column. This requires that:
both columns have the same dtype
in places where the indices overlap the values of both columns match or one column contains
NaN
Media and split information, as well as, references to schemes and raters are discarded. If you intend to keep them, use
update().- Parameters:
other (
Self) – the other table- Raises:
ValueError – if columns with the same name have different dtypes
ValueError – if values in the same position do not match
ValueError – if level and dtypes of indices do not match
- Return type:
Self
__eq__()¶
__getitem__()¶
__len__()¶
__setitem__()¶
- MiscTable.__setitem__(column_id, column)¶
Add new column to table.
- Parameters:
- Raises:
BadIdError – if a column with a
scheme_idorrater_idis added that does not existValueError – if column ID is not different from level names
ValueError – if the column is linked to a scheme that is using labels from a misc table, but the misc table the column is assigned to is already used by the same or another scheme
- Return type:
columns¶
- MiscTable.columns¶
Table columns
copy()¶
- MiscTable.copy()¶
Copy table.
- Return type:
Self- Returns:
new table object
db¶
- MiscTable.db¶
Database object.
- Returns:
database object or
Noneif not assigned yet
description¶
- MiscTable.description¶
Description
df¶
- MiscTable.df¶
Table data.
- Returns:
data
drop_columns()¶
drop_index()¶
- MiscTable.drop_index(index, *, inplace=False)¶
Drop rows from index.
- Parameters:
- Return type:
Self- Returns:
new object if
inplace=False, otherwiseself- Raises:
ValueError – if level and dtypes of index does not match table index
dump()¶
extend_index()¶
- MiscTable.extend_index(index, *, fill_values=None, inplace=False)¶
Extend table with new rows.
- Parameters:
- Return type:
Self- Returns:
new object if
inplace=False, otherwiseself- Raises:
ValueError – if level and dtypes of index does not match table index
from_dict()¶
get()¶
- MiscTable.get(index=None, *, map=None, copy=True)¶
Get labels.
By default, all labels of the table are returned, use
indexto get a subset.Examples are provided with the table specifications, and for
mapin Map scheme labels.- Parameters:
index (
Index) – indexcopy (
bool) – return a copy of the labelsmap (
dict[str,str|Sequence[str]]) – map scheme or scheme fields to column values. For example if your table holds a columnspeakerwith speaker IDs, which is assigned to a scheme that contains a dict mapping speaker IDs to age and gender entries,map={'speaker': ['age', 'gender']}will replace the column with two new columns that map ID values to age and gender, respectively. To also keep the original column with speaker IDS, you can domap={'speaker': ['speaker', 'age', 'gender']}
- Return type:
- Returns:
labels
- Raises:
FileNotFoundError – if file is not found
RuntimeError – if table is not assign to a database
ValueError – if trying to map without a scheme
ValueError – if trying to map from a scheme that has no labels
ValueError – if trying to map to a non-existing field
index¶
- MiscTable.index¶
Table index.
- Returns:
index
levels¶
- MiscTable.levels¶
Index levels.
load()¶
- MiscTable.load(path)¶
Load table data from disk.
Tables are stored as CSV, PARQUET and/or PKL files to disk. If the PKL file exists, it will load the PKL file as long as its modification date is the newest, otherwise it will raise an error and ask to delete one of the files.
- Parameters:
path (
str) – file path without extension- Raises:
RuntimeError – if table file(s) are missing
RuntimeError – if CSV or PARQUET file is newer than PKL file
media¶
- MiscTable.media¶
Media object.
- Returns:
media object or
Noneif not available
media_id¶
- MiscTable.media_id¶
Media ID
meta¶
- MiscTable.meta¶
Dictionary with meta fields
pick_columns()¶
pick_index()¶
- MiscTable.pick_index(index, *, inplace=False)¶
Pick rows from index.
- Parameters:
- Return type:
Self- Returns:
new object if
inplace=False, otherwiseself- Raises:
ValueError – if level and dtypes of index does not match table index
save()¶
- MiscTable.save(path, *, storage_format='parquet', update_other_formats=True)¶
Save table data to disk.
Existing files will be overwritten.
When using
"parquet"asstorage_formata hash, based on the content of the table, is stored under the keyb"hash"in the metadata of the schema of the parquet file. This provides a deterministic hash for the file, as md5 sums of parquet files, containing identical information, often differ. Reasons include factors like the library that wrote the parquet file, the chosen compression codec and metadata written by the library.The hash can be accessed with
pyarrowby:pyarrow.parquet.read_schema(f"{path}.parquet").metadata[b"hash"].decode()
The hash is used by
audbwhen publishing a database to track changes of database files.- Parameters:
path (
str) – file path without extensionstorage_format (
str) – storage format of table. Seeaudformat.define.TableStorageFormatfor available formatsupdate_other_formats (
bool) – ifTrueit will not only save to the givenstorage_format, but update all files stored in other storage formats as well
set()¶
- MiscTable.set(values, *, index=None)¶
Set labels.
By default, all labels of the table are replaced, use
indexto select a subset. If a column is assigned to aSchemevalues will be automatically converted to match its dtype.Examples are provided with the table specifications.
split¶
- MiscTable.split¶
Split object.
- Returns:
split object or
Noneif not available
split_id¶
- MiscTable.split_id¶
Split ID
to_dict()¶
update()¶
- MiscTable.update(others, *, overwrite=False)¶
Update table with other table(s).
Table which calls
update()to combine tables must be assigned to a database. For all tables media and split must match.Columns that are not yet part of the table will be added and referenced schemes or raters are copied. For overlapping columns, schemes and raters must match.
Columns with the same identifier are combined to a single column. This requires that both columns have the same dtype and if
overwriteis set toFalse, values in places where the indices overlap have to match or one column containsNaN. Ifoverwriteis set toTrue, the value of the last table in the list is kept.The index type of the table must not change.
- Parameters:
- Return type:
Self- Returns:
the updated table
- Raises:
RuntimeError – if table is not assign to a database
ValueError – if split or media does not match
ValueError – if overlapping columns reference different schemes or raters
ValueError – if a missing scheme or rater cannot be copied because a different object with the same ID exists
ValueError – if values in same position overlap
ValueError – if level and dtypes of table indices do not match