Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Version 1.3.1 (2024-09-16)¶
Changed: replace unmaintained
iso-639
dependency withiso639-lang
Fixed: ensure
poetry
can manageaudformat
Version 1.3.0 (2024-07-18)¶
Added:
strict
argument toaudformat.utils.hash()
. If set toTrue
, the order of the data, and its level/column names are taken into account when calculating the hashChanged: store tables per default as parquet files, by changing the default value of
storage_format
to"parquet"
inaudformat.Table.save()
andaudformat.Database.save()
Fixed: load csv tables with
pandas.read_csv()
, ifpyarrow.csv.read_csv()
fails
Version 1.2.0 (2024-06-25)¶
Added: expand format specifications to allow parquet files as table files
Added: support for storing tables as parquet files by adding
"parquet"
(audformat.define.TableStorageFormat.PARQUET
) as an option for thestorage_format
argument ofaudformat.Table.save()
andaudformat.Database.save()
Added: support for
numpy>=2.0
Added: mention text files as potential media files in the documentation
Added: mention in the documentation of
audformat.utils.hash()
that column/level names do not influence its hash valueAdded: warn in the documentation of
audformat.utils.hash()
that the hash of a dataframe or series, containing"Int64"
as data type, changes withpandas>=2.2.0
Fixed: ensure
"boolean"
data type is always used in indices of misc tables that store boolean values
Version 1.1.4 (2024-05-15)¶
Fixed:
audformat.Database.get()
, if its argumentadditional_schemes
contains a non-existent scheme
Version 1.1.3 (2024-04-26)¶
Added:
as_dataframe
argument toaudformat.utils.read_csv()
Fixed:
audformat.utils.read_csv()
now treats float/integer values instart
,end
columns as seconds
Version 1.1.2 (2024-02-02)¶
Fixed:
audformat.Database.load()
when loading databases with a misc table that has an assigned split
Version 1.1.1 (2024-01-25)¶
Changed: depend on
audeer>=2.0.0
Fixed:
pandas
deprecation warnings
Version 1.1.0 (2023-11-30)¶
Added:
audformat.Database.get()
method to retrieve labels based on their schemes and independent of the tables in which they are storedAdded:
aggregate_function
andaggregate_strategy
arguments toaudformat.utils.concat()
to support overlapping values in the objects that should be concatenatedChanged:
audformat.Column.get(map=...)
now returns dtype of labelsChanged:
audformat.Column.get(map=...)
does no longer raise an error if some of the mapped values are not available when stored in a dictionary as scheme labelsFixed: avoid deprecation warning by replacing
pkg_resources
internally withimportlib.metadata
Version 1.0.3 (2023-10-11)¶
Fixed:
audformat.utils.hash()
forpandas>=2.1.0
Fixed: remove upper limit of
pandas
dependency
Version 1.0.2 (2023-10-09)¶
Fixed: require
pandas<2.1.0
aspandas>=2.1.0
introduced a bug in calculating the hash of an indexRemoved: deprecated
root
argument fromaudformat.testing.create_audio_files()
Version 1.0.1 (2023-05-08)¶
Fixed: ensure
audformat.utils.to_segmented_index()
andaudformat.Table.get()
withas_segmented=True
uses same precision forend
values asaudformat.segmented_index()
Version 1.0.0 (2023-04-27)¶
Added:
audformat.Scheme.labels_as_list
property to list all scheme labelsAdded: example to the documentation of
audformat.utils.to_filewise_index()
Changed: convert dates to UTC timezone in
audformat.Column.set()
when using a scheme of type"date"
Fixed: support
pandas>=2.0.0
Fixed: mention
author
,license
,license_url
,organization
in the specification documentation of the database headerFixed: missing
Raises
section in the documentation ofaudformat.Database.load()
andaudformat.Database.attachments
Fixed: when the
root
argument ofaudformat.utils.expand_file_path()
is a relative path it is no longer expanded to an absolute path
Version 0.16.1 (2023-03-29)¶
Added:
copy_attachments
argument toaudformat.Database.update()
Changed: preserve
dtypes
whenaudformat.Table.get()
is called with an indexChanged: speed up
audformat.utils.union()
Changed: allow to save a database with missing attachments
Version 0.16.0 (2023-01-12)¶
Added:
audformat.Attachment
to store any kind of files/folders as part of the databaseAdded: support for Python 3.10
Added: support for Python 3.11
Changed: require
audeer>=1.19.0
Changed: split API documentation into sub-pages for each function
Fixed: support
"meta"
as key in meta dictionaries like the one passed asmeta
argument toaudformat.Database
Version 0.15.4 (2022-11-01)¶
Fixed: avoid
FutureWarning
when setting values in place for a series inaudformat.Column.set()
Fixed: improve sketches in the specifications section of the documentation
Version 0.15.3 (2022-09-19)¶
Changed:
audformat.Column.set()
now lists values not matching the scheme of the column in the corresponding error messageFixed:
audformat.Column.set()
checking of values for a scheme with minimum and/or maximum when input values are given asnp.array
and containNaN
orNone
Fixed:
audformat.Column.set()
checking of values for a scheme with minimum and/or maximum when minimum or maximum is 0
Version 0.15.2 (2022-08-17)¶
Added:
audformat.Table.map_files()
Fixed:
audformat.Database.load()
for databases that contain a scheme with labels stored in a misc table that is using schemes for its columns. Before it could fail if the schemes were not loaded in the correct orderFixed:
audformat.Table.drop_index()
andaudformat.MiscTable.drop_index()
when the provided index to drop contains entries not present in the index of the table. Before it was extending the table by those entries besides dropping overlapping indices
Version 0.15.1 (2022-08-11)¶
Added:
audformat.Scheme.uses_table
to indicate if the scheme uses a misc table to store its labelsAdded: usage example to docstring of
audfromat.utils.to_segmented_index()
Changed: forbid nesting of misc tables as scheme labels
Fixed: support for
pd.Index
andpd.Series
inaudformat.utils.to_filewise_index()
Fixed: description of
audformat.Schemes.labels
in API documentation
Version 0.15.0 (2022-08-05)¶
Added:
audformat.MiscTable
which can store data not associated with media filesAdded: store scheme labels in a misc table
Added: dictionary
audformat.Database.misc_tables
holding misc tables of a databaseAdded:
audformat.utils.difference()
for finding index entries that are only part of a single index for a given sequence of indicesAdded:
audformat.utils.is_index_alike()
for checking if a sequence of indices has the same number of levels, level names, and matching dtypesAdded:
audformat.define.DataType.OBJECT
Added:
audformat.utils.set_index_dtypes()
to change dtypes of an indexAdded:
audformat.testing.add_misc_table()
Added:
audformat.Database.__iter__
iterates through all (misc) tables, e.g. a user can dolist(db)
to get a list of all (misc) tablesChanged:
audformat.Database.update()
can now join schemes with different labelsChanged:
audformat.utils.union()
,audformat.utils.intersect()
, andaudformat.utils.concat()
now support any kind of indexChanged:
audformat.utils.intersect()
no longer removes segments from a segmented index that are contained in a filewise indexChanged: require
pandas>=1.4.1
Changed: use
pandas
dtype"string"
instead of"object"
for storingaudformat
dtype"str"
entriesChanged: use a misc table to store the
"speaker"
scheme labels in the emodb example in the documentationChanged:
audformat.utils.join_labels()
raisesValueError
if labels are of different dtypeFixed: ensure column IDs are different from index level names
Fixed: make sure
audformat.Column.set()
converts data to dtype of scheme before checking if values are in min-max-range of schemeFixed: links to
pandas
API in the documentationFixed: include methods
to_dict()
,from_dict()
,dump()
, and attributesdescription
,meta
in the documentation for the classesaudformat.Column
,audformat.Database
,audformat.Media
,audformat.Rater
,audformat.Scheme
,audformat.Split
,audformat.Table
Fixed: type hint of argument
dtype
in the documentation ofaudformat.Scheme
Removed: support for Python 3.7
Version 0.14.3 (2022-06-01)¶
Added:
audformat.utils.map_country()
Changed: improve speed of
audformat.Table.drop_files()
for segmented tables
Version 0.14.2 (2022-04-29)¶
Added:
audformat.utils.index_has_overlap()
Added:
audformat.utils.iter_index_by_file()
Changed: store categories with integers as
int64
instead ofInt64
Changed: require
audeer>=1.18.0
Changed: support
pandas>=1.4.0
Version 0.14.1 (2022-03-03)¶
Added:
audformat.utils.map_file_path()
Version 0.14.0 (2022-02-24)¶
Changed: ensure
audformat.testing.create_database()
uses Unix path separatorsChanged: don’t allow
\
path entries in a portable databaseChanged: mark deprecated
root
argument ofaudformat.testing.create_audio_files()
to be removed in version 1.0.0
Version 0.13.3 (2022-02-07)¶
Fixed: conversion of pickle protocol 5 files to pickle protocol 4 in cache
Version 0.13.2 (2022-01-27)¶
Fixed: reintroduce sorting the output of
audformat.Database.files
andaudformat.Database.segments
Version 0.13.1 (2022-01-26)¶
Fixed: changelog for 0.13.0
Version 0.13.0 (2022-01-26)¶
Changed:
audformat.utils.union()
no longer sorts levelsChanged:
audformat.Table.save()
forces pickle format 4Changed: clean up test requirements
Changed: require
pandas < 1.4.0
Version 0.12.4 (2022-01-12)¶
Changed: the API documentation on the
language
argument ofaudformat.Database
is more verbose nowChanged: the difference between
audformat.define.DataType.TIME
andaudformat.define.DataType.DATE
is now discussed in the API documentationFixed: saving a not loaded table to CSV when a PKL file is present
Fixed:
pandas
deprecation warnings
Version 0.12.3 (2022-01-03)¶
Removed: Python 3.6 support
Version 0.12.2 (2021-11-18)¶
Added:
audformat.assert_no_duplicates()
Changed:
audformat.assert_index()
no longer checks for duplicates
Version 0.12.1 (2021-11-17)¶
Added:
audformat.utils.hash()
Added:
audformat.utils.expand_file_path()
Added:
audformat.utils.replace_file_extension()
Changed: use
yaml.CLoader
for faster header reading
Version 0.12.0 (2021-11-10)¶
Added:
as_segmented
,allow_nat
,root
,num_workers
arguments toaudformat.Table.get()
Added:
as_segmented
,allow_nat
,root
,num_workers
arguments toaudformat.Column.get()
Added:
files_duration
argument toaudformat.utils.to_segmented_index()
Added:
audformat.Database.files_duration()
Changed: changed default value of
load_data
argument inaudformat.Database.load()
toFalse
Changed: speed up
audformat.Database.files
andaudformat.Database.segments
Fixed: re-add support for
pandas>=1.3
Version 0.11.6 (2021-08-20)¶
Added: support for Python 3.9
Fixed: speed up
audformat.utils.union()
Fixed:
audformat.Column.set()
withpd.Series
andnp.array
for a scheme with fixed labels and containingNaN
values
Version 0.11.5 (2021-08-09)¶
Removed: duration scheme and column from conventions and emodb example
Version 0.11.4 (2021-08-05)¶
Added: custom
BadKeyError
when key is not foundChanged: limit to
pandas <1.3
until it works again for newerpandas
versionsChanged: remove the
<1.0.0
limit foraudiofile
as a stable release is available and the API has not changed
Version 0.11.3 (2021-06-10)¶
Added:
audformat.utils.duration
Fixed: description of
audformat.Database.is_portable
in documentation
Version 0.11.2 (2021-05-12)¶
Added:
audformat.utils.join_schemes
Version 0.11.1 (2021-05-11)¶
Added:
Database.is_portable
Added:
copy_media
argument toDatabase.update()
Changed: remove
root
argument fromtesting.create_audio_files()
and instead useDatabase.root
Fixed:
utils.concat()
converts to nullable dtypeFixed:
utils.concat()
returnsDataFrame
if input contains at least oneDataFrame
Version 0.11.0 (2021-05-06)¶
Note: tables stored from this version upwards cannot be loaded with older versions
Added:
Database.root
Added:
utils.join_labels()
Added:
Scheme.replace_labels()
Changed: set dependency to
pandas>=1.1.5
Changed: do not compress pickled table files
Version 0.10.2 (2021-04-22)¶
Changed:
allow_nat
argument toutils.to_segmented_index()
Version 0.10.1 (2021-03-31)¶
Fixed:
audformat.assert_index()
checks for correct dtypes
Version 0.10.0 (2021-03-18)¶
Added:
audformat.Database.update()
Added:
audformat.Table.update()
Added:
overwrite
argument toaudformat.utils.concat()
Changed: result of
audformat.Table.__add__()
is no longer assigned to aaudformat.Database
Version 0.9.8 (2021-02-23)¶
Added:
audformat.Database.license
Added:
audformat.Database.license_url
Added:
audformat.Database.author
Added:
audformat.Database.organization
Added:
audformat.utils.intersect()
for index objectsAdded:
audformat.utils.union()
for index objectsChanged:
Database.load()
raises error if table file missingChanged: forbid duplicates in
audformat
conform indicesFixed:
audformat.Table.__add__()
returned wrong values for some index combinations
Version 0.9.7 (2021-02-01)¶
Added:
update_other_formats
argument toaudformat.Table.save()
to make sure existing files in other formats are updated as wellChanged: use
round_trip
argument when loading CSV files to ensure dataframes are equal after storing and loading again
Version 0.9.6 (2021-01-28)¶
Fixed: implemented
audformat.Database.__eq__
and returnTrue
for identical databases
Version 0.9.5 (2021-01-14)¶
Changed: use nullable Pandas’ type
"boolean"
forbool
schemesFixed:
Scheme.draw()
generates boolean values if scheme isbool
Version 0.9.4 (2021-01-11)¶
Changed: add arguments
num_workers
andverbose
toaudformat.Database.load()
Version 0.9.3 (2021-01-07)¶
Fixed: avoid sphinx syntax in CHANGELOG
Version 0.9.2 (2021-01-07)¶
Changed: add arguments
num_workers
andverbose
toaudformat.Database.drop_files()
,audformat.Database.map_files()
,audformat.Database.pick_files()
,audformat.Database.save()
Changed:
audformat.segmented_index()
supportint
andfloat
, which will be interpreted as secondsFixed:
audformat.utils.to_segmented_index()
returns correct index type forNaT
Version 0.9.1 (2020-12-21)¶
Fixed: add column name to HTML Series output in docs
Fixed: removed mentioning of
NotConformToUnifiedFormat
error andRedundantArgumentError
errorFixed: add missing errors to docstring of
audformat.Table.set()
andaudformat.Column.set()
Version 0.9.0 (2020-12-18)¶
Added: initial release public release