Augment¶

class auglib.Augment(transform, *, sampling_rate=None, resample=False, channels=None, mixdown=False, keep_nat=False, num_workers=1, multiprocessing=False, seed=None, verbose=False)[source]¶

Augmentation interface.

Provides an interface for auglib.Transform and turns it into an object that can be applied on a signal, file(s) and an audformat database. If input has multiple channels, each channel is augmented individually. I.e. in case randomized arguments are used, the augmentation can be different for each channel. More details are discussed under Usage.

augment = auglib.Augment(transform)
# Apply on signal, returns np.ndarray
signal = augment(signal, sampling_rate)
# Apply on signal, file, files, database index.
# Returns a column holding the augmented signals
y = augment.process_signal(signal, sampling_rate)
y = augment.process_file(file)
y = augment.process_files(files)
y = augment.process_index(index)
# Apply on index, table, column.
# Writes results to disk
# and returns index, table, column
# pointing to augmented files
index = augment.augment(index)
y = augment.augment(y)
df = augment.augment(df)

auglib.Augment inherits from audobject.Object, which means you can serialize to and instantiate the class from a YAML file. By setting a seed, you can further ensure that re-running the augmentation loaded form a YAML file will create the same output. Have a look at audobject.Object to see all available methods. The following arguments are not serialized: keep_nat, multiprocessing, num_workers, verbose. For more information see section on hidden arguments.

Parameters

transform (auglib.core.transform.Base) – transformation object
sampling_rate (typing.Optional[int]) – sampling rate in Hz. If None it will call process_func with the actual sampling rate of the signal.
resample (bool) – if True enforces given sampling rate by resampling
channels (typing.Union[int, typing.Sequence[int], None]) – channel selection, see audresample.remix()
mixdown (bool) – apply mono mix-down on selection
keep_nat (bool) – if the end of segment is set to NaT do not replace with file duration in the result
num_workers (typing.Optional[int]) – number of parallel jobs or 1 for sequential processing. If None will be set to the number of processors on the machine multiplied by 5 in case of multithreading and number of processors in case of multiprocessing. If seed is not None, the value is always set to 1
multiprocessing (bool) – use multiprocessing instead of multithreading
seed (typing.Optional[int]) – if not None calls auglib.seed() with the given value when object is constructed. This will automatically set num_workers to 1
verbose (bool) – show debug messages

Raises

ValueError – if resample = True, but sampling_rate = None

Examples

>>> import audb
>>> import audiofile
>>> import auglib
>>> db = audb.load(
...     "emodb",
...     version="1.4.1",
...     media=["wav/03a01Fa.wav", "wav/03a01Nc.wav", "wav/03a01Wa.wav"],
...     verbose=False,
... )
>>> transform = auglib.transform.WhiteNoiseUniform()
>>> augment = auglib.Augment(transform)
>>> # Augment a numpy array
>>> signal, sampling_rate = audiofile.read(db.files[0])
>>> signal_augmented = augment(signal, sampling_rate)
>>> # Augment (parts of) a database
>>> df = db.get("emotion")
>>> df_augmented = augment.augment(
...     df,
...     cache_root="cache",
...     remove_root=db.root,
... )
>>> label = df_augmented.iloc[0, 0]
>>> file = df_augmented.index[0][0]
>>> file, label
('...03a01Fa.wav', 'happiness')

call()¶

Augment.__call__(signal, sampling_rate)¶

Apply processing to signal.

This function processes the signal without transforming the output into a pd.Series. Instead, it will return the raw processed signal. However, if channel selection, mixdown and/or resampling is enabled, the signal will be first remixed and resampled if the input sampling rate does not fit the expected sampling rate.

Parameters

signal (numpy.ndarray) – signal values
sampling_rate (int) – sampling rate in Hz

Return type

typing.Any

Returns

Processed signal

Raises

RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid

arguments¶

Augment.arguments¶

Returns arguments that are serialized.

Returns: Dictionary of arguments and their values.
Raises: RuntimeError – if arguments are found that are not assigned to attributes of the same name

Examples

>>> import audobject.testing
>>> o = audobject.testing.TestObject('test', point=(1, 1))
>>> o.arguments
{'name': 'test', 'point': (1, 1)}

augment()¶

Augment.augment(data, cache_root=None, *, data_root=None, remove_root=None, modified_only=True, num_variants=1, force=False)[source]¶

Augment an index, column, or table conform to audformat.

Creates num_variants copies of the segments referenced in the index and augments them. Files of augmented segments are stored as <cache_root>/<short_id>/<index_id>/<variant>/<original_path>. The <index_id> is the identifier of the index of data. Note that the <index_id> of a filewise index is the same as its corresponding segmented index with start=0 and end=NaT, but differs from its corresponding segmented index when end is set to the file durations. It is possible to shorten the path by setting remove_root to a directory that should be removed from <original_path> (e.g. the audb cache folder). If more than one segment of the same file is augmented, a counter is added at the end of the filename. The result is an index, column or table that references the augmented files. If the input is a column or table data, the original content is kept. If num_variants > 1 the data is duplicated accordingly. If modified_only is set to False the original index is also included.

Parameters

data (typing.Union[pandas.core.indexes.base.Index, pandas.core.series.Series, pandas.core.frame.DataFrame]) – index, column or table conform to audformat
cache_root (typing.Optional[str]) – directory to cache augmented files, if None defaults to auglib.config.CACHE_ROOT
data_root (typing.Optional[str]) – if index contains relative files, set to root directory where files are stored
remove_root (typing.Optional[str]) – directory that should be removed from the beginning of the original file path before joining with cache_root
modified_only (bool) – return only modified segments, otherwise combine original and modified segments
num_variants (int) – number of variations that are created for every segment
force (bool) – overwrite existing files

Return type

typing.Union[pandas.core.indexes.base.Index, pandas.core.series.Series, pandas.core.frame.DataFrame]

Returns

index, column or table including augmented files

Raises

RuntimeError – if sampling rates of file and transformation do not match
RuntimeError – if modified_only=False but resampling or remixing is turned on

borrowed_arguments¶

Augment.borrowed_arguments¶

Returns borrowed arguments.

Returns: Dictionary with borrowed arguments.

channels¶

Augment.channels¶: Channel selection.

from_dict()¶

static Augment.from_dict(d, root=None, **kwargs)¶

Return type: audobject.core.object.Object

from_yaml()¶

static Augment.from_yaml(path_or_stream, **kwargs)¶

Return type: audobject.core.object.Object

from_yaml_s()¶

static Augment.from_yaml_s(yaml_string, **kwargs)¶

Return type: audobject.core.object.Object

hidden_arguments¶

Augment.hidden_arguments¶

Returns hidden arguments.

Returns: List with names of hidden arguments.

hop_dur¶

Augment.hop_dur¶: Hop duration.

id¶

Augment.id¶

Object identifier.

The ID of an object ID is created from its non-hidden arguments.

Returns: object identifier

Examples

>>> class Foo(Object):
...    def __init__(self, bar: str):
...        self.bar = bar
>>> foo1 = Foo('I am unique!')
>>> foo1.id
'893df240-babe-d796-cdf1-c436171b7a96'
>>> foo2 = Foo('I am different!')
>>> foo2.id
'9303f2a5-bfc9-e5ff-0ffa-a9846e2d2190'
>>> foo3 = Foo('I am unique!')
>>> foo1.id == foo3.id
True

is_loaded_from_dict¶

Augment.is_loaded_from_dict¶

Check if object was loaded from a dictionary.

Returns True if object was initialized from a dictionary, e.g. after loading it from a YAML file.

Returns

True if object was loaded from a dictionary,: otherwise False

keep_nat¶

Augment.keep_nat¶: Keep NaT in results.

max_signal_dur¶

Augment.max_signal_dur¶: Maximum signal length.

min_signal_dur¶

Augment.min_signal_dur¶: Minimum signal length.

mixdown¶

Augment.mixdown¶: Mono mixdown.

multiprocessing¶

Augment.multiprocessing¶: Use multiprocessing.

num_workers¶

Augment.num_workers¶: Number of workers.

process_file()¶

Augment.process_file(file, *, start=None, end=None, root=None, process_func_args=None)¶

Process the content of an audio file.

Parameters

file (str) – file path
start (typing.Union[float, int, str, pandas._libs.tslibs.timedeltas.Timedelta, None]) – start processing at this position. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options
end (typing.Union[float, int, str, pandas._libs.tslibs.timedeltas.Timedelta, None]) – end processing at this position. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options
root (typing.Optional[str]) – root folder to expand relative file path
process_func_args (typing.Optional[typing.Dict[str, typing.Any]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored in audinterface.Process.process_func_args

Return type

pandas.core.series.Series

Returns

Series with processed file conform to audformat

Raises

RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid

process_files()¶

Augment.process_files(files, *, starts=None, ends=None, root=None, process_func_args=None)¶

Process a list of files.

Parameters

files (typing.Sequence[str]) – list of file paths
starts (typing.Union[float, int, str, pandas._libs.tslibs.timedeltas.Timedelta, typing.Sequence[typing.Union[float, int, str, pandas._libs.tslibs.timedeltas.Timedelta]], None]) – segment start positions. Time values given as float or integers are treated as seconds. See audinterface.utils.to_timedelta() for further options. If a scalar is given, it is applied to all files
ends (typing.Union[float, int, str, pandas._libs.tslibs.timedeltas.Timedelta, typing.Sequence[typing.Union[float, int, str, pandas._libs.tslibs.timedeltas.Timedelta]], None]) – segment end positions. Time values given as float or integers are treated as seconds. See audinterface.utils.to_timedelta() for further options. If a scalar is given, it is applied to all files
root (typing.Optional[str]) – root folder to expand relative file paths
process_func_args (typing.Optional[typing.Dict[str, typing.Any]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored in audinterface.Process.process_func_args

Return type

pandas.core.series.Series

Returns

Series with processed files conform to audformat

Raises

RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid

process_folder()¶

Augment.process_folder(root, *, filetype='wav', include_root=True, process_func_args=None)¶

Process files in a folder.

Note

At the moment does not scan in sub-folders!

Parameters

root (str) – root folder
filetype (str) – file extension
include_root (bool) – if True the file paths are absolute in the index of the returned result
process_func_args (typing.Optional[typing.Dict[str, typing.Any]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored in audinterface.Process.process_func_args

Return type

pandas.core.series.Series

Returns

Series with processed files conform to audformat

Raises

FileNotFoundError – if folder does not exist
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid

process_func¶

Augment.process_func¶: Processing function.

process_func_args¶

Augment.process_func_args¶: Additional keyword arguments to processing function.

process_func_is_mono¶

Augment.process_func_is_mono¶: Process channels individually.

process_index()¶

Augment.process_index(index, *, preserve_index=False, root=None, cache_root=None, process_func_args=None)¶

Process from an index conform to audformat.

If cache_root is not None, a hash value is created from the index using audformat.utils.hash() and the result is stored as <cache_root>/<hash>.pkl. When called again with the same index, results will be read from the cached file.

Parameters

index (pandas.core.indexes.base.Index) – index with segment information
preserve_index (bool) – if True and audinterface.Process.segment is None the returned index will be of same type as the original one, otherwise always a segmented index is returned
root (typing.Optional[str]) – root folder to expand relative file paths
cache_root (typing.Optional[str]) – cache folder (see description)
process_func_args (typing.Optional[typing.Dict[str, typing.Any]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored in audinterface.Process.process_func_args

Return type

pandas.core.series.Series

Returns

Series with processed segments conform to audformat

Raises

RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid

process_signal()¶

Augment.process_signal(signal, sampling_rate, *, file=None, start=None, end=None, process_func_args=None)¶

Process audio signal and return result.

Note

If a file is given, the index of the returned frame has levels file, start and end. Otherwise, it consists only of start and end.

Parameters

signal (numpy.ndarray) – signal values
sampling_rate (int) – sampling rate in Hz
file (typing.Optional[str]) – file path
start (typing.Union[float, int, str, pandas._libs.tslibs.timedeltas.Timedelta, None]) – start processing at this position. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options
end (typing.Union[float, int, str, pandas._libs.tslibs.timedeltas.Timedelta, None]) – end processing at this position. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options
process_func_args (typing.Optional[typing.Dict[str, typing.Any]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored in audinterface.Process.process_func_args

Return type

pandas.core.series.Series

Returns

Series with processed signal conform to audformat

Raises

RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid

process_signal_from_index()¶

Augment.process_signal_from_index(signal, sampling_rate, index, process_func_args=None)¶

Split a signal into segments and process each segment.

Parameters

signal (numpy.ndarray) – signal values
sampling_rate (int) – sampling rate in Hz
index (pandas.core.indexes.base.Index) – a segmented index conform to audformat or a pandas.MultiIndex with two levels named start and end that hold start and end positions as pandas.Timedelta objects. See also audinterface.utils.signal_index()
process_func_args (typing.Optional[typing.Dict[str, typing.Any]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored in audinterface.Process.process_func_args

Return type

pandas.core.series.Series

Returns

Series with processed segments conform to audformat

Raises

RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
ValueError – if index contains duplicates

resample¶

Augment.resample¶: Resample signal.

resolvers¶

Augment.resolvers¶

Return resolvers.

Returns: Dictionary with resolvers.

sampling_rate¶

Augment.sampling_rate¶: Sampling rate in Hz.

seed¶

Augment.seed¶: Random seed to initialize the random number generator.

segment¶

Augment.segment¶: Segmentation object.

short_id¶

Augment.short_id¶

Short flavor ID.

This just truncates the ID to its last eight characters.

to_dict()¶

Augment.to_dict(*, include_version=True, flatten=False, root=None)¶

Converts object to a dictionary.

Includes items from audobject.Object.arguments. If an argument has a resolver, its value is encoded. Usually, the object can be re-instantiated using audobject.Object.from_dict(). However, if flatten=True, this is not possible.

Parameters

include_version (bool) – add version to class name
flatten (bool) – flatten the dictionary
root (typing.Optional[str]) – if file is written to disk, set to target directory

Return type

typing.Dict[str, typing.Union[bool, datetime.datetime, dict, float, int, list, None, str]]

Returns

dictionary that represent the object

Examples

>>> import audobject.testing
>>> o = audobject.testing.TestObject('test', point=(1, 1))
>>> o.to_dict(include_version=False)
{'$audobject.core.testing.TestObject': {'name': 'test', 'point': [1, 1]}}
>>> o.to_dict(flatten=True)
{'name': 'test', 'point.0': 1, 'point.1': 1}

to_yaml()¶

Augment.to_yaml(path_or_stream, *, include_version=True)¶

Save object to YAML file.

Parameters

path_or_stream (typing.Union[str, typing.IO]) – file path or stream
include_version (bool) – add version to class name

to_yaml_s()¶

Augment.to_yaml_s(*, include_version=True)¶

Convert object to YAML string.

Parameters: include_version (bool) – add version to class name
Return type: str
Returns: YAML string

Examples

>>> import audobject.testing
>>> o = audobject.testing.TestObject('test', point=(1, 1))
>>> print(o.to_yaml_s(include_version=False))
$audobject.core.testing.TestObject:
  name: test
  point:
  - 1
  - 1

transform¶

Augment.transform¶: The transformation object.

verbose¶

Augment.verbose¶: Show debug messages.

win_dur¶

Augment.win_dur¶: Window duration.

Augment¶

__call__()¶

arguments¶

augment()¶

borrowed_arguments¶

channels¶

from_dict()¶

from_yaml()¶

from_yaml_s()¶

hidden_arguments¶

hop_dur¶

id¶

is_loaded_from_dict¶

keep_nat¶

max_signal_dur¶

min_signal_dur¶

mixdown¶

multiprocessing¶

num_workers¶

process_file()¶

process_files()¶

process_folder()¶

process_func¶

process_func_args¶

process_func_is_mono¶

process_index()¶

process_signal()¶

process_signal_from_index()¶

resample¶

resolvers¶

sampling_rate¶

seed¶

segment¶

short_id¶

to_dict()¶

to_yaml()¶

to_yaml_s()¶

transform¶

verbose¶

win_dur¶

call()¶