Process

class audinterface.Process(*, process_func=None, process_func_args=None, process_func_is_mono=False, sampling_rate=None, resample=False, channels=None, mixdown=False, win_dur=None, hop_dur=None, min_signal_dur=None, max_signal_dur=None, segment=None, keep_nat=False, num_workers=1, multiprocessing=False, verbose=False)[source]

Processing interface.

Parameters:
  • process_func (Optional[Callable[..., object]]) – processing function, which expects the two positional arguments signal and sampling_rate and any number of additional keyword arguments (see process_func_args). There are the following special arguments: 'idx', 'file', 'root'. If expected by the function, but not specified in process_func_args, they will be replaced with: a running index, the currently processed file, the root folder. There is no restriction on the return type of the function

  • process_func_args (Optional[dict[str, object]]) – (keyword) arguments passed on to the processing function

  • process_func_is_mono (bool) – if set to True and the input signal has multiple channels, process_func will be applied to every channel individually

  • sampling_rate (Optional[int]) – sampling rate in Hz. If None it will call process_func with the actual sampling rate of the signal

  • resample (bool) – if True enforces given sampling rate by resampling

  • channels (Union[int, Sequence[int], None]) – channel selection, see audresample.remix()

  • mixdown (bool) – apply mono mix-down on selection

  • win_dur (Union[float, int, str, Timedelta, None]) – window duration, if processing should be applied on a sliding window. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options

  • hop_dur (Union[float, int, str, Timedelta, None]) – hop duration, if processing should be applied on a sliding window. This defines the shift between two windows. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options. Defaults to win_dur / 2

  • min_signal_dur (Union[float, int, str, Timedelta, None]) – minimum signal length required by process_func. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options. If provided signal is shorter, it will be zero padded at the end

  • max_signal_dur (Union[float, int, str, Timedelta, None]) – maximum signal length required by process_func. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options. If provided signal is longer, it will be cut at the end

  • segment (Optional[Segment]) – when a audinterface.Segment object is provided, it will be used to find a segmentation of the input signal. Afterwards processing is applied to each segment

  • keep_nat (bool) – if the end of segment is set to NaT do not replace with file duration in the result

  • num_workers (int | None) – number of parallel jobs or 1 for sequential processing. If None will be set to the number of processors on the machine multiplied by 5 in case of multithreading and number of processors in case of multiprocessing

  • multiprocessing (bool) – use multiprocessing instead of multithreading

  • verbose (bool) – show debug messages

Raises:
  • ValueError – if resample = True, but sampling_rate = None

  • ValueError – if hop_dur is specified, but not win_dur

Examples

>>> def mean(signal, sampling_rate):
...     return float(signal.mean())
>>> interface = Process(process_func=mean)
>>> signal = np.array([1.0, 2.0, 3.0])
>>> interface(signal, sampling_rate=3)
2.0
>>> interface.process_signal(signal, sampling_rate=3)
start   end
0 days  0 days 00:00:01   2.0
dtype: float64
>>> # Apply interface on an audformat conform index of a dataframe
>>> import audb
>>> db = audb.load(
...     "emodb",
...     version="1.3.0",
...     media="wav/03a01Fa.wav",
...     full_path=False,
...     verbose=False,
... )
>>> index = db["emotion"].index
>>> interface.process_index(index, root=db.root)
file             start   end
wav/03a01Fa.wav  0 days  0 days 00:00:01.898250    -0.000311
dtype: float64
>>> interface.process_index(index, root=db.root, preserve_index=True)
file
wav/03a01Fa.wav  -0.000311
dtype: float64
>>> # Apply interface with a sliding window
>>> interface = Process(
...     process_func=mean,
...     win_dur=1.0,
...     hop_dur=0.5,
... )
>>> interface.process_index(index, root=db.root)
file             start                   end
wav/03a01Fa.wav  0 days 00:00:00         0 days 00:00:01          -0.000329
                 0 days 00:00:00.500000  0 days 00:00:01.500000   -0.000285
dtype: float64

__call__()

Process.__call__(signal, sampling_rate)[source]

Apply processing to signal.

This function processes the signal without transforming the output into a pd.Series. Instead, it will return the raw processed signal. However, if channel selection, mixdown and/or resampling is enabled, the signal will be first remixed and resampled if the input sampling rate does not fit the expected sampling rate.

Parameters:
  • signal (ndarray) – signal values

  • sampling_rate (int) – sampling rate in Hz

Return type:

object

Returns:

Processed signal

Raises:

channels

Process.channels

Channel selection.

hop_dur

Process.hop_dur

Hop duration.

keep_nat

Process.keep_nat

Keep NaT in results.

max_signal_dur

Process.max_signal_dur

Maximum signal length.

min_signal_dur

Process.min_signal_dur

Minimum signal length.

mixdown

Process.mixdown

Mono mixdown.

multiprocessing

Process.multiprocessing

Use multiprocessing.

num_workers

Process.num_workers

Number of workers.

process_file()

Process.process_file(file, *, start=None, end=None, root=None, process_func_args=None)[source]

Process the content of an audio file.

Parameters:
Return type:

Series

Returns:

Series with processed file conform to audformat

Raises:

process_files()

Process.process_files(files, *, starts=None, ends=None, root=None, process_func_args=None)[source]

Process a list of files.

Parameters:
Return type:

Series

Returns:

Series with processed files conform to audformat

Raises:

process_folder()

Process.process_folder(root, *, filetype='wav', include_root=True, process_func_args=None)[source]

Process files in a folder.

Note

At the moment does not scan in sub-folders!

Parameters:
  • root (str) – root folder

  • filetype (str) – file extension

  • include_root (bool) – if True the file paths are absolute in the index of the returned result

  • process_func_args (Optional[dict[str, object]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored in audinterface.Process.process_func_args

Return type:

Series

Returns:

Series with processed files conform to audformat

Raises:

process_func

Process.process_func

Processing function.

process_func_args

Process.process_func_args

Additional keyword arguments to processing function.

process_func_is_mono

Process.process_func_is_mono

Process channels individually.

process_index()

Process.process_index(index, *, preserve_index=False, root=None, cache_root=None, process_func_args=None)[source]

Process from an index conform to audformat.

If cache_root is not None, a hash value is created from the index using audformat.utils.hash() and the result is stored as <cache_root>/<hash>.pkl. When called again with the same index, results will be read from the cached file.

Parameters:
Return type:

Series

Returns:

Series with processed segments conform to audformat

Raises:

process_signal()

Process.process_signal(signal, sampling_rate, *, file=None, start=None, end=None, process_func_args=None)[source]

Process audio signal and return result.

Note

If a file is given, the index of the returned frame has levels file, start and end. Otherwise, it consists only of start and end.

Parameters:
Return type:

Series

Returns:

Series with processed signal conform to audformat

Raises:

process_signal_from_index()

Process.process_signal_from_index(signal, sampling_rate, index, process_func_args=None)[source]

Split a signal into segments and process each segment.

Parameters:
Return type:

Series

Returns:

Series with processed segments conform to audformat

Raises:

resample

Process.resample

Resample signal.

sampling_rate

Process.sampling_rate

Sampling rate in Hz.

segment

Process.segment

Segmentation object.

verbose

Process.verbose

Show debug messages.

win_dur

Process.win_dur

Window duration.