Feature

class audinterface.Feature(feature_names, *, name=None, params=None, process_func=None, process_func_args=None, process_func_is_mono=False, process_func_applies_sliding_window=False, sampling_rate=None, resample=False, channels=0, mixdown=False, win_dur=None, hop_dur=None, min_signal_dur=None, max_signal_dur=None, segment=None, keep_nat=False, num_workers=1, multiprocessing=False, verbose=False)[source]

Feature extraction interface.

The features are returned as a pandas.DataFrame. If your input signal is of size (num_channels, num_time_steps), the returned object has num_channels * num_features columns. It will have one row per file or signal.

If features are extracted using a sliding window, each window will be stored as one row. If win_dur is specified start and end indices are referred from the original start and end arguments and the window positions. If win_dur is None, the original start and end indices are kept. If process_func_applies_sliding_window is set to True the processing function is responsible to apply the sliding window. Otherwise, the sliding window is applied before the processing function is called.

If the arguments win_dur and hop_dur are not specified in process_func_args, but process_func expects them, they are passed on automatically.

Parameters
  • feature_names (Union[str, Sequence[str]]) – features are stored as columns in a data frame, where feature_names defines the names of the columns. If len(channels) > 1, the data frame has a multi-column index with with channel ID as first level and feature_names as second level

  • name (Optional[str]) – name of the feature set, e.g. 'stft'

  • params (Optional[Dict]) – parameters that describe the feature set, e.g. {'win_size': 512, 'hop_size': 256, 'num_fft': 512}. With the parameters you can differentiate different flavors of the same feature set

  • process_func (Optional[Callable[..., Any]]) – feature extraction function, which expects the two positional arguments signal and sampling_rate and any number of additional keyword arguments (see process_func_args). There are the following special arguments: 'idx', 'file', 'root'. If expected by the function, but not specified in process_func_args, they will be replaced with: a running index, the currently processed file, the root folder. The function must return features in the shape of (num_features), (num_channels, num_features), (num_features, num_frames), or (num_channels, num_features, num_frames)

  • process_func_args (Optional[Dict[str, Any]]) – (keyword) arguments passed on to the processing function

  • process_func_is_mono (bool) – apply process_func to every channel individually

  • process_func_applies_sliding_window (bool) – if True the processing function receives whole files or segments and is responsible for applying a sliding window itself. If False, the sliding window is applied internally and the processing function receives individual frames instead. Applies only if features are extracted in a framewise manner (see win_dur and hop_dur)

  • sampling_rate (Optional[int]) – sampling rate in Hz. If None it will call process_func with the actual sampling rate of the signal

  • resample (bool) – if True enforces given sampling rate by resampling

  • channels (Union[int, Sequence[int]]) – channel selection, see audresample.remix()

  • win_dur (Union[float, int, str, Timedelta, None]) – window duration, if features are extracted with a sliding window. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options

  • hop_dur (Union[float, int, str, Timedelta, None]) – hop duration, if features are extracted with a sliding window. This defines the shift between two windows. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options. Defaults to win_dur / 2

  • min_signal_dur (Union[float, int, str, Timedelta, None]) – minimum signal duration required by process_func. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options. If provided signal is shorter, it will be zero padded at the end

  • max_signal_dur (Union[float, int, str, Timedelta, None]) – maximum signal duraton required by process_func. If value is a float or integer it is treated as seconds. See audinterface.utils.to_timedelta() for further options. If provided signal is longer, it will be cut at the end

  • mixdown (bool) – apply mono mix-down on selection

  • segment (Optional[Segment]) – when a audinterface.Segment object is provided, it will be used to find a segmentation of the input signal. Afterwards processing is applied to each segment

  • keep_nat (bool) – if the end of segment is set to NaT do not replace with file duration in the result

  • num_workers (Optional[int]) – number of parallel jobs or 1 for sequential processing. If None will be set to the number of processors on the machine multiplied by 5 in case of multithreading and number of processors in case of multiprocessing

  • multiprocessing (bool) – use multiprocessing instead of multithreading

  • verbose (bool) – show debug messages

Raises
  • ValueError – if win_dur or hop_dur are given in samples and sampling_rate is None

  • ValueError – if hop_dur is specified, but not win_dur

Examples

>>> def mean_std(signal, sampling_rate):
...     return [signal.mean(), signal.std()]
>>> interface = Feature(["mean", "std"], process_func=mean_std)
>>> signal = np.array([1.0, 2.0, 3.0])
>>> interface(signal, sampling_rate=3)
array([[[2.        ],
        [0.81649658]]])
>>> interface.process_signal(signal, sampling_rate=3)
                        mean       std
start  end
0 days 0 days 00:00:01   2.0  0.816497
>>> # Apply interface on an audformat conform index of a dataframe
>>> import audb
>>> db = audb.load(
...     "emodb",
...     version="1.3.0",
...     media="wav/03a01Fa.wav",
...     full_path=False,
...     verbose=False,
... )
>>> index = db["emotion"].index
>>> interface.process_index(index, root=db.root)
                                                   mean       std
file            start  end
wav/03a01Fa.wav 0 days 0 days 00:00:01.898250 -0.000311  0.082317
>>> interface.process_index(index, root=db.root, preserve_index=True)
                mean       std
file
wav/03a01Fa.wav -0.000311  0.082317
>>> # Apply interface with a sliding window
>>> interface = Feature(
...     ["mean", "std"],
...     process_func=mean_std,
...     win_dur=1.0,
...     hop_dur=0.25,
... )
>>> interface.process_index(index, root=db.root)
                                                                   mean       std
file            start                  end
wav/03a01Fa.wav 0 days 00:00:00        0 days 00:00:01        -0.000329  0.098115
                0 days 00:00:00.250000 0 days 00:00:01.250000 -0.000405  0.087917
                0 days 00:00:00.500000 0 days 00:00:01.500000 -0.000285  0.067042
                0 days 00:00:00.750000 0 days 00:00:01.750000 -0.000187  0.063677
>>> # Apply the same process function on all channels
>>> # of a multi-channel signal
>>> import audiofile
>>> signal, sampling_rate = audiofile.read(
...     audeer.path(db.root, db.files[0]),
...     always_2d=True,
... )
>>> signal_multi_channel = np.concatenate(
...     [
...         signal - 0.5,
...         signal + 0.5,
...     ],
... )
>>> interface = Feature(
...     ["mean", "std"],
...     process_func=mean_std,
...     process_func_is_mono=True,
...     channels=[0, 1],
... )
>>> interface.process_signal(
...     signal_multi_channel,
...     sampling_rate,
... )
                                      0                   1
                                   mean       std      mean       std
start  end
0 days 0 days 00:00:01.898250 -0.500311  0.082317  0.499689  0.082317

__call__()

Feature.__call__(signal, sampling_rate)[source]

Apply processing to signal.

This function processes the signal without transforming the output into a pandas.DataFrame. Instead, it will return the raw processed signal. However, if channel selection, mixdown and/or resampling is enabled, the signal will be first remixed and resampled if the input sampling rate does not fit the expected sampling rate.

Parameters
  • signal (ndarray) – signal values

  • sampling_rate (int) – sampling rate in Hz

Return type

ndarray

Returns

feature array with shape (num_channels, num_features, num_frames)

Raises

column_names

Feature.column_names

Feature column names.

feature_names

Feature.feature_names

Feature names.

hop_dur

Feature.hop_dur

Hop duration.

name

Feature.name

Name of the feature set.

num_channels

Feature.num_channels

Expected number of channels

num_features

Feature.num_features

Number of features.

params

Feature.params

Dictionary of parameters describing the feature set.

process

Feature.process

Processing object.

process_file()

Feature.process_file(file, *, start=None, end=None, root=None, process_func_args=None)[source]

Extract features from an audio file.

Parameters
Raises
Return type

DataFrame

process_files()

Feature.process_files(files, *, starts=None, ends=None, root=None, process_func_args=None)[source]

Extract features for a list of files.

Parameters
Raises
Return type

DataFrame

process_folder()

Feature.process_folder(root, *, filetype='wav', include_root=True, process_func_args=None)[source]

Extract features from files in a folder.

Note

At the moment does not scan in sub-folders!

Parameters
  • root (str) – root folder

  • filetype (str) – file extension

  • include_root (bool) – if True the file paths are absolute in the index of the returned result

  • process_func_args (Optional[Dict[str, Any]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored in audinterface.Feature.process.process_func_args

Raises
Return type

DataFrame

process_func_applies_sliding_window

Feature.process_func_applies_sliding_window

Controls if processing function applies sliding window.

process_index()

Feature.process_index(index, *, preserve_index=False, root=None, cache_root=None, process_func_args=None)[source]

Extract features from an index conform to audformat.

If cache_root is not None, a hash value is created from the index using audformat.utils.hash() and the result is stored as <cache_root>/<hash>.pkl. When called again with the same index, features will be read from the cached file.

Parameters
  • index (Index) – index with segment information

  • preserve_index (bool) – if True and audinterface.Feature.process.segment is None the returned index will be of same type as the original one, otherwise always a segmented index is returned

  • root (Optional[str]) – root folder to expand relative file paths

  • cache_root (Optional[str]) – cache folder (see description)

  • process_func_args (Optional[Dict[str, Any]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored in audinterface.Feature.process.process_func_args

Raises
Return type

DataFrame

process_signal()

Feature.process_signal(signal, sampling_rate, *, file=None, start=None, end=None, process_func_args=None)[source]

Extract features for an audio signal.

Note

If a file is given, the index of the returned frame has levels file, start and end. Otherwise, it consists only of start and end.

Parameters
Raises
  • RuntimeError – if sampling rates do not match

  • RuntimeError – if channel selection is invalid

  • RuntimeError – if dimension of extracted features is greater than three

  • RuntimeError – if feature extractor uses sliding window, but self.win_dur is not specified

  • RuntimeError – if number of features does not match number of feature names

  • RuntimeError – if multiple frames are returned, but win_dur is not set

Return type

DataFrame

process_signal_from_index()

Feature.process_signal_from_index(signal, sampling_rate, index, process_func_args=None)[source]

Split a signal into segments and extract features for each segment.

Parameters
Raises
Return type

DataFrame

to_numpy()

Feature.to_numpy(frame)[source]

Return feature values as a numpy array.

The returned numpy.ndarray has the original shape, i.e. (channels, features, time).

Parameters

frame (DataFrame) – feature frame

Return type

ndarray

verbose

Feature.verbose

Show debug messages.

win_dur

Feature.win_dur

Window duration.