Feature¶
- class audinterface.Feature(feature_names, *, name=None, params=None, process_func=None, process_func_args=None, process_func_is_mono=False, process_func_applies_sliding_window=False, sampling_rate=None, resample=False, channels=0, mixdown=False, win_dur=None, hop_dur=None, min_signal_dur=None, max_signal_dur=None, segment=None, keep_nat=False, num_workers=1, multiprocessing=False, verbose=False)[source]¶
Feature extraction interface.
The features are returned as a
pandas.DataFrame
. If your input signal is of size(num_channels, num_time_steps)
, the returned object hasnum_channels * num_features
columns. It will have one row per file or signal.If features are extracted using a sliding window, each window will be stored as one row. If
win_dur
is specifiedstart
andend
indices are referred from the originalstart
andend
arguments and the window positions. Ifwin_dur
isNone
, the originalstart
andend
indices are kept. Ifprocess_func_applies_sliding_window
is set toTrue
the processing function is responsible to apply the sliding window. Otherwise, the sliding window is applied before the processing function is called.If the arguments
win_dur
andhop_dur
are not specified inprocess_func_args
, butprocess_func
expects them, they are passed on automatically.- Parameters
feature_names (
Union
[str
,Sequence
[str
]]) – features are stored as columns in a data frame, wherefeature_names
defines the names of the columns. Iflen(channels)
> 1, the data frame has a multi-column index with with channel ID as first level andfeature_names
as second levelparams (
Optional
[Dict
]) – parameters that describe the feature set, e.g.{'win_size': 512, 'hop_size': 256, 'num_fft': 512}
. With the parameters you can differentiate different flavors of the same feature setprocess_func (
Optional
[Callable
[...
,Any
]]) – feature extraction function, which expects the two positional argumentssignal
andsampling_rate
and any number of additional keyword arguments (seeprocess_func_args
). There are the following special arguments:'idx'
,'file'
,'root'
. If expected by the function, but not specified inprocess_func_args
, they will be replaced with: a running index, the currently processed file, the root folder. The function must return features in the shape of(num_features)
,(num_channels, num_features)
,(num_features, num_frames)
, or(num_channels, num_features, num_frames)
process_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing functionprocess_func_is_mono (
bool
) – applyprocess_func
to every channel individuallyprocess_func_applies_sliding_window (
bool
) – ifTrue
the processing function receives whole files or segments and is responsible for applying a sliding window itself. IfFalse
, the sliding window is applied internally and the processing function receives individual frames instead. Applies only if features are extracted in a framewise manner (seewin_dur
andhop_dur
)sampling_rate (
Optional
[int
]) – sampling rate in Hz. IfNone
it will callprocess_func
with the actual sampling rate of the signalresample (
bool
) – ifTrue
enforces given sampling rate by resamplingchannels (
Union
[int
,Sequence
[int
]]) – channel selection, seeaudresample.remix()
win_dur (
Union
[float
,int
,str
,Timedelta
,None
]) – window duration, if features are extracted with a sliding window. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionshop_dur (
Union
[float
,int
,str
,Timedelta
,None
]) – hop duration, if features are extracted with a sliding window. This defines the shift between two windows. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. Defaults towin_dur / 2
min_signal_dur (
Union
[float
,int
,str
,Timedelta
,None
]) – minimum signal duration required byprocess_func
. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. If provided signal is shorter, it will be zero padded at the endmax_signal_dur (
Union
[float
,int
,str
,Timedelta
,None
]) – maximum signal duraton required byprocess_func
. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. If provided signal is longer, it will be cut at the endmixdown (
bool
) – apply mono mix-down on selectionsegment (
Optional
[Segment
]) – when aaudinterface.Segment
object is provided, it will be used to find a segmentation of the input signal. Afterwards processing is applied to each segmentkeep_nat (
bool
) – if the end of segment is set toNaT
do not replace with file duration in the resultnum_workers (
Optional
[int
]) – number of parallel jobs or 1 for sequential processing. IfNone
will be set to the number of processors on the machine multiplied by 5 in case of multithreading and number of processors in case of multiprocessingmultiprocessing (
bool
) – use multiprocessing instead of multithreadingverbose (
bool
) – show debug messages
- Raises
ValueError – if
win_dur
orhop_dur
are given in samples andsampling_rate is None
ValueError – if
hop_dur
is specified, but notwin_dur
Examples
>>> def mean_std(signal, sampling_rate): ... return [signal.mean(), signal.std()] >>> interface = Feature(["mean", "std"], process_func=mean_std) >>> signal = np.array([1.0, 2.0, 3.0]) >>> interface(signal, sampling_rate=3) array([[[2. ], [0.81649658]]]) >>> interface.process_signal(signal, sampling_rate=3) mean std start end 0 days 0 days 00:00:01 2.0 0.816497 >>> # Apply interface on an audformat conform index of a dataframe >>> import audb >>> db = audb.load( ... "emodb", ... version="1.3.0", ... media="wav/03a01Fa.wav", ... full_path=False, ... verbose=False, ... ) >>> index = db["emotion"].index >>> interface.process_index(index, root=db.root) mean std file start end wav/03a01Fa.wav 0 days 0 days 00:00:01.898250 -0.000311 0.082317 >>> interface.process_index(index, root=db.root, preserve_index=True) mean std file wav/03a01Fa.wav -0.000311 0.082317 >>> # Apply interface with a sliding window >>> interface = Feature( ... ["mean", "std"], ... process_func=mean_std, ... win_dur=1.0, ... hop_dur=0.25, ... ) >>> interface.process_index(index, root=db.root) mean std file start end wav/03a01Fa.wav 0 days 00:00:00 0 days 00:00:01 -0.000329 0.098115 0 days 00:00:00.250000 0 days 00:00:01.250000 -0.000405 0.087917 0 days 00:00:00.500000 0 days 00:00:01.500000 -0.000285 0.067042 0 days 00:00:00.750000 0 days 00:00:01.750000 -0.000187 0.063677 >>> # Apply the same process function on all channels >>> # of a multi-channel signal >>> import audiofile >>> signal, sampling_rate = audiofile.read( ... audeer.path(db.root, db.files[0]), ... always_2d=True, ... ) >>> signal_multi_channel = np.concatenate( ... [ ... signal - 0.5, ... signal + 0.5, ... ], ... ) >>> interface = Feature( ... ["mean", "std"], ... process_func=mean_std, ... process_func_is_mono=True, ... channels=[0, 1], ... ) >>> interface.process_signal( ... signal_multi_channel, ... sampling_rate, ... ) 0 1 mean std mean std start end 0 days 0 days 00:00:01.898250 -0.500311 0.082317 0.499689 0.082317
__call__()¶
- Feature.__call__(signal, sampling_rate)[source]¶
Apply processing to signal.
This function processes the signal without transforming the output into a
pandas.DataFrame
. Instead, it will return the raw processed signal. However, if channel selection, mixdown and/or resampling is enabled, the signal will be first remixed and resampled if the input sampling rate does not fit the expected sampling rate.- Parameters
- Return type
- Returns
feature array with shape
(num_channels, num_features, num_frames)
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
RuntimeError – if multiple frames are returned, but
win_dur
is not set
process_file()¶
- Feature.process_file(file, *, start=None, end=None, root=None, process_func_args=None)[source]¶
Extract features from an audio file.
- Parameters
file (
str
) – file pathstart (
Union
[float
,int
,str
,Timedelta
,None
]) – start processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsend (
Union
[float
,int
,str
,Timedelta
,None
]) – end processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsroot (
Optional
[str
]) – root folder to expand relative file pathprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Feature.process.process_func_args
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
RuntimeError – if multiple frames are returned, but
win_dur
is not set
- Return type
process_files()¶
- Feature.process_files(files, *, starts=None, ends=None, root=None, process_func_args=None)[source]¶
Extract features for a list of files.
- Parameters
starts (
Union
[float
,int
,str
,Timedelta
,Sequence
[Union
[float
,int
,str
,Timedelta
]],None
]) – segment start positions. Time values given as float or integers are treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. If a scalar is given, it is applied to all filesends (
Union
[float
,int
,str
,Timedelta
,Sequence
[Union
[float
,int
,str
,Timedelta
]],None
]) – segment end positions. Time values given as float or integers are treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. If a scalar is given, it is applied to all filesroot (
Optional
[str
]) – root folder to expand relative file pathsprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Feature.process.process_func_args
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
RuntimeError – if multiple frames are returned, but
win_dur
is not set
- Return type
process_folder()¶
- Feature.process_folder(root, *, filetype='wav', include_root=True, process_func_args=None)[source]¶
Extract features from files in a folder.
Note
At the moment does not scan in sub-folders!
- Parameters
root (
str
) – root folderfiletype (
str
) – file extensioninclude_root (
bool
) – ifTrue
the file paths are absolute in the index of the returned resultprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Feature.process.process_func_args
- Raises
FileNotFoundError – if folder does not exist
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
RuntimeError – if multiple frames are returned, but
win_dur
is not set
- Return type
process_func_applies_sliding_window¶
- Feature.process_func_applies_sliding_window¶
Controls if processing function applies sliding window.
process_index()¶
- Feature.process_index(index, *, preserve_index=False, root=None, cache_root=None, process_func_args=None)[source]¶
Extract features from an index conform to audformat.
If
cache_root
is notNone
, a hash value is created from the index usingaudformat.utils.hash()
and the result is stored as<cache_root>/<hash>.pkl
. When called again with the same index, features will be read from the cached file.- Parameters
index (
Index
) – index with segment informationpreserve_index (
bool
) – ifTrue
andaudinterface.Feature.process.segment
isNone
the returned index will be of same type as the original one, otherwise always a segmented index is returnedroot (
Optional
[str
]) – root folder to expand relative file pathsprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Feature.process.process_func_args
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
RuntimeError – if multiple frames are returned, but
win_dur
is not setValueError – if index is not conform to audformat
- Return type
process_signal()¶
- Feature.process_signal(signal, sampling_rate, *, file=None, start=None, end=None, process_func_args=None)[source]¶
Extract features for an audio signal.
Note
If a
file
is given, the index of the returned frame has levelsfile
,start
andend
. Otherwise, it consists only ofstart
andend
.- Parameters
signal (
ndarray
) – signal valuessampling_rate (
int
) – sampling rate in Hzstart (
Union
[float
,int
,str
,Timedelta
,None
]) – start processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsend (
Union
[float
,int
,str
,Timedelta
,None
]) – end processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Feature.process.process_func_args
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
RuntimeError – if dimension of extracted features is greater than three
RuntimeError – if feature extractor uses sliding window, but
self.win_dur
is not specifiedRuntimeError – if number of features does not match number of feature names
RuntimeError – if multiple frames are returned, but
win_dur
is not set
- Return type
process_signal_from_index()¶
- Feature.process_signal_from_index(signal, sampling_rate, index, process_func_args=None)[source]¶
Split a signal into segments and extract features for each segment.
- Parameters
signal (
ndarray
) – signal valuessampling_rate (
int
) – sampling rate in Hzindex (
MultiIndex
) – apandas.MultiIndex
with two levels named start and end that hold start and end positions aspandas.Timedelta
objects. See alsoaudinterface.utils.signal_index()
process_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Feature.process.process_func_args
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
RuntimeError – if multiple frames are returned, but
win_dur
is not setValueError – if index contains duplicates
- Return type
to_numpy()¶
- Feature.to_numpy(frame)[source]¶
Return feature values as a numpy array.
The returned
numpy.ndarray
has the original shape, i.e.(channels, features, time)
.