SegmentWithFeature¶
- class audinterface.SegmentWithFeature(feature_names, *, name=None, params=None, process_func=None, process_func_args=None, sampling_rate=None, resample=False, channels=0, mixdown=False, min_signal_dur=None, max_signal_dur=None, keep_nat=False, num_workers=1, multiprocessing=False, verbose=False)[source]¶
Segmentation with feature interface.
Interface for functions that apply a segmentation to the input signal, and also compute features for those segments at the same time, e.g. a speech recognition model that recognizes speech and also provides the time stamps of that speech.
The features are returned as a
pandas.DataFrame
withnum_features
columns and one row per detected segment.- Parameters
feature_names (
str
|Sequence
[str
]) – features are stored as columns in a data frame, wherefeature_names
defines the names of the columns.params (
Optional
[dict
[str
,object
]]) – parameters that describe the feature set, e.g.{'win_size': 512, 'hop_size': 256, 'num_fft': 512}
. With the parameters you can differentiate different flavors of the same feature setprocess_func (
Optional
[Callable
[...
,Series
]]) – segmentation with feature function, which expects the two positional argumentssignal
andsampling_rate
and any number of additional keyword arguments (seeprocess_func_args
). There are the following special arguments:'idx'
,'file'
,'root'
. If expected by the function, but not specified inprocess_func_args
, they will be replaced with: a running index, the currently processed file, the root folder. Must return apandas.Series
with apandas.MultiIndex
with two levels named start and end that hold start and end positions aspandas.Timedelta
objects, and with elements in the shape of(num_features)
.process_func_args (
Optional
[dict
[str
,object
]]) – (keyword) arguments passed on to the processing functionsampling_rate (
Optional
[int
]) – sampling rate in Hz. IfNone
it will callprocess_func
with the actual sampling rate of the signalresample (
bool
) – ifTrue
enforces given sampling rate by resamplingchannels (
int
|Sequence
[int
]) – channel selection, seeaudresample.remix()
mixdown (
bool
) – apply mono mix-down on selectionmin_signal_dur (
Union
[float
,int
,str
,Timedelta
,None
]) – minimum signal length required byprocess_func
. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. If provided signal is shorter, it will be zero padded at the endmax_signal_dur (
Union
[float
,int
,str
,Timedelta
,None
]) – maximum signal length required byprocess_func
. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. If provided signal is longer, it will be cut at the endkeep_nat (
bool
) – if the end of segment is set toNaT
do not replace with file duration in the resultnum_workers (
Optional
[int
]) – number of parallel jobs or 1 for sequential processing. IfNone
will be set to the number of processors on the machine multiplied by 5 in case of multithreading and number of processors in case of multiprocessingmultiprocessing (
bool
) – use multiprocessing instead of multithreadingverbose (
bool
) – show debug messages
- Raises
ValueError – if
resample = True
, butsampling_rate = None
Examples
>>> def segment_with_mean_std(signal, sampling_rate, *, win_size=1.0, hop_size=1.0): ... size = signal.shape[1] / sampling_rate ... starts = pd.to_timedelta( ... np.arange(0, size - win_size + (1 / sampling_rate), hop_size), ... unit="s", ... ) ... ends = pd.to_timedelta( ... np.arange(win_size, size + (1 / sampling_rate), hop_size), unit="s" ... ) ... # Get windows of shape (channels, samples, frames) ... frames = utils.sliding_window(signal, sampling_rate, win_size, hop_size) ... means = frames.mean(axis=(0, 1)) ... stds = frames.std(axis=(0, 1)) ... index = pd.MultiIndex.from_tuples(zip(starts, ends), names=["start", "end"]) ... features = list(np.stack((means, stds), axis=-1)) ... return pd.Series(data=features, index=index) >>> interface = SegmentWithFeature( ... feature_names=["mean", "std"], process_func=segment_with_mean_std ... ) >>> signal = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]) >>> interface(signal, sampling_rate=2) start end 0 days 00:00:00 0 days 00:00:01 [1.5, 0.5] 0 days 00:00:01 0 days 00:00:02 [3.5, 0.5] 0 days 00:00:02 0 days 00:00:03 [5.5, 0.5] dtype: object >>> interface.process_signal(signal, sampling_rate=2) mean std start end 0 days 00:00:00 0 days 00:00:01 1.5 0.5 0 days 00:00:01 0 days 00:00:02 3.5 0.5 0 days 00:00:02 0 days 00:00:03 5.5 0.5 >>> # Apply interface on an audformat conform index of a dataframe >>> import audb >>> db = audb.load( ... "emodb", ... version="1.3.0", ... media="wav/03a01Fa.wav", ... full_path=False, ... verbose=False, ... ) >>> index = db["emotion"].index >>> interface.process_index(index, root=db.root) mean std file start end wav/03a01Fa.wav 0 days 0 days 00:00:01 -0.000329 0.098115
__call__()¶
- SegmentWithFeature.__call__(signal, sampling_rate)[source]¶
Apply processing to signal.
This function processes the signal without transforming the output into a
pd.DataFrame
. Instead, it will return the raw processed signal. However, if channel selection, mixdown and/or resampling is enabled, the signal will be first remixed and resampled if the input sampling rate does not fit the expected sampling rate.- Parameters
- Return type
- Returns
Processed signal
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
ValueError – if the process function doesn’t return a
pd.Series
with index conform to audformat and elements of shape(num_features)
process_file()¶
- SegmentWithFeature.process_file(file, *, start=None, end=None, root=None, process_func_args=None)[source]¶
Segment the content of an audio file and extract features.
- Parameters
file (
str
) – file pathstart (
Union
[float
,int
,str
,Timedelta
,None
]) – start processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsend (
Union
[float
,int
,str
,Timedelta
,None
]) – end processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsroot (
Optional
[str
]) – root folder to expand relative file pathprocess_func_args (
Optional
[dict
[str
,object
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.SegmentWithFeature.process.process_func_args
- Return type
- Returns
pandas.DataFrame
with segmented index conform to audformat- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
ValueError – if the process function doesn’t return a
pd.Series
with index conform to audformat and elements of shape(num_features)
process_files()¶
- SegmentWithFeature.process_files(files, *, starts=None, ends=None, root=None, process_func_args=None)[source]¶
Segment and extract features for a list of files.
- Parameters
starts (
Union
[float
,int
,str
,Timedelta
,Sequence
[Union
[float
,int
,str
,Timedelta
]],None
]) – segment start positions. Time values given as float or integers are treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. If a scalar is given, it is applied to all filesends (
Union
[float
,int
,str
,Timedelta
,Sequence
[Union
[float
,int
,str
,Timedelta
]],None
]) – segment end positions. Time values given as float or integers are treated as seconds Seeaudinterface.utils.to_timedelta()
for further options. If a scalar is given, it is applied to all filesroot (
Optional
[str
]) – root folder to expand relative file pathsprocess_func_args (
Optional
[dict
[str
,object
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.SegmentWithFeature.process.process_func_args
- Return type
- Returns
pandas.DataFrame
with segmented index conform to audformat- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
ValueError – if the process function doesn’t return a
pd.Series
with index conform to audformat and elements of shape(num_features)
process_folder()¶
- SegmentWithFeature.process_folder(root, *, filetype='wav', include_root=True, process_func_args=None)[source]¶
Segment and extract features for files in a folder.
Note
At the moment does not scan in sub-folders!
- Parameters
root (
str
) – root folderfiletype (
str
) – file extensioninclude_root (
bool
) – ifTrue
the file paths are absolute in the index of the returned resultprocess_func_args (
Optional
[dict
[str
,object
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.SegmentWithFeature.process.process_func_args
- Return type
- Returns
pandas.DataFrame
with segmented index conform to audformat- Raises
FileNotFoundError – if folder does not exist
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
ValueError – if the process function doesn’t return a
pd.Series
with index conform to audformat and elements of shape(num_features)
process_index()¶
- SegmentWithFeature.process_index(index, *, root=None, cache_root=None, process_func_args=None)[source]¶
Segment and extract features for files or segments from an index.
If
cache_root
is notNone
, a hash value is created from the index usingaudformat.utils.hash()
and the result is stored as<cache_root>/<hash>.pkl
. When called again with the same index, results will be read from the cached file.- Parameters
- Return type
- Returns
pandas.DataFrame
with segmented index conform to audformat- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
ValueError – if the process function doesn’t return a
pd.Series
with index conform to audformat and elements of shape(num_features)
process_signal()¶
- SegmentWithFeature.process_signal(signal, sampling_rate, *, file=None, start=None, end=None, process_func_args=None)[source]¶
Segment and extract features for audio signal.
Note
If a
file
is given, the index of the returned frame has levelsfile
,start
andend
. Otherwise, it consists only ofstart
andend
.- Parameters
signal (
ndarray
) – signal valuessampling_rate (
int
) – sampling rate in Hzstart (
Union
[float
,int
,str
,Timedelta
,None
]) – start processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsend (
Union
[float
,int
,str
,Timedelta
,None
]) – end processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsprocess_func_args (
Optional
[dict
[str
,object
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.SegmentWithFeature.process.process_func_args
- Return type
- Returns
pandas.DataFrame
with segmented index conform to audformat- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
ValueError – if the process function doesn’t return a
pd.Series
with index conform to audformat and elements of shape(num_features)
process_signal_from_index()¶
- SegmentWithFeature.process_signal_from_index(signal, sampling_rate, index, process_func_args=None)[source]¶
Segment and extract features for parts of a signal.
- Parameters
signal (
ndarray
) – signal valuessampling_rate (
int
) – sampling rate in Hzindex (
Index
) – a segmented index conform to audformat or apandas.MultiIndex
with two levels named start and end that hold start and end positions aspandas.Timedelta
objects. See alsoaudinterface.utils.signal_index()
process_func_args (
Optional
[dict
[str
,object
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.SegmentWithFeature.process.process_func_args
- Return type
- Returns
pandas.DataFrame
with segmented index conform to audformat- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
ValueError – if index contains duplicates
ValueError – if the process function doesn’t return a
pd.Series
with index conform to audformat and elements of shape(num_features)
process_table()¶
- SegmentWithFeature.process_table(table, *, root=None, cache_root=None, process_func_args=None)[source]¶
Segment and extract features for files or segments from a table.
The labels of the table are reassigned to the new segments. The columns of the table may not overlap with the
audinterface.SegmentWithFeature.feature_names
.If
cache_root
is notNone
, a hash value is created from the index usingaudformat.utils.hash()
and the result is stored as<cache_root>/<hash>.pkl
. When called again with the same index, results will be read from the cached file.- Parameters
table (
Series
|DataFrame
) –pandas.Series
orpandas.DataFrame
with an index conform to audformatroot (
Optional
[str
]) – root folder to expand relative file pathsprocess_func_args (
Optional
[dict
[str
,object
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.SegmentWithFeature.process.process_func_args
- Return type
- Returns
pandas.DataFrame
with segmented index conform to audformat- Raises
ValueError – if table is not a
pandas.Series
or apandas.DataFrame
ValueError – if the table columns and the extracted feature columns overlap
ValueError – if the process function doesn’t return a
pd.Series
with index conform to audformat and elements of shape(num_features)
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid