Segment¶
- class audinterface.Segment(*, process_func=None, process_func_args=None, invert=False, sampling_rate=None, resample=False, channels=None, mixdown=False, min_signal_dur=None, max_signal_dur=None, keep_nat=False, num_workers=1, multiprocessing=False, verbose=False)[source]¶
Segmentation interface.
Interface for models that apply a segmentation to the input signal, e.g. a voice activity model that detects speech regions.
- Parameters
process_func (
Optional
[Callable
[...
,MultiIndex
]]) – segmentation function, which expects the two positional argumentssignal
andsampling_rate
and any number of additional keyword arguments (seeprocess_func_args
). There are the following special arguments:'idx'
,'file'
,'root'
. If expected by the function, but not specified inprocess_func_args
, they will be replaced with: a running index, the currently processed file, the root folder. Must return apandas.MultiIndex
with two levels named start and end that hold start and end positions aspandas.Timedelta
objectsprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing functioninvert (
bool
) – Invert the segmentationsampling_rate (
Optional
[int
]) – sampling rate in Hz IfNone
it will callprocess_func
with the actual sampling rate of the signalresample (
bool
) – ifTrue
enforces given sampling rate by resamplingchannels (
Union
[int
,Sequence
[int
],None
]) – channel selection, seeaudresample.remix()
mixdown (
bool
) – apply mono mix-down on selectionmin_signal_dur (
Union
[float
,int
,str
,Timedelta
,None
]) – minimum signal length required byprocess_func
. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. If provided signal is shorter, it will be zero padded at the endmax_signal_dur (
Union
[float
,int
,str
,Timedelta
,None
]) – maximum signal length required byprocess_func
. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. If provided signal is longer, it will be cut at the endkeep_nat (
bool
) – if the end of segment is set toNaT
do not replace with file duration in the resultnum_workers (
Optional
[int
]) – number of parallel jobs or 1 for sequential processing. IfNone
will be set to the number of processors on the machine multiplied by 5 in case of multithreading and number of processors in case of multiprocessingmultiprocessing (
bool
) – use multiprocessing instead of multithreadingverbose (
bool
) – show debug messages
- Raises
ValueError – if
resample = True
, butsampling_rate = None
Examples
>>> def segment(signal, sampling_rate, *, win_size=0.2, hop_size=0.1): ... size = signal.shape[1] / sampling_rate ... starts = pd.to_timedelta(np.arange(0, size - win_size, hop_size), unit="s") ... ends = pd.to_timedelta(np.arange(win_size, size, hop_size), unit="s") ... return pd.MultiIndex.from_tuples(zip(starts, ends), names=["start", "end"]) >>> interface = Segment(process_func=segment) >>> signal = np.array([1.0, 2.0, 3.0]) >>> interface(signal, sampling_rate=3) MultiIndex([( '0 days 00:00:00', '0 days 00:00:00.200000'), ('0 days 00:00:00.100000', '0 days 00:00:00.300000'), ('0 days 00:00:00.200000', '0 days 00:00:00.400000'), ('0 days 00:00:00.300000', '0 days 00:00:00.500000'), ('0 days 00:00:00.400000', '0 days 00:00:00.600000'), ('0 days 00:00:00.500000', '0 days 00:00:00.700000'), ('0 days 00:00:00.600000', '0 days 00:00:00.800000'), ('0 days 00:00:00.700000', '0 days 00:00:00.900000')], names=['start', 'end']) >>> # Apply interface on an audformat conform index of a dataframe >>> import audb >>> db = audb.load( ... "emodb", ... version="1.3.0", ... media="wav/03a01Fa.wav", ... full_path=False, ... verbose=False, ... ) >>> interface = Segment( ... process_func=segment, ... process_func_args={"win_size": 0.5, "hop_size": 0.25}, ... ) >>> interface.process_index(db["emotion"].index, root=db.root) MultiIndex([('wav/03a01Fa.wav', '0 days 00:00:00', ...), ('wav/03a01Fa.wav', '0 days 00:00:00.250000', ...), ('wav/03a01Fa.wav', '0 days 00:00:00.500000', ...), ('wav/03a01Fa.wav', '0 days 00:00:00.750000', ...), ('wav/03a01Fa.wav', '0 days 00:00:01', ...), ('wav/03a01Fa.wav', '0 days 00:00:01.250000', ...)], names=['file', 'start', 'end'])
__call__()¶
- Segment.__call__(signal, sampling_rate)[source]¶
Apply processing to signal.
This function processes the signal without transforming the output into a
pd.MultiIndex
. Instead, it will return the raw processed signal. However, if channel selection, mixdown and/or resampling is enabled, the signal will be first remixed and resampled if the input sampling rate does not fit the expected sampling rate.- Parameters
- Return type
- Returns
Processed signal
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
process_file()¶
- Segment.process_file(file, *, start=None, end=None, root=None, process_func_args=None)[source]¶
Segment the content of an audio file.
- Parameters
file (
str
) – file pathstart (
Union
[float
,int
,str
,Timedelta
,None
]) – start processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsend (
Union
[float
,int
,str
,Timedelta
,None
]) – end processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsroot (
Optional
[str
]) – root folder to expand relative file pathprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Segment.process.process_func_args
- Return type
- Returns
Segmented index conform to audformat
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
process_files()¶
- Segment.process_files(files, *, starts=None, ends=None, root=None, process_func_args=None)[source]¶
Segment a list of files.
- Parameters
starts (
Union
[float
,int
,str
,Timedelta
,Sequence
[Union
[float
,int
,str
,Timedelta
]],None
]) – segment start positions. Time values given as float or integers are treated as seconds. Seeaudinterface.utils.to_timedelta()
for further options. If a scalar is given, it is applied to all filesends (
Union
[float
,int
,str
,Timedelta
,Sequence
[Union
[float
,int
,str
,Timedelta
]],None
]) – segment end positions. Time values given as float or integers are treated as seconds Seeaudinterface.utils.to_timedelta()
for further options. If a scalar is given, it is applied to all filesroot (
Optional
[str
]) – root folder to expand relative file pathsprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Segment.process.process_func_args
- Return type
- Returns
Segmented index conform to audformat
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
process_folder()¶
- Segment.process_folder(root, *, filetype='wav', include_root=True, process_func_args=None)[source]¶
Segment files in a folder.
Note
At the moment does not scan in sub-folders!
- Parameters
root (
str
) – root folderfiletype (
str
) – file extensioninclude_root (
bool
) – ifTrue
the file paths are absolute in the index of the returned resultprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Segment.process.process_func_args
- Return type
- Returns
Segmented index conform to audformat
- Raises
FileNotFoundError – if folder does not exist
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
process_index()¶
- Segment.process_index(index, *, root=None, cache_root=None, process_func_args=None)[source]¶
Segment files or segments from an index.
If
cache_root
is notNone
, a hash value is created from the index usingaudformat.utils.hash()
and the result is stored as<cache_root>/<hash>.pkl
. When called again with the same index, results will be read from the cached file.- Parameters
- Return type
- Returns
Segmented index conform to audformat
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
process_signal()¶
- Segment.process_signal(signal, sampling_rate, *, file=None, start=None, end=None, process_func_args=None)[source]¶
Segment audio signal.
Note
If a
file
is given, the index of the returned frame has levelsfile
,start
andend
. Otherwise, it consists only ofstart
andend
.- Parameters
signal (
ndarray
) – signal valuessampling_rate (
int
) – sampling rate in Hzstart (
Union
[float
,int
,str
,Timedelta
,None
]) – start processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsend (
Union
[float
,int
,str
,Timedelta
,None
]) – end processing at this position. If value is a float or integer it is treated as seconds. Seeaudinterface.utils.to_timedelta()
for further optionsprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Segment.process.process_func_args
- Return type
- Returns
Segmented index conform to audformat
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
process_signal_from_index()¶
- Segment.process_signal_from_index(signal, sampling_rate, index, process_func_args=None)[source]¶
Segment parts of a signal.
- Parameters
signal (
ndarray
) – signal valuessampling_rate (
int
) – sampling rate in Hzindex (
Index
) – a segmented index conform to audformat or apandas.MultiIndex
with two levels named start and end that hold start and end positions aspandas.Timedelta
objects. See alsoaudinterface.utils.signal_index()
process_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Segment.process.process_func_args
- Return type
- Returns
Segmented index conform to audformat
- Raises
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid
ValueError – if index contains duplicates
process_table()¶
- Segment.process_table(table, *, root=None, cache_root=None, process_func_args=None)[source]¶
Segment files or segments from a table.
The labels of the table are reassigned to the new segments.
If
cache_root
is notNone
, a hash value is created from the index usingaudformat.utils.hash()
and the result is stored as<cache_root>/<hash>.pkl
. When called again with the same index, results will be read from the cached file.- Parameters
table (
Union
[Series
,DataFrame
]) –pandas.Series
orpandas.DataFrame
with an index conform to audformatroot (
Optional
[str
]) – root folder to expand relative file pathsprocess_func_args (
Optional
[Dict
[str
,Any
]]) – (keyword) arguments passed on to the processing function. They will temporarily overwrite the ones stored inaudinterface.Segment.process.process_func_args
- Return type
- Returns
Segmented table with an index conform to audformat
- Raises
ValueError – if table is not a
pandas.Series
or apandas.DataFrame
RuntimeError – if sampling rates do not match
RuntimeError – if channel selection is invalid