to_segmented_index()¶
- audformat.utils.to_segmented_index(obj, *, allow_nat=True, files_duration=None, root=None, num_workers=1, verbose=False)[source]¶
Convert to segmented index.
If the input a filewise table,
startandendwill be added as new levels to the index. By default,startwill be set to 0 andendtoNaT.If
allow_natis set toFalse, all occurrences ofend=NaTare replaced with the duration of the file. This requires that the referenced file exists, or that the durations are provided withfiles_duration. If file names in the index are relative, therootargument can be used to provide the location where the files are stored.- Parameters
obj (
Index|Series|DataFrame) – object conform to table specificationsallow_nat (
bool) – if set toFalse,end=NaTis replaced with file durationfiles_duration (
Optional[MutableMapping[str,Timedelta]]) – mapping from file to duration. If notNone, used to look up durations. If no entry is found for a file, it is added to the mapping. Expects absolute file names and durations aspd.Timedeltaobjects. Only relevant ifallow_natis set toFalseroot (
Optional[str]) – root directory under which the files referenced in the index are storednum_workers (
Optional[int]) – number of parallel jobs. IfNonewill be set to the number of processors on the machine multiplied by 5verbose (
bool) – show progress bar
- Return type
- Returns
object with segmented index
- Raises
ValueError – if object not conform to table specifications
FileNotFoundError – if file is not found
Examples
>>> index = filewise_index(["f1", "f2"]) >>> to_segmented_index(index) MultiIndex([('f1', '0 days', NaT), ('f2', '0 days', NaT)], names=['file', 'start', 'end']) >>> to_segmented_index( ... index, ... allow_nat=False, ... files_duration={ ... "f1": pd.to_timedelta(1.1, unit="s"), ... "f2": pd.to_timedelta(2.2, unit="s"), ... }, ... ) MultiIndex([('f1', '0 days', '0 days 00:00:01.100000'), ('f2', '0 days', '0 days 00:00:02.200000')], names=['file', 'start', 'end'])