union()¶

audformat.utils.union(objs)[source]¶

Create union of index objects.

If all index objects are conform to table specifications and at least one object is segmented, the output is a segmented index. Otherwise, requires that levels and dtypes of all objects match, see audformat.utils.is_index_alike(). Integer dtypes don’t have to match, but the result will always be of dtype Int64. When a pandas.Index is combined with a single-level pandas.MultiIndex, the result is a pandas.Index.

The order of the resulting index depends on the order of objs. If you require audformat.utils.union() to be commutative, you have to sort its output.

Parameters: objs (Sequence[Index]) – index objects
Return type: Index
Returns: union of index objects
Raises: ValueError – if level and dtypes of objects do not match

Examples

>>> union(
...     [
...         pd.Index([0, 1], name="idx"),
...         pd.Index([1, 2], dtype="Int64", name="idx"),
...     ]
... )
Index([0, 1, 2], dtype='Int64', name='idx')
>>> union(
...     [
...         pd.Index([0, 1], name="idx"),
...         pd.MultiIndex.from_arrays([[1, 2]], names=["idx"]),
...     ]
... )
Index([0, 1, 2], dtype='Int64', name='idx')
>>> union(
...     [
...         pd.MultiIndex.from_arrays(
...             [["a", "b", "c"], [0, 1, 2]],
...             names=["idx1", "idx2"],
...         ),
...         pd.MultiIndex.from_arrays(
...             [["b", "c"], [1, 3]],
...             names=["idx1", "idx2"],
...         ),
...     ]
... )
MultiIndex([('a', 0),
            ('b', 1),
            ('c', 2),
            ('c', 3)],
           names=['idx1', 'idx2'])
>>> union(
...     [
...         filewise_index(["f1", "f2", "f3"]),
...         filewise_index(["f2", "f3", "f4"]),
...     ]
... )
Index(['f1', 'f2', 'f3', 'f4'], dtype='string', name='file')
>>> union(
...     [
...         segmented_index(["f2"], [0], [1]),
...         segmented_index(["f1", "f2"], [0, 1], [1, 2]),
...     ]
... )
MultiIndex([('f2', '0 days 00:00:00', '0 days 00:00:01'),
            ('f1', '0 days 00:00:00', '0 days 00:00:01'),
            ('f2', '0 days 00:00:01', '0 days 00:00:02')],
           names=['file', 'start', 'end'])
>>> union(
...     [
...         filewise_index(["f1", "f2"]),
...         segmented_index(["f1", "f2"], [0, 0], [1, 1]),
...     ]
... )
MultiIndex([('f1', '0 days',               NaT),
            ('f2', '0 days',               NaT),
            ('f1', '0 days', '0 days 00:00:01'),
            ('f2', '0 days', '0 days 00:00:01')],
           names=['file', 'start', 'end'])