Usage¶
auglib
lets you augment audio data.
It provides transformations
in its sub-module auglib.transform
.
auglib.Augment
can then apply
those transformations
to a signal,
a file,
or a whole dataset.
Check how to include external solutions and look at the examples for further inspiration. Or continue reading here, to see how to apply the augmentations to different inputs.
import auglib
transform = auglib.transform.Compose(
[
auglib.transform.HighPass(cutoff=5000),
auglib.transform.Clip(),
auglib.transform.NormalizeByPeak(peak_db=-3),
]
)
augment = auglib.Augment(transform)
Augment a signal¶
We now load a signal from emodb, and apply our augmentation to it.
import audb
import audiofile
files = audb.load_media(
"emodb",
"wav/03a01Fa.wav",
version="1.4.1",
verbose=False,
)
signal, sampling_rate = audiofile.read(files[0])
signal_augmented = augment(signal, sampling_rate)
Augment files in memory¶
auglib.Augment
can apply the augmentation
to a list of files.
We load three files from emodb,
and augment them using auglib.Augment.process_files()
.
files = audb.load_media(
"emodb",
["wav/03a01Fa.wav", "wav/03a01Nc.wav", "wav/03a01Wa.wav"],
version="1.4.1",
verbose=False,
)
y_augmented = augment.process_files(files)
y_augmented
file | start | end | |
---|---|---|---|
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Fa.wav | 0 days | 0 days 00:00:01.898250 | [[0.0012283714, 0.0041666315, -0.0018338256, -... |
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Nc.wav | 0 days | 0 days 00:00:01.611250 | [[0.0003378813, -0.00029246297, 0.00043359815,... |
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Wa.wav | 0 days | 0 days 00:00:01.877812500 | [[0.0, 2.8740513e-05, -0.00017815993, 0.000150... |
All process_*()
methods
return a series
(pd.Series
)
holding the augmented signals
with a pd.MultiIndex
containing the levels file
,
start
,
end
(segmented index).
Augment a dataset to disk¶
auglib.Augment.augment()
augments
a dataset to a cache folder on disk.
It takes as input an index,
series or dataframe.
The index needs at least one level,
named file
holding file paths
(filewise index)
or the three levels file
,
start
,
and end
,
holding information on start and end
for segments of the file
provided as pd.Timedelta
(segmented index).
auglib.Augment.augment()
returns an index, series, or table
with a segmented index
that points to the augmented files.
The next example
loads the emodb dataset,
limited to three files for this example.
It then uses audformat.Database.files
to get a filewise index
pointing to the files of the dataset.
db = audb.load(
"emodb",
version="1.4.1",
media=["wav/03a01Fa.wav", "wav/03a01Nc.wav", "wav/03a01Wa.wav"],
verbose=False,
)
index = db.files
index_augmented = augment.augment(index, cache_root="cache")
index_augmented
file | start | end | |
---|---|---|---|
0 | /home/runner/work/auglib/auglib/cache/bd491f1a... | 0 days | 0 days 00:00:01.898250 |
1 | /home/runner/work/auglib/auglib/cache/bd491f1a... | 0 days | 0 days 00:00:01.611250 |
2 | /home/runner/work/auglib/auglib/cache/bd491f1a... | 0 days | 0 days 00:00:01.877812500 |
The augmented files are stored inside the cache_root
folder.
If auglib.Augment.augment()
is called again on the same index,
it detects the requested augmentation
in cache,
and returns directly its result.
If you don’t specify cache_root
,
the default value of $HOME/auglib
will be used.
If we pass a series instead of an index a series will be returned:
y = db["files"]["speaker"].get()
y_augmented = augment.augment(y, cache_root="cache")
y_augmented
file | start | end | |
---|---|---|---|
/home/runner/work/auglib/auglib/cache/bd491f1a/8464945703565270270/0/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Fa.wav | 0 days | 0 days 00:00:01.898250 | 3 |
/home/runner/work/auglib/auglib/cache/bd491f1a/8464945703565270270/0/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Nc.wav | 0 days | 0 days 00:00:01.611250 | 3 |
/home/runner/work/auglib/auglib/cache/bd491f1a/8464945703565270270/0/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Wa.wav | 0 days | 0 days 00:00:01.877812500 | 3 |
Finally, we augment a dataframe, this time keeping the original files in the result and augmenting every file twice.
df = db["files"].get()
df_augmented = augment.augment(
df,
cache_root="cache",
modified_only=False,
num_variants=2,
)
df_augmented
duration | speaker | transcription | |||
---|---|---|---|---|---|
file | start | end | |||
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Fa.wav | 0 days | 0 days 00:00:01.898250 | 0 days 00:00:01.898250 | 3 | a01 |
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Nc.wav | 0 days | 0 days 00:00:01.611250 | 0 days 00:00:01.611250 | 3 | a01 |
/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Wa.wav | 0 days | 0 days 00:00:01.877812500 | 0 days 00:00:01.877812500 | 3 | a01 |
/home/runner/work/auglib/auglib/cache/bd491f1a/8464945703565270270/0/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Fa.wav | 0 days | 0 days 00:00:01.898250 | 0 days 00:00:01.898250 | 3 | a01 |
/home/runner/work/auglib/auglib/cache/bd491f1a/8464945703565270270/0/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Nc.wav | 0 days | 0 days 00:00:01.611250 | 0 days 00:00:01.611250 | 3 | a01 |
/home/runner/work/auglib/auglib/cache/bd491f1a/8464945703565270270/0/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Wa.wav | 0 days | 0 days 00:00:01.877812500 | 0 days 00:00:01.877812500 | 3 | a01 |
/home/runner/work/auglib/auglib/cache/bd491f1a/8464945703565270270/1/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Fa.wav | 0 days | 0 days 00:00:01.898250 | 0 days 00:00:01.898250 | 3 | a01 |
/home/runner/work/auglib/auglib/cache/bd491f1a/8464945703565270270/1/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Nc.wav | 0 days | 0 days 00:00:01.611250 | 0 days 00:00:01.611250 | 3 | a01 |
/home/runner/work/auglib/auglib/cache/bd491f1a/8464945703565270270/1/home/runner/audb/emodb/1.4.1/d3b62a9b/wav/03a01Wa.wav | 0 days | 0 days 00:00:01.877812500 | 0 days 00:00:01.877812500 | 3 | a01 |
Serialize¶
It’s possible to serialize a
auglib.Augment
object
to YAML.
print(augment.to_yaml_s())
$auglib.core.interface.Augment==1.0.4:
transform:
$auglib.core.transform.Compose==1.0.4:
transforms:
- $auglib.core.transform.HighPass==1.0.4:
cutoff: 5000
order: 1
design: butter
preserve_level: false
bypass_prob: null
- $auglib.core.transform.Clip==1.0.4:
threshold: 0.0
soft: false
normalize: false
preserve_level: false
bypass_prob: null
- $auglib.core.transform.NormalizeByPeak==1.0.4:
peak_db: -3
clip: false
preserve_level: false
bypass_prob: null
preserve_level: false
bypass_prob: null
sampling_rate: null
resample: false
channels: null
mixdown: false
seed: null
We can save it to a file and re-instantiate it from there.
import audobject
file = "transform.yaml"
augment.to_yaml(file)
augment_from_yaml = audobject.from_yaml(file)
augment_from_yaml(signal, sampling_rate)
array([[ 1.2283714e-03, 4.1666315e-03, -1.8338256e-03, ...,
-4.1018693e-06, 8.3834183e-04, -1.6675657e-04]], dtype=float32)
The new object creates the exact same augmentation.
To make an augmentation reproducible
that includes random behavior
we have to set the seed
argument.
transform = auglib.transform.PinkNoise(gain_db=-5)
augment = auglib.Augment(transform, seed=0)
augment(signal, sampling_rate)
array([[0.2510293 , 0.19091995, 0.23076648, ..., 0.12397236, 0.15249531,
0.16985646]], dtype=float32)
When we serialize the object, the seed will be stored to YAML and used to re-initialize the random number generator when the object is loaded.
augment.to_yaml(file)
augment_from_yaml = audobject.from_yaml(file)
augment_from_yaml(signal, sampling_rate)
array([[0.2510293 , 0.19091995, 0.23076648, ..., 0.12397236, 0.15249531,
0.16985646]], dtype=float32)
If we wanted a different random seed we can also overwrite the value.
augment_other_seed = audobject.from_yaml(file, override_args={"seed": 1})
augment_other_seed(signal, sampling_rate)
array([[-0.01383871, -0.10363714, -0.12082221, ..., -0.21219613,
-0.08782648, -0.14412443]], dtype=float32)