External Solutions¶
Augmentation examples,
showing you
how to combine auglib
with external augmentation solutions.
Let’s start with loading an example file to augment.
import audb
import audiofile
files = audb.load_media(
"emodb",
"wav/03a01Fa.wav",
version="1.4.1",
verbose=False,
)
signal, sampling_rate = audiofile.read(files[0])
Pedalboard¶
Pedalboard is a Python package from Spotify, that provides a collection of useful and fast augmentations. It also allows you to include VST plugins in your augmentation pipeline.
The documentation of Pedalboard
does not discuss all the used parameters
of the augmentations.
For the value range
and an explanation
of the parameters,
you might want to look
at the corresponding documentation
of the underlying JUCE C code.
E.g. for pedalboard.Reverb
it is located at
https://docs.juce.com/master/structReverb_1_1Parameters.html
In the following example,
we use the compressor,
chorus,
phaser,
and reverb
from pedalboard,
as part of our auglib
augmentation chain
with the help of the auglib.transform.Function
class.
def pedalboard_transform(signal, sampling_rate):
r"""Custom augmentation using pedalboard."""
import pedalboard
board = pedalboard.Pedalboard(
[
pedalboard.Compressor(threshold_db=-50, ratio=25),
pedalboard.Chorus(),
pedalboard.Phaser(),
pedalboard.Reverb(room_size=0.25),
],
)
return board(signal, sampling_rate)
transform = auglib.transform.Compose(
[
auglib.transform.Function(pedalboard_transform),
auglib.transform.NormalizeByPeak(),
]
)
augment = auglib.Augment(transform)
signal_augmented = augment(signal, sampling_rate)
Audiomentations¶
Audiomentations is another Python library
for audio data augmentation,
originally inspired by albumentations.
It provides additional transformations
such as pitch shifting and time stretching,
or mp3 compression to
simulate lower audio quality.
It also includes spectrogram transformations
(not supported by auglib
).
For GPU support the package
torch-audiomentations
is available.
In the following example,
we combine Gaussian noise,
time stretching,
and pitch shifting.
Similar to auglib
a probability controls if
a transformation is applied or bypassed.
Again,
we use auglib.transform.Function
to include transforms from audiomentations
into our auglib
augmentation chain.
def audiomentations_transform(signal, sampling_rate, p):
r"""Custom augmentation using audiomentations."""
import audiomentations
compose = audiomentations.Compose([
audiomentations.AddGaussianNoise(min_amplitude=0.001, max_amplitude=0.015, p=p),
audiomentations.TimeStretch(min_rate=0.8, max_rate=1.25, p=p),
audiomentations.PitchShift(min_semitones=-4, max_semitones=4, p=p),
])
return compose(signal, sampling_rate)
transform = auglib.transform.Compose(
[
auglib.transform.Function(audiomentations_transform, {"p": 1.0}),
auglib.transform.NormalizeByPeak(),
]
)
augment = auglib.Augment(transform)
signal_augmented = augment(signal, sampling_rate)
Sox¶
Sox provides a large variety of effects, so called Transformers, that might be useful for augmentation. Here, we shift the pitch by two semitones, and apply a Flanger effect.
def sox_transform(signal, sampling_rate):
r"""Custom augmentation using sox."""
import sox
tfm = sox.Transformer()
tfm.pitch(2)
tfm.flanger()
return tfm.build_array(
input_array=signal.squeeze(),
sample_rate_in=sampling_rate,
)
transform = auglib.transform.Compose(
[
auglib.transform.Function(sox_transform),
auglib.transform.NormalizeByPeak(),
]
)
augment = auglib.Augment(transform)
signal_augmented = augment(signal, sampling_rate)