External Solutions

Augmentation examples, showing you how to combine auglib with external augmentation solutions.

Let’s start with loading an example file to augment.

import audb
import audiofile

files = audb.load_media(
    "emodb",
    "wav/03a01Fa.wav",
    version="1.4.1",
    verbose=False,
)
signal, sampling_rate = audiofile.read(files[0])
_images/external_2_0.png

Pedalboard

Pedalboard is a Python package from Spotify, that provides a collection of useful and fast augmentations. It also allows you to include VST plugins in your augmentation pipeline.

The documentation of Pedalboard does not discuss all the used parameters of the augmentations. For the value range and an explanation of the parameters, you might want to look at the corresponding documentation of the underlying JUCE C code. E.g. for pedalboard.Reverb it is located at https://docs.juce.com/master/structReverb_1_1Parameters.html

In the following example, we use the compressor, chorus, phaser, and reverb from pedalboard, as part of our auglib augmentation chain with the help of the auglib.transform.Function class.

def pedalboard_transform(signal, sampling_rate):
    r"""Custom augmentation using pedalboard."""
    import pedalboard
    board = pedalboard.Pedalboard(
        [
            pedalboard.Compressor(threshold_db=-50, ratio=25),
            pedalboard.Chorus(),
            pedalboard.Phaser(),
            pedalboard.Reverb(room_size=0.25),
        ],
    )
    return board(signal, sampling_rate)

transform = auglib.transform.Compose(
    [
        auglib.transform.Function(pedalboard_transform),
        auglib.transform.NormalizeByPeak(),
    ]
)
augment = auglib.Augment(transform)
signal_augmented = augment(signal, sampling_rate)
_images/external_5_0.png

Audiomentations

Audiomentations is another Python library for audio data augmentation, originally inspired by albumentations. It provides additional transformations such as pitch shifting and time stretching, or mp3 compression to simulate lower audio quality. It also includes spectrogram transformations (not supported by auglib). For GPU support the package torch-audiomentations is available.

In the following example, we combine Gaussian noise, time stretching, and pitch shifting. Similar to auglib a probability controls if a transformation is applied or bypassed. Again, we use auglib.transform.Function to include transforms from audiomentations into our auglib augmentation chain.

def audiomentations_transform(signal, sampling_rate, p):
    r"""Custom augmentation using audiomentations."""
    import audiomentations
    compose = audiomentations.Compose([
        audiomentations.AddGaussianNoise(min_amplitude=0.001, max_amplitude=0.015, p=p),
        audiomentations.TimeStretch(min_rate=0.8, max_rate=1.25, p=p),
        audiomentations.PitchShift(min_semitones=-4, max_semitones=4, p=p),
    ])
    return compose(signal, sampling_rate)

transform = auglib.transform.Compose(
    [
        auglib.transform.Function(audiomentations_transform, {"p": 1.0}),
        auglib.transform.NormalizeByPeak(),
    ]
)
augment = auglib.Augment(transform)
signal_augmented = augment(signal, sampling_rate)
_images/external_8_0.png

Sox

Sox provides a large variety of effects, so called Transformers, that might be useful for augmentation. Here, we shift the pitch by two semitones, and apply a Flanger effect.

def sox_transform(signal, sampling_rate):
    r"""Custom augmentation using sox."""
    import sox
    tfm = sox.Transformer()
    tfm.pitch(2)
    tfm.flanger()
    return tfm.build_array(
        input_array=signal.squeeze(),
        sample_rate_in=sampling_rate,
    )

transform = auglib.transform.Compose(
    [
        auglib.transform.Function(sox_transform),
        auglib.transform.NormalizeByPeak(),
    ]
)
augment = auglib.Augment(transform)
signal_augmented = augment(signal, sampling_rate)
_images/external_11_0.png