read()¶

audiofile.read(file, duration=None, offset=None, always_2d=False, dtype='float32', **kwargs)[source]¶

Read audio file.

It uses soundfile.read() for WAV, FLAC, MP3, and OGG files. All other audio files are first converted to WAV by sox or ffmpeg.

duration and offset support all formats mentioned in audmath.duration_in_seconds(), like '2 ms', or pd.to_timedelta(2, 's'). The exception is that float and integer values are always interpreted as seconds and strings without unit always as samples. If duration and/or offset are negative, they are interpreted from right to left, whereas duration starts from the end of the signal for offset=None. If the signal is shorter than the requested duration and/or offset only the part of the signal overlapping with the requested signal is returned, e.g. for a file containing the signal [0, 1, 2], duration=2, offset=-4 will return [0].

duration and offset are evenly rounded after conversion to samples.

Parameters

file (str) – file name of input audio file
duration (Union[float, int, str, timedelta64, None]) – return only the specified duration
offset (Union[float, int, str, timedelta64, None]) – start reading at offset
always_2d (bool) – if True it always returns a two-dimensional signal even for mono sound files
dtype (str) – data type of returned signal, select from 'float64', 'float32', 'int32', 'int16'
kwargs – pass on further arguments to soundfile.read()

Return type

tuple[array, int]

Returns

a two-dimensional array in the form [channels, samples]. If the sound file has only one channel and always_2d=False, a one-dimensional array is returned
sample rate of the audio file

Raises

FileNotFoundError – if ffmpeg binary is needed, but cannot be found
RuntimeError – if file is missing, broken or format is not supported
ValueError – if duration is a string that does not match a valid ‘<value><unit>’ pattern or the provided unit is not supported

Examples

>>> signal, sampling_rate = read("mono.wav", always_2d=True)
>>> sampling_rate
8000
>>> signal.shape
(1, 12000)
>>> signal, sampling_rate = read("mono.wav")
>>> signal.shape
(12000,)
>>> import audplot
>>> audplot.waveform(signal)

>>> signal, sampling_rate = read("mono.wav", duration=0.5)
>>> # Extend signal to original length
>>> signal = np.pad(signal, (0, 8000))
>>> audplot.waveform(signal)

>>> signal, sampling_rate = read("mono.wav", duration=-0.5)
>>> # Extend signal to original length
>>> signal = np.pad(signal, (8000, 0))
>>> audplot.waveform(signal)

>>> signal, sampling_rate = read("mono.wav", offset="4000", duration="4000")
>>> # Extend signal to original length
>>> signal = np.pad(signal, (4000, 4000))
>>> audplot.waveform(signal)

>>> # Use audresample for resampling and remixing
>>> import audresample
>>> signal, sampling_rate = read("stereo.wav")
>>> signal.shape
(2, 12000)
>>> target_rate = 16000
>>> signal = audresample.resample(signal, sampling_rate, target_rate)
>>> signal.shape
(2, 24000)
>>> signal = audresample.remix(signal, mixdown=True)
>>> signal.shape
(1, 24000)
>>> audplot.waveform(signal)