read()

audiofile.read(file, duration=None, offset=None, always_2d=False, dtype='float32', **kwargs)[source]

Read audio file.

It uses soundfile.read() for WAV, FLAC, MP3, and OGG files. All other audio files are first converted to WAV by sox or ffmpeg.

duration and offset support all formats mentioned in audmath.duration_in_seconds(), like '2 ms', or pd.to_timedelta(2, 's'). The exception is that float and integer values are always interpreted as seconds and strings without unit always as samples. If duration and/or offset are negative, they are interpreted from right to left, whereas duration starts from the end of the signal for offset=None. If the signal is shorter than the requested duration and/or offset only the part of the signal overlapping with the requested signal is returned, e.g. for a file containing the signal [0, 1, 2], duration=2, offset=-4 will return [0].

duration and offset are evenly rounded after conversion to samples.

Parameters
  • file (str) – file name of input audio file

  • duration (Union[float, int, str, timedelta64, None]) – return only the specified duration

  • offset (Union[float, int, str, timedelta64, None]) – start reading at offset

  • always_2d (bool) – if True it always returns a two-dimensional signal even for mono sound files

  • dtype (str) – data type of returned signal, select from 'float64', 'float32', 'int32', 'int16'

  • kwargs – pass on further arguments to soundfile.read()

Return type

tuple[array, int]

Returns

  • a two-dimensional array in the form [channels, samples]. If the sound file has only one channel and always_2d=False, a one-dimensional array is returned

  • sample rate of the audio file

Raises
  • FileNotFoundError – if ffmpeg binary is needed, but cannot be found

  • RuntimeError – if file is missing, broken or format is not supported

  • ValueError – if duration is a string that does not match a valid ‘<value><unit>’ pattern or the provided unit is not supported

Examples

>>> signal, sampling_rate = read("mono.wav", always_2d=True)
>>> sampling_rate
8000
>>> signal.shape
(1, 12000)
>>> signal, sampling_rate = read("mono.wav")
>>> signal.shape
(12000,)
>>> import audplot
>>> audplot.waveform(signal)
../_images/audiofile-read-2.png
>>> signal, sampling_rate = read("mono.wav", duration=0.5)
>>> # Extend signal to original length
>>> signal = np.pad(signal, (0, 8000))
>>> audplot.waveform(signal)
../_images/audiofile-read-3.png
>>> signal, sampling_rate = read("mono.wav", duration=-0.5)
>>> # Extend signal to original length
>>> signal = np.pad(signal, (8000, 0))
>>> audplot.waveform(signal)
../_images/audiofile-read-4.png
>>> signal, sampling_rate = read("mono.wav", offset="4000", duration="4000")
>>> # Extend signal to original length
>>> signal = np.pad(signal, (4000, 4000))
>>> audplot.waveform(signal)
../_images/audiofile-read-5.png
>>> # Use audresample for resampling and remixing
>>> import audresample
>>> signal, sampling_rate = read("stereo.wav")
>>> signal.shape
(2, 12000)
>>> target_rate = 16000
>>> signal = audresample.resample(signal, sampling_rate, target_rate)
>>> signal.shape
(2, 24000)
>>> signal = audresample.remix(signal, mixdown=True)
>>> signal.shape
(1, 24000)
>>> audplot.waveform(signal)
../_images/audiofile-read-6.png