stream()¶
- audb.stream(name, table, *, version=None, map=None, batch_size=16, shuffle=False, buffer_size=100000, only_metadata=False, bit_depth=None, channels=None, format=None, mixdown=False, sampling_rate=None, full_path=True, cache_root=None, num_workers=1, timeout=86400, verbose=True)[source]¶
Stream table and media files of a database.
Loads only the first
batch_sizerows of a table into memory, and downloads only the related media files, if any media files are requested.By setting
bit_depth,channels,format,mixdown, andsampling_ratewe can request a specific flavor of the database. In that case media files are automatically converted to the desired properties (see alsoaudb.Flavor).- Parameters
name (
str) – name of databasetable (
str) – name of tablemap (
Optional[dict[str,str|Sequence[str]]]) – map scheme or scheme fields to column values. For example if your table holds a columnspeakerwith speaker IDs, which is assigned to a scheme that contains a dict mapping speaker IDs to age and gender entries,map={'speaker': ['age', 'gender']}will replace the column with two new columns that map ID values to age and gender, respectively. To also keep the original column with speaker IDS, you can domap={'speaker': ['speaker', 'age', 'gender']}batch_size (
int) – number of table rows to return in one iterationshuffle (
bool) – ifTrue, it first readsbuffer_sizerows from the table and selectsbatch_sizerandomly from thembuffer_size (
int) – number of table rows to be loaded whenshuffleisTrueonly_metadata (
bool) – load only header and tables of databasechannels (
Union[int,Sequence[int],None]) – channel selection, seeaudresample.remix(). Note that media files with too few channels will be first upsampled by repeating the existing channels. E.g.channels=[0, 1]upsamples all mono files to stereo, andchannels=[1]returns the second channel of all multi-channel files and all mono filesmixdown (
bool) – apply mono mix-downsampling_rate (
Optional[int]) – sampling rate in Hz, one of8000,16000,22050,24000,44100,48000full_path (
bool) – replace relative with absolute file pathscache_root (
Optional[str]) – cache folder where databases are stored. If not setaudb.default_cache_root()is usednum_workers (
Optional[int]) – number of parallel jobs or 1 for sequential processing. IfNonewill be set to the number of processors on the machine multiplied by 5timeout (
float) – maximum time in seconds before giving up acquiring a lock to the database cache folder.Noneis returned in this caseverbose (
bool) – show debug messages
- Return type
- Returns
database object
- Raises
ValueError – if table is requested that is not part of the database
ValueError – if a non-supported
bit_depth,format, orsampling_rateis requestedRuntimeError – if a flavor is requested, but the database contains media files, that don’t contain audio, e.g. text files
Examples
>>> import numpy as np >>> np.random.seed(1) >>> db = audb.stream( ... "emodb", ... "files", ... version="1.4.1", ... batch_size=4, ... shuffle=True, ... only_metadata=True, ... full_path=False, ... verbose=False, ... ) >>> next(db) duration speaker transcription file wav/14a05Fb.wav 0 days 00:00:03.128687500 14 a05 wav/15a05Eb.wav 0 days 00:00:03.993562500 15 a05 wav/12a05Nd.wav 0 days 00:00:03.185875 12 a05 wav/13a07Na.wav 0 days 00:00:01.911687500 13 a07