publish()¶
- audb.publish(db_root, version, repository, *, archives=None, previous_version='latest', cache_root=None, num_workers=1, verbose=True)[source]¶
Publish database.
Publishes a database conform to audformat, stored in the
db_root
folder.A database can have dependencies to media files and tables of an older version. E.g. you might alter an existing table by adding labels for new media files to it and publish it as a new version.
audb.publish()
will then upload new and altered files and update the dependencies accordingly.To update a database, you first have to load the version that the new version should depend on with
audb.load_to()
todb_root
. Media files that are not altered can be omitted, so it is recommended to setonly_metadata=True
inaudb.load_to()
. Afterwards you make your changes to that folder and runaudb.publish()
. To remove media files from a database, make sure they are no longer referenced in the tables.Setting
previous_version=None
allows you to start from scratch and upload all files even if an older versions exist. In this case you don’t callaudb.load_to()
before runningaudb.publish()
.Handling of audio formats is based on the file extension in
audb
. This means the file extension must be lowercase and should match the audio format of the file, e.g..wav
.When canceling
audb.publish()
during publication you can restart it afterwards. It will continue from the current state, but you might need overwrite permissions in addition to write permissions on the backend.audb
uses md5 hashes of the database files to check if they have changed. Be aware that for certain file formats, like parquet, md5 hashes might differ for files with identical content. Reasons include the library that wrote the file, involved compression codes, or additional metadata written by the library. For files stored in parquet format,audb.publish()
will first look for a hash stored in its metadata under theb"hash"
key. For parquet tables, this deterministic hash is automatically added byaudformat
.Tables stored only as pickle files, are converted to parquet files before publication. If a table is stored as a parquet and csv file, the csv file is ignored, and the parquet file is published.
- Parameters
db_root (
str
) – root directory of databaseversion (
str
) – version stringrepository (
Repository
) – name of repositoryarchives (
Optional
[Mapping
[str
,str
]]) – dictionary mapping files to archive names. Can be used to bundle files into archives, which will speed up communication with the server if the database contains many small files. Archive name must not include an extensionprevious_version (
Optional
[str
]) – specifies the version this publication should be based on. If'latest'
it will use automatically the latest published version orNone
if no version was published. IfNone
it assumes you start from scratchcache_root (
Optional
[str
]) – cache folder where databases are stored. If not setaudb.default_cache_root()
is used. Only used to read the dependencies of the previous versionnum_workers (
Optional
[int
]) – number of parallel jobs or 1 for sequential processing. IfNone
will be set to the number of processors on the machine multiplied by 5verbose (
bool
) – show debug messages
- Return type
- Returns
dependency object
- Raises
RuntimeError – if version already exists
RuntimeError – if database tables reference non-existing files
RuntimeError – if database attachment path does not exist, is a symlink, is empty, or contains an empty sub-folder
RuntimeError – if database in
db_root
depends on other version as indicated byprevious_version
RuntimeError – if database is not portable, see
audformat.Database.is_portable()
RuntimeError – if non-standard formats like MP3 and MP4 are published, but sox and/or mediafile is not installed
RuntimeError – if the type of a database file changes, e.g. from media to attachment
RuntimeError – if a new media file has an uppercase letter in its file extension
RuntimeError – if database contains tables, misc tables, or attachments that are stored under an ID using a char not in
'[A-Za-z0-9._-]'
ValueError – if
version
orprevious_version
cannot be parsed byaudeer.StrictVersion
ValueError – if
previous_version
>=version
ValueError – if
repository
has artifactory as backend in Python>=3.12ValueError – if
repository
has a non-supported backend