Maven

class audbackend.interface.Maven(backend, *, extensions=[], regex=False)[source]

Interface for Maven style versioned file access.

Use this interface, if you want to version files similar to how it is handled by Maven. For each file on the backend path one or more versions may exist.

Files are stored under ".../<name-wo-ext>/<version>/<name-wo-ext>-<version>.<ext>". By default, the extension <ext> is set to the string after the last dot. I.e., the backend path ".../file.tar.gz" will translate into ".../file.tar/1.0.0/file.tar-1.0.0.gz". However, by passing a list with custom extensions it is possible to overwrite the default behavior for certain extensions. E.g., with extensions=["tar.gz"] it is ensured that "tar.gz" will be recognized as an extension and the backend path ".../file.tar.gz" will then translate into ".../file/1.0.0/file-1.0.0.tar.gz". If regex is set to True, the extensions are treated as regular expressions.

Parameters
  • backend (Base) – file storage backend

  • extensions (Sequence[str]) – list of file extensions to support including a ".". Per default only the part after the last ".", is considered as a file extension

  • regex (bool) – if True, extensions entries are treated as regular expressions. E.g. "\d+.tar.gz" will match "1.tar.gz", "2.tar.gz", … as extensions

Examples

>>> file = "src.txt"
>>> backend = audbackend.backend.FileSystem("host", "repo")
>>> backend.open()
>>> interface = Maven(backend)
>>> interface.put_archive(".", "/sub/archive.zip", "1.0.0", files=[file])
>>> for version in ["1.0.0", "2.0.0"]:
...     interface.put_file(file, "/file.txt", version)
>>> interface.ls()
[('/file.txt', '1.0.0'), ('/file.txt', '2.0.0'), ('/sub/archive.zip', '1.0.0')]
>>> interface.get_file("/file.txt", "dst.txt", "2.0.0")
'...dst.txt'

backend

Maven.backend

Backend object.

Returns

backend object

Examples

>>> interface.backend
audbackend.backend.FileSystem('host', 'repo')

checksum()

Maven.checksum(path, version)

MD5 checksum for file on backend.

Parameters
  • path (str) – path to file on backend

  • version (str) – version string

Return type

str

Returns

MD5 checksum

Raises
  • BackendError – if an error is raised on the backend, e.g. path does not exist

  • ValueError – if path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • ValueError – if version is empty or does not match '[A-Za-z0-9._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> import audeer
>>> audeer.md5(file)
'd41d8cd98f00b204e9800998ecf8427e'
>>> interface.put_file(file, "/file.txt", "1.0.0")
>>> interface.checksum("/file.txt", "1.0.0")
'd41d8cd98f00b204e9800998ecf8427e'

copy_file()

Maven.copy_file(src_path, dst_path, *, version=None, validate=False, verbose=False)

Copy file on backend.

If version is None all versions of src_path will be copied.

If dst_path exists and has a different checksum, it is overwritten. Otherwise, the operation is silently skipped.

If validate is set to True, a final check is performed to assert that src_path and dst_path have the same checksum. If it fails, dst_path is removed and an InterruptedError is raised.

Parameters
  • src_path (str) – source path to file on backend

  • dst_path (str) – destination path to file on backend

  • validate (bool) – verify file was successfully copied

  • version (Optional[str]) – version string

  • verbose (bool) – show debug messages

Raises
  • BackendError – if an error is raised on the backend

  • InterruptedError – if validation fails

  • ValueError – if src_path or dst_path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • ValueError – if version is empty or does not match '[A-Za-z0-9._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.put_file(file, "/file.txt", "1.0.0")
>>> interface.exists("/copy.txt", "1.0.0")
False
>>> interface.copy_file("/file.txt", "/copy.txt", version="1.0.0")
>>> interface.exists("/copy.txt", "1.0.0")
True

date()

Maven.date(path, version)

Last modification date of file on backend.

If the date cannot be determined, an empty string is returned.

Parameters
  • path (str) – path to file on backend

  • version (str) – version string

Return type

str

Returns

date in format 'yyyy-mm-dd'

Raises
  • BackendError – if an error is raised on the backend, e.g. path does not exist

  • ValueError – if path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • ValueError – if version is empty or does not match '[A-Za-z0-9._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.put_file(file, "/file.txt", "1.0.0")
>>> interface.date("/file.txt", "1.0.0")
'1991-02-20'

exists()

Maven.exists(path, version, *, suppress_backend_errors=False)

Check if file exists on backend.

Parameters
  • path (str) – path to file on backend

  • version (str) – version string

  • suppress_backend_errors (bool) – if set to True, silently catch errors raised on the backend and return False

Return type

bool

Returns

True if file exists

Raises
  • BackendError – if suppress_backend_errors is False and an error is raised on the backend, e.g. due to a connection timeout

  • ValueError – if path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • ValueError – if version is empty or does not match '[A-Za-z0-9._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.exists("/file.txt", "1.0.0")
False
>>> interface.put_file(file, "/file.txt", "1.0.0")
>>> interface.exists("/file.txt", "1.0.0")
True

get_archive()

Maven.get_archive(src_path, dst_root, version, *, tmp_root=None, validate=False, verbose=False)

Get archive from backend and extract.

The archive type is derived from the extension of src_path. See audeer.extract_archive() for supported extensions.

If dst_root does not exist, it is created.

If validate is set to True, a final check is performed to assert that src_path and the retrieved archive have the same checksum. If it fails, the retrieved archive is removed and an InterruptedError is raised.

Parameters
  • src_path (str) – path to archive on backend

  • dst_root (str) – local destination directory

  • version (str) – version string

  • tmp_root (Optional[str]) – directory under which archive is temporarily extracted. Defaults to temporary directory of system

  • validate (bool) – verify archive was successfully retrieved from the backend

  • verbose (bool) – show debug messages

Return type

list[str]

Returns

extracted files

Raises

Examples

>>> file = "src.txt"
>>> interface.put_archive(".", "/sub/archive.zip", "1.0.0", files=[file])
>>> os.remove(file)
>>> interface.get_archive("/sub/archive.zip", ".", "1.0.0")
['src.txt']

get_file()

Maven.get_file(src_path, dst_path, version, *, validate=False, verbose=False)

Get file from backend.

If the folder of dst_path does not exist, it is created.

If dst_path exists with a different checksum, it is overwritten, or otherwise, the operation is silently skipped.

If validate is set to True, a final check is performed to assert that src_path and dst_path have the same checksum. If it fails, dst_path is removed and an InterruptedError is raised.

Parameters
  • src_path (str) – path to file on backend

  • dst_path (str) – destination path to local file

  • version (str) – version string

  • validate (bool) – verify file was successfully retrieved from the backend

  • verbose (bool) – show debug messages

Return type

str

Returns

full path to local file

Raises
  • BackendError – if an error is raised on the backend, e.g. src_path does not exist

  • InterruptedError – if validation fails

  • IsADirectoryError – if dst_path points to an existing folder

  • PermissionError – if the user lacks write permissions for dst_path

  • ValueError – if src_path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • ValueError – if version is empty or does not match '[A-Za-z0-9._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.put_file(file, "/file.txt", "1.0.0")
>>> os.path.exists("dst.txt")
False
>>> _ = interface.get_file("/file.txt", "dst.txt", "1.0.0")
>>> os.path.exists("dst.txt")
True

host

Maven.host

Host path.

Returns: host path

Examples

>>> interface.host
'host'

join()

Maven.join(path, *paths)

Join to path on backend.

Parameters
  • path (str) – first part of path

  • *paths – additional parts of path

Return type

str

Returns

path joined by Backend.sep

Raises

ValueError – if path contains invalid character or does not start with '/', or if joined path contains invalid character

Examples

>>> interface.join("/", "file.txt")
'/file.txt'
>>> interface.join("/sub", "file.txt")
'/sub/file.txt'
>>> interface.join("//sub//", "/", "", None, "/file.txt")
'/sub/file.txt'

latest_version()

Maven.latest_version(path)

Latest version of a file.

Parameters

path (str) – path to file on backend

Return type

str

Returns

version string

Raises
  • BackendError – if an error is raised on the backend, e.g. path does not exist

  • ValueError – if path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • RuntimeError

    if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.put_file(file, "/file.txt", "1.0.0")
>>> interface.put_file(file, "/file.txt", "2.0.0")
>>> interface.latest_version("/file.txt")
'2.0.0'

ls()

Maven.ls(path='/', *, latest_version=False, pattern=None, suppress_backend_errors=False)[source]

List files on backend.

Returns a sorted list of tuples with path and version. If a full path (e.g. /sub/file.ext) is provided, all versions of the path are returned. If a sub-path (e.g. /sub/) is provided, all files that start with the sub-path are returned. When path is set to '/' a (possibly empty) list with all files on the backend is returned.

Parameters
  • path (str) – path or sub-path (if it ends with '/') on backend

  • latest_version (bool) – if multiple versions of a file exist, only include the latest

  • pattern (Optional[str]) – if not None, return only files matching the pattern string, see fnmatch.fnmatch()

  • suppress_backend_errors (bool) – if set to True, silently catch errors raised on the backend and return an empty list

Return type

list[tuple[str, str]]

Returns

list of tuples (path, version)

Raises
  • BackendError – if suppress_backend_errors is False and an error is raised on the backend, e.g. path does not exist

  • ValueError – if path does not start with '/' or does not match '[A-Za-z0-9/._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.put_archive(".", "/sub/archive.zip", "1.0.0", files=[file])
>>> for version in ["1.0.0", "2.0.0"]:
...     interface.put_file(file, "/file.txt", version)
>>> interface.ls()
[('/file.txt', '1.0.0'), ('/file.txt', '2.0.0'), ('/sub/archive.zip', '1.0.0')]
>>> interface.ls(latest_version=True)
[('/file.txt', '2.0.0'), ('/sub/archive.zip', '1.0.0')]
>>> interface.ls("/file.txt")
[('/file.txt', '1.0.0'), ('/file.txt', '2.0.0')]
>>> interface.ls(pattern="*.txt")
[('/file.txt', '1.0.0'), ('/file.txt', '2.0.0')]
>>> interface.ls(pattern="archive.*")
[('/sub/archive.zip', '1.0.0')]
>>> interface.ls("/sub/")
[('/sub/archive.zip', '1.0.0')]

move_file()

Maven.move_file(src_path, dst_path, *, version=None, validate=False, verbose=False)

Move file on backend.

If version is None all versions of src_path will be moved.

If dst_path exists and has a different checksum, it is overwritten. Otherwise, src_path is removed and the operation silently skipped.

If validate is set to True, a final check is performed to assert that src_path and dst_path have the same checksum. If it fails, dst_path is removed and an InterruptedError is raised. To ensure src_path still exists in this case it is first copied and only removed when the check has successfully passed.

Parameters
  • src_path (str) – source path to file on backend

  • dst_path (str) – destination path to file on backend

  • version (Optional[str]) – version string

  • validate (bool) – verify file was successfully moved

  • verbose (bool) – show debug messages

Raises
  • BackendError – if an error is raised on the backend

  • InterruptedError – if validation fails

  • ValueError – if src_path or dst_path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • ValueError – if version is empty or does not match '[A-Za-z0-9._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.put_file(file, "/file.txt", "1.0.0")
>>> interface.exists("/move.txt", "1.0.0")
False
>>> interface.move_file("/file.txt", "/move.txt", version="1.0.0")
>>> interface.exists("/move.txt", "1.0.0")
True
>>> interface.exists("/file.txt", "1.0.0")
False

owner()

Maven.owner(path, version)

Owner of file on backend.

If the owner of the file cannot be determined, an empty string is returned.

Parameters
  • path (str) – path to file on backend

  • version (str) – version string

Return type

str

Returns

owner

Raises
  • BackendError – if an error is raised on the backend, e.g. path does not exist

  • ValueError – if path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • ValueError – if version is empty or does not match '[A-Za-z0-9._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.put_file(file, "/file.txt", "1.0.0")
>>> interface.owner("/file.txt", "1.0.0")
'doctest'

put_archive()

Maven.put_archive(src_root, dst_path, version, *, files=None, tmp_root=None, validate=False, verbose=False)

Create archive and put on backend.

The archive type is derived from the extension of dst_path. See audeer.create_archive() for supported extensions.

The operation is silently skipped, if an archive with the same checksum already exists on the backend.

If validate is set to True, a final check is performed to assert that the local archive and dst_path have the same checksum. If it fails, dst_path is removed and an InterruptedError is raised.

Parameters
  • src_root (str) – local root directory where files are located. By default, all files below src_root will be included into the archive. Use files to select specific files

  • dst_path (str) – path to archive on backend

  • version (str) – version string

  • files (Union[str, Sequence[str], None]) – file(s) to include into the archive. Must exist within src_root

  • tmp_root (Optional[str]) – directory under which archive is temporarily created. Defaults to temporary directory of system

  • validate (bool) – verify archive was successfully put on the backend

  • verbose (bool) – show debug messages

Raises
  • BackendError – if an error is raised on the backend

  • FileNotFoundError – if src_root, tmp_root, or one or more files do not exist

  • InterruptedError – if validation fails

  • NotADirectoryError – if src_root is not a folder

  • RuntimeError – if dst_path does not end with zip or tar.gz or a file in files is not below root

  • ValueError – if dst_path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • ValueError – if version is empty or does not match '[A-Za-z0-9._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.exists("/sub/archive.tar.gz", "1.0.0")
False
>>> interface.put_archive(".", "/sub/archive.tar.gz", "1.0.0")
>>> interface.exists("/sub/archive.tar.gz", "1.0.0")
True

put_file()

Maven.put_file(src_path, dst_path, version, *, validate=False, verbose=False)

Put file on backend.

The operation is silently skipped, if a file with the same checksum already exists on the backend.

If validate is set to True, a final check is performed to assert that src_path and dst_path have the same checksum. If it fails, dst_path is removed and an InterruptedError is raised.

Parameters
  • src_path (str) – path to local file

  • dst_path (str) – path to file on backend

  • version (str) – version string

  • validate (bool) – verify file was successfully put on the backend

  • verbose (bool) – show debug messages

Returns

file path on backend

Raises

Examples

>>> file = "src.txt"
>>> interface.exists("/file.txt", "3.0.0")
False
>>> interface.put_file(file, "/file.txt", "3.0.0")
>>> interface.exists("/file.txt", "3.0.0")
True

remove_file()

Maven.remove_file(path, version)

Remove file from backend.

Parameters
  • path (str) – path to file on backend

  • version (str) – version string

Raises
  • BackendError – if an error is raised on the backend, e.g. path does not exist

  • ValueError – if path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • ValueError – if version is empty or does not match '[A-Za-z0-9._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.put_file(file, "/file.txt", "1.0.0")
>>> interface.exists("/file.txt", "1.0.0")
True
>>> interface.remove_file("/file.txt", "1.0.0")
>>> interface.exists("/file.txt", "1.0.0")
False

repository

Maven.repository

Repository name.

Returns

repository name

Examples

>>> interface.repository
'repo'

sep

Maven.sep

File separator on backend.

Returns

file separator

Examples

>>> interface.sep
'/'

split()

Maven.split(path)

Split path on backend into sub-path and basename.

Parameters

path (str) – path containing Backend.sep as separator

Return type

tuple[str, str]

Returns

tuple containing (root, basename)

Raises

ValueError – if path does not start with '/' or does not match '[A-Za-z0-9/._-]+'

Examples

>>> interface.split("/")
('/', '')
>>> interface.split("/file.txt")
('/', 'file.txt')
>>> interface.split("/sub/")
('/sub/', '')
>>> interface.split("/sub//file.txt")
('/sub/', 'file.txt')

versions()

Maven.versions(path, *, suppress_backend_errors=False)

Versions of a file.

Parameters
  • path (str) – path to file on backend

  • suppress_backend_errors (bool) – if set to True, silently catch errors raised on the backend and return an empty list

Return type

list[str]

Returns

list of versions in ascending order

Raises
  • BackendError – if suppress_backend_errors is False and an error is raised on the backend, e.g. path does not exist

  • ValueError – if path does not start with '/', ends on '/', or does not match '[A-Za-z0-9/._-]+'

  • RuntimeError – if backend was not opened

Examples

>>> file = "src.txt"
>>> interface.put_file(file, "/file.txt", "1.0.0")
>>> interface.put_file(file, "/file.txt", "2.0.0")
>>> interface.versions("/file.txt")
['1.0.0', '2.0.0']