checksum()

audbackend.checksum(file)[source]

Checksum of file.

This function is used by backends to get the checksum of local files, using audeer.md5().

An exception are parquet files, for which their "hash" metadata entry is used as checksum, if the entry is available and pyarrow is installed.

Parameters

file (str) – file path with extension

Return type

str

Returns

MD5 checksum of file

Raises

FileNotFoundError – if file does not exist

Examples

>>> checksum("src.txt")
'd41d8cd98f00b204e9800998ecf8427e'
>>> import audformat
>>> import pandas as pd
>>> import pyarrow as pa
>>> import pyarrow.parquet as pq
>>> df = pd.DataFrame([0, 1], columns=["a"])
>>> hash = audformat.utils.hash(df, strict=True)
>>> hash
'9021a9b6e1e696ba9de4fe29346319b2'
>>> parquet_file = audeer.path("file.parquet")
>>> table = pa.Table.from_pandas(df)
>>> table = table.replace_schema_metadata({"hash": hash})
>>> pq.write_table(table, parquet_file, compression="snappy")
>>> checksum(parquet_file)
'9021a9b6e1e696ba9de4fe29346319b2'