unweighted_average_bias()¶

audmetric.unweighted_average_bias(truth, prediction, protected_variable, labels=None, *, subgroups=None, metric=<function fscore_per_class>, reduction=<function std>)[source]¶

Unweighted average bias of protected variable.

The bias is measured in terms of equalized odds which requires the classifier to have identical performance for all classes independent of a protected variable such as race. The performance of the classifier for its different classes can be assessed with standard metrics such as recall or precision. The difference in performance, denoted as score divergence, can be computed in different ways, as well. For two subgroups the (absolute) difference serves as a standard choice. For more than two subgroups the score divergence could be estimated by the standard deviation of the scores.

Note

If for a class less than two subgroups exhibit a performance score, the corresponding class is ignored in the bias computation. This occurs if there is no class sample for a subgroup, e.g. no negative (class label) female (subgroup of sex).

Parameters

truth (Sequence[Any]) – ground truth classes
prediction (Sequence[Any]) – predicted classes
protected_variable (Sequence[Any]) – manifestations of protected variable such as subgroups “male” and “female” of variable “sex”
labels (Optional[Sequence[Any]]) – included labels in preferred ordering. The bias is computed only on the specified labels. If no labels are supplied, they will be inferred from $\{\text{prediction}, \text{truth}\}$ and ordered alphabetically.
subgroups (Optional[Sequence[Any]]) – included subgroups in preferred ordering. The direction of the bias is determined by the ordering of the subgroups. Besides, the bias is computed only on the specified subgroups. If no subgroups are supplied, they will be inferred from $\text{protected\_variable}$ and ordered alphanumerically.
metric (Callable[[Sequence[Any], Sequence[Any], Optional[Sequence[str]]], Dict[str, float]]) – metric which equalized odds are measured with. Typical choices are: audmetric.recall_per_class(), audmetric.precision_per_class() or audmetric.fscore_per_class()
reduction (Callable[[Sequence[float]], float]) – specifies the reduction operation to measure the divergence between the scores of the subgroups of the protected variable for each class. Typical choices are: difference or absolute difference between scores for two subgroups and standard deviation of scores for more than two subgroups.

Return type

float

Returns

unweighted average bias

Raises

ValueError – if truth, prediction and protected_variable have different lengths
ValueError – if subgroups contains values not contained in protected_variable

Examples

>>> unweighted_average_bias([1, 1], [1, 0], ["male", "female"])
0.5
>>> unweighted_average_bias(
...     [1, 1],
...     [1, 0],
...     ["male", "female"],
...     subgroups=["female", "male"],
...     reduction=lambda x: x[0] - x[1],
... )
-1.0
>>> unweighted_average_bias([0, 1], [1, 0], ["male", "female"], metric=recall_per_class)
nan
>>> unweighted_average_bias(
...     [0, 0, 0, 0],
...     [1, 1, 0, 0],
...     ["a", "b", "c", "d"],
...     metric=recall_per_class,
... )
0.5