word_error_rate()¶

audmetric.word_error_rate(truth, prediction, *, norm='truth')[source]¶

Word error rate based on edit distance.

The word error rate is computed by aggregating the normalized edit distances of each (truth, prediction)-pair and averaging the aggregated score by the number of pairs.

The normalized edit distance of each (truth, prediction)-pair is computed as the edit distance divided by a normalization factor n. This represents the average editing cost per sequence item. The value of n depends on the norm parameter.

If norm is "truth", n is set to the reference (truth) length, following the Wikipedia formulation. Here, n is the number of words in the reference. This means WER can be greater than 1 if the prediction sequence is longer than the reference:

$n = \text{len}(t)$

If norm is "longest", n is set to the maximum length between truth and prediction:

$n = \max(\text{len}(t), \text{len}(p))$

Parameters:

truth (Sequence[Sequence[str]]) – ground truth strings
prediction (Sequence[Sequence[str]]) – predicted strings
norm (str) – normalization method, either “truth” or “longest”. “truth” normalizes by truth length, “longest” normalizes by max length of truth and prediction

Return type:

float

Returns:

word error rate

Raises:

ValueError – if truth and prediction differ in length
ValueError – if norm is not one of "truth", "longest"

Examples

>>> truth = [["lorem", "ipsum"], ["north", "wind", "and", "sun"]]
>>> prediction = [["lorm", "ipsum"], ["north", "wind"]]
>>> word_error_rate(truth, prediction)
0.5
>>> truth = [["hello", "world"]]
>>> prediction = [["xyz", "moon", "abc"]]
>>> word_error_rate(truth, prediction)
1.5
>>> word_error_rate(truth, prediction, norm="longest")
1.0