word_error_rate()

audmetric.word_error_rate(truth, prediction, *, norm='truth')[source]

Word error rate based on edit distance.

The word error rate is computed by aggregating the normalized edit distances of each (truth, prediction)-pair and averaging the aggregated score by the number of pairs.

The normalized edit distance of each (truth, prediction)-pair is computed as the edit distance divided by a normalization factor n. This represents the average editing cost per sequence item. The value of n depends on the norm parameter.

If norm is "truth", n is set to the reference (truth) length, following the Wikipedia formulation. Here, n is the number of words in the reference. This means WER can be greater than 1 if the prediction sequence is longer than the reference:

n=len(t)n = \text{len}(t)

If norm is "longest", n is set to the maximum length between truth and prediction:

n=max(len(t),len(p))n = \max(\text{len}(t), \text{len}(p))
Parameters
  • truth (Sequence[Sequence[str]]) – ground truth strings

  • prediction (Sequence[Sequence[str]]) – predicted strings

  • norm (str) – normalization method, either “truth” or “longest”. “truth” normalizes by truth length, “longest” normalizes by max length of truth and prediction

Return type

float

Returns

word error rate

Raises
  • ValueError – if truth and prediction differ in length

  • ValueError – if norm is not one of "truth", "longest"

Examples

>>> truth = [["lorem", "ipsum"], ["north", "wind", "and", "sun"]]
>>> prediction = [["lorm", "ipsum"], ["north", "wind"]]
>>> word_error_rate(truth, prediction)
0.5
>>> truth = [["hello", "world"]]
>>> prediction = [["xyz", "moon", "abc"]]
>>> word_error_rate(truth, prediction)
1.5
>>> word_error_rate(truth, prediction, norm="longest")
1.0