w2v2-b vs. w2v2-L vs. w2v2-L-robust vs. w2v2-L-xls-r vs. w2v2-L-vox

This compares the models w2v2-b and w2v2-L and w2v2-L-robust and w2v2-L-xls-r and w2v2-L-vox to one another.

Tests overview

Topic

Passed Tests

w2v2-b

w2v2-L

w2v2-L-robust

w2v2-L-xls-r

w2v2-L-vox

Overall Score

84.9% (332 passed / 59 failed)

81.7% (294 passed / 66 failed)

84.4% (304 passed / 56 failed)

83.6% (301 passed / 59 failed)

85.1% (338 passed / 59 failed)

Correctness consistency

51.2%

55.8%

51.2%

53.5%

51.2%

Correctness distribution

66.7%

66.7%

66.7%

66.7%

66.7%

Correctness regression

66.7%

66.7%

66.7%

66.7%

66.7%

Correctness speaker average

100.0%

100.0%

100.0%

100.0%

100.0%

Correctness speaker ranking

100.0%

100.0%

100.0%

100.0%

100.0%

Fairness accent

98.4%

93.5%

89.2%

97.8%

94.4%

Fairness language

79.2%

66.7%

79.2%

79.2%

83.3%

Fairness linguistic sentiment

100.0%

97.2%

93.1%

100.0%

100.0%

Fairness pitch

100.0%

93.3%

100.0%

93.3%

100.0%

Fairness sex

90.6%

96.9%

100.0%

96.9%

93.8%

Robustness background noise

50.0%

50.0%

75.0%

45.8%

54.2%

Robustness low quality phone

75.0%

25.0%

100.0%

25.0%

100.0%

Robustness recording condition

0.0%

0.0%

50.0%

50.0%

50.0%

Robustness simulated recording condition

0.0%

16.7%

50.0%

0.0%

33.3%

Robustness small changes

90.0%

90.0%

95.0%

90.0%

90.0%

Robustness spectral tilt

87.5%

87.5%

100.0%

87.5%

75.0%