w2v2-L-cat vs. hubert-L-cat vs. wavlm-cat vs. data2vec-cat

This compares the models w2v2-L-cat and hubert-L-cat and wavlm-cat and data2vec-cat to one another.

Tests overview

Topic

Passed Tests

w2v2-L-cat

hubert-L-cat

wavlm-cat

data2vec-cat

Overall Score

74.5% (458 passed / 157 failed)

79.5% (489 passed / 126 failed)

78.5% (483 passed / 132 failed)

74.3% (457 passed / 158 failed)

Correctness classification

52.0%

64.0%

72.0%

51.0%

Correctness distribution

70.0%

62.5%

67.5%

70.0%

Correctness speaker average

33.3%

66.7%

58.3%

58.3%

Correctness speaker ranking

50.0%

62.5%

50.0%

37.5%

Fairness accent

100.0%

100.0%

99.2%

100.0%

Fairness language

75.0%

91.7%

79.2%

83.3%

Fairness linguistic sentiment

88.5%

85.4%

85.4%

86.5%

Fairness pitch

100.0%

96.3%

92.6%

96.3%

Fairness sex

94.4%

100.0%

94.4%

100.0%

Robustness background noise

33.3%

41.7%

46.7%

38.3%

Robustness low quality phone

90.0%

80.0%

90.0%

90.0%

Robustness recording condition

0.0%

100.0%

100.0%

50.0%

Robustness simulated recording condition

33.3%

66.7%

33.3%

33.3%

Robustness small changes

66.0%

78.0%

62.0%

56.0%

Robustness spectral tilt

90.0%

95.0%

90.0%

80.0%