Correctness speaker ranking

Overall scores

w2v2-L

hubert-L

wavlm

data2vec

Overall Score

100.0% passed tests (2 passed / 0 failed).

100.0% passed tests (2 passed / 0 failed).

100.0% passed tests (2 passed / 0 failed).

100.0% passed tests (2 passed / 0 failed).

Spearmans Rho

Threshold: 0.7

Data

Spearmans Rho

w2v2-L

hubert-L

wavlm

data2vec

msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard

0.91

0.92

0.94

0.92

msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard

0.86

0.82

0.72

0.78

mean

0.89

0.87

0.83

0.85

Visualization

The plots visualize the precision of predicting speakers to be in the Top 25% or Bottom 25% of all speakers. Green dots indicate correctly classified speakers, red false positive speakers, whereby red squares indicate confusions between Top 25% and Bottom 25% speakers. The remaining grey data points are samples outside the range of interest. They contain false negatives that should have been predicted in the Top 25% or Bottom 25% of speakers, but were not. True negatives are those speakers that are not part of the Top 25% or Bottom 25%, and were predicted as such.

w2v2-L

hubert-L

wavlm

data2vec

../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard102.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard123.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard124.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard125.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard72.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard85.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard86.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard87.png