Correctness speaker ranking

Overall scores

CNN14

w2v2-b

hubert-b

axlstm

Overall Score

0.0% passed tests (0 passed / 2 failed).

0.0% passed tests (0 passed / 2 failed).

50.0% passed tests (1 passed / 1 failed).

0.0% passed tests (0 passed / 2 failed).

Spearmans Rho

Threshold: 0.7

Data

Spearmans Rho

CNN14

w2v2-b

hubert-b

axlstm

msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard

0.20

0.70

0.82

0.53

msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard

-0.07

0.22

0.22

-0.03

mean

0.07

0.46

0.52

0.25

Visualization

The plots visualize the precision of predicting speakers to be in the Top 25% or Bottom 25% of all speakers. Green dots indicate correctly classified speakers, red false positive speakers, whereby red squares indicate confusions between Top 25% and Bottom 25% speakers. The remaining grey data points are samples outside the range of interest. They contain false negatives that should have been predicted in the Top 25% or Bottom 25% of speakers, but were not. True negatives are those speakers that are not part of the Top 25% or Bottom 25%, and were predicted as such.

CNN14

w2v2-b

hubert-b

axlstm

../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard144.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard145.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard146.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard147.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard100.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard101.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard102.png
../../../_images/visualization_msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard103.png