Correctness speaker average¶

Overall scores¶
	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
Overall Score	41.7% passed tests (5 passed / 7 failed).	41.7% passed tests (5 passed / 7 failed).	33.3% passed tests (4 passed / 8 failed).	58.3% passed tests (7 passed / 5 failed).

Class Proportion Mean Absolute Error¶

Threshold: 0.1¶
Data	anger				happiness				neutral				sadness
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
iemocap-2.3.0-full	0.04	0.09	0.11	0.18	0.09	0.10	0.03	0.05	0.19	0.21	0.27	0.09	0.31	0.19	0.37	0.08
meld-1.3.1-emotion.categories.test.gold_standard	0.35	0.02	0.02	0.03	0.09	0.42	0.29	0.39	0.48	0.47	0.43	0.33	0.09	0.07	0.16	0.04
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.12	0.05	0.07	0.09	0.15	0.11	0.05	0.09	0.08	0.23	0.19	0.25	0.11	0.09	0.15	0.19
mean	0.17	0.05	0.07	0.10	0.11	0.21	0.12	0.18	0.25	0.30	0.30	0.22	0.17	0.12	0.23	0.10

Visualization¶

The plot shows the proportion of the predicted samples for each class, as well as the true proportion of the class. We select a slightly higher threshold for the absolute error in the plots compared to the Class Proportion Difference test as we are interested in highlighting only big deviations here.

CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat