Correctness classification¶

Overall scores¶
	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
Overall Score	38.0% passed tests (38 passed / 62 failed).	49.0% passed tests (49 passed / 51 failed).	52.0% passed tests (52 passed / 48 failed).	40.0% passed tests (40 passed / 60 failed).

Precision Per Class¶

Threshold: 0.5¶
Data	anger				happiness				neutral				sadness
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.56	0.49	0.54	0.58	0.45	0.30	1.00	0.35	0.78	0.85	0.45	0.77	0.07	0.09	0.07	0.08
danish-emotional-speech-1.1.1-emotion.test	0.55	0.60	0.57	0.77	0.77	0.52	0.50	0.46	0.26	0.40	0.33	0.27	0.36	0.44	0.37	0.34
emodb-1.2.0-emotion.categories.test.gold_standard	0.62	0.73	0.72	0.71	0.11	0.39	0.50	0.56	0.77	0.75	0.87	0.78	0.72	0.80	0.78	0.62
emovo-1.2.1-emotion.test	0.54	0.50	0.48	0.57	0.21	0.45	0.38	0.45	0.57	0.48	0.51	0.38	0.45	0.78	0.77	0.55
iemocap-2.3.0-emotion.categories.test.gold_standard	0.79	0.77	0.87	0.89	0.43	0.23	0.28	0.21	0.43	0.65	0.74	0.52	0.30	0.45	0.38	0.50
meld-1.3.1-emotion.categories.test.gold_standard	0.23	0.33	0.38	0.37	0.16	0.22	0.23	0.20	0.70	0.69	0.73	0.66	0.19	0.22	0.17	0.30
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.24	0.35	0.40	0.27	0.66	0.65	0.77	0.61	0.65	0.78	0.72	0.68	0.14	0.16	0.14	0.11
msppodcast-2.6.0-emotion.categories.test-2.gold_standard	0.09	0.10	0.13	0.14	0.52	0.53	0.62	0.47	0.66	0.73	0.70	0.67	0.06	0.09	0.07	0.06
polish-emotional-speech-1.1.1-emotion.categories.test.gold_standard	0.47	0.78	0.68	0.59	0.00	0.54	0.87	0.68	0.52	0.71	0.58	0.60	0.48	0.78	0.88	0.60
ravdess-1.1.2-emotion.speech.test	0.56	0.62	0.64	0.70	0.00	0.00	0.00	0.47	0.16	0.20	0.00	0.18	0.45	0.47	0.38	0.44
mean	0.47	0.53	0.54	0.56	0.33	0.38	0.52	0.45	0.55	0.62	0.56	0.55	0.32	0.43	0.40	0.36

Threshold: 0.5¶
Data	anger				happiness				neutral				sadness
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.76	0.60	0.63	0.43	0.07	0.23	0.07	0.16	0.04	0.20	0.02	0.27	0.92	0.90	0.95	0.85
danish-emotional-speech-1.1.1-emotion.test	0.35	0.54	0.33	0.19	0.19	0.23	0.04	0.12	0.27	0.48	0.33	0.67	0.75	0.63	0.87	0.33
emodb-1.2.0-emotion.categories.test.gold_standard	1.00	0.65	0.93	0.93	0.04	0.56	0.19	0.19	0.37	0.67	0.74	0.52	0.67	0.74	0.93	0.85
emovo-1.2.1-emotion.test	0.82	0.82	0.82	0.75	0.07	0.30	0.06	0.15	0.24	0.39	0.55	0.39	0.79	0.69	0.82	0.71
iemocap-2.3.0-emotion.categories.test.gold_standard	0.69	0.57	0.56	0.30	0.11	0.47	0.33	0.28	0.20	0.29	0.26	0.68	0.74	0.76	0.93	0.58
meld-1.3.1-emotion.categories.test.gold_standard	0.74	0.29	0.32	0.32	0.21	0.73	0.60	0.63	0.09	0.10	0.17	0.27	0.37	0.39	0.47	0.18
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.55	0.63	0.67	0.50	0.48	0.73	0.59	0.56	0.45	0.44	0.50	0.37	0.55	0.52	0.67	0.54
msppodcast-2.6.0-emotion.categories.test-2.gold_standard	0.47	0.43	0.47	0.32	0.19	0.45	0.32	0.32	0.54	0.48	0.49	0.46	0.34	0.43	0.53	0.38
polish-emotional-speech-1.1.1-emotion.categories.test.gold_standard	0.80	0.72	0.80	0.65	0.00	0.80	0.50	0.48	0.60	0.50	0.92	0.60	0.55	0.70	0.57	0.72
ravdess-1.1.2-emotion.speech.test	0.75	0.91	0.56	0.44	0.00	0.00	0.00	0.44	0.25	0.06	0.00	0.31	0.62	0.88	1.00	0.47
mean	0.69	0.62	0.61	0.48	0.14	0.45	0.27	0.33	0.31	0.36	0.40	0.45	0.63	0.66	0.77	0.56

Threshold: 0.5¶
Data	Unweighted Average Precision
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.46	0.43	0.51	0.45
danish-emotional-speech-1.1.1-emotion.test	0.48	0.49	0.44	0.46
emodb-1.2.0-emotion.categories.test.gold_standard	0.55	0.67	0.72	0.67
emovo-1.2.1-emotion.test	0.45	0.55	0.54	0.49
iemocap-2.3.0-emotion.categories.test.gold_standard	0.49	0.52	0.57	0.53
meld-1.3.1-emotion.categories.test.gold_standard	0.32	0.36	0.38	0.38
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.42	0.49	0.51	0.42
msppodcast-2.6.0-emotion.categories.test-2.gold_standard	0.33	0.36	0.38	0.33
polish-emotional-speech-1.1.1-emotion.categories.test.gold_standard	0.37	0.70	0.75	0.62
ravdess-1.1.2-emotion.speech.test	0.29	0.32	0.26	0.45
mean	0.42	0.49	0.51	0.48

Threshold: 0.5¶
Data	Unweighted Average Recall
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.45	0.48	0.42	0.43
danish-emotional-speech-1.1.1-emotion.test	0.39	0.47	0.39	0.33
emodb-1.2.0-emotion.categories.test.gold_standard	0.52	0.65	0.69	0.62
emovo-1.2.1-emotion.test	0.48	0.55	0.56	0.50
iemocap-2.3.0-emotion.categories.test.gold_standard	0.44	0.52	0.52	0.46
meld-1.3.1-emotion.categories.test.gold_standard	0.35	0.38	0.39	0.35
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.51	0.58	0.61	0.49
msppodcast-2.6.0-emotion.categories.test-2.gold_standard	0.38	0.45	0.45	0.37
polish-emotional-speech-1.1.1-emotion.categories.test.gold_standard	0.49	0.68	0.70	0.61
ravdess-1.1.2-emotion.speech.test	0.41	0.46	0.39	0.41
mean	0.44	0.52	0.51	0.46