Correctness regression¶

Overall scores¶
	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
Overall Score	66.7% passed tests (6 passed / 3 failed).	88.9% passed tests (8 passed / 1 failed).	88.9% passed tests (8 passed / 1 failed).	66.7% passed tests (6 passed / 3 failed).	66.7% passed tests (6 passed / 3 failed).

Concordance Correlation Coeff¶

Threshold: 0.5¶
Data	Concordance Correlation Coeff
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	0.60	0.66	0.68	0.67	0.67
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	0.72	0.72	0.74	0.72	0.74
msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard	0.47	0.50	0.51	0.49	0.48
mean	0.60	0.63	0.64	0.63	0.63

Threshold: 0.1¶
Data	Mean Absolute Error
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	0.14	0.12	0.11	0.12	0.11
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	0.10	0.09	0.08	0.09	0.09
msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard	0.11	0.10	0.09	0.10	0.10
mean	0.12	0.10	0.10	0.10	0.10

Threshold: 0.5¶
Data	Pearson Correlation Coeff
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	0.62	0.66	0.68	0.67	0.67
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	0.73	0.73	0.75	0.72	0.74
msppodcast-2.6.1-emotion.dimensions.test-2.gold_standard	0.51	0.51	0.53	0.49	0.51
mean	0.62	0.63	0.65	0.63	0.64