Robustness low quality phone¶

Overall scores¶
	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
Overall Score	90.0% passed tests (9 passed / 1 failed).	100.0% passed tests (10 passed / 0 failed).	60.0% passed tests (6 passed / 4 failed).	80.0% passed tests (8 passed / 2 failed).

Change Uar Low Quality Phone¶

Threshold: -0.05¶
Data	Change UAR Low Quality Phone
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	-0.01	-0.04	-0.08	-0.12
emovo-1.2.1-emotion.test	-0.02	-0.04	-0.08	-0.05
iemocap-2.3.0-emotion.categories.test.gold_standard	-0.04	-0.03	-0.06	-0.00
meld-1.3.1-emotion.categories.test.gold_standard	0.01	-0.01	-0.02	0.01
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	-0.06	-0.04	-0.09	-0.04
mean	-0.02	-0.03	-0.07	-0.04

Threshold: 0.5¶
Data	Percentage Unchanged Predictions Low Quality Phone
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.90	0.70	0.87	0.68
emovo-1.2.1-emotion.test	0.73	0.67	0.61	0.63
iemocap-2.3.0-emotion.categories.test.gold_standard	0.82	0.71	0.79	0.68
meld-1.3.1-emotion.categories.test.gold_standard	0.64	0.66	0.71	0.64
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.64	0.68	0.67	0.64
mean	0.75	0.68	0.73	0.65

Confusion Matrix showing the shift from the predictions of the original audio to the predictions of the low quality phone audio.