Robustness low quality phone¶

Overall scores¶
	w2v2-L-cat	hubert-L-cat	wavlm-cat	data2vec-cat
Overall Score	90.0% passed tests (9 passed / 1 failed).	80.0% passed tests (8 passed / 2 failed).	90.0% passed tests (9 passed / 1 failed).	90.0% passed tests (9 passed / 1 failed).

Change Uar Low Quality Phone¶

Threshold: -0.05¶
Data	Change UAR Low Quality Phone
Data	w2v2-L-cat	hubert-L-cat	wavlm-cat	data2vec-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	-0.10	-0.10	-0.09	-0.06
emovo-1.2.1-emotion.test	0.01	-0.02	-0.01	0.05
iemocap-2.3.0-emotion.categories.test.gold_standard	-0.02	-0.03	-0.03	-0.03
meld-1.3.1-emotion.categories.test.gold_standard	-0.01	-0.01	-0.01	-0.00
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	-0.02	-0.05	-0.01	-0.03
mean	-0.03	-0.04	-0.03	-0.01

Threshold: 0.5¶
Data	Percentage Unchanged Predictions Low Quality Phone
Data	w2v2-L-cat	hubert-L-cat	wavlm-cat	data2vec-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.71	0.75	0.75	0.68
emovo-1.2.1-emotion.test	0.72	0.74	0.81	0.69
iemocap-2.3.0-emotion.categories.test.gold_standard	0.77	0.78	0.81	0.76
meld-1.3.1-emotion.categories.test.gold_standard	0.74	0.70	0.78	0.69
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.70	0.79	0.83	0.76
mean	0.73	0.75	0.80	0.72

Confusion Matrix showing the shift from the predictions of the original audio to the predictions of the low quality phone audio.