Robustness background noise¶

Overall scores¶
	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
Overall Score	29.2% passed tests (7 passed / 17 failed).	16.7% passed tests (4 passed / 20 failed).	41.7% passed tests (10 passed / 14 failed).	37.5% passed tests (9 passed / 15 failed).	41.7% passed tests (10 passed / 14 failed).

Change Ccc Babble Noise¶

Threshold: -0.05¶
Data	Change CCC Babble Noise
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	-0.02	-0.06	-0.03	-0.02	-0.04
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	-0.00	-0.07	-0.01	-0.02	-0.02
mean	-0.01	-0.07	-0.02	-0.02	-0.03

Change Ccc Coughing¶

Threshold: -0.05¶
Data	Change CCC Coughing
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	-0.12	-0.14	-0.06	-0.11	-0.08
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	-0.13	-0.12	-0.05	-0.11	-0.04
mean	-0.12	-0.13	-0.06	-0.11	-0.06

Change Ccc Environmental Noise¶

Threshold: -0.05¶
Data	Change CCC Environmental Noise
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	-0.02	-0.05	-0.02	-0.04	-0.03
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	-0.01	-0.03	-0.00	-0.02	-0.01
mean	-0.01	-0.04	-0.01	-0.03	-0.02

Change Ccc Music¶

Threshold: -0.05¶
Data	Change CCC Music
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	-0.01	-0.04	-0.02	-0.02	-0.03
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	-0.00	-0.03	-0.01	-0.02	-0.01
mean	-0.01	-0.04	-0.01	-0.02	-0.02

Change Ccc Sneezing¶

Threshold: -0.05¶
Data	Change CCC Sneezing
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	-0.09	-0.10	-0.04	-0.03	-0.03
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	-0.09	-0.10	-0.03	-0.04	-0.02
mean	-0.09	-0.10	-0.04	-0.04	-0.03

Change Ccc White Noise¶

Threshold: -0.05¶
Data	Change CCC White Noise
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	-0.03	-0.06	-0.04	-0.09	-0.05
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	-0.06	-0.03	-0.02	-0.03	-0.01
mean	-0.04	-0.04	-0.03	-0.06	-0.03

Percentage Unchanged Predictions Babble Noise¶

Threshold: 0.9¶
Data	Percentage Unchanged Predictions Babble Noise
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	0.81	0.65	0.75	0.81	0.75
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	0.67	0.55	0.76	0.69	0.65
mean	0.74	0.60	0.76	0.75	0.70

Percentage Unchanged Predictions Coughing¶

Threshold: 0.9¶
Data	Percentage Unchanged Predictions Coughing
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	0.36	0.43	0.53	0.39	0.54
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	0.26	0.41	0.58	0.32	0.51
mean	0.31	0.42	0.55	0.35	0.53

Percentage Unchanged Predictions Environmental Noise¶

Threshold: 0.9¶
Data	Percentage Unchanged Predictions Environmental Noise
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	0.77	0.75	0.79	0.72	0.79
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	0.71	0.70	0.85	0.64	0.77
mean	0.74	0.72	0.82	0.68	0.78

Percentage Unchanged Predictions Music¶

Threshold: 0.9¶
Data	Percentage Unchanged Predictions Music
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	0.82	0.79	0.84	0.81	0.81
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	0.74	0.71	0.86	0.68	0.76
mean	0.78	0.75	0.85	0.75	0.79

Percentage Unchanged Predictions Sneezing¶

Threshold: 0.9¶
Data	Percentage Unchanged Predictions Sneezing
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	0.47	0.51	0.55	0.60	0.65
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	0.26	0.41	0.61	0.50	0.57
mean	0.36	0.46	0.58	0.55	0.61

Percentage Unchanged Predictions White Noise¶

Threshold: 0.9¶
Data	Percentage Unchanged Predictions White Noise
Data	w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox
iemocap-2.3.0-emotion.dimensions.test.gold_standard	0.53	0.66	0.46	0.52	0.60
msppodcast-2.6.1-emotion.dimensions.test-1.gold_standard	0.45	0.63	0.64	0.34	0.54
mean	0.49	0.65	0.55	0.43	0.57

Visualization Babble Noise¶

Difference of predictions for clean audio and audio with added babble noise. The allowed prediction difference \(\delta < 0.05\) is highlighted in green in the upper plot. The lower plot shows the distributions of the two predictions.

w2v2-b	w2v2-L	w2v2-L-robust	w2v2-L-xls-r	w2v2-L-vox