Robustness small changes¶

Overall scores¶
	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
Overall Score	74.0% passed tests (37 passed / 13 failed).	60.0% passed tests (30 passed / 20 failed).	76.0% passed tests (38 passed / 12 failed).	32.0% passed tests (16 passed / 34 failed).

Percentage Unchanged Predictions Additive Tone¶

Threshold: 0.95¶
Data	Percent Unchanged Pred Additive Tone
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.99	0.96	0.98	0.93
emovo-1.2.1-emotion.test	0.93	0.95	0.95	0.89
iemocap-2.3.0-emotion.categories.test.gold_standard	0.97	0.93	0.95	0.86
meld-1.3.1-emotion.categories.test.gold_standard	0.98	0.98	0.97	0.95
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.93	0.95	0.95	0.86
mean	0.96	0.95	0.96	0.90

Threshold: 0.95¶
Data	Percent Unchanged Pred Append Zeros
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.99	0.98	0.99	0.89
emovo-1.2.1-emotion.test	0.97	0.98	0.98	0.90
iemocap-2.3.0-emotion.categories.test.gold_standard	0.97	0.98	0.98	0.92
meld-1.3.1-emotion.categories.test.gold_standard	0.97	0.98	0.97	0.89
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.98	0.99	0.99	0.92
mean	0.98	0.98	0.98	0.90

Threshold: 0.95¶
Data	Percent Unchanged Pred Clip
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.99	0.97	1.00	0.97
emovo-1.2.1-emotion.test	0.97	0.98	0.98	0.98
iemocap-2.3.0-emotion.categories.test.gold_standard	0.98	0.96	0.98	0.94
meld-1.3.1-emotion.categories.test.gold_standard	0.98	0.98	0.98	0.97
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.98	0.98	0.99	0.96
mean	0.98	0.97	0.99	0.96

Threshold: 0.95¶
Data	Percent Unchanged Pred Crop Beginning
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.99	0.90	0.98	0.81
emovo-1.2.1-emotion.test	0.95	0.90	0.93	0.82
iemocap-2.3.0-emotion.categories.test.gold_standard	0.96	0.92	0.96	0.86
meld-1.3.1-emotion.categories.test.gold_standard	0.94	0.90	0.93	0.84
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.96	0.94	0.96	0.83
mean	0.96	0.91	0.95	0.83

Threshold: 0.95¶
Data	Percent Unchanged Pred Crop End
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.99	0.99	1.00	0.93
emovo-1.2.1-emotion.test	0.99	0.98	0.98	0.93
iemocap-2.3.0-emotion.categories.test.gold_standard	0.99	0.98	0.99	0.95
meld-1.3.1-emotion.categories.test.gold_standard	0.98	0.97	0.96	0.92
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.99	0.99	0.99	0.95
mean	0.99	0.98	0.98	0.94

Threshold: 0.95¶
Data	Percent Unchanged Pred Gain
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.98	1.00	1.00	1.00
emovo-1.2.1-emotion.test	0.96	1.00	1.00	0.99
iemocap-2.3.0-emotion.categories.test.gold_standard	0.96	0.99	0.99	0.98
meld-1.3.1-emotion.categories.test.gold_standard	0.91	1.00	1.00	1.00
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.93	1.00	1.00	0.99
mean	0.95	1.00	1.00	0.99

Threshold: 0.95¶
Data	Percent Unchanged Pred Highpass Filter
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.99	0.95	0.99	0.96
emovo-1.2.1-emotion.test	0.98	0.96	0.97	0.97
iemocap-2.3.0-emotion.categories.test.gold_standard	0.98	0.95	0.98	0.96
meld-1.3.1-emotion.categories.test.gold_standard	0.97	0.96	0.97	0.97
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.99	0.97	0.98	0.97
mean	0.98	0.96	0.98	0.97

Threshold: 0.95¶
Data	Percent Unchanged Pred Lowpass Filter
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.98	0.95	0.99	0.94
emovo-1.2.1-emotion.test	0.98	0.98	0.98	0.96
iemocap-2.3.0-emotion.categories.test.gold_standard	0.99	0.98	0.99	0.98
meld-1.3.1-emotion.categories.test.gold_standard	0.95	0.97	0.97	0.95
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.97	0.98	0.99	0.92
mean	0.97	0.97	0.98	0.95

Threshold: 0.95¶
Data	Percent Unchanged Pred Prepend Zeros
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.98	0.90	0.98	0.80
emovo-1.2.1-emotion.test	0.95	0.90	0.93	0.80
iemocap-2.3.0-emotion.categories.test.gold_standard	0.95	0.91	0.95	0.82
meld-1.3.1-emotion.categories.test.gold_standard	0.93	0.89	0.92	0.82
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.96	0.94	0.96	0.82
mean	0.95	0.91	0.95	0.81

Threshold: 0.95¶
Data	Percent Unchanged Pred White Noise
Data	CNN14-cat	w2v2-b-cat	hubert-b-cat	axlstm-cat
crema-d-1.2.0-emotion.categories.test.gold_standard	0.98	0.94	0.96	0.90
emovo-1.2.1-emotion.test	0.87	0.90	0.89	0.82
iemocap-2.3.0-emotion.categories.test.gold_standard	0.93	0.88	0.88	0.84
meld-1.3.1-emotion.categories.test.gold_standard	0.96	0.96	0.95	0.91
msppodcast-2.6.0-emotion.categories.test-1.gold_standard	0.88	0.92	0.88	0.82
mean	0.92	0.92	0.91	0.86