Robustness small changes

Overall scores

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

Overall Score

74.0% passed tests (37 passed / 13 failed).

60.0% passed tests (30 passed / 20 failed).

76.0% passed tests (38 passed / 12 failed).

32.0% passed tests (16 passed / 34 failed).

Percentage Unchanged Predictions Additive Tone

Threshold: 0.95

Data

Percent Unchanged Pred Additive Tone

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.99

0.96

0.98

0.93

emovo-1.2.1-emotion.test

0.93

0.95

0.95

0.89

iemocap-2.3.0-emotion.categories.test.gold_standard

0.97

0.93

0.95

0.86

meld-1.3.1-emotion.categories.test.gold_standard

0.98

0.98

0.97

0.95

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.93

0.95

0.95

0.86

mean

0.96

0.95

0.96

0.90

Percentage Unchanged Predictions Append Zeros

Threshold: 0.95

Data

Percent Unchanged Pred Append Zeros

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.99

0.98

0.99

0.89

emovo-1.2.1-emotion.test

0.97

0.98

0.98

0.90

iemocap-2.3.0-emotion.categories.test.gold_standard

0.97

0.98

0.98

0.92

meld-1.3.1-emotion.categories.test.gold_standard

0.97

0.98

0.97

0.89

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.98

0.99

0.99

0.92

mean

0.98

0.98

0.98

0.90

Percentage Unchanged Predictions Clip

Threshold: 0.95

Data

Percent Unchanged Pred Clip

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.99

0.97

1.00

0.97

emovo-1.2.1-emotion.test

0.97

0.98

0.98

0.98

iemocap-2.3.0-emotion.categories.test.gold_standard

0.98

0.96

0.98

0.94

meld-1.3.1-emotion.categories.test.gold_standard

0.98

0.98

0.98

0.97

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.98

0.98

0.99

0.96

mean

0.98

0.97

0.99

0.96

Percentage Unchanged Predictions Crop Beginning

Threshold: 0.95

Data

Percent Unchanged Pred Crop Beginning

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.99

0.90

0.98

0.81

emovo-1.2.1-emotion.test

0.95

0.90

0.93

0.82

iemocap-2.3.0-emotion.categories.test.gold_standard

0.96

0.92

0.96

0.86

meld-1.3.1-emotion.categories.test.gold_standard

0.94

0.90

0.93

0.84

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.96

0.94

0.96

0.83

mean

0.96

0.91

0.95

0.83

Percentage Unchanged Predictions Crop End

Threshold: 0.95

Data

Percent Unchanged Pred Crop End

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.99

0.99

1.00

0.93

emovo-1.2.1-emotion.test

0.99

0.98

0.98

0.93

iemocap-2.3.0-emotion.categories.test.gold_standard

0.99

0.98

0.99

0.95

meld-1.3.1-emotion.categories.test.gold_standard

0.98

0.97

0.96

0.92

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.99

0.99

0.99

0.95

mean

0.99

0.98

0.98

0.94

Percentage Unchanged Predictions Gain

Threshold: 0.95

Data

Percent Unchanged Pred Gain

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.98

1.00

1.00

1.00

emovo-1.2.1-emotion.test

0.96

1.00

1.00

0.99

iemocap-2.3.0-emotion.categories.test.gold_standard

0.96

0.99

0.99

0.98

meld-1.3.1-emotion.categories.test.gold_standard

0.91

1.00

1.00

1.00

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.93

1.00

1.00

0.99

mean

0.95

1.00

1.00

0.99

Percentage Unchanged Predictions Highpass Filter

Threshold: 0.95

Data

Percent Unchanged Pred Highpass Filter

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.99

0.95

0.99

0.96

emovo-1.2.1-emotion.test

0.98

0.96

0.97

0.97

iemocap-2.3.0-emotion.categories.test.gold_standard

0.98

0.95

0.98

0.96

meld-1.3.1-emotion.categories.test.gold_standard

0.97

0.96

0.97

0.97

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.99

0.97

0.98

0.97

mean

0.98

0.96

0.98

0.97

Percentage Unchanged Predictions Lowpass Filter

Threshold: 0.95

Data

Percent Unchanged Pred Lowpass Filter

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.98

0.95

0.99

0.94

emovo-1.2.1-emotion.test

0.98

0.98

0.98

0.96

iemocap-2.3.0-emotion.categories.test.gold_standard

0.99

0.98

0.99

0.98

meld-1.3.1-emotion.categories.test.gold_standard

0.95

0.97

0.97

0.95

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.97

0.98

0.99

0.92

mean

0.97

0.97

0.98

0.95

Percentage Unchanged Predictions Prepend Zeros

Threshold: 0.95

Data

Percent Unchanged Pred Prepend Zeros

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.98

0.90

0.98

0.80

emovo-1.2.1-emotion.test

0.95

0.90

0.93

0.80

iemocap-2.3.0-emotion.categories.test.gold_standard

0.95

0.91

0.95

0.82

meld-1.3.1-emotion.categories.test.gold_standard

0.93

0.89

0.92

0.82

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.96

0.94

0.96

0.82

mean

0.95

0.91

0.95

0.81

Percentage Unchanged Predictions White Noise

Threshold: 0.95

Data

Percent Unchanged Pred White Noise

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.98

0.94

0.96

0.90

emovo-1.2.1-emotion.test

0.87

0.90

0.89

0.82

iemocap-2.3.0-emotion.categories.test.gold_standard

0.93

0.88

0.88

0.84

meld-1.3.1-emotion.categories.test.gold_standard

0.96

0.96

0.95

0.91

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.88

0.92

0.88

0.82

mean

0.92

0.92

0.91

0.86

Visualization

CNN14-cat

w2v2-b-cat

hubert-b-cat

axlstm-cat

../../../_images/visualization_crema-d-1.2.0-emotion.categories.test.gold_standard51.png
../../../_images/visualization_crema-d-1.2.0-emotion.categories.test.gold_standard37.png
../../../_images/visualization_crema-d-1.2.0-emotion.categories.test.gold_standard52.png
../../../_images/visualization_crema-d-1.2.0-emotion.categories.test.gold_standard53.png
../../../_images/visualization_emovo-1.2.1-emotion.test51.png
../../../_images/visualization_emovo-1.2.1-emotion.test37.png
../../../_images/visualization_emovo-1.2.1-emotion.test52.png
../../../_images/visualization_emovo-1.2.1-emotion.test53.png
../../../_images/visualization_iemocap-2.3.0-emotion.categories.test.gold_standard51.png
../../../_images/visualization_iemocap-2.3.0-emotion.categories.test.gold_standard37.png
../../../_images/visualization_iemocap-2.3.0-emotion.categories.test.gold_standard52.png
../../../_images/visualization_iemocap-2.3.0-emotion.categories.test.gold_standard53.png
../../../_images/visualization_meld-1.3.1-emotion.categories.test.gold_standard67.png
../../../_images/visualization_meld-1.3.1-emotion.categories.test.gold_standard47.png
../../../_images/visualization_meld-1.3.1-emotion.categories.test.gold_standard68.png
../../../_images/visualization_meld-1.3.1-emotion.categories.test.gold_standard69.png
../../../_images/visualization_msppodcast-2.6.0-emotion.categories.test-1.gold_standard45.png
../../../_images/visualization_msppodcast-2.6.0-emotion.categories.test-1.gold_standard25.png
../../../_images/visualization_msppodcast-2.6.0-emotion.categories.test-1.gold_standard46.png
../../../_images/visualization_msppodcast-2.6.0-emotion.categories.test-1.gold_standard47.png