Robustness small changes

Overall scores

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

Overall Score

60.0% passed tests (30 passed / 20 failed).

66.0% passed tests (33 passed / 17 failed).

72.0% passed tests (36 passed / 14 failed).

80.0% passed tests (40 passed / 10 failed).

70.0% passed tests (35 passed / 15 failed).

Percentage Unchanged Predictions Additive Tone

Threshold: 0.95

Data

Percent Unchanged Pred Additive Tone

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.96

0.96

0.96

0.99

0.99

emovo-1.2.1-emotion.test

0.95

0.93

0.94

0.95

0.94

iemocap-2.3.0-emotion.categories.test.gold_standard

0.93

0.92

0.96

0.95

0.93

meld-1.3.1-emotion.categories.test.gold_standard

0.98

0.97

0.98

0.99

0.98

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.95

0.95

0.97

0.96

0.94

mean

0.95

0.95

0.96

0.97

0.96

Percentage Unchanged Predictions Append Zeros

Threshold: 0.95

Data

Percent Unchanged Pred Append Zeros

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.98

0.99

0.98

1.00

1.00

emovo-1.2.1-emotion.test

0.98

0.98

0.99

0.99

1.00

iemocap-2.3.0-emotion.categories.test.gold_standard

0.98

0.98

0.98

0.99

1.00

meld-1.3.1-emotion.categories.test.gold_standard

0.98

0.98

0.98

0.99

0.99

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.99

0.99

0.99

0.99

0.99

mean

0.98

0.98

0.98

0.99

1.00

Percentage Unchanged Predictions Clip

Threshold: 0.95

Data

Percent Unchanged Pred Clip

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.97

0.98

0.98

0.99

0.99

emovo-1.2.1-emotion.test

0.98

0.99

0.99

0.99

0.98

iemocap-2.3.0-emotion.categories.test.gold_standard

0.96

0.98

0.98

0.98

0.97

meld-1.3.1-emotion.categories.test.gold_standard

0.98

0.98

0.98

0.99

0.97

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.98

0.99

0.99

0.99

0.98

mean

0.97

0.98

0.98

0.99

0.98

Percentage Unchanged Predictions Crop Beginning

Threshold: 0.95

Data

Percent Unchanged Pred Crop Beginning

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.90

0.92

0.93

0.99

0.98

emovo-1.2.1-emotion.test

0.90

0.91

0.95

0.97

0.94

iemocap-2.3.0-emotion.categories.test.gold_standard

0.92

0.93

0.95

0.96

0.93

meld-1.3.1-emotion.categories.test.gold_standard

0.90

0.90

0.92

0.94

0.90

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.94

0.96

0.96

0.96

0.94

mean

0.91

0.92

0.94

0.96

0.94

Percentage Unchanged Predictions Crop End

Threshold: 0.95

Data

Percent Unchanged Pred Crop End

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.99

0.99

0.99

1.00

1.00

emovo-1.2.1-emotion.test

0.98

0.98

0.99

1.00

0.99

iemocap-2.3.0-emotion.categories.test.gold_standard

0.98

0.98

0.99

0.99

0.99

meld-1.3.1-emotion.categories.test.gold_standard

0.97

0.97

0.97

0.98

0.98

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.99

0.98

0.99

0.99

0.99

mean

0.98

0.98

0.99

0.99

0.99

Percentage Unchanged Predictions Gain

Threshold: 0.95

Data

Percent Unchanged Pred Gain

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

1.00

1.00

1.00

1.00

1.00

emovo-1.2.1-emotion.test

1.00

1.00

1.00

1.00

1.00

iemocap-2.3.0-emotion.categories.test.gold_standard

0.99

0.99

0.99

0.99

0.99

meld-1.3.1-emotion.categories.test.gold_standard

1.00

1.00

1.00

1.00

1.00

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

1.00

1.00

1.00

1.00

1.00

mean

1.00

1.00

1.00

1.00

1.00

Percentage Unchanged Predictions Highpass Filter

Threshold: 0.95

Data

Percent Unchanged Pred Highpass Filter

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.95

0.97

0.96

0.99

0.99

emovo-1.2.1-emotion.test

0.96

0.94

0.97

0.93

0.97

iemocap-2.3.0-emotion.categories.test.gold_standard

0.95

0.96

0.97

0.95

0.96

meld-1.3.1-emotion.categories.test.gold_standard

0.96

0.96

0.97

0.96

0.96

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.97

0.98

0.98

0.95

0.96

mean

0.96

0.96

0.97

0.96

0.97

Percentage Unchanged Predictions Lowpass Filter

Threshold: 0.95

Data

Percent Unchanged Pred Lowpass Filter

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.95

0.97

0.96

0.98

0.96

emovo-1.2.1-emotion.test

0.98

0.97

0.99

0.99

0.98

iemocap-2.3.0-emotion.categories.test.gold_standard

0.98

0.99

0.99

0.99

0.99

meld-1.3.1-emotion.categories.test.gold_standard

0.97

0.97

0.98

0.98

0.97

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.98

0.98

0.98

0.98

0.96

mean

0.97

0.98

0.98

0.98

0.97

Percentage Unchanged Predictions Prepend Zeros

Threshold: 0.95

Data

Percent Unchanged Pred Prepend Zeros

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.90

0.93

0.90

0.98

0.97

emovo-1.2.1-emotion.test

0.90

0.89

0.92

0.97

0.94

iemocap-2.3.0-emotion.categories.test.gold_standard

0.91

0.93

0.93

0.96

0.93

meld-1.3.1-emotion.categories.test.gold_standard

0.89

0.90

0.90

0.95

0.90

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.94

0.95

0.95

0.96

0.94

mean

0.91

0.92

0.92

0.96

0.94

Percentage Unchanged Predictions White Noise

Threshold: 0.95

Data

Percent Unchanged Pred White Noise

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

crema-d-1.2.0-emotion.categories.test.gold_standard

0.94

0.94

0.93

0.97

0.98

emovo-1.2.1-emotion.test

0.90

0.87

0.89

0.85

0.88

iemocap-2.3.0-emotion.categories.test.gold_standard

0.88

0.87

0.92

0.90

0.95

meld-1.3.1-emotion.categories.test.gold_standard

0.96

0.97

0.96

0.97

0.95

msppodcast-2.6.0-emotion.categories.test-1.gold_standard

0.92

0.92

0.92

0.90

0.90

mean

0.92

0.91

0.92

0.92

0.93

Visualization

w2v2-b-cat

w2v2-L-cat

w2v2-L-robust-cat

w2v2-L-vox-cat

w2v2-L-xls-r-cat

../../../_images/visualization_crema-d-1.2.0-emotion.categories.test.gold_standard37.png
../../../_images/visualization_crema-d-1.2.0-emotion.categories.test.gold_standard38.png
../../../_images/visualization_crema-d-1.2.0-emotion.categories.test.gold_standard39.png
../../../_images/visualization_crema-d-1.2.0-emotion.categories.test.gold_standard40.png
../../../_images/visualization_crema-d-1.2.0-emotion.categories.test.gold_standard41.png
../../../_images/visualization_emovo-1.2.1-emotion.test37.png
../../../_images/visualization_emovo-1.2.1-emotion.test38.png
../../../_images/visualization_emovo-1.2.1-emotion.test39.png
../../../_images/visualization_emovo-1.2.1-emotion.test40.png
../../../_images/visualization_emovo-1.2.1-emotion.test41.png
../../../_images/visualization_iemocap-2.3.0-emotion.categories.test.gold_standard37.png
../../../_images/visualization_iemocap-2.3.0-emotion.categories.test.gold_standard38.png
../../../_images/visualization_iemocap-2.3.0-emotion.categories.test.gold_standard39.png
../../../_images/visualization_iemocap-2.3.0-emotion.categories.test.gold_standard40.png
../../../_images/visualization_iemocap-2.3.0-emotion.categories.test.gold_standard41.png
../../../_images/visualization_meld-1.3.1-emotion.categories.test.gold_standard47.png
../../../_images/visualization_meld-1.3.1-emotion.categories.test.gold_standard48.png
../../../_images/visualization_meld-1.3.1-emotion.categories.test.gold_standard49.png
../../../_images/visualization_meld-1.3.1-emotion.categories.test.gold_standard50.png
../../../_images/visualization_meld-1.3.1-emotion.categories.test.gold_standard51.png
../../../_images/visualization_msppodcast-2.6.0-emotion.categories.test-1.gold_standard25.png
../../../_images/visualization_msppodcast-2.6.0-emotion.categories.test-1.gold_standard26.png
../../../_images/visualization_msppodcast-2.6.0-emotion.categories.test-1.gold_standard27.png
../../../_images/visualization_msppodcast-2.6.0-emotion.categories.test-1.gold_standard28.png
../../../_images/visualization_msppodcast-2.6.0-emotion.categories.test-1.gold_standard29.png