Robustness simulated recording condition

Overall scores

w2v2-L

hubert-L

wavlm

data2vec

Overall Score

33.3% passed tests (2 passed / 4 failed).

16.7% passed tests (1 passed / 5 failed).

33.3% passed tests (2 passed / 4 failed).

16.7% passed tests (1 passed / 5 failed).

Percentage Unchanged Predictions Simulated Position

Threshold: 0.8

Data

Percentage Unchanged Predictions Simulated Position

w2v2-L

hubert-L

wavlm

data2vec

emovo-1.2.1-emotion.test

0.59

0.70

0.79

0.62

imda-nsc-read-speech-balanced-2.6.0-headset

0.81

0.82

0.90

0.80

timit-1.4.1-files

0.86

0.77

0.92

0.84

mean

0.75

0.76

0.87

0.75

Percentage Unchanged Predictions Simulated Room

Threshold: 0.8

Data

Percentage Unchanged Predictions Simulated Room

w2v2-L

hubert-L

wavlm

data2vec

emovo-1.2.1-emotion.test

0.35

0.61

0.59

0.48

imda-nsc-read-speech-balanced-2.6.0-headset

0.44

0.72

0.66

0.65

timit-1.4.1-files

0.55

0.72

0.64

0.70

mean

0.45

0.68

0.63

0.61

Visualization Simulated Position

Difference of predictions for audio with a baseline simulated position and audio with a different simulated position. The allowed prediction difference \(\delta < 0.05\) is highlighted in green in the upper plot. The lower plot shows the distributions of the two predictions.

w2v2-L

hubert-L

wavlm

data2vec

../../../_images/visualization-simulated-position_emovo-1.2.1-emotion.test4.png
../../../_images/visualization-simulated-position_emovo-1.2.1-emotion.test8.png
../../../_images/visualization-simulated-position_emovo-1.2.1-emotion.test9.png
../../../_images/visualization-simulated-position_emovo-1.2.1-emotion.test10.png
../../../_images/visualization-simulated-position_imda-nsc-read-speech-balanced-2.6.0-headset4.png
../../../_images/visualization-simulated-position_imda-nsc-read-speech-balanced-2.6.0-headset8.png
../../../_images/visualization-simulated-position_imda-nsc-read-speech-balanced-2.6.0-headset9.png
../../../_images/visualization-simulated-position_imda-nsc-read-speech-balanced-2.6.0-headset10.png
../../../_images/visualization-simulated-position_timit-1.4.1-files4.png
../../../_images/visualization-simulated-position_timit-1.4.1-files8.png
../../../_images/visualization-simulated-position_timit-1.4.1-files9.png
../../../_images/visualization-simulated-position_timit-1.4.1-files10.png
../../../_images/visualization-simulated-room_emovo-1.2.1-emotion.test4.png
../../../_images/visualization-simulated-room_emovo-1.2.1-emotion.test8.png
../../../_images/visualization-simulated-room_emovo-1.2.1-emotion.test9.png
../../../_images/visualization-simulated-room_emovo-1.2.1-emotion.test10.png
../../../_images/visualization-simulated-room_imda-nsc-read-speech-balanced-2.6.0-headset4.png
../../../_images/visualization-simulated-room_imda-nsc-read-speech-balanced-2.6.0-headset8.png
../../../_images/visualization-simulated-room_imda-nsc-read-speech-balanced-2.6.0-headset9.png
../../../_images/visualization-simulated-room_imda-nsc-read-speech-balanced-2.6.0-headset10.png
../../../_images/visualization-simulated-room_timit-1.4.1-files4.png
../../../_images/visualization-simulated-room_timit-1.4.1-files8.png
../../../_images/visualization-simulated-room_timit-1.4.1-files9.png
../../../_images/visualization-simulated-room_timit-1.4.1-files10.png