Robustness simulated recording condition

Overall scores

w2v2-L

hubert-L

wavlm

data2vec

Overall Score

16.7% passed tests (1 passed / 5 failed).

0.0% passed tests (0 passed / 6 failed).

33.3% passed tests (2 passed / 4 failed).

16.7% passed tests (1 passed / 5 failed).

Percentage Unchanged Predictions Simulated Position

Threshold: 0.8

Data

Percentage Unchanged Predictions Simulated Position

w2v2-L

hubert-L

wavlm

data2vec

emovo-1.2.1-emotion.test

0.81

0.65

0.72

0.58

imda-nsc-read-speech-balanced-2.6.0-headset

0.79

0.68

0.86

0.75

timit-1.4.1-files

0.65

0.75

0.92

0.82

mean

0.75

0.69

0.83

0.72

Percentage Unchanged Predictions Simulated Room

Threshold: 0.8

Data

Percentage Unchanged Predictions Simulated Room

w2v2-L

hubert-L

wavlm

data2vec

emovo-1.2.1-emotion.test

0.70

0.58

0.61

0.53

imda-nsc-read-speech-balanced-2.6.0-headset

0.75

0.66

0.68

0.72

timit-1.4.1-files

0.73

0.64

0.46

0.80

mean

0.73

0.63

0.58

0.68

Visualization Simulated Position

Difference of predictions for audio with a baseline simulated position and audio with a different simulated position. The allowed prediction difference \(\delta < 0.05\) is highlighted in green in the upper plot. The lower plot shows the distributions of the two predictions.

w2v2-L

hubert-L

wavlm

data2vec

../../../_images/visualization-simulated-position_emovo-1.2.1-emotion.test37.png
../../../_images/visualization-simulated-position_emovo-1.2.1-emotion.test41.png
../../../_images/visualization-simulated-position_emovo-1.2.1-emotion.test42.png
../../../_images/visualization-simulated-position_emovo-1.2.1-emotion.test43.png
../../../_images/visualization-simulated-position_imda-nsc-read-speech-balanced-2.6.0-headset37.png
../../../_images/visualization-simulated-position_imda-nsc-read-speech-balanced-2.6.0-headset41.png
../../../_images/visualization-simulated-position_imda-nsc-read-speech-balanced-2.6.0-headset42.png
../../../_images/visualization-simulated-position_imda-nsc-read-speech-balanced-2.6.0-headset43.png
../../../_images/visualization-simulated-position_timit-1.4.1-files37.png
../../../_images/visualization-simulated-position_timit-1.4.1-files41.png
../../../_images/visualization-simulated-position_timit-1.4.1-files42.png
../../../_images/visualization-simulated-position_timit-1.4.1-files43.png
../../../_images/visualization-simulated-room_emovo-1.2.1-emotion.test37.png
../../../_images/visualization-simulated-room_emovo-1.2.1-emotion.test41.png
../../../_images/visualization-simulated-room_emovo-1.2.1-emotion.test42.png
../../../_images/visualization-simulated-room_emovo-1.2.1-emotion.test43.png
../../../_images/visualization-simulated-room_imda-nsc-read-speech-balanced-2.6.0-headset37.png
../../../_images/visualization-simulated-room_imda-nsc-read-speech-balanced-2.6.0-headset41.png
../../../_images/visualization-simulated-room_imda-nsc-read-speech-balanced-2.6.0-headset42.png
../../../_images/visualization-simulated-room_imda-nsc-read-speech-balanced-2.6.0-headset43.png
../../../_images/visualization-simulated-room_timit-1.4.1-files37.png
../../../_images/visualization-simulated-room_timit-1.4.1-files41.png
../../../_images/visualization-simulated-room_timit-1.4.1-files42.png
../../../_images/visualization-simulated-room_timit-1.4.1-files43.png