Fairness linguistic sentiment

Overall scores

w2v2-L

hubert-L

wavlm

data2vec

Overall Score

97.2% passed tests (70 passed / 2 failed).

91.7% passed tests (66 passed / 6 failed).

84.7% passed tests (61 passed / 11 failed).

87.5% passed tests (63 passed / 9 failed).

Bin Proportion Shift Difference Negative Sentiment

Shift in bin proportions for negative sentiment for specific language - Average of the shift in bin proportions for negative sentiment for all languages. The full expression leading to the test score is displayed in parentheses. Bins with too few samples are skipped.

Threshold: 0.075

Data

(-inf, 0.25]

(0.25, 0.5]

(0.5, 0.75]

(0.75, inf]

w2v2-L

hubert-L

wavlm

data2vec

w2v2-L

hubert-L

wavlm

data2vec

w2v2-L

hubert-L

wavlm

data2vec

w2v2-L

hubert-L

wavlm

data2vec

checklist-synth-1.0.0-words-in-context-de

-0.00 (-0.00 - 0.00)

-0.02 (0.00 - 0.02)

-0.03 (0.01 - 0.04)

-0.01 (0.01 - 0.01)

0.00 (0.00 - -0.00)

0.02 (-0.00 - -0.02)

0.04 (0.00 - -0.04)

0.00 (-0.01 - -0.01)

checklist-synth-1.0.0-words-in-context-en

-0.01 (-0.01 - 0.00)

0.18 (0.20 - 0.02)

0.19 (0.22 - 0.04)

0.10 (0.11 - 0.01)

0.01 (0.01 - -0.00)

-0.18 (-0.20 - -0.02)

-0.19 (-0.22 - -0.04)

-0.10 (-0.11 - -0.01)

checklist-synth-1.0.0-words-in-context-es

-0.01 (-0.00 - 0.00)

-0.02 (0.01 - 0.02)

-0.04 (0.00 - 0.04)

-0.01 (0.01 - 0.01)

0.01 (0.00 - -0.00)

0.02 (-0.01 - -0.02)

0.04 (-0.00 - -0.04)

0.01 (-0.01 - -0.01)

checklist-synth-1.0.0-words-in-context-fr

0.03 (0.03 - 0.00)

-0.03 (-0.01 - 0.02)

-0.00 (0.04 - 0.04)

-0.00 (0.01 - 0.01)

-0.03 (-0.03 - -0.00)

0.03 (0.01 - -0.02)

-0.00 (-0.04 - -0.04)

0.00 (-0.01 - -0.01)

checklist-synth-1.0.0-words-in-context-it

0.00 (0.00 - 0.00)

-0.02 (-0.00 - 0.02)

-0.05 (-0.01 - 0.04)

-0.01 (-0.00 - 0.01)

-0.00 (-0.00 - -0.00)

0.02 (0.00 - -0.02)

0.05 (0.01 - -0.04)

0.01 (0.00 - -0.01)

checklist-synth-1.0.0-words-in-context-ja

-0.00 (0.00 - 0.00)

-0.02 (0.00 - 0.02)

-0.01 (0.03 - 0.04)

-0.01 (0.00 - 0.01)

0.00 (-0.00 - -0.00)

0.02 (-0.00 - -0.02)

0.01 (-0.03 - -0.04)

0.02 (0.00 - -0.01)

checklist-synth-1.0.0-words-in-context-pt

-0.01 (-0.01 - 0.00)

-0.04 (-0.02 - 0.02)

-0.02 (0.02 - 0.04)

-0.05 (-0.03 - 0.01)

0.01 (0.01 - -0.00)

0.04 (0.02 - -0.02)

0.02 (-0.02 - -0.04)

0.05 (0.03 - -0.01)

checklist-synth-1.0.0-words-in-context-zh

-0.00 (0.00 - 0.00)

-0.02 (-0.00 - 0.02)

-0.04 (-0.00 - 0.04)

-0.01 (0.00 - 0.01)

0.00 (-0.00 - -0.00)

0.02 (0.00 - -0.02)

0.04 (-0.00 - -0.04)

0.01 (0.00 - -0.01)

mean

-0.00

0.00

0.00

0.00

0.00

-0.00

0.00

-0.00

Bin Proportion Shift Difference Neutral Sentiment

Shift in bin proportions for neutral sentiment for specific language - Average of the shift in bin proportions for neutral sentiment for all languages. The full expression leading to the test score is displayed in parentheses. Bins with too few samples are skipped.

Threshold: 0.075

Data

(-inf, 0.25]

(0.25, 0.5]

(0.5, 0.75]

(0.75, inf]

w2v2-L

hubert-L

wavlm

data2vec

w2v2-L

hubert-L

wavlm

data2vec

w2v2-L

hubert-L

wavlm

data2vec

w2v2-L

hubert-L

wavlm

data2vec

checklist-synth-1.0.0-words-in-context-de

0.01 (0.00 - -0.01)

0.01 (0.00 - -0.00)

0.01 (0.01 - -0.00)

-0.01 (0.02 - 0.03)

-0.01 (-0.00 - 0.01)

-0.01 (-0.01 - 0.00)

-0.01 (0.00 - 0.01)

0.01 (-0.02 - -0.03)

checklist-synth-1.0.0-words-in-context-en

0.04 (0.03 - -0.01)

0.01 (0.00 - -0.00)

0.15 (0.14 - -0.00)

0.12 (0.15 - 0.03)

-0.04 (-0.03 - 0.01)

-0.01 (-0.00 - 0.00)

-0.15 (-0.14 - 0.01)

-0.12 (-0.15 - -0.03)

checklist-synth-1.0.0-words-in-context-es

0.03 (0.02 - -0.01)

-0.05 (-0.06 - -0.00)

-0.11 (-0.11 - -0.00)

-0.01 (0.02 - 0.03)

-0.03 (-0.02 - 0.01)

0.05 (0.06 - 0.00)

0.11 (0.11 - 0.01)

0.01 (-0.02 - -0.03)

checklist-synth-1.0.0-words-in-context-fr

-0.03 (-0.04 - -0.01)

-0.01 (-0.01 - -0.00)

-0.02 (-0.02 - -0.00)

0.04 (0.07 - 0.03)

0.03 (0.04 - 0.01)

0.01 (0.01 - 0.00)

0.02 (0.02 - 0.01)

-0.04 (-0.07 - -0.03)

checklist-synth-1.0.0-words-in-context-it

0.00 (-0.01 - -0.01)

0.02 (0.01 - -0.00)

-0.03 (-0.04 - -0.00)

-0.03 (0.00 - 0.03)

-0.00 (0.01 - 0.01)

-0.01 (-0.01 - 0.00)

0.03 (0.04 - 0.01)

0.03 (-0.00 - -0.03)

checklist-synth-1.0.0-words-in-context-ja

0.01 (-0.00 - -0.01)

0.01 (0.00 - -0.00)

-0.01 (-0.02 - -0.00)

-0.03 (0.00 - 0.03)

-0.01 (0.00 - 0.01)

-0.00 (-0.00 - 0.00)

0.01 (0.02 - 0.01)

0.03 (-0.00 - -0.03)

checklist-synth-1.0.0-words-in-context-pt

-0.08 (-0.09 - -0.01)

0.02 (0.01 - -0.00)

0.00 (-0.00 - -0.00)

-0.05 (-0.02 - 0.03)

0.08 (0.09 - 0.01)

-0.02 (-0.01 - 0.00)

-0.00 (0.00 - 0.01)

0.05 (0.02 - -0.03)

checklist-synth-1.0.0-words-in-context-zh

0.01 (-0.00 - -0.01)

0.01 (0.00 - -0.00)

0.00 (-0.00 - -0.00)

-0.03 (-0.00 - 0.03)

-0.01 (0.00 - 0.01)

-0.01 (-0.00 - 0.00)

-0.00 (0.00 - 0.01)

0.04 (0.01 - -0.03)

mean

-0.00

0.00

-0.00

0.00

0.00

0.00

0.00

0.00

Bin Proportion Shift Difference Positive Sentiment

Shift in bin proportions for positive sentiment for specific language - Average of the shift in bin proportions for positive sentiment for all languages. The full expression leading to the test score is displayed in parentheses. Bins with too few samples are skipped.

Threshold: 0.075

Data

(-inf, 0.25]

(0.25, 0.5]

(0.5, 0.75]

(0.75, inf]

w2v2-L

hubert-L

wavlm

data2vec

w2v2-L

hubert-L

wavlm

data2vec

w2v2-L

hubert-L

wavlm

data2vec

w2v2-L

hubert-L

wavlm

data2vec

checklist-synth-1.0.0-words-in-context-de

-0.00 (0.00 - 0.00)

0.02 (-0.00 - -0.02)

0.02 (-0.01 - -0.04)

0.01 (-0.02 - -0.03)

0.00 (-0.00 - -0.00)

-0.02 (0.00 - 0.02)

-0.04 (0.00 - 0.04)

-0.01 (0.02 - 0.03)

checklist-synth-1.0.0-words-in-context-en

-0.00 (-0.00 - 0.00)

-0.19 (-0.21 - -0.02)

-0.26 (-0.29 - -0.04)

-0.15 (-0.18 - -0.03)

0.00 (0.00 - -0.00)

0.19 (0.21 - 0.02)

0.26 (0.29 - 0.04)

0.16 (0.18 - 0.03)

checklist-synth-1.0.0-words-in-context-es

-0.01 (-0.00 - 0.00)

0.04 (0.01 - -0.02)

0.08 (0.04 - -0.04)

0.01 (-0.02 - -0.03)

0.01 (0.00 - -0.00)

-0.04 (-0.01 - 0.02)

-0.08 (-0.04 - 0.04)

-0.01 (0.02 - 0.03)

checklist-synth-1.0.0-words-in-context-fr

-0.02 (-0.02 - 0.00)

0.03 (0.01 - -0.02)

0.01 (-0.03 - -0.04)

-0.01 (-0.04 - -0.03)

0.02 (0.02 - -0.00)

-0.03 (-0.01 - 0.02)

-0.00 (0.03 - 0.04)

0.01 (0.04 - 0.03)

checklist-synth-1.0.0-words-in-context-it

-0.00 (0.00 - 0.00)

0.02 (-0.00 - -0.02)

0.07 (0.03 - -0.04)

0.03 (-0.00 - -0.03)

0.00 (-0.00 - -0.00)

-0.02 (0.00 - 0.02)

-0.07 (-0.03 - 0.04)

-0.02 (0.00 - 0.03)

checklist-synth-1.0.0-words-in-context-ja

-0.00 (-0.00 - 0.00)

0.02 (-0.00 - -0.02)

0.01 (-0.02 - -0.04)

0.03 (0.00 - -0.03)

0.00 (0.00 - -0.00)

-0.02 (0.00 - 0.02)

-0.01 (0.02 - 0.04)

-0.03 (-0.00 - 0.03)

checklist-synth-1.0.0-words-in-context-pt

0.04 (0.04 - 0.00)

0.04 (0.01 - -0.02)

0.02 (-0.02 - -0.04)

0.07 (0.04 - -0.03)

-0.04 (-0.04 - -0.00)

-0.04 (-0.01 - 0.02)

-0.02 (0.02 - 0.04)

-0.07 (-0.04 - 0.03)

checklist-synth-1.0.0-words-in-context-zh

-0.00 (0.00 - 0.00)

0.02 (-0.00 - -0.02)

0.04 (0.00 - -0.04)

0.03 (-0.00 - -0.03)

0.00 (-0.00 - -0.00)

-0.02 (0.00 - 0.02)

-0.04 (0.00 - 0.04)

-0.03 (-0.01 - 0.03)

mean

0.00

0.00

-0.00

0.00

-0.00

0.00

0.00

-0.00

Mean Shift Difference Negative Sentiment

Mean shift for negative sentiment for specific language - Average of the mean shift for negative sentiment for all languages. The full expression leading to the test score is displayed in parentheses.

Threshold: 0.025

Data

Mean Shift Difference Negative Sentiment

w2v2-L

hubert-L

wavlm

data2vec

checklist-synth-1.0.0-words-in-context-de

0.00 (0.00 - -0.00)

0.01 (0.00 - -0.00)

0.01 (0.00 - -0.00)

0.01 (0.00 - -0.01)

checklist-synth-1.0.0-words-in-context-en

0.00 (0.00 - -0.00)

-0.03 (-0.03 - -0.00)

-0.02 (-0.03 - -0.00)

-0.03 (-0.03 - -0.01)

checklist-synth-1.0.0-words-in-context-es

0.00 (0.00 - -0.00)

0.00 (-0.00 - -0.00)

0.00 (0.00 - -0.00)

0.01 (0.00 - -0.01)

checklist-synth-1.0.0-words-in-context-fr

-0.00 (-0.00 - -0.00)

0.00 (-0.00 - -0.00)

0.00 (-0.00 - -0.00)

0.00 (-0.00 - -0.01)

checklist-synth-1.0.0-words-in-context-it

0.00 (0.00 - -0.00)

0.01 (0.00 - -0.00)

0.01 (0.00 - -0.00)

0.00 (-0.00 - -0.01)

checklist-synth-1.0.0-words-in-context-ja

-0.00 (-0.00 - -0.00)

0.00 (-0.00 - -0.00)

0.00 (-0.00 - -0.00)

-0.00 (-0.01 - -0.01)

checklist-synth-1.0.0-words-in-context-pt

0.00 (0.00 - -0.00)

0.01 (0.00 - -0.00)

0.00 (-0.00 - -0.00)

0.01 (0.00 - -0.01)

checklist-synth-1.0.0-words-in-context-zh

-0.00 (-0.00 - -0.00)

0.00 (-0.00 - -0.00)

0.00 (-0.00 - -0.00)

0.00 (-0.00 - -0.01)

mean

0.00

0.00

0.00

0.00

Mean Shift Difference Neutral Sentiment

Mean shift for neutral sentiment for specific language - Average of the mean shift for neutral sentiment for all languages. The full expression leading to the test score is displayed in parentheses.

Threshold: 0.025

Data

Mean Shift Difference Neutral Sentiment

w2v2-L

hubert-L

wavlm

data2vec

checklist-synth-1.0.0-words-in-context-de

-0.01 (-0.00 - 0.00)

-0.00 (-0.00 - -0.00)

0.00 (0.00 - -0.00)

0.00 (-0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-en

-0.00 (-0.00 - 0.00)

-0.01 (-0.01 - -0.00)

-0.02 (-0.02 - -0.00)

-0.04 (-0.05 - -0.00)

checklist-synth-1.0.0-words-in-context-es

-0.01 (-0.00 - 0.00)

0.01 (0.01 - -0.00)

0.01 (0.01 - -0.00)

0.00 (-0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-fr

0.00 (0.01 - 0.00)

-0.00 (-0.00 - -0.00)

0.00 (0.00 - -0.00)

0.00 (-0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-it

-0.00 (-0.00 - 0.00)

0.00 (-0.00 - -0.00)

0.00 (0.00 - -0.00)

0.00 (-0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-ja

0.01 (0.01 - 0.00)

0.01 (0.00 - -0.00)

-0.00 (-0.00 - -0.00)

0.02 (0.01 - -0.00)

checklist-synth-1.0.0-words-in-context-pt

0.00 (0.01 - 0.00)

-0.00 (-0.00 - -0.00)

0.00 (0.00 - -0.00)

0.00 (-0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-zh

0.00 (0.01 - 0.00)

-0.00 (-0.00 - -0.00)

0.01 (0.01 - -0.00)

0.01 (0.00 - -0.00)

mean

-0.00

0.00

0.00

-0.00

Mean Shift Difference Positive Sentiment

Mean shift for positive sentiment for specific language - Average of the mean shift for positive sentiment for all languages. The full expression leading to the test score is displayed in parentheses.

Threshold: 0.025

Data

Mean Shift Difference Positive Sentiment

w2v2-L

hubert-L

wavlm

data2vec

checklist-synth-1.0.0-words-in-context-de

-0.00 (-0.00 - -0.00)

-0.00 (-0.00 - 0.00)

-0.01 (-0.00 - 0.00)

-0.01 (-0.00 - 0.01)

checklist-synth-1.0.0-words-in-context-en

0.00 (0.00 - -0.00)

0.03 (0.04 - 0.00)

0.03 (0.03 - 0.00)

0.04 (0.05 - 0.01)

checklist-synth-1.0.0-words-in-context-es

-0.00 (-0.00 - -0.00)

-0.01 (-0.00 - 0.00)

-0.01 (-0.00 - 0.00)

-0.01 (-0.00 - 0.01)

checklist-synth-1.0.0-words-in-context-fr

0.00 (0.00 - -0.00)

-0.00 (0.00 - 0.00)

-0.00 (0.00 - 0.00)

-0.00 (0.00 - 0.01)

checklist-synth-1.0.0-words-in-context-it

-0.00 (-0.00 - -0.00)

-0.01 (-0.00 - 0.00)

-0.01 (-0.00 - 0.00)

-0.00 (0.00 - 0.01)

checklist-synth-1.0.0-words-in-context-ja

0.00 (-0.00 - -0.00)

-0.00 (0.00 - 0.00)

0.00 (0.01 - 0.00)

-0.01 (0.00 - 0.01)

checklist-synth-1.0.0-words-in-context-pt

-0.00 (-0.00 - -0.00)

-0.01 (-0.00 - 0.00)

-0.00 (0.00 - 0.00)

-0.01 (-0.00 - 0.01)

checklist-synth-1.0.0-words-in-context-zh

0.00 (0.00 - -0.00)

-0.00 (0.00 - 0.00)

-0.00 (-0.00 - 0.00)

-0.00 (0.00 - 0.01)

mean

0.00

-0.00

-0.00

-0.00

Visualization

w2v2-L

hubert-L

wavlm

data2vec

../../../_images/visualization_checklist-synth-1.0.0-words-in-context-de15.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-de19.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-de20.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-de21.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-en15.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-en19.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-en20.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-en21.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-es15.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-es19.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-es20.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-es21.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-fr15.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-fr19.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-fr20.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-fr21.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-it15.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-it19.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-it20.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-it21.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-ja15.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-ja19.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-ja20.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-ja21.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-pt15.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-pt19.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-pt20.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-pt21.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-zh15.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-zh19.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-zh20.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-zh21.png