Fairness linguistic sentiment

Overall scores

CNN14

w2v2-b

hubert-b

axlstm

Overall Score

99.0% passed tests (95 passed / 1 failed).

100.0% passed tests (72 passed / 0 failed).

94.8% passed tests (91 passed / 5 failed).

97.9% passed tests (94 passed / 2 failed).

Bin Proportion Shift Difference Negative Sentiment

Shift in bin proportions for negative sentiment for specific language - Average of the shift in bin proportions for negative sentiment for all languages. The full expression leading to the test score is displayed in parentheses. Bins with too few samples are skipped.

Threshold: 0.075

Data

(-inf, 0.25]

(0.25, 0.5]

(0.5, 0.75]

(0.75, inf]

CNN14

w2v2-b

hubert-b

axlstm

CNN14

w2v2-b

hubert-b

axlstm

CNN14

w2v2-b

hubert-b

axlstm

CNN14

w2v2-b

hubert-b

axlstm

checklist-synth-1.0.0-words-in-context-de

-0.06 (-0.07 - -0.01)

-0.01 (-0.02 - -0.01)

-0.01 (-0.00 - 0.00)

0.06 (0.07 - 0.01)

-0.02 (-0.01 - 0.00)

0.00 (0.02 - 0.02)

0.01 (0.00 - -0.01)

0.00 (0.00 - -0.00)

0.02 (0.01 - -0.01)

0.01 (0.00 - -0.01)

-0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-en

0.01 (0.00 - -0.01)

0.01 (0.00 - -0.01)

-0.01 (-0.00 - 0.00)

-0.01 (0.00 - 0.01)

-0.00 (-0.00 - 0.00)

-0.02 (-0.00 - 0.02)

0.01 (0.00 - -0.01)

-0.00 (-0.00 - -0.00)

0.02 (0.01 - -0.01)

0.01 (0.00 - -0.01)

-0.01 (-0.01 - 0.00)

checklist-synth-1.0.0-words-in-context-es

-0.01 (-0.02 - -0.01)

-0.03 (-0.04 - -0.01)

-0.01 (-0.00 - 0.00)

0.00 (0.02 - 0.01)

-0.00 (-0.00 - 0.00)

0.02 (0.04 - 0.02)

0.01 (0.00 - -0.01)

0.00 (0.00 - -0.00)

0.01 (0.00 - -0.01)

0.01 (-0.00 - -0.01)

-0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-fr

0.01 (0.00 - -0.01)

0.00 (-0.01 - -0.01)

0.02 (0.03 - 0.00)

-0.01 (0.00 - 0.01)

-0.00 (-0.00 - 0.00)

-0.01 (0.01 - 0.02)

-0.02 (-0.03 - -0.01)

-0.01 (-0.01 - -0.00)

-0.01 (-0.02 - -0.01)

0.01 (0.00 - -0.01)

0.01 (0.02 - 0.00)

checklist-synth-1.0.0-words-in-context-it

0.01 (0.00 - -0.01)

0.01 (0.00 - -0.01)

-0.01 (-0.00 - 0.00)

-0.02 (-0.01 - 0.01)

0.01 (0.01 - 0.00)

-0.02 (0.00 - 0.02)

0.00 (-0.00 - -0.01)

0.01 (0.01 - -0.00)

-0.01 (-0.02 - -0.01)

0.01 (-0.00 - -0.01)

-0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-ja

0.01 (0.00 - -0.01)

0.01 (0.00 - -0.01)

0.00 (0.01 - 0.00)

-0.01 (0.00 - 0.01)

0.02 (0.02 - 0.00)

-0.01 (0.00 - 0.02)

-0.00 (-0.01 - -0.01)

-0.00 (-0.00 - -0.00)

-0.01 (-0.02 - -0.01)

0.00 (-0.00 - -0.01)

-0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-pt

0.01 (-0.00 - -0.01)

-0.01 (-0.02 - -0.01)

0.00 (0.01 - 0.00)

-0.02 (-0.01 - 0.01)

-0.01 (-0.00 - 0.00)

0.00 (0.02 - 0.02)

-0.00 (-0.01 - -0.01)

0.01 (0.01 - -0.00)

-0.02 (-0.02 - -0.01)

0.01 (0.00 - -0.01)

0.02 (0.03 - 0.00)

checklist-synth-1.0.0-words-in-context-zh

0.01 (0.00 - -0.01)

0.01 (0.00 - -0.01)

-0.00 (0.00 - 0.00)

0.00 (0.01 - 0.01)

0.01 (0.01 - 0.00)

0.03 (0.05 - 0.02)

0.00 (-0.00 - -0.01)

-0.01 (-0.01 - -0.00)

0.00 (-0.01 - -0.01)

-0.04 (-0.05 - -0.01)

-0.01 (-0.00 - 0.00)

mean

-0.00

-0.00

-0.00

-0.00

0.00

-0.00

0.00

0.00

0.00

0.00

0.00

Bin Proportion Shift Difference Neutral Sentiment

Shift in bin proportions for neutral sentiment for specific language - Average of the shift in bin proportions for neutral sentiment for all languages. The full expression leading to the test score is displayed in parentheses. Bins with too few samples are skipped.

Threshold: 0.075

Data

(-inf, 0.25]

(0.25, 0.5]

(0.5, 0.75]

(0.75, inf]

CNN14

w2v2-b

hubert-b

axlstm

CNN14

w2v2-b

hubert-b

axlstm

CNN14

w2v2-b

hubert-b

axlstm

CNN14

w2v2-b

hubert-b

axlstm

checklist-synth-1.0.0-words-in-context-de

0.08 (0.09 - 0.01)

0.06 (0.08 - 0.02)

0.01 (0.00 - -0.01)

-0.09 (-0.09 - 0.00)

0.01 (-0.01 - -0.02)

-0.05 (-0.08 - -0.03)

-0.01 (-0.00 - 0.01)

0.01 (0.00 - -0.01)

-0.01 (0.00 - 0.02)

-0.02 (0.00 - 0.02)

0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-en

-0.01 (-0.00 - 0.01)

0.03 (0.05 - 0.02)

0.02 (0.01 - -0.01)

0.10 (0.10 - 0.00)

0.02 (-0.00 - -0.02)

-0.02 (-0.05 - -0.03)

-0.02 (-0.01 - 0.01)

-0.09 (-0.10 - -0.01)

-0.01 (0.00 - 0.02)

-0.02 (0.00 - 0.02)

-0.00 (-0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-es

-0.02 (-0.01 - 0.01)

-0.04 (-0.02 - 0.02)

0.02 (0.01 - -0.01)

0.01 (0.01 - 0.00)

0.01 (-0.01 - -0.02)

0.06 (0.02 - -0.03)

-0.01 (0.00 - 0.01)

0.01 (0.00 - -0.01)

-0.02 (-0.01 - 0.02)

-0.02 (-0.00 - 0.02)

0.02 (0.02 - 0.00)

checklist-synth-1.0.0-words-in-context-fr

-0.01 (0.00 - 0.01)

-0.03 (-0.02 - 0.02)

-0.01 (-0.02 - -0.01)

0.00 (0.01 - 0.00)

0.02 (0.00 - -0.02)

0.05 (0.02 - -0.03)

0.01 (0.02 - 0.01)

0.01 (-0.01 - -0.01)

-0.00 (0.02 - 0.02)

-0.02 (0.00 - 0.02)

-0.02 (-0.02 - 0.00)

checklist-synth-1.0.0-words-in-context-it

-0.01 (0.00 - 0.01)

-0.02 (-0.00 - 0.02)

0.01 (-0.00 - -0.01)

-0.01 (-0.01 - 0.00)

-0.02 (-0.04 - -0.02)

0.03 (-0.00 - -0.03)

-0.01 (0.00 - 0.01)

0.02 (0.01 - -0.01)

0.01 (0.02 - 0.02)

-0.01 (0.00 - 0.02)

0.02 (0.02 - 0.00)

checklist-synth-1.0.0-words-in-context-ja

-0.01 (0.00 - 0.01)

-0.01 (0.00 - 0.02)

0.00 (-0.01 - -0.01)

-0.01 (-0.00 - 0.00)

-0.03 (-0.04 - -0.02)

0.00 (-0.03 - -0.03)

-0.01 (0.01 - 0.01)

0.01 (0.00 - -0.01)

0.00 (0.02 - 0.02)

0.01 (0.03 - 0.02)

0.02 (0.02 - 0.00)

checklist-synth-1.0.0-words-in-context-pt

-0.01 (-0.00 - 0.01)

0.03 (0.05 - 0.02)

-0.07 (-0.08 - -0.01)

0.02 (0.02 - 0.00)

0.03 (0.01 - -0.02)

-0.01 (-0.05 - -0.03)

0.06 (0.08 - 0.01)

-0.01 (-0.02 - -0.01)

0.07 (0.08 - 0.02)

-0.02 (0.00 - 0.02)

-0.09 (-0.09 - 0.00)

checklist-synth-1.0.0-words-in-context-zh

-0.01 (0.00 - 0.01)

-0.02 (-0.00 - 0.02)

0.01 (-0.01 - -0.01)

-0.02 (-0.02 - 0.00)

-0.02 (-0.04 - -0.02)

-0.07 (-0.10 - -0.03)

-0.01 (0.01 - 0.01)

0.03 (0.02 - -0.01)

-0.03 (-0.02 - 0.02)

0.08 (0.10 - 0.02)

0.06 (0.06 - 0.00)

mean

0.00

-0.00

-0.00

0.00

0.00

-0.00

-0.00

-0.00

0.00

-0.00

0.00

Bin Proportion Shift Difference Positive Sentiment

Shift in bin proportions for positive sentiment for specific language - Average of the shift in bin proportions for positive sentiment for all languages. The full expression leading to the test score is displayed in parentheses. Bins with too few samples are skipped.

Threshold: 0.075

Data

(-inf, 0.25]

(0.25, 0.5]

(0.5, 0.75]

(0.75, inf]

CNN14

w2v2-b

hubert-b

axlstm

CNN14

w2v2-b

hubert-b

axlstm

CNN14

w2v2-b

hubert-b

axlstm

CNN14

w2v2-b

hubert-b

axlstm

checklist-synth-1.0.0-words-in-context-de

0.03 (0.03 - 0.01)

-0.02 (-0.02 - 0.00)

0.00 (0.00 - -0.00)

-0.02 (-0.03 - -0.01)

0.01 (0.02 - 0.00)

0.02 (0.02 - -0.00)

-0.00 (-0.00 - 0.00)

-0.01 (0.00 - 0.01)

-0.02 (-0.01 - 0.00)

-0.00 (0.00 - 0.00)

0.00 (-0.00 - -0.01)

checklist-synth-1.0.0-words-in-context-en

-0.01 (0.00 - 0.01)

-0.03 (-0.02 - 0.00)

-0.00 (-0.00 - -0.00)

-0.03 (-0.04 - -0.01)

-0.00 (0.00 - 0.00)

0.03 (0.02 - -0.00)

0.00 (0.00 - 0.00)

0.04 (0.04 - 0.01)

-0.01 (-0.01 - 0.00)

-0.00 (0.00 - 0.00)

0.02 (0.01 - -0.01)

checklist-synth-1.0.0-words-in-context-es

0.01 (0.02 - 0.01)

0.05 (0.05 - 0.00)

0.00 (0.00 - -0.00)

-0.01 (-0.02 - -0.01)

0.00 (0.01 - 0.00)

-0.05 (-0.05 - -0.00)

-0.00 (-0.00 - 0.00)

-0.01 (0.00 - 0.01)

-0.00 (0.00 - 0.00)

0.00 (0.00 - 0.00)

-0.00 (-0.01 - -0.01)

checklist-synth-1.0.0-words-in-context-fr

-0.01 (-0.00 - 0.01)

0.01 (0.01 - 0.00)

-0.02 (-0.02 - -0.00)

0.00 (-0.01 - -0.01)

-0.00 (0.00 - 0.00)

-0.01 (-0.01 - -0.00)

0.02 (0.02 - 0.00)

0.01 (0.01 - 0.01)

0.01 (0.01 - 0.00)

-0.00 (0.00 - 0.00)

-0.01 (-0.01 - -0.01)

checklist-synth-1.0.0-words-in-context-it

-0.01 (0.00 - 0.01)

-0.00 (0.00 - 0.00)

0.00 (0.00 - -0.00)

0.03 (0.01 - -0.01)

0.00 (0.00 - 0.00)

0.00 (-0.00 - -0.00)

-0.00 (0.00 - 0.00)

-0.02 (-0.01 - 0.01)

0.01 (0.01 - 0.00)

-0.00 (-0.00 - 0.00)

-0.01 (-0.01 - -0.01)

checklist-synth-1.0.0-words-in-context-ja

-0.01 (0.00 - 0.01)

-0.00 (-0.00 - 0.00)

-0.00 (-0.00 - -0.00)

0.01 (-0.00 - -0.01)

-0.01 (-0.00 - 0.00)

0.01 (0.01 - -0.00)

0.00 (0.00 - 0.00)

-0.00 (0.00 - 0.01)

0.01 (0.01 - 0.00)

-0.01 (-0.01 - 0.00)

-0.00 (-0.01 - -0.01)

checklist-synth-1.0.0-words-in-context-pt

-0.01 (0.00 - 0.01)

-0.00 (0.00 - 0.00)

0.02 (0.02 - -0.00)

0.02 (0.00 - -0.01)

-0.00 (0.00 - 0.00)

0.00 (-0.00 - -0.00)

-0.02 (-0.02 - 0.00)

-0.01 (-0.01 - 0.01)

-0.01 (-0.01 - 0.00)

-0.00 (0.00 - 0.00)

0.01 (0.01 - -0.01)

checklist-synth-1.0.0-words-in-context-zh

-0.01 (0.00 - 0.01)

-0.00 (-0.00 - 0.00)

-0.00 (-0.00 - -0.00)

0.00 (-0.01 - -0.01)

0.00 (0.01 - 0.00)

-0.00 (-0.01 - -0.00)

0.00 (0.00 - 0.00)

0.00 (0.01 - 0.01)

0.01 (0.01 - 0.00)

0.01 (0.01 - 0.00)

-0.01 (-0.02 - -0.01)

mean

-0.00

0.00

0.00

-0.00

0.00

-0.00

0.00

0.00

0.00

0.00

0.00

Mean Shift Difference Negative Sentiment

Mean shift for negative sentiment for specific language - Average of the mean shift for negative sentiment for all languages. The full expression leading to the test score is displayed in parentheses.

Threshold: 0.025

Data

Mean Shift Difference Negative Sentiment

CNN14

w2v2-b

hubert-b

axlstm

checklist-synth-1.0.0-words-in-context-de

0.00 (0.00 - 0.00)

0.01 (0.01 - -0.00)

0.01 (0.01 - 0.00)

0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-en

-0.00 (0.00 - 0.00)

0.00 (0.00 - -0.00)

0.00 (0.00 - 0.00)

-0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-es

0.01 (0.01 - 0.00)

0.00 (0.00 - -0.00)

0.00 (0.00 - 0.00)

0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-fr

-0.00 (-0.00 - 0.00)

-0.01 (-0.01 - -0.00)

-0.01 (-0.01 - 0.00)

0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-it

0.00 (0.00 - 0.00)

0.00 (0.00 - -0.00)

0.00 (0.00 - 0.00)

-0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-ja

0.00 (0.00 - 0.00)

-0.00 (-0.01 - -0.00)

-0.00 (-0.00 - 0.00)

-0.00 (-0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-pt

0.00 (0.00 - 0.00)

-0.00 (-0.00 - -0.00)

0.00 (0.00 - 0.00)

0.01 (0.01 - 0.00)

checklist-synth-1.0.0-words-in-context-zh

-0.01 (-0.01 - 0.00)

-0.00 (-0.00 - -0.00)

-0.01 (-0.01 - 0.00)

-0.01 (-0.00 - 0.00)

mean

0.00

0.00

-0.00

0.00

Mean Shift Difference Neutral Sentiment

Mean shift for neutral sentiment for specific language - Average of the mean shift for neutral sentiment for all languages. The full expression leading to the test score is displayed in parentheses.

Threshold: 0.025

Data

Mean Shift Difference Neutral Sentiment

CNN14

w2v2-b

hubert-b

axlstm

checklist-synth-1.0.0-words-in-context-de

-0.01 (-0.01 - 0.00)

-0.01 (-0.00 - 0.00)

-0.01 (-0.01 - -0.00)

-0.00 (0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-en

-0.01 (-0.00 - 0.00)

-0.02 (-0.01 - 0.00)

-0.03 (-0.03 - -0.00)

-0.01 (-0.01 - 0.00)

checklist-synth-1.0.0-words-in-context-es

-0.00 (0.00 - 0.00)

0.00 (0.00 - 0.00)

0.00 (0.00 - -0.00)

-0.00 (-0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-fr

0.01 (0.01 - 0.00)

0.01 (0.01 - 0.00)

-0.01 (-0.01 - -0.00)

-0.00 (-0.00 - 0.00)

checklist-synth-1.0.0-words-in-context-it

0.00 (0.00 - 0.00)

-0.00 (0.00 - 0.00)

0.01 (0.00 - -0.00)

0.02 (0.02 - 0.00)

checklist-synth-1.0.0-words-in-context-ja

0.01 (0.01 - 0.00)

0.00 (0.01 - 0.00)

0.02 (0.02 - -0.00)

0.02 (0.02 - 0.00)

checklist-synth-1.0.0-words-in-context-pt

-0.01 (-0.01 - 0.00)

0.01 (0.01 - 0.00)

-0.00 (-0.00 - -0.00)

-0.03 (-0.03 - 0.00)

checklist-synth-1.0.0-words-in-context-zh

0.01 (0.02 - 0.00)

0.00 (0.01 - 0.00)

0.02 (0.01 - -0.00)

0.02 (0.03 - 0.00)

mean

0.00

-0.00

0.00

0.00

Mean Shift Difference Positive Sentiment

Mean shift for positive sentiment for specific language - Average of the mean shift for positive sentiment for all languages. The full expression leading to the test score is displayed in parentheses.

Threshold: 0.025

Data

Mean Shift Difference Positive Sentiment

CNN14

w2v2-b

hubert-b

axlstm

checklist-synth-1.0.0-words-in-context-de

0.00 (0.00 - -0.00)

-0.01 (-0.01 - -0.00)

-0.01 (-0.01 - 0.00)

0.00 (-0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-en

0.00 (0.00 - -0.00)

0.00 (0.00 - -0.00)

0.01 (0.01 - 0.00)

0.01 (0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-es

-0.01 (-0.01 - -0.00)

-0.00 (-0.00 - -0.00)

-0.00 (-0.00 - 0.00)

0.00 (-0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-fr

0.00 (-0.00 - -0.00)

0.00 (0.00 - -0.00)

0.01 (0.01 - 0.00)

-0.00 (-0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-it

-0.00 (-0.00 - -0.00)

-0.00 (-0.00 - -0.00)

-0.01 (-0.00 - 0.00)

-0.01 (-0.01 - -0.00)

checklist-synth-1.0.0-words-in-context-ja

-0.00 (-0.01 - -0.00)

0.00 (0.00 - -0.00)

-0.01 (-0.01 - 0.00)

-0.00 (-0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-pt

0.00 (-0.00 - -0.00)

-0.00 (-0.00 - -0.00)

-0.00 (-0.00 - 0.00)

0.00 (0.00 - -0.00)

checklist-synth-1.0.0-words-in-context-zh

0.00 (0.00 - -0.00)

0.00 (0.00 - -0.00)

0.00 (0.00 - 0.00)

-0.00 (-0.01 - -0.00)

mean

-0.00

-0.00

-0.00

0.00

Visualization

CNN14

w2v2-b

hubert-b

axlstm

../../../_images/visualization_checklist-synth-1.0.0-words-in-context-de11.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-de12.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-de13.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-de14.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-en11.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-en12.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-en13.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-en14.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-es11.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-es12.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-es13.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-es14.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-fr11.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-fr12.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-fr13.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-fr14.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-it11.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-it12.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-it13.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-it14.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-ja11.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-ja12.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-ja13.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-ja14.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-pt11.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-pt12.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-pt13.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-pt14.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-zh11.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-zh12.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-zh13.png
../../../_images/visualization_checklist-synth-1.0.0-words-in-context-zh14.png