Mapping the Multilingual Margins: Intersectional Biases of Sentiment
Analysis Systems in English, Spanish, and Arabic
- URL: http://arxiv.org/abs/2204.03558v1
- Date: Thu, 7 Apr 2022 16:33:15 GMT
- Title: Mapping the Multilingual Margins: Intersectional Biases of Sentiment
Analysis Systems in English, Spanish, and Arabic
- Authors: Ant\'onio C\^amara, Nina Taneja, Tamjeed Azad, Emily Allaway, Richard
Zemel
- Abstract summary: We introduce four multilingual Equity Evaluation Corpora, supplementary test sets designed to measure social biases, and a novel statistical framework for studying unisectional and intersectional social biases in natural language processing.
We use these tools to measure gender, racial, ethnic, and intersectional social biases across five models trained on emotion regression tasks in English, Spanish, and Arabic.
- Score: 3.3458760961317635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As natural language processing systems become more widespread, it is
necessary to address fairness issues in their implementation and deployment to
ensure that their negative impacts on society are understood and minimized.
However, there is limited work that studies fairness using a multilingual and
intersectional framework or on downstream tasks. In this paper, we introduce
four multilingual Equity Evaluation Corpora, supplementary test sets designed
to measure social biases, and a novel statistical framework for studying
unisectional and intersectional social biases in natural language processing.
We use these tools to measure gender, racial, ethnic, and intersectional social
biases across five models trained on emotion regression tasks in English,
Spanish, and Arabic. We find that many systems demonstrate statistically
significant unisectional and intersectional social biases.
Related papers
- Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Evaluating Biased Attitude Associations of Language Models in an
Intersectional Context [2.891314299138311]
Language models are trained on large-scale corpora that embed implicit biases documented in psychology.
We study biases related to age, education, gender, height, intelligence, literacy, race, religion, sex, sexual orientation, social class, and weight.
We find that language models exhibit the most biased attitudes against gender identity, social class, and sexual orientation signals in language.
arXiv Detail & Related papers (2023-07-07T03:01:56Z) - On Evaluating and Mitigating Gender Biases in Multilingual Settings [5.248564173595024]
We investigate some of the challenges with evaluating and mitigating biases in multilingual settings.
We first create a benchmark for evaluating gender biases in pre-trained masked language models.
We extend various debiasing methods to work beyond English and evaluate their effectiveness for SOTA massively multilingual models.
arXiv Detail & Related papers (2023-07-04T06:23:04Z) - Bias Beyond English: Counterfactual Tests for Bias in Sentiment Analysis
in Four Languages [13.694445396757162]
Sentiment analysis systems are used in many products and hundreds of languages.
Gender and racial biases are well-studied in English SA systems, but understudied in other languages.
We build a counterfactual evaluation corpus for gender and racial/migrant bias in four languages.
arXiv Detail & Related papers (2023-05-19T13:38:53Z) - Comparing Biases and the Impact of Multilingual Training across Multiple
Languages [70.84047257764405]
We present a bias analysis across Italian, Chinese, English, Hebrew, and Spanish on the downstream sentiment analysis task.
We adapt existing sentiment bias templates in English to Italian, Chinese, Hebrew, and Spanish for four attributes: race, religion, nationality, and gender.
Our results reveal similarities in bias expression such as favoritism of groups that are dominant in each language's culture.
arXiv Detail & Related papers (2023-05-18T18:15:07Z) - Fairness in Language Models Beyond English: Gaps and Challenges [11.62418844341466]
This paper presents a survey of fairness in multilingual and non-English contexts.
It highlights the shortcomings of current research and the difficulties faced by methods designed for English.
arXiv Detail & Related papers (2023-02-24T11:25:50Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - Towards Debiasing Sentence Representations [109.70181221796469]
We show that Sent-Debias is effective in removing biases, and at the same time, preserves performance on sentence-level downstream tasks.
We hope that our work will inspire future research on characterizing and removing social biases from widely adopted sentence representations for fairer NLP.
arXiv Detail & Related papers (2020-07-16T04:22:30Z) - Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer [101.58431011820755]
We study gender bias in multilingual embeddings and how it affects transfer learning for NLP applications.
We create a multilingual dataset for bias analysis and propose several ways for quantifying bias in multilingual representations.
arXiv Detail & Related papers (2020-05-02T04:34:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.