Blacks is to Anger as Whites is to Joy? Understanding Latent Affective
Bias in Large Pre-trained Neural Language Models
- URL: http://arxiv.org/abs/2301.09003v1
- Date: Sat, 21 Jan 2023 20:23:09 GMT
- Title: Blacks is to Anger as Whites is to Joy? Understanding Latent Affective
Bias in Large Pre-trained Neural Language Models
- Authors: Anoop Kadan, Deepak P., Sahely Bhadra, Manjary P. Gangan, Lajish V. L
- Abstract summary: "Affective Bias" is biased association of emotions towards a particular gender, race, and religion.
We show the existence of statistically significant affective bias in the PLM based emotion detection systems.
- Score: 3.5278693565908137
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Groundbreaking inventions and highly significant performance improvements in
deep learning based Natural Language Processing are witnessed through the
development of transformer based large Pre-trained Language Models (PLMs). The
wide availability of unlabeled data within human generated data deluge along
with self-supervised learning strategy helps to accelerate the success of large
PLMs in language generation, language understanding, etc. But at the same time,
latent historical bias/unfairness in human minds towards a particular gender,
race, etc., encoded unintentionally/intentionally into the corpora harms and
questions the utility and efficacy of large PLMs in many real-world
applications, particularly for the protected groups. In this paper, we present
an extensive investigation towards understanding the existence of "Affective
Bias" in large PLMs to unveil any biased association of emotions such as anger,
fear, joy, etc., towards a particular gender, race or religion with respect to
the downstream task of textual emotion detection. We conduct our exploration of
affective bias from the very initial stage of corpus level affective bias
analysis by searching for imbalanced distribution of affective words within a
domain, in large scale corpora that are used to pre-train and fine-tune PLMs.
Later, to quantify affective bias in model predictions, we perform an extensive
set of class-based and intensity-based evaluations using various bias
evaluation corpora. Our results show the existence of statistically significant
affective bias in the PLM based emotion detection systems, indicating biased
association of certain emotions towards a particular gender, race, and
religion.
Related papers
- The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models [58.130894823145205]
We center transgender, nonbinary, and other gender-diverse identities to investigate how alignment procedures interact with pre-existing gender-diverse bias.
Our findings reveal that DPO-aligned models are particularly sensitive to supervised finetuning.
We conclude with recommendations tailored to DPO and broader alignment practices.
arXiv Detail & Related papers (2024-11-06T06:50:50Z) - Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs) [82.57490175399693]
We study gender bias in 22 popular image-to-text vision-language assistants (VLAs)
Our results show that VLAs replicate human biases likely present in the data, such as real-world occupational imbalances.
To eliminate the gender bias in these models, we find that finetuning-based debiasing methods achieve the best tradeoff between debiasing and retaining performance on downstream tasks.
arXiv Detail & Related papers (2024-10-25T05:59:44Z) - The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models [78.69526166193236]
Pre-trained Language models (PLMs) have been acknowledged to contain harmful information, such as social biases.
We propose sc Social Bias Neurons to accurately pinpoint units (i.e., neurons) in a language model that can be attributed to undesirable behavior, such as social bias.
As measured by prior metrics from StereoSet, our model achieves a higher degree of fairness while maintaining language modeling ability with low cost.
arXiv Detail & Related papers (2024-06-14T15:41:06Z) - Locating and Mitigating Gender Bias in Large Language Models [40.78150878350479]
Large language models (LLM) are pre-trained on extensive corpora to learn facts and human cognition which contain human preferences.
This process can inadvertently lead to these models acquiring biases and prevalent stereotypes in society.
We propose the LSDM (Least Square Debias Method), a knowledge-editing based method for mitigating gender bias in occupational pronouns.
arXiv Detail & Related papers (2024-03-21T13:57:43Z) - Detecting Bias in Large Language Models: Fine-tuned KcBERT [0.0]
We define such harm as societal bias and assess ethnic, gender, and racial biases in a model fine-tuned with Korean comments.
Our contribution lies in demonstrating that societal bias exists in Korean language models due to language-dependent characteristics.
arXiv Detail & Related papers (2024-03-16T02:27:19Z) - Towards an Enhanced Understanding of Bias in Pre-trained Neural Language
Models: A Survey with Special Emphasis on Affective Bias [2.6304695993930594]
We present a survey to comprehend bias in large pre-trained language models, analyze the stages at which they occur, and various ways in which these biases could be quantified and mitigated.
Considering wide applicability of textual affective computing based downstream tasks in real-world systems such as business, healthcare, education, etc., we give a special emphasis on investigating bias in the context of affect (emotion) i.e., Affective Bias.
We present a summary of various bias evaluation corpora that help to aid future research and discuss challenges in the research on bias in pre-trained language models.
arXiv Detail & Related papers (2022-04-21T18:51:19Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - Towards Socially Responsible AI: Cognitive Bias-Aware Multi-Objective
Learning [24.522730093209262]
Human society had a long history of suffering from cognitive biases leading to social prejudices and mass injustice.
We propose a bias-aware multi-objective learning framework that learns to reduce the frequency of predicting certain combinations of them.
arXiv Detail & Related papers (2020-05-14T17:01:53Z) - Towards Controllable Biases in Language Generation [87.89632038677912]
We develop a method to induce societal biases in generated text when input prompts contain mentions of specific demographic groups.
We analyze two scenarios: 1) inducing negative biases for one demographic and positive biases for another demographic, and 2) equalizing biases between demographics.
arXiv Detail & Related papers (2020-05-01T08:25:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.