Bias Against 93 Stigmatized Groups in Masked Language Models and
Downstream Sentiment Classification Tasks
- URL: http://arxiv.org/abs/2306.05550v1
- Date: Thu, 8 Jun 2023 20:46:09 GMT
- Title: Bias Against 93 Stigmatized Groups in Masked Language Models and
Downstream Sentiment Classification Tasks
- Authors: Katelyn X. Mei, Sonia Fereidooni, Aylin Caliskan
- Abstract summary: This study extends the focus of bias evaluation in extant work by examining bias against social stigmas on a large scale.
It focuses on 93 stigmatized groups in the United States, including a wide range of conditions related to disease, disability, drug use, mental illness, religion, sexuality, socioeconomic status, and other relevant factors.
We investigate bias against these groups in English pre-trained Masked Language Models (MLMs) and their downstream sentiment classification tasks.
- Score: 2.5690340428649323
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The rapid deployment of artificial intelligence (AI) models demands a
thorough investigation of biases and risks inherent in these models to
understand their impact on individuals and society. This study extends the
focus of bias evaluation in extant work by examining bias against social
stigmas on a large scale. It focuses on 93 stigmatized groups in the United
States, including a wide range of conditions related to disease, disability,
drug use, mental illness, religion, sexuality, socioeconomic status, and other
relevant factors. We investigate bias against these groups in English
pre-trained Masked Language Models (MLMs) and their downstream sentiment
classification tasks. To evaluate the presence of bias against 93 stigmatized
conditions, we identify 29 non-stigmatized conditions to conduct a comparative
analysis. Building upon a psychology scale of social rejection, the Social
Distance Scale, we prompt six MLMs: RoBERTa-base, RoBERTa-large, XLNet-large,
BERTweet-base, BERTweet-large, and DistilBERT. We use human annotations to
analyze the predicted words from these models, with which we measure the extent
of bias against stigmatized groups. When prompts include stigmatized
conditions, the probability of MLMs predicting negative words is approximately
20 percent higher than when prompts have non-stigmatized conditions. In the
sentiment classification tasks, when sentences include stigmatized conditions
related to diseases, disability, education, and mental illness, they are more
likely to be classified as negative. We also observe a strong correlation
between bias in MLMs and their downstream sentiment classifiers (r =0.79). The
evidence indicates that MLMs and their downstream sentiment classification
tasks exhibit biases against socially stigmatized groups.
Related papers
- Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs)
By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases.
The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z) - VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model [72.13121434085116]
VLBiasBench is a benchmark aimed at evaluating biases in Large Vision-Language Models (LVLMs)
We construct a dataset encompassing nine distinct categories of social biases, including age, disability status, gender, nationality, physical appearance, race, religion, profession, social economic status and two intersectional bias categories (race x gender, and race x social economic status)
We conduct extensive evaluations on 15 open-source models as well as one advanced closed-source model, providing some new insights into the biases revealing from these models.
arXiv Detail & Related papers (2024-06-20T10:56:59Z) - The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models [78.69526166193236]
Pre-trained Language models (PLMs) have been acknowledged to contain harmful information, such as social biases.
We propose sc Social Bias Neurons to accurately pinpoint units (i.e., neurons) in a language model that can be attributed to undesirable behavior, such as social bias.
As measured by prior metrics from StereoSet, our model achieves a higher degree of fairness while maintaining language modeling ability with low cost.
arXiv Detail & Related papers (2024-06-14T15:41:06Z) - Seeds of Stereotypes: A Large-Scale Textual Analysis of Race and Gender Associations with Diseases in Online Sources [1.8259644946867188]
The study analyzed the context in which various diseases are discussed alongside markers of race and gender.
We found that demographic terms are disproportionately associated with specific disease concepts in online texts.
We find widespread disparities in the associations of specific racial and gender terms with the 18 diseases analyzed.
arXiv Detail & Related papers (2024-05-08T13:38:56Z) - Detecting Bias in Large Language Models: Fine-tuned KcBERT [0.0]
We define such harm as societal bias and assess ethnic, gender, and racial biases in a model fine-tuned with Korean comments.
Our contribution lies in demonstrating that societal bias exists in Korean language models due to language-dependent characteristics.
arXiv Detail & Related papers (2024-03-16T02:27:19Z) - SocialStigmaQA: A Benchmark to Uncover Stigma Amplification in
Generative Language Models [8.211129045180636]
We introduce a benchmark meant to capture the amplification of social bias, via stigmas, in generative language models.
Our benchmark, SocialStigmaQA, contains roughly 10K prompts, with a variety of prompt styles, carefully constructed to test for both social bias and model robustness.
We find that the proportion of socially biased output ranges from 45% to 59% across a variety of decoding strategies and prompting styles.
arXiv Detail & Related papers (2023-12-12T18:27:44Z) - Social Bias Probing: Fairness Benchmarking for Language Models [38.180696489079985]
This paper proposes a novel framework for probing language models for social biases by assessing disparate treatment.
We curate SoFa, a large-scale benchmark designed to address the limitations of existing fairness collections.
We show that biases within language models are more nuanced than acknowledged, indicating a broader scope of encoded biases than previously recognized.
arXiv Detail & Related papers (2023-11-15T16:35:59Z) - Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models [0.0]
This paper investigates bias along less-studied but still consequential, dimensions, such as age and beauty.
We ask whether LLMs hold wide-reaching biases of positive or negative sentiment for specific social groups similar to the "what is beautiful is good" bias found in people in experimental psychology.
arXiv Detail & Related papers (2023-09-16T07:07:04Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Auditing Algorithmic Fairness in Machine Learning for Health with
Severity-Based LOGAN [70.76142503046782]
We propose supplementing machine learning-based (ML) healthcare tools for bias with SLOGAN, an automatic tool for capturing local biases in a clinical prediction task.
LOGAN adapts an existing tool, LOcal Group biAs detectioN, by contextualizing group bias detection in patient illness severity and past medical history.
On average, SLOGAN identifies larger fairness disparities in over 75% of patient groups than LOGAN while maintaining clustering quality.
arXiv Detail & Related papers (2022-11-16T08:04:12Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.