Evaluating Large Language Models through Gender and Racial Stereotypes
- URL: http://arxiv.org/abs/2311.14788v1
- Date: Fri, 24 Nov 2023 18:41:16 GMT
- Title: Evaluating Large Language Models through Gender and Racial Stereotypes
- Authors: Ananya Malik
- Abstract summary: We conduct a quality comparative study and establish a framework to evaluate language models under the premise of two kinds of biases: gender and race.
We find out that while gender bias has reduced immensely in newer models, as compared to older ones, racial bias still exists.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Language Models have ushered a new age of AI gaining traction within the NLP
community as well as amongst the general population. AI's ability to make
predictions, generations and its applications in sensitive decision-making
scenarios, makes it even more important to study these models for possible
biases that may exist and that can be exaggerated. We conduct a quality
comparative study and establish a framework to evaluate language models under
the premise of two kinds of biases: gender and race, in a professional setting.
We find out that while gender bias has reduced immensely in newer models, as
compared to older ones, racial bias still exists.
Related papers
- Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs)
By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases.
The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z) - Investigating Gender Bias in Turkish Language Models [3.100560442806189]
We investigate the significance of gender bias in Turkish language models.
We build upon existing bias evaluation frameworks and extend them to the Turkish language.
Specifically, we evaluate Turkish language models for their embedded ethnic bias toward Kurdish people.
arXiv Detail & Related papers (2024-04-17T20:24:41Z) - Generalizing Fairness to Generative Language Models via Reformulation of Non-discrimination Criteria [4.738231680800414]
This paper studies how to uncover and quantify the presence of gender biases in generative language models.
We derive generative AI analogues of three well-known non-discrimination criteria from classification, namely independence, separation and sufficiency.
Our results address the presence of occupational gender bias within such conversational language models.
arXiv Detail & Related papers (2024-03-13T14:19:08Z) - Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do.
We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models.
Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z) - Towards Auditing Large Language Models: Improving Text-based Stereotype
Detection [5.3634450268516565]
This work introduces i) the Multi-Grain Stereotype dataset, which includes 52,751 instances of gender, race, profession and religion stereotypic text.
We design several experiments to rigorously test the proposed model trained on the novel dataset.
Experiments show that training the model in a multi-class setting can outperform the one-vs-all binary counterpart.
arXiv Detail & Related papers (2023-11-23T17:47:14Z) - Exposing Bias in Online Communities through Large-Scale Language Models [3.04585143845864]
This work uses the flaw of bias in language models to explore the biases of six different online communities.
The bias of the resulting models is evaluated by prompting the models with different demographics and comparing the sentiment and toxicity values of these generations.
This work not only affirms how easily bias is absorbed from training data but also presents a scalable method to identify and compare the bias of different datasets or communities.
arXiv Detail & Related papers (2023-06-04T08:09:26Z) - The Birth of Bias: A case study on the evolution of gender bias in an
English language model [1.6344851071810076]
We use a relatively small language model, using the LSTM architecture trained on an English Wikipedia corpus.
We find that the representation of gender is dynamic and identify different phases during training.
We show that gender information is represented increasingly locally in the input embeddings of the model.
arXiv Detail & Related papers (2022-07-21T00:59:04Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - How True is GPT-2? An Empirical Analysis of Intersectional Occupational
Biases [50.591267188664666]
Downstream applications are at risk of inheriting biases contained in natural language models.
We analyze the occupational biases of a popular generative language model, GPT-2.
For a given job, GPT-2 reflects the societal skew of gender and ethnicity in the US, and in some cases, pulls the distribution towards gender parity.
arXiv Detail & Related papers (2021-02-08T11:10:27Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z) - Towards Controllable Biases in Language Generation [87.89632038677912]
We develop a method to induce societal biases in generated text when input prompts contain mentions of specific demographic groups.
We analyze two scenarios: 1) inducing negative biases for one demographic and positive biases for another demographic, and 2) equalizing biases between demographics.
arXiv Detail & Related papers (2020-05-01T08:25:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.