Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias
- URL: http://arxiv.org/abs/2407.03536v3
- Date: Fri, 13 Dec 2024 10:55:37 GMT
- Title: Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias
- Authors: Jayanta Sadhu, Maneesha Rani Saha, Rifat Shahriyar,
- Abstract summary: It is important to assess the influence of different types of biases embedded in Large Language Models to ensure fair use in sensitive fields.
Although there have been extensive works on bias assessment in English, such efforts are rare and scarce for a major language like Bangla.
This is the first work of such kind involving bias assessment of LLMs for Bangla to the best of our knowledge.
- Score: 2.98683507969764
- License:
- Abstract: The rapid growth of Large Language Models (LLMs) has put forward the study of biases as a crucial field. It is important to assess the influence of different types of biases embedded in LLMs to ensure fair use in sensitive fields. Although there have been extensive works on bias assessment in English, such efforts are rare and scarce for a major language like Bangla. In this work, we examine two types of social biases in LLM generated outputs for Bangla language. Our main contributions in this work are: (1) bias studies on two different social biases for Bangla, (2) a curated dataset for bias measurement benchmarking and (3) testing two different probing techniques for bias detection in the context of Bangla. This is the first work of such kind involving bias assessment of LLMs for Bangla to the best of our knowledge. All our code and resources are publicly available for the progress of bias related research in Bangla NLP.
Related papers
- LIBRA: Measuring Bias of Large Language Model from a Local Context [9.612845616659776]
Large Language Models (LLMs) have significantly advanced natural language processing applications.
Yet their widespread use raises concerns regarding inherent biases that may reduce utility or harm for particular social groups.
This research addresses these limitations with a Local Integrated Bias Recognition and Assessment Framework (LIBRA) for measuring bias.
arXiv Detail & Related papers (2025-02-02T04:24:57Z) - How far can bias go? -- Tracing bias from pretraining data to alignment [54.51310112013655]
This study examines the correlation between gender-occupation bias in pre-training data and their manifestation in LLMs.
Our findings reveal that biases present in pre-training data are amplified in model outputs.
arXiv Detail & Related papers (2024-11-28T16:20:25Z) - An Empirical Study of Gendered Stereotypes in Emotional Attributes for Bangla in Multilingual Large Language Models [2.98683507969764]
Women are associated with emotions like empathy, fear, and guilt, while men are linked to anger, bravado, and authority.
We show the existence of gender bias in the context of emotions in Bangla through analytical methods.
All of our resources including code and data are made publicly available to support future research on Bangla NLP.
arXiv Detail & Related papers (2024-07-08T22:22:15Z) - An Empirical Study on the Characteristics of Bias upon Context Length Variation for Bangla [4.494043534116323]
We create a dataset for intrinsic gender bias measurement in Bangla.
We discuss necessary adaptations to apply existing bias measurement methods for Bangla.
We examine the impact of context length variation on bias measurement.
arXiv Detail & Related papers (2024-06-25T08:49:11Z) - White Men Lead, Black Women Help? Benchmarking Language Agency Social Biases in LLMs [58.27353205269664]
Social biases can manifest in language agency.
We introduce the novel Language Agency Bias Evaluation benchmark.
We unveil language agency social biases in 3 recent Large Language Model (LLM)-generated content.
arXiv Detail & Related papers (2024-04-16T12:27:54Z) - GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language
Models [83.30078426829627]
Large language models (LLMs) have gained popularity and are being widely adopted by a large user community.
The existing evaluation methods have many constraints, and their results exhibit a limited degree of interpretability.
We propose a bias evaluation framework named GPTBIAS that leverages the high performance of LLMs to assess bias in models.
arXiv Detail & Related papers (2023-12-11T12:02:14Z) - "Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in
LLM-Generated Reference Letters [97.11173801187816]
Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content.
This paper critically examines gender biases in LLM-generated reference letters.
arXiv Detail & Related papers (2023-10-13T16:12:57Z) - Language-Agnostic Bias Detection in Language Models with Bias Probing [22.695872707061078]
Pretrained language models (PLMs) are key components in NLP, but they contain strong social biases.
We propose a bias probing technique called LABDet for evaluating social bias in PLMs with a robust and language-agnostic method.
We find consistent patterns of nationality bias across monolingual PLMs in six languages that align with historical and political context.
arXiv Detail & Related papers (2023-05-22T17:58:01Z) - Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias [33.99768156365231]
We introduce a causal formulation for bias measurement in generative language models.
We propose a benchmark called OccuGender, with a bias-measuring procedure to investigate occupational gender bias.
The results show that these models exhibit substantial occupational gender bias.
arXiv Detail & Related papers (2022-12-20T22:41:24Z) - The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings.
We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z) - Towards Controllable Biases in Language Generation [87.89632038677912]
We develop a method to induce societal biases in generated text when input prompts contain mentions of specific demographic groups.
We analyze two scenarios: 1) inducing negative biases for one demographic and positive biases for another demographic, and 2) equalizing biases between demographics.
arXiv Detail & Related papers (2020-05-01T08:25:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.