Intersectional Bias in Japanese Large Language Models from a Contextualized Perspective
- URL: http://arxiv.org/abs/2506.12327v2
- Date: Sun, 27 Jul 2025 07:14:26 GMT
- Title: Intersectional Bias in Japanese Large Language Models from a Contextualized Perspective
- Authors: Hitomi Yanaka, Xinqi He, Jie Lu, Namgi Han, Sunjin Oh, Ryoma Kumon, Yuma Matsuoka, Katsuhiko Watabe, Yuko Itatsu,
- Abstract summary: We construct a benchmark inter-JBBQ to evaluate the intersectional bias in large language models (LLMs) on the question-answering setting.<n>Using inter-JBBQ to analyze GPT-4o and Swallow, we find that biased output varies according to its contexts even with the equal combination of social attributes.
- Score: 19.168850702678125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An increasing number of studies have examined the social bias of rapidly developed large language models (LLMs). Although most of these studies have focused on bias occurring in a single social attribute, research in social science has shown that social bias often occurs in the form of intersectionality -- the constitutive and contextualized perspective on bias aroused by social attributes. In this study, we construct the Japanese benchmark inter-JBBQ, designed to evaluate the intersectional bias in LLMs on the question-answering setting. Using inter-JBBQ to analyze GPT-4o and Swallow, we find that biased output varies according to its contexts even with the equal combination of social attributes.
Related papers
- Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs)
By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases.
The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z) - VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model [72.13121434085116]
We introduce VLBiasBench, a benchmark to evaluate biases in Large Vision-Language Models (LVLMs)<n>VLBiasBench features a dataset that covers nine distinct categories of social biases, including age, disability status, gender, nationality, physical appearance, race, religion, profession, social economic status, as well as two intersectional bias categories: race x gender and race x social economic status.<n>We conduct extensive evaluations on 15 open-source models as well as two advanced closed-source models, yielding new insights into the biases present in these models.
arXiv Detail & Related papers (2024-06-20T10:56:59Z) - Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models [11.132360309354782]
Social bias is shaped by the accumulation of social perceptions towards targets across various demographic identities.
We propose a novel strategy to intuitively quantify social perceptions and suggest metrics that can evaluate the social biases within large language models.
arXiv Detail & Related papers (2024-06-06T13:32:09Z) - JBBQ: Japanese Bias Benchmark for Analyzing Social Biases in Large Language Models [24.351580958043595]
We construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias benchmark BBQ.<n>We show that while current open Japanese LLMs with more parameters show improved accuracies on JBBQ, their bias scores increase.<n> prompts with a warning about social biases and chain-of-thought prompting reduce the effect of biases in model outputs.
arXiv Detail & Related papers (2024-06-04T07:31:06Z) - Social Bias Probing: Fairness Benchmarking for Language Models [38.180696489079985]
This paper proposes a novel framework for probing language models for social biases by assessing disparate treatment.
We curate SoFa, a large-scale benchmark designed to address the limitations of existing fairness collections.
We show that biases within language models are more nuanced than acknowledged, indicating a broader scope of encoded biases than previously recognized.
arXiv Detail & Related papers (2023-11-15T16:35:59Z) - Evaluating Biased Attitude Associations of Language Models in an
Intersectional Context [2.891314299138311]
Language models are trained on large-scale corpora that embed implicit biases documented in psychology.
We study biases related to age, education, gender, height, intelligence, literacy, race, religion, sex, sexual orientation, social class, and weight.
We find that language models exhibit the most biased attitudes against gender identity, social class, and sexual orientation signals in language.
arXiv Detail & Related papers (2023-07-07T03:01:56Z) - The Tail Wagging the Dog: Dataset Construction Biases of Social Bias
Benchmarks [75.58692290694452]
We compare social biases with non-social biases stemming from choices made during dataset construction that might not even be discernible to the human eye.
We observe that these shallow modifications have a surprising effect on the resulting degree of bias across various models.
arXiv Detail & Related papers (2022-10-18T17:58:39Z) - Toward Understanding Bias Correlations for Mitigation in NLP [34.956581421295]
This work aims to provide a first systematic study toward understanding bias correlations in mitigation.
We examine bias mitigation in two common NLP tasks -- toxicity detection and word embeddings.
Our findings suggest that biases are correlated and present scenarios in which independent debiasing approaches may be insufficient.
arXiv Detail & Related papers (2022-05-24T22:48:47Z) - The SAME score: Improved cosine based bias score for word embeddings [49.75878234192369]
We introduce SAME, a novel bias score for semantic bias in embeddings.
We show that SAME is capable of measuring semantic bias and identify potential causes for social bias in downstream tasks.
arXiv Detail & Related papers (2022-03-28T09:28:13Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - Towards Debiasing Sentence Representations [109.70181221796469]
We show that Sent-Debias is effective in removing biases, and at the same time, preserves performance on sentence-level downstream tasks.
We hope that our work will inspire future research on characterizing and removing social biases from widely adopted sentence representations for fairer NLP.
arXiv Detail & Related papers (2020-07-16T04:22:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.