Bias in, Bias out: Annotation Bias in Multilingual Large Language Models
- URL: http://arxiv.org/abs/2511.14662v1
- Date: Tue, 18 Nov 2025 17:02:12 GMT
- Title: Bias in, Bias out: Annotation Bias in Multilingual Large Language Models
- Authors: Xia Cui, Ziyi Huang, Naeemeh Adel,
- Abstract summary: bias in NLP datasets remains a major challenge for developing multilingual Large Language Models.<n>We propose a comprehensive framework for understanding annotation bias, distinguishing among instruction bias, annotator bias, and contextual and cultural bias.
- Score: 4.032367157209129
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Annotation bias in NLP datasets remains a major challenge for developing multilingual Large Language Models (LLMs), particularly in culturally diverse settings. Bias from task framing, annotator subjectivity, and cultural mismatches can distort model outputs and exacerbate social harms. We propose a comprehensive framework for understanding annotation bias, distinguishing among instruction bias, annotator bias, and contextual and cultural bias. We review detection methods (including inter-annotator agreement, model disagreement, and metadata analysis) and highlight emerging techniques such as multilingual model divergence and cultural inference. We further outline proactive and reactive mitigation strategies, including diverse annotator recruitment, iterative guideline refinement, and post-hoc model adjustments. Our contributions include: (1) a typology of annotation bias; (2) a synthesis of detection metrics; (3) an ensemble-based bias mitigation approach adapted for multilingual settings, and (4) an ethical analysis of annotation processes. Together, these insights aim to inform more equitable and culturally grounded annotation pipelines for LLMs.
Related papers
- When Meanings Meet: Investigating the Emergence and Quality of Shared Concept Spaces during Multilingual Language Model Training [57.230355403478995]
We investigate the development of language-agnostic concept spaces during pretraining of EuroLLM.<n>We find that shared concept spaces emerge early and continue to refine, but that alignment with them is language-dependent.<n>In contrast to prior work, our fine-grained manual analysis reveals that some apparent gains in translation quality reflect shifts in behavior.
arXiv Detail & Related papers (2026-01-30T11:23:01Z) - Mitigating Biases in Language Models via Bias Unlearning [27.565946855618368]
We propose BiasUnlearn, a novel model debiasing framework which achieves targeted debiasing via dual-pathway unlearning mechanisms.<n>The results show that BiasUnlearn outperforms existing methods in mitigating bias in language models while retaining language modeling capabilities.
arXiv Detail & Related papers (2025-09-30T02:15:12Z) - SESGO: Spanish Evaluation of Stereotypical Generative Outputs [1.1549572298362782]
This paper addresses the critical gap in evaluating bias in multilingual Large Language Models (LLMs)<n>Current evaluations remain predominantly US-English-centric, leaving potential harms in other linguistic and cultural contexts largely underexamined.<n>We introduce a novel, culturally-grounded framework for detecting social biases in instruction-tuned LLMs.
arXiv Detail & Related papers (2025-09-03T14:04:51Z) - Social Bias in Multilingual Language Models: A Survey [5.756606441319472]
This systematic review analyzes emerging research that extends bias evaluation and mitigation approaches into multilingual and non-English contexts.<n>We examine these studies with respect to linguistic diversity, cultural awareness, and their choice of evaluation metrics and mitigation techniques.
arXiv Detail & Related papers (2025-08-27T18:25:32Z) - MyCulture: Exploring Malaysia's Diverse Culture under Low-Resource Language Constraints [7.822567458977689]
MyCulture is a benchmark designed to comprehensively evaluate Large Language Models (LLMs) on Malaysian culture.<n>Unlike conventional benchmarks, MyCulture employs a novel open-ended multiple-choice question format without predefined options.<n>We analyze structural bias by comparing model performance on structured versus free-form outputs, and assess language bias through multilingual prompt variations.
arXiv Detail & Related papers (2025-08-07T14:17:43Z) - Beyond Early-Token Bias: Model-Specific and Language-Specific Position Effects in Multilingual LLMs [50.07451351559251]
We present a study across five typologically distinct languages (English, Russian, German, Hindi, and Vietnamese)<n>We examine how position bias interacts with prompt strategies and affects output entropy.
arXiv Detail & Related papers (2025-05-22T02:23:00Z) - Uncovering Biases with Reflective Large Language Models [2.5200794639628032]
Biases and errors in human-labeled data present significant challenges for machine learning.
We present the Reflective LLM Dialogue Framework RLDF, which leverages structured adversarial dialogues to uncover diverse perspectives.
Experiments show RLDF successfully identifies potential biases in public content while exposing limitations in human-labeled data.
arXiv Detail & Related papers (2024-08-24T04:48:32Z) - Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models [50.40276881893513]
This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in Speech Large Language Models (SLLMs)
By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases.
The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.
arXiv Detail & Related papers (2024-08-14T16:55:06Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural Reasoning [44.53966523376327]
SeaEval is a benchmark for multilingual foundation models.
We characterize how these models understand and reason with natural language.
We also investigate how well they comprehend cultural practices, nuances, and values.
arXiv Detail & Related papers (2023-09-09T11:42:22Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - An Analysis of Social Biases Present in BERT Variants Across Multiple
Languages [0.0]
We investigate the bias present in monolingual BERT models across a diverse set of languages.
We propose a template-based method to measure any kind of bias, based on sentence pseudo-likelihood.
We conclude that current methods of probing for bias are highly language-dependent.
arXiv Detail & Related papers (2022-11-25T23:38:08Z) - Analyzing the Limits of Self-Supervision in Handling Bias in Language [52.26068057260399]
We evaluate how well language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing.
Our analyses indicate that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation.
arXiv Detail & Related papers (2021-12-16T05:36:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.