Social Debiasing for Fair Multi-modal LLMs
- URL: http://arxiv.org/abs/2408.06569v2
- Date: Wed, 20 Aug 2025 07:24:46 GMT
- Title: Social Debiasing for Fair Multi-modal LLMs
- Authors: Harry Cheng, Yangyang Guo, Qingpei Guo, Ming Yang, Tian Gan, Weili Guan, Liqiang Nie,
- Abstract summary: Multi-modal Large Language Models (MLLMs) have dramatically advanced the research field and delivered powerful vision-language understanding capabilities.<n>These models often inherit deep-rooted social biases from their training data, leading to uncomfortable responses with respect to attributes such as race and gender.<n>This paper addresses the issue of social biases in MLLMs by introducing a comprehensive counterfactual dataset with multiple social concepts.
- Score: 59.61512883471714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-modal Large Language Models (MLLMs) have dramatically advanced the research field and delivered powerful vision-language understanding capabilities. However, these models often inherit deep-rooted social biases from their training data, leading to uncomfortable responses with respect to attributes such as race and gender. This paper addresses the issue of social biases in MLLMs by i) introducing a comprehensive counterfactual dataset with multiple social concepts (CMSC), which complements existing datasets by providing 18 diverse and balanced social concepts; and ii) proposing a counter-stereotype debiasing (CSD) strategy that mitigates social biases in MLLMs by leveraging the opposites of prevalent stereotypes. CSD incorporates both a novel bias-aware data sampling method and a loss rescaling method, enabling the model to effectively reduce biases. We conduct extensive experiments with four prevalent MLLM architectures. The results demonstrate the advantage of the CMSC dataset and the edge of CSD strategy in reducing social biases compared to existing competing methods, without compromising the overall performance on general multi-modal reasoning benchmarks.
Related papers
- Interpretable Debiasing of Vision-Language Models for Social Fairness [55.85977929985967]
We introduce an interpretable, model-agnostic bias mitigation framework, DeBiasLens, that localizes social attribute neurons in Vision-Language models.<n>We train SAEs on facial image or caption datasets without corresponding social attribute labels to uncover neurons highly responsive to specific demographics.<n>Our research lays the groundwork for future auditing tools, prioritizing social fairness in emerging real-world AI systems.
arXiv Detail & Related papers (2026-02-27T13:37:11Z) - Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment [49.81946749379338]
This work seeks to analyze the capacity of Transformers-based systems to learn demographic biases present in the data.<n>We propose a privacy-enhancing framework to reduce gender information from the learning pipeline as a way to mitigate biased behaviors in the final tools.
arXiv Detail & Related papers (2025-06-13T15:29:43Z) - MLLMs are Deeply Affected by Modality Bias [158.64371871084478]
Recent advances in Multimodal Large Language Models (MLLMs) have shown promising results in integrating diverse modalities such as texts and images.<n>MLLMs are heavily influenced by modality bias, often relying on language while under-utilizing other modalities like visual inputs.<n>This paper argues that MLLMs are deeply affected by modality bias, highlighting its manifestations across various tasks.
arXiv Detail & Related papers (2025-05-24T11:49:31Z) - Distilling Transitional Pattern to Large Language Models for Multimodal Session-based Recommendation [67.84581846180458]
Session-based recommendation (SBR) predicts the next item based on anonymous sessions.
Recent Multimodal SBR methods utilize simplistic pre-trained models for modality learning but have limitations in semantic richness.
We propose a multimodal LLM-enhanced framework TPAD, which extends a distillation paradigm to decouple and align transitional patterns for promoting MSBR.
arXiv Detail & Related papers (2025-04-13T07:49:08Z) - SB-Bench: Stereotype Bias Benchmark for Large Multimodal Models [42.54907891780377]
Stereotype biases in Large Multimodal Models (LMMs) perpetuate harmful societal prejudices.<n>Existing datasets evaluating stereotype biases in LMMs often lack diversity and rely on synthetic images.<n>We introduce the Stereotype Bias Benchmark (SB-bench), the most comprehensive framework to date for assessing stereotype biases.
arXiv Detail & Related papers (2025-02-12T20:41:53Z) - Understanding and Mitigating the Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks [24.706895491806794]
This work presents the first systematic investigation in understanding, analyzing, and mitigating bias inheritance.
We analyze how 6 different types of biases manifest at varying bias ratios.
We propose three mitigation strategies: token-based, mask-based, and loss-based approaches.
arXiv Detail & Related papers (2025-02-06T15:20:58Z) - Fair In-Context Learning via Latent Concept Variables [17.216196320585922]
Large language models (LLMs) can inherit social bias and discrimination from their pre-training data.
We design data augmentation strategies that reduce correlation between predictive outcomes and sensitive variables helping to promote fairness during latent concept learning.
arXiv Detail & Related papers (2024-11-04T23:10:05Z) - Towards Implicit Bias Detection and Mitigation in Multi-Agent LLM Interactions [25.809599403713506]
Large Language Models (LLMs) are increasingly being employed in numerous studies to simulate societies and execute diverse social tasks.
LLMs are susceptible to societal biases due to their exposure to human-generated data.
This study investigates the presence of implicit gender biases in multi-agent LLM interactions and proposes two strategies to mitigate these biases.
arXiv Detail & Related papers (2024-10-03T15:28:05Z) - A Multi-LLM Debiasing Framework [85.17156744155915]
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities.
Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning.
We propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs.
arXiv Detail & Related papers (2024-09-20T20:24:50Z) - CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models [58.57987316300529]
Large Language Models (LLMs) are increasingly deployed to handle various natural language processing (NLP) tasks.
To evaluate the biases exhibited by LLMs, researchers have recently proposed a variety of datasets.
We propose CEB, a Compositional Evaluation Benchmark that covers different types of bias across different social groups and tasks.
arXiv Detail & Related papers (2024-07-02T16:31:37Z) - Understanding Intrinsic Socioeconomic Biases in Large Language Models [4.276697874428501]
We introduce a novel dataset of one million English sentences to quantify socioeconomic biases.
Our findings reveal pervasive socioeconomic biases in both established models like GPT-2 and state-of-the-art models like Llama 2 and Falcon.
arXiv Detail & Related papers (2024-05-28T23:54:44Z) - Prompting Fairness: Integrating Causality to Debias Large Language Models [19.76215433424235]
Large language models (LLMs) are susceptible to generating biased and discriminatory responses.<n>We propose a causality-guided debiasing framework to tackle social biases.
arXiv Detail & Related papers (2024-03-13T17:46:28Z) - The Common Stability Mechanism behind most Self-Supervised Learning
Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques.
We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO.
We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z) - Large Language Model (LLM) Bias Index -- LLMBI [0.0]
The Large Language Model Bias Index (LLMBI) is a pioneering approach designed to quantify and address biases inherent in large language models (LLMs)
We formulated LLMBI using a composite scoring system incorporating multiple dimensions of bias, including but not limited to age, gender, and racial biases.
Our empirical analysis, conducted using responses from OpenAI's API, employs advanced sentiment analysis as a representative method for bias detection.
arXiv Detail & Related papers (2023-12-22T15:38:13Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Non-Invasive Fairness in Learning through the Lens of Data Drift [88.37640805363317]
We show how to improve the fairness of Machine Learning models without altering the data or the learning algorithm.
We use a simple but key insight: the divergence of trends between different populations, and, consecutively, between a learned model and minority populations, is analogous to data drift.
We explore two strategies (model-splitting and reweighing) to resolve this drift, aiming to improve the overall conformance of models to the underlying data.
arXiv Detail & Related papers (2023-03-30T17:30:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.