Interpreting Social Bias in LVLMs via Information Flow Analysis and Multi-Round Dialogue Evaluation
- URL: http://arxiv.org/abs/2505.21106v1
- Date: Tue, 27 May 2025 12:28:44 GMT
- Title: Interpreting Social Bias in LVLMs via Information Flow Analysis and Multi-Round Dialogue Evaluation
- Authors: Zhengyang Ji, Yifan Jia, Shang Gao, Yutao Yue,
- Abstract summary: Large Vision Language Models (LVLMs) have achieved remarkable progress in multimodal tasks, yet they also exhibit notable social biases.<n>We propose an explanatory framework that combines information flow analysis with multi-round dialogue evaluation.<n>Experiments reveal that LVLMs exhibit systematic disparities in information usage when processing images of different demographic groups.
- Score: 1.7997395646080083
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Vision Language Models (LVLMs) have achieved remarkable progress in multimodal tasks, yet they also exhibit notable social biases. These biases often manifest as unintended associations between neutral concepts and sensitive human attributes, leading to disparate model behaviors across demographic groups. While existing studies primarily focus on detecting and quantifying such biases, they offer limited insight into the underlying mechanisms within the models. To address this gap, we propose an explanatory framework that combines information flow analysis with multi-round dialogue evaluation, aiming to understand the origin of social bias from the perspective of imbalanced internal information utilization. Specifically, we first identify high-contribution image tokens involved in the model's reasoning process for neutral questions via information flow analysis. Then, we design a multi-turn dialogue mechanism to evaluate the extent to which these key tokens encode sensitive information. Extensive experiments reveal that LVLMs exhibit systematic disparities in information usage when processing images of different demographic groups, suggesting that social bias is deeply rooted in the model's internal reasoning dynamics. Furthermore, we complement our findings from a textual modality perspective, showing that the model's semantic representations already display biased proximity patterns, thereby offering a cross-modal explanation of bias formation.
Related papers
- Assessing the Reliability of LLMs Annotations in the Context of Demographic Bias and Model Explanation [5.907945985868999]
This study investigates the extent to which annotator demographic features influence labeling decisions compared to text content.<n>Using a Generalized Linear Mixed Model, we quantify this inf luence, finding that demographic factors account for a minor fraction ( 8%) of the observed variance.<n>We then assess the reliability of Generative AI (GenAI) models as annotators, specifically evaluating if guiding them with demographic personas improves alignment with human judgments.
arXiv Detail & Related papers (2025-07-17T14:00:13Z) - Biases Propagate in Encoder-based Vision-Language Models: A Systematic Analysis From Intrinsic Measures to Zero-shot Retrieval Outcomes [14.331322509462419]
Social-group biases intrinsic to foundational encoder-based vision-language models (VLMs) manifest in biases in downstream tasks.<n>We introduce a controlled framework to measure this propagation by correlating intrinsic measures of bias in the representational space with measures of bias in zero-shot text-to-image (TTI) and image-to-text (ITT) retrieval.<n>Results show substantial correlations between intrinsic and extrinsic bias, with an average $rho$ = 0.83 $pm$ 0.10.<n> Notably, we find that larger/better-performing models exhibit greater bias propagation, a finding that raises concerns
arXiv Detail & Related papers (2025-06-06T20:01:32Z) - A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language Models [53.18562650350898]
Chain-of-thought (CoT) reasoning enhances performance of large language models.<n>We present the first comprehensive study of CoT faithfulness in large vision-language models.
arXiv Detail & Related papers (2025-05-29T18:55:05Z) - VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models [121.03333569013148]
We introduce VisuLogic: a benchmark of 1,000 human-verified problems across six categories.<n>These types of questions can be evaluated to assess the visual reasoning capabilities of MLLMs from multiple perspectives.<n>Most models score below 30% accuracy-only slightly above the 25% random baseline and far below the 51.4% achieved by humans.
arXiv Detail & Related papers (2025-04-21T17:59:53Z) - VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity [34.29409506366145]
VERIFY is a benchmark designed to isolate and rigorously evaluate the visual reasoning capabilities of state-of-the-art MLLMs.<n>Each problem is accompanied by a human-annotated reasoning path, making it the first to provide in-depth evaluation of model decision-making processes.<n>We propose novel metrics that assess visual reasoning fidelity beyond mere accuracy, highlighting critical imbalances in current model reasoning patterns.
arXiv Detail & Related papers (2025-03-14T16:26:11Z) - Fine-Grained Bias Detection in LLM: Enhancing detection mechanisms for nuanced biases [0.0]
This study presents a detection framework to identify nuanced biases in Large Language Models (LLMs)<n>The approach integrates contextual analysis, interpretability via attention mechanisms, and counterfactual data augmentation to capture hidden biases.<n>Results show improvements in detecting subtle biases compared to conventional methods.
arXiv Detail & Related papers (2025-03-08T04:43:01Z) - Measuring Agreeableness Bias in Multimodal Models [0.3529736140137004]
This paper examines a phenomenon in multimodal language models where pre-marked options in question images can influence model responses.
We present models with images of multiple-choice questions, which they initially answer correctly, then expose the same model to versions with pre-marked options.
Our findings reveal a significant shift in the models' responses towards the pre-marked option, even when it contradicts their answers in the neutral settings.
arXiv Detail & Related papers (2024-08-17T06:25:36Z) - Towards Context-Aware Emotion Recognition Debiasing from a Causal Demystification Perspective via De-confounded Training [14.450673163785094]
Context-Aware Emotion Recognition (CAER) provides valuable semantic cues for recognizing the emotions of target persons.
Current approaches invariably focus on designing sophisticated structures to extract perceptually critical representations from contexts.
We present a Contextual Causal Intervention Module (CCIM) to de-confound the confounder.
arXiv Detail & Related papers (2024-07-06T05:29:02Z) - Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective [13.486497323758226]
Vision-language models pre-trained on extensive datasets can inadvertently learn biases by correlating gender information with objects or scenarios.
We propose a framework that incorporates causal mediation analysis to measure and map the pathways of bias generation and propagation.
arXiv Detail & Related papers (2024-07-03T05:19:45Z) - VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models [57.43276586087863]
Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs.
Existing benchmarks are often limited in scope, focusing mainly on object hallucinations.
We introduce a multi-dimensional benchmark covering objects, attributes, and relations, with challenging images selected based on associative biases.
arXiv Detail & Related papers (2024-04-22T04:49:22Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.