Biases Propagate in Encoder-based Vision-Language Models: A Systematic Analysis From Intrinsic Measures to Zero-shot Retrieval Outcomes
- URL: http://arxiv.org/abs/2506.06506v1
- Date: Fri, 06 Jun 2025 20:01:32 GMT
- Title: Biases Propagate in Encoder-based Vision-Language Models: A Systematic Analysis From Intrinsic Measures to Zero-shot Retrieval Outcomes
- Authors: Kshitish Ghate, Tessa Charlesworth, Mona Diab, Aylin Caliskan,
- Abstract summary: Social-group biases intrinsic to foundational encoder-based vision-language models (VLMs) manifest in biases in downstream tasks.<n>We introduce a controlled framework to measure this propagation by correlating intrinsic measures of bias in the representational space with measures of bias in zero-shot text-to-image (TTI) and image-to-text (ITT) retrieval.<n>Results show substantial correlations between intrinsic and extrinsic bias, with an average $rho$ = 0.83 $pm$ 0.10.<n> Notably, we find that larger/better-performing models exhibit greater bias propagation, a finding that raises concerns
- Score: 14.331322509462419
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To build fair AI systems we need to understand how social-group biases intrinsic to foundational encoder-based vision-language models (VLMs) manifest in biases in downstream tasks. In this study, we demonstrate that intrinsic biases in VLM representations systematically ``carry over'' or propagate into zero-shot retrieval tasks, revealing how deeply rooted biases shape a model's outputs. We introduce a controlled framework to measure this propagation by correlating (a) intrinsic measures of bias in the representational space with (b) extrinsic measures of bias in zero-shot text-to-image (TTI) and image-to-text (ITT) retrieval. Results show substantial correlations between intrinsic and extrinsic bias, with an average $\rho$ = 0.83 $\pm$ 0.10. This pattern is consistent across 114 analyses, both retrieval directions, six social groups, and three distinct VLMs. Notably, we find that larger/better-performing models exhibit greater bias propagation, a finding that raises concerns given the trend towards increasingly complex AI models. Our framework introduces baseline evaluation tasks to measure the propagation of group and valence signals. Investigations reveal that underrepresented groups experience less robust propagation, further skewing their model-related outcomes.
Related papers
- Interpreting Social Bias in LVLMs via Information Flow Analysis and Multi-Round Dialogue Evaluation [1.7997395646080083]
Large Vision Language Models (LVLMs) have achieved remarkable progress in multimodal tasks, yet they also exhibit notable social biases.<n>We propose an explanatory framework that combines information flow analysis with multi-round dialogue evaluation.<n>Experiments reveal that LVLMs exhibit systematic disparities in information usage when processing images of different demographic groups.
arXiv Detail & Related papers (2025-05-27T12:28:44Z) - Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning [77.120955854093]
We show that data diversity can be a strong predictor of generalization in language models.<n>We introduce G-Vendi, a metric that quantifies diversity via the entropy of model-induced gradients.<n>We present Prismatic Synthesis, a framework for generating diverse synthetic data.
arXiv Detail & Related papers (2025-05-26T16:05:10Z) - BiasConnect: Investigating Bias Interactions in Text-to-Image Models [73.76853483463836]
We introduce BiasConnect, a novel tool designed to analyze and quantify bias interactions in Text-to-Image models.<n>Our method provides empirical estimates that indicate how other bias dimensions shift toward or away from an ideal distribution when a given bias is modified.<n>We demonstrate the utility of BiasConnect for selecting optimal bias mitigation axes, comparing different TTI models on the dependencies they learn, and understanding the amplification of intersectional societal biases in TTI models.
arXiv Detail & Related papers (2025-03-12T19:01:41Z) - Exploring Bias in over 100 Text-to-Image Generative Models [49.60774626839712]
We investigate bias trends in text-to-image generative models over time, focusing on the increasing availability of models through open platforms like Hugging Face.<n>We assess bias across three key dimensions: (i) distribution bias, (ii) generative hallucination, and (iii) generative miss-rate.<n>Our findings indicate that artistic and style-transferred models exhibit significant bias, whereas foundation models, benefiting from broader training distributions, are becoming progressively less biased.
arXiv Detail & Related papers (2025-03-11T03:40:44Z) - Intrinsic Bias is Predicted by Pretraining Data and Correlates with Downstream Performance in Vision-Language Encoders [13.474737752636608]
We present the largest comprehensive analysis to-date of how the upstream pre-training factors and downstream performance of CLIP models relate to intrinsic biases.<n>We study 131 unique CLIP models, trained on 26 datasets, using 55 architectures, and in a variety of sizes.<n>We find that the choice of pre-training dataset is the most significant upstream predictor of bias, whereas architectural variations have minimal impact.
arXiv Detail & Related papers (2025-02-11T21:11:47Z) - Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective [13.486497323758226]
Vision-language models pre-trained on extensive datasets can inadvertently learn biases by correlating gender information with objects or scenarios.<n>We propose a framework that incorporates causal mediation analysis to measure and map the pathways of bias generation and propagation.
arXiv Detail & Related papers (2024-07-03T05:19:45Z) - Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study [61.65123150513683]
multimodal foundation models, such as CLIP, produce state-of-the-art zero-shot results.
It is reported that these models close the robustness gap by matching the performance of supervised models trained on ImageNet.
We show that CLIP leads to a significant robustness drop compared to supervised ImageNet models on our benchmark.
arXiv Detail & Related papers (2024-03-15T17:33:49Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.