Related papers: PRISM: Reducing Spurious Implicit Biases in Vision-Language Models with LLM-Guided Embedding Projection

PRISM: Reducing Spurious Implicit Biases in Vision-Language Models with LLM-Guided Embedding Projection

URL: http://arxiv.org/abs/2507.08979v1
Date: Fri, 11 Jul 2025 19:24:58 GMT
Title: PRISM: Reducing Spurious Implicit Biases in Vision-Language Models with LLM-Guided Embedding Projection
Authors: Mahdiyar Molahasani, Azadeh Motamedi, Michael Greenspan, Il-Min Kim, Ali Etemad,
Abstract summary: We introduce Projection-based Reduction of Implicit Spurious bias in vision-language Models (PRISM)<n>PRISM is a new data-free and task-agnostic solution for bias mitigation in vision-language models.
Score: 21.96645957850079
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce Projection-based Reduction of Implicit Spurious bias in vision-language Models (PRISM), a new data-free and task-agnostic solution for bias mitigation in VLMs like CLIP. VLMs often inherit and amplify biases in their training data, leading to skewed predictions. PRISM is designed to debias VLMs without relying on predefined bias categories or additional external data. It operates in two stages: first, an LLM is prompted with simple class prompts to generate scene descriptions that contain spurious correlations. Next, PRISM uses our novel contrastive-style debiasing loss to learn a projection that maps the embeddings onto a latent space that minimizes spurious correlations while preserving the alignment between image and text embeddings.Extensive experiments demonstrate that PRISM outperforms current debiasing methods on the commonly used Waterbirds and CelebA datasets We make our code public at: https://github.com/MahdiyarMM/PRISM.

Related papers

DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs [1.89915151018241]
We argue that implicit bias in Large Language Models (LLMs) is not only an ethical, but also a technical issue.<n>We developed a method for calculating an easily interpretable benchmark, DIF (Demographic Implicit Fairness)
arXiv Detail & Related papers (2025-05-15T06:53:37Z)
Implicit Bias in LLMs: A Survey [2.07180164747172]
This paper provides a comprehensive review of the existing literature on implicit bias in Large language models.<n>We begin by introducing key concepts, theories and methods related to implicit bias in psychology.<n>We categorize detection methods into three primary approaches: word association, task-oriented text generation and decision-making.
arXiv Detail & Related papers (2025-03-04T16:49:37Z)
Preference Leakage: A Contamination Problem in LLM-as-a-judge [69.96778498636071]
Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods.<n>In this work, we expose preference leakage, a contamination problem in LLM-as-a-judge caused by the relatedness between the synthetic data generators and LLM-based evaluators.
arXiv Detail & Related papers (2025-02-03T17:13:03Z)
Differentially Private Steering for Large Language Model Alignment [55.30573701583768]
We present the first study of aligning Large Language Models with private datasets.<n>Our work proposes the Private Steering for LLM Alignment (PSA) algorithm to edit activations with differential privacy guarantees.<n>Our results show that PSA achieves DP guarantees for LLM alignment with minimal loss in performance.
arXiv Detail & Related papers (2025-01-30T17:58:36Z)
RAZOR: Sharpening Knowledge by Cutting Bias with Unsupervised Text Rewriting [16.633948320306832]
biases prevalent in manually constructed datasets can introduce spurious correlations between tokens and labels.<n>Existing debiasing methods often rely on prior knowledge of specific dataset biases.<n>We propose RAZOR, a novel, unsupervised, and data-focused debiasing approach based on text rewriting for shortcut mitigation.
arXiv Detail & Related papers (2024-12-10T17:02:58Z)
BendVLM: Test-Time Debiasing of Vision-Language Embeddings [31.033058277888234]
Vision-language model (VLM) embeddings have been shown to encode biases present in their training data. Debiasing approaches that fine-tune the VLM often suffer from catastrophic forgetting. We propose Bend-VLM, a nonlinear, fine-tuning-free approach for VLM embedding debiasing.
arXiv Detail & Related papers (2024-11-07T04:16:15Z)
Identifying and Mitigating Social Bias Knowledge in Language Models [52.52955281662332]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.<n>FAST surpasses state-of-the-art baselines with superior debiasing performance.<n>This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z)
Refining Skewed Perceptions in Vision-Language Contrastive Models through Visual Representations [0.033483662989441935]
Large vision-language contrastive models (VLCMs) have become foundational, demonstrating remarkable success across a variety of downstream tasks.<n>Despite their advantages, these models inherit biases from the disproportionate distribution of real-world data, leading to misconceptions about the actual environment.<n>This study presents an investigation into how a simple linear probe can effectively distill task-specific core features from CLIP's embedding for downstream applications.
arXiv Detail & Related papers (2024-05-22T22:03:11Z)
Prompting Fairness: Integrating Causality to Debias Large Language Models [19.76215433424235]
Large language models (LLMs) are susceptible to generating biased and discriminatory responses.<n>We propose a causality-guided debiasing framework to tackle social biases.
arXiv Detail & Related papers (2024-03-13T17:46:28Z)
Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing. Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image. To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z)
ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs [65.9625653425636]
Large Language models (LLMs) exhibit harmful social biases. This work introduces a novel approach utilizing ChatGPT to generate synthetic training data.
arXiv Detail & Related papers (2024-02-19T01:28:48Z)
Self-Supervised Position Debiasing for Large Language Models [39.261233221850155]
We propose a self-supervised position debiasing (SOD) framework to mitigate position bias for large language models (LLMs) Experiments on eight datasets and five tasks show that SOD consistently outperforms existing methods in mitigating three types of position biases.
arXiv Detail & Related papers (2024-01-02T14:12:41Z)
Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding. We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.