Counterfactually Measuring and Eliminating Social Bias in
Vision-Language Pre-training Models
- URL: http://arxiv.org/abs/2207.01056v1
- Date: Sun, 3 Jul 2022 14:39:32 GMT
- Title: Counterfactually Measuring and Eliminating Social Bias in
Vision-Language Pre-training Models
- Authors: Yi Zhang, Junyang Wang, Jitao Sang
- Abstract summary: We introduce a counterfactual-based bias measurement emphCounterBias to quantify the social bias in Vision-Language Pre-training models.
We also construct a novel VL-Bias dataset including 24K image-text pairs for measuring gender bias.
- Score: 13.280828458515062
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vision-Language Pre-training (VLP) models have achieved state-of-the-art
performance in numerous cross-modal tasks. Since they are optimized to capture
the statistical properties of intra- and inter-modality, there remains risk to
learn social biases presented in the data as well. In this work, we (1)
introduce a counterfactual-based bias measurement \emph{CounterBias} to
quantify the social bias in VLP models by comparing the [MASK]ed prediction
probabilities of factual and counterfactual samples; (2) construct a novel
VL-Bias dataset including 24K image-text pairs for measuring gender bias in VLP
models, from which we observed that significant gender bias is prevalent in VLP
models; and (3) propose a VLP debiasing method \emph{FairVLP} to minimize the
difference in the [MASK]ed prediction probabilities between factual and
counterfactual image-text pairs for VLP debiasing. Although CounterBias and
FairVLP focus on social bias, they are generalizable to serve as tools and
provide new insights to probe and regularize more knowledge in VLP models.
Related papers
- Joint Vision-Language Social Bias Removal for CLIP [16.954442426379913]
We propose a novel V-L debiasing framework to align image and text biases followed by removing them from both modalities.
We believe this work will offer new insights and guidance for future studies addressing the social bias problem in CLIP.
arXiv Detail & Related papers (2024-11-19T10:14:26Z) - Scaling Laws for Predicting Downstream Performance in LLMs [75.28559015477137]
This work focuses on the pre-training loss as a more-efficient metric for performance estimation.
We extend the power law analytical function to predict domain-specific pre-training loss based on FLOPs across data sources.
We employ a two-layer neural network to model the non-linear relationship between multiple domain-specific loss and downstream performance.
arXiv Detail & Related papers (2024-10-11T04:57:48Z) - Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.
FAST surpasses state-of-the-art baselines with superior debiasing performance.
This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z) - GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - Test-time Distribution Learning Adapter for Cross-modal Visual Reasoning [16.998833621046117]
We propose Test-Time Distribution LearNing Adapter (TT-DNA) which directly works during the testing period.
Specifically, we estimate Gaussian distributions to model visual features of the few-shot support images to capture the knowledge from the support set.
Our extensive experimental results on visual reasoning for human object interaction demonstrate that our proposed TT-DNA outperforms existing state-of-the-art methods by large margins.
arXiv Detail & Related papers (2024-03-10T01:34:45Z) - Survey of Social Bias in Vision-Language Models [65.44579542312489]
Survey aims to provide researchers with a high-level insight into the similarities and differences of social bias studies in pre-trained models across NLP, CV, and VL.
The findings and recommendations presented here can benefit the ML community, fostering the development of fairer and non-biased AI models.
arXiv Detail & Related papers (2023-09-24T15:34:56Z) - Probing Cross-modal Semantics Alignment Capability from the Textual
Perspective [52.52870614418373]
Aligning cross-modal semantics is claimed to be one of the essential capabilities of vision and language pre-training models.
We propose a new probing method that is based on image captioning to first empirically study the cross-modal semantics alignment of fjord models.
arXiv Detail & Related papers (2022-10-18T02:55:58Z) - VL-CheckList: Evaluating Pre-trained Vision-Language Models with
Objects, Attributes and Relations [28.322824790738768]
Vision-Language Pretraining models have successfully facilitated many cross-modal downstream tasks.
Most existing works evaluated their systems by comparing the fine-tuned downstream task performance.
Inspired by the CheckList for testing natural language processing, we exploit VL-CheckList, a novel framework.
arXiv Detail & Related papers (2022-07-01T06:25:53Z) - VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models [21.549122658275383]
Recent advances in vision-language pre-training have demonstrated impressive performance in a range of vision-language tasks.
We introduce the Vision-Language Understanding Evaluation benchmark, a multi-task multi-dimension benchmark for evaluating the generalization capabilities and the efficiency-performance trade-off.
arXiv Detail & Related papers (2022-05-30T16:52:30Z) - Towards More Fine-grained and Reliable NLP Performance Prediction [85.78131503006193]
We make two contributions to improving performance prediction for NLP tasks.
First, we examine performance predictors for holistic measures of accuracy like F1 or BLEU.
Second, we propose methods to understand the reliability of a performance prediction model from two angles: confidence intervals and calibration.
arXiv Detail & Related papers (2021-02-10T15:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.