Counterfactually Measuring and Eliminating Social Bias in
Vision-Language Pre-training Models
- URL: http://arxiv.org/abs/2207.01056v1
- Date: Sun, 3 Jul 2022 14:39:32 GMT
- Title: Counterfactually Measuring and Eliminating Social Bias in
Vision-Language Pre-training Models
- Authors: Yi Zhang, Junyang Wang, Jitao Sang
- Abstract summary: We introduce a counterfactual-based bias measurement emphCounterBias to quantify the social bias in Vision-Language Pre-training models.
We also construct a novel VL-Bias dataset including 24K image-text pairs for measuring gender bias.
- Score: 13.280828458515062
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vision-Language Pre-training (VLP) models have achieved state-of-the-art
performance in numerous cross-modal tasks. Since they are optimized to capture
the statistical properties of intra- and inter-modality, there remains risk to
learn social biases presented in the data as well. In this work, we (1)
introduce a counterfactual-based bias measurement \emph{CounterBias} to
quantify the social bias in VLP models by comparing the [MASK]ed prediction
probabilities of factual and counterfactual samples; (2) construct a novel
VL-Bias dataset including 24K image-text pairs for measuring gender bias in VLP
models, from which we observed that significant gender bias is prevalent in VLP
models; and (3) propose a VLP debiasing method \emph{FairVLP} to minimize the
difference in the [MASK]ed prediction probabilities between factual and
counterfactual image-text pairs for VLP debiasing. Although CounterBias and
FairVLP focus on social bias, they are generalizable to serve as tools and
provide new insights to probe and regularize more knowledge in VLP models.
Related papers
- GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Test-time Distribution Learning Adapter for Cross-modal Visual Reasoning [16.998833621046117]
We propose Test-Time Distribution LearNing Adapter (TT-DNA) which directly works during the testing period.
Specifically, we estimate Gaussian distributions to model visual features of the few-shot support images to capture the knowledge from the support set.
Our extensive experimental results on visual reasoning for human object interaction demonstrate that our proposed TT-DNA outperforms existing state-of-the-art methods by large margins.
arXiv Detail & Related papers (2024-03-10T01:34:45Z) - Survey of Social Bias in Vision-Language Models [65.44579542312489]
Survey aims to provide researchers with a high-level insight into the similarities and differences of social bias studies in pre-trained models across NLP, CV, and VL.
The findings and recommendations presented here can benefit the ML community, fostering the development of fairer and non-biased AI models.
arXiv Detail & Related papers (2023-09-24T15:34:56Z) - Probing Cross-modal Semantics Alignment Capability from the Textual
Perspective [52.52870614418373]
Aligning cross-modal semantics is claimed to be one of the essential capabilities of vision and language pre-training models.
We propose a new probing method that is based on image captioning to first empirically study the cross-modal semantics alignment of fjord models.
arXiv Detail & Related papers (2022-10-18T02:55:58Z) - VL-CheckList: Evaluating Pre-trained Vision-Language Models with
Objects, Attributes and Relations [28.322824790738768]
Vision-Language Pretraining models have successfully facilitated many cross-modal downstream tasks.
Most existing works evaluated their systems by comparing the fine-tuned downstream task performance.
Inspired by the CheckList for testing natural language processing, we exploit VL-CheckList, a novel framework.
arXiv Detail & Related papers (2022-07-01T06:25:53Z) - VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models [21.549122658275383]
Recent advances in vision-language pre-training have demonstrated impressive performance in a range of vision-language tasks.
We introduce the Vision-Language Understanding Evaluation benchmark, a multi-task multi-dimension benchmark for evaluating the generalization capabilities and the efficiency-performance trade-off.
arXiv Detail & Related papers (2022-05-30T16:52:30Z) - Towards Debiasing Temporal Sentence Grounding in Video [59.42702544312366]
temporal sentence grounding in video (TSGV) task is to locate a temporal moment from an untrimmed video, to match a language query.
Without considering bias in moment annotations, many models tend to capture statistical regularities of the moment annotations.
We propose two debiasing strategies, data debiasing and model debiasing, to "force" a TSGV model to capture cross-modal interactions.
arXiv Detail & Related papers (2021-11-08T08:18:25Z) - Towards More Fine-grained and Reliable NLP Performance Prediction [85.78131503006193]
We make two contributions to improving performance prediction for NLP tasks.
First, we examine performance predictors for holistic measures of accuracy like F1 or BLEU.
Second, we propose methods to understand the reliability of a performance prediction model from two angles: confidence intervals and calibration.
arXiv Detail & Related papers (2021-02-10T15:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.