Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and
Beyond
- URL: http://arxiv.org/abs/2310.14670v2
- Date: Tue, 31 Oct 2023 20:49:11 GMT
- Title: Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and
Beyond
- Authors: Zhecan Wang, Long Chen, Haoxuan You, Keyang Xu, Yicheng He, Wenhao Li,
Noel Codella, Kai-Wei Chang, Shih-Fu Chang
- Abstract summary: Vision-language (VL) understanding tasks evaluate models' comprehension of complex visual scenes through multiple-choice questions.
We have identified two dataset biases that models can exploit as shortcuts to resolve various VL tasks correctly without proper understanding.
We propose Adversarial Data Synthesis (ADS) to generate synthetic training and debiased evaluation data.
We then introduce Intra-sample Counterfactual Training (ICT) to assist models in utilizing the synthesized training data, particularly the counterfactual data, via focusing on intra-sample differentiation.
- Score: 93.96982273042296
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vision-language (VL) understanding tasks evaluate models' comprehension of
complex visual scenes through multiple-choice questions. However, we have
identified two dataset biases that models can exploit as shortcuts to resolve
various VL tasks correctly without proper understanding. The first type of
dataset bias is \emph{Unbalanced Matching} bias, where the correct answer
overlaps the question and image more than the incorrect answers. The second
type of dataset bias is \emph{Distractor Similarity} bias, where incorrect
answers are overly dissimilar to the correct answer but significantly similar
to other incorrect answers within the same sample. To address these dataset
biases, we first propose Adversarial Data Synthesis (ADS) to generate synthetic
training and debiased evaluation data. We then introduce Intra-sample
Counterfactual Training (ICT) to assist models in utilizing the synthesized
training data, particularly the counterfactual data, via focusing on
intra-sample differentiation. Extensive experiments demonstrate the
effectiveness of ADS and ICT in consistently improving model performance across
different benchmarks, even in domain-shifted scenarios.
Related papers
- Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Mitigating Representation Bias in Action Recognition: Algorithms and
Benchmarks [76.35271072704384]
Deep learning models perform poorly when applied to videos with rare scenes or objects.
We tackle this problem from two different angles: algorithm and dataset.
We show that the debiased representation can generalize better when transferred to other datasets and tasks.
arXiv Detail & Related papers (2022-09-20T00:30:35Z) - Unbiased Math Word Problems Benchmark for Mitigating Solving Bias [72.8677805114825]
Current solvers exist solving bias which consists of data bias and learning bias due to biased dataset and improper training strategy.
Our experiments verify MWP solvers are easy to be biased by the biased training datasets which do not cover diverse questions for each problem narrative of all MWPs.
An MWP can be naturally solved by multiple equivalent equations while current datasets take only one of the equivalent equations as ground truth.
arXiv Detail & Related papers (2022-05-17T06:07:04Z) - Generating Data to Mitigate Spurious Correlations in Natural Language
Inference Datasets [27.562256973255728]
Natural language processing models often exploit spurious correlations between task-independent features and labels in datasets to perform well only within the distributions they are trained on.
We propose to tackle this problem by generating a debiased version of a dataset, which can then be used to train a debiased, off-the-shelf model.
Our approach consists of 1) a method for training data generators to generate high-quality, label-consistent data samples; and 2) a filtering mechanism for removing data points that contribute to spurious correlations.
arXiv Detail & Related papers (2022-03-24T09:08:05Z) - Does Data Repair Lead to Fair Models? Curating Contextually Fair Data To
Reduce Model Bias [10.639605996067534]
Contextual information is a valuable cue for Deep Neural Networks (DNNs) to learn better representations and improve accuracy.
In COCO, many object categories have a much higher co-occurrence with men compared to women, which can bias a DNN's prediction in favor of men.
We introduce a data repair algorithm using the coefficient of variation, which can curate fair and contextually balanced data for a protected class.
arXiv Detail & Related papers (2021-10-20T06:00:03Z) - Greedy Gradient Ensemble for Robust Visual Question Answering [163.65789778416172]
We stress the language bias in Visual Question Answering (VQA) that comes from two aspects, i.e., distribution bias and shortcut bias.
We propose a new de-bias framework, Greedy Gradient Ensemble (GGE), which combines multiple biased models for unbiased base model learning.
GGE forces the biased models to over-fit the biased data distribution in priority, thus makes the base model pay more attention to examples that are hard to solve by biased models.
arXiv Detail & Related papers (2021-07-27T08:02:49Z) - Mitigating the Position Bias of Transformer Models in Passage Re-Ranking [12.526786110360622]
Supervised machine learning models and their evaluation strongly depends on the quality of the underlying dataset.
We observe a bias in the position of the correct answer in the text in two popular Question Answering datasets used for passage re-ranking.
We demonstrate that by mitigating the position bias, Transformer-based re-ranking models are equally effective on a biased and debiased dataset.
arXiv Detail & Related papers (2021-01-18T10:38:03Z) - Improving QA Generalization by Concurrent Modeling of Multiple Biases [61.597362592536896]
Existing NLP datasets contain various biases that models can easily exploit to achieve high performances on the corresponding evaluation sets.
We propose a general framework for improving the performance on both in-domain and out-of-domain datasets by concurrent modeling of multiple biases in the training data.
We extensively evaluate our framework on extractive question answering with training data from various domains with multiple biases of different strengths.
arXiv Detail & Related papers (2020-10-07T11:18:49Z) - Towards Accuracy-Fairness Paradox: Adversarial Example-based Data
Augmentation for Visual Debiasing [15.689539491203373]
Machine learning fairness concerns about the biases towards certain protected or sensitive group of people when addressing the target tasks.
This paper studies the debiasing problem in the context of image classification tasks.
arXiv Detail & Related papers (2020-07-27T15:17:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.