Towards Robustifying NLI Models Against Lexical Dataset Biases
- URL: http://arxiv.org/abs/2005.04732v2
- Date: Wed, 13 May 2020 23:44:17 GMT
- Title: Towards Robustifying NLI Models Against Lexical Dataset Biases
- Authors: Xiang Zhou, Mohit Bansal
- Abstract summary: This paper explores both data-level and model-level debiasing methods to robustify models against lexical dataset biases.
First, we debias the dataset through data augmentation and enhancement, but show that the model bias cannot be fully removed via this method.
The second approach employs a bag-of-words sub-model to capture the features that are likely to exploit the bias and prevents the original model from learning these biased features.
- Score: 94.79704960296108
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While deep learning models are making fast progress on the task of Natural
Language Inference, recent studies have also shown that these models achieve
high accuracy by exploiting several dataset biases, and without deep
understanding of the language semantics. Using contradiction-word bias and
word-overlapping bias as our two bias examples, this paper explores both
data-level and model-level debiasing methods to robustify models against
lexical dataset biases. First, we debias the dataset through data augmentation
and enhancement, but show that the model bias cannot be fully removed via this
method. Next, we also compare two ways of directly debiasing the model without
knowing what the dataset biases are in advance. The first approach aims to
remove the label bias at the embedding level. The second approach employs a
bag-of-words sub-model to capture the features that are likely to exploit the
bias and prevents the original model from learning these biased features by
forcing orthogonality between these two sub-models. We performed evaluations on
new balanced datasets extracted from the original MNLI dataset as well as the
NLI stress tests, and show that the orthogonality approach is better at
debiasing the model while maintaining competitive overall accuracy. Our code
and data are available at: https://github.com/owenzx/LexicalDebias-ACL2020
Related papers
- Fighting Bias with Bias: Promoting Model Robustness by Amplifying
Dataset Biases [5.997909991352044]
Recent work sought to develop robust, unbiased models by filtering biased examples from training sets.
We argue that such filtering can obscure the true capabilities of models to overcome biases.
We introduce an evaluation framework defined by a bias-amplified training set and an anti-biased test set.
arXiv Detail & Related papers (2023-05-30T10:10:42Z) - Echoes: Unsupervised Debiasing via Pseudo-bias Labeling in an Echo
Chamber [17.034228910493056]
This paper presents experimental analyses revealing that the existing biased models overfit to bias-conflicting samples in the training data.
We propose a straightforward and effective method called Echoes, which trains a biased model and a target model with a different strategy.
Our approach achieves superior debiasing results compared to the existing baselines on both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-05-06T13:13:18Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Generating Data to Mitigate Spurious Correlations in Natural Language
Inference Datasets [27.562256973255728]
Natural language processing models often exploit spurious correlations between task-independent features and labels in datasets to perform well only within the distributions they are trained on.
We propose to tackle this problem by generating a debiased version of a dataset, which can then be used to train a debiased, off-the-shelf model.
Our approach consists of 1) a method for training data generators to generate high-quality, label-consistent data samples; and 2) a filtering mechanism for removing data points that contribute to spurious correlations.
arXiv Detail & Related papers (2022-03-24T09:08:05Z) - Pseudo Bias-Balanced Learning for Debiased Chest X-ray Classification [57.53567756716656]
We study the problem of developing debiased chest X-ray diagnosis models without knowing exactly the bias labels.
We propose a novel algorithm, pseudo bias-balanced learning, which first captures and predicts per-sample bias labels.
Our proposed method achieved consistent improvements over other state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-18T11:02:18Z) - Debiasing Methods in Natural Language Understanding Make Bias More
Accessible [28.877572447481683]
Recent debiasing methods in natural language understanding (NLU) improve performance on such datasets by pressuring models into making unbiased predictions.
We propose a general probing-based framework that allows for post-hoc interpretation of biases in language models.
We show that, counter-intuitively, the more a language model is pushed towards a debiased regime, the more bias is actually encoded in its inner representations.
arXiv Detail & Related papers (2021-09-09T08:28:22Z) - A Generative Approach for Mitigating Structural Biases in Natural
Language Inference [24.44419010439227]
In this work, we reformulate the NLI task as a generative task, where a model is conditioned on the biased subset of the input and the label.
We show that this approach is highly robust to large amounts of bias.
We find that generative models are difficult to train and they generally perform worse than discriminative baselines.
arXiv Detail & Related papers (2021-08-31T17:59:45Z) - Learning from others' mistakes: Avoiding dataset biases without modeling
them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task.
Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available.
We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z) - Improving Robustness by Augmenting Training Sentences with
Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective.
We propose to augment the input sentences in the training data with their corresponding predicate-argument structures.
We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.