HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in
Natural Language Inference
- URL: http://arxiv.org/abs/2003.02756v2
- Date: Mon, 15 Mar 2021 02:47:53 GMT
- Title: HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in
Natural Language Inference
- Authors: Tianyu Liu, Xin Zheng, Baobao Chang and Zhifang Sui
- Abstract summary: We derive adversarial examples in terms of the hypothesis-only bias.
We investigate two debiasing approaches which exploit the artificial pattern modeling to mitigate such hypothesis-only bias.
- Score: 38.14399396661415
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many recent studies have shown that for models trained on datasets for
natural language inference (NLI), it is possible to make correct predictions by
merely looking at the hypothesis while completely ignoring the premise. In this
work, we manage to derive adversarial examples in terms of the hypothesis-only
bias and explore eligible ways to mitigate such bias. Specifically, we extract
various phrases from the hypotheses (artificial patterns) in the training sets,
and show that they have been strong indicators to the specific labels. We then
figure out `hard' and `easy' instances from the original test sets whose labels
are opposite to or consistent with those indications. We also set up baselines
including both pretrained models (BERT, RoBERTa, XLNet) and competitive
non-pretrained models (InferSent, DAM, ESIM). Apart from the benchmark and
baselines, we also investigate two debiasing approaches which exploit the
artificial pattern modeling to mitigate such hypothesis-only bias:
down-sampling and adversarial training. We believe those methods can be treated
as competitive baselines in NLI debiasing tasks.
Related papers
- Automatically Identifying Semantic Bias in Crowdsourced Natural Language
Inference Datasets [78.6856732729301]
We introduce a model-driven, unsupervised technique to find "bias clusters" in a learned embedding space of hypotheses in NLI datasets.
interventions and additional rounds of labeling can be performed to ameliorate the semantic bias of the hypothesis distribution of a dataset.
arXiv Detail & Related papers (2021-12-16T22:49:01Z) - A Generative Approach for Mitigating Structural Biases in Natural
Language Inference [24.44419010439227]
In this work, we reformulate the NLI task as a generative task, where a model is conditioned on the biased subset of the input and the label.
We show that this approach is highly robust to large amounts of bias.
We find that generative models are difficult to train and they generally perform worse than discriminative baselines.
arXiv Detail & Related papers (2021-08-31T17:59:45Z) - Bayesian analysis of the prevalence bias: learning and predicting from
imbalanced data [10.659348599372944]
This paper lays the theoretical and computational framework for training models, and for prediction, in the presence of prevalence bias.
It offers an alternative to principled training losses and complements test-time procedures based on selecting an operating point from summary curves.
It integrates seamlessly in the current paradigm of (deep) learning using backpropagation and naturally with Bayesian models.
arXiv Detail & Related papers (2021-07-31T14:36:33Z) - Double Perturbation: On the Robustness of Robustness and Counterfactual
Bias Evaluation [109.06060143938052]
We propose a "double perturbation" framework to uncover model weaknesses beyond the test dataset.
We apply this framework to study two perturbation-based approaches that are used to analyze models' robustness and counterfactual bias in English.
arXiv Detail & Related papers (2021-04-12T06:57:36Z) - Exploring Lexical Irregularities in Hypothesis-Only Models of Natural
Language Inference [5.283529004179579]
Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences.
Models that understand entailment should encode both, the premise and the hypothesis.
Experiments by Poliak et al. revealed a strong preference of these models towards patterns observed only in the hypothesis.
arXiv Detail & Related papers (2021-01-19T01:08:06Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - L2R2: Leveraging Ranking for Abductive Reasoning [65.40375542988416]
The abductive natural language inference task ($alpha$NLI) is proposed to evaluate the abductive reasoning ability of a learning system.
A novel $L2R2$ approach is proposed under the learning-to-rank framework.
Experiments on the ART dataset reach the state-of-the-art in the public leaderboard.
arXiv Detail & Related papers (2020-05-22T15:01:23Z) - Towards Robustifying NLI Models Against Lexical Dataset Biases [94.79704960296108]
This paper explores both data-level and model-level debiasing methods to robustify models against lexical dataset biases.
First, we debias the dataset through data augmentation and enhancement, but show that the model bias cannot be fully removed via this method.
The second approach employs a bag-of-words sub-model to capture the features that are likely to exploit the bias and prevents the original model from learning these biased features.
arXiv Detail & Related papers (2020-05-10T17:56:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.