More Bang for Your Buck: Natural Perturbation for Robust Question
Answering
- URL: http://arxiv.org/abs/2004.04849v2
- Date: Tue, 6 Oct 2020 07:10:00 GMT
- Title: More Bang for Your Buck: Natural Perturbation for Robust Question
Answering
- Authors: Daniel Khashabi, Tushar Khot, Ashish Sabharwal
- Abstract summary: We propose an alternative to the standard approach of constructing training sets of completely new examples.
Our approach involves first collecting a set of seed examples and then applying human-driven natural perturbations.
We find that when natural perturbations are moderately cheaper to create, it is more effective to train models using them.
- Score: 49.83269677507831
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While recent models have achieved human-level scores on many NLP datasets, we
observe that they are considerably sensitive to small changes in input. As an
alternative to the standard approach of addressing this issue by constructing
training sets of completely new examples, we propose doing so via minimal
perturbation of examples. Specifically, our approach involves first collecting
a set of seed examples and then applying human-driven natural perturbations (as
opposed to rule-based machine perturbations), which often change the gold label
as well. Local perturbations have the advantage of being relatively easier (and
hence cheaper) to create than writing out completely new examples. To evaluate
the impact of this phenomenon, we consider a recent question-answering dataset
(BoolQ) and study the benefit of our approach as a function of the perturbation
cost ratio, the relative cost of perturbing an existing question vs. creating a
new one from scratch. We find that when natural perturbations are moderately
cheaper to create, it is more effective to train models using them: such models
exhibit higher robustness and better generalization, while retaining
performance on the original BoolQ dataset.
Related papers
- Reducing Bias in Pre-trained Models by Tuning while Penalizing Change [8.862970622361747]
Deep models trained on large amounts of data often incorporate implicit biases present during training time.
New data is often expensive and hard to come by in areas such as autonomous driving or medical decision-making.
We present a method based on change penalization that takes a pre-trained model and adapts the weights to mitigate a previously detected bias.
arXiv Detail & Related papers (2024-04-18T16:12:38Z) - Language Model Cascades: Token-level uncertainty and beyond [65.38515344964647]
Recent advances in language models (LMs) have led to significant improvements in quality on complex NLP tasks.
Cascading offers a simple strategy to achieve more favorable cost-quality tradeoffs.
We show that incorporating token-level uncertainty through learned post-hoc deferral rules can significantly outperform simple aggregation strategies.
arXiv Detail & Related papers (2024-04-15T21:02:48Z) - An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning [55.467047686093025]
A common approach to alleviate such forgetting is to rehearse samples from prior tasks during fine-tuning.
We propose a sampling scheme, textttbf mix-cd, that prioritizes rehearsal of collateral damage'' samples.
Our approach is computationally efficient, easy to implement, and outperforms several leading continual learning methods in compute-constrained settings.
arXiv Detail & Related papers (2024-02-12T22:32:12Z) - SCENE: Self-Labeled Counterfactuals for Extrapolating to Negative
Examples [23.77077091225583]
Self-labeled Counterfactuals for Extrapolating to Negative Examples (SCENE) is an automatic method for synthesizing training data.
With access to only answerable training examples, SCENE can close 69.6% of the performance gap on SQuAD 2.0.
arXiv Detail & Related papers (2023-05-13T19:30:58Z) - Smoothly Giving up: Robustness for Simple Models [30.56684535186692]
Examples of algorithms to train such models include logistic regression and boosting.
We use $Served-Served joint convex loss functions, which tunes between canonical convex loss functions, to robustly train such models.
We also provide results for boosting a COVID-19 dataset for logistic regression, highlighting the efficacy approach across multiple relevant domains.
arXiv Detail & Related papers (2023-02-17T19:48:11Z) - Bias Mimicking: A Simple Sampling Approach for Bias Mitigation [57.17709477668213]
We introduce a new class-conditioned sampling method: Bias Mimicking.
Bias Mimicking improves underrepresented groups' accuracy of sampling methods by 3% over four benchmarks.
arXiv Detail & Related papers (2022-09-30T17:33:00Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - Adversarially Robust Classifier with Covariate Shift Adaptation [25.39995678746662]
Existing adversarially trained models typically perform inference on test examples independently from each other.
We show that simple adaptive batch normalization (BN) technique can significantly improve the robustness of these models for any random perturbations.
We further demonstrate that adaptive BN technique significantly improves robustness against common corruptions, while often enhancing performance against adversarial attacks.
arXiv Detail & Related papers (2021-02-09T19:51:56Z) - Towards Understanding the Regularization of Adversarial Robustness on
Neural Networks [46.54437309608066]
We study the degradation through the regularization perspective.
We find that AR is achieved by regularizing/biasing NNs towards less confident solutions.
arXiv Detail & Related papers (2020-11-15T08:32:09Z) - Effective Distant Supervision for Temporal Relation Extraction [49.20329405920023]
A principal barrier to training temporal relation extraction models in new domains is the lack of varied, high quality examples.
We present a method of automatically collecting distantly-supervised examples of temporal relations.
arXiv Detail & Related papers (2020-10-24T03:17:31Z) - Improving Robustness by Augmenting Training Sentences with
Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective.
We propose to augment the input sentences in the training data with their corresponding predicate-argument structures.
We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.