Distilling Robustness into Natural Language Inference Models with Domain-Targeted Augmentation
- URL: http://arxiv.org/abs/2305.13067v3
- Date: Wed, 24 Jul 2024 18:54:53 GMT
- Title: Distilling Robustness into Natural Language Inference Models with Domain-Targeted Augmentation
- Authors: Joe Stacey, Marek Rei,
- Abstract summary: We investigate two complementary methods for improving the robustness of the resulting student models on out-of-distribution domains.
The first approach augments the distillation with generated unlabelled examples that match the target distribution.
The second method upsamples data points among the training set that are similar to the target distribution.
- Score: 12.512147282842175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation optimises a smaller student model to behave similarly to a larger teacher model, retaining some of the performance benefits. While this method can improve results on in-distribution examples, it does not necessarily generalise to out-of-distribution (OOD) settings. We investigate two complementary methods for improving the robustness of the resulting student models on OOD domains. The first approach augments the distillation with generated unlabelled examples that match the target distribution. The second method upsamples data points among the training set that are similar to the target distribution. When applied on the task of natural language inference (NLI), our experiments on MNLI show that distillation with these modifications outperforms previous robustness solutions. We also find that these methods improve performance on OOD domains even beyond the target domain.
Related papers
- Distributionally Robust Reinforcement Learning with Human Feedback [13.509499718691016]
We introduce a distributionally robust RLHF for fine-tuning large language models.
Our goal is to ensure that a fine-tuned model retains its performance even when the distribution of prompts significantly differs.
We show that our robust training improves the accuracy of the learned reward models on average, and markedly on some tasks, such as reasoning.
arXiv Detail & Related papers (2025-03-01T15:43:39Z) - Joint Training for Selective Prediction [5.662924503089369]
Selective Prediction methods determine when to adopt a classifier's output versus defer to a human.
One previous method involves learning a deferral model based on engineered features.
We introduce a novel joint-training approach that simultaneously optimize learned representations used by the classifier module and a learned deferral policy.
arXiv Detail & Related papers (2024-10-31T15:28:26Z) - Comprehensive OOD Detection Improvements [46.46252643210326]
Out-of-distribution (OOD) detection methods have been created for this task.
We employ dimensionality reduction on feature embeddings in representation-based methods for both time speedups and improved performance.
We demonstrate the effectiveness of our methods on the OpenOODv1.5 benchmark framework.
arXiv Detail & Related papers (2024-01-18T18:05:35Z) - Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models.
We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data.
Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z) - Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution [67.9215891673174]
We propose score entropy as a novel loss that naturally extends score matching to discrete spaces.
We test our Score Entropy Discrete Diffusion models on standard language modeling tasks.
arXiv Detail & Related papers (2023-10-25T17:59:12Z) - Self-training through Classifier Disagreement for Cross-Domain Opinion
Target Extraction [62.41511766918932]
Opinion target extraction (OTE) or aspect extraction (AE) is a fundamental task in opinion mining.
Recent work focus on cross-domain OTE, which is typically encountered in real-world scenarios.
We propose a new SSL approach that opts for selecting target samples whose model output from a domain-specific teacher and student network disagrees on the unlabelled target data.
arXiv Detail & Related papers (2023-02-28T16:31:17Z) - Guide the Learner: Controlling Product of Experts Debiasing Method Based
on Token Attribution Similarities [17.082695183953486]
A popular workaround is to train a robust model by re-weighting training examples based on a secondary biased model.
Here, the underlying assumption is that the biased model resorts to shortcut features.
We introduce a fine-tuning strategy that incorporates the similarity between the main and biased model attribution scores in a Product of Experts loss function.
arXiv Detail & Related papers (2023-02-06T15:21:41Z) - Distributional Shift Adaptation using Domain-Specific Features [41.91388601229745]
In open-world scenarios, streaming big data can be Out-Of-Distribution (OOD)
We propose a simple yet effective approach that relies on correlations in general regardless of whether the features are invariant or not.
Our approach uses the most confidently predicted samples identified by an OOD base model to train a new model that effectively adapts to the target domain.
arXiv Detail & Related papers (2022-11-09T04:16:21Z) - From Distillation to Hard Negative Sampling: Making Sparse Neural IR
Models More Effective [15.542082655342476]
We build on SPLADE -- a sparse expansion-based retriever -- and show to which extent it is able to benefit from the same training improvements as dense models.
We study the link between effectiveness and efficiency, on in-domain and zero-shot settings.
arXiv Detail & Related papers (2022-05-10T08:08:43Z) - Distributionally Robust Recurrent Decoders with Random Network
Distillation [93.10261573696788]
We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to disregard OOD context during inference.
We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
arXiv Detail & Related papers (2021-10-25T19:26:29Z) - Enhancing the Generalization for Intent Classification and Out-of-Domain
Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU)
Recent works have shown that using extra data and labels can improve the OOD detection performance.
This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.