An Empirical Study on Model-agnostic Debiasing Strategies for Robust
Natural Language Inference
- URL: http://arxiv.org/abs/2010.03777v2
- Date: Sat, 17 Oct 2020 14:57:09 GMT
- Title: An Empirical Study on Model-agnostic Debiasing Strategies for Robust
Natural Language Inference
- Authors: Tianyu Liu, Xin Zheng, Xiaoan Ding, Baobao Chang and Zhifang Sui
- Abstract summary: We focus on the model-agnostic debiasing strategies and explore how to make the NLI models robust to multiple adversarial attacks.
We first benchmark prevailing neural NLI models including pretrained ones on various adversarial datasets.
We then try to combat distinct known biases by modifying a mixture of experts (MoE) ensemble method.
- Score: 37.420864237437804
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The prior work on natural language inference (NLI) debiasing mainly targets
at one or few known biases while not necessarily making the models more robust.
In this paper, we focus on the model-agnostic debiasing strategies and explore
how to (or is it possible to) make the NLI models robust to multiple distinct
adversarial attacks while keeping or even strengthening the models'
generalization power. We firstly benchmark prevailing neural NLI models
including pretrained ones on various adversarial datasets. We then try to
combat distinct known biases by modifying a mixture of experts (MoE) ensemble
method and show that it's nontrivial to mitigate multiple NLI biases at the
same time, and that model-level ensemble method outperforms MoE ensemble
method. We also perform data augmentation including text swap, word
substitution and paraphrase and prove its efficiency in combating various
(though not all) adversarial attacks at the same time. Finally, we investigate
several methods to merge heterogeneous training data (1.35M) and perform model
ensembling, which are straightforward but effective to strengthen NLI models.
Related papers
- Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling [23.447466392929712]
Large language models (LLMs) exhibit varying strengths and weaknesses across different tasks.
Existing LLM ensembling methods often overlook model compatibility and struggle with inefficient alignment of probabilities.
We introduce the textscUnion textscTop-$k$ textscEnsembling (textscUniTE), a novel approach that efficiently combines models by focusing on the union of the top-k tokens from each model.
arXiv Detail & Related papers (2024-10-03T08:42:38Z) - Enhancing adversarial robustness in Natural Language Inference using explanations [41.46494686136601]
We cast the spotlight on the underexplored task of Natural Language Inference (NLI)
We validate the usage of natural language explanation as a model-agnostic defence strategy through extensive experimentation.
We research the correlation of widely used language generation metrics with human perception, in order for them to serve as a proxy towards robust NLI models.
arXiv Detail & Related papers (2024-09-11T17:09:49Z) - Addressing Bias Through Ensemble Learning and Regularized Fine-Tuning [0.2812395851874055]
This paper proposes a comprehensive approach using multiple methods to remove bias in AI models.
We train multiple models with the counter-bias of the pre-trained model through data splitting, local training, and regularized fine-tuning.
We conclude our solution with knowledge distillation that results in a single unbiased neural network.
arXiv Detail & Related papers (2024-02-01T09:24:36Z) - Leveraging Biases in Large Language Models: "bias-kNN'' for Effective
Few-Shot Learning [36.739829839357995]
This study introduces a novel methodology named bias-kNN''
This approach capitalizes on the biased outputs, harnessing them as primary features for kNN and supplementing with gold labels.
Our comprehensive evaluations, spanning diverse domain text classification datasets and different GPT-2 model sizes, indicate the adaptability and efficacy of the bias-kNN'' method.
arXiv Detail & Related papers (2024-01-18T08:05:45Z) - Universal Semi-supervised Model Adaptation via Collaborative Consistency
Training [92.52892510093037]
We introduce a realistic and challenging domain adaptation problem called Universal Semi-supervised Model Adaptation (USMA)
We propose a collaborative consistency training framework that regularizes the prediction consistency between two models.
Experimental results demonstrate the effectiveness of our method on several benchmark datasets.
arXiv Detail & Related papers (2023-07-07T08:19:40Z) - In and Out-of-Domain Text Adversarial Robustness via Label Smoothing [64.66809713499576]
We study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks.
Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks.
We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.
arXiv Detail & Related papers (2022-12-20T14:06:50Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood
Ensemble [163.3333439344695]
Dirichlet Neighborhood Ensemble (DNE) is a randomized smoothing method for training a robust model to defense substitution-based attacks.
DNE forms virtual sentences by sampling embedding vectors for each word in an input sentence from a convex hull spanned by the word and its synonyms, and it augments them with the training data.
We demonstrate through extensive experimentation that our method consistently outperforms recently proposed defense methods by a significant margin across different network architectures and multiple data sets.
arXiv Detail & Related papers (2020-06-20T18:01:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.