SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases
- URL: http://arxiv.org/abs/2302.14413v1
- Date: Tue, 28 Feb 2023 08:47:20 GMT
- Title: SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases
- Authors: Yanchen Liu, Jing Yan, Yan Chen, Jing Liu, Hua Wu
- Abstract summary: We propose a new debiasing method Sparse Mixture-of-Adapters (SMoA), which can mitigate multiple dataset biases effectively and efficiently.
Experiments on Natural Language Inference and Paraphrase Identification tasks demonstrate that SMoA outperforms full-finetuning, adapter tuning baselines, and prior strong debiasing methods.
- Score: 27.56143777363971
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies reveal that various biases exist in different NLP tasks, and
over-reliance on biases results in models' poor generalization ability and low
adversarial robustness. To mitigate datasets biases, previous works propose
lots of debiasing techniques to tackle specific biases, which perform well on
respective adversarial sets but fail to mitigate other biases. In this paper,
we propose a new debiasing method Sparse Mixture-of-Adapters (SMoA), which can
mitigate multiple dataset biases effectively and efficiently. Experiments on
Natural Language Inference and Paraphrase Identification tasks demonstrate that
SMoA outperforms full-finetuning, adapter tuning baselines, and prior strong
debiasing methods. Further analysis indicates the interpretability of SMoA that
sub-adapter can capture specific pattern from the training data and specialize
to handle specific bias.
Related papers
- Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint.
We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b.
We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z) - Improving Bias Mitigation through Bias Experts in Natural Language
Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model.
Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z) - Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs)
Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations.
Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Discover and Mitigate Unknown Biases with Debiasing Alternate Networks [42.89260385194433]
We propose Debiasing Alternate Networks (DebiAN), which comprises two networks -- a Discoverer and a classifier.
DebiAN aims at unlearning the biases identified by the discoverer.
While previous works evaluate debiasing results in terms of a single bias, we create Multi-Color MNIST dataset to better benchmark mitigation of multiple biases.
arXiv Detail & Related papers (2022-07-20T17:59:51Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning Debiased Models with Dynamic Gradient Alignment and
Bias-conflicting Sample Mining [39.00256193731365]
Deep neural networks notoriously suffer from dataset biases which are detrimental to model robustness, generalization and fairness.
We propose a two-stage debiasing scheme to combat against the intractable unknown biases.
arXiv Detail & Related papers (2021-11-25T14:50:10Z) - Are Bias Mitigation Techniques for Deep Learning Effective? [24.84797949716142]
We introduce an improved evaluation protocol, sensible metrics, and a new dataset.
We evaluate seven state-of-the-art algorithms using the same network architecture.
We find that algorithms exploit hidden biases, are unable to scale to multiple forms of bias, and are highly sensitive to the choice of tuning set.
arXiv Detail & Related papers (2021-04-01T00:14:45Z) - Improving Robustness by Augmenting Training Sentences with
Predicate-Argument Structures [62.562760228942054]
Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective.
We propose to augment the input sentences in the training data with their corresponding predicate-argument structures.
We show that without targeting a specific bias, our sentence augmentation improves the robustness of transformer models against multiple biases.
arXiv Detail & Related papers (2020-10-23T16:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.