Making Attention Mechanisms More Robust and Interpretable with Virtual
Adversarial Training for Semi-Supervised Text Classification
- URL: http://arxiv.org/abs/2104.08763v1
- Date: Sun, 18 Apr 2021 07:51:45 GMT
- Title: Making Attention Mechanisms More Robust and Interpretable with Virtual
Adversarial Training for Semi-Supervised Text Classification
- Authors: Shunsuke Kitada, Hitoshi Iyatomi
- Abstract summary: We propose a new general training technique for attention mechanisms based on virtual adversarial training (VAT)
VAT can compute adversarial perturbations from unlabeled data in a semi-supervised setting for the attention mechanisms that have been reported in previous studies to be vulnerable to perturbations.
- Score: 9.13755431537592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new general training technique for attention mechanisms based on
virtual adversarial training (VAT). VAT can compute adversarial perturbations
from unlabeled data in a semi-supervised setting for the attention mechanisms
that have been reported in previous studies to be vulnerable to perturbations.
Empirical experiments reveal that our technique (1) provides significantly
better prediction performance compared to not only conventional adversarial
training-based techniques but also VAT-based techniques in a semi-supervised
setting, (2) demonstrates a stronger correlation with the word importance and
better agreement with evidence provided by humans, and (3) gains in performance
with increasing amounts of unlabeled data.
Related papers
- Adversarial Training: A Survey [130.89534734092388]
Adversarial training (AT) refers to integrating adversarial examples into the training process.
Recent studies have demonstrated the effectiveness of AT in improving the robustness of deep neural networks against diverse adversarial attacks.
arXiv Detail & Related papers (2024-10-19T08:57:35Z) - The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks [90.52808174102157]
In safety-critical applications such as medical imaging and autonomous driving, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks.
A notable knowledge gap remains concerning the uncertainty inherent in adversarially trained models.
This study investigates the uncertainty of deep learning models by examining the performance of conformal prediction (CP) in the context of standard adversarial attacks.
arXiv Detail & Related papers (2024-05-14T18:05:19Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Implicit Counterfactual Data Augmentation for Robust Learning [24.795542869249154]
This study proposes an Implicit Counterfactual Data Augmentation method to remove spurious correlations and make stable predictions.
Experiments have been conducted across various biased learning scenarios covering both image and text datasets.
arXiv Detail & Related papers (2023-04-26T10:36:40Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Relation Extraction with Weighted Contrastive Pre-training on Distant
Supervision [22.904752492573504]
We propose a weighted contrastive learning method by leveraging the supervised data to estimate the reliability of pre-training instances.
Experimental results on three supervised datasets demonstrate the advantages of our proposed weighted contrastive learning approach.
arXiv Detail & Related papers (2022-05-18T07:45:59Z) - Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance.
Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - Attention Meets Perturbations: Robust and Interpretable Attention with
Adversarial Training [7.106986689736828]
We propose a general training technique for natural language processing tasks, including AT for attention (Attention AT) and more interpretable AT for attention (Attention iAT)
The proposed techniques improved the prediction performance and the model interpretability by exploiting the mechanisms with AT.
arXiv Detail & Related papers (2020-09-25T07:26:45Z) - Adversarial Learning for Supervised and Semi-supervised Relation
Extraction in Biomedical Literature [2.8881198461098894]
Adversarial training is a technique of improving model performance by involving adversarial examples in the training process.
In this paper, we investigate adversarial training with multiple adversarial examples to benefit the relation extraction task.
We also apply adversarial training technique in semi-supervised scenarios to utilize unlabeled data.
arXiv Detail & Related papers (2020-05-08T20:19:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.