MPAT: Building Robust Deep Neural Networks against Textual Adversarial
Attacks
- URL: http://arxiv.org/abs/2402.18792v1
- Date: Thu, 29 Feb 2024 01:49:18 GMT
- Title: MPAT: Building Robust Deep Neural Networks against Textual Adversarial
Attacks
- Authors: Fangyuan Zhang, Huichi Zhou, Shuangjiao Li, Hongtao Wang
- Abstract summary: We propose a malicious perturbation based adversarial training method (MPAT) for building robust deep neural networks against adversarial attacks.
Specifically, we construct a multi-level malicious example generation strategy to generate adversarial examples with malicious perturbations.
We employ a novel training objective function to ensure achieving the defense goal without compromising the performance on the original task.
- Score: 4.208423642716679
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks have been proven to be vulnerable to adversarial
examples and various methods have been proposed to defend against adversarial
attacks for natural language processing tasks. However, previous defense
methods have limitations in maintaining effective defense while ensuring the
performance of the original task. In this paper, we propose a malicious
perturbation based adversarial training method (MPAT) for building robust deep
neural networks against textual adversarial attacks. Specifically, we construct
a multi-level malicious example generation strategy to generate adversarial
examples with malicious perturbations, which are used instead of original
inputs for model training. Additionally, we employ a novel training objective
function to ensure achieving the defense goal without compromising the
performance on the original task. We conduct comprehensive experiments to
evaluate our defense method by attacking five victim models on three benchmark
datasets. The result demonstrates that our method is more effective against
malicious adversarial attacks compared with previous defense methods while
maintaining or further improving the performance on the original task.
Related papers
- GenFighter: A Generative and Evolutive Textual Attack Removal [6.044610337297754]
Adrial attacks pose significant challenges to deep neural networks (DNNs) such as Transformer models in natural language processing (NLP)
This paper introduces a novel defense strategy, called GenFighter, which enhances adversarial robustness by learning and reasoning on the training classification distribution.
We show that GenFighter outperforms state-of-the-art defenses in accuracy under attack and attack success rate metrics.
arXiv Detail & Related papers (2024-04-17T16:32:13Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Searching for an Effective Defender: Benchmarking Defense against
Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples.
Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z) - Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning [95.60856995067083]
This work is among the first to perform adversarial defense for ASV without knowing the specific attack algorithms.
We propose to perform adversarial defense from two perspectives: 1) adversarial perturbation purification and 2) adversarial perturbation detection.
Experimental results show that our detection module effectively shields the ASV by detecting adversarial samples with an accuracy of around 80%.
arXiv Detail & Related papers (2021-06-01T07:10:54Z) - Mitigating Gradient-based Adversarial Attacks via Denoising and
Compression [7.305019142196582]
Gradient-based adversarial attacks on deep neural networks pose a serious threat.
They can be deployed by adding imperceptible perturbations to the test data of any network.
Denoising and dimensionality reduction are two distinct methods that have been investigated to combat such attacks.
arXiv Detail & Related papers (2021-04-03T22:57:01Z) - Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images.
We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks.
The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z) - Robust Tracking against Adversarial Attacks [69.59717023941126]
We first attempt to generate adversarial examples on top of video sequences to improve the tracking robustness against adversarial attacks.
We apply the proposed adversarial attack and defense approaches to state-of-the-art deep tracking algorithms.
arXiv Detail & Related papers (2020-07-20T08:05:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.