Adversarial Attack and Defense of Structured Prediction Models
- URL: http://arxiv.org/abs/2010.01610v2
- Date: Sun, 18 Oct 2020 09:39:58 GMT
- Title: Adversarial Attack and Defense of Structured Prediction Models
- Authors: Wenjuan Han, Liwen Zhang, Yong Jiang, Kewei Tu
- Abstract summary: In this paper, we investigate attacks and defenses for structured prediction tasks in NLP.
The structured output of structured prediction models is sensitive to small perturbations in the input.
We propose a novel and unified framework that learns to attack a structured prediction model using a sequence-to-sequence model.
- Score: 58.49290114755019
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building an effective adversarial attacker and elaborating on countermeasures
for adversarial attacks for natural language processing (NLP) have attracted a
lot of research in recent years. However, most of the existing approaches focus
on classification problems. In this paper, we investigate attacks and defenses
for structured prediction tasks in NLP. Besides the difficulty of perturbing
discrete words and the sentence fluency problem faced by attackers in any NLP
tasks, there is a specific challenge to attackers of structured prediction
models: the structured output of structured prediction models is sensitive to
small perturbations in the input. To address these problems, we propose a novel
and unified framework that learns to attack a structured prediction model using
a sequence-to-sequence model with feedbacks from multiple reference models of
the same structured prediction task. Based on the proposed attack, we further
reinforce the victim model with adversarial training, making its prediction
more robust and accurate. We evaluate the proposed framework in dependency
parsing and part-of-speech tagging. Automatic and human evaluations show that
our proposed framework succeeds in both attacking state-of-the-art structured
prediction models and boosting them with adversarial training.
Related papers
- MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - Problem space structural adversarial attacks for Network Intrusion Detection Systems based on Graph Neural Networks [8.629862888374243]
We propose the first formalization of adversarial attacks specifically tailored for GNN in network intrusion detection.
We outline and model the problem space constraints that attackers need to consider to carry out feasible structural attacks in real-world scenarios.
Our findings demonstrate the increased robustness of the models against classical feature-based adversarial attacks.
arXiv Detail & Related papers (2024-03-18T14:40:33Z) - Fooling the Textual Fooler via Randomizing Latent Representations [13.77424820701913]
adversarial word-level perturbations are well-studied and effective attack strategies.
We propose a lightweight and attack-agnostic defense whose main goal is to perplex the process of generating an adversarial example.
We empirically demonstrate near state-of-the-art robustness of AdvFooler against representative adversarial word-level attacks.
arXiv Detail & Related papers (2023-10-02T06:57:25Z) - Adversarial Attacks Against Uncertainty Quantification [10.655660123083607]
This work focuses on a different adversarial scenario in which the attacker is still interested in manipulating the uncertainty estimate.
In particular, the goal is to undermine the use of machine-learning models when their outputs are consumed by a downstream module or by a human operator.
arXiv Detail & Related papers (2023-09-19T12:54:09Z) - COVER: A Heuristic Greedy Adversarial Attack on Prompt-based Learning in
Language Models [4.776465250559034]
We propose a prompt-based adversarial attack on manual templates in black box scenarios.
First of all, we design character-level and word-level approaches to break manual templates separately.
And we present a greedy algorithm for the attack based on the above destructive approaches.
arXiv Detail & Related papers (2023-06-09T03:53:42Z) - In and Out-of-Domain Text Adversarial Robustness via Label Smoothing [64.66809713499576]
We study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks.
Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks.
We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.
arXiv Detail & Related papers (2022-12-20T14:06:50Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - A Unified Evaluation of Textual Backdoor Learning: Frameworks and
Benchmarks [72.7373468905418]
We develop an open-source toolkit OpenBackdoor to foster the implementations and evaluations of textual backdoor learning.
We also propose CUBE, a simple yet strong clustering-based defense baseline.
arXiv Detail & Related papers (2022-06-17T02:29:23Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.