Frauds Bargain Attack: Generating Adversarial Text Samples via Word
Manipulation Process
- URL: http://arxiv.org/abs/2303.01234v2
- Date: Wed, 27 Dec 2023 17:54:38 GMT
- Title: Frauds Bargain Attack: Generating Adversarial Text Samples via Word
Manipulation Process
- Authors: Mingze Ni, Zhensu Sun and Wei Liu
- Abstract summary: This study proposes a new method called the Fraud's Bargain Attack.
It uses a randomization mechanism to expand the search space and produce high-quality adversarial examples.
It outperforms other methods in terms of success rate, imperceptibility and sentence quality.
- Score: 9.269657271777527
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research has revealed that natural language processing (NLP) models
are vulnerable to adversarial examples. However, the current techniques for
generating such examples rely on deterministic heuristic rules, which fail to
produce optimal adversarial examples. In response, this study proposes a new
method called the Fraud's Bargain Attack (FBA), which uses a randomization
mechanism to expand the search space and produce high-quality adversarial
examples with a higher probability of success. FBA uses the Metropolis-Hasting
sampler, a type of Markov Chain Monte Carlo sampler, to improve the selection
of adversarial examples from all candidates generated by a customized
stochastic process called the Word Manipulation Process (WMP). The WMP method
modifies individual words in a contextually-aware manner through insertion,
removal, or substitution. Through extensive experiments, this study
demonstrates that FBA outperforms other methods in terms of attack success
rate, imperceptibility and sentence quality.
Related papers
- Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation [60.493180081319785]
We propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step.
Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.
arXiv Detail & Related papers (2024-08-24T14:14:32Z) - A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers [10.063169009242682]
We train an encoder-decoder paraphrase model to generate adversarial examples.
We adopt a reinforcement learning algorithm and propose a constraint-enforcing reward.
We show how key design choices impact the generated examples and discuss the strengths and weaknesses of the proposed approach.
arXiv Detail & Related papers (2024-05-20T09:33:43Z) - Reversible Jump Attack to Textual Classifiers with Modification Reduction [8.247761405798874]
Reversible Jump Attack (RJA) and Metropolis-Hasting Modification Reduction (MMR) are proposed.
RJA-MMR outperforms current state-of-the-art methods in attack performance, imperceptibility, fluency and grammar correctness.
arXiv Detail & Related papers (2024-03-21T04:54:31Z) - Token-Level Adversarial Prompt Detection Based on Perplexity Measures
and Contextual Information [67.78183175605761]
Large Language Models are susceptible to adversarial prompt attacks.
This vulnerability underscores a significant concern regarding the robustness and reliability of LLMs.
We introduce a novel approach to detecting adversarial prompts at a token level.
arXiv Detail & Related papers (2023-11-20T03:17:21Z) - Context-aware Adversarial Attack on Named Entity Recognition [15.049160192547909]
We study context-aware adversarial attack methods to examine the model's robustness.
Specifically, we propose perturbing the most informative words for recognizing entities to create adversarial examples.
Experiments and analyses show that our methods are more effective in deceiving the model into making wrong predictions than strong baselines.
arXiv Detail & Related papers (2023-09-16T14:04:23Z) - In and Out-of-Domain Text Adversarial Robustness via Label Smoothing [64.66809713499576]
We study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks.
Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks.
We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.
arXiv Detail & Related papers (2022-12-20T14:06:50Z) - ADDMU: Detection of Far-Boundary Adversarial Examples with Data and
Model Uncertainty Estimation [125.52743832477404]
Adversarial Examples Detection (AED) is a crucial defense technique against adversarial attacks.
We propose a new technique, textbfADDMU, which combines two types of uncertainty estimation for both regular and FB adversarial example detection.
Our new method outperforms previous methods by 3.6 and 6.0 emphAUC points under each scenario.
arXiv Detail & Related papers (2022-10-22T09:11:12Z) - Phrase-level Adversarial Example Generation for Neural Machine
Translation [75.01476479100569]
We propose a phrase-level adversarial example generation (PAEG) method to enhance the robustness of the model.
We verify our method on three benchmarks, including LDC Chinese-English, IWSLT14 German-English, and WMT14 English-German tasks.
arXiv Detail & Related papers (2022-01-06T11:00:49Z) - Randomized Substitution and Vote for Textual Adversarial Example
Detection [6.664295299367366]
A line of work has shown that natural text processing models are vulnerable to adversarial examples.
We propose a novel textual adversarial example detection method, termed Randomized Substitution and Vote (RS&V)
Empirical evaluations on three benchmark datasets demonstrate that RS&V could detect the textual adversarial examples more successfully than the existing detection methods.
arXiv Detail & Related papers (2021-09-13T04:17:58Z) - Contextualized Perturbation for Textual Adversarial Attack [56.370304308573274]
Adversarial examples expose the vulnerabilities of natural language processing (NLP) models.
This paper presents CLARE, a ContextuaLized AdversaRial Example generation model that produces fluent and grammatical outputs.
arXiv Detail & Related papers (2020-09-16T06:53:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.