Related papers: Adversarial Text Generation with Dynamic Contextual Perturbation

Adversarial Text Generation with Dynamic Contextual Perturbation

URL: http://arxiv.org/abs/2506.09148v1
Date: Tue, 10 Jun 2025 18:02:37 GMT
Title: Adversarial Text Generation with Dynamic Contextual Perturbation
Authors: Hetvi Waghela, Jaydip Sen, Sneha Rakshit, Subhasis Dasgupta,
Abstract summary: Adversarial attacks on Natural Language Processing (NLP) models expose vulnerabilities by introducing subtle perturbations to input text.<n>We propose a novel adversarial text attack scheme named Dynamic Contextual Perturbation (DCP)<n>DCP generates context-aware perturbations across sentences, paragraphs, and documents, ensuring semantic fidelity and fluency.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial attacks on Natural Language Processing (NLP) models expose vulnerabilities by introducing subtle perturbations to input text, often leading to misclassification while maintaining human readability. Existing methods typically focus on word-level or local text segment alterations, overlooking the broader context, which results in detectable or semantically inconsistent perturbations. We propose a novel adversarial text attack scheme named Dynamic Contextual Perturbation (DCP). DCP dynamically generates context-aware perturbations across sentences, paragraphs, and documents, ensuring semantic fidelity and fluency. Leveraging the capabilities of pre-trained language models, DCP iteratively refines perturbations through an adversarial objective function that balances the dual objectives of inducing model misclassification and preserving the naturalness of the text. This comprehensive approach allows DCP to produce more sophisticated and effective adversarial examples that better mimic natural language patterns. Our experimental results, conducted on various NLP models and datasets, demonstrate the efficacy of DCP in challenging the robustness of state-of-the-art NLP systems. By integrating dynamic contextual analysis, DCP significantly enhances the subtlety and impact of adversarial attacks. This study highlights the critical role of context in adversarial attacks and lays the groundwork for creating more robust NLP systems capable of withstanding sophisticated adversarial strategies.

Related papers

Compromising Embodied Agents with Contextual Backdoor Attacks [69.71630408822767]
Large language models (LLMs) have transformed the development of embodied intelligence. This paper uncovers a significant backdoor security threat within this process. By poisoning just a few contextual demonstrations, attackers can covertly compromise the contextual environment of a black-box LLM.
arXiv Detail & Related papers (2024-08-06T01:20:12Z)
Enhance Robustness of Language Models Against Variation Attack through Graph Integration [28.291064456532517]
We propose a novel method, CHinese vAriatioN Graph Enhancement, to increase the robustness of language models against character variation attacks. CHANGE essentially enhances PLMs' interpretation of adversarially manipulated text. Experiments conducted in a multitude of NLP tasks show that CHANGE outperforms current language models in combating against adversarial attacks.
arXiv Detail & Related papers (2024-04-18T09:04:39Z)
Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods [0.0]
A text adversarial attack involves the deliberate manipulation of input text to mislead the predictions of the model. BERT, BERT-on-BERT attack, and Fraud Bargain's Attack (FBA) are explored in this paper. PWWS emerges as the most potent adversary, consistently outperforming other methods across multiple evaluation scenarios.
arXiv Detail & Related papers (2024-04-08T02:55:01Z)
Few-Shot Adversarial Prompt Learning on Vision-Language Models [62.50622628004134]
The vulnerability of deep neural networks to imperceptible adversarial perturbations has attracted widespread attention. Previous efforts achieved zero-shot adversarial robustness by aligning adversarial visual features with text supervision. We propose a few-shot adversarial prompt framework where adapting input sequences with limited data makes significant adversarial robustness improvement.
arXiv Detail & Related papers (2024-03-21T18:28:43Z)
Improving Adversarial Transferability of Vision-Language Pre-training Models through Collaborative Multimodal Interaction [22.393624206051925]
Existing work rarely studies the transferability of attacks on Vision-Language Pre-training models. We propose a novel attack, called Collaborative Multimodal Interaction Attack (CMI-Attack) CMI-Attack raises the transfer success rates from ALBEF to TCL, $textCLIP_textViT$ and $textCLIP_textCNN$ by 8.11%-16.75% over state-of-the-art methods.
arXiv Detail & Related papers (2024-03-16T10:32:24Z)
SA-Attack: Improving Adversarial Transferability of Vision-Language Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios. We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z)
In and Out-of-Domain Text Adversarial Robustness via Label Smoothing [64.66809713499576]
We study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks. Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks. We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.
arXiv Detail & Related papers (2022-12-20T14:06:50Z)
TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization [35.8795761863398]
We propose TextGrad, a new attack generator using gradient-driven optimization, supporting high-accuracy and high-quality assessment of adversarial robustness in NLP. We develop an effective convex relaxation method to co-optimize the continuously-relaxed site selection and perturbation variables. As a first-order attack generation method, TextGrad can be baked into adversarial training to further improve the robustness of NLP models.
arXiv Detail & Related papers (2022-12-19T05:55:58Z)
Phrase-level Textual Adversarial Attack with Label Preservation [34.42846737465045]
We propose Phrase-Level Textual Adrial aTtack (PLAT) that generates adversarial samples through phrase-level perturbations. PLAT has a superior attack effectiveness as well as a better label consistency than strong baselines.
arXiv Detail & Related papers (2022-05-22T02:22:38Z)
Learning-based Hybrid Local Search for the Hard-label Textual Attack [53.92227690452377]
We consider a rarely investigated but more rigorous setting, namely hard-label attack, in which the attacker could only access the prediction label. Based on this observation, we propose a novel hard-label attack, called Learning-based Hybrid Local Search (LHLS) algorithm. Our LHLS significantly outperforms existing hard-label attacks regarding the attack performance as well as adversary quality.
arXiv Detail & Related papers (2022-01-20T14:16:07Z)
Evaluating Deception Detection Model Robustness To Linguistic Variation [10.131671217810581]
We propose an analysis of model robustness against linguistic variation in the setting of deceptive news detection. We consider two prediction tasks and compare three state-of-the-art embeddings to highlight consistent trends in model performance. We find that character or mixed ensemble models are the most effective defenses and that character perturbation-based attack tactics are more successful.
arXiv Detail & Related papers (2021-04-23T17:25:38Z)
Contextualized Perturbation for Textual Adversarial Attack [56.370304308573274]
Adversarial examples expose the vulnerabilities of natural language processing (NLP) models. This paper presents CLARE, a ContextuaLized AdversaRial Example generation model that produces fluent and grammatical outputs.
arXiv Detail & Related papers (2020-09-16T06:53:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.