An Extensive Study on Adversarial Attack against Pre-trained Models of
Code
- URL: http://arxiv.org/abs/2311.07553v2
- Date: Thu, 23 Nov 2023 11:20:39 GMT
- Title: An Extensive Study on Adversarial Attack against Pre-trained Models of
Code
- Authors: Xiaohu Du, Ming Wen, Zichao Wei, Shangwen Wang, Hai Jin
- Abstract summary: Transformer-based pre-trained models of code (PTMC) have been widely utilized and have achieved state-of-the-art performance in many mission-critical applications.
They can be vulnerable to adversarial attacks through identifier substitution or coding style transformation.
This study systematically analyzes five state-of-the-art adversarial attack approaches from three perspectives.
- Score: 14.948361027395748
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transformer-based pre-trained models of code (PTMC) have been widely utilized
and have achieved state-of-the-art performance in many mission-critical
applications. However, they can be vulnerable to adversarial attacks through
identifier substitution or coding style transformation, which can significantly
degrade accuracy and may further incur security concerns. Although several
approaches have been proposed to generate adversarial examples for PTMC, the
effectiveness and efficiency of such approaches, especially on different code
intelligence tasks, has not been well understood. To bridge this gap, this
study systematically analyzes five state-of-the-art adversarial attack
approaches from three perspectives: effectiveness, efficiency, and the quality
of generated examples. The results show that none of the five approaches
balances all these perspectives. Particularly, approaches with a high attack
success rate tend to be time-consuming; the adversarial code they generate
often lack naturalness, and vice versa. To address this limitation, we explore
the impact of perturbing identifiers under different contexts and find that
identifier substitution within for and if statements is the most effective.
Based on these findings, we propose a new approach that prioritizes different
types of statements for various tasks and further utilizes beam search to
generate adversarial examples. Evaluation results show that it outperforms the
state-of-the-art ALERT in terms of both effectiveness and efficiency while
preserving the naturalness of the generated adversarial examples.
Related papers
- Countering Backdoor Attacks in Image Recognition: A Survey and Evaluation of Mitigation Strategies [10.801476967873173]
We present a review of existing mitigation strategies designed to counter backdoor attacks in image recognition.
We conduct an extensive benchmarking of sixteen state-of-the-art approaches against eight distinct backdoor attacks.
Our results, derived from 122,236 individual experiments, indicate that while many approaches provide some level of protection, their performance can vary considerably.
arXiv Detail & Related papers (2024-11-17T23:30:01Z) - Simple Perturbations Subvert Ethereum Phishing Transactions Detection: An Empirical Analysis [12.607077453567594]
We investigate the impact of various adversarial attack strategies on model performance metrics, such as accuracy, precision, recall, and F1-score.
We examine the effectiveness of different mitigation strategies, including adversarial training and enhanced feature selection, in enhancing model robustness.
arXiv Detail & Related papers (2024-08-06T20:40:20Z) - MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Enhancing Adversarial Attacks: The Similar Target Method [6.293148047652131]
adversarial examples pose a threat to deep neural networks' applications.
Deep neural networks are vulnerable to adversarial examples, posing a threat to the models' applications and raising security concerns.
We propose a similar targeted attack method named Similar Target(ST)
arXiv Detail & Related papers (2023-08-21T14:16:36Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - When Measures are Unreliable: Imperceptible Adversarial Perturbations
toward Top-$k$ Multi-Label Learning [83.8758881342346]
A novel loss function is devised to generate adversarial perturbations that could achieve both visual and measure imperceptibility.
Experiments on large-scale benchmark datasets demonstrate the superiority of our proposed method in attacking the top-$k$ multi-label systems.
arXiv Detail & Related papers (2023-07-27T13:18:47Z) - Adversarial Examples Detection with Enhanced Image Difference Features
based on Local Histogram Equalization [20.132066800052712]
We propose an adversarial example detection framework based on a high-frequency information enhancement strategy.
This framework can effectively extract and amplify the feature differences between adversarial examples and normal examples.
arXiv Detail & Related papers (2023-05-08T03:14:01Z) - MEAD: A Multi-Armed Approach for Evaluation of Adversarial Examples
Detectors [24.296350262025552]
We propose a novel framework, called MEAD, for evaluating detectors based on several attack strategies.
Among them, we make use of three new objectives to generate attacks.
The proposed performance metric is based on the worst-case scenario.
arXiv Detail & Related papers (2022-06-30T17:05:45Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.