Related papers: An Extensive Study on Adversarial Attack against Pre-trained Models of Code

An Extensive Study on Adversarial Attack against Pre-trained Models of Code

URL: http://arxiv.org/abs/2311.07553v2
Date: Thu, 23 Nov 2023 11:20:39 GMT
Title: An Extensive Study on Adversarial Attack against Pre-trained Models of Code
Authors: Xiaohu Du, Ming Wen, Zichao Wei, Shangwen Wang, Hai Jin
Abstract summary: Transformer-based pre-trained models of code (PTMC) have been widely utilized and have achieved state-of-the-art performance in many mission-critical applications. They can be vulnerable to adversarial attacks through identifier substitution or coding style transformation. This study systematically analyzes five state-of-the-art adversarial attack approaches from three perspectives.
Score: 14.948361027395748
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformer-based pre-trained models of code (PTMC) have been widely utilized and have achieved state-of-the-art performance in many mission-critical applications. However, they can be vulnerable to adversarial attacks through identifier substitution or coding style transformation, which can significantly degrade accuracy and may further incur security concerns. Although several approaches have been proposed to generate adversarial examples for PTMC, the effectiveness and efficiency of such approaches, especially on different code intelligence tasks, has not been well understood. To bridge this gap, this study systematically analyzes five state-of-the-art adversarial attack approaches from three perspectives: effectiveness, efficiency, and the quality of generated examples. The results show that none of the five approaches balances all these perspectives. Particularly, approaches with a high attack success rate tend to be time-consuming; the adversarial code they generate often lack naturalness, and vice versa. To address this limitation, we explore the impact of perturbing identifiers under different contexts and find that identifier substitution within for and if statements is the most effective. Based on these findings, we propose a new approach that prioritizes different types of statements for various tasks and further utilizes beam search to generate adversarial examples. Evaluation results show that it outperforms the state-of-the-art ALERT in terms of both effectiveness and efficiency while preserving the naturalness of the generated adversarial examples.

Related papers

Evaluating the Vulnerability of ML-Based Ethereum Phishing Detectors to Single-Feature Adversarial Perturbations [9.362363409064546]
This paper explores the vulnerability of machine learning models to simple single-feature adversarial attacks in the context of fraudulent transaction detection. Through comprehensive experimentation, we investigate the impact of various adversarial attack strategies on model performance metrics. We examine the effectiveness of different mitigation strategies, including adversarial training and enhanced feature selection, in enhancing model robustness and show their effectiveness.
arXiv Detail & Related papers (2025-04-24T15:54:56Z)
Countering Backdoor Attacks in Image Recognition: A Survey and Evaluation of Mitigation Strategies [10.801476967873173]
We present a review of existing mitigation strategies designed to counter backdoor attacks in image recognition. We conduct an extensive benchmarking of sixteen state-of-the-art approaches against eight distinct backdoor attacks. Our results, derived from 122,236 individual experiments, indicate that while many approaches provide some level of protection, their performance can vary considerably.
arXiv Detail & Related papers (2024-11-17T23:30:01Z)
Simple Perturbations Subvert Ethereum Phishing Transactions Detection: An Empirical Analysis [12.607077453567594]
We investigate the impact of various adversarial attack strategies on model performance metrics, such as accuracy, precision, recall, and F1-score. We examine the effectiveness of different mitigation strategies, including adversarial training and enhanced feature selection, in enhancing model robustness.
arXiv Detail & Related papers (2024-08-06T20:40:20Z)
MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models. Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs. Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z)
Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked. We propose an attack-agnostic defense method named Meta Invariance Defense (MID) We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z)
Enhancing Adversarial Attacks: The Similar Target Method [6.293148047652131]
adversarial examples pose a threat to deep neural networks' applications. Deep neural networks are vulnerable to adversarial examples, posing a threat to the models' applications and raising security concerns. We propose a similar targeted attack method named Similar Target(ST)
arXiv Detail & Related papers (2023-08-21T14:16:36Z)
Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework. Our importance weights are obtained by optimizing the KL-divergence regularized loss function. Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z)
When Measures are Unreliable: Imperceptible Adversarial Perturbations toward Top-$k$ Multi-Label Learning [83.8758881342346]
A novel loss function is devised to generate adversarial perturbations that could achieve both visual and measure imperceptibility. Experiments on large-scale benchmark datasets demonstrate the superiority of our proposed method in attacking the top-$k$ multi-label systems.
arXiv Detail & Related papers (2023-07-27T13:18:47Z)
Adversarial Examples Detection with Enhanced Image Difference Features based on Local Histogram Equalization [20.132066800052712]
We propose an adversarial example detection framework based on a high-frequency information enhancement strategy. This framework can effectively extract and amplify the feature differences between adversarial examples and normal examples.
arXiv Detail & Related papers (2023-05-08T03:14:01Z)
MEAD: A Multi-Armed Approach for Evaluation of Adversarial Examples Detectors [24.296350262025552]
We propose a novel framework, called MEAD, for evaluating detectors based on several attack strategies. Among them, we make use of three new objectives to generate attacks. The proposed performance metric is based on the worst-case scenario.
arXiv Detail & Related papers (2022-06-30T17:05:45Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples. We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples. Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.