Related papers: Versatile Weight Attack via Flipping Limited Bits

Versatile Weight Attack via Flipping Limited Bits

URL: http://arxiv.org/abs/2207.12405v1
Date: Mon, 25 Jul 2022 03:24:58 GMT
Title: Versatile Weight Attack via Flipping Limited Bits
Authors: Jiawang Bai, Baoyuan Wu, Zhifeng Li, and Shu-tao Xia
Abstract summary: We study a novel attack paradigm, which modifies model parameters in the deployment stage. Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack. We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
Score: 68.45224286690932
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To explore the vulnerability of deep neural networks (DNNs), many attack paradigms have been well studied, such as the poisoning-based backdoor attack in the training stage and the adversarial attack in the inference stage. In this paper, we study a novel attack paradigm, which modifies model parameters in the deployment stage. Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack, where the effectiveness term could be customized depending on the attacker's purpose. Furthermore, we present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA). To this end, we formulate this problem as a mixed integer programming (MIP) to jointly determine the state of the binary bits (0 or 1) in the memory and learn the sample modification. Utilizing the latest technique in integer programming, we equivalently reformulate this MIP problem as a continuous optimization problem, which can be effectively and efficiently solved using the alternating direction method of multipliers (ADMM) method. Consequently, the flipped critical bits can be easily determined through optimization, rather than using a heuristic strategy. Extensive experiments demonstrate the superiority of SSA and TSA in attacking DNNs.

Related papers

DALA: A Distribution-Aware LoRA-Based Adversarial Attack against Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data. Recent attack methods can achieve a relatively high attack success rate (ASR) We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z)
Hyperparameter Learning under Data Poisoning: Analysis of the Influence of Regularization via Multiobjective Bilevel Optimization [3.3181276611945263]
Machine Learning (ML) algorithms are vulnerable to poisoning attacks, where a fraction of the training data is manipulated to deliberately degrade the algorithms' performance. Optimal attacks can be formulated as bilevel optimization problems and help to assess their robustness in worst-case scenarios.
arXiv Detail & Related papers (2023-06-02T15:21:05Z)
Modeling Adversarial Attack on Pre-trained Language Models as Sequential Decision Making [10.425483543802846]
adversarial attack task has found that pre-trained language models (PLMs) are vulnerable to small perturbations. In this paper, we model the adversarial attack task on PLMs as a sequential decision-making problem. We propose to use reinforcement learning to find an appropriate sequential attack path to generate adversaries, named SDM-Attack.
arXiv Detail & Related papers (2023-05-27T10:33:53Z)
Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation and Complexity Analysis [20.11993437283895]
This paper provides a game-theoretical underpinning for understanding this type of security risk. We define the sampling attack model as a Stackelberg game between the attacker and the agent, which yields a minimax formulation. We observe that a minor effort of the attacker can significantly deteriorate the learning performance.
arXiv Detail & Related papers (2022-07-29T21:29:29Z)
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training. We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z)
Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels. Recent efforts combine it with another l_infty perturbation on magnitudes. We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z)
Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes. Our goal is to misclassify a specific sample into a target class without any sample modification. By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z)
Regularisation Can Mitigate Poisoning Attacks: A Novel Analysis Based on Multiobjective Bilevel Optimisation [3.3181276611945263]
Machine Learning (ML) algorithms are vulnerable to poisoning attacks, where a fraction of the training data is manipulated to deliberately degrade the algorithms' performance. Optimal poisoning attacks, which can be formulated as bilevel problems, help to assess the robustness of learning algorithms in worst-case scenarios. We show that this approach leads to an overly pessimistic view of the robustness of the algorithms.
arXiv Detail & Related papers (2020-02-28T19:46:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.