Versatile Weight Attack via Flipping Limited Bits
- URL: http://arxiv.org/abs/2207.12405v1
- Date: Mon, 25 Jul 2022 03:24:58 GMT
- Title: Versatile Weight Attack via Flipping Limited Bits
- Authors: Jiawang Bai, Baoyuan Wu, Zhifeng Li, and Shu-tao Xia
- Abstract summary: We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
- Score: 68.45224286690932
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To explore the vulnerability of deep neural networks (DNNs), many attack
paradigms have been well studied, such as the poisoning-based backdoor attack
in the training stage and the adversarial attack in the inference stage. In
this paper, we study a novel attack paradigm, which modifies model parameters
in the deployment stage. Considering the effectiveness and stealthiness goals,
we provide a general formulation to perform the bit-flip based weight attack,
where the effectiveness term could be customized depending on the attacker's
purpose. Furthermore, we present two cases of the general formulation with
different malicious purposes, i.e., single sample attack (SSA) and triggered
samples attack (TSA). To this end, we formulate this problem as a mixed integer
programming (MIP) to jointly determine the state of the binary bits (0 or 1) in
the memory and learn the sample modification. Utilizing the latest technique in
integer programming, we equivalently reformulate this MIP problem as a
continuous optimization problem, which can be effectively and efficiently
solved using the alternating direction method of multipliers (ADMM) method.
Consequently, the flipped critical bits can be easily determined through
optimization, rather than using a heuristic strategy. Extensive experiments
demonstrate the superiority of SSA and TSA in attacking DNNs.
Related papers
- DALA: A Distribution-Aware LoRA-Based Adversarial Attack against
Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data.
Recent attack methods can achieve a relatively high attack success rate (ASR)
We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z) - Hyperparameter Learning under Data Poisoning: Analysis of the Influence
of Regularization via Multiobjective Bilevel Optimization [3.3181276611945263]
Machine Learning (ML) algorithms are vulnerable to poisoning attacks, where a fraction of the training data is manipulated to deliberately degrade the algorithms' performance.
Optimal attacks can be formulated as bilevel optimization problems and help to assess their robustness in worst-case scenarios.
arXiv Detail & Related papers (2023-06-02T15:21:05Z) - Modeling Adversarial Attack on Pre-trained Language Models as Sequential
Decision Making [10.425483543802846]
adversarial attack task has found that pre-trained language models (PLMs) are vulnerable to small perturbations.
In this paper, we model the adversarial attack task on PLMs as a sequential decision-making problem.
We propose to use reinforcement learning to find an appropriate sequential attack path to generate adversaries, named SDM-Attack.
arXiv Detail & Related papers (2023-05-27T10:33:53Z) - Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation
and Complexity Analysis [20.11993437283895]
This paper provides a game-theoretical underpinning for understanding this type of security risk.
We define the sampling attack model as a Stackelberg game between the attacker and the agent, which yields a minimax formulation.
We observe that a minor effort of the attacker can significantly deteriorate the learning performance.
arXiv Detail & Related papers (2022-07-29T21:29:29Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels.
Recent efforts combine it with another l_infty perturbation on magnitudes.
We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z) - Regularisation Can Mitigate Poisoning Attacks: A Novel Analysis Based on
Multiobjective Bilevel Optimisation [3.3181276611945263]
Machine Learning (ML) algorithms are vulnerable to poisoning attacks, where a fraction of the training data is manipulated to deliberately degrade the algorithms' performance.
Optimal poisoning attacks, which can be formulated as bilevel problems, help to assess the robustness of learning algorithms in worst-case scenarios.
We show that this approach leads to an overly pessimistic view of the robustness of the algorithms.
arXiv Detail & Related papers (2020-02-28T19:46:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.