Versatile Defense Against Adversarial Attacks on Image Recognition
- URL: http://arxiv.org/abs/2403.08170v1
- Date: Wed, 13 Mar 2024 01:48:01 GMT
- Title: Versatile Defense Against Adversarial Attacks on Image Recognition
- Authors: Haibo Zhang, Zhihua Yao, Kouichi Sakurai
- Abstract summary: Defending against adversarial attacks in a real-life setting can be compared to the way antivirus software works.
It appears that a defense method based on image-to-image translation may be capable of this.
The trained model has successfully improved the classification accuracy from nearly zero to an average of 86%.
- Score: 2.9980620769521513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks present a significant security risk to image recognition
tasks. Defending against these attacks in a real-life setting can be compared
to the way antivirus software works, with a key consideration being how well
the defense can adapt to new and evolving attacks. Another important factor is
the resources involved in terms of time and cost for training defense models
and updating the model database. Training many models that are specific to each
type of attack can be time-consuming and expensive. Ideally, we should be able
to train one single model that can handle a wide range of attacks. It appears
that a defense method based on image-to-image translation may be capable of
this. The proposed versatile defense approach in this paper only requires
training one model to effectively resist various unknown adversarial attacks.
The trained model has successfully improved the classification accuracy from
nearly zero to an average of 86%, performing better than other defense methods
proposed in prior studies. When facing the PGD attack and the MI-FGSM attack,
versatile defense model even outperforms the attack-specific models trained
based on these two attacks. The robustness check also shows that our versatile
defense model performs stably regardless with the attack strength.
Related papers
- Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game [28.33029508522531]
Malicious attackers induce large models to jailbreak and generate information containing illegal, privacy-invasive information.
Large models counter malicious attackers' attacks using techniques such as safety alignment.
We propose a multi-agent attacker-disguiser game approach to achieve a weak defense mechanism that allows the large model to both safely reply to the attacker and hide the defense intent.
arXiv Detail & Related papers (2024-04-03T07:43:11Z) - Efficient Defense Against Model Stealing Attacks on Convolutional Neural
Networks [0.548924822963045]
Model stealing attacks can lead to intellectual property theft and other security and privacy risks.
Current state-of-the-art defenses against model stealing attacks suggest adding perturbations to the prediction probabilities.
We propose a simple yet effective and efficient defense alternative.
arXiv Detail & Related papers (2023-09-04T22:25:49Z) - MultiRobustBench: Benchmarking Robustness Against Multiple Attacks [86.70417016955459]
We present the first unified framework for considering multiple attacks against machine learning (ML) models.
Our framework is able to model different levels of learner's knowledge about the test-time adversary.
We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types.
arXiv Detail & Related papers (2023-02-21T20:26:39Z) - BagFlip: A Certified Defense against Data Poisoning [15.44806926189642]
BagFlip is a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attacks.
We evaluate BagFlip on image classification and malware detection datasets.
arXiv Detail & Related papers (2022-05-26T21:09:24Z) - What Doesn't Kill You Makes You Robust(er): Adversarial Training against
Poisons and Backdoors [57.040948169155925]
We extend the adversarial training framework to defend against (training-time) poisoning and backdoor attacks.
Our method desensitizes networks to the effects of poisoning by creating poisons during training and injecting them into training batches.
We show that this defense withstands adaptive attacks, generalizes to diverse threat models, and incurs a better performance trade-off than previous defenses.
arXiv Detail & Related papers (2021-02-26T17:54:36Z) - "What's in the box?!": Deflecting Adversarial Attacks by Randomly
Deploying Adversarially-Disjoint Models [71.91835408379602]
adversarial examples have been long considered a real threat to machine learning models.
We propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models.
arXiv Detail & Related papers (2021-02-09T20:07:13Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - Adversarial Imitation Attack [63.76805962712481]
A practical adversarial attack should require as little as possible knowledge of attacked models.
Current substitute attacks need pre-trained models to generate adversarial examples.
In this study, we propose a novel adversarial imitation attack.
arXiv Detail & Related papers (2020-03-28T10:02:49Z) - An Analysis of Adversarial Attacks and Defenses on Autonomous Driving
Models [15.007794089091616]
Convolutional neural network (CNN) is a key component in autonomous driving.
Previous work shows CNN-based classification models are vulnerable to adversarial attacks.
This paper presents an in-depth analysis of five adversarial attacks and four defense methods on three driving models.
arXiv Detail & Related papers (2020-02-06T09:49:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.