Related papers: Adversarial Attacks on Image Classification Models: Analysis and Defense

Adversarial Attacks on Image Classification Models: Analysis and Defense

URL: http://arxiv.org/abs/2312.16880v1
Date: Thu, 28 Dec 2023 08:08:23 GMT
Title: Adversarial Attacks on Image Classification Models: Analysis and Defense
Authors: Jaydip Sen, Abhiraj Sen, and Ananda Chatterjee
Abstract summary: adversarial attacks on image classification models based on convolutional neural networks (CNN) Fast gradient sign method (FGSM) is explored and its adverse effects on the performances of image classification models are examined. mechanism is proposed to defend against the FGSM attack based on a modified defensive distillation-based approach.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The notion of adversarial attacks on image classification models based on convolutional neural networks (CNN) is introduced in this work. To classify images, deep learning models called CNNs are frequently used. However, when the networks are subject to adversarial attacks, extremely potent and previously trained CNN models that perform quite effectively on image datasets for image classification tasks may perform poorly. In this work, one well-known adversarial attack known as the fast gradient sign method (FGSM) is explored and its adverse effects on the performances of image classification models are examined. The FGSM attack is simulated on three pre-trained image classifier CNN architectures, ResNet-101, AlexNet, and RegNetY 400MF using randomly chosen images from the ImageNet dataset. The classification accuracies of the models are computed in the absence and presence of the attack to demonstrate the detrimental effect of the attack on the performances of the classifiers. Finally, a mechanism is proposed to defend against the FGSM attack based on a modified defensive distillation-based approach. Extensive results are presented for the validation of the proposed scheme.

Related papers

Adversarial Machine Learning: Attacking and Safeguarding Image Datasets [0.0]
This paper examines the vulnerabilities of convolutional neural networks (CNNs) to adversarial attacks and explores a method for their safeguarding. CNNs were implemented on four of the most common image datasets and achieved high baseline accuracy. It appears that while most level of robustness is achieved against the models after adversarial training, there are still a few losses in the performance of these models against adversarial perturbations.
arXiv Detail & Related papers (2025-01-31T22:32:38Z)
Undermining Image and Text Classification Algorithms Using Adversarial Attacks [0.0]
Our study addresses the gap by training various machine learning models and using GANs and SMOTE to generate additional data points aimed at attacking text classification models. Our experiments reveal a significant vulnerability in classification models. Specifically, we observe a 20 % decrease in accuracy for the top-performing text classification models post-attack, along with a 30 % decrease in facial recognition accuracy.
arXiv Detail & Related papers (2024-11-03T18:44:28Z)
Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images. We identify model weaknesses by testing the model using the counterfactual image dataset. We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z)
MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models. Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs. Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z)
Counterfactual Image Generation for adversarially robust and interpretable Classifiers [1.3859669037499769]
We propose a unified framework leveraging image-to-image translation Generative Adrial Networks (GANs) to produce counterfactual samples. This is achieved by combining the classifier and discriminator into a single model that attributes real images to their respective classes and flags generated images as "fake" We show how the model exhibits improved robustness to adversarial attacks, and we show how the discriminator's "fakeness" value serves as an uncertainty measure of the predictions.
arXiv Detail & Related papers (2023-10-01T18:50:29Z)
Adversarial Attacks on Image Classification Models: FGSM and Patch Attacks and their Impact [0.0]
This chapter introduces the concept of adversarial attacks on image classification models built on convolutional neural networks (CNN) CNNs are very popular deep-learning models which are used in image classification tasks. Two very well-known adversarial attacks are discussed and their impact on the performance of image classifiers is analyzed.
arXiv Detail & Related papers (2023-07-05T06:40:08Z)
Semantic Image Attack for Visual Model Diagnosis [80.36063332820568]
In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models. This paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images.
arXiv Detail & Related papers (2023-03-23T03:13:04Z)
Attackar: Attack of the Evolutionary Adversary [0.0]
This paper introduces textitAttackar, an evolutionary, score-based, black-box attack. Attackar is based on a novel objective function that can be used in gradient-free optimization problems. Our results demonstrate the superior performance of Attackar, both in terms of accuracy score and query efficiency.
arXiv Detail & Related papers (2022-08-17T13:57:23Z)
Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. We propose DiffPure that uses diffusion models for adversarial purification. Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z)
Practical No-box Adversarial Attacks with Training-free Hybrid Image Transformation [123.33816363589506]
We show the existence of a textbftraining-free adversarial perturbation under the no-box threat model. Motivated by our observation that high-frequency component (HFC) domains in low-level features, we attack an image mainly by manipulating its frequency components. Our method is even competitive to mainstream transfer-based black-box attacks.
arXiv Detail & Related papers (2022-03-09T09:51:00Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
Leveraging Siamese Networks for One-Shot Intrusion Detection Model [0.0]
Supervised Machine Learning (ML) to enhance Intrusion Detection Systems has been the subject of significant research. retraining the models in-situ renders the network susceptible to attacks owing to the time-window required to acquire a sufficient volume of data. Here, a complementary approach referred to as 'One-Shot Learning', whereby a limited number of examples of a new attack-class is used to identify a new attack-class. A Siamese Network is trained to differentiate between classes based on pairs similarities, rather than features, allowing to identify new and previously unseen attacks.
arXiv Detail & Related papers (2020-06-27T11:40:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.