Related papers: Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm

Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm

URL: http://arxiv.org/abs/2310.13019v3
Date: Fri, 17 Nov 2023 19:39:43 GMT
Title: Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm
Authors: S. M. Fazle Rabby Labib, Joyanta Jyoti Mondal, Meem Arafat Manab
Abstract summary: DeepFool, an algorithm proposed by Moosavi-Dezfooli et al. convolution, finds minimal perturbations to misclassify input images. DeepFool lacks a targeted approach, making it less effective in specific attack scenarios. We propose Enhanced Targeted DeepFool, an augmented version of DeepFool that allows targeting specific classes for misclassification.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Deep neural networks (DNNs) have significantly advanced various domains, but their vulnerability to adversarial attacks poses serious concerns. Understanding these vulnerabilities and developing effective defense mechanisms is crucial. DeepFool, an algorithm proposed by Moosavi-Dezfooli et al. (2016), finds minimal perturbations to misclassify input images. However, DeepFool lacks a targeted approach, making it less effective in specific attack scenarios. Also, in previous related works, researchers primarily focus on success, not considering how much an image is getting distorted; the integrity of the image quality, and the confidence level to misclassifying. So, in this paper, we propose Enhanced Targeted DeepFool, an augmented version of DeepFool that allows targeting specific classes for misclassification and also introduce a minimum confidence score requirement hyperparameter to enhance flexibility. Our experiments demonstrate the effectiveness and efficiency of the proposed method across different deep neural network architectures while preserving image integrity as much and perturbation rate as less as possible. By using our approach, the behavior of models can be manipulated arbitrarily using the perturbed images, as we can specify both the target class and the associated confidence score, unlike other DeepFool-derivative works, such as Targeted DeepFool by Gajjar et al. (2022). Results show that one of the deep convolutional neural network architectures, AlexNet, and one of the state-of-the-art model Vision Transformer exhibit high robustness to getting fooled. This approach can have larger implication, as our tuning of confidence level can expose the robustness of image recognition models. Our code will be made public upon acceptance of the paper.

Related papers

Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space. Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings. The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z)
Exploring Geometry of Blind Spots in Vision Models [56.47644447201878]
We study the phenomenon of under-sensitivity in vision models such as CNNs and Transformers. We propose a Level Set Traversal algorithm that iteratively explores regions of high confidence with respect to the input space. We estimate the extent of these connected higher-dimensional regions over which the model maintains a high degree of confidence.
arXiv Detail & Related papers (2023-10-30T18:00:33Z)
Dual Adversarial Resilience for Collaborating Robust Underwater Image Enhancement and Perception [54.672052775549]
In this work, we introduce a collaborative adversarial resilience network, dubbed CARNet, for underwater image enhancement and subsequent detection tasks. We propose a synchronized attack training strategy with both visual-driven and perception-driven attacks enabling the network to discern and remove various types of attacks. Experiments demonstrate that the proposed method outputs visually appealing enhancement images and perform averagely 6.71% higher detection mAP than state-of-the-art methods.
arXiv Detail & Related papers (2023-09-03T06:52:05Z)
Mitigating Adversarial Attacks in Deepfake Detection: An Exploration of Perturbation and AI Techniques [1.0718756132502771]
adversarial examples are subtle perturbations artfully injected into clean images or videos. Deepfakes have emerged as a potent tool to manipulate public opinion and tarnish the reputations of public figures. This article delves into the multifaceted world of adversarial examples, elucidating the underlying principles behind their capacity to deceive deep learning algorithms.
arXiv Detail & Related papers (2023-02-22T23:48:19Z)
Minimum Noticeable Difference based Adversarial Privacy Preserving Image Generation [44.2692621807947]
We develop a framework to generate adversarial privacy preserving images that have minimum perceptual difference from the clean ones but are able to attack deep learning models. To the best of our knowledge, this is the first work on exploring quality-preserving adversarial image generation based on the MND concept for privacy preserving.
arXiv Detail & Related papers (2022-06-17T09:02:12Z)
Robust Sensible Adversarial Learning of Deep Neural Networks for Image Classification [6.594522185216161]
We introduce sensible adversarial learning and demonstrate the synergistic effect between pursuits of standard natural accuracy and robustness. Specifically, we define a sensible adversary which is useful for learning a robust model while keeping high natural accuracy. We propose a novel and efficient algorithm that trains a robust model using implicit loss truncation.
arXiv Detail & Related papers (2022-05-20T22:57:44Z)
On the Robustness of Quality Measures for GANs [136.18799984346248]
This work evaluates the robustness of quality measures of generative models such as Inception Score (IS) and Fr'echet Inception Distance (FID) We show that such metrics can also be manipulated by additive pixel perturbations.
arXiv Detail & Related papers (2022-01-31T06:43:09Z)
Deep Bayesian Image Set Classification: A Defence Approach against Adversarial Attacks [32.48820298978333]
Deep neural networks (DNNs) are susceptible to be fooled with nearly high confidence by an adversary. In practice, the vulnerability of deep learning systems against carefully perturbed images, known as adversarial examples, poses a dire security threat in the physical world applications. We propose a robust deep Bayesian image set classification as a defence framework against a broad range of adversarial attacks.
arXiv Detail & Related papers (2021-08-23T14:52:44Z)
Deep Model Intellectual Property Protection via Deep Watermarking [122.87871873450014]
Deep neural networks are exposed to serious IP infringement risks. Given a target deep model, if the attacker knows its full information, it can be easily stolen by fine-tuning. We propose a new model watermarking framework for protecting deep networks trained for low-level computer vision or image processing tasks.
arXiv Detail & Related papers (2021-03-08T18:58:21Z)
Towards Transferable Adversarial Attack against Deep Face Recognition [58.07786010689529]
Deep convolutional neural networks (DCNNs) have been found to be vulnerable to adversarial examples. transferable adversarial examples can severely hinder the robustness of DCNNs. We propose DFANet, a dropout-based method used in convolutional layers, which can increase the diversity of surrogate models. We generate a new set of adversarial face pairs that can successfully attack four commercial APIs without any queries.
arXiv Detail & Related papers (2020-04-13T06:44:33Z)
Adversarial Attacks on Convolutional Neural Networks in Facial Recognition Domain [2.4704085162861693]
Adversarial attacks that render Deep Neural Network (DNN) classifiers vulnerable in real life represent a serious threat in autonomous vehicles, malware filters, or biometric authentication systems. We apply Fast Gradient Sign Method to introduce perturbations to a facial image dataset and then test the output on a different classifier. We craft a variety of different black-box attack algorithms on a facial image dataset assuming minimal adversarial knowledge.
arXiv Detail & Related papers (2020-01-30T00:25:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.