LEAT: Towards Robust Deepfake Disruption in Real-World Scenarios via
Latent Ensemble Attack
- URL: http://arxiv.org/abs/2307.01520v1
- Date: Tue, 4 Jul 2023 07:00:37 GMT
- Title: LEAT: Towards Robust Deepfake Disruption in Real-World Scenarios via
Latent Ensemble Attack
- Authors: Joonkyo Shim, Hyunsoo Yoon
- Abstract summary: Deepfakes, malicious visual contents created by generative models, pose an increasingly harmful threat to society.
To proactively mitigate deepfake damages, recent studies have employed adversarial perturbation to disrupt deepfake model outputs.
We propose a simple yet effective disruption method called Latent Ensemble ATtack (LEAT), which attacks the independent latent encoding process.
- Score: 11.764601181046496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deepfakes, malicious visual contents created by generative models, pose an
increasingly harmful threat to society. To proactively mitigate deepfake
damages, recent studies have employed adversarial perturbation to disrupt
deepfake model outputs. However, previous approaches primarily focus on
generating distorted outputs based on only predetermined target attributes,
leading to a lack of robustness in real-world scenarios where target attributes
are unknown. Additionally, the transferability of perturbations between two
prominent generative models, Generative Adversarial Networks (GANs) and
Diffusion Models, remains unexplored. In this paper, we emphasize the
importance of target attribute-transferability and model-transferability for
achieving robust deepfake disruption. To address this challenge, we propose a
simple yet effective disruption method called Latent Ensemble ATtack (LEAT),
which attacks the independent latent encoding process. By disrupting the latent
encoding process, it generates distorted output images in subsequent generation
processes, regardless of the given target attributes. This target
attribute-agnostic attack ensures robust disruption even when the target
attributes are unknown. Additionally, we introduce a Normalized Gradient
Ensemble strategy that effectively aggregates gradients for iterative gradient
attacks, enabling simultaneous attacks on various types of deepfake models,
involving both GAN-based and Diffusion-based models. Moreover, we demonstrate
the insufficiency of evaluating disruption quality solely based on pixel-level
differences. As a result, we propose an alternative protocol for
comprehensively evaluating the success of defense. Extensive experiments
confirm the efficacy of our method in disrupting deepfakes in real-world
scenarios, reporting a higher defense success rate compared to previous
methods.
Related papers
- Transferable Adversarial Attacks on SAM and Its Downstream Models [87.23908485521439]
This paper explores the feasibility of adversarial attacking various downstream models fine-tuned from the segment anything model (SAM)
To enhance the effectiveness of the adversarial attack towards models fine-tuned on unknown datasets, we propose a universal meta-initialization (UMI) algorithm.
arXiv Detail & Related papers (2024-10-26T15:04:04Z) - MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations.
We transform the multi-granular attack into a sequential decision-making process.
Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z) - Improving the Robustness of Object Detection and Classification AI models against Adversarial Patch Attacks [2.963101656293054]
We analyze attack techniques and propose a robust defense approach.
We successfully reduce model confidence by over 20% using adversarial patch attacks that exploit object shape, texture and position.
Our inpainting defense approach significantly enhances model resilience, achieving high accuracy and reliable localization despite the adversarial attacks.
arXiv Detail & Related papers (2024-03-04T13:32:48Z) - Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent
Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space.
Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings.
The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z) - Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions.
We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z) - AdvART: Adversarial Art for Camouflaged Object Detection Attacks [7.7889972735711925]
We propose a novel approach to generate naturalistic and inconspicuous adversarial patches.
Our technique is based on directly manipulating the pixel values in the patch, which gives higher flexibility and larger space.
Our attack achieves superior success rate of up to 91.19% and 72%, respectively, in the digital world and when deployed in smart cameras at the edge.
arXiv Detail & Related papers (2023-03-03T06:28:05Z) - Adv-Attribute: Inconspicuous and Transferable Adversarial Attack on Face
Recognition [111.1952945740271]
Adversarial Attributes (Adv-Attribute) is designed to generate inconspicuous and transferable attacks on face recognition.
Experiments on the FFHQ and CelebA-HQ datasets show that the proposed Adv-Attribute method achieves the state-of-the-art attacking success rates.
arXiv Detail & Related papers (2022-10-13T09:56:36Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - Unreasonable Effectiveness of Last Hidden Layer Activations [0.5156484100374058]
We show that using some widely known activation functions in the output layer of the model with high temperature values has the effect of zeroing out the gradients for both targeted and untargeted attack cases.
We've experimentally verified the efficacy of our approach on MNIST (Digit), CIFAR10 datasets.
arXiv Detail & Related papers (2022-02-15T12:02:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.