OGAN: Disrupting Deepfakes with an Adversarial Attack that Survives
Training
- URL: http://arxiv.org/abs/2006.12247v2
- Date: Wed, 25 Nov 2020 11:47:30 GMT
- Title: OGAN: Disrupting Deepfakes with an Adversarial Attack that Survives
Training
- Authors: Eran Segalis, Eran Galili
- Abstract summary: We introduce a class of adversarial attacks that can disrupt face-swapping autoencoders.
We propose the Oscillating GAN (OGAN) attack, a novel attack optimized to be training-resistant.
These results demonstrate the existence of training-resistant adversarial attacks, potentially applicable to a wide range of domains.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in autoencoders and generative models have given rise to
effective video forgery methods, used for generating so-called "deepfakes".
Mitigation research is mostly focused on post-factum deepfake detection and not
on prevention. We complement these efforts by introducing a novel class of
adversarial attacks---training-resistant attacks---which can disrupt
face-swapping autoencoders whether or not its adversarial images have been
included in the training set of said autoencoders. We propose the Oscillating
GAN (OGAN) attack, a novel attack optimized to be training-resistant, which
introduces spatial-temporal distortions to the output of face-swapping
autoencoders. To implement OGAN, we construct a bilevel optimization problem,
where we train a generator and a face-swapping model instance against each
other. Specifically, we pair each input image with a target distortion, and
feed them into a generator that produces an adversarial image. This image will
exhibit the distortion when a face-swapping autoencoder is applied to it. We
solve the optimization problem by training the generator and the face-swapping
model simultaneously using an iterative process of alternating optimization.
Next, we analyze the previously published Distorting Attack and show it is
training-resistant, though it is outperformed by our suggested OGAN. Finally,
we validate both attacks using a popular implementation of FaceSwap, and show
that they transfer across different target models and target faces, including
faces the adversarial attacks were not trained on. More broadly, these results
demonstrate the existence of training-resistant adversarial attacks,
potentially applicable to a wide range of domains.
Related papers
- Imperceptible Face Forgery Attack via Adversarial Semantic Mask [59.23247545399068]
We propose an Adversarial Semantic Mask Attack framework (ASMA) which can generate adversarial examples with good transferability and invisibility.
Specifically, we propose a novel adversarial semantic mask generative model, which can constrain generated perturbations in local semantic regions for good stealthiness.
arXiv Detail & Related papers (2024-06-16T10:38:11Z) - GenFighter: A Generative and Evolutive Textual Attack Removal [6.044610337297754]
Adrial attacks pose significant challenges to deep neural networks (DNNs) such as Transformer models in natural language processing (NLP)
This paper introduces a novel defense strategy, called GenFighter, which enhances adversarial robustness by learning and reasoning on the training classification distribution.
We show that GenFighter outperforms state-of-the-art defenses in accuracy under attack and attack success rate metrics.
arXiv Detail & Related papers (2024-04-17T16:32:13Z) - Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery
Detection [62.595450266262645]
This paper introduces a novel and previously unrecognized threat in face forgery detection scenarios caused by backdoor attack.
By embedding backdoors into models, attackers can deceive detectors into producing erroneous predictions for forged faces.
We propose emphPoisoned Forgery Face framework, which enables clean-label backdoor attacks on face forgery detectors.
arXiv Detail & Related papers (2024-02-18T06:31:05Z) - Detecting Adversarial Faces Using Only Real Face Self-Perturbations [36.26178169550577]
Adrial attacks aim to disturb the functionality of a target system by adding specific noise to the input samples.
Existing defense techniques achieve high accuracy in detecting some specific adversarial faces (adv-faces)
New attack methods especially GAN-based attacks with completely different noise patterns circumvent them and reach a higher attack success rate.
arXiv Detail & Related papers (2023-04-22T09:55:48Z) - Guidance Through Surrogate: Towards a Generic Diagnostic Attack [101.36906370355435]
We develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed Guided Projected Gradient Attack (G-PGA)
Our modified attack does not require random restarts, large number of attack iterations or search for an optimal step-size.
More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.
arXiv Detail & Related papers (2022-12-30T18:45:23Z) - Adversarial Pixel Restoration as a Pretext Task for Transferable
Perturbations [54.1807206010136]
Transferable adversarial attacks optimize adversaries from a pretrained surrogate model and known label space to fool the unknown black-box models.
We propose Adversarial Pixel Restoration as a self-supervised alternative to train an effective surrogate model from scratch.
Our training approach is based on a min-max objective which reduces overfitting via an adversarial objective.
arXiv Detail & Related papers (2022-07-18T17:59:58Z) - Restricted Black-box Adversarial Attack Against DeepFake Face Swapping [70.82017781235535]
We introduce a practical adversarial attack that does not require any queries to the facial image forgery model.
Our method is built on a substitute model persuing for face reconstruction and then transfers adversarial examples from the substitute model directly to inaccessible black-box DeepFake models.
arXiv Detail & Related papers (2022-04-26T14:36:06Z) - Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images.
We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks.
The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z) - Defending against GAN-based Deepfake Attacks via Transformation-aware
Adversarial Faces [36.87244915810356]
Deepfake represents a category of face-swapping attacks that leverage machine learning models.
We propose to use novel transformation-aware adversarially perturbed faces as a defense against Deepfake attacks.
We also propose to use an ensemble-based approach to enhance the defense robustness against GAN-based Deepfake variants.
arXiv Detail & Related papers (2020-06-12T18:51:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.