An Analysis of Recent Advances in Deepfake Image Detection in an Evolving Threat Landscape
- URL: http://arxiv.org/abs/2404.16212v1
- Date: Wed, 24 Apr 2024 21:21:50 GMT
- Title: An Analysis of Recent Advances in Deepfake Image Detection in an Evolving Threat Landscape
- Authors: Sifat Muhammad Abdullah, Aravind Cheruvu, Shravya Kanchi, Taejoong Chung, Peng Gao, Murtuza Jadliwala, Bimal Viswanath,
- Abstract summary: Deepfake or synthetic images produced using deep generative models pose serious risks to online platforms.
We study 8 state-of-the-art detectors and argue that they are far from being ready for deployment.
- Score: 11.45988746286973
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deepfake or synthetic images produced using deep generative models pose serious risks to online platforms. This has triggered several research efforts to accurately detect deepfake images, achieving excellent performance on publicly available deepfake datasets. In this work, we study 8 state-of-the-art detectors and argue that they are far from being ready for deployment due to two recent developments. First, the emergence of lightweight methods to customize large generative models, can enable an attacker to create many customized generators (to create deepfakes), thereby substantially increasing the threat surface. We show that existing defenses fail to generalize well to such \emph{user-customized generative models} that are publicly available today. We discuss new machine learning approaches based on content-agnostic features, and ensemble modeling to improve generalization performance against user-customized models. Second, the emergence of \textit{vision foundation models} -- machine learning models trained on broad data that can be easily adapted to several downstream tasks -- can be misused by attackers to craft adversarial deepfakes that can evade existing defenses. We propose a simple adversarial attack that leverages existing foundation models to craft adversarial samples \textit{without adding any adversarial noise}, through careful semantic manipulation of the image content. We highlight the vulnerabilities of several defenses against our attack, and explore directions leveraging advanced foundation models and adversarial training to defend against this new threat.
Related papers
- AdvIRL: Reinforcement Learning-Based Adversarial Attacks on 3D NeRF Models [1.7205106391379021]
textitAdvIRL generates adversarial noise that remains robust under diverse 3D transformations.
Our approach is validated across a wide range of scenes, from small objects (e.g., bananas) to large environments (e.g., lighthouses)
arXiv Detail & Related papers (2024-12-18T01:01:30Z) - Transpose Attack: Stealing Datasets with Bidirectional Training [4.166238443183223]
We show that adversaries can exfiltrate datasets from protected learning environments under the guise of legitimate models.
We propose a novel approach for detecting infected models.
arXiv Detail & Related papers (2023-11-13T15:14:50Z) - Streamlining Attack Tree Generation: A Fragment-Based Approach [39.157069600312774]
We present a novel fragment-based attack graph generation approach that utilizes information from publicly available information security databases.
We also propose a domain-specific language for attack modeling, which we employ in the proposed attack graph generation approach.
arXiv Detail & Related papers (2023-10-01T12:41:38Z) - Careful What You Wish For: on the Extraction of Adversarially Trained
Models [2.707154152696381]
Recent attacks on Machine Learning (ML) models pose several security and privacy threats.
We propose a framework to assess extraction attacks on adversarially trained models.
We show that adversarially trained models are more vulnerable to extraction attacks than models obtained under natural training circumstances.
arXiv Detail & Related papers (2022-07-21T16:04:37Z) - Deepfake Forensics via An Adversarial Game [99.84099103679816]
We advocate adversarial training for improving the generalization ability to both unseen facial forgeries and unseen image/video qualities.
Considering that AI-based face manipulation often leads to high-frequency artifacts that can be easily spotted by models yet difficult to generalize, we propose a new adversarial training method that attempts to blur out these specific artifacts.
arXiv Detail & Related papers (2021-03-25T02:20:08Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images.
We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks.
The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z) - Artificial Fingerprinting for Generative Models: Rooting Deepfake
Attribution in Training Data [64.65952078807086]
Photorealistic image generation has reached a new level of quality due to the breakthroughs of generative adversarial networks (GANs)
Yet, the dark side of such deepfakes, the malicious use of generated media, raises concerns about visual misinformation.
We seek a proactive and sustainable solution on deepfake detection by introducing artificial fingerprints into the models.
arXiv Detail & Related papers (2020-07-16T16:49:55Z) - Orthogonal Deep Models As Defense Against Black-Box Attacks [71.23669614195195]
We study the inherent weakness of deep models in black-box settings where the attacker may develop the attack using a model similar to the targeted model.
We introduce a novel gradient regularization scheme that encourages the internal representation of a deep model to be orthogonal to another.
We verify the effectiveness of our technique on a variety of large-scale models.
arXiv Detail & Related papers (2020-06-26T08:29:05Z) - Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem.
We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.