Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey
- URL: http://arxiv.org/abs/2411.17911v2
- Date: Sat, 05 Apr 2025 18:48:12 GMT
- Title: Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey
- Authors: Hong-Hanh Nguyen-Le, Van-Tuan Tran, Dinh-Thuc Nguyen, Nhien-An Le-Khac,
- Abstract summary: deepfakes (DFs) have been utilized for malicious purposes, such as individual impersonation, misinformation spreading, and artists style imitation.<n>This survey offers researchers and practitioners a comprehensive resource for understanding the current landscape, methodological approaches, and promising future directions in this rapidly evolving field.
- Score: 1.7811840395202345
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In recent years, deepfakes (DFs) have been utilized for malicious purposes, such as individual impersonation, misinformation spreading, and artists style imitation, raising questions about ethical and security concerns. In this survey, we provide a comprehensive review and comparison of passive DF detection across multiple modalities, including image, video, audio, and multi-modal, to explore the inter-modality relationships between them. Beyond detection accuracy, we extend our analysis to encompass crucial performance dimensions essential for real-world deployment: generalization capabilities across novel generation techniques, robustness against adversarial manipulations and postprocessing techniques, attribution precision in identifying generation sources, and resilience under real-world operational conditions. Additionally, we analyze the advantages and limitations of existing datasets, benchmarks, and evaluation metrics for passive DF detection. Finally, we propose future research directions that address these unexplored and emerging issues in the field of passive DF detection. This survey offers researchers and practitioners a comprehensive resource for understanding the current landscape, methodological approaches, and promising future directions in this rapidly evolving field.
Related papers
- Deep Learning Advancements in Anomaly Detection: A Comprehensive Survey [43.75849983150303]
As datasets become more complex, traditional anomaly detection methods struggle to capture intricate patterns.
Deep learning has made AD methods more powerful and adaptable, improving their ability to handle high-dimensional and unstructured data.
This review bridges gaps in existing literature and serves as a valuable resource for researchers and practitioners seeking to enhance AD techniques using deep learning.
arXiv Detail & Related papers (2025-03-17T14:04:48Z) - Survey on AI-Generated Media Detection: From Non-MLLM to MLLM [51.91311158085973]
Methods for detecting AI-generated media have evolved rapidly.
General-purpose detectors based on MLLMs integrate authenticity verification, explainability, and localization capabilities.
Ethical and security considerations have emerged as critical global concerns.
arXiv Detail & Related papers (2025-02-07T12:18:20Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - Underwater Object Detection in the Era of Artificial Intelligence: Current, Challenge, and Future [119.88454942558485]
Underwater object detection (UOD) aims to identify and localise objects in underwater images or videos.
In recent years, artificial intelligence (AI) based methods, especially deep learning methods, have shown promising performance in UOD.
arXiv Detail & Related papers (2024-10-08T00:25:33Z) - FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant [59.2438504610849]
We introduce FFAA: Face Forgery Analysis Assistant, consisting of a fine-tuned Multimodal Large Language Model (MLLM) and Multi-answer Intelligent Decision System (MIDS)
Our method not only provides user-friendly and explainable results but also significantly boosts accuracy and robustness compared to previous methods.
arXiv Detail & Related papers (2024-08-19T15:15:20Z) - Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition [52.522244807811894]
We propose a novel multimodal Transformer framework using prompt learning to address the issue of missing modalities.
Our method introduces three types of prompts: generative prompts, missing-signal prompts, and missing-type prompts.
Through prompt learning, we achieve a substantial reduction in the number of trainable parameters.
arXiv Detail & Related papers (2024-07-07T13:55:56Z) - Evolving from Single-modal to Multi-modal Facial Deepfake Detection: Progress and Challenges [40.11614155244292]
This survey traces the evolution of deepfake detection from early single-modal methods to sophisticated multi-modal approaches.
We present a structured taxonomy of detection techniques and analyze the transition from GAN-based to diffusion model-driven deepfakes.
arXiv Detail & Related papers (2024-06-11T05:48:04Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Video Anomaly Detection in 10 Years: A Survey and Outlook [10.143205531474907]
Video anomaly detection (VAD) holds immense importance across diverse domains such as surveillance, healthcare, and environmental monitoring.
This survey explores deep learning-based VAD, expanding beyond traditional supervised training paradigms to encompass emerging weakly supervised, self-supervised, and unsupervised approaches.
arXiv Detail & Related papers (2024-05-29T17:56:31Z) - A Timely Survey on Vision Transformer for Deepfake Detection [11.410817278428533]
Vision Transformer (ViT)-based approaches showcase superior performance in generality and efficiency.
This survey aims to equip researchers with a nuanced understanding of ViT's pivotal role in deepfake detection.
arXiv Detail & Related papers (2024-05-14T09:33:04Z) - Audio Anti-Spoofing Detection: A Survey [7.3348524333159]
Deep learning has given rise to sophisticated algorithms capable of manipulating or creating multimedia fake content, known as Deepfake.
Audio anti-spoofing detection challenges have been organized to foster the development of anti-spoofing countermeasures.
This survey paper presents a comprehensive review of every component within the detection pipeline, including algorithm architectures, optimization techniques, application generalizability, evaluation metrics, performance comparisons, available datasets, and open-source availability.
arXiv Detail & Related papers (2024-04-22T06:52:12Z) - Deepfake Generation and Detection: A Benchmark and Survey [134.19054491600832]
Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions.
This survey comprehensively reviews the latest developments in deepfake generation and detection.
We focus on researching four representative deepfake fields: face swapping, face reenactment, talking face generation, and facial attribute editing.
arXiv Detail & Related papers (2024-03-26T17:12:34Z) - CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF)
Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts.
It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - A Comprehensive Study on the Robustness of Image Classification and
Object Detection in Remote Sensing: Surveying and Benchmarking [17.012502610423006]
Deep neural networks (DNNs) have found widespread applications in interpreting remote sensing (RS) imagery.
It has been demonstrated in previous works that DNNs are vulnerable to different types of noises, particularly adversarial noises.
This study represents the first comprehensive examination of both natural robustness and adversarial robustness in RS tasks.
arXiv Detail & Related papers (2023-06-21T08:52:35Z) - Survey of Network Intrusion Detection Methods from the Perspective of
the Knowledge Discovery in Databases Process [63.75363908696257]
We review the methods that have been applied to network data with the purpose of developing an intrusion detector.
We discuss the techniques used for the capture, preparation and transformation of the data, as well as, the data mining and evaluation methods.
As a result of this literature review, we investigate some open issues which will need to be considered for further research in the area of network security.
arXiv Detail & Related papers (2020-01-27T11:21:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.