Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey
- URL: http://arxiv.org/abs/2411.17911v1
- Date: Tue, 26 Nov 2024 22:04:49 GMT
- Title: Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey
- Authors: Hong-Hanh Nguyen-Le, Van-Tuan Tran, Dinh-Thuc Nguyen, Nhien-An Le-Khac,
- Abstract summary: Deepfakes (DFs) have been utilized for malicious purposes, such as individual impersonation, misinformation spreading, and artists' style imitation.
This survey explores passive approaches across multiple modalities, including image, video, audio, and multi-modal domains.
- Score: 1.7811840395202345
- License:
- Abstract: In recent years, deepfakes (DFs) have been utilized for malicious purposes, such as individual impersonation, misinformation spreading, and artists' style imitation, raising questions about ethical and security concerns. However, existing surveys have focused on accuracy performance of passive DF detection approaches for single modalities, such as image, video or audio. This comprehensive survey explores passive approaches across multiple modalities, including image, video, audio, and multi-modal domains, and extend our discussion beyond detection accuracy, including generalization, robustness, attribution, and interpretability. Additionally, we discuss threat models for passive approaches, including potential adversarial strategies and different levels of adversary knowledge and capabilities. We also highlights current challenges in DF detection, including the lack of generalization across different generative models, the need for comprehensive trustworthiness evaluation, and the limitations of existing multi-modal approaches. Finally, we propose future research directions that address these unexplored and emerging issues in the field of passive DF detection, such as adaptive learning, dynamic benchmark, holistic trustworthiness evaluation, and multi-modal detectors for talking-face video generation.
Related papers
- Survey on AI-Generated Media Detection: From Non-MLLM to MLLM [51.91311158085973]
Methods for detecting AI-generated media have evolved rapidly.
General-purpose detectors based on MLLMs integrate authenticity verification, explainability, and localization capabilities.
Ethical and security considerations have emerged as critical global concerns.
arXiv Detail & Related papers (2025-02-07T12:18:20Z) - A Review Paper of the Effects of Distinct Modalities and ML Techniques to Distracted Driving Detection [3.6248657646376707]
Distracted driving remains a significant global challenge with severe human and economic repercussions.
This systematic review addresses critical gaps by providing a comprehensive analysis of machine learning (ML) and deep learning (DL) techniques applied across various data modalities.
arXiv Detail & Related papers (2025-01-20T21:35:34Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - Deep Learning for Video Anomaly Detection: A Review [52.74513211976795]
Video anomaly detection (VAD) aims to discover behaviors or events deviating from the normality in videos.
In the era of deep learning, a great variety of deep learning based methods are constantly emerging for the VAD task.
This review covers the spectrum of five different categories, namely, semi-supervised, weakly supervised, fully supervised, unsupervised and open-set supervised VAD.
arXiv Detail & Related papers (2024-09-09T07:31:16Z) - FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant [59.2438504610849]
We introduce FFAA: Face Forgery Analysis Assistant, consisting of a fine-tuned Multimodal Large Language Model (MLLM) and Multi-answer Intelligent Decision System (MIDS)
Our method not only provides user-friendly and explainable results but also significantly boosts accuracy and robustness compared to previous methods.
arXiv Detail & Related papers (2024-08-19T15:15:20Z) - Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition [52.522244807811894]
We propose a novel multimodal Transformer framework using prompt learning to address the issue of missing modalities.
Our method introduces three types of prompts: generative prompts, missing-signal prompts, and missing-type prompts.
Through prompt learning, we achieve a substantial reduction in the number of trainable parameters.
arXiv Detail & Related papers (2024-07-07T13:55:56Z) - Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A Survey [40.11614155244292]
As AI-generated media become more realistic, the risk of misuse to spread misinformation and commit identity fraud increases.
This work traces the evolution from traditional single-modality methods to sophisticated multi-modal approaches that handle audio-visual and text-visual scenarios.
To our knowledge, this is the first survey of its kind.
arXiv Detail & Related papers (2024-06-11T05:48:04Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Video Anomaly Detection in 10 Years: A Survey and Outlook [10.143205531474907]
Video anomaly detection (VAD) holds immense importance across diverse domains such as surveillance, healthcare, and environmental monitoring.
This survey explores deep learning-based VAD, expanding beyond traditional supervised training paradigms to encompass emerging weakly supervised, self-supervised, and unsupervised approaches.
arXiv Detail & Related papers (2024-05-29T17:56:31Z) - Audio Anti-Spoofing Detection: A Survey [7.3348524333159]
Deep learning has given rise to sophisticated algorithms capable of manipulating or creating multimedia fake content, known as Deepfake.
Audio anti-spoofing detection challenges have been organized to foster the development of anti-spoofing countermeasures.
This survey paper presents a comprehensive review of every component within the detection pipeline, including algorithm architectures, optimization techniques, application generalizability, evaluation metrics, performance comparisons, available datasets, and open-source availability.
arXiv Detail & Related papers (2024-04-22T06:52:12Z) - A Comprehensive Study on the Robustness of Image Classification and
Object Detection in Remote Sensing: Surveying and Benchmarking [17.012502610423006]
Deep neural networks (DNNs) have found widespread applications in interpreting remote sensing (RS) imagery.
It has been demonstrated in previous works that DNNs are vulnerable to different types of noises, particularly adversarial noises.
This study represents the first comprehensive examination of both natural robustness and adversarial robustness in RS tasks.
arXiv Detail & Related papers (2023-06-21T08:52:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.