Related papers: Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey

Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey

URL: http://arxiv.org/abs/2411.17911v1
Date: Tue, 26 Nov 2024 22:04:49 GMT
Title: Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey
Authors: Hong-Hanh Nguyen-Le, Van-Tuan Tran, Dinh-Thuc Nguyen, Nhien-An Le-Khac,
Abstract summary: Deepfakes (DFs) have been utilized for malicious purposes, such as individual impersonation, misinformation spreading, and artists' style imitation.<n>This survey explores passive approaches across multiple modalities, including image, video, audio, and multi-modal domains.
Score: 1.7811840395202345
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In recent years, deepfakes (DFs) have been utilized for malicious purposes, such as individual impersonation, misinformation spreading, and artists' style imitation, raising questions about ethical and security concerns. However, existing surveys have focused on accuracy performance of passive DF detection approaches for single modalities, such as image, video or audio. This comprehensive survey explores passive approaches across multiple modalities, including image, video, audio, and multi-modal domains, and extend our discussion beyond detection accuracy, including generalization, robustness, attribution, and interpretability. Additionally, we discuss threat models for passive approaches, including potential adversarial strategies and different levels of adversary knowledge and capabilities. We also highlights current challenges in DF detection, including the lack of generalization across different generative models, the need for comprehensive trustworthiness evaluation, and the limitations of existing multi-modal approaches. Finally, we propose future research directions that address these unexplored and emerging issues in the field of passive DF detection, such as adaptive learning, dynamic benchmark, holistic trustworthiness evaluation, and multi-modal detectors for talking-face video generation.

Related papers

Deep Learning Advancements in Anomaly Detection: A Comprehensive Survey [43.75849983150303]
As datasets become more complex, traditional anomaly detection methods struggle to capture intricate patterns. Deep learning has made AD methods more powerful and adaptable, improving their ability to handle high-dimensional and unstructured data. This review bridges gaps in existing literature and serves as a valuable resource for researchers and practitioners seeking to enhance AD techniques using deep learning.
arXiv Detail & Related papers (2025-03-17T14:04:48Z)
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM [51.91311158085973]
Methods for detecting AI-generated media have evolved rapidly. General-purpose detectors based on MLLMs integrate authenticity verification, explainability, and localization capabilities. Ethical and security considerations have emerged as critical global concerns.
arXiv Detail & Related papers (2025-02-07T12:18:20Z)
Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance. Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z)
Underwater Object Detection in the Era of Artificial Intelligence: Current, Challenge, and Future [119.88454942558485]
Underwater object detection (UOD) aims to identify and localise objects in underwater images or videos. In recent years, artificial intelligence (AI) based methods, especially deep learning methods, have shown promising performance in UOD.
arXiv Detail & Related papers (2024-10-08T00:25:33Z)
FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant [59.2438504610849]
We introduce FFAA: Face Forgery Analysis Assistant, consisting of a fine-tuned Multimodal Large Language Model (MLLM) and Multi-answer Intelligent Decision System (MIDS) Our method not only provides user-friendly and explainable results but also significantly boosts accuracy and robustness compared to previous methods.
arXiv Detail & Related papers (2024-08-19T15:15:20Z)
Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition [52.522244807811894]
We propose a novel multimodal Transformer framework using prompt learning to address the issue of missing modalities. Our method introduces three types of prompts: generative prompts, missing-signal prompts, and missing-type prompts. Through prompt learning, we achieve a substantial reduction in the number of trainable parameters.
arXiv Detail & Related papers (2024-07-07T13:55:56Z)
Evolving from Single-modal to Multi-modal Facial Deepfake Detection: Progress and Challenges [40.11614155244292]
This survey traces the evolution of deepfake detection from early single-modal methods to sophisticated multi-modal approaches. We present a structured taxonomy of detection techniques and analyze the transition from GAN-based to diffusion model-driven deepfakes.
arXiv Detail & Related papers (2024-06-11T05:48:04Z)
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods. The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics. We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z)
Video Anomaly Detection in 10 Years: A Survey and Outlook [10.143205531474907]
Video anomaly detection (VAD) holds immense importance across diverse domains such as surveillance, healthcare, and environmental monitoring. This survey explores deep learning-based VAD, expanding beyond traditional supervised training paradigms to encompass emerging weakly supervised, self-supervised, and unsupervised approaches.
arXiv Detail & Related papers (2024-05-29T17:56:31Z)
A Timely Survey on Vision Transformer for Deepfake Detection [11.410817278428533]
Vision Transformer (ViT)-based approaches showcase superior performance in generality and efficiency. This survey aims to equip researchers with a nuanced understanding of ViT's pivotal role in deepfake detection.
arXiv Detail & Related papers (2024-05-14T09:33:04Z)
Audio Anti-Spoofing Detection: A Survey [7.3348524333159]
Deep learning has given rise to sophisticated algorithms capable of manipulating or creating multimedia fake content, known as Deepfake. Audio anti-spoofing detection challenges have been organized to foster the development of anti-spoofing countermeasures. This survey paper presents a comprehensive review of every component within the detection pipeline, including algorithm architectures, optimization techniques, application generalizability, evaluation metrics, performance comparisons, available datasets, and open-source availability.
arXiv Detail & Related papers (2024-04-22T06:52:12Z)
Deepfake Generation and Detection: A Benchmark and Survey [134.19054491600832]
Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions. This survey comprehensively reviews the latest developments in deepfake generation and detection. We focus on researching four representative deepfake fields: face swapping, face reenactment, talking face generation, and facial attribute editing.
arXiv Detail & Related papers (2024-03-26T17:12:34Z)
CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF) Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts. It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z)
A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking [17.012502610423006]
Deep neural networks (DNNs) have found widespread applications in interpreting remote sensing (RS) imagery. It has been demonstrated in previous works that DNNs are vulnerable to different types of noises, particularly adversarial noises. This study represents the first comprehensive examination of both natural robustness and adversarial robustness in RS tasks.
arXiv Detail & Related papers (2023-06-21T08:52:35Z)
Survey of Network Intrusion Detection Methods from the Perspective of the Knowledge Discovery in Databases Process [63.75363908696257]
We review the methods that have been applied to network data with the purpose of developing an intrusion detector. We discuss the techniques used for the capture, preparation and transformation of the data, as well as, the data mining and evaluation methods. As a result of this literature review, we investigate some open issues which will need to be considered for further research in the area of network security.
arXiv Detail & Related papers (2020-01-27T11:21:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.