Related papers: Toward Generalized Detection of Synthetic Media: Limitations, Challenges, and the Path to Multimodal Solutions

Toward Generalized Detection of Synthetic Media: Limitations, Challenges, and the Path to Multimodal Solutions

URL: http://arxiv.org/abs/2511.11116v1
Date: Fri, 14 Nov 2025 09:44:44 GMT
Title: Toward Generalized Detection of Synthetic Media: Limitations, Challenges, and the Path to Multimodal Solutions
Authors: Redwan Hussain, Mizanur Rahman, Prithwiraj Bhattacharjee,
Abstract summary: This study reviews twenty-four recent works on AI-generated media detection.<n>It concludes that multimodal deep learning models have the potential to provide more robust and generalized detection.
Score: 5.8251644521379164
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Artificial intelligence (AI) in media has advanced rapidly over the last decade. The introduction of Generative Adversarial Networks (GANs) improved the quality of photorealistic image generation. Diffusion models later brought a new era of generative media. These advances made it difficult to separate real and synthetic content. The rise of deepfakes demonstrated how these tools could be misused to spread misinformation, political conspiracies, privacy violations, and fraud. For this reason, many detection models have been developed. They often use deep learning methods such as Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). These models search for visual, spatial, or temporal anomalies. However, such approaches often fail to generalize across unseen data and struggle with content from different models. In addition, existing approaches are ineffective in multimodal data and highly modified content. This study reviews twenty-four recent works on AI-generated media detection. Each study was examined individually to identify its contributions and weaknesses, respectively. The review then summarizes the common limitations and key challenges faced by current approaches. Based on this analysis, a research direction is suggested with a focus on multimodal deep learning models. Such models have the potential to provide more robust and generalized detection. It offers future researchers a clear starting point for building stronger defenses against harmful synthetic media.

Related papers

Towards Unified Multimodal Misinformation Detection in Social Media: A Benchmark Dataset and Baseline [56.790045049514326]
Two major forms of deception dominate: human-crafted misinformation and AI-generated content.<n>We propose Unified Multimodal Fake Content Detection (UMFDet), a framework designed to handle both forms of deception.<n>UMFDet achieves robust and consistent performance across both misinformation types, outperforming specialized baselines.
arXiv Detail & Related papers (2025-09-30T09:26:32Z)
Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation [31.737159092430108]
We study different generative architectures, searching and identifying discriminative features that are unbiased, robust to impairments, and shared across models.<n>We introduce a novel data augmentation strategy based on the wavelet decomposition and replace specific frequency-related bands to drive the model to exploit more relevant forensic cues.<n>Our method achieves a significant accuracy improvement over state-of-the-art detectors and obtains excellent results even on very recent generative models.
arXiv Detail & Related papers (2025-06-20T07:36:59Z)
BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation [77.55074597806035]
GenBuster-200K is a large-scale, high-quality AI-generated video dataset featuring 200K high-resolution video clips.<n>BusterX is a novel AI-generated video detection and explanation framework leveraging multimodal large language model (MLLM) and reinforcement learning.
arXiv Detail & Related papers (2025-05-19T02:06:43Z)
Methods and Trends in Detecting AI-Generated Images: A Comprehensive Review [0.17188280334580194]
Generative Adversarial Networks (GANs), Diffusion Models, and Variational Autoencoders (VAEs) have enabled the synthesis of high-quality multimedia data.<n>These advancements have also raised significant concerns regarding adversarial attacks, unethical usage, and societal harm.<n>This survey provides a comprehensive review of state-of-the-art techniques for detecting and classifying synthetic images generated by advanced generative AI models.
arXiv Detail & Related papers (2025-02-21T03:16:18Z)
Adaptive Meta-Learning for Robust Deepfake Detection: A Multi-Agent Framework to Data Drift and Model Generalization [6.589206192038365]
This paper proposes an adversarial meta-learning algorithm using task-specific adaptive sample synthesis and consistency regularization. It boosts both robustness and generalization of the model. Experimental results demonstrate the model's consistent performance across various datasets, outperforming the models in comparison.
arXiv Detail & Related papers (2024-11-12T19:55:07Z)
Evolving from Single-modal to Multi-modal Facial Deepfake Detection: Progress and Challenges [40.11614155244292]
This survey traces the evolution of deepfake detection from early single-modal methods to sophisticated multi-modal approaches.<n>We present a structured taxonomy of detection techniques and analyze the transition from GAN-based to diffusion model-driven deepfakes.
arXiv Detail & Related papers (2024-06-11T05:48:04Z)
Deepfake Generation and Detection: A Benchmark and Survey [134.19054491600832]
Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions. This survey comprehensively reviews the latest developments in deepfake generation and detection. We focus on researching four representative deepfake fields: face swapping, face reenactment, talking face generation, and facial attribute editing.
arXiv Detail & Related papers (2024-03-26T17:12:34Z)
On the Challenges and Opportunities in Generative AI [155.030542942979]
We argue that current large-scale generative AI models exhibit several fundamental shortcomings that hinder their widespread adoption across domains.<n>We aim to provide researchers with insights for exploring fruitful research directions, thus fostering the development of more robust and accessible generative AI solutions.
arXiv Detail & Related papers (2024-02-28T15:19:33Z)
Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis [69.09526348527203]
Deep generative models have led to highly realistic media, known as deepfakes, that are commonly indistinguishable from real to human eyes. We propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection. We demonstrate the improved effectiveness, cross-GAN generalization, and robustness against perturbations of our approach in a variety of detection scenarios.
arXiv Detail & Related papers (2021-05-29T21:22:24Z)
Artificial Fingerprinting for Generative Models: Rooting Deepfake Attribution in Training Data [64.65952078807086]
Photorealistic image generation has reached a new level of quality due to the breakthroughs of generative adversarial networks (GANs) Yet, the dark side of such deepfakes, the malicious use of generated media, raises concerns about visual misinformation. We seek a proactive and sustainable solution on deepfake detection by introducing artificial fingerprints into the models.
arXiv Detail & Related papers (2020-07-16T16:49:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.