Related papers: ForensicFlow: A Tri-Modal Adaptive Network for Robust Deepfake Detection

ForensicFlow: A Tri-Modal Adaptive Network for Robust Deepfake Detection

URL: http://arxiv.org/abs/2511.14554v1
Date: Tue, 18 Nov 2025 14:56:34 GMT
Title: ForensicFlow: A Tri-Modal Adaptive Network for Robust Deepfake Detection
Authors: Mohammad Romani,
Abstract summary: We introduce ForensicFlow, a tri-modal forensic framework that fuses RGB, texture, and frequency evidence for video Deepfake detection.<n>Trained on Celeb-DF (v2) with Focal Loss, ForensicFlow achieves AUC 0.9752, F1-Score 0.9408, and accuracy 0.9208, outperforming single-stream baselines.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deepfakes generated by advanced GANs and autoencoders severely threaten information integrity and societal stability. Single-stream CNNs fail to capture multi-scale forgery artifacts across spatial, texture, and frequency domains, limiting robustness and generalization. We introduce the ForensicFlow, a tri-modal forensic framework that synergistically fuses RGB, texture, and frequency evidence for video Deepfake detection. The RGB branch (ConvNeXt-tiny) extracts global visual inconsistencies; the texture branch (Swin Transformer-tiny) detects fine-grained blending artifacts; the frequency branch (CNN + SE) identifies periodic spectral noise. Attention-based temporal pooling dynamically prioritizes high-evidence frames, while adaptive attention fusion balances branch contributions.Trained on Celeb-DF (v2) with Focal Loss, ForensicFlow achieves AUC 0.9752, F1-Score 0.9408, and accuracy 0.9208, outperforming single-stream baselines. Ablation validates branch synergy; Grad-CAM confirms forensic focus. This comprehensive feature fusion provides superior resilience against subtle forgeries.

Related papers

Deepfake Forensics Adapter: A Dual-Stream Network for Generalizable Deepfake Detection [22.889849855283355]
Deepfake Forensics Adapter (DFA) is a novel dual-stream framework that synergizes vision-language foundation models with targeted forensics analysis.<n>Our approach integrates a pre-trained CLIP model with three core components to achieve specialized deepfake detection.<n>Our framework not only demonstrates state-of-the-art performance, but also points out a feasible and effective direction for developing a robust deepfake detection system.
arXiv Detail & Related papers (2026-03-02T04:58:00Z)
Enhanced Portable Ultra Low-Field Diffusion Tensor Imaging with Bayesian Artifact Correction and Deep Learning-Based Super-Resolution [3.95277369791128]
Portable, ultra-low-field (ULF) magnetic resonance imaging has the potential to expand access to neuroimaging.<n>Currently suffers from coarse spatial and angular resolutions and low signal-to-noise ratios.<n>We introduce a nine-direction, single-shell ULF DTI sequence, as well as a companion Bayesian bias field correction algorithm.
arXiv Detail & Related papers (2026-02-11T23:50:48Z)
Multi-modal Deepfake Detection and Localization with FPN-Transformer [21.022230340898556]
We introduce a multi-modal deepfake detection and localization framework based on a Feature Pyramid-Transformer (FPN-Transformer)<n>A multi-scale feature pyramid is constructed through R-TLM blocks with localized attention mechanisms, enabling joint analysis of cross-context temporal dependencies.<n>We evaluate our approach on the test set of the IJCAI'25 DDL-AV benchmark, showing a good performance with a final score of 0.7535.
arXiv Detail & Related papers (2025-11-11T09:33:39Z)
A Hybrid Deep Learning and Forensic Approach for Robust Deepfake Detection [0.0]
Existing deepfake detection methods either rely on deep learning, which suffers from poor generalization and vulnerability to distortions, or forensic analysis, which is interpretable but limited against new manipulation techniques.<n>This study proposes a hybrid framework that fuses forensic features, including noise residuals, JPEG compression traces, and frequency-domain descriptors, with deep learning representations from CNNs and vision transformers.
arXiv Detail & Related papers (2025-10-31T11:32:52Z)
HyperFake: Hyperspectral Reconstruction and Attention-Guided Analysis for Advanced Deepfake Detection [2.198430261120653]
Deepfakes pose a significant threat to digital media security.<n>Current detection methods struggle to generalize across different manipulation techniques.<n>We introduce HyperFake, a novel deepfake detection pipeline that reconstructs 31-channel hyperspectral data from standard RGB videos.
arXiv Detail & Related papers (2025-05-24T08:28:55Z)
Faster Than Lies: Real-time Deepfake Detection using Binary Neural Networks [0.0]
Deepfake detection aims to contrast the spread of deep-generated media that undermines trust in online content. We introduce a novel deepfake detection approach on images using Binary Neural Networks (BNNs) for fast inference with minimal accuracy loss.
arXiv Detail & Related papers (2024-06-07T13:37:36Z)
Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection [86.97062579515833]
We introduce the concept of Neighboring Pixel Relationships(NPR) as a means to capture and characterize the generalized structural artifacts stemming from up-sampling operations. A comprehensive analysis is conducted on an open-world dataset, comprising samples generated by tft28 distinct generative models. This analysis culminates in the establishment of a novel state-of-the-art performance, showcasing a remarkable tft11.6% improvement over existing methods.
arXiv Detail & Related papers (2023-12-16T14:27:06Z)
Learning Heavily-Degraded Prior for Underwater Object Detection [59.5084433933765]
This paper seeks transferable prior knowledge from detector-friendly images. It is based on statistical observations that, the heavily degraded regions of detector-friendly (DFUI) and underwater images have evident feature distribution gaps. Our method with higher speeds and less parameters still performs better than transformer-based detectors.
arXiv Detail & Related papers (2023-08-24T12:32:46Z)
CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network. We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z)
Spatial-Temporal Frequency Forgery Clue for Video Forgery Detection in VIS and NIR Scenario [87.72258480670627]
Existing face forgery detection methods based on frequency domain find that the GAN forged images have obvious grid-like visual artifacts in the frequency spectrum compared to the real images. This paper proposes a Cosine Transform-based Forgery Clue Augmentation Network (FCAN-DCT) to achieve a more comprehensive spatial-temporal feature representation.
arXiv Detail & Related papers (2022-07-05T09:27:53Z)
Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis [69.09526348527203]
Deep generative models have led to highly realistic media, known as deepfakes, that are commonly indistinguishable from real to human eyes. We propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection. We demonstrate the improved effectiveness, cross-GAN generalization, and robustness against perturbations of our approach in a variety of detection scenarios.
arXiv Detail & Related papers (2021-05-29T21:22:24Z)
Frame-rate Up-conversion Detection Based on Convolutional Neural Network for Learning Spatiotemporal Features [7.895528973776606]
This paper proposes a frame-rate conversion detection network (FCDNet) that learns forensic features caused by FRUC in an end-to-end fashion. FCDNet uses a stack of consecutive frames as the input and effectively learns artifacts using network blocks to learn features.
arXiv Detail & Related papers (2021-03-25T08:47:46Z)
Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize. We propose to utilize the high-frequency noises for face forgery detection. The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales. The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.