ForensicFlow: A Tri-Modal Adaptive Network for Robust Deepfake Detection
- URL: http://arxiv.org/abs/2511.14554v1
- Date: Tue, 18 Nov 2025 14:56:34 GMT
- Title: ForensicFlow: A Tri-Modal Adaptive Network for Robust Deepfake Detection
- Authors: Mohammad Romani,
- Abstract summary: We introduce ForensicFlow, a tri-modal forensic framework that fuses RGB, texture, and frequency evidence for video Deepfake detection.<n>Trained on Celeb-DF (v2) with Focal Loss, ForensicFlow achieves AUC 0.9752, F1-Score 0.9408, and accuracy 0.9208, outperforming single-stream baselines.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deepfakes generated by advanced GANs and autoencoders severely threaten information integrity and societal stability. Single-stream CNNs fail to capture multi-scale forgery artifacts across spatial, texture, and frequency domains, limiting robustness and generalization. We introduce the ForensicFlow, a tri-modal forensic framework that synergistically fuses RGB, texture, and frequency evidence for video Deepfake detection. The RGB branch (ConvNeXt-tiny) extracts global visual inconsistencies; the texture branch (Swin Transformer-tiny) detects fine-grained blending artifacts; the frequency branch (CNN + SE) identifies periodic spectral noise. Attention-based temporal pooling dynamically prioritizes high-evidence frames, while adaptive attention fusion balances branch contributions.Trained on Celeb-DF (v2) with Focal Loss, ForensicFlow achieves AUC 0.9752, F1-Score 0.9408, and accuracy 0.9208, outperforming single-stream baselines. Ablation validates branch synergy; Grad-CAM confirms forensic focus. This comprehensive feature fusion provides superior resilience against subtle forgeries.
Related papers
- Deepfake Forensics Adapter: A Dual-Stream Network for Generalizable Deepfake Detection [22.889849855283355]
Deepfake Forensics Adapter (DFA) is a novel dual-stream framework that synergizes vision-language foundation models with targeted forensics analysis.<n>Our approach integrates a pre-trained CLIP model with three core components to achieve specialized deepfake detection.<n>Our framework not only demonstrates state-of-the-art performance, but also points out a feasible and effective direction for developing a robust deepfake detection system.
arXiv Detail & Related papers (2026-03-02T04:58:00Z) - Enhanced Portable Ultra Low-Field Diffusion Tensor Imaging with Bayesian Artifact Correction and Deep Learning-Based Super-Resolution [3.95277369791128]
Portable, ultra-low-field (ULF) magnetic resonance imaging has the potential to expand access to neuroimaging.<n>Currently suffers from coarse spatial and angular resolutions and low signal-to-noise ratios.<n>We introduce a nine-direction, single-shell ULF DTI sequence, as well as a companion Bayesian bias field correction algorithm.
arXiv Detail & Related papers (2026-02-11T23:50:48Z) - Multi-modal Deepfake Detection and Localization with FPN-Transformer [21.022230340898556]
We introduce a multi-modal deepfake detection and localization framework based on a Feature Pyramid-Transformer (FPN-Transformer)<n>A multi-scale feature pyramid is constructed through R-TLM blocks with localized attention mechanisms, enabling joint analysis of cross-context temporal dependencies.<n>We evaluate our approach on the test set of the IJCAI'25 DDL-AV benchmark, showing a good performance with a final score of 0.7535.
arXiv Detail & Related papers (2025-11-11T09:33:39Z) - A Hybrid Deep Learning and Forensic Approach for Robust Deepfake Detection [0.0]
Existing deepfake detection methods either rely on deep learning, which suffers from poor generalization and vulnerability to distortions, or forensic analysis, which is interpretable but limited against new manipulation techniques.<n>This study proposes a hybrid framework that fuses forensic features, including noise residuals, JPEG compression traces, and frequency-domain descriptors, with deep learning representations from CNNs and vision transformers.
arXiv Detail & Related papers (2025-10-31T11:32:52Z) - HyperFake: Hyperspectral Reconstruction and Attention-Guided Analysis for Advanced Deepfake Detection [2.198430261120653]
Deepfakes pose a significant threat to digital media security.<n>Current detection methods struggle to generalize across different manipulation techniques.<n>We introduce HyperFake, a novel deepfake detection pipeline that reconstructs 31-channel hyperspectral data from standard RGB videos.
arXiv Detail & Related papers (2025-05-24T08:28:55Z) - Faster Than Lies: Real-time Deepfake Detection using Binary Neural Networks [0.0]
Deepfake detection aims to contrast the spread of deep-generated media that undermines trust in online content.
We introduce a novel deepfake detection approach on images using Binary Neural Networks (BNNs) for fast inference with minimal accuracy loss.
arXiv Detail & Related papers (2024-06-07T13:37:36Z) - Rethinking the Up-Sampling Operations in CNN-based Generative Network
for Generalizable Deepfake Detection [86.97062579515833]
We introduce the concept of Neighboring Pixel Relationships(NPR) as a means to capture and characterize the generalized structural artifacts stemming from up-sampling operations.
A comprehensive analysis is conducted on an open-world dataset, comprising samples generated by tft28 distinct generative models.
This analysis culminates in the establishment of a novel state-of-the-art performance, showcasing a remarkable tft11.6% improvement over existing methods.
arXiv Detail & Related papers (2023-12-16T14:27:06Z) - Learning Heavily-Degraded Prior for Underwater Object Detection [59.5084433933765]
This paper seeks transferable prior knowledge from detector-friendly images.
It is based on statistical observations that, the heavily degraded regions of detector-friendly (DFUI) and underwater images have evident feature distribution gaps.
Our method with higher speeds and less parameters still performs better than transformer-based detectors.
arXiv Detail & Related papers (2023-08-24T12:32:46Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z) - Spatial-Temporal Frequency Forgery Clue for Video Forgery Detection in
VIS and NIR Scenario [87.72258480670627]
Existing face forgery detection methods based on frequency domain find that the GAN forged images have obvious grid-like visual artifacts in the frequency spectrum compared to the real images.
This paper proposes a Cosine Transform-based Forgery Clue Augmentation Network (FCAN-DCT) to achieve a more comprehensive spatial-temporal feature representation.
arXiv Detail & Related papers (2022-07-05T09:27:53Z) - Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis [69.09526348527203]
Deep generative models have led to highly realistic media, known as deepfakes, that are commonly indistinguishable from real to human eyes.
We propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection.
We demonstrate the improved effectiveness, cross-GAN generalization, and robustness against perturbations of our approach in a variety of detection scenarios.
arXiv Detail & Related papers (2021-05-29T21:22:24Z) - Frame-rate Up-conversion Detection Based on Convolutional Neural Network
for Learning Spatiotemporal Features [7.895528973776606]
This paper proposes a frame-rate conversion detection network (FCDNet) that learns forensic features caused by FRUC in an end-to-end fashion.
FCDNet uses a stack of consecutive frames as the input and effectively learns artifacts using network blocks to learn features.
arXiv Detail & Related papers (2021-03-25T08:47:46Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.