A Novel Local Focusing Mechanism for Deepfake Detection Generalization
- URL: http://arxiv.org/abs/2508.17029v1
- Date: Sat, 23 Aug 2025 14:06:30 GMT
- Title: A Novel Local Focusing Mechanism for Deepfake Detection Generalization
- Authors: Mingliang Li, Lin Yuanbo Wu, Changhong Liu, Hanxi Li,
- Abstract summary: Deepfake generation techniques have intensified the need for robust and generalizable detection methods.<n>We propose a novel Local Focus Mechanism (LFM) that explicitly attends to discriminative local features for differentiating fake from real images.<n>LFM achieves a 3.7 improvement in accuracy and a 2.8 increase in average precision over the state-of-the-art Neighboring Pixel Relationships (NPR) method.
- Score: 10.223643897131192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid advancement of deepfake generation techniques has intensified the need for robust and generalizable detection methods. Existing approaches based on reconstruction learning typically leverage deep convolutional networks to extract differential features. However, these methods show poor generalization across object categories (e.g., from faces to cars) and generation domains (e.g., from GANs to Stable Diffusion), due to intrinsic limitations of deep CNNs. First, models trained on a specific category tend to overfit to semantic feature distributions, making them less transferable to other categories, especially as network depth increases. Second, Global Average Pooling (GAP) compresses critical local forgery cues into a single vector, thus discarding discriminative patterns vital for real-fake classification. To address these issues, we propose a novel Local Focus Mechanism (LFM) that explicitly attends to discriminative local features for differentiating fake from real images. LFM integrates a Salience Network (SNet) with a task-specific Top-K Pooling (TKP) module to select the K most informative local patterns. To mitigate potential overfitting introduced by Top-K pooling, we introduce two regularization techniques: Rank-Based Linear Dropout (RBLD) and Random-K Sampling (RKS), which enhance the model's robustness. LFM achieves a 3.7 improvement in accuracy and a 2.8 increase in average precision over the state-of-the-art Neighboring Pixel Relationships (NPR) method, while maintaining exceptional efficiency at 1789 FPS on a single NVIDIA A6000 GPU. Our approach sets a new benchmark for cross-domain deepfake detection. The source code are available in https://github.com/lmlpy/LFM.git
Related papers
- Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - LAA-Net: Localized Artifact Attention Network for Quality-Agnostic and Generalizable Deepfake Detection [12.567069964305265]
This paper introduces a novel approach for high-quality deepfake detection called Localized Artifact Attention Network (LAA-Net)
Existing methods for high-quality deepfake detection are mainly based on a supervised binary classifier coupled with an implicit attention mechanism.
Experiments performed on several benchmarks show the superiority of our approach in terms of Area Under the Curve (AUC) and Average Precision (AP)
arXiv Detail & Related papers (2024-01-24T23:42:08Z) - Activate and Reject: Towards Safe Domain Generalization under Category
Shift [71.95548187205736]
We study a practical problem of Domain Generalization under Category Shift (DGCS)
It aims to simultaneously detect unknown-class samples and classify known-class samples in the target domains.
Compared to prior DG works, we face two new challenges: 1) how to learn the concept of unknown'' during training with only source known-class samples, and 2) how to adapt the source-trained model to unseen environments.
arXiv Detail & Related papers (2023-10-07T07:53:12Z) - DETR Doesn't Need Multi-Scale or Locality Design [69.56292005230185]
This paper presents an improved DETR detector that maintains a "plain" nature.
It uses a single-scale feature map and global cross-attention calculations without specific locality constraints.
We show that two simple technologies are surprisingly effective within a plain design to compensate for the lack of multi-scale feature maps and locality constraints.
arXiv Detail & Related papers (2023-08-03T17:59:04Z) - ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency
Transform for Domain Generalization [15.057335610188545]
Domain Domain (DG) aims to learn a model that generalizes well to unseen target domains utilizing multiple source domains without re-training.
Most existing DG works are based on convolutional neural networks (CNNs)
arXiv Detail & Related papers (2023-03-21T08:36:34Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Denoised Non-Local Neural Network for Semantic Segmentation [18.84185406522064]
We propose a Denoised Non-Local Network (Denoised NL) to eliminate the inter-class and intra-class noises respectively.
Our proposed NL can achieve the state-of-the-art performance of 83.5% and 46.69% mIoU on Cityscapes and ADE20K, respectively.
arXiv Detail & Related papers (2021-10-27T06:16:31Z) - Markov Localisation using Heatmap Regression and Deep Convolutional
Odometry [59.33322623437816]
We present a novel CNN-based localisation approach that can leverage modern deep learning hardware.
We create a hybrid CNN that can perform image-based localisation and odometry-based likelihood propagation within a single neural network.
arXiv Detail & Related papers (2021-06-01T10:28:49Z) - Focus on Local: Detecting Lane Marker from Bottom Up via Key Point [10.617793053931964]
We propose a novel lane marker detection solution, FOLOLane, that focuses on modeling local patterns and achieving prediction of global structures.
Specifically, the CNN models lowcomplexity local patterns with two separate heads, the first one predicts the existence of key points, and the second refines the location of key points in the local range and correlates key points of the same lane line.
arXiv Detail & Related papers (2021-05-28T08:59:14Z) - Generalized Focal Loss: Learning Qualified and Distributed Bounding
Boxes for Dense Object Detection [85.53263670166304]
One-stage detector basically formulates object detection as dense classification and localization.
Recent trend for one-stage detectors is to introduce an individual prediction branch to estimate the quality of localization.
This paper delves into the representations of the above three fundamental elements: quality estimation, classification and localization.
arXiv Detail & Related papers (2020-06-08T07:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.