Generalizable Deepfake Detection via Effective Local-Global Feature Extraction
- URL: http://arxiv.org/abs/2501.15253v1
- Date: Sat, 25 Jan 2025 15:53:57 GMT
- Title: Generalizable Deepfake Detection via Effective Local-Global Feature Extraction
- Authors: Jiazhen Yan, Ziqiang Li, Ziwen He, Zhangjie Fu,
- Abstract summary: GANs and diffusion models have led to the generation of increasingly realistic fake images.
Deepfake detection has become a pressing issue in today's world.
We propose a novel method that effectively combines local and global features.
- Score: 5.221473306027505
- License:
- Abstract: The rapid advancement of GANs and diffusion models has led to the generation of increasingly realistic fake images, posing significant hidden dangers and threats to society. Consequently, deepfake detection has become a pressing issue in today's world. While some existing methods focus on forgery features from either a local or global perspective, they often overlook the complementary nature of these features. Other approaches attempt to incorporate both local and global features but rely on simplistic strategies, such as cropping, which fail to capture the intricate relationships between local features. To address these limitations, we propose a novel method that effectively combines local spatial-frequency domain features with global frequency domain information, capturing detailed and holistic forgery traces. Specifically, our method uses Discrete Wavelet Transform (DWT) and sliding windows to tile forged features and leverages attention mechanisms to extract local spatial-frequency domain information. Simultaneously, the phase component of the Fast Fourier Transform (FFT) is integrated with attention mechanisms to extract global frequency domain information, complementing the local features and ensuring the integrity of forgery detection. Comprehensive evaluations on open-world datasets generated by 34 distinct generative models demonstrate a significant improvement of 2.9% over existing state-of-the-art methods.
Related papers
- Object Style Diffusion for Generalized Object Detection in Urban Scene [69.04189353993907]
We introduce a novel single-domain object detection generalization method, named GoDiff.
By integrating pseudo-target domain data with source domain data, we diversify the training dataset.
Experimental results demonstrate that our method not only enhances the generalization ability of existing detectors but also functions as a plug-and-play enhancement for other single-domain generalization methods.
arXiv Detail & Related papers (2024-12-18T13:03:00Z) - SuperGF: Unifying Local and Global Features for Visual Localization [13.869227429939423]
SuperGF is a transformer-based aggregation model that operates directly on image-matching-specific local features.
We provide implementations of SuperGF using various types of local features, including dense and sparse learning-based or hand-crafted descriptors.
arXiv Detail & Related papers (2022-12-23T13:48:07Z) - Cross-Domain Local Characteristic Enhanced Deepfake Video Detection [18.430287055542315]
Deepfake detection has attracted increasing attention due to security concerns.
Many detectors cannot achieve accurate results when detecting unseen manipulations.
We propose a novel pipeline, Cross-Domain Local Forensics, for more general deepfake video detection.
arXiv Detail & Related papers (2022-11-07T07:44:09Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Delving into Sequential Patches for Deepfake Detection [64.19468088546743]
Recent advances in face forgery techniques produce nearly untraceable deepfake videos, which could be leveraged with malicious intentions.
Previous studies has identified the importance of local low-level cues and temporal information in pursuit to generalize well across deepfake methods.
We propose the Local- & Temporal-aware Transformer-based Deepfake Detection framework, which adopts a local-to-global learning protocol.
arXiv Detail & Related papers (2022-07-06T16:46:30Z) - Federated and Generalized Person Re-identification through Domain and
Feature Hallucinating [88.77196261300699]
We study the problem of federated domain generalization (FedDG) for person re-identification (re-ID)
We propose a novel method, called "Domain and Feature Hallucinating (DFH)", to produce diverse features for learning generalized local and global models.
Our method achieves the state-of-the-art performance for FedDG on four large-scale re-ID benchmarks.
arXiv Detail & Related papers (2022-03-05T09:15:13Z) - An Entropy-guided Reinforced Partial Convolutional Network for Zero-Shot
Learning [77.72330187258498]
We propose a novel Entropy-guided Reinforced Partial Convolutional Network (ERPCNet)
ERPCNet extracts and aggregates localities based on semantic relevance and visual correlations without human-annotated regions.
It not only discovers global-cooperative localities dynamically but also converges faster for policy gradient optimization.
arXiv Detail & Related papers (2021-11-03T11:13:13Z) - Local Relation Learning for Face Forgery Detection [73.73130683091154]
We propose a novel perspective of face forgery detection via local relation learning.
Specifically, we propose a Multi-scale Patch Similarity Module (MPSM), which measures the similarity between features of local regions.
We also propose an RGB-Frequency Attention Module (RFAM) to fuse information in both RGB and frequency domains for more comprehensive local feature representation.
arXiv Detail & Related papers (2021-05-06T10:44:32Z) - Video Salient Object Detection via Adaptive Local-Global Refinement [7.723369608197167]
Video salient object detection (VSOD) is an important task in many vision applications.
We propose an adaptive local-global refinement framework for VSOD.
We show that our weighting methodology can further exploit the feature correlations, thus driving the network to learn more discriminative feature representation.
arXiv Detail & Related papers (2021-04-29T14:14:11Z) - Gait Recognition via Effective Global-Local Feature Representation and
Local Temporal Aggregation [28.721376937882958]
Gait recognition is one of the most important biometric technologies and has been applied in many fields.
Recent gait recognition frameworks represent each gait frame by descriptors extracted from either global appearances or local regions of humans.
We propose a novel feature extraction and fusion framework to achieve discriminative feature representations for gait recognition.
arXiv Detail & Related papers (2020-11-03T04:07:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.