Multi-Scale Wavelet Transformer for Face Forgery Detection
- URL: http://arxiv.org/abs/2210.03899v1
- Date: Sat, 8 Oct 2022 03:39:36 GMT
- Title: Multi-Scale Wavelet Transformer for Face Forgery Detection
- Authors: Jie Liu, Jingjing Wang, Peng Zhang, Chunmao Wang, Di Xie, Shiliang Pu
- Abstract summary: We propose a multi-scale wavelet transformer framework for face forgery detection.
Frequency-based spatial attention is designed to guide the spatial feature extractor to concentrate more on forgery traces.
Cross-modality attention is proposed to fuse the frequency features with the spatial features.
- Score: 43.33712402517951
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Currently, many face forgery detection methods aggregate spatial and
frequency features to enhance the generalization ability and gain promising
performance under the cross-dataset scenario. However, these methods only
leverage one level frequency information which limits their expressive ability.
To overcome these limitations, we propose a multi-scale wavelet transformer
framework for face forgery detection. Specifically, to take full advantage of
the multi-scale and multi-frequency wavelet representation, we gradually
aggregate the multi-scale wavelet representation at different stages of the
backbone network. To better fuse the frequency feature with the spatial
features, frequency-based spatial attention is designed to guide the spatial
feature extractor to concentrate more on forgery traces. Meanwhile,
cross-modality attention is proposed to fuse the frequency features with the
spatial features. These two attention modules are calculated through a unified
transformer block for efficiency. A wide variety of experiments demonstrate
that the proposed method is efficient and effective for both within and cross
datasets.
Related papers
- Multiple Contexts and Frequencies Aggregation Network forDeepfake Detection [5.65128683992597]
Deepfake detection faces increasing challenges since the fast growth of generative models in developing massive and diverse Deepfake technologies.
Recent advances rely on introducing features from spatial or frequency domains rather than modeling general forgery features within backbones.
We propose an efficient network for face forgery detection named MkfaNet, which consists of two core modules.
arXiv Detail & Related papers (2024-08-03T05:34:53Z) - Wavelet-based Bi-dimensional Aggregation Network for SAR Image Change Detection [53.842568573251214]
Experimental results on three SAR datasets demonstrate that our WBANet significantly outperforms contemporary state-of-the-art methods.
Our WBANet achieves 98.33%, 96.65%, and 96.62% of percentage of correct classification (PCC) on the respective datasets.
arXiv Detail & Related papers (2024-07-18T04:36:10Z) - Frequency-Aware Deepfake Detection: Improving Generalizability through
Frequency Space Learning [81.98675881423131]
This research addresses the challenge of developing a universal deepfake detector that can effectively identify unseen deepfake images.
Existing frequency-based paradigms have relied on frequency-level artifacts introduced during the up-sampling in GAN pipelines to detect forgeries.
We introduce a novel frequency-aware approach called FreqNet, centered around frequency domain learning, specifically designed to enhance the generalizability of deepfake detectors.
arXiv Detail & Related papers (2024-03-12T01:28:00Z) - Learning Spatial-Frequency Transformer for Visual Object Tracking [15.750739748843744]
Recent trackers adopt the Transformer to combine or replace the widely used ResNet as their new backbone network.
We believe these operations ignore the spatial prior of the target object which may lead to sub-optimal results.
We propose a unified Spatial-Frequency Transformer that models the spatial Prior and High-frequency emphasis Attention (GPHA) simultaneously.
arXiv Detail & Related papers (2022-08-18T13:46:12Z) - Adaptive Frequency Learning in Two-branch Face Forgery Detection [66.91715092251258]
We propose Adaptively learn Frequency information in the two-branch Detection framework, dubbed AFD.
We liberate our network from the fixed frequency transforms, and achieve better performance with our data- and task-dependent transform layers.
arXiv Detail & Related papers (2022-03-27T14:25:52Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z) - Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in
Frequency Domain [88.7339322596758]
We present a novel Spatial-Phase Shallow Learning (SPSL) method, which combines spatial image and phase spectrum to capture the up-sampling artifacts of face forgery.
SPSL can achieve the state-of-the-art performance on cross-datasets evaluation as well as multi-class classification and obtain comparable results on single dataset evaluation.
arXiv Detail & Related papers (2021-03-02T16:45:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.