Related papers: WMamba: Wavelet-based Mamba for Face Forgery Detection

WMamba: Wavelet-based Mamba for Face Forgery Detection

URL: http://arxiv.org/abs/2501.09617v1
Date: Thu, 16 Jan 2025 15:44:24 GMT
Title: WMamba: Wavelet-based Mamba for Face Forgery Detection
Authors: Siran Peng, Tianshuo Zhang, Li Gao, Xiangyu Zhu, Haoyuan Zhang, Kai Pang, Zhen Lei,
Abstract summary: Wavelet analysis can uncover subtle forgery artifacts that remain imperceptible in the spatial domain.<n>We introduce WMamba, a novel wavelet-based feature extractor built upon the Mamba architecture.<n>We show that WMamba achieves state-of-the-art (SOTA) performance, highlighting its effectiveness and superiority in face forgery detection.
Score: 34.216401304665816
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid advancement of deepfake generation technologies, the demand for robust and accurate face forgery detection algorithms has become increasingly critical. Recent studies have demonstrated that wavelet analysis can uncover subtle forgery artifacts that remain imperceptible in the spatial domain. Wavelets effectively capture important facial contours, which are often slender, fine-grained, and global in nature. However, existing wavelet-based approaches fail to fully leverage these unique characteristics, resulting in sub-optimal feature extraction and limited generalizability. To address this challenge, we introduce WMamba, a novel wavelet-based feature extractor built upon the Mamba architecture. WMamba maximizes the utility of wavelet information through two key innovations. First, we propose Dynamic Contour Convolution (DCConv), which employs specially crafted deformable kernels to adaptively model slender facial contours. Second, by leveraging the Mamba architecture, our method captures long-range spatial relationships with linear computational complexity. This efficiency allows for the extraction of fine-grained, global forgery artifacts from small image patches. Extensive experimental results show that WMamba achieves state-of-the-art (SOTA) performance, highlighting its effectiveness and superiority in face forgery detection.

Related papers

Towards Scalable and Deep Graph Neural Networks via Noise Masking [59.058558158296265]
Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks. scaling them to large graphs is challenging due to the high computational and storage costs. We present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works.
arXiv Detail & Related papers (2024-12-19T07:48:14Z)
Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
DiMSUM: Diffusion Mamba -- A Scalable and Unified Spatial-Frequency Method for Image Generation [4.391439322050918]
We introduce a novel state-space architecture for diffusion models. We harness spatial and frequency information to enhance the inductive bias towards local features in input images.
arXiv Detail & Related papers (2024-11-06T18:59:17Z)
Wavelet-Driven Generalizable Framework for Deepfake Face Forgery Detection [0.0]
Wavelet-CLIP is a deepfake detection framework that integrates wavelet transforms with features derived from the ViT-L/14 architecture, pre-trained in the CLIP fashion.<n>Our method showcases outstanding performance, achieving an average AUC of 0.749 for cross-data generalization and 0.893 for robustness against unseen deepfakes.
arXiv Detail & Related papers (2024-09-26T21:16:51Z)
MambaVT: Spatio-Temporal Contextual Modeling for robust RGB-T Tracking [51.28485682954006]
We propose a pure Mamba-based framework (MambaVT) to fully exploit intrinsic-temporal contextual modeling for robust visible-thermal tracking. Specifically, we devise the long-range cross-frame integration component to globally adapt to target appearance variations. Experiments show the significant potential of vision Mamba for RGB-T tracking, with MambaVT achieving state-of-the-art performance on four mainstream benchmarks.
arXiv Detail & Related papers (2024-08-15T02:29:00Z)
WaveMamba: Spatial-Spectral Wavelet Mamba for Hyperspectral Image Classification [3.5302264121619094]
This paper introduces WaveMamba, a novel approach that integrates wavelet transformation with the spatial-spectral Mamba architecture to enhance HSI classification. WaveMamba surpasses existing models, achieving an accuracy improvement of 4.5% on the University of Houston dataset and a 2.0% increase on the Pavia University dataset.
arXiv Detail & Related papers (2024-08-02T12:44:07Z)
Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding.<n>Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z)
HSIMamba: Hyperpsectral Imaging Efficient Feature Learning with Bidirectional State Space for Classification [16.742768644585684]
HSIMamba is a novel framework that uses bidirectional reversed convolutional neural network pathways to extract spectral features more efficiently. Our approach combines the operational efficiency of CNNs with the dynamic feature extraction capability of attention mechanisms found in Transformers. This approach improves classification accuracy beyond current benchmarks and addresses computational inefficiencies encountered with advanced models like Transformers.
arXiv Detail & Related papers (2024-03-30T07:27:36Z)
GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable. Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology. We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z)
Wavelet Prior Attention Learning in Axial Inpainting Network [35.06912946192495]
We propose a novel model -- Wavelet prior attention learning in Axial Inpainting Network (WAIN) The WPA guides the high-level feature aggregation in the multi-scale frequency domain, alleviating the textual artifacts. Stacked ATs employ unmasked clues to help model reasonable features along with low-level features of horizontal and vertical axes.
arXiv Detail & Related papers (2022-06-07T08:45:27Z)
Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize. We propose to utilize the high-frequency noises for face forgery detection. The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales. The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.