MC-LCR: Multi-modal contrastive classification by locally correlated
representations for effective face forgery detection
- URL: http://arxiv.org/abs/2110.03290v1
- Date: Thu, 7 Oct 2021 09:24:12 GMT
- Title: MC-LCR: Multi-modal contrastive classification by locally correlated
representations for effective face forgery detection
- Authors: Gaojian Wang, Qian Jiang, Xin Jin, Wei Li and Xiaohui Cui
- Abstract summary: We propose a novel framework named Multi-modal Contrastive Classification by Locally Correlated Representations.
Our MC-LCR aims to amplify implicit local discrepancies between authentic and forged faces from both spatial and frequency domains.
We achieve state-of-the-art performance and demonstrate the robustness and generalization of our method.
- Score: 11.124150983521158
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As the remarkable development of facial manipulation technologies is
accompanied by severe security concerns, face forgery detection has become a
recent research hotspot. Most existing detection methods train a binary
classifier under global supervision to judge real or fake. However, advanced
manipulations only perform small-scale tampering, posing challenges to
comprehensively capture subtle and local forgery artifacts, especially in high
compression settings and cross-dataset scenarios. To address such limitations,
we propose a novel framework named Multi-modal Contrastive Classification by
Locally Correlated Representations(MC-LCR), for effective face forgery
detection. Instead of specific appearance features, our MC-LCR aims to amplify
implicit local discrepancies between authentic and forged faces from both
spatial and frequency domains. Specifically, we design the shallow style
representation block that measures the pairwise correlation of shallow feature
maps, which encodes local style information to extract more discriminative
features in the spatial domain. Moreover, we make a key observation that subtle
forgery artifacts can be further exposed in the patch-wise phase and amplitude
spectrum and exhibit different clues. According to the complementarity of
amplitude and phase information, we develop a patch-wise amplitude and phase
dual attention module to capture locally correlated inconsistencies with each
other in the frequency domain. Besides the above two modules, we further
introduce the collaboration of supervised contrastive loss with cross-entropy
loss. It helps the network learn more discriminative and generalized
representations. Through extensive experiments and comprehensive studies, we
achieve state-of-the-art performance and demonstrate the robustness and
generalization of our method.
Related papers
- Mixture-of-Noises Enhanced Forgery-Aware Predictor for Multi-Face Manipulation Detection and Localization [52.87635234206178]
This paper proposes a new framework, namely MoNFAP, specifically tailored for multi-face manipulation detection and localization.
The framework incorporates two novel modules: the Forgery-aware Unified Predictor (FUP) Module and the Mixture-of-Noises Module (MNM)
arXiv Detail & Related papers (2024-08-05T08:35:59Z) - COMICS: End-to-end Bi-grained Contrastive Learning for Multi-face Forgery Detection [56.7599217711363]
Face forgery recognition methods can only process one face at a time.
Most face forgery recognition methods can only process one face at a time.
We propose COMICS, an end-to-end framework for multi-face forgery detection.
arXiv Detail & Related papers (2023-08-03T03:37:13Z) - Attention Consistency Refined Masked Frequency Forgery Representation
for Generalizing Face Forgery Detection [96.539862328788]
Existing forgery detection methods suffer from unsatisfactory generalization ability to determine the authenticity in the unseen domain.
We propose a novel Attention Consistency Refined masked frequency forgery representation model toward generalizing face forgery detection algorithm (ACMF)
Experiment results on several public face forgery datasets demonstrate the superior performance of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2023-07-21T08:58:49Z) - Cross-Domain Local Characteristic Enhanced Deepfake Video Detection [18.430287055542315]
Deepfake detection has attracted increasing attention due to security concerns.
Many detectors cannot achieve accurate results when detecting unseen manipulations.
We propose a novel pipeline, Cross-Domain Local Forensics, for more general deepfake video detection.
arXiv Detail & Related papers (2022-11-07T07:44:09Z) - Dual Contrastive Learning for General Face Forgery Detection [64.41970626226221]
We propose a novel face forgery detection framework, named Dual Contrastive Learning (DCL), which constructs positive and negative paired data.
To explore the essential discrepancies, Intra-Instance Contrastive Learning (Intra-ICL) is introduced to focus on the local content inconsistencies prevalent in the forged faces.
arXiv Detail & Related papers (2021-12-27T05:44:40Z) - Learnable Multi-level Frequency Decomposition and Hierarchical Attention
Mechanism for Generalized Face Presentation Attack Detection [7.324459578044212]
Face presentation attack detection (PAD) is attracting a lot of attention and playing a key role in securing face recognition systems.
We propose a dual-stream convolution neural networks (CNNs) framework to deal with unseen scenarios.
We successfully prove the design of our proposed PAD solution in a step-wise ablation study.
arXiv Detail & Related papers (2021-09-16T13:06:43Z) - Local Relation Learning for Face Forgery Detection [73.73130683091154]
We propose a novel perspective of face forgery detection via local relation learning.
Specifically, we propose a Multi-scale Patch Similarity Module (MPSM), which measures the similarity between features of local regions.
We also propose an RGB-Frequency Attention Module (RFAM) to fuse information in both RGB and frequency domains for more comprehensive local feature representation.
arXiv Detail & Related papers (2021-05-06T10:44:32Z) - Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in
Frequency Domain [88.7339322596758]
We present a novel Spatial-Phase Shallow Learning (SPSL) method, which combines spatial image and phase spectrum to capture the up-sampling artifacts of face forgery.
SPSL can achieve the state-of-the-art performance on cross-datasets evaluation as well as multi-class classification and obtain comparable results on single dataset evaluation.
arXiv Detail & Related papers (2021-03-02T16:45:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.