Related papers: Discrepancy-Guided Reconstruction Learning for Image Forgery Detection

Discrepancy-Guided Reconstruction Learning for Image Forgery Detection

URL: http://arxiv.org/abs/2304.13349v2
Date: Wed, 3 May 2023 12:50:10 GMT
Title: Discrepancy-Guided Reconstruction Learning for Image Forgery Detection
Authors: Zenan Shi, Haipeng Chen, Long Chen and Dong Zhang
Abstract summary: We first propose a Discrepancy-Guided (DisGE) to extract forgery-sensitive visual patterns. We then introduce a Double-Head Reconstruction (DouHR) module to enhance genuine compact visual patterns in different granular spaces. Under DouHR, we further introduce a Discrepancy-Aggregation Detector (DisAD) to aggregate these genuine compact visual patterns.
Score: 10.221066530624373
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we propose a novel image forgery detection paradigm for boosting the model learning capacity on both forgery-sensitive and genuine compact visual patterns. Compared to the existing methods that only focus on the discrepant-specific patterns (\eg, noises, textures, and frequencies), our method has a greater generalization. Specifically, we first propose a Discrepancy-Guided Encoder (DisGE) to extract forgery-sensitive visual patterns. DisGE consists of two branches, where the mainstream backbone branch is used to extract general semantic features, and the accessorial discrepant external attention branch is used to extract explicit forgery cues. Besides, a Double-Head Reconstruction (DouHR) module is proposed to enhance genuine compact visual patterns in different granular spaces. Under DouHR, we further introduce a Discrepancy-Aggregation Detector (DisAD) to aggregate these genuine compact visual patterns, such that the forgery detection capability on unknown patterns can be improved. Extensive experimental results on four challenging datasets validate the effectiveness of our proposed method against state-of-the-art competitors.

Related papers

Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models. In this paper, we investigate how detection performance varies across model backbones, types, and datasets. We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
Effort: Efficient Orthogonal Modeling for Generalizable AI-Generated Image Detection [66.16595174895802]
Existing AI-generated image (AIGI) detection methods often suffer from limited generalization performance. In this paper, we identify a crucial yet previously overlooked asymmetry phenomenon in AIGI detection.
arXiv Detail & Related papers (2024-11-23T19:10:32Z)
Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection [58.228940066769596]
We introduce a Dual-Image Enhanced CLIP approach, leveraging a joint vision-language scoring system. Our methods process pairs of images, utilizing each as a visual reference for the other, thereby enriching the inference process with visual context. Our approach significantly exploits the potential of vision-language joint anomaly detection and demonstrates comparable performance with current SOTA methods across various datasets.
arXiv Detail & Related papers (2024-05-08T03:13:20Z)
Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection [106.39544368711427]
We study the problem of generalizable synthetic image detection, aiming to detect forgery images from diverse generative methods. We present a novel forgery-aware adaptive transformer approach, namely FatFormer. Our approach tuned on 4-class ProGAN data attains an average of 98% accuracy to unseen GANs, and surprisingly generalizes to unseen diffusion models with 95% accuracy.
arXiv Detail & Related papers (2023-12-27T17:36:32Z)
Produce Once, Utilize Twice for Anomaly Detection [6.501323305130114]
We derive POUTA, which improves both the accuracy and efficiency by reusing the discriminant information potential in the reconstructive network. POUTA achieves better performance than the state-of-the-art few-shot anomaly detection methods without any special design.
arXiv Detail & Related papers (2023-12-20T10:49:49Z)
Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust. Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model. We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z)
SeeABLE: Soft Discrepancies and Bounded Contrastive Learning for Exposing Deepfakes [7.553507857251396]
We propose a novel deepfake detector, called SeeABLE, that formalizes the detection problem as a (one-class) out-of-distribution detection task. SeeABLE pushes perturbed faces towards predefined prototypes using a novel regression-based bounded contrastive loss. We show that our model convincingly outperforms competing state-of-the-art detectors, while exhibiting highly encouraging generalization capabilities.
arXiv Detail & Related papers (2022-11-21T09:38:30Z)
Towards Effective Image Manipulation Detection with Proposal Contrastive Learning [61.5469708038966]
We propose Proposal Contrastive Learning (PCL) for effective image manipulation detection. Our PCL consists of a two-stream architecture by extracting two types of global features from RGB and noise views respectively. Our PCL can be easily adapted to unlabeled data in practice, which can reduce manual labeling costs and promote more generalizable features.
arXiv Detail & Related papers (2022-10-16T13:30:13Z)
MC-LCR: Multi-modal contrastive classification by locally correlated representations for effective face forgery detection [11.124150983521158]
We propose a novel framework named Multi-modal Contrastive Classification by Locally Correlated Representations. Our MC-LCR aims to amplify implicit local discrepancies between authentic and forged faces from both spatial and frequency domains. We achieve state-of-the-art performance and demonstrate the robustness and generalization of our method.
arXiv Detail & Related papers (2021-10-07T09:24:12Z)
Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize. We propose to utilize the high-frequency noises for face forgery detection. The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales. The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z)
Gait Recognition using Multi-Scale Partial Representation Transformation with Capsules [22.99694601595627]
We propose a novel deep network, learning to transfer multi-scale partial gait representations using capsules. Our network first obtains multi-scale partial representations using a state-of-the-art deep partial feature extractor. It then recurrently learns the correlations and co-occurrences of the patterns among the partial features in forward and backward directions.
arXiv Detail & Related papers (2020-10-18T19:47:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.