Weakly-supervised deepfake localization in diffusion-generated images
- URL: http://arxiv.org/abs/2311.04584v2
- Date: Mon, 13 Nov 2023 08:32:51 GMT
- Title: Weakly-supervised deepfake localization in diffusion-generated images
- Authors: Dragos Tantaru and Elisabeta Oneata and Dan Oneata
- Abstract summary: We propose a weakly-supervised localization problem based on the Xception network as the backbone architecture.
We show that the best performing detection method (based on local scores) is less sensitive to the looser supervision than to the mismatch in terms of dataset or generator.
- Score: 4.548755617115687
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The remarkable generative capabilities of denoising diffusion models have
raised new concerns regarding the authenticity of the images we see every day
on the Internet. However, the vast majority of existing deepfake detection
models are tested against previous generative approaches (e.g. GAN) and usually
provide only a "fake" or "real" label per image. We believe a more informative
output would be to augment the per-image label with a localization map
indicating which regions of the input have been manipulated. To this end, we
frame this task as a weakly-supervised localization problem and identify three
main categories of methods (based on either explanations, local scores or
attention), which we compare on an equal footing by using the Xception network
as the common backbone architecture. We provide a careful analysis of all the
main factors that parameterize the design space: choice of method, type of
supervision, dataset and generator used in the creation of manipulated images;
our study is enabled by constructing datasets in which only one of the
components is varied. Our results show that weakly-supervised localization is
attainable, with the best performing detection method (based on local scores)
being less sensitive to the looser supervision than to the mismatch in terms of
dataset or generator.
Related papers
- Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities [88.398085358514]
Contrastive Deepfake Embeddings (CoDE) is a novel embedding space specifically designed for deepfake detection.
CoDE is trained via contrastive learning by additionally enforcing global-local similarities.
arXiv Detail & Related papers (2024-07-29T18:00:10Z) - FM-OSD: Foundation Model-Enabled One-Shot Detection of Anatomical Landmarks [44.54301473673582]
We propose the first foundation model-enabled one-shot landmark detection (FM-OSD) framework for accurate landmark detection in medical images.
By using solely a single template image, our method demonstrates significant superiority over strong state-of-the-art one-shot landmark detection methods.
arXiv Detail & Related papers (2024-07-07T15:37:02Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [63.54342601757723]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - CRADL: Contrastive Representations for Unsupervised Anomaly Detection
and Localization [2.8659934481869715]
Unsupervised anomaly detection in medical imaging aims to detect and localize arbitrary anomalies without requiring anomalous data during training.
Most current state-of-the-art methods use latent variable generative models operating directly on the images.
We propose CRADL whose core idea is to model the distribution of normal samples directly in the low-dimensional representation space of an encoder trained with a contrastive pretext-task.
arXiv Detail & Related papers (2023-01-05T16:07:49Z) - TruFor: Leveraging all-round clues for trustworthy image forgery
detection and localization [17.270110456445806]
TruFor is a forensic framework that can be applied to a large variety of image manipulation methods.
We rely on the extraction of both high-level and low-level traces through a transformer-based fusion architecture.
Our method is able to reliably detect and localize both cheapfakes and deepfakes manipulations outperforming state-of-the-art works.
arXiv Detail & Related papers (2022-12-21T11:49:43Z) - Evaluating the Label Efficiency of Contrastive Self-Supervised Learning
for Multi-Resolution Satellite Imagery [0.0]
Self-supervised learning has been applied in the remote sensing domain to exploit readily-available unlabeled data.
In this paper, we study self-supervised visual representation learning through the lens of label efficiency.
arXiv Detail & Related papers (2022-10-13T06:54:13Z) - AnoViT: Unsupervised Anomaly Detection and Localization with Vision
Transformer-based Encoder-Decoder [3.31490164885582]
We propose a vision transformer-based encoder-decoder model, named AnoViT, to reflect normal information by additionally learning the global relationship between image patches.
The proposed model performed better than the convolution-based model on three benchmark datasets.
arXiv Detail & Related papers (2022-03-21T09:01:37Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Low-Rank Subspaces in GANs [101.48350547067628]
This work introduces low-rank subspaces that enable more precise control of GAN generation.
LowRankGAN is able to find the low-dimensional representation of attribute manifold.
Experiments on state-of-the-art GAN models (including StyleGAN2 and BigGAN) trained on various datasets demonstrate the effectiveness of our LowRankGAN.
arXiv Detail & Related papers (2021-06-08T16:16:32Z) - Self-supervised Segmentation via Background Inpainting [96.10971980098196]
We introduce a self-supervised detection and segmentation approach that can work with single images captured by a potentially moving camera.
We exploit a self-supervised loss function that we exploit to train a proposal-based segmentation network.
We apply our method to human detection and segmentation in images that visually depart from those of standard benchmarks and outperform existing self-supervised methods.
arXiv Detail & Related papers (2020-11-11T08:34:40Z) - Unsupervised Metric Relocalization Using Transform Consistency Loss [66.19479868638925]
Training networks to perform metric relocalization traditionally requires accurate image correspondences.
We propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration.
We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
arXiv Detail & Related papers (2020-11-01T19:24:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.