Towards Generic Image Manipulation Detection with Weakly-Supervised
Self-Consistency Learning
- URL: http://arxiv.org/abs/2309.01246v1
- Date: Sun, 3 Sep 2023 19:19:56 GMT
- Title: Towards Generic Image Manipulation Detection with Weakly-Supervised
Self-Consistency Learning
- Authors: Yuanhao Zhai, Tianyu Luan, David Doermann, Junsong Yuan
- Abstract summary: We propose weakly-supervised image manipulation detection.
Such a setting can leverage more training images and has the potential to adapt quickly to new manipulation techniques.
Two consistency properties are learned: multi-source consistency (MSC) and inter-patch consistency (IPC)
- Score: 49.43362803584032
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As advanced image manipulation techniques emerge, detecting the manipulation
becomes increasingly important. Despite the success of recent learning-based
approaches for image manipulation detection, they typically require expensive
pixel-level annotations to train, while exhibiting degraded performance when
testing on images that are differently manipulated compared with training
images. To address these limitations, we propose weakly-supervised image
manipulation detection, such that only binary image-level labels (authentic or
tampered with) are required for training purpose. Such a weakly-supervised
setting can leverage more training images and has the potential to adapt
quickly to new manipulation techniques. To improve the generalization ability,
we propose weakly-supervised self-consistency learning (WSCL) to leverage the
weakly annotated images. Specifically, two consistency properties are learned:
multi-source consistency (MSC) and inter-patch consistency (IPC). MSC exploits
different content-agnostic information and enables cross-source learning via an
online pseudo label generation and refinement process. IPC performs global
pair-wise patch-patch relationship reasoning to discover a complete region of
manipulation. Extensive experiments validate that our WSCL, even though is
weakly supervised, exhibits competitive performance compared with
fully-supervised counterpart under both in-distribution and out-of-distribution
evaluations, as well as reasonable manipulation localization ability.
Related papers
- Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control [73.6361029556484]
Embodied AI agents require a fine-grained understanding of the physical world mediated through visual and language inputs.
We consider pre-trained text-to-image diffusion models, which are explicitly optimized to generate images from text prompts.
We show that Stable Control Representations enable learning policies that exhibit state-of-the-art performance on OVMM, a difficult open-vocabulary navigation benchmark.
arXiv Detail & Related papers (2024-05-09T15:39:54Z) - Transformer-based Clipped Contrastive Quantization Learning for
Unsupervised Image Retrieval [15.982022297570108]
Unsupervised image retrieval aims to learn the important visual characteristics without any given level to retrieve the similar images for a given query image.
In this paper, we propose a TransClippedCLR model by encoding the global context of an image using Transformer having local context through patch based processing.
Results using the proposed clipped contrastive learning are greatly improved on all datasets as compared to same backbone network with vanilla contrastive learning.
arXiv Detail & Related papers (2024-01-27T09:39:11Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - Localized Region Contrast for Enhancing Self-Supervised Learning in
Medical Image Segmentation [27.82940072548603]
We propose a novel contrastive learning framework that integrates Localized Region Contrast (LRC) to enhance existing self-supervised pre-training methods for medical image segmentation.
Our approach involves identifying Super-pixels by Felzenszwalb's algorithm and performing local contrastive learning using a novel contrastive sampling loss.
arXiv Detail & Related papers (2023-04-06T22:43:13Z) - Non-Contrastive Learning Meets Language-Image Pre-Training [145.6671909437841]
We study the validity of non-contrastive language-image pre-training (nCLIP)
We introduce xCLIP, a multi-tasking framework combining CLIP and nCLIP, and show that nCLIP aids CLIP in enhancing feature semantics.
arXiv Detail & Related papers (2022-10-17T17:57:46Z) - Towards Effective Image Manipulation Detection with Proposal Contrastive
Learning [61.5469708038966]
We propose Proposal Contrastive Learning (PCL) for effective image manipulation detection.
Our PCL consists of a two-stream architecture by extracting two types of global features from RGB and noise views respectively.
Our PCL can be easily adapted to unlabeled data in practice, which can reduce manual labeling costs and promote more generalizable features.
arXiv Detail & Related papers (2022-10-16T13:30:13Z) - Masked Unsupervised Self-training for Zero-shot Image Classification [98.23094305347709]
Masked Unsupervised Self-Training (MUST) is a new approach which leverages two different and complimentary sources of supervision: pseudo-labels and raw images.
MUST improves upon CLIP by a large margin and narrows the performance gap between unsupervised and supervised classification.
arXiv Detail & Related papers (2022-06-07T02:03:06Z) - AugNet: End-to-End Unsupervised Visual Representation Learning with
Image Augmentation [3.6790362352712873]
We propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures.
Our experiments demonstrate that the method is able to represent the image in low dimensional space.
Unlike many deep-learning-based image retrieval algorithms, our approach does not require access to external annotated datasets.
arXiv Detail & Related papers (2021-06-11T09:02:30Z) - Image Manipulation Detection by Multi-View Multi-Scale Supervision [11.319080833880307]
Key challenge of image manipulation detection is how to learn generalizable features that are sensitive to manipulations in novel data.
In this paper we address both aspects by multi-view feature learning and multi-scale supervision.
Our thoughts are realized by a new network which we term MVSS-Net.
arXiv Detail & Related papers (2021-04-14T13:05:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.