MVSS-Net: Multi-View Multi-Scale Supervised Networks for Image
Manipulation Detection
- URL: http://arxiv.org/abs/2112.08935v1
- Date: Thu, 16 Dec 2021 15:01:52 GMT
- Title: MVSS-Net: Multi-View Multi-Scale Supervised Networks for Image
Manipulation Detection
- Authors: Chengbo Dong, Xinru Chen, Ruohan Hu, Juan Cao, Xirong Li
- Abstract summary: Key research question for image manipulation detection is how to learn generalizable features that are sensitive to manipulations in novel data.
In this paper we address both aspects by multi-view feature learning and multi-scale supervision.
Our thoughts are realized by a new network which we term MVSS-Net and its enhanced version MVSS-Net++.
- Score: 10.594107680952774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The key research question for image manipulation detection is how to learn
generalizable features that are sensitive to manipulations in novel data,
whilst specific to prevent false alarms on authentic images. Current research
emphasizes the sensitivity, with the specificity mostly ignored. In this paper
we address both aspects by multi-view feature learning and multi-scale
supervision. By exploiting noise distribution and boundary artifacts
surrounding tampered regions, the former aims to learn semantic-agnostic and
thus more generalizable features. The latter allows us to learn from authentic
images which are nontrivial to be taken into account by the prior art that
relies on a semantic segmentation loss. Our thoughts are realized by a new
network which we term MVSS-Net and its enhanced version MVSS-Net++.
Comprehensive experiments on six public benchmark datasets justify the
viability of the MVSS-Net series for both pixel-level and image-level
manipulation detection.
Related papers
- Not Just Learning from Others but Relying on Yourself: A New Perspective
on Few-Shot Segmentation in Remote Sensing [14.37799301656178]
Few-shot segmentation (FSS) is proposed to segment unknown class targets with just a few annotated samples.
We develop a Dual-Mining network named DMNet for cross-image mining and self-mining.
Our model with the backbone of Resnet-50 achieves the mIoU of 49.58% and 51.34% on iSAID under 1-shot and 5-shot settings.
arXiv Detail & Related papers (2023-10-19T04:09:10Z) - CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding [38.53988682814626]
We propose a context-enhanced masked image modeling method (CtxMIM) for remote sensing image understanding.
CtxMIM formulates original image patches as a reconstructive template and employs a Siamese framework to operate on two sets of image patches.
With the simple and elegant design, CtxMIM encourages the pre-training model to learn object-level or pixel-level features on a large-scale dataset.
arXiv Detail & Related papers (2023-09-28T18:04:43Z) - Towards Generic Image Manipulation Detection with Weakly-Supervised
Self-Consistency Learning [49.43362803584032]
We propose weakly-supervised image manipulation detection.
Such a setting can leverage more training images and has the potential to adapt quickly to new manipulation techniques.
Two consistency properties are learned: multi-source consistency (MSC) and inter-patch consistency (IPC)
arXiv Detail & Related papers (2023-09-03T19:19:56Z) - Multi-spectral Class Center Network for Face Manipulation Detection and Localization [52.569170436393165]
We propose a novel Multi-Spectral Class Center Network (MSCCNet) for face manipulation detection and localization.
Based on the features of different frequency bands, the MSCC module collects multi-spectral class centers and computes pixel-to-class relations.
Applying multi-spectral class-level representations suppresses the semantic information of the visual concepts which is insensitive to manipulated regions of forgery images.
arXiv Detail & Related papers (2023-05-18T08:09:20Z) - MSMG-Net: Multi-scale Multi-grained Supervised Metworks for Multi-task
Image Manipulation Detection and Localization [1.14219428942199]
A novel multi-scale multi-grained deep network (MSMG-Net) is proposed to automatically identify manipulated regions.
In our MSMG-Net, a parallel multi-scale feature extraction structure is used to extract multi-scale features.
The MSMG-Net can effectively perceive the object-level semantics and encode the edge artifact.
arXiv Detail & Related papers (2022-11-06T14:58:21Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Multi-Content Complementation Network for Salient Object Detection in
Optical Remote Sensing Images [108.79667788962425]
salient object detection in optical remote sensing images (RSI-SOD) remains to be a challenging emerging topic.
We propose a novel Multi-Content Complementation Network (MCCNet) to explore the complementarity of multiple content for RSI-SOD.
In MCCM, we consider multiple types of features that are critical to RSI-SOD, including foreground features, edge features, background features, and global image-level features.
arXiv Detail & Related papers (2021-12-02T04:46:40Z) - Semantic-Aware Generation for Self-Supervised Visual Representation
Learning [116.5814634936371]
We advocate for Semantic-aware Generation (SaGe) to facilitate richer semantics rather than details to be preserved in the generated image.
SaGe complements the target network with view-specific features and thus alleviates the semantic degradation brought by intensive data augmentations.
We execute SaGe on ImageNet-1K and evaluate the pre-trained models on five downstream tasks including nearest neighbor test, linear classification, and fine-scaled image recognition.
arXiv Detail & Related papers (2021-11-25T16:46:13Z) - Image Manipulation Detection by Multi-View Multi-Scale Supervision [11.319080833880307]
Key challenge of image manipulation detection is how to learn generalizable features that are sensitive to manipulations in novel data.
In this paper we address both aspects by multi-view feature learning and multi-scale supervision.
Our thoughts are realized by a new network which we term MVSS-Net.
arXiv Detail & Related papers (2021-04-14T13:05:58Z) - D-Unet: A Dual-encoder U-Net for Image Splicing Forgery Detection and
Localization [108.8592577019391]
Image splicing forgery detection is a global binary classification task that distinguishes the tampered and non-tampered regions by image fingerprints.
We propose a novel network called dual-encoder U-Net (D-Unet) for image splicing forgery detection, which employs an unfixed encoder and a fixed encoder.
In an experimental comparison study of D-Unet and state-of-the-art methods, D-Unet outperformed the other methods in image-level and pixel-level detection.
arXiv Detail & Related papers (2020-12-03T10:54:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.