Related papers: MVSS-Net: Multi-View Multi-Scale Supervised Networks for Image Manipulation Detection

MVSS-Net: Multi-View Multi-Scale Supervised Networks for Image Manipulation Detection

URL: http://arxiv.org/abs/2112.08935v1
Date: Thu, 16 Dec 2021 15:01:52 GMT
Title: MVSS-Net: Multi-View Multi-Scale Supervised Networks for Image Manipulation Detection
Authors: Chengbo Dong, Xinru Chen, Ruohan Hu, Juan Cao, Xirong Li
Abstract summary: Key research question for image manipulation detection is how to learn generalizable features that are sensitive to manipulations in novel data. In this paper we address both aspects by multi-view feature learning and multi-scale supervision. Our thoughts are realized by a new network which we term MVSS-Net and its enhanced version MVSS-Net++.
Score: 10.594107680952774
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The key research question for image manipulation detection is how to learn generalizable features that are sensitive to manipulations in novel data, whilst specific to prevent false alarms on authentic images. Current research emphasizes the sensitivity, with the specificity mostly ignored. In this paper we address both aspects by multi-view feature learning and multi-scale supervision. By exploiting noise distribution and boundary artifacts surrounding tampered regions, the former aims to learn semantic-agnostic and thus more generalizable features. The latter allows us to learn from authentic images which are nontrivial to be taken into account by the prior art that relies on a semantic segmentation loss. Our thoughts are realized by a new network which we term MVSS-Net and its enhanced version MVSS-Net++. Comprehensive experiments on six public benchmark datasets justify the viability of the MVSS-Net series for both pixel-level and image-level manipulation detection.

Related papers

Semi-supervised Semantic Segmentation for Remote Sensing Images via Multi-scale Uncertainty Consistency and Cross-Teacher-Student Attention [59.19580789952102]
This paper proposes a novel semi-supervised Multi-Scale Uncertainty and Cross-Teacher-Student Attention (MUCA) model for RS image semantic segmentation tasks. MUCA constrains the consistency among feature maps at different layers of the network by introducing a multi-scale uncertainty consistency regularization. MUCA utilizes a Cross-Teacher-Student attention mechanism to guide the student network, guiding the student network to construct more discriminative feature representations.
arXiv Detail & Related papers (2025-01-18T11:57:20Z)
Not Just Learning from Others but Relying on Yourself: A New Perspective on Few-Shot Segmentation in Remote Sensing [14.37799301656178]
Few-shot segmentation (FSS) is proposed to segment unknown class targets with just a few annotated samples. We develop a Dual-Mining network named DMNet for cross-image mining and self-mining. Our model with the backbone of Resnet-50 achieves the mIoU of 49.58% and 51.34% on iSAID under 1-shot and 5-shot settings.
arXiv Detail & Related papers (2023-10-19T04:09:10Z)
CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding [38.53988682814626]
We propose a context-enhanced masked image modeling method (CtxMIM) for remote sensing image understanding. CtxMIM formulates original image patches as a reconstructive template and employs a Siamese framework to operate on two sets of image patches. With the simple and elegant design, CtxMIM encourages the pre-training model to learn object-level or pixel-level features on a large-scale dataset.
arXiv Detail & Related papers (2023-09-28T18:04:43Z)
Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning [49.43362803584032]
We propose weakly-supervised image manipulation detection. Such a setting can leverage more training images and has the potential to adapt quickly to new manipulation techniques. Two consistency properties are learned: multi-source consistency (MSC) and inter-patch consistency (IPC)
arXiv Detail & Related papers (2023-09-03T19:19:56Z)
Multi-spectral Class Center Network for Face Manipulation Detection and Localization [52.569170436393165]
We propose a novel Multi-Spectral Class Center Network (MSCCNet) for face manipulation detection and localization. Based on the features of different frequency bands, the MSCC module collects multi-spectral class centers and computes pixel-to-class relations. Applying multi-spectral class-level representations suppresses the semantic information of the visual concepts which is insensitive to manipulated regions of forgery images.
arXiv Detail & Related papers (2023-05-18T08:09:20Z)
MSMG-Net: Multi-scale Multi-grained Supervised Metworks for Multi-task Image Manipulation Detection and Localization [1.14219428942199]
A novel multi-scale multi-grained deep network (MSMG-Net) is proposed to automatically identify manipulated regions. In our MSMG-Net, a parallel multi-scale feature extraction structure is used to extract multi-scale features. The MSMG-Net can effectively perceive the object-level semantics and encode the edge artifact.
arXiv Detail & Related papers (2022-11-06T14:58:21Z)
Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally. Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy. The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z)
Multi-Content Complementation Network for Salient Object Detection in Optical Remote Sensing Images [108.79667788962425]
salient object detection in optical remote sensing images (RSI-SOD) remains to be a challenging emerging topic. We propose a novel Multi-Content Complementation Network (MCCNet) to explore the complementarity of multiple content for RSI-SOD. In MCCM, we consider multiple types of features that are critical to RSI-SOD, including foreground features, edge features, background features, and global image-level features.
arXiv Detail & Related papers (2021-12-02T04:46:40Z)
Semantic-Aware Generation for Self-Supervised Visual Representation Learning [116.5814634936371]
We advocate for Semantic-aware Generation (SaGe) to facilitate richer semantics rather than details to be preserved in the generated image. SaGe complements the target network with view-specific features and thus alleviates the semantic degradation brought by intensive data augmentations. We execute SaGe on ImageNet-1K and evaluate the pre-trained models on five downstream tasks including nearest neighbor test, linear classification, and fine-scaled image recognition.
arXiv Detail & Related papers (2021-11-25T16:46:13Z)
Image Manipulation Detection by Multi-View Multi-Scale Supervision [11.319080833880307]
Key challenge of image manipulation detection is how to learn generalizable features that are sensitive to manipulations in novel data. In this paper we address both aspects by multi-view feature learning and multi-scale supervision. Our thoughts are realized by a new network which we term MVSS-Net.
arXiv Detail & Related papers (2021-04-14T13:05:58Z)
D-Unet: A Dual-encoder U-Net for Image Splicing Forgery Detection and Localization [108.8592577019391]
Image splicing forgery detection is a global binary classification task that distinguishes the tampered and non-tampered regions by image fingerprints. We propose a novel network called dual-encoder U-Net (D-Unet) for image splicing forgery detection, which employs an unfixed encoder and a fixed encoder. In an experimental comparison study of D-Unet and state-of-the-art methods, D-Unet outperformed the other methods in image-level and pixel-level detection.
arXiv Detail & Related papers (2020-12-03T10:54:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.