Multi-scale Target-Aware Framework for Constrained Image Splicing
Detection and Localization
- URL: http://arxiv.org/abs/2308.09357v2
- Date: Mon, 21 Aug 2023 11:58:14 GMT
- Title: Multi-scale Target-Aware Framework for Constrained Image Splicing
Detection and Localization
- Authors: Yuxuan Tan, Yuanman Li, Limin Zeng, Jiaxiong Ye, Wei wang, Xia Li
- Abstract summary: We propose a multi-scale target-aware framework to couple feature extraction and correlation matching in a unified pipeline.
Our approach can effectively promote the collaborative learning of related patches, and perform mutual promotion of feature learning and correlation matching.
Our experiments demonstrate that our model, which uses a unified pipeline, outperforms state-of-the-art methods on several benchmark datasets.
- Score: 11.803255600587308
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Constrained image splicing detection and localization (CISDL) is a
fundamental task of multimedia forensics, which detects splicing operation
between two suspected images and localizes the spliced region on both images.
Recent works regard it as a deep matching problem and have made significant
progress. However, existing frameworks typically perform feature extraction and
correlation matching as separate processes, which may hinder the model's
ability to learn discriminative features for matching and can be susceptible to
interference from ambiguous background pixels. In this work, we propose a
multi-scale target-aware framework to couple feature extraction and correlation
matching in a unified pipeline. In contrast to previous methods, we design a
target-aware attention mechanism that jointly learns features and performs
correlation matching between the probe and donor images. Our approach can
effectively promote the collaborative learning of related patches, and perform
mutual promotion of feature learning and correlation matching. Additionally, in
order to handle scale transformations, we introduce a multi-scale projection
method, which can be readily integrated into our target-aware framework that
enables the attention process to be conducted between tokens containing
information of varying scales. Our experiments demonstrate that our model,
which uses a unified pipeline, outperforms state-of-the-art methods on several
benchmark datasets and is robust against scale transformations.
Related papers
- Discriminative Co-Saliency and Background Mining Transformer for
Co-Salient Object Detection [111.04994415248736]
We propose a Discriminative co-saliency and background Mining Transformer framework (DMT)
We use two types of pre-defined tokens to mine co-saliency and background information via our proposed contrast-induced pixel-to-token correlation and co-saliency token-to-token correlation modules.
Experimental results on three benchmark datasets demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2023-04-30T15:56:47Z) - FECANet: Boosting Few-Shot Semantic Segmentation with Feature-Enhanced
Context-Aware Network [48.912196729711624]
Few-shot semantic segmentation is the task of learning to locate each pixel of a novel class in a query image with only a few annotated support images.
We propose a Feature-Enhanced Context-Aware Network (FECANet) to suppress the matching noise caused by inter-class local similarity.
In addition, we propose a novel correlation reconstruction module that encodes extra correspondence relations between foreground and background and multi-scale context semantic features.
arXiv Detail & Related papers (2023-01-19T16:31:13Z) - Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image.
We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation.
Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z) - Correlation-Aware Deep Tracking [83.51092789908677]
We propose a novel target-dependent feature network inspired by the self-/cross-attention scheme.
Our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
Our model can be flexibly pre-trained on abundant unpaired images, leading to notably faster convergence than the existing methods.
arXiv Detail & Related papers (2022-03-03T11:53:54Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Deep Relational Metric Learning [84.95793654872399]
This paper presents a deep relational metric learning framework for image clustering and retrieval.
We learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions.
Experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
arXiv Detail & Related papers (2021-08-23T09:31:18Z) - Multimodal Contrastive Training for Visual Representation Learning [45.94662252627284]
We develop an approach to learning visual representations that embraces multimodal data.
Our method exploits intrinsic data properties within each modality and semantic information from cross-modal correlation simultaneously.
By including multimodal training in a unified framework, our method can learn more powerful and generic visual features.
arXiv Detail & Related papers (2021-04-26T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.