A Multi-Stream Fusion Network for Image Splicing Localization
- URL: http://arxiv.org/abs/2212.01128v1
- Date: Fri, 2 Dec 2022 12:17:53 GMT
- Title: A Multi-Stream Fusion Network for Image Splicing Localization
- Authors: Maria Siopi and Giorgos Kordopatis-Zilos and Polychronis Charitidis
and Ioannis Kompatsiaris and Symeon Papadopoulos
- Abstract summary: We propose an encoder-decoder architecture that consists of multiple encoder streams.
Each stream is fed with either the tampered image or handcrafted signals and processes them separately to capture relevant information from each one independently.
The extracted features from the multiple streams are fused in the bottleneck of the architecture and propagated to the decoder network that generates the output localization map.
- Score: 18.505512386111985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the problem of image splicing localization with a
multi-stream network architecture that processes the raw RGB image in parallel
with other handcrafted forensic signals. Unlike previous methods that either
use only the RGB images or stack several signals in a channel-wise manner, we
propose an encoder-decoder architecture that consists of multiple encoder
streams. Each stream is fed with either the tampered image or handcrafted
signals and processes them separately to capture relevant information from each
one independently. Finally, the extracted features from the multiple streams
are fused in the bottleneck of the architecture and propagated to the decoder
network that generates the output localization map. We experiment with two
handcrafted algorithms, i.e., DCT and Splicebuster. Our proposed approach is
benchmarked on three public forensics datasets, demonstrating competitive
performance against several competing methods and achieving state-of-the-art
results, e.g., 0.898 AUC on CASIA.
Related papers
- A TextGCN-Based Decoding Approach for Improving Remote Sensing Image Captioning [0.15346678870160887]
We propose a novel encoder-decoder setup that deploys a Text Graph Convolutional Network (TextGCN) and multi-layer LSTMs.
The embeddings generated by TextGCN enhance the decoder's understanding by capturing the semantic relationships among words at both the sentence and corpus levels.
We present an extensive evaluation of our approach against various other state-of-the-art encoder-decoder frameworks.
arXiv Detail & Related papers (2024-09-27T06:12:31Z) - Neural Distributed Image Compression with Cross-Attention Feature
Alignment [1.2234742322758418]
We consider a pair of stereo images, which have overlapping fields of view, captured by a synchronized and calibrated pair of cameras.
We assume that one image of the pair is to be compressed and transmitted, while the other image is available only at the decoder.
In the proposed architecture, the encoder maps the input image to a latent space using a DNN, quantizes the latent representation, and compresses it losslessly using entropy coding.
arXiv Detail & Related papers (2022-07-18T10:15:04Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Adjacent Context Coordination Network for Salient Object Detection in
Optical Remote Sensing Images [102.75699068451166]
We propose a novel Adjacent Context Coordination Network (ACCoNet) to explore the coordination of adjacent features in an encoder-decoder architecture for optical RSI-SOD.
The proposed ACCoNet outperforms 22 state-of-the-art methods under nine evaluation metrics, and runs up to 81 fps on a single NVIDIA Titan X GPU.
arXiv Detail & Related papers (2022-03-25T14:14:55Z) - LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text
Retrieval [117.15862403330121]
We propose LoopITR, which combines dual encoders and cross encoders in the same network for joint learning.
Specifically, we let the dual encoder provide hard negatives to the cross encoder, and use the more discriminative cross encoder to distill its predictions back to the dual encoder.
arXiv Detail & Related papers (2022-03-10T16:41:12Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - Convolutional Autoencoder for Blind Hyperspectral Image Unmixing [0.0]
spectral unmixing is a technique to decompose a mixed pixel into two fundamental representatives: endmembers and abundances.
In this paper, a novel architecture is proposed to perform blind unmixing on hyperspectral images.
arXiv Detail & Related papers (2020-11-18T17:41:31Z) - Two-stream Encoder-Decoder Network for Localizing Image Forgeries [4.982505311411925]
We propose a novel two-stream encoder-decoder network, which utilizes both the high-level and the low-level image features.
We have carried out experimental analysis on multiple standard forensics datasets to evaluate the performance of the proposed method.
arXiv Detail & Related papers (2020-09-27T15:49:17Z) - Wireless Image Retrieval at the Edge [20.45405359815043]
We study the image retrieval problem at the wireless edge, where an edge device captures an image, which is then used to retrieve similar images from an edge server.
Our goal is to maximize the accuracy of the retrieval task under power and bandwidth constraints over the wireless link.
We propose two alternative schemes based on digital and analog communications, respectively.
arXiv Detail & Related papers (2020-07-21T16:15:40Z) - Beyond Single Stage Encoder-Decoder Networks: Deep Decoders for Semantic
Image Segmentation [56.44853893149365]
Single encoder-decoder methodologies for semantic segmentation are reaching their peak in terms of segmentation quality and efficiency per number of layers.
We propose a new architecture based on a decoder which uses a set of shallow networks for capturing more information content.
In order to further improve the architecture we introduce a weight function which aims to re-balance classes to increase the attention of the networks to under-represented objects.
arXiv Detail & Related papers (2020-07-19T18:44:34Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.