Related papers: Pixel-Inconsistency Modeling for Image Manipulation Localization

Pixel-Inconsistency Modeling for Image Manipulation Localization

URL: http://arxiv.org/abs/2310.00234v2
Date: Tue, 19 Nov 2024 13:34:58 GMT
Title: Pixel-Inconsistency Modeling for Image Manipulation Localization
Authors: Chenqi Kong, Anwei Luo, Shiqi Wang, Haoliang Li, Anderson Rocha, Alex C. Kot,
Abstract summary: Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
Score: 59.968362815126326
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Digital image forensics plays a crucial role in image authentication and manipulation localization. Despite the progress powered by deep neural networks, existing forgery localization methodologies exhibit limitations when deployed to unseen datasets and perturbed images (i.e., lack of generalization and robustness to real-world applications). To circumvent these problems and aid image integrity, this paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. The rationale is grounded on the observation that most image signal processors (ISP) involve the demosaicing process, which introduces pixel correlations in pristine images. Moreover, manipulating operations, including splicing, copy-move, and inpainting, directly affect such pixel regularity. We, therefore, first split the input image into several blocks and design masked self-attention mechanisms to model the global pixel dependency in input images. Simultaneously, we optimize another local pixel dependency stream to mine local manipulation clues within input forgery images. In addition, we design novel Learning-to-Weight Modules (LWM) to combine features from the two streams, thereby enhancing the final forgery localization performance. To improve the training process, we propose a novel Pixel-Inconsistency Data Augmentation (PIDA) strategy, driving the model to focus on capturing inherent pixel-level artifacts instead of mining semantic forgery traces. This work establishes a comprehensive benchmark integrating 15 representative detection models across 12 datasets. Extensive experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints and achieve state-of-the-art generalization and robustness performances in image manipulation localization.

Related papers

Context-Aware Weakly Supervised Image Manipulation Localization with SAM Refinement [52.15627062770557]
Malicious image manipulation poses societal risks, increasing the importance of effective image manipulation detection methods. Recent approaches in image manipulation detection have largely been driven by fully supervised approaches. We present a novel weakly supervised framework based on a dual-branch Transformer-CNN architecture.
arXiv Detail & Related papers (2025-03-26T07:35:09Z)
Exploring Multi-view Pixel Contrast for General and Robust Image Forgery Localization [4.8454936010479335]
We propose a Multi-view Pixel-wise Contrastive algorithm (MPC) for image forgery localization. Specifically, we first pre-train the backbone network with the supervised contrastive loss. Then the localization head is fine-tuned using the cross-entropy loss, resulting in a better pixel localizer.
arXiv Detail & Related papers (2024-06-19T13:51:52Z)
Learning Invariant Inter-pixel Correlations for Superpixel Generation [12.605604620139497]
Learnable features exhibit constrained discriminative capability, resulting in unsatisfactory pixel grouping performance. We propose the Content Disentangle Superpixel algorithm to selectively separate the invariant inter-pixel correlations and statistical properties. The experimental results on four benchmark datasets demonstrate the superiority of our approach to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-02-28T09:46:56Z)
Pixel Adapter: A Graph-Based Post-Processing Approach for Scene Text Image Super-Resolution [22.60056946339325]
We propose the Pixel Adapter Module (PAM) based on graph attention to address pixel distortion caused by upsampling. The PAM effectively captures local structural information by allowing each pixel to interact with its neighbors and update features. We demonstrate that our proposed method generates high-quality super-resolution images, surpassing existing methods in recognition accuracy.
arXiv Detail & Related papers (2023-09-16T08:12:12Z)
ISSTAD: Incremental Self-Supervised Learning Based on Transformer for Anomaly Detection and Localization [12.975540251326683]
We introduce a novel approach based on the Transformer backbone network. We train a Masked Autoencoder (MAE) model solely on normal images. In the subsequent stage, we apply pixel-level data augmentation techniques to generate corrupted normal images. This process allows the model to learn how to repair corrupted regions and classify the status of each pixel.
arXiv Detail & Related papers (2023-03-30T13:11:26Z)
Towards Effective Image Manipulation Detection with Proposal Contrastive Learning [61.5469708038966]
We propose Proposal Contrastive Learning (PCL) for effective image manipulation detection. Our PCL consists of a two-stream architecture by extracting two types of global features from RGB and noise views respectively. Our PCL can be easily adapted to unlabeled data in practice, which can reduce manual labeling costs and promote more generalizable features.
arXiv Detail & Related papers (2022-10-16T13:30:13Z)
Joint Learning of Deep Texture and High-Frequency Features for Computer-Generated Image Detection [24.098604827919203]
We propose a joint learning strategy with deep texture and high-frequency features for CG image detection. A semantic segmentation map is generated to guide the affine transformation operation. The combination of the original image and the high-frequency components of the original and rendered images are fed into a multi-branch neural network equipped with attention mechanisms.
arXiv Detail & Related papers (2022-09-07T17:30:40Z)
Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network. We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z)
Spatially-Adaptive Image Restoration using Distortion-Guided Networks [51.89245800461537]
We present a learning-based solution for restoring images suffering from spatially-varying degradations. We propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts to difficult regions in the image.
arXiv Detail & Related papers (2021-08-19T11:02:25Z)
Learning Spatial and Spatio-Temporal Pixel Aggregations for Image and Video Denoising [104.59305271099967]
We present a pixel aggregation network and learn the pixel sampling and averaging strategies for image denoising. We develop a pixel aggregation network for video denoising to sample pixels across the spatial-temporal space. Our method is able to solve the misalignment issues caused by large motion in dynamic scenes.
arXiv Detail & Related papers (2021-01-26T13:00:46Z)
Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task. We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network. Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.