SAFIRE: Segment Any Forged Image Region
- URL: http://arxiv.org/abs/2412.08197v1
- Date: Wed, 11 Dec 2024 08:40:37 GMT
- Title: SAFIRE: Segment Any Forged Image Region
- Authors: Myung-Joon Kwon, Wonjun Lee, Seung-Hun Nam, Minji Son, Changick Kim,
- Abstract summary: We propose Segment Any Forged Image Region (SAFIRE), which solves forgery localization using point prompting.
Instead of memorizing certain forgery traces, SAFIRE naturally focuses on uniform characteristics within each source region.
This approach leads to more stable and effective learning, achieving superior performance in both the new task and the traditional binary forgery localization.
- Score: 16.97096271263231
- License:
- Abstract: Most techniques approach the problem of image forgery localization as a binary segmentation task, training neural networks to label original areas as 0 and forged areas as 1. In contrast, we tackle this issue from a more fundamental perspective by partitioning images according to their originating sources. To this end, we propose Segment Any Forged Image Region (SAFIRE), which solves forgery localization using point prompting. Each point on an image is used to segment the source region containing itself. This allows us to partition images into multiple source regions, a capability achieved for the first time. Additionally, rather than memorizing certain forgery traces, SAFIRE naturally focuses on uniform characteristics within each source region. This approach leads to more stable and effective learning, achieving superior performance in both the new task and the traditional binary forgery localization.
Related papers
- Image Copy-Move Forgery Detection via Deep PatchMatch and Pairwise Ranking Learning [39.85737063875394]
This study develops a novel end-to-end CMFD framework that integrates the strengths of conventional and deep learning methods.
Unlike existing deep models, our approach utilizes features extracted from high-resolution scales to seek explicit and reliable point-to-point matching.
By leveraging the strong prior of point-to-point matches, the framework can identify subtle differences and effectively discriminate between source and target regions.
arXiv Detail & Related papers (2024-04-26T10:38:17Z) - CFL-Net: Image Forgery Localization Using Contrastive Learning [16.668334854459143]
We use contrastive loss to learn mapping into a feature space where the features between untampered and manipulated regions are well-separated for each image.
Our method has the advantage of localizing manipulated region without requiring any prior knowledge or assumption about the forgery type.
arXiv Detail & Related papers (2022-10-04T15:31:30Z) - Location-Free Camouflage Generation Network [82.74353843283407]
Camouflage is a common visual phenomenon, which refers to hiding the foreground objects into the background images, making them briefly invisible to the human eye.
This paper proposes a novel Location-free Camouflage Generation Network (LCG-Net) that fuse high-level features of foreground and background image, and generate result by one inference.
Experiments show that our method has results as satisfactory as state-of-the-art in the single-appearance regions and are less likely to be completely invisible, but far exceed the quality of the state-of-the-art in the multi-appearance regions.
arXiv Detail & Related papers (2022-03-18T10:33:40Z) - Point-Level Region Contrast for Object Detection Pre-Training [147.47349344401806]
We present point-level region contrast, a self-supervised pre-training approach for the task of object detection.
Our approach performs contrastive learning by directly sampling individual point pairs from different regions.
Compared to an aggregated representation per region, our approach is more robust to the change in input region quality.
arXiv Detail & Related papers (2022-02-09T18:56:41Z) - RegionCLIP: Region-based Language-Image Pretraining [94.29924084715316]
Contrastive language-image pretraining (CLIP) using image-text pairs has achieved impressive results on image classification.
We propose a new method called RegionCLIP that significantly extends CLIP to learn region-level visual representations.
Our method significantly outperforms the state of the art by 3.8 AP50 and 2.2 AP for novel categories on COCO and LVIS datasets.
arXiv Detail & Related papers (2021-12-16T18:39:36Z) - GLocal: Global Graph Reasoning and Local Structure Transfer for Person
Image Generation [2.580765958706854]
We focus on person image generation, namely, generating person image under various conditions, e.g., corrupted texture or different pose.
We present a GLocal framework to improve the occlusion-aware texture estimation by globally reasoning the style inter-correlations among different semantic regions.
For local structural information preservation, we further extract the local structure of the source image and regain it in the generated image via local structure transfer.
arXiv Detail & Related papers (2021-12-01T03:54:30Z) - From Contexts to Locality: Ultra-high Resolution Image Segmentation via
Locality-aware Contextual Correlation [43.70432772819461]
We innovate the widely used high-resolution image segmentation pipeline.
An ultra-high resolution image is partitioned into regular patches for local segmentation and then the local results are merged into a high-resolution semantic mask.
arXiv Detail & Related papers (2021-09-06T16:26:05Z) - BaMBNet: A Blur-aware Multi-branch Network for Defocus Deblurring [74.34263243089688]
convolutional neural networks (CNNs) have been introduced to the defocus deblurring problem and achieved significant progress.
This study designs a novel blur-aware multi-branch network (BaMBNet) in which different regions (with different blur amounts) should be treated differentially.
Both quantitative and qualitative experiments demonstrate that our BaMBNet outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-05-31T07:55:30Z) - Rethinking of the Image Salient Object Detection: Object-level Semantic
Saliency Re-ranking First, Pixel-wise Saliency Refinement Latter [62.26677215668959]
We propose a lightweight, weakly supervised deep network to coarsely locate semantically salient regions.
We then fuse multiple off-the-shelf deep models on these semantically salient regions as the pixel-wise saliency refinement.
Our method is simple yet effective, which is the first attempt to consider the salient object detection mainly as an object-level semantic re-ranking problem.
arXiv Detail & Related papers (2020-08-10T07:12:43Z) - TriGAN: Image-to-Image Translation for Multi-Source Domain Adaptation [82.52514546441247]
We propose the first approach for Multi-Source Domain Adaptation (MSDA) based on Generative Adversarial Networks.
Our method is inspired by the observation that the appearance of a given image depends on three factors: the domain, the style and the content.
We test our approach using common MSDA benchmarks, showing that it outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-04-19T05:07:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.