CamDiff: Camouflage Image Augmentation via Diffusion Model
- URL: http://arxiv.org/abs/2304.05469v1
- Date: Tue, 11 Apr 2023 19:37:47 GMT
- Title: CamDiff: Camouflage Image Augmentation via Diffusion Model
- Authors: Xue-Jing Luo, Shuo Wang, Zongwei Wu, Christos Sakaridis, Yun Cheng,
Deng-Ping Fan, Luc Van Gool
- Abstract summary: CamDiff is a novel approach to synthesize salient objects in camouflaged scenes.
We leverage the latent diffusion model to synthesize salient objects in camouflaged scenes.
Our approach enables flexible editing and efficient large-scale dataset generation at a low cost.
- Score: 83.35960536063857
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The burgeoning field of camouflaged object detection (COD) seeks to identify
objects that blend into their surroundings. Despite the impressive performance
of recent models, we have identified a limitation in their robustness, where
existing methods may misclassify salient objects as camouflaged ones, despite
these two characteristics being contradictory. This limitation may stem from
lacking multi-pattern training images, leading to less saliency robustness. To
address this issue, we introduce CamDiff, a novel approach inspired by
AI-Generated Content (AIGC) that overcomes the scarcity of multi-pattern
training images. Specifically, we leverage the latent diffusion model to
synthesize salient objects in camouflaged scenes, while using the zero-shot
image classification ability of the Contrastive Language-Image Pre-training
(CLIP) model to prevent synthesis failures and ensure the synthesized object
aligns with the input prompt. Consequently, the synthesized image retains its
original camouflage label while incorporating salient objects, yielding
camouflage samples with richer characteristics. The results of user studies
show that the salient objects in the scenes synthesized by our framework
attract the user's attention more; thus, such samples pose a greater challenge
to the existing COD models. Our approach enables flexible editing and efficient
large-scale dataset generation at a low cost. It significantly enhances COD
baselines' training and testing phases, emphasizing robustness across diverse
domains. Our newly-generated datasets and source code are available at
https://github.com/drlxj/CamDiff.
Related papers
- BD-Diff: Generative Diffusion Model for Image Deblurring on Unknown Domains with Blur-Decoupled Learning [55.21345354747609]
BD-Diff is a generative-diffusion-based model designed to enhance deblurring performance on unknown domains.
We employ two Q-Formers as structural representations and blur patterns extractors separately.
We introduce a reconstruction task to make the structural features and blur patterns complementary.
arXiv Detail & Related papers (2025-02-03T17:00:40Z) - CGCOD: Class-Guided Camouflaged Object Detection [19.959268087062217]
We introduce class-guided camouflaged object detection (CGCOD), which extends traditional COD task by incorporating object-specific class knowledge.
We propose a multi-stage framework, CGNet, which incorporates a plug-and-play class prompt generator and a simple yet effective class-guided detector.
This establishes a new paradigm for COD, bridging the gap between contextual understanding and class-guided detection.
arXiv Detail & Related papers (2024-12-25T19:38:32Z) - Unconstrained Salient and Camouflaged Object Detection [4.698538612738126]
We introduce a benchmark called Unconstrained Salient and Camouflaged Object Detection (USCOD)
USCOD supports the simultaneous detection of salient and camouflaged objects in unconstrained scenes, regardless of their presence.
To address this challenge, we propose USCNet, a baseline model for USCOD that decouples the learning of attribute distinction from mask reconstruction.
arXiv Detail & Related papers (2024-12-14T19:37:17Z) - ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios.
We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out.
Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z) - CamoFA: A Learnable Fourier-based Augmentation for Camouflage Segmentation [25.557135220449236]
We propose a learnable augmentation method for camouflaged object detection (COD) and camouflaged instance segmentation (CIS)
Our proposed augmentation method boosts the performance of camouflaged object detectors and instance segmenters by large margins.
arXiv Detail & Related papers (2023-08-29T22:43:46Z) - Camouflaged Image Synthesis Is All You Need to Boost Camouflaged
Detection [65.8867003376637]
We propose a framework for synthesizing camouflage data to enhance the detection of camouflaged objects in natural scenes.
Our approach employs a generative model to produce realistic camouflage images, which can be used to train existing object detection models.
Our framework outperforms the current state-of-the-art method on three datasets.
arXiv Detail & Related papers (2023-08-13T06:55:05Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - High-resolution Iterative Feedback Network for Camouflaged Object
Detection [128.893782016078]
Spotting camouflaged objects that are visually assimilated into the background is tricky for object detection algorithms.
We aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries.
We introduce a novel HitNet to refine the low-resolution representations by high-resolution features in an iterative feedback manner.
arXiv Detail & Related papers (2022-03-22T11:20:21Z) - Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object
Detection [0.0]
We propose a mixed-scale triplet network, bf ZoomNet, which mimics the behavior of humans when observing vague images.
Specifically, our ZoomNet employs the zoom strategy to learn the discriminative mixed-scale semantics by the designed scale integration unit and hierarchical mixed-scale unit.
Our proposed highly task-friendly model consistently surpasses the existing 23 state-of-the-art methods on four public datasets.
arXiv Detail & Related papers (2022-03-05T09:13:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.