Location-Free Camouflage Generation Network
- URL: http://arxiv.org/abs/2203.09845v1
- Date: Fri, 18 Mar 2022 10:33:40 GMT
- Title: Location-Free Camouflage Generation Network
- Authors: Yangyang Li, Wei Zhai, Yang Cao, Zheng-jun Zha
- Abstract summary: Camouflage is a common visual phenomenon, which refers to hiding the foreground objects into the background images, making them briefly invisible to the human eye.
This paper proposes a novel Location-free Camouflage Generation Network (LCG-Net) that fuse high-level features of foreground and background image, and generate result by one inference.
Experiments show that our method has results as satisfactory as state-of-the-art in the single-appearance regions and are less likely to be completely invisible, but far exceed the quality of the state-of-the-art in the multi-appearance regions.
- Score: 82.74353843283407
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Camouflage is a common visual phenomenon, which refers to hiding the
foreground objects into the background images, making them briefly invisible to
the human eye. Previous work has typically been implemented by an iterative
optimization process. However, these methods struggle in 1) efficiently
generating camouflage images using foreground and background with arbitrary
structure; 2) camouflaging foreground objects to regions with multiple
appearances (e.g. the junction of the vegetation and the mountains), which
limit their practical application. To address these problems, this paper
proposes a novel Location-free Camouflage Generation Network (LCG-Net) that
fuse high-level features of foreground and background image, and generate
result by one inference. Specifically, a Position-aligned Structure Fusion
(PSF) module is devised to guide structure feature fusion based on the
point-to-point structure similarity of foreground and background, and introduce
local appearance features point-by-point. To retain the necessary identifiable
features, a new immerse loss is adopted under our pipeline, while a background
patch appearance loss is utilized to ensure that the hidden objects look
continuous and natural at regions with multiple appearances. Experiments show
that our method has results as satisfactory as state-of-the-art in the
single-appearance regions and are less likely to be completely invisible, but
far exceed the quality of the state-of-the-art in the multi-appearance regions.
Moreover, our method is hundreds of times faster than previous methods.
Benefitting from the unique advantages of our method, we provide some
downstream applications for camouflage generation, which show its potential.
The related code and dataset will be released at
https://github.com/Tale17/LCG-Net.
Related papers
- DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios.
We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out.
Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z) - Rethinking the Localization in Weakly Supervised Object Localization [51.29084037301646]
Weakly supervised object localization (WSOL) is one of the most popular and challenging tasks in computer vision.
Recent dividing WSOL into two parts (class-agnostic object localization and object classification) has become the state-of-the-art pipeline for this task.
We propose to replace SCR with a binary-class detector (BCD) for localizing multiple objects, where the detector is trained by discriminating the foreground and background.
arXiv Detail & Related papers (2023-08-11T14:38:51Z) - The Art of Camouflage: Few-Shot Learning for Animal Detection and Segmentation [21.047026366450197]
We address the problem of few-shot learning for camouflaged object detection and segmentation.
We propose FS-CDIS, a framework to efficiently detect and segment camouflaged instances.
Our proposed method achieves state-of-the-art performance on the newly collected dataset.
arXiv Detail & Related papers (2023-04-15T01:33:14Z) - Sharp Eyes: A Salient Object Detector Working The Same Way as Human
Visual Characteristics [3.222802562733787]
We propose a sharp eyes network (SENet) that first seperates the object from scene, and then finely segments it.
The proposed method aims to utilize the expanded objects to guide the network obtain complete prediction.
arXiv Detail & Related papers (2023-01-18T11:00:45Z) - JPGNet: Joint Predictive Filtering and Generative Network for Image
Inpainting [21.936689731138213]
Image inpainting aims to restore the missing regions and make the recovery results identical to the originally complete image.
Existing works usually regard it as a pure generation problem and employ cutting-edge generative techniques to address it.
In this paper, we formulate image inpainting as a mix of two problems, i.e., predictive filtering and deep generation.
arXiv Detail & Related papers (2021-07-09T07:49:52Z) - In-sample Contrastive Learning and Consistent Attention for Weakly
Supervised Object Localization [18.971497314227275]
Weakly supervised object localization (WSOL) aims to localize the target object using only the image-level supervision.
Recent methods encourage the model to activate feature maps over the entire object by dropping the most discriminative parts.
We consider the background as an important cue that guides the feature activation to cover the sophisticated object region.
arXiv Detail & Related papers (2020-09-25T07:24:46Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.