Foreground Object Search by Distilling Composite Image Feature
- URL: http://arxiv.org/abs/2308.04990v1
- Date: Wed, 9 Aug 2023 14:43:10 GMT
- Title: Foreground Object Search by Distilling Composite Image Feature
- Authors: Bo Zhang and Jiacheng Sui and Li Niu
- Abstract summary: Foreground object search (FOS) aims to find compatible foreground objects for a given background image.
We observe that competitive retrieval performance could be achieved by using a discriminator to predict the compatibility of composite image.
We propose a novel FOS method via distilling composite feature (DiscoFOS)
- Score: 15.771802337102837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Foreground object search (FOS) aims to find compatible foreground objects for
a given background image, producing realistic composite image. We observe that
competitive retrieval performance could be achieved by using a discriminator to
predict the compatibility of composite image, but this approach has
unaffordable time cost. To this end, we propose a novel FOS method via
distilling composite feature (DiscoFOS). Specifically, the abovementioned
discriminator serves as teacher network. The student network employs two
encoders to extract foreground feature and background feature. Their
interaction output is enforced to match the composite image feature from the
teacher network. Additionally, previous works did not release their datasets,
so we contribute two datasets for FOS task: S-FOSD dataset with synthetic
composite images and R-FOSD dataset with real composite images. Extensive
experiments on our two datasets demonstrate the superiority of the proposed
method over previous approaches. The dataset and code are available at
https://github.com/bcmi/Foreground-Object-Search-Dataset-FOSD.
Related papers
- Rethinking Image Super-Resolution from Training Data Perspectives [54.28824316574355]
We investigate the understudied effect of the training data used for image super-resolution (SR)
With this, we propose an automated image evaluation pipeline.
We find that datasets with (i) low compression artifacts, (ii) high within-image diversity as judged by the number of different objects, and (iii) a large number of images from ImageNet or PASS all positively affect SR performance.
arXiv Detail & Related papers (2024-09-01T16:25:04Z) - DESOBAv2: Towards Large-scale Real-world Dataset for Shadow Generation [19.376935979734714]
In this work, we focus on generating plausible shadow for the inserted foreground object to make the composite image more realistic.
To supplement the existing small-scale dataset DESOBA, we create a large-scale dataset called DESOBAv2.
arXiv Detail & Related papers (2023-08-19T10:21:23Z) - Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to
Parcel Logistics [58.720142291102135]
We present a fully automated pipeline to generate a synthetic dataset for instance segmentation in four steps.
We first scrape images for the objects of interest from popular image search engines.
We compare three different methods for image selection: Object-agnostic pre-processing, manual image selection and CNN-based image selection.
arXiv Detail & Related papers (2022-10-18T12:49:04Z) - Semantic keypoint-based pose estimation from single RGB frames [64.80395521735463]
We present an approach to estimating the continuous 6-DoF pose of an object from a single RGB image.
The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model.
We show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios.
arXiv Detail & Related papers (2022-04-12T15:03:51Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - OPA: Object Placement Assessment Dataset [20.791187775546625]
Image composition aims to generate realistic composite image by inserting an object from one image into another background image.
In this paper, we focus on object placement assessment task, which verifies whether a composite image is plausible in terms of the object placement.
arXiv Detail & Related papers (2021-07-05T09:23:53Z) - Deep Image Compositing [0.0]
In image editing, the most common task is pasting objects from one image to the other and then adjusting the manifestation of the foreground object with the background object.
To achieve this, we are using Generative Adversarial Networks (GANS)
GANS is able to decode the color histogram of the foreground and background part of the image and also learns to blend the foreground object with the background.
arXiv Detail & Related papers (2021-03-29T09:23:37Z) - Self-Supervised Representation Learning for RGB-D Salient Object
Detection [93.17479956795862]
We use Self-Supervised Representation Learning to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation.
Our pretext tasks require only a few and un RGB-D datasets to perform pre-training, which make the network capture rich semantic contexts.
For the inherent problem of cross-modal fusion in RGB-D SOD, we propose a multi-path fusion module.
arXiv Detail & Related papers (2021-01-29T09:16:06Z) - Real-MFF: A Large Realistic Multi-focus Image Dataset with Ground Truth [58.226535803985804]
We introduce a large and realistic multi-focus dataset called Real-MFF.
The dataset contains 710 pairs of source images with corresponding ground truth images.
We evaluate 10 typical multi-focus algorithms on this dataset for the purpose of illustration.
arXiv Detail & Related papers (2020-03-28T12:33:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.