AMICO: Amodal Instance Composition
- URL: http://arxiv.org/abs/2210.05828v1
- Date: Tue, 11 Oct 2022 23:23:14 GMT
- Title: AMICO: Amodal Instance Composition
- Authors: Peiye Zhuang, Jia-bin Huang, Ayush Saraf, Xuejian Rong, Changil Kim,
Denis Demandolx
- Abstract summary: Image composition aims to blend multiple objects to form a harmonized image.
We present Amodal Instance Composition for blending imperfect objects onto a target image.
Our results show state-of-the-art performance on public COCOA and KINS benchmarks.
- Score: 40.03865667370814
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image composition aims to blend multiple objects to form a harmonized image.
Existing approaches often assume precisely segmented and intact objects. Such
assumptions, however, are hard to satisfy in unconstrained scenarios. We
present Amodal Instance Composition for compositing imperfect -- potentially
incomplete and/or coarsely segmented -- objects onto a target image. We first
develop object shape prediction and content completion modules to synthesize
the amodal contents. We then propose a neural composition model to blend the
objects seamlessly. Our primary technical novelty lies in using separate
foreground/background representations and blending mask prediction to alleviate
segmentation errors. Our results show state-of-the-art performance on public
COCOA and KINS benchmarks and attain favorable visual results across diverse
scenes. We demonstrate various image composition applications such as object
insertion and de-occlusion.
Related papers
- Thinking Outside the BBox: Unconstrained Generative Object Compositing [36.86960274923344]
We present a novel problem of unconstrained generative object compositing.
Our first-of-its-kind model is able to generate object effects such as shadows and reflections that go beyond the mask.
Our model outperforms existing object placement and compositing models in various quality metrics and user studies.
arXiv Detail & Related papers (2024-09-06T18:42:30Z) - FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior [50.0535198082903]
We offer a novel approach to image composition, which integrates multiple input images into a single, coherent image.
We showcase the potential of utilizing the powerful generative prior inherent in large-scale pre-trained diffusion models to accomplish generic image composition.
arXiv Detail & Related papers (2024-07-06T03:35:43Z) - Neural Congealing: Aligning Images to a Joint Semantic Atlas [14.348512536556413]
We present a zero-shot self-supervised framework for aligning semantically-common content across a set of images.
Our approach harnesses the power of pre-trained DINO-ViT features to learn.
We show that our method performs favorably compared to a state-of-the-art method that requires extensive training on large-scale datasets.
arXiv Detail & Related papers (2023-02-08T09:26:22Z) - OccluMix: Towards De-Occlusion Virtual Try-on by Semantically-Guided
Mixup [79.3118064406151]
Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes)
Prior methods successfully preserve the character of clothing images.
Occlusion remains a pernicious effect for realistic virtual try-on.
arXiv Detail & Related papers (2023-01-03T06:29:11Z) - DisPositioNet: Disentangled Pose and Identity in Semantic Image
Manipulation [83.51882381294357]
DisPositioNet is a model that learns a disentangled representation for each object for the task of image manipulation using scene graphs.
Our framework enables the disentanglement of the variational latent embeddings as well as the feature representation in the graph.
arXiv Detail & Related papers (2022-11-10T11:47:37Z) - LayoutBERT: Masked Language Layout Model for Object Insertion [3.4806267677524896]
We propose layoutBERT for the object insertion task.
It uses a novel self-supervised masked language model objective and bidirectional multi-head self-attention.
We provide both qualitative and quantitative evaluations on datasets from diverse domains.
arXiv Detail & Related papers (2022-04-30T21:35:38Z) - Adversarial Image Composition with Auxiliary Illumination [53.89445873577062]
We propose an Adversarial Image Composition Net (AIC-Net) that achieves realistic image composition.
A novel branched generation mechanism is proposed, which disentangles the generation of shadows and the transfer of foreground styles.
Experiments on pedestrian and car composition tasks show that the proposed AIC-Net achieves superior composition performance.
arXiv Detail & Related papers (2020-09-17T12:58:16Z) - Object-Centric Image Generation with Factored Depths, Locations, and
Appearances [30.541425619507184]
We present a generative model of images that explicitly reasons over the set of objects they show.
Our model learns a structured latent representation that separates objects from each other and from the background.
It can be trained from images alone in a purely unsupervised fashion without the need for object masks or depth information.
arXiv Detail & Related papers (2020-04-01T18:00:11Z) - Object-Centric Image Generation from Layouts [93.10217725729468]
We develop a layout-to-image-generation method to generate complex scenes with multiple objects.
Our method learns representations of the spatial relationships between objects in the scene, which lead to our model's improved layout-fidelity.
We introduce SceneFID, an object-centric adaptation of the popular Fr'echet Inception Distance metric, that is better suited for multi-object images.
arXiv Detail & Related papers (2020-03-16T21:40:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.