Making Images Real Again: A Comprehensive Survey on Deep Image Composition
- URL: http://arxiv.org/abs/2106.14490v5
- Date: Mon, 22 Apr 2024 05:24:45 GMT
- Title: Making Images Real Again: A Comprehensive Survey on Deep Image Composition
- Authors: Li Niu, Wenyan Cong, Liu Liu, Yan Hong, Bo Zhang, Jing Liang, Liqing Zhang,
- Abstract summary: Image composition task could be into multiple sub-tasks, in which each sub-task targets at one or more issues.
In this paper, we conduct comprehensive survey over the sub-tasks and blending of image composition.
For each one, we summarize the existing methods, available datasets, and common evaluation metrics.
- Score: 34.09380539557308
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a common image editing operation, image composition aims to combine the foreground from one image and another background image, resulting in a composite image. However, there are many issues that could make the composite images unrealistic. These issues can be summarized as the inconsistency between foreground and background, which includes appearance inconsistency (e.g., incompatible illumination), geometry inconsistency (e.g., unreasonable size), and semantic inconsistency (e.g., mismatched semantic context). Image composition task could be decomposed into multiple sub-tasks, in which each sub-task targets at one or more issues. Specifically, object placement aims to find reasonable scale, location, and shape for the foreground. Image blending aims to address the unnatural boundary between foreground and background. Image harmonization aims to adjust the illumination statistics of foreground. Shadow generation aims to generate plausible shadow for the foreground. These sub-tasks can be executed sequentially or parallelly to acquire realistic composite images. To the best of our knowledge, there is no previous survey on image composition. In this paper, we conduct comprehensive survey over the sub-tasks and combinatorial task of image composition. For each one, we summarize the existing methods, available datasets, and common evaluation metrics. Datasets and codes for image composition are summarized at https://github.com/bcmi/Awesome-Image-Composition. We have also contributed the first image composition toolbox: libcom https://github.com/bcmi/libcom, which assembles 10+ image composition related functions (e.g., image blending, image harmonization, object placement, shadow generation, generative composition). The ultimate goal of this toolbox is solving all the problems related to image composition with simple `import libcom'.
Related papers
- DepGAN: Leveraging Depth Maps for Handling Occlusions and Transparency in Image Composition [7.693732944239458]
DepGAN is a Generative Adversarial Network that utilizes depth maps and alpha channels to rectify inaccurate occlusions.
Central to our network is a novel loss function called Depth Aware Loss which quantifies the pixel wise depth difference.
We enhance our network's learning process by utilizing opacity data, enabling it to effectively manage compositions involving transparent and semi-transparent objects.
arXiv Detail & Related papers (2024-07-16T16:18:40Z) - DreamCom: Finetuning Text-guided Inpainting Model for Image Composition [24.411003826961686]
We propose DreamCom by treating image composition as text-guided image inpainting customized for certain object.
Specifically, we finetune pretrained text-guided image inpainting model based on a few reference images containing the same object.
In practice, the inserted object may be adversely affected by the background, so we propose masked attention mechanisms to avoid negative background interference.
arXiv Detail & Related papers (2023-09-27T09:23:50Z) - DESOBAv2: Towards Large-scale Real-world Dataset for Shadow Generation [19.376935979734714]
In this work, we focus on generating plausible shadow for the inserted foreground object to make the composite image more realistic.
To supplement the existing small-scale dataset DESOBA, we create a large-scale dataset called DESOBAv2.
arXiv Detail & Related papers (2023-08-19T10:21:23Z) - Blind Image Decomposition [53.760745569495825]
We present Blind Image Decomposition (BID), which requires separating a superimposed image into constituent underlying images in a blind setting.
How to decompose superimposed images, like rainy images, into distinct source components is a crucial step towards real-world vision systems.
We propose a simple yet general Blind Image Decomposition Network (BIDeN) to serve as a strong baseline for future work.
arXiv Detail & Related papers (2021-08-25T17:37:19Z) - SSH: A Self-Supervised Framework for Image Harmonization [97.16345684998788]
We propose a novel Self-Supervised Harmonization framework (SSH) that can be trained using just "free" natural images without being edited.
Our results show that the proposedSSH outperforms previous state-of-the-art methods in terms of reference metrics, visual quality, and subject user study.
arXiv Detail & Related papers (2021-08-15T19:51:33Z) - Shadow Generation for Composite Image in Real-world Scenes [23.532079444113528]
We propose a novel shadow generation network SGRNet, which consists of a shadow mask prediction stage and a shadow filling stage.
In the shadow mask prediction stage, foreground and background information are thoroughly interacted to generate foreground shadow mask.
In the shadow filling stage, shadow parameters are predicted to fill the shadow area.
arXiv Detail & Related papers (2021-04-21T03:30:02Z) - Deep Image Compositing [0.0]
In image editing, the most common task is pasting objects from one image to the other and then adjusting the manifestation of the foreground object with the background object.
To achieve this, we are using Generative Adversarial Networks (GANS)
GANS is able to decode the color histogram of the foreground and background part of the image and also learns to blend the foreground object with the background.
arXiv Detail & Related papers (2021-03-29T09:23:37Z) - Deep Image Compositing [93.75358242750752]
We propose a new method which can automatically generate high-quality image composites without any user input.
Inspired by Laplacian pyramid blending, a dense-connected multi-stream fusion network is proposed to effectively fuse the information from the foreground and background images.
Experiments show that the proposed method can automatically generate high-quality composites and outperforms existing methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-11-04T06:12:24Z) - Bridging Composite and Real: Towards End-to-end Deep Image Matting [88.79857806542006]
We study the roles of semantics and details for image matting.
We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders.
Comprehensive empirical studies have demonstrated that GFM outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-30T10:57:13Z) - Adversarial Image Composition with Auxiliary Illumination [53.89445873577062]
We propose an Adversarial Image Composition Net (AIC-Net) that achieves realistic image composition.
A novel branched generation mechanism is proposed, which disentangles the generation of shadows and the transfer of foreground styles.
Experiments on pedestrian and car composition tasks show that the proposed AIC-Net achieves superior composition performance.
arXiv Detail & Related papers (2020-09-17T12:58:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.