Semantic-Guided Inpainting Network for Complex Urban Scenes Manipulation
- URL: http://arxiv.org/abs/2010.09334v1
- Date: Mon, 19 Oct 2020 09:17:17 GMT
- Title: Semantic-Guided Inpainting Network for Complex Urban Scenes Manipulation
- Authors: Pierfrancesco Ardino, Yahui Liu, Elisa Ricci, Bruno Lepri and Marco De
Nadai
- Abstract summary: In this work, we propose a novel deep learning model to alter a complex urban scene by removing a user-specified portion of the image.
Inspired by recent works on image inpainting, our proposed method leverages the semantic segmentation to model the content and structure of the image.
To generate reliable results, we design a new decoder block that combines the semantic segmentation and generation task.
- Score: 19.657440527538547
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Manipulating images of complex scenes to reconstruct, insert and/or remove
specific object instances is a challenging task. Complex scenes contain
multiple semantics and objects, which are frequently cluttered or ambiguous,
thus hampering the performance of inpainting models. Conventional techniques
often rely on structural information such as object contours in multi-stage
approaches that generate unreliable results and boundaries. In this work, we
propose a novel deep learning model to alter a complex urban scene by removing
a user-specified portion of the image and coherently inserting a new object
(e.g. car or pedestrian) in that scene. Inspired by recent works on image
inpainting, our proposed method leverages the semantic segmentation to model
the content and structure of the image, and learn the best shape and location
of the object to insert. To generate reliable results, we design a new decoder
block that combines the semantic segmentation and generation task to guide
better the generation of new objects and scenes, which have to be semantically
consistent with the image. Our experiments, conducted on two large-scale
datasets of urban scenes (Cityscapes and Indian Driving), show that our
proposed approach successfully address the problem of semantically-guided
inpainting of complex urban scene.
Related papers
- Sketch-Guided Scene Image Generation [11.009579131371018]
We propose a sketch-guided scene image generation framework, decomposing the task of scene image scene generation from sketch inputs.
We employ pre-trained diffusion models to convert each single object drawing into an image of the object, inferring additional details while maintaining the sparse sketch structure.
In scene-level image construction, we generate the latent representation of the scene image using the separated background prompts.
arXiv Detail & Related papers (2024-07-09T00:16:45Z) - LLM Blueprint: Enabling Text-to-Image Generation with Complex and
Detailed Prompts [60.54912319612113]
Diffusion-based generative models have significantly advanced text-to-image generation but encounter challenges when processing lengthy and intricate text prompts.
We present a novel approach leveraging Large Language Models (LLMs) to extract critical components from text prompts.
Our evaluation on complex prompts featuring multiple objects demonstrates a substantial improvement in recall compared to baseline diffusion models.
arXiv Detail & Related papers (2023-10-16T17:57:37Z) - Localizing Object-level Shape Variations with Text-to-Image Diffusion
Models [60.422435066544814]
We present a technique to generate a collection of images that depicts variations in the shape of a specific object.
A particular challenge when generating object variations is accurately localizing the manipulation applied over the object's shape.
To localize the image-space operation, we present two techniques that use the self-attention layers in conjunction with the cross-attention layers.
arXiv Detail & Related papers (2023-03-20T17:45:08Z) - Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators [97.12135238534628]
We propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects.
Specifically, the semantic discriminators leverage pretrained visual features to improve the realism of the generated visual concepts.
Our proposed scheme significantly improves the generation quality and achieves state-of-the-art results on various tasks.
arXiv Detail & Related papers (2022-12-13T01:36:56Z) - LayoutBERT: Masked Language Layout Model for Object Insertion [3.4806267677524896]
We propose layoutBERT for the object insertion task.
It uses a novel self-supervised masked language model objective and bidirectional multi-head self-attention.
We provide both qualitative and quantitative evaluations on datasets from diverse domains.
arXiv Detail & Related papers (2022-04-30T21:35:38Z) - Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid [102.24539566851809]
Restoring reasonable and realistic content for arbitrary missing regions in images is an important yet challenging task.
Recent image inpainting models have made significant progress in generating vivid visual details, but they can still lead to texture blurring or structural distortions.
We propose the Semantic Pyramid Network (SPN) motivated by the idea that learning multi-scale semantic priors can greatly benefit the recovery of locally missing content in images.
arXiv Detail & Related papers (2021-12-08T04:33:33Z) - Boosting Image Outpainting with Semantic Layout Prediction [18.819765707811904]
We train a GAN to extend regions in semantic segmentation domain instead of image domain.
Another GAN model is trained to synthesize real images based on the extended semantic layouts.
Our approach can handle semantic clues more easily and hence works better in complex scenarios.
arXiv Detail & Related papers (2021-10-18T13:09:31Z) - Context-Aware Image Inpainting with Learned Semantic Priors [100.99543516733341]
We introduce pretext tasks that are semantically meaningful to estimating the missing contents.
We propose a context-aware image inpainting model, which adaptively integrates global semantics and local features.
arXiv Detail & Related papers (2021-06-14T08:09:43Z) - Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed
Scenes [54.836331922449666]
We propose a Semantic Guidance and Evaluation Network (SGE-Net) to update the structural priors and the inpainted image.
It utilizes semantic segmentation map as guidance in each scale of inpainting, under which location-dependent inferences are re-evaluated.
Experiments on real-world images of mixed scenes demonstrated the superiority of our proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-15T17:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.