Towards Full-to-Empty Room Generation with Structure-Aware Feature
Encoding and Soft Semantic Region-Adaptive Normalization
- URL: http://arxiv.org/abs/2112.05396v1
- Date: Fri, 10 Dec 2021 09:00:13 GMT
- Title: Towards Full-to-Empty Room Generation with Structure-Aware Feature
Encoding and Soft Semantic Region-Adaptive Normalization
- Authors: Vasileios Gkitsas, Nikolaos Zioulis, Vladimiros Sterzentsenko,
Alexandros Doumanoglou, Dimitrios Zarpalas
- Abstract summary: We propose a simple yet effective adjusted fully differentiable soft semantic region-adaptive normalization module (softSEAN) block.
Our approach besides the advantages of mitigating training complexity and non-differentiability issues surpasses the compared methods both quantitatively and qualitatively.
Our softSEAN block can be used as a drop-in module for existing discriminative and generative models.
- Score: 67.64622529651677
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The task of transforming a furnished room image into a background-only is
extremely challenging since it requires making large changes regarding the
scene context while still preserving the overall layout and style. In order to
acquire photo-realistic and structural consistent background, existing deep
learning methods either employ image inpainting approaches or incorporate the
learning of the scene layout as an individual task and leverage it later in a
not fully differentiable semantic region-adaptive normalization module. To
tackle these drawbacks, we treat scene layout generation as a feature linear
transformation problem and propose a simple yet effective adjusted fully
differentiable soft semantic region-adaptive normalization module (softSEAN)
block. We showcase the applicability in diminished reality and depth estimation
tasks, where our approach besides the advantages of mitigating training
complexity and non-differentiability issues, surpasses the compared methods
both quantitatively and qualitatively. Our softSEAN block can be used as a
drop-in module for existing discriminative and generative models.
Implementation is available on vcl3d.github.io/PanoDR/.
Related papers
- A Spitting Image: Modular Superpixel Tokenization in Vision Transformers [0.0]
Vision Transformer (ViT) architectures traditionally employ a grid-based approach to tokenization independent of the semantic content of an image.
We propose a modular superpixel tokenization strategy which decouples tokenization and feature extraction.
arXiv Detail & Related papers (2024-08-14T17:28:58Z) - Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators [97.12135238534628]
We propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects.
Specifically, the semantic discriminators leverage pretrained visual features to improve the realism of the generated visual concepts.
Our proposed scheme significantly improves the generation quality and achieves state-of-the-art results on various tasks.
arXiv Detail & Related papers (2022-12-13T01:36:56Z) - Image-Specific Information Suppression and Implicit Local Alignment for
Text-based Person Search [61.24539128142504]
Text-based person search (TBPS) is a challenging task that aims to search pedestrian images with the same identity from an image gallery given a query text.
Most existing methods rely on explicitly generated local parts to model fine-grained correspondence between modalities.
We propose an efficient joint Multi-level Alignment Network (MANet) for TBPS, which can learn aligned image/text feature representations between modalities at multiple levels.
arXiv Detail & Related papers (2022-08-30T16:14:18Z) - Situational Perception Guided Image Matting [16.1897179939677]
We propose a Situational Perception Guided Image Matting (SPG-IM) method that mitigates subjective bias of matting annotations.
SPG-IM can better associate inter-objects and object-to-environment saliency, and compensate the subjective nature of image matting.
arXiv Detail & Related papers (2022-04-20T07:35:51Z) - Retrieval-based Spatially Adaptive Normalization for Semantic Image
Synthesis [68.1281982092765]
We propose a novel normalization module, termed as REtrieval-based Spatially AdaptIve normaLization (RESAIL)
RESAIL provides pixel level fine-grained guidance to the normalization architecture.
Experiments on several challenging datasets show that our RESAIL performs favorably against state-of-the-arts in terms of quantitative metrics, visual quality, and subjective evaluation.
arXiv Detail & Related papers (2022-04-06T14:21:39Z) - Towards Controllable and Photorealistic Region-wise Image Manipulation [11.601157452472714]
We present a generative model with auto-encoder architecture for per-region style manipulation.
We apply a code consistency loss to enforce an explicit disentanglement between content and style latent representations.
The model is constrained by a content alignment loss to ensure the foreground editing will not interfere background contents.
arXiv Detail & Related papers (2021-08-19T13:29:45Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.