Survey on Controlable Image Synthesis with Deep Learning
- URL: http://arxiv.org/abs/2307.10275v1
- Date: Tue, 18 Jul 2023 07:02:51 GMT
- Title: Survey on Controlable Image Synthesis with Deep Learning
- Authors: Shixiong Zhang, Jiao Li, Lu Yang
- Abstract summary: We present a survey of some recent works on 3D controllable image synthesis using deep learning.
We first introduce the datasets and evaluation indicators for 3D controllable image synthesis.
The photometrically controllable image synthesis approaches are also reviewed for 3D re-lighting researches.
- Score: 15.29961293132048
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image synthesis has attracted emerging research interests in academic and
industry communities. Deep learning technologies especially the generative
models greatly inspired controllable image synthesis approaches and
applications, which aim to generate particular visual contents with latent
prompts. In order to further investigate low-level controllable image synthesis
problem which is crucial for fine image rendering and editing tasks, we present
a survey of some recent works on 3D controllable image synthesis using deep
learning. We first introduce the datasets and evaluation indicators for 3D
controllable image synthesis. Then, we review the state-of-the-art research for
geometrically controllable image synthesis in two aspects: 1)
Viewpoint/pose-controllable image synthesis; 2) Structure/shape-controllable
image synthesis. Furthermore, the photometrically controllable image synthesis
approaches are also reviewed for 3D re-lighting researches. While the emphasis
is on 3D controllable image synthesis algorithms, the related applications,
products and resources are also briefly summarized for practitioners.
Related papers
- ORES: Open-vocabulary Responsible Visual Synthesis [104.7572323359984]
We formalize a new task, Open-vocabulary Responsible Visual Synthesis (ORES), where the synthesis model is able to avoid forbidden visual concepts.
To address this problem, we present a Two-stage Intervention (TIN) framework.
By introducing 1) rewriting with learnable instruction through a large-scale language model (LLM) and 2) synthesizing with prompt intervention on a diffusion model, it can effectively synthesize images avoiding any concepts but following the user's query as much as possible.
arXiv Detail & Related papers (2023-08-26T06:47:34Z) - ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real
Novel View Synthesis via Contrastive Learning [102.46382882098847]
We first investigate the effects of synthetic data in synthetic-to-real novel view synthesis.
We propose to introduce geometry-aware contrastive learning to learn multi-view consistent features with geometric constraints.
Our method can render images with higher quality and better fine-grained details, outperforming existing generalizable novel view synthesis methods in terms of PSNR, SSIM, and LPIPS.
arXiv Detail & Related papers (2023-03-20T12:06:14Z) - HORIZON: High-Resolution Semantically Controlled Panorama Synthesis [105.55531244750019]
Panorama synthesis endeavors to craft captivating 360-degree visual landscapes, immersing users in the heart of virtual worlds.
Recent breakthroughs in visual synthesis have unlocked the potential for semantic control in 2D flat images, but a direct application of these methods to panorama synthesis yields distorted content.
We unveil an innovative framework for generating high-resolution panoramas, adeptly addressing the issues of spherical distortion and edge discontinuity through sophisticated spherical modeling.
arXiv Detail & Related papers (2022-10-10T09:43:26Z) - Pathology Synthesis of 3D-Consistent Cardiac MR Images using 2D VAEs and
GANs [0.5039813366558306]
We propose a method for generating labeled data for the application of supervised deep-learning (DL) training.
The image synthesis consists of label deformation and label-to-image translation tasks.
We demonstrate that such an approach could provide a solution to diversify and enrich an available database of cardiac MR images.
arXiv Detail & Related papers (2022-09-09T10:17:49Z) - Realistic Image Synthesis with Configurable 3D Scene Layouts [59.872657806747576]
We propose a novel approach to realistic-looking image synthesis based on a 3D scene layout.
Our approach takes a 3D scene with semantic class labels as input and trains a 3D scene painting network.
With the trained painting network, realistic-looking images for the input 3D scene can be rendered and manipulated.
arXiv Detail & Related papers (2021-08-23T09:44:56Z) - A Survey on Adversarial Image Synthesis [0.0]
We provide a taxonomy of methods used in image synthesis, review different models for text-to-image synthesis and image-to-image translation, and discuss some evaluation metrics as well as possible future research directions in image synthesis with GAN.
arXiv Detail & Related papers (2021-06-30T13:31:48Z) - pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware
Image Synthesis [45.51447644809714]
We propose a novel generative model, named Periodic Implicit Generative Adversarial Networks ($pi$-GAN or pi-GAN) for high-quality 3D-aware image synthesis.
The proposed approach obtains state-of-the-art results for 3D-aware image synthesis with multiple real and synthetic datasets.
arXiv Detail & Related papers (2020-12-02T01:57:46Z) - Semantic View Synthesis [56.47999473206778]
We tackle a new problem of semantic view synthesis -- generating free-viewpoint rendering of a synthesized scene using a semantic label map as input.
First, we focus on synthesizing the color and depth of the visible surface of the 3D scene.
We then use the synthesized color and depth to impose explicit constraints on the multiple-plane image (MPI) representation prediction process.
arXiv Detail & Related papers (2020-08-24T17:59:46Z) - Hierarchy Composition GAN for High-fidelity Image Synthesis [57.32311953820988]
This paper presents an innovative Hierarchical Composition GAN (HIC-GAN)
HIC-GAN incorporates image synthesis in geometry and appearance domains into an end-to-end trainable network.
Experiments on scene text image synthesis, portrait editing and indoor rendering tasks show that the proposed HIC-GAN achieves superior synthesis performance qualitatively and quantitatively.
arXiv Detail & Related papers (2019-05-12T11:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.