Related papers: Towards a Neural Graphics Pipeline for Controllable Image Generation

Towards a Neural Graphics Pipeline for Controllable Image Generation

URL: http://arxiv.org/abs/2006.10569v2
Date: Mon, 22 Feb 2021 09:18:55 GMT
Title: Towards a Neural Graphics Pipeline for Controllable Image Generation
Authors: Xuelin Chen, Daniel Cohen-Or, Baoquan Chen and Niloy J. Mitra
Abstract summary: We present Neural Graphics Pipeline (NGP), a hybrid generative model that brings together neural and traditional image formation models. NGP decomposes the image into a set of interpretable appearance feature maps, uncovering direct control handles for controllable image generation. We demonstrate the effectiveness of our approach on controllable image generation of single-object scenes.
Score: 96.11791992084551
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we leverage advances in neural networks towards forming a neural rendering for controllable image generation, and thereby bypassing the need for detailed modeling in conventional graphics pipeline. To this end, we present Neural Graphics Pipeline (NGP), a hybrid generative model that brings together neural and traditional image formation models. NGP decomposes the image into a set of interpretable appearance feature maps, uncovering direct control handles for controllable image generation. To form an image, NGP generates coarse 3D models that are fed into neural rendering modules to produce view-specific interpretable 2D maps, which are then composited into the final output image using a traditional image formation model. Our approach offers control over image generation by providing direct handles controlling illumination and camera parameters, in addition to control over shape and appearance variations. The key challenge is to learn these controls through unsupervised training that links generated coarse 3D models with unpaired real images via neural and traditional (e.g., Blinn- Phong) rendering functions, without establishing an explicit correspondence between them. We demonstrate the effectiveness of our approach on controllable image generation of single-object scenes. We evaluate our hybrid modeling framework, compare with neural-only generation methods (namely, DCGAN, LSGAN, WGAN-GP, VON, and SRNs), report improvement in FID scores against real images, and demonstrate that NGP supports direct controls common in traditional forward rendering. Code is available at http://geometry.cs.ucl.ac.uk/projects/2021/ngp.

Related papers

CtrlNeRF: The Generative Neural Radiation Fields for the Controllable Synthesis of High-fidelity 3D-Aware Images [5.50550810374347]
generative neural radiance field (GRAF) is capable of producing images from random noise z without 3D supervision. In practice, shape and appearance are modeled by z_s and z_a, respectively, to manipulate them separately during inference. We introduce a controllable generative model that uses a single network to represent multiple scenes with shared weights.
arXiv Detail & Related papers (2024-12-01T10:19:24Z)
NovelGS: Consistent Novel-view Denoising via Large Gaussian Reconstruction Model [57.92709692193132]
NovelGS is a diffusion model for Gaussian Splatting given sparse-view images. We leverage the novel view denoising through a transformer-based network to generate 3D Gaussians.
arXiv Detail & Related papers (2024-11-25T07:57:17Z)
PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models [55.080748327139176]
PerlDiff is a method for effective street view image generation that fully leverages perspective 3D geometric information. Our results justify that our PerlDiff markedly enhances the precision of generation on the NuScenes and KITTI datasets.
arXiv Detail & Related papers (2024-07-08T16:46:47Z)
Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting [9.383423119196408]
We introduce Multi-view ControlNet (MVControl), a novel neural network architecture designed to enhance existing multi-view diffusion models. MVControl is able to offer 3D diffusion guidance for optimization-based 3D generation. In pursuit of efficiency, we adopt 3D Gaussians as our representation instead of the commonly used implicit representations.
arXiv Detail & Related papers (2024-03-15T02:57:20Z)
Controlling the Output of a Generative Model by Latent Feature Vector Shifting [0.0]
We present our novel method for latent vector shifting for controlled output image modification. In our approach we use a pre-trained model of StyleGAN3 that generates images of realistic human faces. Our latent feature shifter is a neural network model with a task to shift the latent vectors of a generative model into a specified feature direction.
arXiv Detail & Related papers (2023-11-15T10:42:06Z)
Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model [39.64952340472541]
We propose a controllable text-to-3D avatar generation method whose facial expression is controllable. Our main strategy is to construct the 3D avatar in Neural Radiance Fields (NeRF) optimized with a set of controlled viewpoint-aware images. We demonstrate the empirical results and discuss the effectiveness of our method.
arXiv Detail & Related papers (2023-09-07T08:14:46Z)
Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control [54.079327030892244]
Free-HeadGAN is a person-generic neural talking head synthesis system. We show that modeling faces with sparse 3D facial landmarks are sufficient for achieving state-of-the-art generative performance.
arXiv Detail & Related papers (2022-08-03T16:46:08Z)
Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses. We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network. Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z)
SMPLpix: Neural Avatars from 3D Human Models [56.85115800735619]
We bridge the gap between classic rendering and the latest generative networks operating in pixel space. We train a network that directly converts a sparse set of 3D mesh vertices into photorealistic images. We show the advantage over conventional differentiables both in terms of the level of photorealism and rendering efficiency.
arXiv Detail & Related papers (2020-08-16T10:22:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.