NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with
360{\deg} Views
- URL: http://arxiv.org/abs/2211.16431v2
- Date: Mon, 3 Apr 2023 05:15:59 GMT
- Title: NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with
360{\deg} Views
- Authors: Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Yi Wang, Zhangyang
Wang
- Abstract summary: In this work, we study the challenging task of lifting a single image to a 3D object.
We demonstrate the ability to generate a plausible 3D object with 360deg views that correspond well with a given reference image.
We propose a novel framework, dubbed NeuralLift-360, that utilizes a depth-aware radiance representation.
- Score: 77.93662205673297
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Virtual reality and augmented reality (XR) bring increasing demand for 3D
content. However, creating high-quality 3D content requires tedious work that a
human expert must do. In this work, we study the challenging task of lifting a
single image to a 3D object and, for the first time, demonstrate the ability to
generate a plausible 3D object with 360{\deg} views that correspond well with
the given reference image. By conditioning on the reference image, our model
can fulfill the everlasting curiosity for synthesizing novel views of objects
from images. Our technique sheds light on a promising direction of easing the
workflows for 3D artists and XR designers. We propose a novel framework, dubbed
NeuralLift-360, that utilizes a depth-aware neural radiance representation
(NeRF) and learns to craft the scene guided by denoising diffusion models. By
introducing a ranking loss, our NeuralLift-360 can be guided with rough depth
estimation in the wild. We also adopt a CLIP-guided sampling strategy for the
diffusion prior to provide coherent guidance. Extensive experiments demonstrate
that our NeuralLift-360 significantly outperforms existing state-of-the-art
baselines. Project page: https://vita-group.github.io/NeuralLift-360/
Related papers
- DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting [56.101576795566324]
We present a text-to-3D 360$circ$ scene generation pipeline.
Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement.
Our method offers a globally consistent 3D scene within a 360$circ$ perspective.
arXiv Detail & Related papers (2024-04-10T10:46:59Z) - CharNeRF: 3D Character Generation from Concept Art [3.8061090528695543]
We present a novel approach to create volumetric representations of 3D characters from consistent turnaround concept art.
We train the network to make use of these priors for various 3D points through a learnable view-direction-attended multi-head self-attention layer.
Our model is able to generate high-quality 360-degree views of characters.
arXiv Detail & Related papers (2024-02-27T01:22:08Z) - Denoising Diffusion via Image-Based Rendering [54.20828696348574]
We introduce the first diffusion model able to perform fast, detailed reconstruction and generation of real-world 3D scenes.
First, we introduce a new neural scene representation, IB-planes, that can efficiently and accurately represent large 3D scenes.
Second, we propose a denoising-diffusion framework to learn a prior over this novel 3D scene representation, using only 2D images.
arXiv Detail & Related papers (2024-02-05T19:00:45Z) - GO-NeRF: Generating Virtual Objects in Neural Radiance Fields [75.13534508391852]
GO-NeRF is capable of utilizing scene context for high-quality and harmonious 3D object generation within an existing NeRF.
Our method employs a compositional rendering formulation that allows the generated 3D objects to be seamlessly composited into the scene.
arXiv Detail & Related papers (2024-01-11T08:58:13Z) - NeO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes [59.15910989235392]
We introduce NeO 360, Neural fields for sparse view synthesis of outdoor scenes.
NeO 360 is a generalizable method that reconstructs 360deg scenes from a single or a few posed RGB images.
Our representation combines the best of both voxel-based and bird's-eye-view (BEV) representations.
arXiv Detail & Related papers (2023-08-24T17:59:50Z) - DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model [15.091263190886337]
We propose a novel pipeline to generate a high-quality 3D NeRF model from a text prompt or a single image.
DitTO-NeRF consists of constructing high-quality partial 3D object for limited in-boundary (IB) angles using the given or text-generated 2D image from the frontal view.
We propose progressive 3D object reconstruction schemes in terms of scales (low to high resolution), angles (IB angles initially to outer-boundary (OB) later, and masks (object to background boundary) in our DITTO-NeRF.
arXiv Detail & Related papers (2023-04-06T02:27:22Z) - DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance
Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars.
It exploits the advantages of both the 2D and 3D neural rendering techniques.
Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.