2L3: Lifting Imperfect Generated 2D Images into Accurate 3D
- URL: http://arxiv.org/abs/2401.15841v1
- Date: Mon, 29 Jan 2024 02:30:31 GMT
- Title: 2L3: Lifting Imperfect Generated 2D Images into Accurate 3D
- Authors: Yizheng Chen, Rengan Xie, Qi Ye, Sen Yang, Zixuan Xie, Tianxiao Chen,
Rong Li and Yuchi Huo
- Abstract summary: Multi-view (MV) 3D reconstruction is a promising solution to fuse generated MV images into consistent 3D objects.
However, the generated images usually suffer from inconsistent lighting, misaligned geometry, and sparse views, leading to poor reconstruction quality.
We present a novel 3D reconstruction framework that leverages intrinsic decomposition guidance, transient-mono prior guidance, and view augmentation to cope with the three issues.
- Score: 16.66666619143761
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Reconstructing 3D objects from a single image is an intriguing but
challenging problem. One promising solution is to utilize multi-view (MV) 3D
reconstruction to fuse generated MV images into consistent 3D objects. However,
the generated images usually suffer from inconsistent lighting, misaligned
geometry, and sparse views, leading to poor reconstruction quality. To cope
with these problems, we present a novel 3D reconstruction framework that
leverages intrinsic decomposition guidance, transient-mono prior guidance, and
view augmentation to cope with the three issues, respectively. Specifically, we
first leverage to decouple the shading information from the generated images to
reduce the impact of inconsistent lighting; then, we introduce mono prior with
view-dependent transient encoding to enhance the reconstructed normal; and
finally, we design a view augmentation fusion strategy that minimizes
pixel-level loss in generated sparse views and semantic loss in augmented
random views, resulting in view-consistent geometry and detailed textures. Our
approach, therefore, enables the integration of a pre-trained MV image
generator and a neural network-based volumetric signed distance function (SDF)
representation for a single image to 3D object reconstruction. We evaluate our
framework on various datasets and demonstrate its superior performance in both
quantitative and qualitative assessments, signifying a significant advancement
in 3D object reconstruction. Compared with the latest state-of-the-art method
Syncdreamer~\cite{liu2023syncdreamer}, we reduce the Chamfer Distance error by
about 36\% and improve PSNR by about 30\% .
Related papers
- GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement [51.97726804507328]
We propose a novel approach for 3D mesh reconstruction from multi-view images.
Our method takes inspiration from large reconstruction models that use a transformer-based triplane generator and a Neural Radiance Field (NeRF) model trained on multi-view images.
arXiv Detail & Related papers (2024-06-09T05:19:24Z) - LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image [64.94932577552458]
Large Reconstruction Models have made significant strides in the realm of automated 3D content generation from single or multiple input images.
Despite their success, these models often produce 3D meshes with geometric inaccuracies, stemming from the inherent challenges of deducing 3D shapes solely from image data.
We introduce a novel framework, the Large Image and Point Cloud Alignment Model (LAM3D), which utilizes 3D point cloud data to enhance the fidelity of generated 3D meshes.
arXiv Detail & Related papers (2024-05-24T15:09:12Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.
Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - FlexiDreamer: Single Image-to-3D Generation with FlexiCubes [20.871847154995688]
FlexiDreamer is a novel framework that directly reconstructs high-quality meshes from multi-view generated images.
Our approach can generate high-fidelity 3D meshes in the single image-to-3D downstream task with approximately 1 minute.
arXiv Detail & Related papers (2024-04-01T08:20:18Z) - InvertAvatar: Incremental GAN Inversion for Generalized Head Avatars [40.10906393484584]
We propose a novel framework that enhances avatar reconstruction performance using an algorithm designed to increase the fidelity from multiple frames.
Our architecture emphasizes pixel-aligned image-to-image translation, mitigating the need to learn correspondences between observation and canonical spaces.
The proposed paradigm demonstrates state-of-the-art performance on one-shot and few-shot avatar animation tasks.
arXiv Detail & Related papers (2023-12-03T18:59:15Z) - One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape
Optimization [30.951405623906258]
Single image 3D reconstruction is an important but challenging task that requires extensive knowledge of our natural world.
We propose a novel method that takes a single image of any object as input and generates a full 360-degree 3D textured mesh in a single feed-forward pass.
arXiv Detail & Related papers (2023-06-29T13:28:16Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - 2D GANs Meet Unsupervised Single-view 3D Reconstruction [21.93671761497348]
controllable image generation based on pre-trained GANs can benefit a wide range of computer vision tasks.
We propose a novel image-conditioned neural implicit field, which can leverage 2D supervisions from GAN-generated multi-view images.
The effectiveness of our approach is demonstrated through superior single-view 3D reconstruction results of generic objects.
arXiv Detail & Related papers (2022-07-20T20:24:07Z) - Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving
Objects [115.71874459429381]
We address the novel task of jointly reconstructing the 3D shape, texture, and motion of an object from a single motion-blurred image.
While previous approaches address the deblurring problem only in the 2D image domain, our proposed rigorous modeling of all object properties in the 3D domain enables the correct description of arbitrary object motion.
arXiv Detail & Related papers (2021-06-16T13:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.