LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields
- URL: http://arxiv.org/abs/2504.20026v1
- Date: Mon, 28 Apr 2025 17:48:58 GMT
- Title: LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields
- Authors: Zhengqin Li, Dilin Wang, Ka Chen, Zhaoyang Lv, Thu Nguyen-Phuoc, Milim Lee, Jia-Bin Huang, Lei Xiao, Cheng Zhang, Yufeng Zhu, Carl S. Marshall, Yufeng Ren, Richard Newcombe, Zhao Dong,
- Abstract summary: We present Large Inverse Rendering Model (LIRM), a transformer architecture that jointly reconstructs high-quality shape, materials, and radiance fields.<n>Our model builds upon the recent Large Reconstruction Models (LRMs) that achieve state-of-the-art sparse-view reconstruction quality.
- Score: 23.174562444342286
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present Large Inverse Rendering Model (LIRM), a transformer architecture that jointly reconstructs high-quality shape, materials, and radiance fields with view-dependent effects in less than a second. Our model builds upon the recent Large Reconstruction Models (LRMs) that achieve state-of-the-art sparse-view reconstruction quality. However, existing LRMs struggle to reconstruct unseen parts accurately and cannot recover glossy appearance or generate relightable 3D contents that can be consumed by standard Graphics engines. To address these limitations, we make three key technical contributions to build a more practical multi-view 3D reconstruction framework. First, we introduce an update model that allows us to progressively add more input views to improve our reconstruction. Second, we propose a hexa-plane neural SDF representation to better recover detailed textures, geometry and material parameters. Third, we develop a novel neural directional-embedding mechanism to handle view-dependent effects. Trained on a large-scale shape and material dataset with a tailored coarse-to-fine training scheme, our model achieves compelling results. It compares favorably to optimization-based dense-view inverse rendering methods in terms of geometry and relighting accuracy, while requiring only a fraction of the inference time.
Related papers
- Taming Feed-forward Reconstruction Models as Latent Encoders for 3D Generative Models [7.485139478358133]
Recent AI-based 3D content creation has largely evolved along two paths: feed-forward image-to-3D reconstruction approaches and 3D generative models trained with 2D or 3D supervision.
We show that existing feed-forward reconstruction methods can serve as effective latent encoders for training 3D generative models, thereby bridging these two paradigms.
arXiv Detail & Related papers (2024-12-31T21:23:08Z) - PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields [49.6405458373509]
We present an inverse rendering (IR) model capable of jointly estimating scene geometry, materials, and illumination.
Our method is easily adaptable to other inverse rendering and 3D reconstruction frameworks that require material estimation.
arXiv Detail & Related papers (2024-12-12T19:00:21Z) - Hyperbolic-constraint Point Cloud Reconstruction from Single RGB-D Images [19.23499128175523]
We introduce hyperbolic space to 3D point cloud reconstruction, enabling the model to represent and understand complex hierarchical structures in point clouds with low distortion.<n>Our model outperforms most existing models, and ablation studies demonstrate the significance of our model and its components.
arXiv Detail & Related papers (2024-12-12T08:27:39Z) - RelitLRM: Generative Relightable Radiance for Large Reconstruction Models [52.672706620003765]
We propose RelitLRM for generating high-quality Gaussian splatting representations of 3D objects under novel illuminations.
Unlike prior inverse rendering methods requiring dense captures and slow optimization, RelitLRM adopts a feed-forward transformer-based model.
We show our sparse-view feed-forward RelitLRM offers competitive relighting results to state-of-the-art dense-view optimization-based baselines.
arXiv Detail & Related papers (2024-10-08T17:40:01Z) - AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction [55.69271635843385]
We present AniSDF, a novel approach that learns fused-granularity neural surfaces with physics-based encoding for high-fidelity 3D reconstruction.<n>Our method boosts the quality of SDF-based methods by a great scale in both geometry reconstruction and novel-view synthesis.
arXiv Detail & Related papers (2024-10-02T03:10:38Z) - GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement [51.97726804507328]
We propose a novel approach for 3D mesh reconstruction from multi-view images.
Our method takes inspiration from large reconstruction models that use a transformer-based triplane generator and a Neural Radiance Field (NeRF) model trained on multi-view images.
arXiv Detail & Related papers (2024-06-09T05:19:24Z) - CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction
Model [37.75256020559125]
We present a high-fidelity feed-forward single image-to-3D generative model.
We highlight the necessity of integrating geometric priors into network design.
Our model delivers a high-fidelity textured mesh from an image in just 10 seconds, without any test-time optimization.
arXiv Detail & Related papers (2024-03-08T04:25:29Z) - Learning Versatile 3D Shape Generation with Improved AR Models [91.87115744375052]
Auto-regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.
We propose the Improved Auto-regressive Model (ImAM) for 3D shape generation, which applies discrete representation learning based on a latent vector instead of volumetric grids.
arXiv Detail & Related papers (2023-03-26T12:03:18Z) - Delicate Textured Mesh Recovery from NeRF via Adaptive Surface
Refinement [78.48648360358193]
We present a novel framework that generates textured surface meshes from images.
Our approach begins by efficiently initializing the geometry and view-dependency appearance with a NeRF.
We jointly refine the appearance with geometry and bake it into texture images for real-time rendering.
arXiv Detail & Related papers (2023-03-03T17:14:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.