Learning-based Inverse Rendering of Complex Indoor Scenes with
Differentiable Monte Carlo Raytracing
- URL: http://arxiv.org/abs/2211.03017v1
- Date: Sun, 6 Nov 2022 03:34:26 GMT
- Title: Learning-based Inverse Rendering of Complex Indoor Scenes with
Differentiable Monte Carlo Raytracing
- Authors: Jingsen Zhu, Fujun Luan, Yuchi Huo, Zihao Lin, Zhihua Zhong, Dianbing
Xi, Jiaxiang Zheng, Rui Tang, Hujun Bao, Rui Wang
- Abstract summary: This work presents an end-to-end, learning-based inverse rendering framework incorporating differentiable Monte Carlo raytracing with importance sampling.
The framework takes a single image as input to jointly recover the underlying geometry, spatially-varying lighting, and photorealistic materials.
- Score: 27.96634370355241
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Indoor scenes typically exhibit complex, spatially-varying appearance from
global illumination, making inverse rendering a challenging ill-posed problem.
This work presents an end-to-end, learning-based inverse rendering framework
incorporating differentiable Monte Carlo raytracing with importance sampling.
The framework takes a single image as input to jointly recover the underlying
geometry, spatially-varying lighting, and photorealistic materials.
Specifically, we introduce a physically-based differentiable rendering layer
with screen-space ray tracing, resulting in more realistic specular reflections
that match the input photo. In addition, we create a large-scale,
photorealistic indoor scene dataset with significantly richer details like
complex furniture and dedicated decorations. Further, we design a novel
out-of-view lighting network with uncertainty-aware refinement leveraging
hypernetwork-based neural radiance fields to predict lighting outside the view
of the input photo. Through extensive evaluations on common benchmark datasets,
we demonstrate superior inverse rendering quality of our method compared to
state-of-the-art baselines, enabling various applications such as complex
object insertion and material editing with high fidelity. Code and data will be
made available at \url{https://jingsenzhu.github.io/invrend}.
Related papers
- MAIR++: Improving Multi-view Attention Inverse Rendering with Implicit Lighting Representation [17.133440382384578]
We propose a scene-level inverse rendering framework that uses multi-view images to decompose the scene into geometry, SVBRDF, and 3D spatially-varying lighting.
A novel framework called Multi-view Attention Inverse Rendering (MAIR) was recently introduced to improve the quality of scene-level inverse rendering.
arXiv Detail & Related papers (2024-08-13T08:04:23Z) - Holistic Inverse Rendering of Complex Facade via Aerial 3D Scanning [38.72679977945778]
We use multi-view aerial images to reconstruct the geometry, lighting, and material of facades using neural signed distance fields (SDFs)
The experiment demonstrates the superior quality of our method on facade holistic inverse rendering, novel view synthesis, and scene editing compared to state-of-the-art baselines.
arXiv Detail & Related papers (2023-11-20T15:03:56Z) - Spatiotemporally Consistent HDR Indoor Lighting Estimation [66.26786775252592]
We propose a physically-motivated deep learning framework to solve the indoor lighting estimation problem.
Given a single LDR image with a depth map, our method predicts spatially consistent lighting at any given image position.
Our framework achieves photorealistic lighting prediction with higher quality compared to state-of-the-art single-image or video-based methods.
arXiv Detail & Related papers (2023-05-07T20:36:29Z) - Neural Fields meet Explicit Geometric Representation for Inverse
Rendering of Urban Scenes [62.769186261245416]
We present a novel inverse rendering framework for large urban scenes capable of jointly reconstructing the scene geometry, spatially-varying materials, and HDR lighting from a set of posed RGB images with optional depth.
Specifically, we use a neural field to account for the primary rays, and use an explicit mesh (reconstructed from the underlying neural field) for modeling secondary rays that produce higher-order lighting effects such as cast shadows.
arXiv Detail & Related papers (2023-04-06T17:51:54Z) - IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering
in Indoor Scenes [99.76677232870192]
We show how a dense vision transformer, IRISformer, excels at both single-task and multi-task reasoning required for inverse rendering.
Specifically, we propose a transformer architecture to simultaneously estimate depths, normals, spatially-varying albedo, roughness and lighting from a single image of an indoor scene.
Our evaluations on benchmark datasets demonstrate state-of-the-art results on each of the above tasks, enabling applications like object insertion and material editing in a single unconstrained real image.
arXiv Detail & Related papers (2022-06-16T19:50:55Z) - DIB-R++: Learning to Predict Lighting and Material with a Hybrid
Differentiable Renderer [78.91753256634453]
We consider the challenging problem of predicting intrinsic object properties from a single image by exploiting differentiables.
In this work, we propose DIBR++, a hybrid differentiable which supports these effects by combining specularization and ray-tracing.
Compared to more advanced physics-based differentiables, DIBR++ is highly performant due to its compact and expressive model.
arXiv Detail & Related papers (2021-10-30T01:59:39Z) - OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene
Datasets [103.54691385842314]
We propose a novel framework for creating large-scale photorealistic datasets of indoor scenes.
Our goal is to make the dataset creation process widely accessible.
This enables important applications in inverse rendering, scene understanding and robotics.
arXiv Detail & Related papers (2020-07-25T06:48:47Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.