UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video
- URL: http://arxiv.org/abs/2306.09349v4
- Date: Tue, 14 Jan 2025 22:05:06 GMT
- Title: UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video
- Authors: Chih-Hao Lin, Bohan Liu, Yi-Ting Chen, Kuan-Sheng Chen, David Forsyth, Jia-Bin Huang, Anand Bhattad, Shenlong Wang,
- Abstract summary: We present UrbanIR, a new inverse graphics model that enables realistic, free-viewpoint renderings of scenes under various lighting conditions with a single video.
It accurately infers shape, albedo, visibility, and sun and sky illumination from wide-baseline videos, such as those from car-mounted cameras.
UrbanIR addresses these issues with novel losses that reduce errors in inverse graphics inference and rendering artifacts.
- Score: 22.613946218766802
- License:
- Abstract: We present UrbanIR (Urban Scene Inverse Rendering), a new inverse graphics model that enables realistic, free-viewpoint renderings of scenes under various lighting conditions with a single video. It accurately infers shape, albedo, visibility, and sun and sky illumination from wide-baseline videos, such as those from car-mounted cameras, differing from NeRF's dense view settings. In this context, standard methods often yield subpar geometry and material estimates, such as inaccurate roof representations and numerous 'floaters'. UrbanIR addresses these issues with novel losses that reduce errors in inverse graphics inference and rendering artifacts. Its techniques allow for precise shadow volume estimation in the original scene. The model's outputs support controllable editing, enabling photorealistic free-viewpoint renderings of night simulations, relit scenes, and inserted objects, marking a significant improvement over existing state-of-the-art methods.
Related papers
- DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models [83.28670336340608]
We introduce DiffusionRenderer, a neural approach that addresses the dual problem of inverse and forward rendering.
Our model enables practical applications from a single video input--including relighting, material editing, and realistic object insertion.
arXiv Detail & Related papers (2025-01-30T18:59:11Z) - StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models [59.55232046525733]
We introduce StreetCrafter, a controllable video diffusion model that utilizes LiDAR point cloud renderings as pixel-level conditions.
In addition, the utilization of pixel-level LiDAR conditions allows us to make accurate pixel-level edits to target scenes.
Our model enables flexible control over viewpoint changes, enlarging the view for satisfying rendering regions.
arXiv Detail & Related papers (2024-12-17T18:58:55Z) - Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion [77.34078223594686]
We propose a novel architecture for direct 3D scene generation by introducing diffusion models into 3D sparse representations and combining them with neural rendering techniques.
Specifically, our approach generates texture colors at the point level for a given geometry using a 3D diffusion model first, which is then transformed into a scene representation in a feed-forward manner.
Experiments in two city-scale datasets show that our model demonstrates proficiency in generating photo-realistic street-view image sequences and cross-view urban scenes from satellite imagery.
arXiv Detail & Related papers (2024-01-19T16:15:37Z) - Neural Fields meet Explicit Geometric Representation for Inverse
Rendering of Urban Scenes [62.769186261245416]
We present a novel inverse rendering framework for large urban scenes capable of jointly reconstructing the scene geometry, spatially-varying materials, and HDR lighting from a set of posed RGB images with optional depth.
Specifically, we use a neural field to account for the primary rays, and use an explicit mesh (reconstructed from the underlying neural field) for modeling secondary rays that produce higher-order lighting effects such as cast shadows.
arXiv Detail & Related papers (2023-04-06T17:51:54Z) - Light Field Neural Rendering [47.7586443731997]
Methods based on geometric reconstruction need only sparse views, but cannot accurately model non-Lambertian effects.
We introduce a model that combines the strengths and mitigates the limitations of these two directions.
Our model outperforms the state-of-the-art on multiple forward-facing and 360deg datasets.
arXiv Detail & Related papers (2021-12-17T18:58:05Z) - RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from
Sparse Inputs [79.00855490550367]
We show that NeRF can produce photorealistic renderings of unseen viewpoints when many input views are available.
We address this by regularizing the geometry and appearance of patches rendered from unobserved viewpoints.
Our model outperforms not only other methods that optimize over a single scene, but also conditional models that are extensively pre-trained on large multi-view datasets.
arXiv Detail & Related papers (2021-12-01T18:59:46Z) - Neural Reflectance Fields for Appearance Acquisition [61.542001266380375]
We present Neural Reflectance Fields, a novel deep scene representation that encodes volume density, normal and reflectance properties at any 3D point in a scene.
We combine this representation with a physically-based differentiable ray marching framework that can render images from a neural reflectance field under any viewpoint and light.
arXiv Detail & Related papers (2020-08-09T22:04:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.