Related papers: DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models

DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models

URL: http://arxiv.org/abs/2501.18590v1
Date: Thu, 30 Jan 2025 18:59:11 GMT
Title: DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models
Authors: Ruofan Liang, Zan Gojcic, Huan Ling, Jacob Munkberg, Jon Hasselgren, Zhi-Hao Lin, Jun Gao, Alexander Keller, Nandita Vijaykumar, Sanja Fidler, Zian Wang,
Abstract summary: We introduce DiffusionRenderer, a neural approach that addresses the dual problem of inverse and forward rendering.<n>Our model enables practical applications from a single video input--including relighting, material editing, and realistic object insertion.
Score: 83.28670336340608
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding and modeling lighting effects are fundamental tasks in computer vision and graphics. Classic physically-based rendering (PBR) accurately simulates the light transport, but relies on precise scene representations--explicit 3D geometry, high-quality material properties, and lighting conditions--that are often impractical to obtain in real-world scenarios. Therefore, we introduce DiffusionRenderer, a neural approach that addresses the dual problem of inverse and forward rendering within a holistic framework. Leveraging powerful video diffusion model priors, the inverse rendering model accurately estimates G-buffers from real-world videos, providing an interface for image editing tasks, and training data for the rendering model. Conversely, our rendering model generates photorealistic images from G-buffers without explicit light transport simulation. Experiments demonstrate that DiffusionRenderer effectively approximates inverse and forwards rendering, consistently outperforming the state-of-the-art. Our model enables practical applications from a single video input--including relighting, material editing, and realistic object insertion.

Related papers

RenderFlow: Single-Step Neural Rendering via Flow Matching [17.56739408578129]
We present a novel end-to-end, deterministic, single-step neural rendering framework, RenderFlow, built upon a flow matching paradigm.<n>Our method significantly accelerates rendering process and enhances both the physical plausibility and overall visual quality of the output.<n>The resulting pipeline achieves near real-time performance with photorealistic rendering quality, effectively bridging the gap between the efficiency of modern generative models and the precision of traditional physically based rendering.
arXiv Detail & Related papers (2026-01-11T14:28:46Z)
BEAM: Bridging Physically-based Rendering and Gaussian Modeling for Relightable Volumetric Video [58.97416204208624]
We present BEAM, a novel pipeline that bridges 4D Gaussian representations with physically-based rendering (PBR) to produce high-quality, relightable videos.<n>By offering realistic, lifelike visualizations under diverse lighting conditions, BEAM opens new possibilities for interactive entertainment, storytelling, and creative visualization.
arXiv Detail & Related papers (2025-02-12T10:58:09Z)
Materialist: Physically Based Editing Using Single-Image Inverse Rendering [50.39048790589746]
We present a method combining a learning-based approach with progressive differentiable rendering.<n>Our method achieves more realistic light material interactions, accurate shadows, and global illumination.<n>We also propose a method for material transparency editing that operates effectively without requiring full scene geometry.
arXiv Detail & Related papers (2025-01-07T11:52:01Z)
GenLit: Reformulating Single-Image Relighting as Video Generation [44.409962561291216]
We introduce GenLit, a framework that distills the ability of a graphics engine to perform light manipulation into a video generation model.<n>We find that a model fine-tuned on only a small synthetic dataset is able to generalize to real images.
arXiv Detail & Related papers (2024-12-15T15:40:40Z)
Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering [56.68286440268329]
correct insertion of virtual objects in images of real-world scenes requires a deep understanding of the scene's lighting, geometry and materials. We propose using a personalized large diffusion model as guidance to a physically based inverse rendering process. Our method recovers scene lighting and tone-mapping parameters, allowing the photorealistic composition of arbitrary virtual objects in single frames or videos of indoor or outdoor scenes.
arXiv Detail & Related papers (2024-08-19T05:15:45Z)
FaceFolds: Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic Faces [21.946327323788275]
3D rendering of dynamic face is a challenging problem. We present a novel representation that enables high-quality rendering of an actor's dynamic facial performances.
arXiv Detail & Related papers (2024-04-22T00:44:13Z)
FLARE: Fast Learning of Animatable and Relightable Mesh Avatars [64.48254296523977]
Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems. We introduce FLARE, a technique that enables the creation of animatable and relightable avatars from a single monocular video.
arXiv Detail & Related papers (2023-10-26T16:13:00Z)
Inverse Rendering of Translucent Objects using Physical and Neural Renderers [13.706425832518093]
In this work, we propose an inverse model that estimates 3D shape, spatially-varying reflectance, homogeneous scattering parameters, and an environment illumination jointly from only a pair of captured images of a translucent object. Because two reconstructions are differentiable, we can compute a reconstruction loss to assist parameter estimation. We constructed a large-scale synthetic dataset of translucent objects, which consists of 117K scenes.
arXiv Detail & Related papers (2023-05-15T04:03:11Z)
Relightify: Relightable 3D Faces from a Single Image via Diffusion Models [86.3927548091627]
We present the first approach to use diffusion models as a prior for highly accurate 3D facial BRDF reconstruction from a single image. In contrast to existing methods, we directly acquire the observed texture from the input image, thus, resulting in more faithful and consistent estimation.
arXiv Detail & Related papers (2023-05-10T11:57:49Z)
DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer [78.91753256634453]
We consider the challenging problem of predicting intrinsic object properties from a single image by exploiting differentiables. In this work, we propose DIBR++, a hybrid differentiable which supports these effects by combining specularization and ray-tracing. Compared to more advanced physics-based differentiables, DIBR++ is highly performant due to its compact and expressive model.
arXiv Detail & Related papers (2021-10-30T01:59:39Z)
Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties. Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.