EndoPBR: Material and Lighting Estimation for Photorealistic Surgical Simulations via Physically-based Rendering
- URL: http://arxiv.org/abs/2502.20669v1
- Date: Fri, 28 Feb 2025 02:50:59 GMT
- Title: EndoPBR: Material and Lighting Estimation for Photorealistic Surgical Simulations via Physically-based Rendering
- Authors: John J. Han, Jie Ying Wu,
- Abstract summary: The lack of labeled datasets in 3D vision for surgical scenes inhibits the development of robust 3D reconstruction algorithms.<n>We introduce a differentiable rendering framework for material and lighting estimation from endoscopic images and known geometry.<n>By grounding color predictions in the rendering equation, we can generate photorealistic images at arbitrary camera poses.
- Score: 1.03590082373586
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The lack of labeled datasets in 3D vision for surgical scenes inhibits the development of robust 3D reconstruction algorithms in the medical domain. Despite the popularity of Neural Radiance Fields and 3D Gaussian Splatting in the general computer vision community, these systems have yet to find consistent success in surgical scenes due to challenges such as non-stationary lighting and non-Lambertian surfaces. As a result, the need for labeled surgical datasets continues to grow. In this work, we introduce a differentiable rendering framework for material and lighting estimation from endoscopic images and known geometry. Compared to previous approaches that model lighting and material jointly as radiance, we explicitly disentangle these scene properties for robust and photorealistic novel view synthesis. To disambiguate the training process, we formulate domain-specific properties inherent in surgical scenes. Specifically, we model the scene lighting as a simple spotlight and material properties as a bidirectional reflectance distribution function, parameterized by a neural network. By grounding color predictions in the rendering equation, we can generate photorealistic images at arbitrary camera poses. We evaluate our method with various sequences from the Colonoscopy 3D Video Dataset and show that our method produces competitive novel view synthesis results compared with other approaches. Furthermore, we demonstrate that synthetic data can be used to develop 3D vision algorithms by finetuning a depth estimation model with our rendered outputs. Overall, we see that the depth estimation performance is on par with fine-tuning with the original real images.
Related papers
- EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.
We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z) - VI3DRM:Towards meticulous 3D Reconstruction from Sparse Views via Photo-Realistic Novel View Synthesis [22.493542492218303]
Visual Isotropy 3D Reconstruction Model (VI3DRM) is a sparse views 3D reconstruction model that operates within an ID consistent and perspective-disentangled 3D latent space.
By facilitating the disentanglement of semantic information, color, material properties and lighting, VI3DRM is capable of generating highly realistic images.
arXiv Detail & Related papers (2024-09-12T16:47:57Z) - EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting [39.60431471170721]
3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities.
Existing methods employ various advanced neural rendering techniques for view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available.
We propose a framework leveraging the prior knowledge from multiple foundation models during the reconstruction process, dubbed as textitEndoSparse.
arXiv Detail & Related papers (2024-07-01T07:24:09Z) - RGB-D Mapping and Tracking in a Plenoxel Radiance Field [5.239559610798646]
We present the vital differences between view synthesis models and 3D reconstruction models.
We also comment on why a depth sensor is essential for modeling accurate geometry in general outward-facing scenes.
Our method achieves state-of-the-art results in both mapping and tracking tasks, while also being faster than competing neural network-based approaches.
arXiv Detail & Related papers (2023-07-07T06:05:32Z) - Extracting Triangular 3D Models, Materials, and Lighting From Images [59.33666140713829]
We present an efficient method for joint optimization of materials and lighting from multi-view image observations.
We leverage meshes with spatially-varying materials and environment that can be deployed in any traditional graphics engine.
arXiv Detail & Related papers (2021-11-24T13:58:20Z) - A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware
Image Synthesis [163.96778522283967]
We propose a shading-guided generative implicit model that is able to learn a starkly improved shape representation.
An accurate 3D shape should also yield a realistic rendering under different lighting conditions.
Our experiments on multiple datasets show that the proposed approach achieves photorealistic 3D-aware image synthesis.
arXiv Detail & Related papers (2021-10-29T10:53:12Z) - Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting [149.1673041605155]
We address the problem of jointly estimating albedo, normals, depth and 3D spatially-varying lighting from a single image.
Most existing methods formulate the task as image-to-image translation, ignoring the 3D properties of the scene.
We propose a unified, learning-based inverse framework that formulates 3D spatially-varying lighting.
arXiv Detail & Related papers (2021-09-13T15:29:03Z) - Neural Reflectance Fields for Appearance Acquisition [61.542001266380375]
We present Neural Reflectance Fields, a novel deep scene representation that encodes volume density, normal and reflectance properties at any 3D point in a scene.
We combine this representation with a physically-based differentiable ray marching framework that can render images from a neural reflectance field under any viewpoint and light.
arXiv Detail & Related papers (2020-08-09T22:04:36Z) - Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image
Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties.
Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.