Related papers: Render-FM: A Foundation Model for Real-time Photorealistic Volumetric Rendering

Render-FM: A Foundation Model for Real-time Photorealistic Volumetric Rendering

URL: http://arxiv.org/abs/2505.17338v1
Date: Thu, 22 May 2025 23:18:30 GMT
Title: Render-FM: A Foundation Model for Real-time Photorealistic Volumetric Rendering
Authors: Zhongpai Gao, Meng Zheng, Benjamin Planche, Anwesa Choudhuri, Terrence Chen, Ziyan Wu,
Abstract summary: We propose Render-FM, a novel foundation model for direct, real-time rendering of CT scans.<n>Our approach generates high-quality, real-time interactive 3D visualizations across diverse clinical CT data.<n>Experiments demonstrate that Render-FM achieves visual fidelity comparable or superior to specialized per-scan methods.
Score: 28.764513004699676
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Volumetric rendering of Computed Tomography (CT) scans is crucial for visualizing complex 3D anatomical structures in medical imaging. Current high-fidelity approaches, especially neural rendering techniques, require time-consuming per-scene optimization, limiting clinical applicability due to computational demands and poor generalizability. We propose Render-FM, a novel foundation model for direct, real-time volumetric rendering of CT scans. Render-FM employs an encoder-decoder architecture that directly regresses 6D Gaussian Splatting (6DGS) parameters from CT volumes, eliminating per-scan optimization through large-scale pre-training on diverse medical data. By integrating robust feature extraction with the expressive power of 6DGS, our approach efficiently generates high-quality, real-time interactive 3D visualizations across diverse clinical CT data. Experiments demonstrate that Render-FM achieves visual fidelity comparable or superior to specialized per-scan methods while drastically reducing preparation time from nearly an hour to seconds for a single inference step. This advancement enables seamless integration into real-time surgical planning and diagnostic workflows. The project page is: https://gaozhongpai.github.io/renderfm/.

Related papers

ClipGS: Clippable Gaussian Splatting for Interactive Cinematic Visualization of Volumetric Medical Data [51.095474325541794]
We introduce ClipGS, an innovative Gaussian splatting framework with the clipping plane supported, for interactive cinematic visualization of medical data.<n>We validate our method on five volumetric medical data, and reach an average 36.635 PSNR rendering quality with 156 FPS and 16.1 MB model size.
arXiv Detail & Related papers (2025-07-09T08:24:28Z)
EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.<n>We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z)
A Fast, Scalable, and Robust Deep Learning-based Iterative Reconstruction Framework for Accelerated Industrial Cone-beam X-ray Computed Tomography [5.104810959579395]
Cone-beam X-ray Computed Tomography (XCT) with large detectors and corresponding large-scale 3D reconstruction plays a pivotal role in micron-scale characterization of materials and parts across various industries.<n>We present a novel deep neural network-based iterative algorithm that integrates an artifact reduction-trained CNN as a prior model with automated regularization parameter selection.
arXiv Detail & Related papers (2025-01-21T19:34:01Z)
Learning Radiance Fields from a Single Snapshot Compressive Image [18.548244681485922]
Snapshot Compressive Imaging (SCI) technique for recovering the underlying 3D scene structure from a single temporal compressed image.<n>We propose SCINeRF, in which we formulate the physical imaging process of SCI as part of the training of NeRF.<n>We further integrate the popular 3D Gaussian Splatting (3DGS) framework and propose SCISplat to improve 3D scene reconstruction quality and training/rendering speed.
arXiv Detail & Related papers (2024-12-27T06:40:44Z)
TomoGRAF: A Robust and Generalizable Reconstruction Network for Single-View Computed Tomography [3.1209855614927275]
Traditional analytical/iterative CT reconstruction algorithms require hundreds of angular data samplings. We develop a novel TomoGRAF framework incorporating the unique X-ray transportation physics to reconstruct high-quality 3D volumes.
arXiv Detail & Related papers (2024-11-12T20:07:59Z)
Multi-Layer Gaussian Splatting for Immersive Anatomy Visualization [1.0580610673031074]
In medical image visualization, path tracing of volumetric medical data like CT scans produces lifelike visualizations. We propose a novel approach utilizing GS to create an efficient but static intermediate representation of CT scans. Our approach achieves interactive frame rates while preserving anatomical structures, with quality adjustable to the target hardware.
arXiv Detail & Related papers (2024-10-22T12:56:58Z)
3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans. Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z)
MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements [59.70107451308687]
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM. Our method, MM3DGS, addresses the limitations of prior rendering by enabling faster scale awareness, and improved trajectory tracking. We also release a multi-modal dataset, UT-MM, collected from a mobile robot equipped with a camera and an inertial measurement unit.
arXiv Detail & Related papers (2024-04-01T04:57:41Z)
Intraoperative 2D/3D Image Registration via Differentiable X-ray Rendering [5.617649111108429]
We present DiffPose, a self-supervised approach that leverages patient-specific simulation and differentiable physics-based rendering to achieve accurate 2D/3D registration without relying on manually labeled data. DiffPose achieves sub-millimeter accuracy across surgical datasets at intraoperative speeds, improving upon existing unsupervised methods by an order of magnitude and even outperforming supervised baselines.
arXiv Detail & Related papers (2023-12-11T13:05:54Z)
Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction [53.93674177236367]
Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging. Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image. This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses. We introduce a novel geometry-aware encoder-decoder framework to solve this problem.
arXiv Detail & Related papers (2023-03-26T14:38:42Z)
Hierarchical Amortized Training for Memory-efficient High Resolution 3D GAN [52.851990439671475]
We propose a novel end-to-end GAN architecture that can generate high-resolution 3D images. We achieve this goal by using different configurations between training and inference. Experiments on 3D thorax CT and brain MRI demonstrate that our approach outperforms state of the art in image generation.
arXiv Detail & Related papers (2020-08-05T02:33:04Z)
Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties. Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.