Feature-EndoGaussian: Feature Distilled Gaussian Splatting in Surgical Deformable Scene Reconstruction
- URL: http://arxiv.org/abs/2503.06161v1
- Date: Sat, 08 Mar 2025 10:50:19 GMT
- Title: Feature-EndoGaussian: Feature Distilled Gaussian Splatting in Surgical Deformable Scene Reconstruction
- Authors: Kai Li, Junhao Wang, William Han, Ding Zhao,
- Abstract summary: We introduce Feature-EndoGaussian (FEG), an extension of 3DGS that integrates 2D segmentation cues into 3D rendering to enable real-time semantic and scene reconstruction.<n>FEG achieves superior performance (SSIM of 0.97, PSNR of 39.08, and LPIPS of 0.03) compared to leading methods.
- Score: 26.358467072736524
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Minimally invasive surgery (MIS) has transformed clinical practice by reducing recovery times, minimizing complications, and enhancing precision. Nonetheless, MIS inherently relies on indirect visualization and precise instrument control, posing unique challenges. Recent advances in artificial intelligence have enabled real-time surgical scene understanding through techniques such as image classification, object detection, and segmentation, with scene reconstruction emerging as a key element for enhanced intraoperative guidance. Although neural radiance fields (NeRFs) have been explored for this purpose, their substantial data requirements and slow rendering inhibit real-time performance. In contrast, 3D Gaussian Splatting (3DGS) offers a more efficient alternative, achieving state-of-the-art performance in dynamic surgical scene reconstruction. In this work, we introduce Feature-EndoGaussian (FEG), an extension of 3DGS that integrates 2D segmentation cues into 3D rendering to enable real-time semantic and scene reconstruction. By leveraging pretrained segmentation foundation models, FEG incorporates semantic feature distillation within the Gaussian deformation framework, thereby enhancing both reconstruction fidelity and segmentation accuracy. On the EndoNeRF dataset, FEG achieves superior performance (SSIM of 0.97, PSNR of 39.08, and LPIPS of 0.03) compared to leading methods. Additionally, on the EndoVis18 dataset, FEG demonstrates competitive class-wise segmentation metrics while balancing model size and real-time performance.
Related papers
- IXGS-Intraoperative 3D Reconstruction from Sparse, Arbitrarily Posed Real X-rays [1.2721397985664153]
We extend the $R2$-Gaussian splatting framework to reconstruct consistent 3D volumes under challenging conditions.
We introduce an anatomy-guided radiographic standardization step using style transfer, improving visual consistency across views.
arXiv Detail & Related papers (2025-04-20T18:28:13Z) - Learning to Efficiently Adapt Foundation Models for Self-Supervised Endoscopic 3D Scene Reconstruction from Any Cameras [41.985581990753765]
We introduce Endo3DAC, a unified framework for endoscopic scene reconstruction.
We design an integrated network capable of simultaneously estimating depth maps, relative poses, and camera intrinsic parameters.
Experiments across four endoscopic datasets demonstrate that Endo3DAC significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2025-03-20T07:49:04Z) - T-3DGS: Removing Transient Objects for 3D Scene Reconstruction [83.05271859398779]
Transient objects in video sequences can significantly degrade the quality of 3D scene reconstructions.
We propose T-3DGS, a novel framework that robustly filters out transient distractors during 3D reconstruction using Gaussian Splatting.
arXiv Detail & Related papers (2024-11-29T07:45:24Z) - Event3DGS: Event-Based 3D Gaussian Splatting for High-Speed Robot Egomotion [54.197343533492486]
Event3DGS can reconstruct high-fidelity 3D structure and appearance under high-speed egomotion.
Experiments on multiple synthetic and real-world datasets demonstrate the superiority of Event3DGS compared with existing event-based dense 3D scene reconstruction frameworks.
Our framework also allows one to incorporate a few motion-blurred frame-based measurements into the reconstruction process to further improve appearance fidelity without loss of structural accuracy.
arXiv Detail & Related papers (2024-06-05T06:06:03Z) - R$^2$-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction [53.19869886963333]
3D Gaussian splatting (3DGS) has shown promising results in rendering image and surface reconstruction.
This paper introduces R2$-Gaussian, the first 3DGS-based framework for sparse-view tomographic reconstruction.
arXiv Detail & Related papers (2024-05-31T08:39:02Z) - Deform3DGS: Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting [20.147880388740287]
This work presents a novel fast reconstruction framework, termed Deform3DGS, for deformable tissues during endoscopic surgery.
We introduce 3D Gaussian Splatting, an emerging technology in real-time 3D rendering, into surgical scenes by integrating a point cloud.
We also propose a novel flexible deformation modeling scheme (FDM) to learn tissue deformation dynamics at the level of individual Gaussians.
arXiv Detail & Related papers (2024-05-28T05:14:57Z) - Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation [66.3814684757376]
This work presents Zero123-6D, the first work to demonstrate the utility of Diffusion Model-based novel-view-synthesizers in enhancing RGB 6D pose estimation at category-level.
The outlined method shows reduction in data requirements, removal of the necessity of depth information in zero-shot category-level 6D pose estimation task, and increased performance, quantitatively demonstrated through experiments on the CO3D dataset.
arXiv Detail & Related papers (2024-03-21T10:38:18Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - Domain adaptation strategies for 3D reconstruction of the lumbar spine using real fluoroscopy data [9.21828361691977]
This study tackles key obstacles in adopting surgical navigation in orthopedic surgeries.
It shows an approach for generating 3D anatomical models of the spine from only a few fluoroscopic images.
It achieved an 84% F1 score, matching the accuracy of our previous synthetic data-based research.
arXiv Detail & Related papers (2024-01-29T10:22:45Z) - EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene
Reconstruction [36.35631592019182]
We introduce EndoGaussian, a real-time endoscopic scene reconstruction framework built on 3D Gaussian Splatting (3DGS)
Our framework significantly boosts the rendering speed to a real-time level.
Experiments on public datasets demonstrate our efficacy against prior SOTAs in many aspects.
arXiv Detail & Related papers (2024-01-23T08:44:26Z) - GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting [51.96353586773191]
We introduce textbfGS-SLAM that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping system.
Our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D rendering.
Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets.
arXiv Detail & Related papers (2023-11-20T12:08:23Z) - A New Backbone for Hyperspectral Image Reconstruction [90.48427561874402]
3D hyperspectral image (HSI) reconstruction refers to inverse process of snapshot compressive imaging.
Proposal is for a Spatial/Spectral Invariant Residual U-Net, namely SSI-ResU-Net.
We show that SSI-ResU-Net achieves competing performance with over 77.3% reduction in terms of floating-point operations.
arXiv Detail & Related papers (2021-08-17T16:20:51Z) - Weakly-supervised Learning For Catheter Segmentation in 3D Frustum
Ultrasound [74.22397862400177]
We propose a novel Frustum ultrasound based catheter segmentation method.
The proposed method achieved the state-of-the-art performance with an efficiency of 0.25 second per volume.
arXiv Detail & Related papers (2020-10-19T13:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.