Endo-G$^{2}$T: Geometry-Guided & Temporally Aware Time-Embedded 4DGS For Endoscopic Scenes
- URL: http://arxiv.org/abs/2511.21367v1
- Date: Wed, 26 Nov 2025 13:12:21 GMT
- Title: Endo-G$^{2}$T: Geometry-Guided & Temporally Aware Time-Embedded 4DGS For Endoscopic Scenes
- Authors: Yangle Liu, Fengze Li, Kan Liu, Jieming Ma,
- Abstract summary: Endo-G$2$T is a geometry-guided and temporally aware training scheme for time-embedded 4DGS.<n>It achieves state-of-the-art results among monocular reconstruction baselines in EndoNeRF and StereoMIS-P1 datasets.
- Score: 3.1445273839174095
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Endoscopic (endo) video exhibits strong view-dependent effects such as specularities, wet reflections, and occlusions. Pure photometric supervision misaligns with geometry and triggers early geometric drift, where erroneous shapes are reinforced during densification and become hard to correct. We ask how to anchor geometry early for 4D Gaussian splatting (4DGS) while maintaining temporal consistency and efficiency in dynamic endoscopic scenes. Thus, we present Endo-G$^{2}$T, a geometry-guided and temporally aware training scheme for time-embedded 4DGS. First, geo-guided prior distillation converts confidence-gated monocular depth into supervision with scale-invariant depth and depth-gradient losses, using a warm-up-to-cap schedule to inject priors softly and avoid early overfitting. Second, a time-embedded Gaussian field represents dynamics in XYZT with a rotor-like rotation parameterization, yielding temporally coherent geometry with lightweight regularization that favors smooth motion and crisp opacity boundaries. Third, keyframe-constrained streaming improves efficiency and long-horizon stability through keyframe-focused optimization under a max-points budget, while non-keyframes advance with lightweight updates. Across EndoNeRF and StereoMIS-P1 datasets, Endo-G$^{2}$T achieves state-of-the-art results among monocular reconstruction baselines.
Related papers
- TG-Field: Geometry-Aware Radiative Gaussian Fields for Tomographic Reconstruction [16.246538335191982]
Tomographic Geometry Field (TG-Field) is a geometry-aware Gaussian deformation framework for computed tomography (CT) reconstruction.<n> TG-Field consistently outperforms existing methods, achieving state-of-the-art reconstruction accuracy under highly sparse-view conditions.
arXiv Detail & Related papers (2026-02-12T08:33:01Z) - Advancing Structured Priors for Sparse-Voxel Surface Reconstruction [38.315369778574386]
Two promising explicit representations, 3D Gaussian Splatting and sparse-voxelization, exhibit complementary strengths and weaknesses.<n>We combine the advantages of both by a voxel method that places voxels at plausible locations and with appropriate levels of detail.<n>Experiments on standard benchmarks demonstrate improvements over prior methods in accuracy, better fine-structure recovery, and more complete surfaces.
arXiv Detail & Related papers (2026-01-25T06:49:22Z) - Geometry-Consistent 4D Gaussian Splatting for Sparse-Input Dynamic View Synthesis [17.560425604804305]
GC-4DGS is a novel framework that infuses geometric consistency into 4D Gaussian Splatting (4DGS)<n>This paper presents GC-4DGS, a novel framework that infuses geometric consistency into 4D Gaussian Splatting (4DGS)
arXiv Detail & Related papers (2025-11-28T10:11:48Z) - 4DSTR: Advancing Generative 4D Gaussians with Spatial-Temporal Rectification for High-Quality and Consistent 4D Generation [28.11338918279445]
We propose a novel 4D generation network called 4DSTR, which modulates generative 4D Gaussian Splatting with spatial-temporal rectification.<n>Experiments demonstrate that our 4DSTR achieves state-of-the-art performance in video-to-4D generation, excelling in reconstruction quality, spatial-temporal consistency, and adaptation to rapid temporal movements.
arXiv Detail & Related papers (2025-11-10T15:57:03Z) - EndoWave: Rational-Wavelet 4D Gaussian Splatting for Endoscopic Reconstruction [18.43808203690038]
endoscopic scenarios present unique challenges, including photometric inconsistencies, non-rigid tissue motion, and view-dependent highlights.<n>Most 3DGS-based methods rely that solely on appearance constraints for optimizing 3DGS are often insufficient in this context.<n>We present EndoWave, which incorporates an optical flow-based geometric constraint and a multi-resolution rational wavelet supervision.
arXiv Detail & Related papers (2025-10-27T07:45:17Z) - ShapeGen4D: Towards High Quality 4D Shape Generation from Videos [85.45517487721257]
We introduce a native video-to-4D shape generation framework that synthesizes a single dynamic 3D representation end-to-end from the video.<n>Our method accurately captures non-rigid motion, volume changes, and even topological transitions without per-frame optimization.
arXiv Detail & Related papers (2025-10-07T17:58:11Z) - 4D Driving Scene Generation With Stereo Forcing [62.47705572424127]
Current generative models struggle to synthesize dynamic 4D driving scenes that simultaneously support temporal extrapolation and spatial novel view synthesis (NVS) without per-scene optimization.<n>We present PhiGenesis, a unified framework for 4D scene generation that extends video generation techniques with geometric and temporal consistency.
arXiv Detail & Related papers (2025-09-24T15:37:17Z) - VDEGaussian: Video Diffusion Enhanced 4D Gaussian Splatting for Dynamic Urban Scenes Modeling [68.65587507038539]
We present a novel video diffusion-enhanced 4D Gaussian Splatting framework for dynamic urban scene modeling.<n>Our key insight is to distill robust, temporally consistent priors from a test-time adapted video diffusion model.<n>Our method significantly enhances dynamic modeling, especially for fast-moving objects, achieving an approximate PSNR gain of 2 dB.
arXiv Detail & Related papers (2025-08-04T07:24:05Z) - X$^{2}$-Gaussian: 4D Radiative Gaussian Splatting for Continuous-time Tomographic Reconstruction [64.2059940799033]
Current methods discretize temporal resolution into fixed phases with respiratory gating devices.<n>X$2$-Gaussian, a novel framework, enables continuous-time 4DCT reconstruction by integrating dynamic radiative splatting with self-supervised respiratory motion learning.
arXiv Detail & Related papers (2025-03-27T17:59:57Z) - DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation [50.01520547454224]
Current generative models struggle to synthesize 4D driving scenes that simultaneously support temporal extrapolation and spatial novel view synthesis (NVS)<n>We propose DiST-4D, which disentangles the problem into two diffusion processes: DiST-T, which predicts future metric depth and multi-view RGB sequences directly from past observations, and DiST-S, which enables spatial NVS by training only on existing viewpoints while enforcing cycle consistency.<n>Experiments demonstrate that DiST-4D achieves state-of-the-art performance in both temporal prediction and NVS tasks, while also delivering competitive performance in planning-related evaluations.
arXiv Detail & Related papers (2025-03-19T13:49:48Z) - Advancing Dense Endoscopic Reconstruction with Gaussian Splatting-driven Surface Normal-aware Tracking and Mapping [12.027762278121052]
Endo-2DTAM is a real-time endoscopic SLAM system with 2D Gaussian Splatting (2DGS)<n>Our robust tracking module combines point-to-point and point-to-plane distance metrics.<n>Our mapping module utilizes normal consistency and depth distortion to enhance surface reconstruction quality.
arXiv Detail & Related papers (2025-01-31T17:15:34Z) - Real-Time Spatio-Temporal Reconstruction of Dynamic Endoscopic Scenes with 4D Gaussian Splatting [1.7947477507955865]
This paper presents ST-Endo4DGS, a novel framework that models the dynamics of dynamic endoscopic scenes.
This approach enables precise representation of deformable tissue, capturing spatial and temporal correlations in real time.
arXiv Detail & Related papers (2024-11-02T11:24:27Z) - 2D Gaussian Splatting for Geometrically Accurate Radiance Fields [50.056790168812114]
3D Gaussian Splatting (3DGS) has recently revolutionized radiance field reconstruction, achieving high quality novel view synthesis and fast rendering speed without baking.<n>We present 2D Gaussian Splatting (2DGS), a novel approach to model and reconstruct geometrically accurate radiance fields from multi-view images.<n>We demonstrate that our differentiable terms allows for noise-free and detailed geometry reconstruction while maintaining competitive appearance quality, fast training speed, and real-time rendering.
arXiv Detail & Related papers (2024-03-26T17:21:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.