Related papers: H3D-DGS: Exploring Heterogeneous 3D Motion Representation for Deformable 3D Gaussian Splatting

H3D-DGS: Exploring Heterogeneous 3D Motion Representation for Deformable 3D Gaussian Splatting

URL: http://arxiv.org/abs/2408.13036v3
Date: Tue, 21 Oct 2025 08:03:39 GMT
Title: H3D-DGS: Exploring Heterogeneous 3D Motion Representation for Deformable 3D Gaussian Splatting
Authors: Bing He, Yunuo Chen, Guo Lu, Qi Wang, Qunshan Gu, Rong Xie, Li Song, Wenjun Zhang,
Abstract summary: Dynamic scene reconstruction poses a persistent challenge in 3D vision.<n>Deformable 3D Gaussian splatting has emerged as an effective method for this task, offering real-time rendering and high visual fidelity.<n>This approach decomposes a dynamic scene into a static representation in a canonical space and time-varying scene motion.<n>Experiments on the Neu3DV and CMU-Panoptic datasets demonstrate that our method achieves superior performance over state-of-the-art deformable 3D Gaussian splatting techniques.
Score: 39.2960379257236
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Dynamic scene reconstruction poses a persistent challenge in 3D vision. Deformable 3D Gaussian Splatting has emerged as an effective method for this task, offering real-time rendering and high visual fidelity. This approach decomposes a dynamic scene into a static representation in a canonical space and time-varying scene motion. Scene motion is defined as the collective movement of all Gaussian points, and for compactness, existing approaches commonly adopt implicit neural fields or sparse control points. However, these methods predominantly rely on gradient-based optimization for all motion information. Due to the high degree of freedom, they struggle to converge on real-world datasets exhibiting complex motion. To preserve the compactness of motion representation and address convergence challenges, this paper proposes heterogeneous 3D control points, termed \textbf{H3D control points}, whose attributes are obtained using a hybrid strategy combining optical flow back-projection and gradient-based methods. This design decouples directly observable motion components from those that are geometrically occluded. Specifically, components of 3D motion that project onto the image plane are directly acquired via optical flow back projection, while unobservable portions are refined through gradient-based optimization. Experiments on the Neu3DV and CMU-Panoptic datasets demonstrate that our method achieves superior performance over state-of-the-art deformable 3D Gaussian splatting techniques. Remarkably, our method converges within just 100 iterations and achieves a per-frame processing speed of 2 seconds on a single NVIDIA RTX 4070 GPU.

Related papers

SAGOnline: Segment Any Gaussians Online [17.33447710659887]
3D Gaussian Splatting (3DGS) has emerged as a powerful paradigm for explicit 3D scene representation.<n>Current methods suffer from prohibitive computational costs, limited 3D spatial reasoning, and an inability to track multiple objects simultaneously.<n>We present Segment Any Gaussians Online (SAGOnline), a lightweight and zero-shot framework for real-time 3D segmentation in Gaussian scenes.
arXiv Detail & Related papers (2025-08-11T17:38:50Z)
PIG: Physically-based Multi-Material Interaction with 3D Gaussians [14.097146027458368]
PIG: Physically-Based Multi-Material Interaction with 3D Gaussians is a novel approach that combines 3D object segmentation with the simulation of objects interacting in high precision.<n>We show that our method not only outperforms the state-of-the-art (SOTA) in terms of visual quality, but also opens up new directions and pipelines for the field of physically realistic scene generation.
arXiv Detail & Related papers (2025-06-09T11:25:21Z)
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training [48.87063562819018]
We introduce Easi3R, a simple yet efficient training-free method for 4D reconstruction. Our approach applies attention adaptation during inference, eliminating the need for from-scratch pre-training or network fine-tuning. Our experiments on real-world dynamic videos demonstrate that our lightweight attention adaptation significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2025-03-31T17:59:58Z)
Disentangled 4D Gaussian Splatting: Towards Faster and More Efficient Dynamic Scene Rendering [12.27734287104036]
Novel-entangleview synthesis (NVS) for dynamic scenes from 2D images presents significant challenges. We introduce Disentangled 4D Gaussianting (Disentangled4DGS), a novel representation and rendering approach that disentangles temporal and spatial deformations. Our approach achieves an unprecedented average rendering speed of 343 FPS at a resolution of $1352times1014$ on a 3090 GPU.
arXiv Detail & Related papers (2025-03-28T05:46:02Z)
EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.<n>We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z)
Deblur4DGS: 4D Gaussian Splatting from Blurry Monocular Video [64.38566659338751]
We propose the first 4D Gaussian Splatting framework to reconstruct a high-quality 4D model from blurry monocular video, named Deblur4DGS. We introduce exposure regularization to avoid trivial solutions, as well as multi-frame and multi-resolution consistency ones to alleviate artifacts. Beyond novel-view, Deblur4DGS can be applied to improve blurry video from multiple perspectives, including deblurring, frame synthesis, and video stabilization.
arXiv Detail & Related papers (2024-12-09T12:02:11Z)
Temporally Compressed 3D Gaussian Splatting for Dynamic Scenes [46.64784407920817]
Temporally Compressed 3D Gaussian Splatting (TC3DGS) is a novel technique designed specifically to compress dynamic 3D Gaussian representations.<n>Our experiments across multiple datasets demonstrate that TC3DGS achieves up to 67$times$ compression with minimal or no degradation in visual quality.
arXiv Detail & Related papers (2024-12-07T17:03:09Z)
4D SlingBAG: spatial-temporal coupled Gaussian ball for large-scale dynamic 3D photoacoustic iterative reconstruction [20.286369270523245]
We propose a novel method, named the 4D sliding Gaussian ball adaptive growth (4D SlingBAG) algorithm. Our method applies spatial-temporal coupled deformation functions to each Gaussian sphere in point cloud, thus explicitly learning the deformations features of the dynamic 3D PA scene. Compared to performing reconstructions by using SlingBAG algorithm individually for each frame, our method significantly reduces computational time and keeps a extremely low memory consumption.
arXiv Detail & Related papers (2024-12-05T06:15:26Z)
Dynamics-Aware Gaussian Splatting Streaming Towards Fast On-the-Fly Training for 4D Reconstruction [12.111389926333592]
Current 3DGS-based streaming methods treat the Gaussian primitives uniformly and constantly renew the densified Gaussians. We propose a novel three-stage pipeline for iterative streamable 4D dynamic spatial reconstruction. Our method achieves state-of-the-art performance in online 4D reconstruction, demonstrating a 20% improvement in on-the-fly training speed, superior representation quality, and real-time rendering capability.
arXiv Detail & Related papers (2024-11-22T10:47:47Z)
MotionGS: Exploring Explicit Motion Guidance for Deformable 3D Gaussian Splatting [56.785233997533794]
We propose a novel deformable 3D Gaussian splatting framework called MotionGS. MotionGS explores explicit motion priors to guide the deformation of 3D Gaussians. Experiments in the monocular dynamic scenes validate that MotionGS surpasses state-of-the-art methods.
arXiv Detail & Related papers (2024-10-10T08:19:47Z)
Gaussian Splatting Lucas-Kanade [0.11249583407496218]
We propose a novel analytical approach that adapts the classical Lucas-Kanade method to dynamic Gaussian splatting.<n>By leveraging the intrinsic properties of the forward warp field network, we derive an analytical velocity field that, through time integration, facilitates accurate scene flow computation.<n>Our method excels in reconstructing highly dynamic scenes with minimal camera movement, as demonstrated through experiments on both synthetic and real-world scenes.
arXiv Detail & Related papers (2024-07-16T01:50:43Z)
$\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous Driving [82.82048452755394]
Photorealistic 3D reconstruction of street scenes is a critical technique for developing real-world simulators for autonomous driving. Most existing street 3DGS methods require tracked 3D vehicle bounding boxes to decompose the static and dynamic elements. We propose a self-supervised street Gaussian ($textitS3$Gaussian) method to decompose dynamic and static elements from 4D consistency.
arXiv Detail & Related papers (2024-05-30T17:57:08Z)
Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels [35.27805034331218]
We present Vidu4D, a novel reconstruction model that excels in accurately reconstructing 4D representations from single generated videos. At the core of Vidu4D is our proposed Dynamic Gaussian Surfels (DGS) technique.
arXiv Detail & Related papers (2024-05-27T04:43:44Z)
SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer [57.506654943449796]
We propose an efficient, sparse-controlled video-to-4D framework named SC4D that decouples motion and appearance. Our method surpasses existing methods in both quality and efficiency. We devise a novel application that seamlessly transfers motion onto a diverse array of 4D entities.
arXiv Detail & Related papers (2024-04-04T18:05:18Z)
GaussianPro: 3D Gaussian Splatting with Progressive Propagation [49.918797726059545]
3DGS relies heavily on the point cloud produced by Structure-from-Motion (SfM) techniques. We propose a novel method that applies a progressive propagation strategy to guide the densification of the 3D Gaussians. Our method significantly surpasses 3DGS on the dataset, exhibiting an improvement of 1.15dB in terms of PSNR.
arXiv Detail & Related papers (2024-02-22T16:00:20Z)
4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency [118.15258850780417]
This work introduces 4DGen, a novel framework for grounded 4D content creation. We identify static 3D assets and monocular video sequences as key components in constructing the 4D content. Our pipeline facilitates conditional 4D generation, enabling users to specify geometry (3D assets) and motion (monocular videos)
arXiv Detail & Related papers (2023-12-28T18:53:39Z)
DreamGaussian4D: Generative 4D Gaussian Splatting [56.49043443452339]
We introduce DreamGaussian4D (DG4D), an efficient 4D generation framework that builds on Gaussian Splatting (GS) Our key insight is that combining explicit modeling of spatial transformations with static GS makes an efficient and powerful representation for 4D generation. Video generation methods have the potential to offer valuable spatial-temporal priors, enhancing the high-quality 4D generation.
arXiv Detail & Related papers (2023-12-28T17:16:44Z)
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models [94.07744207257653]
We focus on the underexplored text-to-4D setting and synthesize dynamic, animated 3D objects. We combine text-to-image, text-to-video, and 3D-aware multiview diffusion models to provide feedback during 4D object optimization.
arXiv Detail & Related papers (2023-12-21T11:41:02Z)
Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle [9.082693946898733]
We introduce a novel point-based approach for fast dynamic scene reconstruction and real-time rendering from both multi-view and monocular videos. In contrast to the prevalent NeRF-based approaches hampered by slow training and rendering speeds, our approach harnesses recent advancements in point-based 3D Gaussian Splatting (3DGS) Our proposed approach showcases a substantial efficiency improvement, achieving a $5times$ faster training speed compared to the per-frame 3DGS modeling.
arXiv Detail & Related papers (2023-12-06T11:25:52Z)
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting [51.96353586773191]
We introduce textbfGS-SLAM that first utilizes 3D Gaussian representation in the Simultaneous Localization and Mapping system. Our method utilizes a real-time differentiable splatting rendering pipeline that offers significant speedup to map optimization and RGB-D rendering. Our method achieves competitive performance compared with existing state-of-the-art real-time methods on the Replica, TUM-RGBD datasets.
arXiv Detail & Related papers (2023-11-20T12:08:23Z)
4D Gaussian Splatting for Real-Time Dynamic Scene Rendering [103.32717396287751]
We propose 4D Gaussian Splatting (4D-GS) as a holistic representation for dynamic scenes. A neuralvoxel encoding algorithm inspired by HexPlane is proposed to efficiently build features from 4D neural voxels. Our 4D-GS method achieves real-time rendering under high resolutions, 82 FPS at an 800$times$800 resolution on an 3090 GPU.
arXiv Detail & Related papers (2023-10-12T17:21:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.