Related papers: 4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians

4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians

URL: http://arxiv.org/abs/2505.22859v1
Date: Wed, 28 May 2025 20:45:10 GMT
Title: 4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians
Authors: Hidenobu Matsuki, Gwangbin Bae, Andrew J. Davison,
Abstract summary: We propose the first 4D tracking and mapping method that jointly performs camera localization and non-rigid surface reconstruction.<n>Our approach captures 4D scenes from an online stream of color images with depth measurements or predictions.
Score: 20.862152067742148
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose the first 4D tracking and mapping method that jointly performs camera localization and non-rigid surface reconstruction via differentiable rendering. Our approach captures 4D scenes from an online stream of color images with depth measurements or predictions by jointly optimizing scene geometry, appearance, dynamics, and camera ego-motion. Although natural environments exhibit complex non-rigid motions, 4D-SLAM remains relatively underexplored due to its inherent challenges; even with 2.5D signals, the problem is ill-posed because of the high dimensionality of the optimization space. To overcome these challenges, we first introduce a SLAM method based on Gaussian surface primitives that leverages depth signals more effectively than 3D Gaussians, thereby achieving accurate surface reconstruction. To further model non-rigid deformations, we employ a warp-field represented by a multi-layer perceptron (MLP) and introduce a novel camera pose estimation technique along with surface regularization terms that facilitate spatio-temporal reconstruction. In addition to these algorithmic challenges, a significant hurdle in 4D SLAM research is the lack of reliable ground truth and evaluation protocols, primarily due to the difficulty of 4D capture using commodity sensors. To address this, we present a novel open synthetic dataset of everyday objects with diverse motions, leveraging large-scale object models and animation modeling. In summary, we open up the modern 4D-SLAM research by introducing a novel method and evaluation protocols grounded in modern vision and rendering techniques.

Related papers

4D Gaussian Splatting SLAM [44.70136817644832]
This paper proposes an efficient architecture that incrementally tracks camera poses and establishes the 4D Gaussian radiance fields in unknown scenarios.<n>In experiments, qualitative and quantitative evaluation results show that the proposed method achieves robust tracking and high-quality view performance in real-world environments.
arXiv Detail & Related papers (2025-03-20T21:08:08Z)
4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives [116.2042238179433]
In this paper, we frame dynamic scenes as unconstrained 4D volume learning problems.<n>We represent a target dynamic scene using a collection of 4D Gaussian primitives with explicit geometry and appearance features.<n>This approach can capture relevant information in space and time by fitting the underlying photorealistic-temporal volume.<n> Notably, our 4DGS model is the first solution that supports real-time rendering of high-resolution, novel views for complex dynamic scenes.
arXiv Detail & Related papers (2024-12-30T05:30:26Z)
Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera [49.82535393220003]
Dyn-HaMR is the first approach to reconstruct 4D global hand motion from monocular videos recorded by dynamic cameras in the wild.<n>We show that our approach significantly outperforms state-of-the-art methods in terms of 4D global mesh recovery.<n>This establishes a new benchmark for hand motion reconstruction from monocular video with moving cameras.
arXiv Detail & Related papers (2024-12-17T12:43:10Z)
4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization [43.81271239333774]
We propose a novel 4D Gaussian Splatting (4DGS) algorithm for dynamic scenes from casually recorded monocular videos. Our experiments show that the proposed method improves the performance of 4DGS reconstruction from a video captured by a handheld monocular camera.
arXiv Detail & Related papers (2024-11-13T18:56:39Z)
DynaSurfGS: Dynamic Surface Reconstruction with Planar-based Gaussian Splatting [13.762831851385227]
We propose DynaSurfGS to achieve both photorealistic rendering and high-fidelity surface reconstruction of dynamic scenarios. The framework first incorporates Gaussian features from 4D neural voxels with the planar-based Gaussian Splatting to facilitate precise surface reconstruction. It also incorporates the as-rigid-as-possible (ARAP) constraint to maintain the approximate rigidity of local neighborhoods of 3D Gaussians between timesteps.
arXiv Detail & Related papers (2024-08-26T01:36:46Z)
Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models [116.31344506738816]
We present a novel framework, textbfDiffusion4D, for efficient and scalable 4D content generation. We develop a 4D-aware video diffusion model capable of synthesizing orbital views of dynamic 3D assets. Our method surpasses prior state-of-the-art techniques in terms of generation efficiency and 4D geometry consistency.
arXiv Detail & Related papers (2024-05-26T17:47:34Z)
SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer [57.506654943449796]
We propose an efficient, sparse-controlled video-to-4D framework named SC4D that decouples motion and appearance. Our method surpasses existing methods in both quality and efficiency. We devise a novel application that seamlessly transfers motion onto a diverse array of 4D entities.
arXiv Detail & Related papers (2024-04-04T18:05:18Z)
LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling [69.56581851211841]
We propose a novel Local 4D implicit Representation for Dynamic clothed human, named LoRD. Our key insight is to encourage the network to learn the latent codes of local part-level representation. LoRD has strong capability for representing 4D human, and outperforms state-of-the-art methods on practical applications.
arXiv Detail & Related papers (2022-08-18T03:49:44Z)
H4D: Human 4D Modeling by Learning Neural Compositional Representation [75.34798886466311]
This work presents a novel framework that can effectively learn a compact and compositional representation for dynamic human. A simple yet effective linear motion model is proposed to provide a rough and regularized motion estimation. Experiments demonstrate our method is not only efficacy in recovering dynamic human with accurate motion and detailed geometry, but also amenable to various 4D human related tasks.
arXiv Detail & Related papers (2022-03-02T17:10:49Z)
4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface [7.637832293935966]
We introduce 4DComplete, a novel data-driven approach that estimates the non-rigid motion for the unobserved geometry. For network training, we constructed a large-scale synthetic dataset called DeformingThings4D.
arXiv Detail & Related papers (2021-05-05T07:39:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.