Related papers: S4D: Streaming 4D Real-World Reconstruction with Gaussians and 3D Control Points

S4D: Streaming 4D Real-World Reconstruction with Gaussians and 3D Control Points

URL: http://arxiv.org/abs/2408.13036v2
Date: Sun, 6 Oct 2024 14:17:11 GMT
Title: S4D: Streaming 4D Real-World Reconstruction with Gaussians and 3D Control Points
Authors: Bing He, Yunuo Chen, Guo Lu, Qi Wang, Qunshan Gu, Rong Xie, Li Song, Wenjun Zhang,
Abstract summary: We introduce a novel approach for streaming 4D real-world reconstruction utilizing discrete 3D control points. This method physically models local rays and establishes a motion-decoupling coordinate system. By effectively merging traditional graphics with learnable pipelines, it provides a robust and efficient local 6-degrees-of-freedom (6 DoF) motion representation.
Score: 30.46796069720543
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Dynamic scene reconstruction using Gaussians has recently attracted increased interest. Mainstream approaches typically employ a global deformation field to warp a 3D scene in canonical space. However, the inherent low-frequency nature of implicit neural fields often leads to ineffective representations of complex motions. Moreover, their structural rigidity can hinder adaptation to scenes with varying resolutions and durations. To address these challenges, we introduce a novel approach for streaming 4D real-world reconstruction utilizing discrete 3D control points. This method physically models local rays and establishes a motion-decoupling coordinate system. By effectively merging traditional graphics with learnable pipelines, it provides a robust and efficient local 6-degrees-of-freedom (6-DoF) motion representation. Additionally, we have developed a generalized framework that integrates our control points with Gaussians. Starting from an initial 3D reconstruction, our workflow decomposes the streaming 4D reconstruction into four independent submodules: 3D segmentation, 3D control point generation, object-wise motion manipulation, and residual compensation. Experimental results demonstrate that our method outperforms existing state-of-the-art 4D Gaussian splatting techniques on both the Neu3DV and CMU-Panoptic datasets. Notably, the optimization of our 3D control points is achievable in 100 iterations and within just 2 seconds per frame on a single NVIDIA 4070 GPU.

Related papers

Easi3R: Estimating Disentangled Motion from DUSt3R Without Training [48.87063562819018]
We introduce Easi3R, a simple yet efficient training-free method for 4D reconstruction. Our approach applies attention adaptation during inference, eliminating the need for from-scratch pre-training or network fine-tuning. Our experiments on real-world dynamic videos demonstrate that our lightweight attention adaptation significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2025-03-31T17:59:58Z)
Disentangled 4D Gaussian Splatting: Towards Faster and More Efficient Dynamic Scene Rendering [12.27734287104036]
Novel-entangleview synthesis (NVS) for dynamic scenes from 2D images presents significant challenges. We introduce Disentangled 4D Gaussianting (Disentangled4DGS), a novel representation and rendering approach that disentangles temporal and spatial deformations. Our approach achieves an unprecedented average rendering speed of 343 FPS at a resolution of $1352times1014$ on a 3090 GPU.
arXiv Detail & Related papers (2025-03-28T05:46:02Z)
Deblur4DGS: 4D Gaussian Splatting from Blurry Monocular Video [64.38566659338751]
We propose the first 4D Gaussian Splatting framework to reconstruct a high-quality 4D model from blurry monocular video, named Deblur4DGS. We introduce exposure regularization to avoid trivial solutions, as well as multi-frame and multi-resolution consistency ones to alleviate artifacts. Beyond novel-view, Deblur4DGS can be applied to improve blurry video from multiple perspectives, including deblurring, frame synthesis, and video stabilization.
arXiv Detail & Related papers (2024-12-09T12:02:11Z)
4D SlingBAG: spatial-temporal coupled Gaussian ball for large-scale dynamic 3D photoacoustic iterative reconstruction [20.286369270523245]
We propose a novel method, named the 4D sliding Gaussian ball adaptive growth (4D SlingBAG) algorithm. Our method applies spatial-temporal coupled deformation functions to each Gaussian sphere in point cloud, thus explicitly learning the deformations features of the dynamic 3D PA scene. Compared to performing reconstructions by using SlingBAG algorithm individually for each frame, our method significantly reduces computational time and keeps a extremely low memory consumption.
arXiv Detail & Related papers (2024-12-05T06:15:26Z)
Dynamics-Aware Gaussian Splatting Streaming Towards Fast On-the-Fly Training for 4D Reconstruction [12.111389926333592]
Current 3DGS-based streaming methods treat the Gaussian primitives uniformly and constantly renew the densified Gaussians. We propose a novel three-stage pipeline for iterative streamable 4D dynamic spatial reconstruction. Our method achieves state-of-the-art performance in online 4D reconstruction, demonstrating a 20% improvement in on-the-fly training speed, superior representation quality, and real-time rendering capability.
arXiv Detail & Related papers (2024-11-22T10:47:47Z)
$\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous Driving [82.82048452755394]
Photorealistic 3D reconstruction of street scenes is a critical technique for developing real-world simulators for autonomous driving. Most existing street 3DGS methods require tracked 3D vehicle bounding boxes to decompose the static and dynamic elements. We propose a self-supervised street Gaussian ($textitS3$Gaussian) method to decompose dynamic and static elements from 4D consistency.
arXiv Detail & Related papers (2024-05-30T17:57:08Z)
Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels [35.27805034331218]
We present Vidu4D, a novel reconstruction model that excels in accurately reconstructing 4D representations from single generated videos. At the core of Vidu4D is our proposed Dynamic Gaussian Surfels (DGS) technique.
arXiv Detail & Related papers (2024-05-27T04:43:44Z)
SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer [57.506654943449796]
We propose an efficient, sparse-controlled video-to-4D framework named SC4D that decouples motion and appearance. Our method surpasses existing methods in both quality and efficiency. We devise a novel application that seamlessly transfers motion onto a diverse array of 4D entities.
arXiv Detail & Related papers (2024-04-04T18:05:18Z)
4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency [118.15258850780417]
This work introduces 4DGen, a novel framework for grounded 4D content creation. We identify static 3D assets and monocular video sequences as key components in constructing the 4D content. Our pipeline facilitates conditional 4D generation, enabling users to specify geometry (3D assets) and motion (monocular videos)
arXiv Detail & Related papers (2023-12-28T18:53:39Z)
DreamGaussian4D: Generative 4D Gaussian Splatting [56.49043443452339]
We introduce DreamGaussian4D (DG4D), an efficient 4D generation framework that builds on Gaussian Splatting (GS) Our key insight is that combining explicit modeling of spatial transformations with static GS makes an efficient and powerful representation for 4D generation. Video generation methods have the potential to offer valuable spatial-temporal priors, enhancing the high-quality 4D generation.
arXiv Detail & Related papers (2023-12-28T17:16:44Z)
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models [94.07744207257653]
We focus on the underexplored text-to-4D setting and synthesize dynamic, animated 3D objects. We combine text-to-image, text-to-video, and 3D-aware multiview diffusion models to provide feedback during 4D object optimization.
arXiv Detail & Related papers (2023-12-21T11:41:02Z)
Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle [9.082693946898733]
We introduce a novel point-based approach for fast dynamic scene reconstruction and real-time rendering from both multi-view and monocular videos. In contrast to the prevalent NeRF-based approaches hampered by slow training and rendering speeds, our approach harnesses recent advancements in point-based 3D Gaussian Splatting (3DGS) Our proposed approach showcases a substantial efficiency improvement, achieving a $5times$ faster training speed compared to the per-frame 3DGS modeling.
arXiv Detail & Related papers (2023-12-06T11:25:52Z)
4D Gaussian Splatting for Real-Time Dynamic Scene Rendering [103.32717396287751]
We propose 4D Gaussian Splatting (4D-GS) as a holistic representation for dynamic scenes. A neuralvoxel encoding algorithm inspired by HexPlane is proposed to efficiently build features from 4D neural voxels. Our 4D-GS method achieves real-time rendering under high resolutions, 82 FPS at an 800$times$800 resolution on an 3090 GPU.
arXiv Detail & Related papers (2023-10-12T17:21:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.