InfoGaussian: Structure-Aware Dynamic Gaussians through Lightweight Information Shaping
- URL: http://arxiv.org/abs/2406.05897v2
- Date: Mon, 23 Dec 2024 13:50:44 GMT
- Title: InfoGaussian: Structure-Aware Dynamic Gaussians through Lightweight Information Shaping
- Authors: Yunchao Zhang, Guandao Yang, Leonidas Guibas, Yanchao Yang,
- Abstract summary: We develop a technique that enforces movement resonance between correlated Gaussians in a motion network.
We develop an efficient contrastive training pipeline with lightweight optimization to shape the motion network.
The proposed technique is evaluated on challenging scenes and demonstrates significant performance improvement.
- Score: 9.703830846219441
- License:
- Abstract: 3D Gaussians, as a low-level scene representation, typically involve thousands to millions of Gaussians. This makes it difficult to control the scene in ways that reflect the underlying dynamic structure, where the number of independent entities is typically much smaller. In particular, it can be challenging to animate and move objects in the scene, which requires coordination among many Gaussians. To address this issue, we develop a mutual information shaping technique that enforces movement resonance between correlated Gaussians in a motion network. Such correlations can be learned from putative 2D object masks in different views. By approximating the mutual information with the Jacobians of the motions, our method ensures consistent movements of the Gaussians composing different objects under various perturbations. In particular, we develop an efficient contrastive training pipeline with lightweight optimization to shape the motion network, avoiding the need for re-shaping throughout the motion sequence. Notably, our training only touches a small fraction of all Gaussians in the scene yet attains the desired compositional behavior according to the underlying dynamic structure. The proposed technique is evaluated on challenging scenes and demonstrates significant performance improvement in promoting consistent movements and 3D object segmentation while inducing low computation and memory requirements.
Related papers
- RelayGS: Reconstructing Dynamic Scenes with Large-Scale and Complex Motions via Relay Gaussians [20.103270640146]
Reconstructing dynamic scenes with large-scale and complex motions remains a significant challenge.
Recent techniques like Neural Radiance Fields and 3D Gaussian Splatting (3DGS) have shown promise but still struggle with scenes involving substantial movement.
This paper proposes RelayGS, a novel method based on 3DGS to represent and reconstruct highly dynamic scenes.
arXiv Detail & Related papers (2024-12-03T15:08:03Z) - DynSUP: Dynamic Gaussian Splatting from An Unposed Image Pair [41.78277294238276]
We propose a method that can use only two images without prior poses to fit Gaussians in dynamic environments.
This strategy decomposes dynamic scenes into piece-wise rigid components, and jointly estimates the camera pose and motions of dynamic objects.
Experiments on both synthetic and real-world datasets demonstrate that our method significantly outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2024-12-01T15:25:33Z) - SADG: Segment Any Dynamic Gaussian Without Object Trackers [39.77468734311312]
SADG, Segment Any Dynamic Gaussian Without Object Trackers, is a novel approach that combines dynamic Gaussian Splatting representation and semantic information without reliance on object IDs.
We learn semantically-aware features by leveraging masks generated from the Segment Anything Model (SAM) and utilizing our novel contrastive learning objective based on hard pixel mining.
We evaluate SADG on proposed benchmarks and demonstrate the superior performance of our approach in segmenting objects within dynamic scenes.
arXiv Detail & Related papers (2024-11-28T17:47:48Z) - MotionGS: Exploring Explicit Motion Guidance for Deformable 3D Gaussian Splatting [56.785233997533794]
We propose a novel deformable 3D Gaussian splatting framework called MotionGS.
MotionGS explores explicit motion priors to guide the deformation of 3D Gaussians.
Experiments in the monocular dynamic scenes validate that MotionGS surpasses state-of-the-art methods.
arXiv Detail & Related papers (2024-10-10T08:19:47Z) - Dynamic Gaussian Marbles for Novel View Synthesis of Casual Monocular Videos [58.22272760132996]
We show that existing 4D Gaussian methods dramatically fail in this setup because the monocular setting is underconstrained.
We propose Dynamic Gaussian Marbles, which consist of three core modifications that target the difficulties of the monocular setting.
We evaluate on the Nvidia Dynamic Scenes dataset and the DyCheck iPhone dataset, and show that Gaussian Marbles significantly outperforms other Gaussian baselines in quality.
arXiv Detail & Related papers (2024-06-26T19:37:07Z) - HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting [53.6394928681237]
holistic understanding of urban scenes based on RGB images is a challenging yet important problem.
Our main idea involves the joint optimization of geometry, appearance, semantics, and motion using a combination of static and dynamic 3D Gaussians.
Our approach offers the ability to render new viewpoints in real-time, yielding 2D and 3D semantic information with high accuracy.
arXiv Detail & Related papers (2024-03-19T13:39:05Z) - SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes [59.23385953161328]
Novel view synthesis for dynamic scenes is still a challenging problem in computer vision and graphics.
We propose a new representation that explicitly decomposes the motion and appearance of dynamic scenes into sparse control points and dense Gaussians.
Our method can enable user-controlled motion editing while retaining high-fidelity appearances.
arXiv Detail & Related papers (2023-12-04T11:57:14Z) - Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis [58.5779956899918]
We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements.
We follow an analysis-by-synthesis framework, inspired by recent work that models scenes as a collection of 3D Gaussians.
We demonstrate a large number of downstream applications enabled by our representation, including first-person view synthesis, dynamic compositional scene synthesis, and 4D video editing.
arXiv Detail & Related papers (2023-08-18T17:59:21Z) - Learning to Segment Rigid Motions from Two Frames [72.14906744113125]
We propose a modular network, motivated by a geometric analysis of what independent object motions can be recovered from an egomotion field.
It takes two consecutive frames as input and predicts segmentation masks for the background and multiple rigidly moving objects, which are then parameterized by 3D rigid transformations.
Our method achieves state-of-the-art performance for rigid motion segmentation on KITTI and Sintel.
arXiv Detail & Related papers (2021-01-11T04:20:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.