UnIRe: Unsupervised Instance Decomposition for Dynamic Urban Scene Reconstruction
- URL: http://arxiv.org/abs/2504.00763v1
- Date: Tue, 01 Apr 2025 13:15:58 GMT
- Title: UnIRe: Unsupervised Instance Decomposition for Dynamic Urban Scene Reconstruction
- Authors: Yunxuan Mao, Rong Xiong, Yue Wang, Yiyi Liao,
- Abstract summary: We propose UnIRe, a 3D Splatting (3DGS) based approach that decomposes a scene into a static background and individual dynamic instances.<n>At its core, we introduce 4D superpoints, a novel representation that clusters multi-frame LiDAR points in 4D space.<n>Experiments show that our method outperforms existing methods in dynamic scene reconstruction while enabling accurate and flexible instance-level editing.
- Score: 27.334884564978907
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reconstructing and decomposing dynamic urban scenes is crucial for autonomous driving, urban planning, and scene editing. However, existing methods fail to perform instance-aware decomposition without manual annotations, which is crucial for instance-level scene editing.We propose UnIRe, a 3D Gaussian Splatting (3DGS) based approach that decomposes a scene into a static background and individual dynamic instances using only RGB images and LiDAR point clouds. At its core, we introduce 4D superpoints, a novel representation that clusters multi-frame LiDAR points in 4D space, enabling unsupervised instance separation based on spatiotemporal correlations. These 4D superpoints serve as the foundation for our decomposed 4D initialization, i.e., providing spatial and temporal initialization to train a dynamic 3DGS for arbitrary dynamic classes without requiring bounding boxes or object templates.Furthermore, we introduce a smoothness regularization strategy in both 2D and 3D space, further improving the temporal stability.Experiments on benchmark datasets show that our method outperforms existing methods in decomposed dynamic scene reconstruction while enabling accurate and flexible instance-level editing, making it a practical solution for real-world applications.
Related papers
- D^2USt3R: Enhancing 3D Reconstruction with 4D Pointmaps for Dynamic Scenes [40.371542172080105]
We propose D2USt3R that regresses 4D pointmaps simuliously capture both static and dynamic 3D scene geometry in a feed-forward manner.
By explicitly incorporating both spatial and temporal aspects, our approach successfully encapsulates object-temporal dense correspondence to the proposed 4D pointmaps, enhancing downstream tasks.
arXiv Detail & Related papers (2025-04-08T17:59:50Z) - Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency [49.875459658889355]
Free4D is a tuning-free framework for 4D scene generation from a single image.<n>Our key insight is to distill pre-trained foundation models for consistent 4D scene representation.<n>The resulting 4D representation enables real-time, controllable rendering.
arXiv Detail & Related papers (2025-03-26T17:59:44Z) - CoDa-4DGS: Dynamic Gaussian Splatting with Context and Deformation Awareness for Autonomous Driving [12.006435326659526]
We introduce a novel 4D Gaussian Splatting (4DGS) approach to improve dynamic scene rendering.<n> Specifically, we employ a 2D semantic segmentation foundation model to self-supervise the 4D semantic features of Gaussians.<n>By aggregating and encoding both semantic and temporal deformation features, each Gaussian is equipped with cues for potential deformation compensation.
arXiv Detail & Related papers (2025-03-09T19:58:51Z) - 4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives [116.2042238179433]
In this paper, we frame dynamic scenes as unconstrained 4D volume learning problems.<n>We represent a target dynamic scene using a collection of 4D Gaussian primitives with explicit geometry and appearance features.<n>This approach can capture relevant information in space and time by fitting the underlying photorealistic-temporal volume.<n> Notably, our 4DGS model is the first solution that supports real-time rendering of high-resolution, novel views for complex dynamic scenes.
arXiv Detail & Related papers (2024-12-30T05:30:26Z) - Driv3R: Learning Dense 4D Reconstruction for Autonomous Driving [116.10577967146762]
We propose Driv3R, a framework that directly regresses per-frame point maps from multi-view image sequences.<n>We employ a 4D flow predictor to identify moving objects within the scene to direct our network focus more on reconstructing these dynamic regions.<n>Driv3R outperforms previous frameworks in 4D dynamic scene reconstruction, achieving 15x faster inference speed.
arXiv Detail & Related papers (2024-12-09T18:58:03Z) - UrbanGS: Semantic-Guided Gaussian Splatting for Urban Scene Reconstruction [86.4386398262018]
UrbanGS uses 2D semantic maps and an existing dynamic Gaussian approach to distinguish static objects from the scene.<n>For potentially dynamic objects, we aggregate temporal information using learnable time embeddings.<n>Our approach outperforms state-of-the-art methods in reconstruction quality and efficiency.
arXiv Detail & Related papers (2024-12-04T16:59:49Z) - T-3DGS: Removing Transient Objects for 3D Scene Reconstruction [83.05271859398779]
Transient objects in video sequences can significantly degrade the quality of 3D scene reconstructions.
We propose T-3DGS, a novel framework that robustly filters out transient distractors during 3D reconstruction using Gaussian Splatting.
arXiv Detail & Related papers (2024-11-29T07:45:24Z) - Event-boosted Deformable 3D Gaussians for Dynamic Scene Reconstruction [50.873820265165975]
We introduce the first approach combining event cameras, which capture high-temporal-resolution, continuous motion data, with deformable 3D-GS for dynamic scene reconstruction.<n>We propose a GS-Threshold Joint Modeling strategy, creating a mutually reinforcing process that greatly improves both 3D reconstruction and threshold modeling.<n>We contribute the first event-inclusive 4D benchmark with synthetic and real-world dynamic scenes, on which our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-11-25T08:23:38Z) - S4D: Streaming 4D Real-World Reconstruction with Gaussians and 3D Control Points [30.46796069720543]
We introduce a novel approach for streaming 4D real-world reconstruction utilizing discrete 3D control points.
This method physically models local rays and establishes a motion-decoupling coordinate system.
By effectively merging traditional graphics with learnable pipelines, it provides a robust and efficient local 6-degrees-of-freedom (6 DoF) motion representation.
arXiv Detail & Related papers (2024-08-23T12:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.