Proactive Scene Decomposition and Reconstruction
- URL: http://arxiv.org/abs/2510.16272v1
- Date: Fri, 17 Oct 2025 23:57:33 GMT
- Title: Proactive Scene Decomposition and Reconstruction
- Authors: Baicheng Li, Zike Yan, Dong Wu, Hongbin Zha,
- Abstract summary: This paper formalizes a new task of proactive scene decomposition and reconstruction.<n>We use human-object interactions to iteratively disassemble and reconstruct the environment.<n>The proposed system effectively integrates multiple tasks in dynamic environments.
- Score: 22.845403993200932
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human behaviors are the major causes of scene dynamics and inherently contain rich cues regarding the dynamics. This paper formalizes a new task of proactive scene decomposition and reconstruction, an online approach that leverages human-object interactions to iteratively disassemble and reconstruct the environment. By observing these intentional interactions, we can dynamically refine the decomposition and reconstruction process, addressing inherent ambiguities in static object-level reconstruction. The proposed system effectively integrates multiple tasks in dynamic environments such as accurate camera and object pose estimation, instance decomposition, and online map updating, capitalizing on cues from human-object interactions in egocentric live streams for a flexible, progressive alternative to conventional object-level reconstruction methods. Aided by the Gaussian splatting technique, accurate and consistent dynamic scene modeling is achieved with photorealistic and efficient rendering. The efficacy is validated in multiple real-world scenarios with promising advantages.
Related papers
- MeshMimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction [54.36564144414704]
MeshMimic is an innovative framework that bridges 3D scene reconstruction and embodied intelligence to enable humanoid robots to learn coupled "motion-terrain" interactions directly from video.<n>By leveraging state-of-the-art 3D vision models, our framework precisely segments and reconstructs both human trajectories and the underlying 3D geometry of terrains and objects.
arXiv Detail & Related papers (2026-02-17T17:09:45Z) - Interaction-Aware 4D Gaussian Splatting for Dynamic Hand-Object Interaction Reconstruction [19.178735596766476]
This paper focuses on a challenging setting of simultaneously modeling geometry and appearance of hand-object interaction scenes without any object priors.<n>We present interaction-aware hand-object Gaussians with newly introduced optimizable parameters aiming to adopt piecewise linear hypothesis for clearer structural representation.<n> Experiments show that our approach surpasses existing dynamic 3D-GS-based methods and achieves state-of-the-art performance in reconstructing dynamic hand-object interaction.
arXiv Detail & Related papers (2025-11-18T14:44:04Z) - Dynamic Avatar-Scene Rendering from Human-centric Context [75.95641456716373]
We propose bf Separate-then-Map (StM) strategy to bridge separately defined and optimized models.<n>StM significantly outperforms existing state-of-the-art methods in both visual quality and rendering accuracy.
arXiv Detail & Related papers (2025-11-13T17:39:06Z) - DynaSplat: Dynamic-Static Gaussian Splatting with Hierarchical Motion Decomposition for Scene Reconstruction [9.391616497099422]
We present DynaSplat, an approach that extends Gaussian Splatting to dynamic scenes.<n>We classify scene elements as static or dynamic through a novel fusion of deformation offset statistics and 2D motion flow consistency.<n>We then introduce a hierarchical motion modeling strategy that captures both coarse global transformations and fine-grained local movements.
arXiv Detail & Related papers (2025-06-11T15:13:35Z) - Robust Visual Localization via Semantic-Guided Multi-Scale Transformer [3.3106947066548167]
We propose a framework that combines multi-scale feature learning with semantic scene understanding.<n>Our approach employs a hierarchical Transformer with cross-scale attention to fuse geometric details and contextual cues, preserving spatial precision while adapting to environmental changes.
arXiv Detail & Related papers (2025-06-10T07:44:53Z) - ReMatching Dynamic Reconstruction Flow [55.272357926111454]
This work introduces the ReMatching framework, designed to improve reconstruction quality by incorporating deformation priors into dynamic reconstruction models.<n>Our approach advocates for velocity-field based priors, for which we suggest a matching procedure that can seamlessly supplement existing dynamic reconstruction pipelines.<n>Our evaluations on popular benchmarks involving both synthetic and real-world dynamic scenes demonstrate that augmenting current state-of-the-art methods with our approach leads to a clear improvement in reconstruction accuracy.
arXiv Detail & Related papers (2024-11-01T16:09:33Z) - Learn to Memorize and to Forget: A Continual Learning Perspective of Dynamic SLAM [17.661231232206028]
Simultaneous localization and mapping (SLAM) with implicit neural representations has received extensive attention.
We propose a novel SLAM framework for dynamic environments.
arXiv Detail & Related papers (2024-07-18T09:35:48Z) - EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
EgoGaussian is a method capable of simultaneously reconstructing 3D scenes and dynamically tracking 3D object motion from RGB egocentric input alone.
We show significant improvements in terms of both dynamic object and background reconstruction quality compared to the state-of-the-art.
arXiv Detail & Related papers (2024-06-28T10:39:36Z) - SMORE: Simultaneous Map and Object REconstruction [66.66729715211642]
We present a method for dynamic surface reconstruction of large-scale urban scenes from LiDAR.<n>We take a holistic perspective and optimize a compositional model of a dynamic scene that decomposes the world into rigidly-moving objects and the background.
arXiv Detail & Related papers (2024-06-19T23:53:31Z) - RobustFusion: Robust Volumetric Performance Reconstruction under
Human-object Interactions from Monocular RGBD Stream [27.600873320989276]
High-quality 4D reconstruction of human performance with complex interactions to various objects is essential in real-world scenarios.
Recent advances still fail to provide reliable performance reconstruction.
We propose RobustFusion, a robust volumetric performance reconstruction system for human-object interaction scenarios.
arXiv Detail & Related papers (2021-04-30T08:41:45Z) - Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model
Alignments [81.38641691636847]
We rethink the problem of scene reconstruction from an embodied agent's perspective.
We reconstruct an interactive scene using RGB-D data stream.
This reconstructed scene replaces the object meshes in the dense panoptic map with part-based articulated CAD models.
arXiv Detail & Related papers (2021-03-30T05:56:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.