DymSLAM:4D Dynamic Scene Reconstruction Based on Geometrical Motion
Segmentation
- URL: http://arxiv.org/abs/2003.04569v1
- Date: Tue, 10 Mar 2020 08:25:21 GMT
- Title: DymSLAM:4D Dynamic Scene Reconstruction Based on Geometrical Motion
Segmentation
- Authors: Chenjie Wang and Bin Luo and Yun Zhang and Qing Zhao and Lu Yin and
Wei Wang and Xin Su and Yajun Wang and Chengyuan Li
- Abstract summary: DymSLAM is a dynamic stereo visual SLAM system capable of reconstructing a 4D (3D + time) dynamic scene with rigid moving objects.
The proposed system allows the robot to be employed for high-level tasks, such as obstacle avoidance for dynamic objects.
- Score: 22.444657614883084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most SLAM algorithms are based on the assumption that the scene is static.
However, in practice, most scenes are dynamic which usually contains moving
objects, these methods are not suitable. In this paper, we introduce DymSLAM, a
dynamic stereo visual SLAM system being capable of reconstructing a 4D (3D +
time) dynamic scene with rigid moving objects. The only input of DymSLAM is
stereo video, and its output includes a dense map of the static environment, 3D
model of the moving objects and the trajectories of the camera and the moving
objects. We at first detect and match the interesting points between successive
frames by using traditional SLAM methods. Then the interesting points belonging
to different motion models (including ego-motion and motion models of rigid
moving objects) are segmented by a multi-model fitting approach. Based on the
interesting points belonging to the ego-motion, we are able to estimate the
trajectory of the camera and reconstruct the static background. The interesting
points belonging to the motion models of rigid moving objects are then used to
estimate their relative motion models to the camera and reconstruct the 3D
models of the objects. We then transform the relative motion to the
trajectories of the moving objects in the global reference frame. Finally, we
then fuse the 3D models of the moving objects into the 3D map of the
environment by considering their motion trajectories to obtain a 4D (3D+time)
sequence. DymSLAM obtains information about the dynamic objects instead of
ignoring them and is suitable for unknown rigid objects. Hence, the proposed
system allows the robot to be employed for high-level tasks, such as obstacle
avoidance for dynamic objects. We conducted experiments in a real-world
environment where both the camera and the objects were moving in a wide range.
Related papers
- EMD: Explicit Motion Modeling for High-Quality Street Gaussian Splatting [22.590036750925627]
Photorealistic reconstruction of street scenes is essential for developing real-world simulators in autonomous driving.
Recent methods based on 3D/4D Gaussian Splatting (GS) have demonstrated promising results, but they still encounter challenges in complex street scenes due to the unpredictable motion of dynamic objects.
We propose Explicit Motion Decomposition (EMD), which models the motions of dynamic objects by introducing learnable motion embeddings to the Gaussians.
arXiv Detail & Related papers (2024-11-23T15:10:04Z) - V3D-SLAM: Robust RGB-D SLAM in Dynamic Environments with 3D Semantic Geometry Voting [1.3493547928462395]
Simultaneous localization and mapping (SLAM) in highly dynamic environments is challenging due to the correlation between moving objects and the camera pose.
We propose a robust method, V3D-SLAM, to remove moving objects via two lightweight re-evaluation stages.
Our experiment on the TUM RGB-D benchmark on dynamic sequences with ground-truth camera trajectories showed that our methods outperform the most recent state-of-the-art SLAM methods.
arXiv Detail & Related papers (2024-10-15T21:08:08Z) - EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
EgoGaussian is a method capable of simultaneously reconstructing 3D scenes and dynamically tracking 3D object motion from RGB egocentric input alone.
We show significant improvements in terms of both dynamic object and background reconstruction quality compared to the state-of-the-art.
arXiv Detail & Related papers (2024-06-28T10:39:36Z) - DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and
Depth from Monocular Videos [76.01906393673897]
We propose a self-supervised method to jointly learn 3D motion and depth from monocular videos.
Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion.
Our model delivers superior performance in all evaluated settings.
arXiv Detail & Related papers (2024-03-09T12:22:46Z) - Delving into Motion-Aware Matching for Monocular 3D Object Tracking [81.68608983602581]
We find that the motion cue of objects along different time frames is critical in 3D multi-object tracking.
We propose MoMA-M3T, a framework that mainly consists of three motion-aware components.
We conduct extensive experiments on the nuScenes and KITTI datasets to demonstrate our MoMA-M3T achieves competitive performance against state-of-the-art methods.
arXiv Detail & Related papers (2023-08-22T17:53:58Z) - Class-agnostic Reconstruction of Dynamic Objects from Videos [127.41336060616214]
We introduce REDO, a class-agnostic framework to REconstruct the Dynamic Objects from RGBD or calibrated videos.
We develop two novel modules. First, we introduce a canonical 4D implicit function which is pixel-aligned with aggregated temporal visual cues.
Second, we develop a 4D transformation module which captures object dynamics to support temporal propagation and aggregation.
arXiv Detail & Related papers (2021-12-03T18:57:47Z) - NeuralDiff: Segmenting 3D objects that move in egocentric videos [92.95176458079047]
We study the problem of decomposing the observed 3D scene into a static background and a dynamic foreground.
This task is reminiscent of the classic background subtraction problem, but is significantly harder because all parts of the scene, static and dynamic, generate a large apparent motion.
In particular, we consider egocentric videos and further separate the dynamic component into objects and the actor that observes and moves them.
arXiv Detail & Related papers (2021-10-19T12:51:35Z) - Attentive and Contrastive Learning for Joint Depth and Motion Field
Estimation [76.58256020932312]
Estimating the motion of the camera together with the 3D structure of the scene from a monocular vision system is a complex task.
We present a self-supervised learning framework for 3D object motion field estimation from monocular videos.
arXiv Detail & Related papers (2021-10-13T16:45:01Z) - AirDOS: Dynamic SLAM benefits from Articulated Objects [9.045690662672659]
Object-aware SLAM (DOS) exploits object-level information to enable robust motion estimation in dynamic environments.
AirDOS is the first dynamic object-aware SLAM system demonstrating that camera pose estimation can be improved by incorporating dynamic articulated objects.
arXiv Detail & Related papers (2021-09-21T01:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.