DOT: Dynamic Object Tracking for Visual SLAM
- URL: http://arxiv.org/abs/2010.00052v1
- Date: Wed, 30 Sep 2020 18:36:28 GMT
- Title: DOT: Dynamic Object Tracking for Visual SLAM
- Authors: Irene Ballester, Alejandro Fontan, Javier Civera, Klaus H. Strobl,
Rudolph Triebel
- Abstract summary: DOT combines instance segmentation and multi-view geometry to generate masks for dynamic objects.
To determine which objects are actually moving, DOT segments first instances of potentially dynamic objects and then, with the estimated camera motion, tracks such objects by minimizing the photometric reprojection error.
Our results show that our approach improves significantly the accuracy and robustness of ORB-SLAM 2, especially in highly dynamic scenes.
- Score: 83.69544718120167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we present DOT (Dynamic Object Tracking), a front-end that
added to existing SLAM systems can significantly improve their robustness and
accuracy in highly dynamic environments. DOT combines instance segmentation and
multi-view geometry to generate masks for dynamic objects in order to allow
SLAM systems based on rigid scene models to avoid such image areas in their
optimizations.
To determine which objects are actually moving, DOT segments first instances
of potentially dynamic objects and then, with the estimated camera motion,
tracks such objects by minimizing the photometric reprojection error. This
short-term tracking improves the accuracy of the segmentation with respect to
other approaches. In the end, only actually dynamic masks are generated. We
have evaluated DOT with ORB-SLAM 2 in three public datasets. Our results show
that our approach improves significantly the accuracy and robustness of
ORB-SLAM 2, especially in highly dynamic scenes.
Related papers
- NID-SLAM: Neural Implicit Representation-based RGB-D SLAM in dynamic environments [9.706447888754614]
We present NID-SLAM, which significantly improves the performance of neural SLAM in dynamic environments.
We propose a new approach to enhance inaccurate regions in semantic masks, particularly in marginal areas.
We also introduce a selection strategy for dynamic scenes, which enhances camera tracking robustness against large-scale objects.
arXiv Detail & Related papers (2024-01-02T12:35:03Z) - EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via
Self-Supervision [85.17951804790515]
EmerNeRF is a simple yet powerful approach for learning spatial-temporal representations of dynamic driving scenes.
It simultaneously captures scene geometry, appearance, motion, and semantics via self-bootstrapping.
Our method achieves state-of-the-art performance in sensor simulation.
arXiv Detail & Related papers (2023-11-03T17:59:55Z) - DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object
Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem.
DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden.
It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z) - Using Detection, Tracking and Prediction in Visual SLAM to Achieve
Real-time Semantic Mapping of Dynamic Scenarios [70.70421502784598]
RDS-SLAM can build semantic maps at object level for dynamic scenarios in real time using only one commonly used Intel Core i7 CPU.
We evaluate RDS-SLAM in TUM RGB-D dataset, and experimental results show that RDS-SLAM can run with 30.3 ms per frame in dynamic scenarios.
arXiv Detail & Related papers (2022-10-10T11:03:32Z) - MOTSLAM: MOT-assisted monocular dynamic SLAM using single-view depth
estimation [5.33931801679129]
MOTSLAM is a dynamic visual SLAM system with the monocular configuration that tracks both poses and bounding boxes of dynamic objects.
Our experiments on the KITTI dataset demonstrate that our system has reached best performance on both camera ego-motion and object tracking on monocular dynamic SLAM.
arXiv Detail & Related papers (2022-10-05T06:07:10Z) - DL-SLOT: Dynamic Lidar SLAM and Object Tracking Based On Graph
Optimization [2.889268075288957]
Ego-pose estimation and dynamic object tracking are two key issues in an autonomous driving system.
In this paper, DL-SLOT, a dynamic Lidar SLAM and object tracking method is proposed.
We perform SLAM and object tracking simultaneously in this framework, which significantly improves the robustness and accuracy of SLAM in highly dynamic road scenarios.
arXiv Detail & Related papers (2022-02-23T11:22:43Z) - AirDOS: Dynamic SLAM benefits from Articulated Objects [9.045690662672659]
Object-aware SLAM (DOS) exploits object-level information to enable robust motion estimation in dynamic environments.
AirDOS is the first dynamic object-aware SLAM system demonstrating that camera pose estimation can be improved by incorporating dynamic articulated objects.
arXiv Detail & Related papers (2021-09-21T01:23:48Z) - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection
Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.
Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z) - Learning to Segment Rigid Motions from Two Frames [72.14906744113125]
We propose a modular network, motivated by a geometric analysis of what independent object motions can be recovered from an egomotion field.
It takes two consecutive frames as input and predicts segmentation masks for the background and multiple rigidly moving objects, which are then parameterized by 3D rigid transformations.
Our method achieves state-of-the-art performance for rigid motion segmentation on KITTI and Sintel.
arXiv Detail & Related papers (2021-01-11T04:20:30Z) - Dynamic Object Tracking and Masking for Visual SLAM [1.37013665345905]
In dynamic environments, performance of visual SLAM techniques can be impaired by visual features taken from moving objects.
This paper presents a pipeline that uses deep neural networks, extended Kalman filters and visual SLAM to improve both localization and mapping.
arXiv Detail & Related papers (2020-07-31T20:37:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.