Unified Multi-Modal Landmark Tracking for Tightly Coupled
Lidar-Visual-Inertial Odometry
- URL: http://arxiv.org/abs/2011.06838v3
- Date: Wed, 17 Feb 2021 11:42:25 GMT
- Title: Unified Multi-Modal Landmark Tracking for Tightly Coupled
Lidar-Visual-Inertial Odometry
- Authors: David Wisth, Marco Camurri, Sandipan Das, Maurice Fallon
- Abstract summary: We present an efficient multi-sensor odometry system for mobile platforms that jointly optimize visual, lidar, and inertial information.
New method to extract 3D line and planar primitives from lidar point clouds is presented.
System has been tested on a variety of platforms and scenarios, including underground exploration with a legged robot and outdoor scanning with a dynamically moving handheld device.
- Score: 5.131684964386192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an efficient multi-sensor odometry system for mobile platforms
that jointly optimizes visual, lidar, and inertial information within a single
integrated factor graph. This runs in real-time at full framerate using fixed
lag smoothing. To perform such tight integration, a new method to extract 3D
line and planar primitives from lidar point clouds is presented. This approach
overcomes the suboptimality of typical frame-to-frame tracking methods by
treating the primitives as landmarks and tracking them over multiple scans.
True integration of lidar features with standard visual features and IMU is
made possible using a subtle passive synchronization of lidar and camera
frames. The lightweight formulation of the 3D features allows for real-time
execution on a single CPU. Our proposed system has been tested on a variety of
platforms and scenarios, including underground exploration with a legged robot
and outdoor scanning with a dynamically moving handheld device, for a total
duration of 96 min and 2.4 km traveled distance. In these test sequences, using
only one exteroceptive sensor leads to failure due to either underconstrained
geometry (affecting lidar) or textureless areas caused by aggressive lighting
changes (affecting vision). In these conditions, our factor graph naturally
uses the best information available from each sensor modality without any hard
switches.
Related papers
- MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements [59.70107451308687]
We show for the first time that using 3D Gaussians for map representation with unposed camera images and inertial measurements can enable accurate SLAM.
Our method, MM3DGS, addresses the limitations of prior rendering by enabling faster scale awareness, and improved trajectory tracking.
We also release a multi-modal dataset, UT-MM, collected from a mobile robot equipped with a camera and an inertial measurement unit.
arXiv Detail & Related papers (2024-04-01T04:57:41Z) - Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking [2.487142846438629]
3 single object tracking within LIDAR point is pivotal task in computer vision.
Existing methods, which depend solely on appearance matching via networks or utilize information from successive frames, encounter significant challenges.
We design an innovative cross-frame bi-temporal motion tracker, named STMD-Tracker, to mitigate these challenges.
arXiv Detail & Related papers (2024-03-23T13:15:44Z) - LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry [52.131996528655094]
We present the Long-term Effective Any Point Tracking (LEAP) module.
LEAP innovatively combines visual, inter-track, and temporal cues with mindfully selected anchors for dynamic track estimation.
Based on these traits, we develop LEAP-VO, a robust visual odometry system adept at handling occlusions and dynamic scenes.
arXiv Detail & Related papers (2024-01-03T18:57:27Z) - Bi-directional Adapter for Multi-modal Tracking [67.01179868400229]
We propose a novel multi-modal visual prompt tracking model based on a universal bi-directional adapter.
We develop a simple but effective light feature adapter to transfer modality-specific information from one modality to another.
Our model achieves superior tracking performance in comparison with both the full fine-tuning methods and the prompt learning-based methods.
arXiv Detail & Related papers (2023-12-17T05:27:31Z) - UniTR: A Unified and Efficient Multi-Modal Transformer for
Bird's-Eye-View Representation [113.35352122662752]
We present an efficient multi-modal backbone for outdoor 3D perception named UniTR.
UniTR processes a variety of modalities with unified modeling and shared parameters.
UniTR is also a fundamentally task-agnostic backbone that naturally supports different 3D perception tasks.
arXiv Detail & Related papers (2023-08-15T12:13:44Z) - Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream.
At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank.
To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z) - DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets [95.84755169585492]
We present Dynamic Sparse Voxel Transformer (DSVT), a single-stride window-based voxel Transformer backbone for outdoor 3D perception.
Our model achieves state-of-the-art performance with a broad range of 3D perception tasks.
arXiv Detail & Related papers (2023-01-15T09:31:58Z) - Balancing the Budget: Feature Selection and Tracking for Multi-Camera
Visual-Inertial Odometry [3.441021278275805]
We present a multi-camera visual-inertial odometry system based on factor graph optimization.
We focus on motion tracking in challenging environments such as in narrow corridors and dark spaces with aggressive motions and abrupt lighting changes.
arXiv Detail & Related papers (2021-09-13T13:53:09Z) - DEFT: Detection Embeddings for Tracking [3.326320568999945]
We propose an efficient joint detection and tracking model named DEFT.
Our approach relies on an appearance-based object matching network jointly-learned with an underlying object detection network.
DEFT has comparable accuracy and speed to the top methods on 2D online tracking leaderboards.
arXiv Detail & Related papers (2021-02-03T20:00:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.