DSLO: Deep Sequence LiDAR Odometry Based on Inconsistent Spatio-temporal Propagation
- URL: http://arxiv.org/abs/2409.00744v1
- Date: Sun, 1 Sep 2024 15:12:48 GMT
- Title: DSLO: Deep Sequence LiDAR Odometry Based on Inconsistent Spatio-temporal Propagation
- Authors: Huixin Zhang, Guangming Wang, Xinrui Wu, Chenfeng Xu, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan, Hesheng Wang,
- Abstract summary: paper introduces a 3D point cloud sequence learning model based on inconsistent-temporal propagation for LiDAR odometry DSLO.
It consists of a pyramid structure with a sequential pose module, a hierarchical pose refinement module, and a temporal feature propagation module.
- Score: 66.8732965660931
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper introduces a 3D point cloud sequence learning model based on inconsistent spatio-temporal propagation for LiDAR odometry, termed DSLO. It consists of a pyramid structure with a spatial information reuse strategy, a sequential pose initialization module, a gated hierarchical pose refinement module, and a temporal feature propagation module. First, spatial features are encoded using a point feature pyramid, with features reused in successive pose estimations to reduce computational overhead. Second, a sequential pose initialization method is introduced, leveraging the high-frequency sampling characteristic of LiDAR to initialize the LiDAR pose. Then, a gated hierarchical pose refinement mechanism refines poses from coarse to fine by selectively retaining or discarding motion information from different layers based on gate estimations. Finally, temporal feature propagation is proposed to incorporate the historical motion information from point cloud sequences, and address the spatial inconsistency issue when transmitting motion information embedded in point clouds between frames. Experimental results on the KITTI odometry dataset and Argoverse dataset demonstrate that DSLO outperforms state-of-the-art methods, achieving at least a 15.67\% improvement on RTE and a 12.64\% improvement on RRE, while also achieving a 34.69\% reduction in runtime compared to baseline methods. Our implementation will be available at https://github.com/IRMVLab/DSLO.
Related papers
- Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences [25.74000325019015]
We introduce a novel LiDAR 3D object detection framework, namely LiSTM, to facilitate spatial-temporal feature learning with cross-frame motion forecasting information.
We have conducted experiments on the aggregation and nuScenes datasets to demonstrate that the proposed framework achieves superior 3D detection performance.
arXiv Detail & Related papers (2024-09-06T16:29:04Z) - Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion [57.232688209606515]
We present HTCL, a novel Temporal Temporal Context Learning paradigm for improving camera-based semantic scene completion.
Our method ranks $1st$ on the Semantic KITTI benchmark and even surpasses LiDAR-based methods in terms of mIoU.
arXiv Detail & Related papers (2024-07-02T09:11:17Z) - Motion-aware Memory Network for Fast Video Salient Object Detection [15.967509480432266]
We design a space-time memory (STM)-based network, which extracts useful temporal information of the current frame from adjacent frames as the temporal branch of VSOD.
In the encoding stage, we generate high-level temporal features by using high-level features from the current and its adjacent frames.
In the decoding stage, we propose an effective fusion strategy for spatial and temporal branches.
The proposed model does not require optical flow or other preprocessing, and can reach a speed of nearly 100 FPS during inference.
arXiv Detail & Related papers (2022-08-01T15:56:19Z) - LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR
Point Clouds [58.402752909624716]
Existing motion capture datasets are largely short-range and cannot yet fit the need of long-range applications.
We propose LiDARHuman26M, a new human motion capture dataset captured by LiDAR at a much longer range to overcome this limitation.
Our dataset also includes the ground truth human motions acquired by the IMU system and the synchronous RGB images.
arXiv Detail & Related papers (2022-03-28T12:52:45Z) - Roadside Lidar Vehicle Detection and Tracking Using Range And Intensity
Background Subtraction [0.0]
We present the solution of roadside LiDAR object detection using a combination of two unsupervised learning algorithms.
The method was validated against a commercial traffic data collection platform.
arXiv Detail & Related papers (2022-01-13T00:54:43Z) - Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based
Motion Recognition [62.46544616232238]
Previous motion recognition methods have achieved promising performance through the tightly coupled multi-temporal representation.
We propose to decouple and recouple caused caused representation for RGB-D-based motion recognition.
arXiv Detail & Related papers (2021-12-16T18:59:47Z) - Efficient 3D Deep LiDAR Odometry [16.388259779644553]
An efficient 3D point cloud learning architecture, named PWCLO-Net, is first proposed in this paper.
The entire architecture is holistically optimized end-to-end to achieve adaptive learning of cost volume and mask.
arXiv Detail & Related papers (2021-11-03T11:09:49Z) - PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical
Embedding Mask Optimization [17.90299648470637]
A novel 3D point cloud learning model for deep LiDAR odometry, named PWCLO-Net, is proposed in this paper.
In this model, the Pyramid, Warping, and Cost volume structure for the LiDAR odometry task is built to refine the estimated pose in a coarse-to-fine approach hierarchically.
Our method outperforms all recent learning-based methods and outperforms the geometry-based approach, LOAM with mapping optimization, on most sequences of KITTI odometry dataset.
arXiv Detail & Related papers (2020-12-02T05:23:41Z) - ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework
for LiDAR Point Cloud Segmentation [111.56730703473411]
Training deep neural networks (DNNs) on LiDAR data requires large-scale point-wise annotations.
Simulation-to-real domain adaptation (SRDA) trains a DNN using unlimited synthetic data with automatically generated labels.
ePointDA consists of three modules: self-supervised dropout noise rendering, statistics-invariant and spatially-adaptive feature alignment, and transferable segmentation learning.
arXiv Detail & Related papers (2020-09-07T23:46:08Z) - Quaternion Equivariant Capsule Networks for 3D Point Clouds [58.566467950463306]
We present a 3D capsule module for processing point clouds that is equivariant to 3D rotations and translations.
We connect dynamic routing between capsules to the well-known Weiszfeld algorithm.
Based on our operator, we build a capsule network that disentangles geometry from pose.
arXiv Detail & Related papers (2019-12-27T13:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.