Related papers: An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds

An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds

URL: http://arxiv.org/abs/2007.12392v1
Date: Fri, 24 Jul 2020 07:34:15 GMT
Title: An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds
Authors: Rui Huang, Wanyue Zhang, Abhijit Kundu, Caroline Pantofaru, David A Ross, Thomas Funkhouser, Alireza Fathi
Abstract summary: We propose a sparse LSTM-based multi-frame 3d object detection algorithm. We use a U-Net style 3D sparse convolution network to extract features for each frame's LiDAR point-cloud.
Score: 16.658604637005535
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Detecting objects in 3D LiDAR data is a core technology for autonomous driving and other robotics applications. Although LiDAR data is acquired over time, most of the 3D object detection algorithms propose object bounding boxes independently for each frame and neglect the useful information available in the temporal domain. To address this problem, in this paper we propose a sparse LSTM-based multi-frame 3d object detection algorithm. We use a U-Net style 3D sparse convolution network to extract features for each frame's LiDAR point-cloud. These features are fed to the LSTM module together with the hidden and memory features from last frame to predict the 3d objects in the current frame as well as hidden and memory features that are passed to the next frame. Experiments on the Waymo Open Dataset show that our algorithm outperforms the traditional frame by frame approach by 7.5% mAP@0.7 and other multi-frame approaches by 1.2% while using less memory and computation per frame. To the best of our knowledge, this is the first work to use an LSTM for 3D object detection in sparse point clouds.

Related papers

VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding [57.04804711488706]
3D visual grounding is crucial for robots, requiring integration of natural language and 3D scene understanding. We present VLM-Grounder, a novel framework using vision-language models (VLMs) for zero-shot 3D visual grounding based solely on 2D images.
arXiv Detail & Related papers (2024-10-17T17:59:55Z)
Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data [68.18735997052265]
We propose a balanced approach that combines the advantages of monocular and point cloud-based 3D detection. Our method requires only a small number of 3D points, that can be obtained from a low-cost, low-resolution sensor. The accuracy of 3D detection improves by 20% compared to the state-of-the-art monocular detection methods.
arXiv Detail & Related papers (2024-04-10T03:54:53Z)
PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection [66.94819989912823]
We propose a point-trajectory transformer with long short-term memory for efficient temporal 3D object detection. We use point clouds of current-frame objects and their historical trajectories as input to minimize the memory bank storage requirement. We conduct extensive experiments on the large-scale dataset to demonstrate that our approach performs well against state-of-the-art methods.
arXiv Detail & Related papers (2023-12-13T18:59:13Z)
3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds. Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z)
A Lightweight and Detector-free 3D Single Object Tracker on Point Clouds [50.54083964183614]
It is non-trivial to perform accurate target-specific detection since the point cloud of objects in raw LiDAR scans is usually sparse and incomplete. We propose DMT, a Detector-free Motion prediction based 3D Tracking network that totally removes the usage of complicated 3D detectors.
arXiv Detail & Related papers (2022-03-08T17:49:07Z)
Temp-Frustum Net: 3D Object Detection with Temporal Fusion [0.0]
3D object detection is a core component of automated driving systems. Frame-by-frame 3D object detection suffers from noise, field-of-view obstruction, and sparsity. We propose a novel Temporal Fusion Module to mitigate these problems.
arXiv Detail & Related papers (2021-04-25T09:08:14Z)
3D-MAN: 3D Multi-frame Attention Network for Object Detection [22.291051951077485]
3D-MAN is a 3D multi-frame attention network that effectively aggregates features from multiple perspectives. We show that 3D-MAN achieves state-of-the-art results compared to published single-frame and multi-frame methods.
arXiv Detail & Related papers (2021-03-30T03:44:22Z)
Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking from View Aggregation [8.854112907350624]
3D multi-object tracking plays a vital role in autonomous navigation. Many approaches detect objects in 2D RGB sequences for tracking, which is lack of reliability when localizing objects in 3D space. We propose a novel convolutional operation, named RelationConv, to better exploit the correlation between each pair of objects in the adjacent frames.
arXiv Detail & Related papers (2020-11-25T16:14:40Z)
Stereo RGB and Deeper LIDAR Based Network for 3D Object Detection [40.34710686994996]
3D object detection has become an emerging task in autonomous driving scenarios. Previous works process 3D point clouds using either projection-based or voxel-based models. We propose the Stereo RGB and Deeper LIDAR framework which can utilize semantic and spatial information simultaneously.
arXiv Detail & Related papers (2020-06-09T11:19:24Z)
DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data. The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes. We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.