Related papers: MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences

MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences

URL: http://arxiv.org/abs/2303.08316v1
Date: Wed, 15 Mar 2023 02:10:27 GMT
Title: MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences
Authors: Chenhang He, Ruihuang Li, Yabin Zhang, Shuai Li, Lei Zhang
Abstract summary: Point cloud sequences are commonly used to accurately detect 3D objects in applications such as autonomous driving. Current top-performing multi-frame detectors mostly follow a Detect-and-Fuse framework, which extracts features from each frame of the sequence and fuses them to detect the objects in the current frame. We propose an efficient Motion-guided Sequential Fusion (MSF) method, which exploits the continuity of object motion to mine useful sequential contexts for object detection in the current frame.
Score: 21.50329070835023
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Point cloud sequences are commonly used to accurately detect 3D objects in applications such as autonomous driving. Current top-performing multi-frame detectors mostly follow a Detect-and-Fuse framework, which extracts features from each frame of the sequence and fuses them to detect the objects in the current frame. However, this inevitably leads to redundant computation since adjacent frames are highly correlated. In this paper, we propose an efficient Motion-guided Sequential Fusion (MSF) method, which exploits the continuity of object motion to mine useful sequential contexts for object detection in the current frame. We first generate 3D proposals on the current frame and propagate them to preceding frames based on the estimated velocities. The points-of-interest are then pooled from the sequence and encoded as proposal features. A novel Bidirectional Feature Aggregation (BiFA) module is further proposed to facilitate the interactions of proposal features across frames. Besides, we optimize the point cloud pooling by a voxel-based sampling technique so that millions of points can be processed in several milliseconds. The proposed MSF method achieves not only better efficiency than other multi-frame detectors but also leading accuracy, with 83.12% and 78.30% mAP on the LEVEL1 and LEVEL2 test sets of Waymo Open Dataset, respectively. Codes can be found at \url{https://github.com/skyhehe123/MSF}.

Related papers

Multiway Point Cloud Mosaicking with Diffusion and Global Optimization [74.3802812773891]
We introduce a novel framework for multiway point cloud mosaicking (named Wednesday) At the core of our approach is ODIN, a learned pairwise registration algorithm that identifies overlaps and refines attention scores. Tested on four diverse, large-scale datasets, our method state-of-the-art pairwise and rotation registration results by a large margin on all benchmarks.
arXiv Detail & Related papers (2024-03-30T17:29:13Z)
PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest [65.48057241587398]
PoIFusion is a framework to fuse information of RGB images and LiDAR point clouds at the points of interest (PoIs) Our approach maintains the view of each modality and obtains multi-modal features by computation-friendly projection and computation. We conducted extensive experiments on nuScenes and Argoverse2 datasets to evaluate our approach.
arXiv Detail & Related papers (2024-03-14T09:28:12Z)
PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection [66.94819989912823]
We propose a point-trajectory transformer with long short-term memory for efficient temporal 3D object detection. We use point clouds of current-frame objects and their historical trajectories as input to minimize the memory bank storage requirement. We conduct extensive experiments on the large-scale dataset to demonstrate that our approach performs well against state-of-the-art methods.
arXiv Detail & Related papers (2023-12-13T18:59:13Z)
Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream. At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank. To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z)
3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds. Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z)
FFPA-Net: Efficient Feature Fusion with Projection Awareness for 3D Object Detection [19.419030878019974]
unstructured 3D point clouds are filled in the 2D plane and 3D point cloud features are extracted faster using projection-aware convolution layers. The corresponding indexes between different sensor signals are established in advance in the data preprocessing. Two new plug-and-play fusion modules, LiCamFuse and BiLiCamFuse, are proposed.
arXiv Detail & Related papers (2022-09-15T16:13:19Z)
TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object Detection [47.941714033657675]
3D object detection using point clouds has attracted increasing attention due to its wide applications in autonomous driving and robotics. We design TransPillars, a novel transformer-based feature aggregation technique that exploits temporal features of consecutive point cloud frames. Our proposed TransPillars achieves state-of-art performance as compared to existing multi-frame detection approaches.
arXiv Detail & Related papers (2022-08-04T15:41:43Z)
MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection [44.619039588252676]
We present a flexible and high-performance 3D detection framework, named MPPNet, for 3D temporal object detection with point cloud sequences. We propose a novel three-hierarchy framework with proxy points for multi-frame feature encoding and interactions to achieve better detection. Our approach outperforms state-of-the-art methods with large margins when applied to both short (e.g., 4-frame) and long (e.g., 16-frame) point cloud sequences.
arXiv Detail & Related papers (2022-05-12T09:38:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.