DynStatF: An Efficient Feature Fusion Strategy for LiDAR 3D Object
Detection
- URL: http://arxiv.org/abs/2305.15219v1
- Date: Wed, 24 May 2023 15:00:01 GMT
- Title: DynStatF: An Efficient Feature Fusion Strategy for LiDAR 3D Object
Detection
- Authors: Yao Rong, Xiangyu Wei, Tianwei Lin, Yueyu Wang, Enkelejda Kasneci
- Abstract summary: Augmenting LiDAR input with multiple previous frames provides richer semantic information.
Crowded point clouds in multi-frames can hurt the precise position information due to the motion blur and inaccurate point projection.
We propose a novel feature fusion strategy, DynStaF, which enhances the rich semantic information provided by the multi-frame.
- Score: 21.573784416916546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Augmenting LiDAR input with multiple previous frames provides richer semantic
information and thus boosts performance in 3D object detection, However,
crowded point clouds in multi-frames can hurt the precise position information
due to the motion blur and inaccurate point projection. In this work, we
propose a novel feature fusion strategy, DynStaF (Dynamic-Static Fusion), which
enhances the rich semantic information provided by the multi-frame (dynamic
branch) with the accurate location information from the current single-frame
(static branch). To effectively extract and aggregate complimentary features,
DynStaF contains two modules, Neighborhood Cross Attention (NCA) and
Dynamic-Static Interaction (DSI), operating through a dual pathway
architecture. NCA takes the features in the static branch as queries and the
features in the dynamic branch as keys (values). When computing the attention,
we address the sparsity of point clouds and take only neighborhood positions
into consideration. NCA fuses two features at different feature map scales,
followed by DSI providing the comprehensive interaction. To analyze our
proposed strategy DynStaF, we conduct extensive experiments on the nuScenes
dataset. On the test set, DynStaF increases the performance of PointPillars in
NDS by a large margin from 57.7% to 61.6%. When combined with CenterPoint, our
framework achieves 61.0% mAP and 67.7% NDS, leading to state-of-the-art
performance without bells and whistles.
Related papers
- FASTC: A Fast Attentional Framework for Semantic Traversability Classification Using Point Cloud [7.711666704468952]
We address the problem of traversability assessment using point clouds.
We propose a pillar feature extraction module that utilizes PointNet to capture features from point clouds organized in vertical volume.
We then propose a newtemporal attention module to fuse multi-frame information, which can properly handle the varying density problem of LIDAR point clouds.
arXiv Detail & Related papers (2024-06-24T12:01:55Z) - PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection [66.94819989912823]
We propose a point-trajectory transformer with long short-term memory for efficient temporal 3D object detection.
We use point clouds of current-frame objects and their historical trajectories as input to minimize the memory bank storage requirement.
We conduct extensive experiments on the large-scale dataset to demonstrate that our approach performs well against state-of-the-art methods.
arXiv Detail & Related papers (2023-12-13T18:59:13Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in
Driving Scenes [82.4186966781934]
We introduce a simple, efficient, and effective two-stage detector, termed as Ret3D.
At the core of Ret3D is the utilization of novel intra-frame and inter-frame relation modules.
With negligible extra overhead, Ret3D achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-08-18T03:48:58Z) - TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object
Detection [47.941714033657675]
3D object detection using point clouds has attracted increasing attention due to its wide applications in autonomous driving and robotics.
We design TransPillars, a novel transformer-based feature aggregation technique that exploits temporal features of consecutive point cloud frames.
Our proposed TransPillars achieves state-of-art performance as compared to existing multi-frame detection approaches.
arXiv Detail & Related papers (2022-08-04T15:41:43Z) - MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D
Temporal Object Detection [44.619039588252676]
We present a flexible and high-performance 3D detection framework, named MPPNet, for 3D temporal object detection with point cloud sequences.
We propose a novel three-hierarchy framework with proxy points for multi-frame feature encoding and interactions to achieve better detection.
Our approach outperforms state-of-the-art methods with large margins when applied to both short (e.g., 4-frame) and long (e.g., 16-frame) point cloud sequences.
arXiv Detail & Related papers (2022-05-12T09:38:42Z) - Background-Aware 3D Point Cloud Segmentationwith Dynamic Point Feature
Aggregation [12.093182949686781]
We propose a novel 3D point cloud learning network, referred to as Dynamic Point Feature Aggregation Network (DPFA-Net)
DPFA-Net has two variants for semantic segmentation and classification of 3D point clouds.
It achieves the state-of-the-art overall accuracy score for semantic segmentation on the S3DIS dataset.
arXiv Detail & Related papers (2021-11-14T05:46:05Z) - Perception-aware Multi-sensor Fusion for 3D LiDAR Semantic Segmentation [59.42262859654698]
3D semantic segmentation is important in scene understanding for many applications, such as auto-driving and robotics.
Existing fusion-based methods may not achieve promising performance due to vast difference between two modalities.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF) to exploit perceptual information from two modalities.
arXiv Detail & Related papers (2021-06-21T10:47:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.