Rethinking Dimensionality Reduction in Grid-based 3D Object Detection
- URL: http://arxiv.org/abs/2209.09464v1
- Date: Tue, 20 Sep 2022 04:51:54 GMT
- Title: Rethinking Dimensionality Reduction in Grid-based 3D Object Detection
- Authors: Dihe Huang, Ying Chen, Yikang Ding, Jinli Liao, Jianlin Liu, Kai Wu,
Qiang Nie, Yong Liu, Chengjie Wang
- Abstract summary: We propose a novel point cloud detection network based on a Multi-level feature dimensionality reduction strategy, called MDRNet.
In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically focus on the valuable parts of the object during voxel-to-BEV feature transformation.
Experiments on nuScenes show that the proposed method outperforms the state-of-the-art methods.
- Score: 24.249147412551768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bird's eye view (BEV) is widely adopted by most of the current point cloud
detectors due to the applicability of well-explored 2D detection techniques.
However, existing methods obtain BEV features by simply collapsing voxel or
point features along the height dimension, which causes the heavy loss of 3D
spatial information. To alleviate the information loss, we propose a novel
point cloud detection network based on a Multi-level feature dimensionality
reduction strategy, called MDRNet. In MDRNet, the Spatial-aware Dimensionality
Reduction (SDR) is designed to dynamically focus on the valuable parts of the
object during voxel-to-BEV feature transformation. Furthermore, the Multi-level
Spatial Residuals (MSR) is proposed to fuse the multi-level spatial information
in the BEV feature maps. Extensive experiments on nuScenes show that the
proposed method outperforms the state-of-the-art methods. The code will be
available upon publication.
Related papers
- GeoBEV: Learning Geometric BEV Representation for Multi-view 3D Object Detection [36.245654685143016]
Bird's-Eye-View (BEV) representation has emerged as a mainstream paradigm for multi-view 3D object detection.
Existing methods overlook the geometric quality of BEV representation, leaving it in a low-resolution state.
arXiv Detail & Related papers (2024-09-03T11:57:36Z) - Uplifting Range-View-based 3D Semantic Segmentation in Real-Time with Multi-Sensor Fusion [18.431017678057348]
Range-View(RV)-based 3D point cloud segmentation is widely adopted due to its compact data form.
However, RV-based methods fall short in providing robust segmentation for the occluded points.
We propose a new LiDAR and Camera Range-view-based 3D point cloud semantic segmentation method (LaCRange)
In addition to being real-time, the proposed method achieves state-of-the-art results on nuScenes benchmark.
arXiv Detail & Related papers (2024-07-12T21:41:57Z) - Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - VirtualPainting: Addressing Sparsity with Virtual Points and
Distance-Aware Data Augmentation for 3D Object Detection [3.5259183508202976]
We present an innovative approach that involves the generation of virtual LiDAR points using camera images.
We also enhance these virtual points with semantic labels obtained from image-based segmentation networks.
Our approach offers a versatile solution that can be seamlessly integrated into various 3D frameworks and 2D semantic segmentation methods.
arXiv Detail & Related papers (2023-12-26T18:03:05Z) - OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for
Multi-Camera 3D Object Detection [78.38062015443195]
OA-BEV is a network that can be plugged into the BEV-based 3D object detection framework.
Our method achieves consistent improvements over the BEV-based baselines in terms of both average precision and nuScenes detection score.
arXiv Detail & Related papers (2023-01-13T06:02:31Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - RAANet: Range-Aware Attention Network for LiDAR-based 3D Object
Detection with Auxiliary Density Level Estimation [11.180128679075716]
Range-Aware Attention Network (RAANet) is developed for 3D object detection from LiDAR data for autonomous driving.
RAANet extracts more powerful BEV features and generates superior 3D object detections.
Experiments on nuScenes dataset demonstrate that our proposed approach outperforms the state-of-the-art methods for LiDAR-based 3D object detection.
arXiv Detail & Related papers (2021-11-18T04:20:13Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object
Detection [59.765645791588454]
Recently introduced RTS3D builds an efficient 4D Feature-Consistency Embedding space for the intermediate representation of object without depth supervision.
We propose a shape prior non-uniform sampling strategy that performs dense sampling in outer region and sparse sampling in inner region.
Our proposed method has 2.57% improvement on AP3d almost without extra network parameters.
arXiv Detail & Related papers (2021-06-18T09:14:55Z) - Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing
Simulation-to-Real Domain Shift in LiDAR Bird's Eye View [110.83289076967895]
We present a BEV domain adaptation method based on CycleGAN that uses prior semantic classification in order to preserve the information of small objects of interest during the domain adaptation process.
The quality of the generated BEVs has been evaluated using a state-of-the-art 3D object detection framework at KITTI 3D Object Detection Benchmark.
arXiv Detail & Related papers (2021-04-22T12:47:37Z) - SIENet: Spatial Information Enhancement Network for 3D Object Detection
from Point Cloud [20.84329063509459]
LiDAR-based 3D object detection pushes forward an immense influence on autonomous vehicles.
Due to the limitation of the intrinsic properties of LiDAR, fewer points are collected at the objects farther away from the sensor.
To address the challenge, we propose a novel two-stage 3D object detection framework, named SIENet.
arXiv Detail & Related papers (2021-03-29T07:45:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.