Sparse2Dense: Learning to Densify 3D Features for 3D Object Detection
- URL: http://arxiv.org/abs/2211.13067v1
- Date: Wed, 23 Nov 2022 16:01:06 GMT
- Title: Sparse2Dense: Learning to Densify 3D Features for 3D Object Detection
- Authors: Tianyu Wang, Xiaowei Hu, Zhengzhe Liu, Chi-Wing Fu
- Abstract summary: LiDAR-produced point clouds are the major source for most state-of-the-art 3D object detectors.
Small, distant, and incomplete objects with sparse or few points are often hard to detect.
We present Sparse2Dense, a new framework to efficiently boost 3D detection performance by learning to densify point clouds in latent space.
- Score: 85.08249413137558
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: LiDAR-produced point clouds are the major source for most state-of-the-art 3D
object detectors. Yet, small, distant, and incomplete objects with sparse or
few points are often hard to detect. We present Sparse2Dense, a new framework
to efficiently boost 3D detection performance by learning to densify point
clouds in latent space. Specifically, we first train a dense point 3D detector
(DDet) with a dense point cloud as input and design a sparse point 3D detector
(SDet) with a regular point cloud as input. Importantly, we formulate the
lightweight plug-in S2D module and the point cloud reconstruction module in
SDet to densify 3D features and train SDet to produce 3D features, following
the dense 3D features in DDet. So, in inference, SDet can simulate dense 3D
features from regular (sparse) point cloud inputs without requiring dense
inputs. We evaluate our method on the large-scale Waymo Open Dataset and the
Waymo Domain Adaptation Dataset, showing its high performance and efficiency
over the state of the arts.
Related papers
- ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D Images [19.02348585677397]
Open-vocabulary 3D object detection (OV-3Det) aims to generalize beyond the limited number of base categories labeled during the training phase.
The biggest bottleneck is the scarcity of annotated 3D data, whereas 2D image datasets are abundant and richly annotated.
We propose a novel framework ImOV3D to leverage pseudo multimodal representation containing both images and point clouds (PC) to close the modality gap.
arXiv Detail & Related papers (2024-10-31T15:02:05Z) - Anchor-free 3D Single Stage Detector with Mask-Guided Attention for
Point Cloud [79.39041453836793]
We develop a novel single-stage 3D detector for point clouds in an anchor-free manner.
We overcome this by converting the voxel-based sparse 3D feature volumes into the sparse 2D feature maps.
We propose an IoU-based detection confidence re-calibration scheme to improve the correlation between the detection confidence score and the accuracy of the bounding box regression.
arXiv Detail & Related papers (2021-08-08T13:42:13Z) - From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object
Detection [101.20784125067559]
We propose a new architecture, namely Hallucinated Hollow-3D R-CNN, to address the problem of 3D object detection.
In our approach, we first extract the multi-view features by sequentially projecting the point clouds into the perspective view and the bird-eye view.
The 3D objects are detected via a box refinement module with a novel Hierarchical Voxel RoI Pooling operation.
arXiv Detail & Related papers (2021-07-30T02:00:06Z) - FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle
Detection [81.79171905308827]
We propose frustum-aware geometric reasoning (FGR) to detect vehicles in point clouds without any 3D annotations.
Our method consists of two stages: coarse 3D segmentation and 3D bounding box estimation.
It is able to accurately detect objects in 3D space with only 2D bounding boxes and sparse point clouds.
arXiv Detail & Related papers (2021-05-17T07:29:55Z) - 3D-to-2D Distillation for Indoor Scene Parsing [78.36781565047656]
We present a new approach that enables us to leverage 3D features extracted from large-scale 3D data repository to enhance 2D features extracted from RGB images.
First, we distill 3D knowledge from a pretrained 3D network to supervise a 2D network to learn simulated 3D features from 2D features during the training.
Second, we design a two-stage dimension normalization scheme to calibrate the 2D and 3D features for better integration.
Third, we design a semantic-aware adversarial training model to extend our framework for training with unpaired 3D data.
arXiv Detail & Related papers (2021-04-06T02:22:24Z) - 3D Object Detection Method Based on YOLO and K-Means for Image and Point
Clouds [1.9458156037869139]
Lidar based 3D object detection and classification tasks are essential for autonomous driving.
This paper proposes a 3D object detection method based on point cloud and image.
arXiv Detail & Related papers (2020-04-21T04:32:36Z) - Boundary-Aware Dense Feature Indicator for Single-Stage 3D Object
Detection from Point Clouds [32.916690488130506]
We propose a universal module that helps 3D detectors focus on the densest region of the point clouds in a boundary-aware manner.
Experiments on KITTI dataset show that DENFI improves the performance of the baseline single-stage detector remarkably.
arXiv Detail & Related papers (2020-04-01T01:21:23Z) - D3Feat: Joint Learning of Dense Detection and Description of 3D Local
Features [51.04841465193678]
We leverage a 3D fully convolutional network for 3D point clouds.
We propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point.
Our method achieves state-of-the-art results in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2020-03-06T12:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.