Paint and Distill: Boosting 3D Object Detection with Semantic Passing
Network
- URL: http://arxiv.org/abs/2207.05497v1
- Date: Tue, 12 Jul 2022 12:35:34 GMT
- Title: Paint and Distill: Boosting 3D Object Detection with Semantic Passing
Network
- Authors: Bo Ju, Zhikang Zou, Xiaoqing Ye, Minyue Jiang, Xiao Tan, Errui Ding,
Jingdong Wang
- Abstract summary: 3D object detection task from lidar or camera sensors is essential for autonomous driving.
We propose a novel semantic passing framework, named SPNet, to boost the performance of existing lidar-based 3D detection models.
- Score: 70.53093934205057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D object detection task from lidar or camera sensors is essential for
autonomous driving. Pioneer attempts at multi-modality fusion complement the
sparse lidar point clouds with rich semantic texture information from images at
the cost of extra network designs and overhead. In this work, we propose a
novel semantic passing framework, named SPNet, to boost the performance of
existing lidar-based 3D detection models with the guidance of rich context
painting, with no extra computation cost during inference. Our key design is to
first exploit the potential instructive semantic knowledge within the
ground-truth labels by training a semantic-painted teacher model and then guide
the pure-lidar network to learn the semantic-painted representation via
knowledge passing modules at different granularities: class-wise passing,
pixel-wise passing and instance-wise passing. Experimental results show that
the proposed SPNet can seamlessly cooperate with most existing 3D detection
frameworks with 1~5% AP gain and even achieve new state-of-the-art 3D detection
performance on the KITTI test benchmark. Code is available at:
https://github.com/jb892/SPNet.
Related papers
- MonoNext: A 3D Monocular Object Detection with ConvNext [69.33657875725747]
This paper introduces a new Multi-Tasking Learning approach called MonoNext for 3D Object Detection.
MonoNext employs a straightforward approach based on the ConvNext network and requires only 3D bounding box data.
In our experiments with the KITTI dataset, MonoNext achieved high precision and competitive performance comparable with state-of-the-art approaches.
arXiv Detail & Related papers (2023-08-01T15:15:40Z) - Unleash the Potential of Image Branch for Cross-modal 3D Object
Detection [67.94357336206136]
We present a new cross-modal 3D object detector, namely UPIDet, which aims to unleash the potential of the image branch from two aspects.
First, UPIDet introduces a new 2D auxiliary task called normalized local coordinate map estimation.
Second, we discover that the representational capability of the point cloud backbone can be enhanced through the gradients backpropagated from the training objectives of the image branch.
arXiv Detail & Related papers (2023-01-22T08:26:58Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - PAI3D: Painting Adaptive Instance-Prior for 3D Object Detection [22.41785292720421]
Painting Adaptive Instance-prior for 3D object detection (PAI3D) is a sequential instance-level fusion framework.
It first extracts instance-level semantic information from images.
Extracted information, including objects categorical label, point-to-object membership and object position, are then used to augment each LiDAR point in the subsequent 3D detection network.
arXiv Detail & Related papers (2022-11-15T11:15:25Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using
LiDAR Point Cloud and Semantic Segmentation [4.350338899049983]
We propose a generalization of PointPainting to be able to apply fusion at different levels.
We show that SemanticVoxels achieves state-of-the-art performance in both 3D and bird's eye view pedestrian detection benchmarks.
arXiv Detail & Related papers (2020-09-25T14:52:32Z) - Stereo RGB and Deeper LIDAR Based Network for 3D Object Detection [40.34710686994996]
3D object detection has become an emerging task in autonomous driving scenarios.
Previous works process 3D point clouds using either projection-based or voxel-based models.
We propose the Stereo RGB and Deeper LIDAR framework which can utilize semantic and spatial information simultaneously.
arXiv Detail & Related papers (2020-06-09T11:19:24Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.