SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D
Vehicle Detection from Point Cloud
- URL: http://arxiv.org/abs/2002.05316v1
- Date: Thu, 13 Feb 2020 02:42:31 GMT
- Title: SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D
Vehicle Detection from Point Cloud
- Authors: Hongwei Yi, Shaoshuai Shi, Mingyu Ding, Jiankai Sun, Kui Xu, Hui Zhou,
Zhe Wang, Sheng Li, Guoping Wang
- Abstract summary: We propose a unified model SegVoxelNet to address the above two problems.
A semantic context encoder is proposed to leverage the free-of-charge semantic segmentation masks in the bird's eye view.
A novel depth-aware head is designed to explicitly model the distribution differences and each part of the depth-aware head is made to focus on its own target detection range.
- Score: 39.99118618229583
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D vehicle detection based on point cloud is a challenging task in real-world
applications such as autonomous driving. Despite significant progress has been
made, we observe two aspects to be further improved. First, the semantic
context information in LiDAR is seldom explored in previous works, which may
help identify ambiguous vehicles. Second, the distribution of point cloud on
vehicles varies continuously with increasing depths, which may not be well
modeled by a single model. In this work, we propose a unified model SegVoxelNet
to address the above two problems. A semantic context encoder is proposed to
leverage the free-of-charge semantic segmentation masks in the bird's eye view.
Suspicious regions could be highlighted while noisy regions are suppressed by
this module. To better deal with vehicles at different depths, a novel
depth-aware head is designed to explicitly model the distribution differences
and each part of the depth-aware head is made to focus on its own target
detection range. Extensive experiments on the KITTI dataset show that the
proposed method outperforms the state-of-the-art alternatives in both accuracy
and efficiency with point cloud as input only.
Related papers
- Language-Guided 3D Object Detection in Point Cloud for Autonomous
Driving [91.91552963872596]
We propose a new multi-modal visual grounding task, termed LiDAR Grounding.
It jointly learns the LiDAR-based object detector with the language features and predicts the targeted region directly from the detector.
Our work offers a deeper insight into the LiDAR-based grounding task and we expect it presents a promising direction for the autonomous driving community.
arXiv Detail & Related papers (2023-05-25T06:22:10Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - PillarGrid: Deep Learning-based Cooperative Perception for 3D Object
Detection from Onboard-Roadside LiDAR [15.195933965761645]
We propose textitPillarGrid, a novel cooperative perception method fusing information from multiple 3D LiDARs.
PillarGrid consists of four main phases: 1) cooperative preprocessing of point clouds, 2) pillar-wise voxelization and feature extraction, 3) grid-wise deep fusion of features from multiple sensors, and 4) convolutional neural network (CNN)-based augmented 3D object detection.
Extensive experimentation shows that PillarGrid outperforms the SOTA single-LiDAR-based 3D object detection methods with respect to both accuracy and range by a large margin.
arXiv Detail & Related papers (2022-03-12T02:28:41Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - LiDAR-based Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
LiDAR-based panoptic segmentation aims to parse both objects and scenes in a unified manner.
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods.
arXiv Detail & Related papers (2020-11-24T08:44:46Z) - SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using
LiDAR Point Cloud and Semantic Segmentation [4.350338899049983]
We propose a generalization of PointPainting to be able to apply fusion at different levels.
We show that SemanticVoxels achieves state-of-the-art performance in both 3D and bird's eye view pedestrian detection benchmarks.
arXiv Detail & Related papers (2020-09-25T14:52:32Z) - InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic
Information Modeling [65.47126868838836]
We propose a novel 3D object detection framework with dynamic information modeling.
Coarse predictions are generated in the first stage via a voxel-based region proposal network.
Experiments are conducted on the large-scale nuScenes 3D detection benchmark.
arXiv Detail & Related papers (2020-07-16T18:27:08Z) - Stereo RGB and Deeper LIDAR Based Network for 3D Object Detection [40.34710686994996]
3D object detection has become an emerging task in autonomous driving scenarios.
Previous works process 3D point clouds using either projection-based or voxel-based models.
We propose the Stereo RGB and Deeper LIDAR framework which can utilize semantic and spatial information simultaneously.
arXiv Detail & Related papers (2020-06-09T11:19:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.