Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation
- URL: http://arxiv.org/abs/2206.02099v1
- Date: Sun, 5 Jun 2022 05:28:32 GMT
- Title: Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation
- Authors: Yuenan Hou, Xinge Zhu, Yuexin Ma, Chen Change Loy, and Yikang Li
- Abstract summary: This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation.
We propose the Point-to-Voxel Knowledge Distillation (PVD), which transfers the hidden knowledge from both point level and voxel level.
- Score: 74.67594286008317
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This article addresses the problem of distilling knowledge from a large
teacher model to a slim student network for LiDAR semantic segmentation.
Directly employing previous distillation approaches yields inferior results due
to the intrinsic challenges of point cloud, i.e., sparsity, randomness and
varying density. To tackle the aforementioned problems, we propose the
Point-to-Voxel Knowledge Distillation (PVD), which transfers the hidden
knowledge from both point level and voxel level. Specifically, we first
leverage both the pointwise and voxelwise output distillation to complement the
sparse supervision signals. Then, to better exploit the structural information,
we divide the whole point cloud into several supervoxels and design a
difficulty-aware sampling strategy to more frequently sample supervoxels
containing less-frequent classes and faraway objects. On these supervoxels, we
propose inter-point and inter-voxel affinity distillation, where the similarity
information between points and voxels can help the student model better capture
the structural information of the surrounding environment. We conduct extensive
experiments on two popular LiDAR segmentation benchmarks, i.e., nuScenes and
SemanticKITTI. On both benchmarks, our PVD consistently outperforms previous
distillation approaches by a large margin on three representative backbones,
i.e., Cylinder3D, SPVNAS and MinkowskiNet. Notably, on the challenging nuScenes
and SemanticKITTI datasets, our method can achieve roughly 75% MACs reduction
and 2x speedup on the competitive Cylinder3D model and rank 1st on the
SemanticKITTI leaderboard among all published algorithms. Our code is available
at https://github.com/cardwing/Codes-for-PVKD.
Related papers
- Self-Supervised Scene Flow Estimation with Point-Voxel Fusion and Surface Representation [30.355128117680444]
Scene flow estimation aims to generate the 3D motion field of points between two consecutive frames of point clouds.
Existing point-based methods ignore the irregularity of point clouds and have difficulty capturing long-range dependencies.
We propose a point-voxel fusion method, where we utilize a voxel branch based on sparse grid attention and the shifted window strategy to capture long-range dependencies.
arXiv Detail & Related papers (2024-10-17T09:05:15Z) - LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual
Semantic Segmentation for Autonomous Driving [26.319489913682574]
We introduce the Learning to Infuse "X" (LIX) framework, with novel contributions in both logit distillation and feature distillation aspects.
We develop an adaptively-recalibrated feature distillation algorithm, including two technical novelties.
arXiv Detail & Related papers (2024-03-13T03:24:36Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Joint Learning for Scattered Point Cloud Understanding with Hierarchical Self-Distillation [34.26170741722835]
We propose an end-to-end architecture that compensates for and identifies partial point clouds on the fly.
hierarchical self-distillation (HSD) can be applied to arbitrary hierarchy-based point cloud methods.
arXiv Detail & Related papers (2023-12-28T08:51:04Z) - Clustering based Point Cloud Representation Learning for 3D Analysis [80.88995099442374]
We propose a clustering based supervised learning scheme for point cloud analysis.
Unlike current de-facto, scene-wise training paradigm, our algorithm conducts within-class clustering on the point embedding space.
Our algorithm shows notable improvements on famous point cloud segmentation datasets.
arXiv Detail & Related papers (2023-07-27T03:42:12Z) - KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection [48.66703222700795]
We resort to a novel kernel strategy to identify the most informative point clouds to acquire labels.
To accommodate both one-stage (i.e., SECOND) and two-stage detectors, we incorporate the classification entropy tangent and well trade-off between detection performance and the total number of bounding boxes selected for annotation.
Our results show that approximately 44% box-level annotation costs and 26% computational time are reduced compared to the state-of-the-art method.
arXiv Detail & Related papers (2023-07-16T04:27:03Z) - Multi-to-Single Knowledge Distillation for Point Cloud Semantic
Segmentation [41.02741249858771]
We propose a novel multi-to-single knowledge distillation framework for the 3D point cloud semantic segmentation task.
Instead of fusing all the points of multi-scans directly, only the instances that belong to the previously defined hard classes are fused.
arXiv Detail & Related papers (2023-04-28T12:17:08Z) - Point2Vec for Self-Supervised Representation Learning on Point Clouds [66.53955515020053]
We extend data2vec to the point cloud domain and report encouraging results on several downstream tasks.
We propose point2vec, which unleashes the full potential of data2vec-like pre-training on point clouds.
arXiv Detail & Related papers (2023-03-29T10:08:29Z) - SelfVoxeLO: Self-supervised LiDAR Odometry with Voxel-based Deep Neural
Networks [81.64530401885476]
We propose a self-supervised LiDAR odometry method, dubbed SelfVoxeLO, to tackle these two difficulties.
Specifically, we propose a 3D convolution network to process the raw LiDAR data directly, which extracts features that better encode the 3D geometric patterns.
We evaluate our method's performances on two large-scale datasets, i.e., KITTI and Apollo-SouthBay.
arXiv Detail & Related papers (2020-10-19T09:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.