Related papers: Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation

Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation

URL: http://arxiv.org/abs/2206.02099v1
Date: Sun, 5 Jun 2022 05:28:32 GMT
Title: Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation
Authors: Yuenan Hou, Xinge Zhu, Yuexin Ma, Chen Change Loy, and Yikang Li
Abstract summary: This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation. We propose the Point-to-Voxel Knowledge Distillation (PVD), which transfers the hidden knowledge from both point level and voxel level.
Score: 74.67594286008317
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation. Directly employing previous distillation approaches yields inferior results due to the intrinsic challenges of point cloud, i.e., sparsity, randomness and varying density. To tackle the aforementioned problems, we propose the Point-to-Voxel Knowledge Distillation (PVD), which transfers the hidden knowledge from both point level and voxel level. Specifically, we first leverage both the pointwise and voxelwise output distillation to complement the sparse supervision signals. Then, to better exploit the structural information, we divide the whole point cloud into several supervoxels and design a difficulty-aware sampling strategy to more frequently sample supervoxels containing less-frequent classes and faraway objects. On these supervoxels, we propose inter-point and inter-voxel affinity distillation, where the similarity information between points and voxels can help the student model better capture the structural information of the surrounding environment. We conduct extensive experiments on two popular LiDAR segmentation benchmarks, i.e., nuScenes and SemanticKITTI. On both benchmarks, our PVD consistently outperforms previous distillation approaches by a large margin on three representative backbones, i.e., Cylinder3D, SPVNAS and MinkowskiNet. Notably, on the challenging nuScenes and SemanticKITTI datasets, our method can achieve roughly 75% MACs reduction and 2x speedup on the competitive Cylinder3D model and rank 1st on the SemanticKITTI leaderboard among all published algorithms. Our code is available at https://github.com/cardwing/Codes-for-PVKD.

Related papers

Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting [86.15347226865826]
We design a new end-to-end object-aware lifting approach, named Unified-Lift. We augment each Gaussian point with an additional Gaussian-level feature learned using a contrastive loss to encode instance information. We conduct experiments on three benchmarks: LERF-Masked, Replica, and Messy Rooms.
arXiv Detail & Related papers (2025-03-18T08:42:23Z)
Self-Supervised Scene Flow Estimation with Point-Voxel Fusion and Surface Representation [30.355128117680444]
Scene flow estimation aims to generate the 3D motion field of points between two consecutive frames of point clouds. Existing point-based methods ignore the irregularity of point clouds and have difficulty capturing long-range dependencies. We propose a point-voxel fusion method, where we utilize a voxel branch based on sparse grid attention and the shifted window strategy to capture long-range dependencies.
arXiv Detail & Related papers (2024-10-17T09:05:15Z)
LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving [26.319489913682574]
We introduce the Learning to Infuse "X" (LIX) framework, with novel contributions in both logit distillation and feature distillation aspects. We develop an adaptively-recalibrated feature distillation algorithm, including two technical novelties.
arXiv Detail & Related papers (2024-03-13T03:24:36Z)
Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding. The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data. We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z)
Joint Learning for Scattered Point Cloud Understanding with Hierarchical Self-Distillation [34.26170741722835]
We propose an end-to-end architecture that compensates for and identifies partial point clouds on the fly. hierarchical self-distillation (HSD) can be applied to arbitrary hierarchy-based point cloud methods.
arXiv Detail & Related papers (2023-12-28T08:51:04Z)
Clustering based Point Cloud Representation Learning for 3D Analysis [80.88995099442374]
We propose a clustering based supervised learning scheme for point cloud analysis. Unlike current de-facto, scene-wise training paradigm, our algorithm conducts within-class clustering on the point embedding space. Our algorithm shows notable improvements on famous point cloud segmentation datasets.
arXiv Detail & Related papers (2023-07-27T03:42:12Z)
KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection [48.66703222700795]
We resort to a novel kernel strategy to identify the most informative point clouds to acquire labels. To accommodate both one-stage (i.e., SECOND) and two-stage detectors, we incorporate the classification entropy tangent and well trade-off between detection performance and the total number of bounding boxes selected for annotation. Our results show that approximately 44% box-level annotation costs and 26% computational time are reduced compared to the state-of-the-art method.
arXiv Detail & Related papers (2023-07-16T04:27:03Z)
Multi-to-Single Knowledge Distillation for Point Cloud Semantic Segmentation [41.02741249858771]
We propose a novel multi-to-single knowledge distillation framework for the 3D point cloud semantic segmentation task. Instead of fusing all the points of multi-scans directly, only the instances that belong to the previously defined hard classes are fused.
arXiv Detail & Related papers (2023-04-28T12:17:08Z)
Point2Vec for Self-Supervised Representation Learning on Point Clouds [66.53955515020053]
We extend data2vec to the point cloud domain and report encouraging results on several downstream tasks. We propose point2vec, which unleashes the full potential of data2vec-like pre-training on point clouds.
arXiv Detail & Related papers (2023-03-29T10:08:29Z)
Multi-Granularity Distillation Scheme Towards Lightweight Semi-Supervised Semantic Segmentation [33.36652973690884]
We offer the first attempt to provide lightweight Semi-Supervised Semantics (SSSS) models via a novel multi-granularity distillation scheme (MGD) MGD is formulated as a labeled-unlabeled data cooperative distillation scheme, which helps to take full advantage of diverse data characteristics. Experimental results on PASCAL2012 and Cityscapes reveal that MGD can outperform the competitive approaches by a large margin under diverse partition protocols.
arXiv Detail & Related papers (2022-08-22T09:32:06Z)
SelfVoxeLO: Self-supervised LiDAR Odometry with Voxel-based Deep Neural Networks [81.64530401885476]
We propose a self-supervised LiDAR odometry method, dubbed SelfVoxeLO, to tackle these two difficulties. Specifically, we propose a 3D convolution network to process the raw LiDAR data directly, which extracts features that better encode the 3D geometric patterns. We evaluate our method's performances on two large-scale datasets, i.e., KITTI and Apollo-SouthBay.
arXiv Detail & Related papers (2020-10-19T09:23:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.