KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection
- URL: http://arxiv.org/abs/2307.07942v1
- Date: Sun, 16 Jul 2023 04:27:03 GMT
- Title: KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection
- Authors: Yadan Luo, Zhuoxiao Chen, Zhen Fang, Zheng Zhang, Zi Huang, Mahsa
Baktashmotlagh
- Abstract summary: We resort to a novel kernel strategy to identify the most informative point clouds to acquire labels.
To accommodate both one-stage (i.e., SECOND) and two-stage detectors, we incorporate the classification entropy tangent and well trade-off between detection performance and the total number of bounding boxes selected for annotation.
Our results show that approximately 44% box-level annotation costs and 26% computational time are reduced compared to the state-of-the-art method.
- Score: 48.66703222700795
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Achieving a reliable LiDAR-based object detector in autonomous driving is
paramount, but its success hinges on obtaining large amounts of precise 3D
annotations. Active learning (AL) seeks to mitigate the annotation burden
through algorithms that use fewer labels and can attain performance comparable
to fully supervised learning. Although AL has shown promise, current approaches
prioritize the selection of unlabeled point clouds with high uncertainty and/or
diversity, leading to the selection of more instances for labeling and reduced
computational efficiency. In this paper, we resort to a novel kernel coding
rate maximization (KECOR) strategy which aims to identify the most informative
point clouds to acquire labels through the lens of information theory. Greedy
search is applied to seek desired point clouds that can maximize the minimal
number of bits required to encode the latent features. To determine the
uniqueness and informativeness of the selected samples from the model
perspective, we construct a proxy network of the 3D detector head and compute
the outer product of Jacobians from all proxy layers to form the empirical
neural tangent kernel (NTK) matrix. To accommodate both one-stage (i.e.,
SECOND) and two-stage detectors (i.e., PVRCNN), we further incorporate the
classification entropy maximization and well trade-off between detection
performance and the total number of bounding boxes selected for annotation.
Extensive experiments conducted on two 3D benchmarks and a 2D detection dataset
evidence the superiority and versatility of the proposed approach. Our results
show that approximately 44% box-level annotation costs and 26% computational
time are reduced compared to the state-of-the-art AL method, without
compromising detection performance.
Related papers
- Dual-Perspective Knowledge Enrichment for Semi-Supervised 3D Object
Detection [55.210991151015534]
We present a novel Dual-Perspective Knowledge Enrichment approach named DPKE for semi-supervised 3D object detection.
Our DPKE enriches the knowledge of limited training data, particularly unlabeled data, from two perspectives: data-perspective and feature-perspective.
arXiv Detail & Related papers (2024-01-10T08:56:07Z) - Exploring Active 3D Object Detection from a Generalization Perspective [58.597942380989245]
Uncertainty-based active learning policies fail to balance the trade-off between point cloud informativeness and box-level annotation costs.
We propose textscCrb, which hierarchically filters out the point clouds of redundant 3D bounding box labels.
Experiments show that the proposed approach outperforms existing active learning strategies.
arXiv Detail & Related papers (2023-01-23T02:43:03Z) - Semi-supervised 3D Object Detection with Proficient Teachers [114.54835359657707]
Dominated point cloud-based 3D object detectors in autonomous driving scenarios rely heavily on the huge amount of accurately labeled samples.
Pseudo-Labeling methodology is commonly used for SSL frameworks, however, the low-quality predictions from the teacher model have seriously limited its performance.
We propose a new Pseudo-Labeling framework for semi-supervised 3D object detection, by enhancing the teacher model to a proficient one with several necessary designs.
arXiv Detail & Related papers (2022-07-26T04:54:03Z) - Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding [62.17020485045456]
It is commonly assumed in semi-supervised learning (SSL) that the unlabeled data are drawn from the same distribution as that of the labeled ones.
We propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized.
arXiv Detail & Related papers (2022-05-02T16:09:17Z) - Semi-supervised 3D Object Detection via Adaptive Pseudo-Labeling [18.209409027211404]
3D object detection is an important task in computer vision.
Most existing methods require a large number of high-quality 3D annotations, which are expensive to collect.
We propose a novel semi-supervised framework based on pseudo-labeling for outdoor 3D object detection tasks.
arXiv Detail & Related papers (2021-08-15T02:58:43Z) - SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection [9.924083358178239]
We propose two variants of self-attention for contextual modeling in 3D object detection.
We first incorporate the pairwise self-attention mechanism into the current state-of-the-art BEV, voxel and point-based detectors.
Next, we propose a self-attention variant that samples a subset of the most representative features by learning deformations over randomly sampled locations.
arXiv Detail & Related papers (2021-01-07T18:30:32Z) - Scan-based Semantic Segmentation of LiDAR Point Clouds: An Experimental
Study [2.6205925938720833]
State of the art methods use deep neural networks to predict semantic classes for each point in a LiDAR scan.
A powerful and efficient way to process LiDAR measurements is to use two-dimensional, image-like projections.
We demonstrate various techniques to boost the performance and to improve runtime as well as memory constraints.
arXiv Detail & Related papers (2020-04-06T11:08:12Z) - EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with
Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system.
It can be trained in one shot on both fully and weakly-annotated data.
It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.