DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection
- URL: http://arxiv.org/abs/2304.13031v2
- Date: Fri, 11 Aug 2023 09:46:37 GMT
- Title: DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection
- Authors: Huan-ang Gao, Beiwen Tian, Pengfei Li, Hao Zhao, Guyue Zhou
- Abstract summary: We study the problem of semi-supervised 3D object detection, which is of great importance considering the high annotation cost for cluttered 3D indoor scenes.
We resort to the robust and principled framework of selfteaching, which has triggered notable progress for semisupervised learning recently.
We propose the first semisupervised 3D detection algorithm that works in the singlestage manner and allows spatially dense training signals.
- Score: 6.096961718434965
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we study the problem of semi-supervised 3D object detection,
which is of great importance considering the high annotation cost for cluttered
3D indoor scenes. We resort to the robust and principled framework of
selfteaching, which has triggered notable progress for semisupervised learning
recently. While this paradigm is natural for image-level or pixel-level
prediction, adapting it to the detection problem is challenged by the issue of
proposal matching. Prior methods are based upon two-stage pipelines, matching
heuristically selected proposals generated in the first stage and resulting in
spatially sparse training signals. In contrast, we propose the first
semisupervised 3D detection algorithm that works in the singlestage manner and
allows spatially dense training signals. A fundamental issue of this new design
is the quantization error caused by point-to-voxel discretization, which
inevitably leads to misalignment between two transformed views in the voxel
domain. To this end, we derive and implement closed-form rules that compensate
this misalignment onthe-fly. Our results are significant, e.g., promoting
ScanNet mAP@0.5 from 35.2% to 48.5% using 20% annotation. Codes and data will
be publicly available.
Related papers
- Infinite 3D Landmarks: Improving Continuous 2D Facial Landmark Detection [9.633565294243173]
We show how a combination of specific architectural modifications can improve their accuracy and temporal stability.
We analyze the use of a spatial transformer network that is trained alongside the landmark detector in an unsupervised manner.
We show that modifying the output head of the landmark predictor to infer landmarks in a canonical 3D space can further improve accuracy.
arXiv Detail & Related papers (2024-05-30T14:54:26Z) - OriCon3D: Effective 3D Object Detection using Orientation and Confidence [0.0]
We propose an advanced methodology for the detection of 3D objects from a single image.
We use a deep convolutional neural network-based 3D object weighted orientation regression paradigm.
Our approach significantly improves the accuracy of 3D object pose determination, surpassing baseline methodologies.
arXiv Detail & Related papers (2023-04-27T19:52:47Z) - SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from
Point Cloud [125.9472454212909]
We present a novel Semi-Supervised Domain Adaptation method for 3D object detection (SSDA3D)
SSDA3D includes an Inter-domain Adaptation stage and an Intra-domain Generalization stage.
Experiments show that, with only 10% labeled target data, our SSDA3D can surpass the fully-supervised oracle model with 100% target label.
arXiv Detail & Related papers (2022-12-06T09:32:44Z) - VIBUS: Data-efficient 3D Scene Parsing with VIewpoint Bottleneck and
Uncertainty-Spectrum Modeling [2.0624279915507047]
Training 3D scene parsing models with sparse supervision is an intriguing alternative.
We term this task as data-efficient 3D scene parsing.
We propose an effective two-stage framework named VIBUS to resolve it.
arXiv Detail & Related papers (2022-10-20T17:59:57Z) - Delving into Localization Errors for Monocular 3D Object Detection [85.77319416168362]
Estimating 3D bounding boxes from monocular images is an essential component in autonomous driving.
In this work, we quantify the impact introduced by each sub-task and find the localization error' is the vital factor in restricting monocular 3D detection.
arXiv Detail & Related papers (2021-03-30T10:38:01Z) - ST3D: Self-training for Unsupervised Domain Adaptation on 3D
ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds.
Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z) - 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object
Detection [76.42897462051067]
3DIoUMatch is a novel semi-supervised method for 3D object detection applicable to both indoor and outdoor scenes.
We leverage a teacher-student mutual learning framework to propagate information from the labeled to the unlabeled train set in the form of pseudo-labels.
Our method consistently improves state-of-the-art methods on both ScanNet and SUN-RGBD benchmarks by significant margins under all label ratios.
arXiv Detail & Related papers (2020-12-08T11:06:26Z) - Unsupervised Object Detection with LiDAR Clues [70.73881791310495]
We present the first practical method for unsupervised object detection with the aid of LiDAR clues.
In our approach, candidate object segments based on 3D point clouds are firstly generated.
Then, an iterative segment labeling process is conducted to assign segment labels and to train a segment labeling network.
The labeling process is carefully designed so as to mitigate the issue of long-tailed and open-ended distribution.
arXiv Detail & Related papers (2020-11-25T18:59:54Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.