Weakly Supervised 3D Object Detection with Multi-Stage Generalization
- URL: http://arxiv.org/abs/2306.05418v2
- Date: Tue, 6 Feb 2024 11:27:57 GMT
- Title: Weakly Supervised 3D Object Detection with Multi-Stage Generalization
- Authors: Jiawei He, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang
- Abstract summary: We introduce BA$2$-Det, encompassing pseudo label generation and multi-stage generalization.
We develop three stages of generalization: progressing from complete to partial, static to dynamic, and close to distant.
BA$2$-Det can achieve a 20% relative improvement on the KITTI dataset.
- Score: 62.96670547848691
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the rapid development of large models, the need for data has become
increasingly crucial. Especially in 3D object detection, costly manual
annotations have hindered further advancements. To reduce the burden of
annotation, we study the problem of achieving 3D object detection solely based
on 2D annotations. Thanks to advanced 3D reconstruction techniques, it is now
feasible to reconstruct the overall static 3D scene. However, extracting
precise object-level annotations from the entire scene and generalizing these
limited annotations to the entire scene remain challenges. In this paper, we
introduce a novel paradigm called BA$^2$-Det, encompassing pseudo label
generation and multi-stage generalization. We devise the DoubleClustering
algorithm to obtain object clusters from reconstructed scene-level points, and
further enhance the model's detection capabilities by developing three stages
of generalization: progressing from complete to partial, static to dynamic, and
close to distant. Experiments conducted on the large-scale Waymo Open Dataset
show that the performance of BA$^2$-Det is on par with the fully-supervised
methods using 10% annotations. Additionally, using large raw videos for
pretraining,BA$^2$-Det can achieve a 20% relative improvement on the KITTI
dataset. The method also has great potential for detecting open-set 3D objects
in complex scenes. Project page: https://ba2det.site.
Related papers
- General Geometry-aware Weakly Supervised 3D Object Detection [62.26729317523975]
A unified framework is developed for learning 3D object detectors from RGB images and associated 2D boxes.
Experiments on KITTI and SUN-RGBD datasets demonstrate that our method yields surprisingly high-quality 3D bounding boxes with only 2D annotation.
arXiv Detail & Related papers (2024-07-18T17:52:08Z) - LASA: Instance Reconstruction from Real Scans using A Large-scale
Aligned Shape Annotation Dataset [17.530432165466507]
We present a novel Cross-Modal Shape Reconstruction (DisCo) method and an Occupancy-Guided 3D Object Detection (OccGOD) method.
Our methods achieve state-of-the-art performance in both instance-level scene reconstruction and 3D object detection tasks.
arXiv Detail & Related papers (2023-12-19T18:50:10Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - U3DS$^3$: Unsupervised 3D Semantic Scene Segmentation [19.706172244951116]
This paper presents U3DS$3$, as a step towards completely unsupervised point cloud segmentation for any holistic 3D scenes.
The initial step of our proposed approach involves generating superpoints based on the geometric characteristics of each scene.
We then undergo a learning process through a spatial clustering-based methodology, followed by iterative training using pseudo-labels generated in accordance with the cluster centroids.
arXiv Detail & Related papers (2023-11-10T12:05:35Z) - Point2Seq: Detecting 3D Objects as Sequences [58.63662049729309]
We present a simple and effective framework, named Point2Seq, for 3D object detection from point clouds.
We view each 3D object as a sequence of words and reformulate the 3D object detection task as decoding words from 3D scenes in an auto-regressive manner.
arXiv Detail & Related papers (2022-03-25T00:20:31Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.