Weakly Supervised Monocular 3D Object Detection using Multi-View
Projection and Direction Consistency
- URL: http://arxiv.org/abs/2303.08686v1
- Date: Wed, 15 Mar 2023 15:14:00 GMT
- Title: Weakly Supervised Monocular 3D Object Detection using Multi-View
Projection and Direction Consistency
- Authors: Runzhou Tao, Wencheng Han, Zhongying Qiu, Cheng-zhong Xu and Jianbing
Shen
- Abstract summary: Monocular 3D object detection has become a mainstream approach in automatic driving for its easy application.
Most current methods still rely on 3D point cloud data for labeling the ground truths used in the training phase.
We propose a new weakly supervised monocular 3D objection detection method, which can train the model with only 2D labels marked on images.
- Score: 78.76508318592552
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular 3D object detection has become a mainstream approach in automatic
driving for its easy application. A prominent advantage is that it does not
need LiDAR point clouds during the inference. However, most current methods
still rely on 3D point cloud data for labeling the ground truths used in the
training phase. This inconsistency between the training and inference makes it
hard to utilize the large-scale feedback data and increases the data collection
expenses. To bridge this gap, we propose a new weakly supervised monocular 3D
objection detection method, which can train the model with only 2D labels
marked on images. To be specific, we explore three types of consistency in this
task, i.e. the projection, multi-view and direction consistency, and design a
weakly-supervised architecture based on these consistencies. Moreover, we
propose a new 2D direction labeling method in this task to guide the model for
accurate rotation direction prediction. Experiments show that our
weakly-supervised method achieves comparable performance with some fully
supervised methods. When used as a pre-training method, our model can
significantly outperform the corresponding fully-supervised baseline with only
1/3 3D labels. https://github.com/weakmono3d/weakmono3d
Related papers
- ALPI: Auto-Labeller with Proxy Injection for 3D Object Detection using 2D Labels Only [5.699475977818167]
3D object detection plays a crucial role in various applications such as autonomous vehicles, robotics and augmented reality.
We propose a weakly supervised 3D annotator that relies solely on 2D bounding box annotations from images, along with size priors.
arXiv Detail & Related papers (2024-07-24T11:58:31Z) - Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene [22.297964850282177]
We propose LiDAR-2D Self-paced Learning (LiSe) for unsupervised 3D detection.
RGB images serve as a valuable complement to LiDAR data, offering precise 2D localization cues.
Our framework devises a self-paced learning pipeline that incorporates adaptive sampling and weak model aggregation strategies.
arXiv Detail & Related papers (2024-07-11T14:58:49Z) - Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance [72.6809373191638]
We propose a framework to study how to leverage constraints between 2D and 3D domains without requiring any 3D labels.
Specifically, we design a feature-level constraint to align LiDAR and image features based on object-aware regions.
Second, the output-level constraint is developed to enforce the overlap between 2D and projected 3D box estimations.
Third, the training-level constraint is utilized by producing accurate and consistent 3D pseudo-labels that align with the visual data.
arXiv Detail & Related papers (2023-12-12T18:57:25Z) - MonoSKD: General Distillation Framework for Monocular 3D Object
Detection via Spearman Correlation Coefficient [11.48914285491747]
Existing monocular 3D detection knowledge distillation methods usually project the LiDAR onto the image plane and train the teacher network accordingly.
We propose MonoSKD, a novel Knowledge Distillation framework for Monocular 3D detection based on Spearman correlation coefficient.
Our framework achieves state-of-the-art performance until submission with no additional inference computational cost.
arXiv Detail & Related papers (2023-10-17T14:48:02Z) - An Empirical Study of Pseudo-Labeling for Image-based 3D Object
Detection [72.30883544352918]
We investigate whether pseudo-labels can provide effective supervision for the baseline models under varying settings.
We achieve 20.23 AP for moderate level on the KITTI-3D testing set without bells and whistles, improving the baseline model by 6.03 AP.
We hope this work can provide insights for the image-based 3D detection community under a semi-supervised setting.
arXiv Detail & Related papers (2022-08-15T12:17:46Z) - Homography Loss for Monocular 3D Object Detection [54.04870007473932]
A differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information.
Our method yields the best performance compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.
arXiv Detail & Related papers (2022-04-02T03:48:03Z) - MonoDistill: Learning Spatial Features for Monocular 3D Object Detection [80.74622486604886]
We propose a simple and effective scheme to introduce the spatial information from LiDAR signals to the monocular 3D detectors.
We use the resulting data to train a 3D detector with the same architecture as the baseline model.
Experimental results show that the proposed method can significantly boost the performance of the baseline model.
arXiv Detail & Related papers (2022-01-26T09:21:41Z) - Is Pseudo-Lidar needed for Monocular 3D Object detection? [32.772699246216774]
We propose an end-to-end, single stage, monocular 3D object detector, DD3D, that can benefit from depth pre-training like pseudo-lidar methods, but without their limitations.
Our architecture is designed for effective information transfer between depth estimation and 3D detection, allowing us to scale with the amount of unlabeled pre-training data.
arXiv Detail & Related papers (2021-08-13T22:22:51Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.