Pre-Training LiDAR-Based 3D Object Detectors Through Colorization
- URL: http://arxiv.org/abs/2310.14592v2
- Date: Sun, 25 Feb 2024 21:56:37 GMT
- Title: Pre-Training LiDAR-Based 3D Object Detectors Through Colorization
- Authors: Tai-Yu Pan, Chenyang Ma, Tianle Chen, Cheng Perng Phoo, Katie Z Luo,
Yurong You, Mark Campbell, Kilian Q. Weinberger, Bharath Hariharan, and
Wei-Lun Chao
- Abstract summary: We introduce an innovative pre-training approach, Grounded Point Colorization (GPC), to bridge the gap between data and labels.
GPC teaches the model to colorize LiDAR point clouds, equipping it with valuable semantic cues.
Experimental results on the KITTI and datasets demonstrate GPC's remarkable effectiveness.
- Score: 65.03659880456048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate 3D object detection and understanding for self-driving cars heavily
relies on LiDAR point clouds, necessitating large amounts of labeled data to
train. In this work, we introduce an innovative pre-training approach, Grounded
Point Colorization (GPC), to bridge the gap between data and labels by teaching
the model to colorize LiDAR point clouds, equipping it with valuable semantic
cues. To tackle challenges arising from color variations and selection bias, we
incorporate color as "context" by providing ground-truth colors as hints during
colorization. Experimental results on the KITTI and Waymo datasets demonstrate
GPC's remarkable effectiveness. Even with limited labeled data, GPC
significantly improves fine-tuning performance; notably, on just 20% of the
KITTI dataset, GPC outperforms training from scratch with the entire dataset.
In sum, we introduce a fresh perspective on pre-training for 3D object
detection, aligning the objective with the model's intended role and ultimately
advancing the accuracy and efficiency of 3D object detection for autonomous
vehicles.
Related papers
- On Deep Learning for Geometric and Semantic Scene Understanding Using On-Vehicle 3D LiDAR [4.606106768645647]
3D LiDAR point cloud data is crucial for scene perception in computer vision, robotics, and autonomous driving.
We present DurLAR, the first high-fidelity 128-channel 3D LiDAR dataset featuring panoramic ambient (near infrared) and reflectivity imagery.
To improve the segmentation accuracy, we introduce Range-Aware Pointwise Distance Distribution (RAPiD) features and the associated RAPiD-Seg architecture.
arXiv Detail & Related papers (2024-11-01T14:01:54Z) - Study of Dropout in PointPillars with 3D Object Detection [0.0]
3D object detection is critical for autonomous driving, leveraging deep learning techniques to interpret LiDAR data.
This study provides an analysis of enhancing the performance of PointPillars model under various dropout rates.
arXiv Detail & Related papers (2024-09-01T09:30:54Z) - Shelf-Supervised Cross-Modal Pre-Training for 3D Object Detection [52.66283064389691]
State-of-the-art 3D object detectors are often trained on massive labeled datasets.
Recent works demonstrate that self-supervised pre-training with unlabeled data can improve detection accuracy with limited labels.
We propose a shelf-supervised approach for generating zero-shot 3D bounding boxes from paired RGB and LiDAR data.
arXiv Detail & Related papers (2024-06-14T15:21:57Z) - SPOT: Scalable 3D Pre-training via Occupancy Prediction for Learning Transferable 3D Representations [76.45009891152178]
Pretraining-finetuning approach can alleviate the labeling burden by fine-tuning a pre-trained backbone across various downstream datasets as well as tasks.
We show, for the first time, that general representations learning can be achieved through the task of occupancy prediction.
Our findings will facilitate the understanding of LiDAR points and pave the way for future advancements in LiDAR pre-training.
arXiv Detail & Related papers (2023-09-19T11:13:01Z) - Point-GCC: Universal Self-supervised 3D Scene Pre-training via
Geometry-Color Contrast [9.14535402695962]
Geometry and color information provided by point clouds are crucial for 3D scene understanding.
We propose a universal 3D scene pre-training framework via Geometry-Color Contrast (Point-GCC)
Point-GCC aligns geometry and color information using a Siamese network.
arXiv Detail & Related papers (2023-05-31T07:44:03Z) - View-to-Label: Multi-View Consistency for Self-Supervised 3D Object
Detection [46.077668660248534]
We propose a novel approach to self-supervise 3D object detection purely from RGB sequences alone.
Our experiments on KITTI 3D dataset demonstrate performance on par with state-of-the-art self-supervised methods.
arXiv Detail & Related papers (2023-05-29T09:30:39Z) - Pattern-Aware Data Augmentation for LiDAR 3D Object Detection [7.394029879643516]
We propose pattern-aware ground truth sampling, a data augmentation technique that downsamples an object's point cloud based on the LiDAR's characteristics.
We improve the performance of PV-RCNN on the car class by more than 0.7 percent on the KITTI validation split at distances greater than 25 m.
arXiv Detail & Related papers (2021-11-30T19:14:47Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - ST3D: Self-training for Unsupervised Domain Adaptation on 3D
ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds.
Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.