AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud
Dataset
- URL: http://arxiv.org/abs/2306.00612v3
- Date: Thu, 26 Oct 2023 15:20:31 GMT
- Title: AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud
Dataset
- Authors: Jiakang Yuan, Bo Zhang, Xiangchao Yan, Tao Chen, Botian Shi, Yikang
Li, Yu Qiao
- Abstract summary: It is a long-term vision for Autonomous Driving (AD) community that the perception models can learn from a large-scale point cloud dataset.
We formulate the point-cloud pre-training task as a semi-supervised problem, which leverages the few-shot labeled and massive unlabeled point-cloud data.
We achieve significant performance gains on a series of downstream perception benchmarks including nuScenes, and KITTI, under different baseline models.
- Score: 25.935496432142976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is a long-term vision for Autonomous Driving (AD) community that the
perception models can learn from a large-scale point cloud dataset, to obtain
unified representations that can achieve promising results on different tasks
or benchmarks. Previous works mainly focus on the self-supervised pre-training
pipeline, meaning that they perform the pre-training and fine-tuning on the
same benchmark, which is difficult to attain the performance scalability and
cross-dataset application for the pre-training checkpoint. In this paper, for
the first time, we are committed to building a large-scale pre-training
point-cloud dataset with diverse data distribution, and meanwhile learning
generalizable representations from such a diverse pre-training dataset. We
formulate the point-cloud pre-training task as a semi-supervised problem, which
leverages the few-shot labeled and massive unlabeled point-cloud data to
generate the unified backbone representations that can be directly applied to
many baseline models and benchmarks, decoupling the AD-related pre-training
process and downstream fine-tuning task. During the period of backbone
pre-training, by enhancing the scene- and instance-level distribution diversity
and exploiting the backbone's ability to learn from unknown instances, we
achieve significant performance gains on a series of downstream perception
benchmarks including Waymo, nuScenes, and KITTI, under different baseline
models like PV-RCNN++, SECOND, CenterPoint.
Related papers
- Point Cloud Pre-training with Diffusion Models [62.12279263217138]
We propose a novel pre-training method called Point cloud Diffusion pre-training (PointDif)
PointDif achieves substantial improvement across various real-world datasets for diverse downstream tasks such as classification, segmentation and detection.
arXiv Detail & Related papers (2023-11-25T08:10:05Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - In-Domain Self-Supervised Learning Improves Remote Sensing Image Scene
Classification [5.323049242720532]
Self-supervised learning has emerged as a promising approach for remote sensing image classification.
We present a study of different self-supervised pre-training strategies and evaluate their effect across 14 downstream datasets.
arXiv Detail & Related papers (2023-07-04T10:57:52Z) - SEPT: Towards Scalable and Efficient Visual Pre-Training [11.345844145289524]
Self-supervised pre-training has shown great potential in leveraging large-scale unlabeled data to improve downstream task performance.
We build a task-specific self-supervised pre-training framework based on a simple hypothesis that pre-training on the unlabeled samples with similar distribution to the target task can bring substantial performance gains.
arXiv Detail & Related papers (2022-12-11T11:02:11Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - Self-Supervised Pre-Training for Transformer-Based Person
Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID)
Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance.
This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.