DINO Pre-training for Vision-based End-to-end Autonomous Driving
- URL: http://arxiv.org/abs/2407.10803v1
- Date: Mon, 15 Jul 2024 15:18:57 GMT
- Title: DINO Pre-training for Vision-based End-to-end Autonomous Driving
- Authors: Shubham Juneja, Povilas Daniušis, Virginijus Marcinkevičius,
- Abstract summary: We propose pre-training the visual encoder of a driving agent using the self-distillation with no labels (DINO) method.
Our experiments in CARLA environment in accordance with the Leaderboard benchmark reveal that the proposed pre-training is more efficient than classification-based pre-training.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this article, we focus on the pre-training of visual autonomous driving agents in the context of imitation learning. Current methods often rely on a classification-based pre-training, which we hypothesise to be holding back from extending capabilities of implicit image understanding. We propose pre-training the visual encoder of a driving agent using the self-distillation with no labels (DINO) method, which relies on a self-supervised learning paradigm.% and is trained on an unrelated task. Our experiments in CARLA environment in accordance with the Leaderboard benchmark reveal that the proposed pre-training is more efficient than classification-based pre-training, and is on par with the recently proposed pre-training based on visual place recognition (VPRPre).
Related papers
- Prior Learning in Introspective VAEs [24.271671383057598]
Variational Autoencoders (VAEs) are a popular framework for unsupervised learning and data generation.
In this study, we focus on the Soft-IntroVAE and investigate the implication of incorporating a multimodal and learnable prior into this framework.
arXiv Detail & Related papers (2024-08-25T10:54:25Z) - Stable Distillation: Regularizing Continued Pre-training for
Low-Resource Automatic Speech Recognition [54.9235160379917]
Stable Distillation is a simple and novel approach for SSL-based continued pre-training.
It boosts ASR performance in the target domain where both labeled and unlabeled data are limited.
arXiv Detail & Related papers (2023-12-20T06:02:12Z) - Value Explicit Pretraining for Learning Transferable Representations [11.069853883599102]
We propose a method that learns generalizable representations for transfer reinforcement learning.
We learn new tasks that share similar objectives as previously learned tasks, by learning an encoder for objective-conditioned representations.
Experiments using a realistic navigation simulator and Atari benchmark show that the pretrained encoder produced by our method outperforms current SoTA pretraining methods.
arXiv Detail & Related papers (2023-12-19T17:12:35Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Preserve Pre-trained Knowledge: Transfer Learning With Self-Distillation
For Action Recognition [8.571437792425417]
We propose a novel transfer learning approach that combines self-distillation in fine-tuning to preserve knowledge from the pre-trained model learned from the large-scale dataset.
Specifically, we fix the encoder from the last epoch as the teacher model to guide the training of the encoder from the current epoch in the transfer learning.
arXiv Detail & Related papers (2022-05-01T16:31:25Z) - Reinforcement Learning with Action-Free Pre-Training from Videos [95.25074614579646]
We introduce a framework that learns representations useful for understanding the dynamics via generative pre-training on videos.
Our framework significantly improves both final performances and sample-efficiency of vision-based reinforcement learning.
arXiv Detail & Related papers (2022-03-25T19:44:09Z) - Just Label What You Need: Fine-Grained Active Selection for Perception
and Prediction through Partially Labeled Scenes [78.23907801786827]
We introduce generalizations that ensure that our approach is both cost-aware and allows for fine-grained selection of examples through partially labeled scenes.
Our experiments on a real-world, large-scale self-driving dataset suggest that fine-grained selection can improve the performance across perception, prediction, and downstream planning tasks.
arXiv Detail & Related papers (2021-04-08T17:57:41Z) - Bootstrapped Self-Supervised Training with Monocular Video for Semantic
Segmentation and Depth Estimation [11.468537169201083]
We formalize a bootstrapped self-supervised learning problem where a system is initially bootstrapped with supervised training on a labeled dataset.
In this work, we leverage temporal consistency between frames in monocular video to perform this bootstrapped self-supervised training.
In addition, we show that the bootstrapped self-supervised training framework can help a network learn depth estimation better than pure supervised training or self-supervised training.
arXiv Detail & Related papers (2021-03-19T21:28:58Z) - Coarse-to-Fine Pre-training for Named Entity Recognition [26.00489191164784]
We propose a NER-specific pre-training framework to in-ject coarse-to-fine automatically mined entityknowledge into pre-trained models.
Our framework achieves significant improvements against several pre-trained base-lines, establishing the new state-of-the-art per-formance on three benchmarks.
arXiv Detail & Related papers (2020-10-16T07:39:20Z) - Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less
Forgetting [66.45372974713189]
We propose a recall and learn mechanism, which adopts the idea of multi-task learning and jointly learns pretraining tasks and downstream tasks.
Experiments show that our method achieves state-of-the-art performance on the GLUE benchmark.
We provide open-source RecAdam, which integrates the proposed mechanisms into Adam to facility the NLP community.
arXiv Detail & Related papers (2020-04-27T08:59:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.