Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative
Priors
- URL: http://arxiv.org/abs/2205.10279v1
- Date: Fri, 20 May 2022 16:19:30 GMT
- Title: Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative
Priors
- Authors: Ravid Shwartz-Ziv, Micah Goldblum, Hossein Souri, Sanyam Kapoor, Chen
Zhu, Yann LeCun, Andrew Gordon Wilson
- Abstract summary: We show that we can learn highly informative posteriors from the source task, through supervised or self-supervised approaches.
This simple modular approach enables significant performance gains and more data-efficient learning on a variety of downstream classification and segmentation tasks.
- Score: 59.93972277761501
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning is increasingly moving towards a transfer learning paradigm
whereby large foundation models are fine-tuned on downstream tasks, starting
from an initialization learned on the source task. But an initialization
contains relatively little information about the source task. Instead, we show
that we can learn highly informative posteriors from the source task, through
supervised or self-supervised approaches, which then serve as the basis for
priors that modify the whole loss surface on the downstream task. This simple
modular approach enables significant performance gains and more data-efficient
learning on a variety of downstream classification and segmentation tasks,
serving as a drop-in replacement for standard pre-training strategies. These
highly informative priors also can be saved for future use, similar to
pre-trained weights, and stand in contrast to the zero-mean isotropic
uninformative priors that are typically used in Bayesian deep learning.
Related papers
- Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification [34.37262622415682]
We propose a new adaptation framework called Data Adaptive Traceback.
Specifically, we utilize a zero-shot-based method to extract the most downstream task-related subset of the pre-training data.
We adopt a pseudo-label-based semi-supervised technique to reuse the pre-training images and a vision-language contrastive learning method to address the confirmation bias issue in semi-supervised learning.
arXiv Detail & Related papers (2024-07-11T18:01:58Z) - Transfer Learning with Informative Priors: Simple Baselines Better than Previously Reported [4.453137996095194]
We compare transfer learning with and without source task informed priors across 5 datasets.
For the scenario of 5-300 examples per class, we find negative or negligible gains on 2 datasets, modest gains on 2 other datasets, and substantial gains on one dataset.
arXiv Detail & Related papers (2024-05-24T14:12:23Z) - Continual Learning with Pretrained Backbones by Tuning in the Input
Space [44.97953547553997]
The intrinsic difficulty in adapting deep learning models to non-stationary environments limits the applicability of neural networks to real-world tasks.
We propose a novel strategy to make the fine-tuning procedure more effective, by avoiding to update the pre-trained part of the network and learning not only the usual classification head, but also a set of newly-introduced learnable parameters.
arXiv Detail & Related papers (2023-06-05T15:11:59Z) - Task Residual for Tuning Vision-Language Models [69.22958802711017]
We propose a new efficient tuning approach for vision-language models (VLMs) named Task Residual Tuning (TaskRes)
TaskRes explicitly decouples the prior knowledge of the pre-trained models and new knowledge regarding a target task.
The proposed TaskRes is simple yet effective, which significantly outperforms previous methods on 11 benchmark datasets.
arXiv Detail & Related papers (2022-11-18T15:09:03Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - PAC-Net: A Model Pruning Approach to Inductive Transfer Learning [16.153557870191488]
PAC-Net is a simple yet effective approach for transfer learning based on pruning.
PAC-Net consists of three steps: Prune, Allocate, and Calibrate.
Under the various and extensive set of inductive transfer learning experiments, we show that our method achieves state-of-the-art performance by a large margin.
arXiv Detail & Related papers (2022-06-12T09:45:16Z) - Knowledge Distillation as Efficient Pre-training: Faster Convergence,
Higher Data-efficiency, and Better Transferability [53.27240222619834]
Knowledge Distillation as Efficient Pre-training aims to efficiently transfer the learned feature representation from pre-trained models to new student models for future downstream tasks.
Our method performs comparably with supervised pre-training counterparts in 3 downstream tasks and 9 downstream datasets requiring 10x less data and 5x less pre-training time.
arXiv Detail & Related papers (2022-03-10T06:23:41Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - Adversarially-Trained Deep Nets Transfer Better: Illustration on Image
Classification [53.735029033681435]
Transfer learning is a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains.
In this work, we demonstrate that adversarially-trained models transfer better than non-adversarially-trained models.
arXiv Detail & Related papers (2020-07-11T22:48:42Z) - A Survey on Self-supervised Pre-training for Sequential Transfer
Learning in Neural Networks [1.1802674324027231]
Self-supervised pre-training for transfer learning is becoming an increasingly popular technique to improve state-of-the-art results using unlabeled data.
We provide an overview of the taxonomy for self-supervised learning and transfer learning, and highlight some prominent methods for designing pre-training tasks across different domains.
arXiv Detail & Related papers (2020-07-01T22:55:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.