Enhancing Transfer Learning with Flexible Nonparametric Posterior
Sampling
- URL: http://arxiv.org/abs/2403.07282v1
- Date: Tue, 12 Mar 2024 03:26:58 GMT
- Title: Enhancing Transfer Learning with Flexible Nonparametric Posterior
Sampling
- Authors: Hyungi Lee, Giung Nam, Edwin Fong, Juho Lee
- Abstract summary: This paper introduces nonparametric transfer learning (NPTL), a flexible posterior sampling method to address the distribution shift issue.
NPTL is suitable for transfer learning scenarios that may involve the distribution shift between upstream and downstream tasks.
- Score: 22.047309973614876
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transfer learning has recently shown significant performance across various
tasks involving deep neural networks. In these transfer learning scenarios, the
prior distribution for downstream data becomes crucial in Bayesian model
averaging (BMA). While previous works proposed the prior over the neural
network parameters centered around the pre-trained solution, such strategies
have limitations when dealing with distribution shifts between upstream and
downstream data. This paper introduces nonparametric transfer learning (NPTL),
a flexible posterior sampling method to address the distribution shift issue
within the context of nonparametric learning. The nonparametric learning (NPL)
method is a recent approach that employs a nonparametric prior for posterior
sampling, efficiently accounting for model misspecification scenarios, which is
suitable for transfer learning scenarios that may involve the distribution
shift between upstream and downstream tasks. Through extensive empirical
validations, we demonstrate that our approach surpasses other baselines in BMA
performance.
Related papers
- Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset [98.52916361979503]
We introduce a novel learning approach that automatically models and adapts to non-stationarity.
We show empirically that our approach performs well in non-stationary supervised and off-policy reinforcement learning settings.
arXiv Detail & Related papers (2024-11-06T16:32:40Z) - A Bayesian Approach to Data Point Selection [24.98069363998565]
Data point selection (DPS) is becoming a critical topic in deep learning.
Existing approaches to DPS are predominantly based on a bi-level optimisation (BLO) formulation.
We propose a novel Bayesian approach to DPS.
arXiv Detail & Related papers (2024-11-06T09:04:13Z) - Sparse Orthogonal Parameters Tuning for Continual Learning [34.462967722928724]
Continual learning methods based on pre-trained models (PTM) have recently gained attention which adapt to successive downstream tasks without catastrophic forgetting.
We propose a novel yet effective method called SoTU (Sparse Orthogonal Parameters TUning)
arXiv Detail & Related papers (2024-11-05T05:19:09Z) - Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis [10.79615566320291]
We explore transfer learning with the goal of optimizing downstream performance.
We introduce a simple linear model that takes as input an arbitrary pretrained feature.
We identify the optimal pretrained representation by minimizing the downstream risk averaged over an ensemble of downstream tasks.
arXiv Detail & Related papers (2024-04-18T19:33:55Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - Variational Density Propagation Continual Learning [0.0]
Deep Neural Networks (DNNs) deployed to the real world are regularly subject to out-of-distribution (OoD) data.
This paper proposes a framework for adapting to data distribution drift modeled by benchmark Continual Learning datasets.
arXiv Detail & Related papers (2023-08-22T21:51:39Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Transfer Learning with Gaussian Processes for Bayesian Optimization [9.933956770453438]
We provide a unified view on hierarchical GP models for transfer learning, which allows us to analyze the relationship between methods.
We develop a novel closed-form boosted GP transfer model that fits between existing approaches in terms of complexity.
We evaluate the performance of the different approaches in large-scale experiments and highlight strengths and weaknesses of the different transfer-learning methods.
arXiv Detail & Related papers (2021-11-22T14:09:45Z) - A Variational Bayesian Approach to Learning Latent Variables for
Acoustic Knowledge Transfer [55.20627066525205]
We propose a variational Bayesian (VB) approach to learning distributions of latent variables in deep neural network (DNN) models.
Our proposed VB approach can obtain good improvements on target devices, and consistently outperforms 13 state-of-the-art knowledge transfer algorithms.
arXiv Detail & Related papers (2021-10-16T15:54:01Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.