Formal Bayesian Transfer Learning via the Total Risk Prior
- URL: http://arxiv.org/abs/2507.23768v1
- Date: Thu, 31 Jul 2025 17:55:16 GMT
- Title: Formal Bayesian Transfer Learning via the Total Risk Prior
- Authors: Nathan Wycoff, Ali Arab, Lisa O. Singh,
- Abstract summary: We show how a particular instantiation of our prior leads to a Bayesian Lasso in a transformed coordinate system.<n>We also demonstrate that recently proposed minimax-frequentist transfer learning techniques may be viewed as an approximate Maximum a Posteriori approach to our model.
- Score: 1.8570591025615457
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In analyses with severe data-limitations, augmenting the target dataset with information from ancillary datasets in the application domain, called source datasets, can lead to significantly improved statistical procedures. However, existing methods for this transfer learning struggle to deal with situations where the source datasets are also limited and not guaranteed to be well-aligned with the target dataset. A typical strategy is to use the empirical loss minimizer on the source data as a prior mean for the target parameters, which places the estimation of source parameters outside of the Bayesian formalism. Our key conceptual contribution is to use a risk minimizer conditional on source parameters instead. This allows us to construct a single joint prior distribution for all parameters from the source datasets as well as the target dataset. As a consequence, we benefit from full Bayesian uncertainty quantification and can perform model averaging via Gibbs sampling over indicator variables governing the inclusion of each source dataset. We show how a particular instantiation of our prior leads to a Bayesian Lasso in a transformed coordinate system and discuss computational techniques to scale our approach to moderately sized datasets. We also demonstrate that recently proposed minimax-frequentist transfer learning techniques may be viewed as an approximate Maximum a Posteriori approach to our model. Finally, we demonstrate superior predictive performance relative to the frequentist baseline on a genetics application, especially when the source data are limited.
Related papers
- Using Scaling Laws for Data Source Utility Estimation in Domain-Specific Pre-Training [4.90288999217624]
We introduce a framework for optimizing domain-specific dataset construction in foundation model training.<n>Our approach extends the usual point estimate approaches, aka micro-annealing, to estimating scaling laws.<n>We validate our approach through experiments on a pre-trained model with 7 billion parameters.
arXiv Detail & Related papers (2025-07-29T21:56:45Z) - Transfer Learning for Matrix Completion [0.0]
We propose a transfer learning procedure given prior information on which source datasets are favorable.<n>With the source matrices close enough to the target matrix, out method outperforms the traditional method using the single target data.
arXiv Detail & Related papers (2025-07-03T02:43:40Z) - Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks.<n>We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z) - Domain Adaptation for Offline Reinforcement Learning with Limited Samples [2.3674123304219816]
offline reinforcement learning learns effective policies from a static target dataset.<n>It relies on the quality and size of the target dataset and it degrades if limited samples in the target dataset are available.<n>This paper proposes the first framework that theoretically explores the impact of the weights assigned to each dataset on the performance of offline RL.
arXiv Detail & Related papers (2024-08-22T05:38:48Z) - Source-Free Domain-Invariant Performance Prediction [68.39031800809553]
We propose a source-free approach centred on uncertainty-based estimation, using a generative model for calibration in the absence of source data.
Our experiments on benchmark object recognition datasets reveal that existing source-based methods fall short with limited source sample availability.
Our approach significantly outperforms the current state-of-the-art source-free and source-based methods, affirming its effectiveness in domain-invariant performance estimation.
arXiv Detail & Related papers (2024-08-05T03:18:58Z) - Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information.
We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Uncertainty-guided Source-free Domain Adaptation [77.3844160723014]
Source-free domain adaptation (SFDA) aims to adapt a classifier to an unlabelled target data set by only using a pre-trained source model.
We propose quantifying the uncertainty in the source model predictions and utilizing it to guide the target adaptation.
arXiv Detail & Related papers (2022-08-16T08:03:30Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Meta-analysis of heterogeneous data: integrative sparse regression in
high-dimensions [21.162280861396205]
We consider the task of meta-analysis in high-dimensional settings in which the data sources are similar but non-identical.
We introduce a global parameter that emphasizes interpretability and statistical efficiency in the presence of heterogeneity.
We demonstrate the benefits of our approach on a large-scale drug treatment dataset involving several different cancer cell-lines.
arXiv Detail & Related papers (2019-12-26T20:30:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.