Proxy-informed Bayesian transfer learning with unknown sources
- URL: http://arxiv.org/abs/2411.03263v2
- Date: Thu, 13 Feb 2025 16:28:07 GMT
- Title: Proxy-informed Bayesian transfer learning with unknown sources
- Authors: Sabina J. Sloman, Julien Martinelli, Samuel Kaski,
- Abstract summary: Generalization outside the scope of one's training data requires leveraging prior knowledge about the effects that transfer.<n>Negative transfer can stem from misspecified prior information about non-transferable causes of the source data.<n>Our proposed method, proxy-informed robust method for probabilistic transfer learning (PROMPT), does not require prior knowledge of the source data.
- Score: 18.264598332579748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalization outside the scope of one's training data requires leveraging prior knowledge about the effects that transfer, and the effects that don't, between different data sources. Transfer learning is a framework for specifying and refining this knowledge about sets of source (training) and target (prediction) data. A challenging open problem is addressing the empirical phenomenon of negative transfer, whereby the transfer learner performs worse on the target data after taking the source data into account than before. We first introduce a Bayesian perspective on negative transfer, and then a method to address it. The key insight from our formulation is that negative transfer can stem from misspecified prior information about non-transferable causes of the source data. Our proposed method, proxy-informed robust method for probabilistic transfer learning (PROMPT), does not require prior knowledge of the source data (the data sources may be "unknown"). PROMPT is thus applicable when differences between tasks are unobserved, such as in the presence of latent confounders. Moreover, the learner need not have access to observations in the target task (cannot "fine-tune"), and instead makes use of proxy (indirect) information. Our theoretical results show that the threat of negative transfer does not depend on the informativeness of the proxy information, highlighting the usefulness of PROMPT in cases where only noisy indirect information, such as human feedback, is available.
Related papers
- Data-Driven Knowledge Transfer in Batch $Q^*$ Learning [5.6665432569907646]
We explore knowledge transfer in dynamic decision-making by concentrating on batch stationary environments.
We propose a framework of Transferred Fitted $Q$-Iteration algorithm with general function approximation.
We show that the final learning error of the $Q*$ function is significantly improved from the single task rate.
arXiv Detail & Related papers (2024-04-01T02:20:09Z) - Covariate-Elaborated Robust Partial Information Transfer with Conditional Spike-and-Slab Prior [1.111488407653005]
We propose a novel Bayesian transfer learning method named CONCERT'' to allow robust partial information transfer.
A conditional spike-and-slab prior is introduced in the joint distribution of target and source parameters for information transfer.
In contrast to existing work, the CONCERT is a one-step procedure, which achieves variable selection and information transfer simultaneously.
arXiv Detail & Related papers (2024-03-30T07:32:58Z) - Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced
Transfer Learning [66.20311762506702]
dataset pruning (DP) has emerged as an effective way to improve data efficiency.
We propose two new DP methods, label mapping and feature mapping, for supervised and self-supervised pretraining settings.
We show that source data classes can be pruned by up to 40% 80% without sacrificing downstream performance.
arXiv Detail & Related papers (2023-10-13T00:07:49Z) - Towards Estimating Transferability using Hard Subsets [25.86053764521497]
We propose HASTE, a new strategy to estimate the transferability of a source model to a particular target task using only a harder subset of target data.
We show that HASTE can be used with any existing transferability metric to improve their reliability.
Our experimental results across multiple source model architectures, target datasets, and transfer learning tasks show that HASTE modified metrics are consistently better or on par with the state of the art transferability metrics.
arXiv Detail & Related papers (2023-01-17T14:50:18Z) - Estimation and inference for transfer learning with high-dimensional
quantile regression [3.4510296013600374]
We propose a transfer learning procedure in the framework of high-dimensional quantile regression models.
We establish error bounds of transfer learning estimator based on delicately selected transferable source domains.
By adopting data-splitting technique, we advocate a transferability detection approach that guarantees to circumvent negative transfer.
arXiv Detail & Related papers (2022-11-26T14:40:19Z) - An Exploration of Data Efficiency in Intra-Dataset Task Transfer for
Dialog Understanding [65.75873687351553]
This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain.
Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning.
arXiv Detail & Related papers (2022-10-21T04:36:46Z) - A Data-Based Perspective on Transfer Learning [76.30206800557411]
We take a closer look at the role of the source dataset's composition in transfer learning.
Our framework gives rise to new capabilities such as pinpointing transfer learning brittleness.
arXiv Detail & Related papers (2022-07-12T17:58:28Z) - When does Bias Transfer in Transfer Learning? [89.22641454588278]
Using transfer learning to adapt a pre-trained "source model" to a downstream "target task" can dramatically increase performance with seemingly no downside.
We demonstrate that there can exist a downside after all: bias transfer, or the tendency for biases of the source model to persist even after adapting the model to the target class.
arXiv Detail & Related papers (2022-07-06T17:58:07Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - A Bayesian Approach to (Online) Transfer Learning: Theory and Algorithms [6.193838300896449]
We study transfer learning from a Bayesian perspective, where a parametric statistical model is used.
Specifically, we study three variants of transfer learning problems, instantaneous, online, and time-variant transfer learning.
For each problem, we define an appropriate objective function, and provide either exact expressions or upper bounds on the learning performance.
Examples show that the derived bounds are accurate even for small sample sizes.
arXiv Detail & Related papers (2021-09-03T08:43:29Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Online Transfer Learning: Negative Transfer and Effect of Prior
Knowledge [6.193838300896449]
We study the online transfer learning problems where the source samples are given in an offline way while the target samples arrive sequentially.
We define the expected regret of the online transfer learning problem and provide upper bounds on the regret using information-theoretic quantities.
Examples show that the derived bounds are accurate even for small sample sizes.
arXiv Detail & Related papers (2021-05-04T12:12:14Z) - Source Data-absent Unsupervised Domain Adaptation through Hypothesis
Transfer and Labeling Transfer [137.36099660616975]
Unsupervised adaptation adaptation (UDA) aims to transfer knowledge from a related but different well-labeled source domain to a new unlabeled target domain.
Most existing UDA methods require access to the source data, and thus are not applicable when the data are confidential and not shareable due to privacy concerns.
This paper aims to tackle a realistic setting with only a classification model available trained over, instead of accessing to the source data.
arXiv Detail & Related papers (2020-12-14T07:28:50Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Exploring and Predicting Transferability across NLP Tasks [115.6278033699853]
We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
arXiv Detail & Related papers (2020-05-02T09:39:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.