Related papers: Adaptive transfer learning

Adaptive transfer learning

URL: http://arxiv.org/abs/2106.04455v1
Date: Tue, 8 Jun 2021 15:39:43 GMT
Title: Adaptive transfer learning
Authors: Henry W. J. Reeve, Timothy I. Cannings, Richard J. Samworth
Abstract summary: We introduce a flexible framework for transfer learning in the context of binary classification. We show that the optimal rate can be achieved by an algorithm that adapts to key aspects of the unknown transfer relationship.
Score: 6.574517227976925
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In transfer learning, we wish to make inference about a target population when we have access to data both from the distribution itself, and from a different but related source distribution. We introduce a flexible framework for transfer learning in the context of binary classification, allowing for covariate-dependent relationships between the source and target distributions that are not required to preserve the Bayes decision boundary. Our main contributions are to derive the minimax optimal rates of convergence (up to poly-logarithmic factors) in this problem, and show that the optimal rate can be achieved by an algorithm that adapts to key aspects of the unknown transfer relationship, as well as the smoothness and tail parameters of our distributional classes. This optimal rate turns out to have several regimes, depending on the interplay between the relative sample sizes and the strength of the transfer relationship, and our algorithm achieves optimality by careful, decision tree-based calibration of local nearest-neighbour procedures.

Related papers

FedDuA: Doubly Adaptive Federated Learning [2.6108066206600555]
Federated learning is a distributed learning framework where clients collaboratively train a global model without sharing their raw data.<n>We formalize the central server optimization procedure through the lens of mirror descent and propose a novel framework, called FedDuA.<n>We prove that our proposed doubly adaptive step-size rule is minimax optimal and provide a convergence analysis for convex objectives.
arXiv Detail & Related papers (2025-05-16T11:15:27Z)
Leveraging Robust Optimization for LLM Alignment under Distribution Shifts [54.654823811482665]
Large language models (LLMs) increasingly rely on preference alignment methods to steer outputs toward human values. Recent approaches have turned to synthetic data generated by LLMs as a scalable alternative. We propose a novel distribution-aware optimization framework that improves preference alignment in the presence of such shifts.
arXiv Detail & Related papers (2025-04-08T09:14:38Z)
Adaptive Sample Aggregation In Transfer Learning [18.53111473571927]
We show that a family of divergences proposed across classification and regression tasks all happen to upper-bound the same measure of continuity between source and target risks. We then turn to situations where the aggregate of source and target data can improve target performance significantly beyond what's possible with either source or target data alone.
arXiv Detail & Related papers (2024-08-29T01:02:40Z)
Uncertainty Quantification via Stable Distribution Propagation [60.065272548502]
We propose a new approach for propagating stable probability distributions through neural networks. Our method is based on local linearization, which we show to be an optimal approximation in terms of total variation distance for the ReLU non-linearity.
arXiv Detail & Related papers (2024-02-13T09:40:19Z)
A Class-aware Optimal Transport Approach with Higher-Order Moment Matching for Unsupervised Domain Adaptation [33.712557972990034]
Unsupervised domain adaptation (UDA) aims to transfer knowledge from a labeled source domain to an unlabeled target domain. We introduce a novel approach called class-aware optimal transport (OT), which measures the OT distance between a distribution over the source class-conditional distributions.
arXiv Detail & Related papers (2024-01-29T08:27:31Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
Robust Transfer Learning with Unreliable Source Data [13.276850367115333]
We introduce a novel quantity called the ''ambiguity level'' that measures the discrepancy between the target and source regression functions. We propose a simple transfer learning procedure, and establish a general theorem that shows how this new quantity is related to the transferability of learning.
arXiv Detail & Related papers (2023-10-06T21:50:21Z)
Compressed Regression over Adaptive Networks [58.79251288443156]
We derive the performance achievable by a network of distributed agents that solve, adaptively and in the presence of communication constraints, a regression problem. We devise an optimized allocation strategy where the parameters necessary for the optimization can be learned online by the agents.
arXiv Detail & Related papers (2023-04-07T13:41:08Z)
Variance-Reduced Heterogeneous Federated Learning via Stratified Client Selection [31.401919362978017]
We propose a novel stratified client selection scheme to reduce the variance for the pursuit of better convergence and higher accuracy. We present an optimized sample size allocation scheme by considering the diversity of stratum's variability. Experimental results confirm that our approach not only allows for better performance relative to state-of-the-art methods but also is compatible with prevalent FL algorithms.
arXiv Detail & Related papers (2022-01-15T05:41:36Z)
A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer [55.20627066525205]
We propose a variational Bayesian (VB) approach to learning distributions of latent variables in deep neural network (DNN) models. Our proposed VB approach can obtain good improvements on target devices, and consistently outperforms 13 state-of-the-art knowledge transfer algorithms.
arXiv Detail & Related papers (2021-10-16T15:54:01Z)
Score-based Generative Neural Networks for Large-Scale Optimal Transport [15.666205208594565]
In certain cases, the optimal transport plan takes the form of a one-to-one mapping from the source support to the target support. We study instead the Sinkhorn problem, a regularized form of optimal transport whose solutions are couplings between the source and the target distribution. We introduce a novel framework for learning the Sinkhorn coupling between two distributions in the form of a score-based generative model.
arXiv Detail & Related papers (2021-10-07T07:45:39Z)
Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference. Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z)
Learning Calibrated Uncertainties for Domain Shift: A Distributionally Robust Learning Approach [150.8920602230832]
We propose a framework for learning calibrated uncertainties under domain shifts. In particular, the density ratio estimation reflects the closeness of a target (test) sample to the source (training) distribution. We show that our proposed method generates calibrated uncertainties that benefit downstream tasks.
arXiv Detail & Related papers (2020-10-08T02:10:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.