Transfer Learning for Contextual Multi-armed Bandits
- URL: http://arxiv.org/abs/2211.12612v2
- Date: Thu, 25 Jan 2024 02:31:43 GMT
- Title: Transfer Learning for Contextual Multi-armed Bandits
- Authors: Changxiao Cai, T. Tony Cai, Hongzhe Li
- Abstract summary: We study the problem of transfer learning for non contextual multi-armed bandits under the coparametric shift model.
A novel transfer learning algorithm that attains the minimax regret is proposed.
A simulation study is carried out to illustrate the benefits of utilizing the data from the auxiliary source domains for learning in the target domain.
- Score: 8.97013379960904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motivated by a range of applications, we study in this paper the problem of
transfer learning for nonparametric contextual multi-armed bandits under the
covariate shift model, where we have data collected on source bandits before
the start of the target bandit learning. The minimax rate of convergence for
the cumulative regret is established and a novel transfer learning algorithm
that attains the minimax regret is proposed. The results quantify the
contribution of the data from the source domains for learning in the target
domain in the context of nonparametric contextual multi-armed bandits.
In view of the general impossibility of adaptation to unknown smoothness, we
develop a data-driven algorithm that achieves near-optimal statistical
guarantees (up to a logarithmic factor) while automatically adapting to the
unknown parameters over a large collection of parameter spaces under an
additional self-similarity assumption. A simulation study is carried out to
illustrate the benefits of utilizing the data from the auxiliary source domains
for learning in the target domain.
Related papers
- Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset [98.52916361979503]
We introduce a novel learning approach that automatically models and adapts to non-stationarity.
We show empirically that our approach performs well in non-stationary supervised and off-policy reinforcement learning settings.
arXiv Detail & Related papers (2024-11-06T16:32:40Z) - Robust Transfer Learning with Unreliable Source Data [13.276850367115333]
We introduce a novel quantity called the ''ambiguity level'' that measures the discrepancy between the target and source regression functions.
We propose a simple transfer learning procedure, and establish a general theorem that shows how this new quantity is related to the transferability of learning.
arXiv Detail & Related papers (2023-10-06T21:50:21Z) - Federated Learning for Heterogeneous Bandits with Unobserved Contexts [0.0]
We study the problem of federated multi-arm contextual bandits with unknown contexts.
We propose an elimination-based algorithm and prove the regret bound for linearly parametrized reward functions.
arXiv Detail & Related papers (2023-03-29T22:06:24Z) - MAPS: A Noise-Robust Progressive Learning Approach for Source-Free
Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation.
This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z) - Estimation and inference for transfer learning with high-dimensional
quantile regression [3.4510296013600374]
We propose a transfer learning procedure in the framework of high-dimensional quantile regression models.
We establish error bounds of transfer learning estimator based on delicately selected transferable source domains.
By adopting data-splitting technique, we advocate a transferability detection approach that guarantees to circumvent negative transfer.
arXiv Detail & Related papers (2022-11-26T14:40:19Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - Adversarial Weighting for Domain Adaptation in Regression [4.34858896385326]
We present a novel instance-based approach to handle regression tasks in the context of supervised domain adaptation.
We develop an adversarial network algorithm which learns both the source weighting scheme and the task in one feed-forward gradient descent.
arXiv Detail & Related papers (2020-06-15T09:44:04Z) - Unsupervised Transfer Learning with Self-Supervised Remedy [60.315835711438936]
Generalising deep networks to novel domains without manual labels is challenging to deep learning.
Pre-learned knowledge does not transfer well without making strong assumptions about the learned and the novel domains.
In this work, we aim to learn a discriminative latent space of the unlabelled target data in a novel domain by knowledge transfer from labelled related domains.
arXiv Detail & Related papers (2020-06-08T16:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.