Related papers: Multi-Environment Meta-Learning in Stochastic Linear Bandits

Multi-Environment Meta-Learning in Stochastic Linear Bandits

URL: http://arxiv.org/abs/2205.06326v1
Date: Thu, 12 May 2022 19:31:28 GMT
Title: Multi-Environment Meta-Learning in Stochastic Linear Bandits
Authors: Ahmadreza Moradipari, Mohammad Ghavamzadeh, Taha Rajabzadeh, Christos Thrampoulidis, Mahnoosh Alizadeh
Abstract summary: We consider the feasibility of meta-learning when task parameters are drawn from a mixture distribution instead of a single environment. We propose a regularized version of the OFUL algorithm that achieves low regret on a new task without requiring knowledge of the environment from which the new task originates.
Score: 49.387421094105136
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work we investigate meta-learning (or learning-to-learn) approaches in multi-task linear stochastic bandit problems that can originate from multiple environments. Inspired by the work of [1] on meta-learning in a sequence of linear bandit problems whose parameters are sampled from a single distribution (i.e., a single environment), here we consider the feasibility of meta-learning when task parameters are drawn from a mixture distribution instead. For this problem, we propose a regularized version of the OFUL algorithm that, when trained on tasks with labeled environments, achieves low regret on a new task without requiring knowledge of the environment from which the new task originates. Specifically, our regret bound for the new algorithm captures the effect of environment misclassification and highlights the benefits over learning each task separately or meta-learning without recognition of the distinct mixture components.

Related papers

TIMRL: A Novel Meta-Reinforcement Learning Framework for Non-Stationary and Multi-Task Environments [6.941538672757626]
We propose a novel meta-reinforcement learning method by leveraging Gaussian mixture model and the transformer network. The classification of tasks is encoded through transformer network to determine the Gaussian component corresponding to the task. Experimental results demonstrate that the proposed method dramatically improves sample efficiency and accurately recognizes the classification of the tasks.
arXiv Detail & Related papers (2025-01-13T09:11:33Z)
Meta-Learning with Heterogeneous Tasks [42.695853959923625]
Heterogeneous Tasks Robust Meta-learning (HeTRoM) An efficient iterative optimization algorithm based on bi-level optimization. Results demonstrate that our method provides flexibility, enabling users to adapt to diverse task settings.
arXiv Detail & Related papers (2024-10-24T16:32:23Z)
Algorithm Design for Online Meta-Learning with Task Boundary Detection [63.284263611646]
We propose a novel algorithm for task-agnostic online meta-learning in non-stationary environments. We first propose two simple but effective detection mechanisms of task switches and distribution shift. We show that a sublinear task-averaged regret can be achieved for our algorithm under mild conditions.
arXiv Detail & Related papers (2023-02-02T04:02:49Z)
ImpressLearn: Continual Learning via Combined Task Impressions [0.0]
This work proposes a new method to sequentially train a deep neural network on multiple tasks without suffering catastrophic forgetting. We show that simply learning a linear combination of a small number of task-specific masks on a randomly backbone network is sufficient to both retain accuracy on previously learned tasks, as well as achieve high accuracy on new tasks.
arXiv Detail & Related papers (2022-10-05T02:28:25Z)
New Tight Relaxations of Rank Minimization for Multi-Task Learning [161.23314844751556]
We propose two novel multi-task learning formulations based on two regularization terms. We show that our methods can correctly recover the low-rank structure shared across tasks, and outperform related multi-task learning methods.
arXiv Detail & Related papers (2021-12-09T07:29:57Z)
Dynamic Regret Analysis for Online Meta-Learning [0.0]
The online meta-learning framework has arisen as a powerful tool for the continual lifelong learning setting. This formulation involves two levels: outer level which learns meta-learners and inner level which learns task-specific models. We establish performance in terms of dynamic regret which handles changing environments from a global prospective. We carry out our analyses in a setting, and in expectation prove a logarithmic local dynamic regret which explicitly depends on the total number of iterations.
arXiv Detail & Related papers (2021-09-29T12:12:59Z)
MetaKernel: Learning Variational Random Features with Limited Labels [120.90737681252594]
Few-shot learning deals with the fundamental and challenging problem of learning from a few annotated samples, while being able to generalize well on new tasks. We propose meta-learning kernels with random Fourier features for few-shot learning, we call Meta Kernel.
arXiv Detail & Related papers (2021-05-08T21:24:09Z)
A Distribution-Dependent Analysis of Meta-Learning [13.24264919706183]
Key problem in the theory of meta-learning is to understand how the task distributions influence transfer risk. In this paper, we give distribution-dependent lower bounds on the transfer risk of any algorithm. We show that a novel, weighted version of the so-called biased regularized regression method is able to match these lower bounds up to a fixed constant factor.
arXiv Detail & Related papers (2020-10-31T19:36:15Z)
Adaptive Task Sampling for Meta-Learning [79.61146834134459]
Key idea of meta-learning for few-shot classification is to mimic the few-shot situations faced at test time. We propose an adaptive task sampling method to improve the generalization performance.
arXiv Detail & Related papers (2020-07-17T03:15:53Z)
Meta-learning with Stochastic Linear Bandits [120.43000970418939]
We consider a class of bandit algorithms that implement a regularized version of the well-known OFUL algorithm, where the regularization is a square euclidean distance to a bias vector. We show both theoretically and experimentally, that when the number of tasks grows and the variance of the task-distribution is small, our strategies have a significant advantage over learning the tasks in isolation.
arXiv Detail & Related papers (2020-05-18T08:41:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.