Multi-task Representation Learning for Pure Exploration in Bilinear
Bandits
- URL: http://arxiv.org/abs/2311.00327v1
- Date: Wed, 1 Nov 2023 06:30:45 GMT
- Title: Multi-task Representation Learning for Pure Exploration in Bilinear
Bandits
- Authors: Subhojyoti Mukherjee, Qiaomin Xie, Josiah P. Hanna, Robert Nowak
- Abstract summary: We study multi-task representation learning for the problem of pure exploration in bilinear bandits.
In bilinear bandits, an action takes the form of a pair of arms from two different entity types.
- Score: 13.773838574776338
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study multi-task representation learning for the problem of pure
exploration in bilinear bandits. In bilinear bandits, an action takes the form
of a pair of arms from two different entity types and the reward is a bilinear
function of the known feature vectors of the arms. In the \textit{multi-task
bilinear bandit problem}, we aim to find optimal actions for multiple tasks
that share a common low-dimensional linear representation. The objective is to
leverage this characteristic to expedite the process of identifying the best
pair of arms for all tasks. We propose the algorithm GOBLIN that uses an
experimental design approach to optimize sample allocations for learning the
global representation as well as minimize the number of samples needed to
identify the optimal pair of arms in individual tasks. To the best of our
knowledge, this is the first study to give sample complexity analysis for pure
exploration in bilinear bandits with shared representation. Our results
demonstrate that by learning the shared representation across tasks, we achieve
significantly improved sample complexity compared to the traditional approach
of solving tasks independently.
Related papers
- The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback [12.388205905012423]
Reinforcement learning from human feedback has contributed to performance improvements in large language models.
We formulate RLHF as the contextual dueling bandit problem and assume a common linear representation.
We prove that to achieve $varepsilon-$optimal, the sample complexity of the source tasks can be significantly reduced.
arXiv Detail & Related papers (2024-05-18T08:29:15Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Active Representation Learning for General Task Space with Applications
in Robotics [44.36398212117328]
We propose an algorithmic framework for textitactive representation learning, where the learner optimally chooses which source tasks to sample from.
We provide several instantiations under this framework, from bilinear and feature-based nonlinear to general nonlinear cases.
Our algorithms outperform baselines by $20%-70%$ on average.
arXiv Detail & Related papers (2023-06-15T08:27:50Z) - Multi-task Representation Learning for Pure Exploration in Linear
Bandits [34.67303292713379]
We study multi-task representation learning for best arm identification in linear bandits (RepBAI-LB) and best policy identification in contextual linear bandits (RepBPI-CLB)
In these two problems, all tasks share a common low-dimensional linear representation, and our goal is to leverage this feature to accelerate the best arm (policy) identification process for all tasks.
We show that by learning the common representation among tasks, our sample complexity is significantly better than that of the native approach which solves tasks independently.
arXiv Detail & Related papers (2023-02-09T05:14:48Z) - On the Sample Complexity of Representation Learning in Multi-task
Bandits with Global and Local structure [77.60508571062958]
We investigate the sample complexity of learning the optimal arm for multi-task bandit problems.
Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor)
We devise an algorithm OSRL-SC whose sample complexity approaches the lower bound, and scales at most as $H(Glog(delta_G)+ Xlog(delta_H))$, with $X,G,H$ being, respectively, the number of tasks, representations and predictors.
arXiv Detail & Related papers (2022-11-28T08:40:12Z) - Synergies between Disentanglement and Sparsity: Generalization and
Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations.
Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z) - Active Multi-Task Representation Learning [50.13453053304159]
We give the first formal study on resource task sampling by leveraging the techniques from active learning.
We propose an algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance.
arXiv Detail & Related papers (2022-02-02T08:23:24Z) - Non-Stationary Representation Learning in Sequential Linear Bandits [22.16801879707937]
We study representation learning for multi-task decision-making in non-stationary environments.
We propose an online algorithm that facilitates efficient decision-making by learning and transferring non-stationary representations in an adaptive fashion.
arXiv Detail & Related papers (2022-01-13T06:13:03Z) - Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models [7.458639397686894]
How to explore efficiently is a central problem in multi-armed bandits.
We introduce the metadata-based multi-task bandit problem.
We propose to capture task relations through the lens of Bayesian hierarchical models.
arXiv Detail & Related papers (2021-08-13T22:45:05Z) - Efficient Pure Exploration for Combinatorial Bandits with Semi-Bandit
Feedback [51.21673420940346]
Combinatorial bandits generalize multi-armed bandits, where the agent chooses sets of arms and observes a noisy reward for each arm contained in the chosen set.
We focus on the pure-exploration problem of identifying the best arm with fixed confidence, as well as a more general setting, where the structure of the answer set differs from the one of the action set.
Based on a projection-free online learning algorithm for finite polytopes, it is the first computationally efficient algorithm which is convexally optimal and has competitive empirical performance.
arXiv Detail & Related papers (2021-01-21T10:35:09Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.