On the Sample Complexity of Representation Learning in Multi-task
Bandits with Global and Local structure
- URL: http://arxiv.org/abs/2211.15129v1
- Date: Mon, 28 Nov 2022 08:40:12 GMT
- Title: On the Sample Complexity of Representation Learning in Multi-task
Bandits with Global and Local structure
- Authors: Alessio Russo, Alexandre Proutiere
- Abstract summary: We investigate the sample complexity of learning the optimal arm for multi-task bandit problems.
Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor)
We devise an algorithm OSRL-SC whose sample complexity approaches the lower bound, and scales at most as $H(Glog(delta_G)+ Xlog(delta_H))$, with $X,G,H$ being, respectively, the number of tasks, representations and predictors.
- Score: 77.60508571062958
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate the sample complexity of learning the optimal arm for
multi-task bandit problems. Arms consist of two components: one that is shared
across tasks (that we call representation) and one that is task-specific (that
we call predictor). The objective is to learn the optimal (representation,
predictor)-pair for each task, under the assumption that the optimal
representation is common to all tasks. Within this framework, efficient
learning algorithms should transfer knowledge across tasks. We consider the
best-arm identification problem for a fixed confidence, where, in each round,
the learner actively selects both a task, and an arm, and observes the
corresponding reward. We derive instance-specific sample complexity lower
bounds satisfied by any $(\delta_G,\delta_H)$-PAC algorithm (such an algorithm
identifies the best representation with probability at least $1-\delta_G$, and
the best predictor for a task with probability at least $1-\delta_H$). We
devise an algorithm OSRL-SC whose sample complexity approaches the lower bound,
and scales at most as $H(G\log(1/\delta_G)+ X\log(1/\delta_H))$, with $X,G,H$
being, respectively, the number of tasks, representations and predictors. By
comparison, this scaling is significantly better than the classical best-arm
identification algorithm that scales as $HGX\log(1/\delta)$.
Related papers
- Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Active Representation Learning for General Task Space with Applications
in Robotics [44.36398212117328]
We propose an algorithmic framework for textitactive representation learning, where the learner optimally chooses which source tasks to sample from.
We provide several instantiations under this framework, from bilinear and feature-based nonlinear to general nonlinear cases.
Our algorithms outperform baselines by $20%-70%$ on average.
arXiv Detail & Related papers (2023-06-15T08:27:50Z) - Improved Active Multi-Task Representation Learning via Lasso [44.607652031235716]
In this paper, we show the dominance of the L1-regularized-relevance-based ($nu1$) strategy by giving a lower bound for the $nu2$-based strategy.
We also characterize the potential of our $nu1$-based strategy in sample-cost-sensitive settings.
arXiv Detail & Related papers (2023-06-05T03:08:29Z) - Globally Optimal Algorithms for Fixed-Budget Best Arm Identification [16.500749121196986]
We characterize the optimal rate as a result of global optimization over all possible parameters.
We show that this rate is indeed achievable by introducing a conceptual algorithm called delayed optimal tracking (DOT)
arXiv Detail & Related papers (2022-06-09T17:42:19Z) - Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle.
We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z) - Mean-based Best Arm Identification in Stochastic Bandits under Reward
Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits.
Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z) - Fast Learning for Renewal Optimization in Online Task Scheduling [18.935043109084543]
This paper considers online optimization of a renewal-reward system.
The probability distribution for the task type vector is unknown.
The proposed algorithm uses an auxiliary variable that is updated according to a classic Robbins-Monro iteration.
arXiv Detail & Related papers (2020-07-18T22:44:13Z) - A No-Free-Lunch Theorem for MultiTask Learning [19.645741778058227]
We consider a seemingly favorable classification scenario where all tasks $P_t$ share a common optimal classifier $h*,$.
We show that, even though such regimes admit minimax rates accounting for both $n$ and $N$, no adaptive algorithm exists.
arXiv Detail & Related papers (2020-06-29T03:03:29Z) - On the Theory of Transfer Learning: The Importance of Task Diversity [114.656572506859]
We consider $t+1$ tasks parameterized by functions of the form $f_j circ h$ in a general function class $mathcalF circ mathcalH$.
We show that for diverse training tasks the sample complexity needed to learn the shared representation across the first $t$ training tasks scales as $C(mathcalH) + t C(mathcalF)$.
arXiv Detail & Related papers (2020-06-20T20:33:59Z) - FANOK: Knockoffs in Linear Time [73.5154025911318]
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems.
We test our methods on problems with $p$ as large as $500,000$.
arXiv Detail & Related papers (2020-06-15T21:55:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.