Robust Meta-learning for Mixed Linear Regression with Small Batches
- URL: http://arxiv.org/abs/2006.09702v2
- Date: Thu, 18 Jun 2020 18:46:30 GMT
- Title: Robust Meta-learning for Mixed Linear Regression with Small Batches
- Authors: Weihao Kong, Raghav Somani, Sham Kakade, Sewoong Oh
- Abstract summary: We study a fundamental question: can abundant small-data tasks compensate for the lack of big-data tasks?
Existing approaches show that such a trade-off is efficiently achievable, with the help of medium-sized tasks with $Omega(k1/2)$ examples each.
We introduce a spectral approach that is simultaneously robust under both scenarios.
- Score: 34.94138630547603
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A common challenge faced in practical supervised learning, such as medical
image processing and robotic interactions, is that there are plenty of tasks
but each task cannot afford to collect enough labeled examples to be learned in
isolation. However, by exploiting the similarities across those tasks, one can
hope to overcome such data scarcity. Under a canonical scenario where each task
is drawn from a mixture of k linear regressions, we study a fundamental
question: can abundant small-data tasks compensate for the lack of big-data
tasks? Existing second moment based approaches show that such a trade-off is
efficiently achievable, with the help of medium-sized tasks with
$\Omega(k^{1/2})$ examples each. However, this algorithm is brittle in two
important scenarios. The predictions can be arbitrarily bad (i) even with only
a few outliers in the dataset; or (ii) even if the medium-sized tasks are
slightly smaller with $o(k^{1/2})$ examples each. We introduce a spectral
approach that is simultaneously robust under both scenarios. To this end, we
first design a novel outlier-robust principal component analysis algorithm that
achieves an optimal accuracy. This is followed by a sum-of-squares algorithm to
exploit the information from higher order moments. Together, this approach is
robust against outliers and achieves a graceful statistical trade-off; the lack
of $\Omega(k^{1/2})$-size tasks can be compensated for with smaller tasks,
which can now be as small as $O(\log k)$.
Related papers
- Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - On the Sample Complexity of Representation Learning in Multi-task
Bandits with Global and Local structure [77.60508571062958]
We investigate the sample complexity of learning the optimal arm for multi-task bandit problems.
Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor)
We devise an algorithm OSRL-SC whose sample complexity approaches the lower bound, and scales at most as $H(Glog(delta_G)+ Xlog(delta_H))$, with $X,G,H$ being, respectively, the number of tasks, representations and predictors.
arXiv Detail & Related papers (2022-11-28T08:40:12Z) - Meta Learning for High-dimensional Ising Model Selection Using
$\ell_1$-regularized Logistic Regression [28.776950569604026]
We consider the meta learning problem for estimating the graphs associated with high-dimensional Ising models.
Our goal is to use the information learned from the auxiliary tasks in the learning of the novel task to reduce its sufficient sample complexity.
arXiv Detail & Related papers (2022-08-19T20:28:39Z) - Sample Efficient Linear Meta-Learning by Alternating Minimization [74.40553081646995]
We study a simple alternating minimization method (MLLAM) which alternately learns the low-dimensional subspace and the regressors.
We show that for a constant subspace dimension MLLAM obtains nearly-optimal estimation error, despite requiring only $Omega(log d)$ samples per task.
We propose a novel task subset selection scheme that ensures the same strong statistical guarantee as MLLAM.
arXiv Detail & Related papers (2021-05-18T06:46:48Z) - How to distribute data across tasks for meta-learning? [59.608652082495624]
We show that the optimal number of data points per task depends on the budget, but it converges to a unique constant value for large budgets.
Our results suggest a simple and efficient procedure for data collection.
arXiv Detail & Related papers (2021-03-15T15:38:47Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z) - Meta-learning for mixed linear regression [44.27602704497616]
In modern supervised learning, there are a large number of tasks, but many of them are associated with only a small amount of labeled data.
We study a fundamental question of interest: When can abundant tasks with small data compensate for lack of tasks with big data?
We show that we can efficiently utilize small data tasks with the help of $tildeOmega(k3/2)$ medium data tasks each with $tildeOmega(k1/2)$ examples.
arXiv Detail & Related papers (2020-02-20T18:34:28Z) - Task-Robust Model-Agnostic Meta-Learning [42.27488241647739]
We introduce the notion of "task-robustness" by reformulating the popular ModelAgnostic Meta-Learning (AML) objective.
The solution to this novel formulation is taskrobust in the sense that it places equal importance on even the most difficult/or rare tasks.
arXiv Detail & Related papers (2020-02-12T02:20:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.