Related papers: Task-Robust Model-Agnostic Meta-Learning

Task-Robust Model-Agnostic Meta-Learning

URL: http://arxiv.org/abs/2002.04766v2
Date: Fri, 19 Jun 2020 03:06:25 GMT
Title: Task-Robust Model-Agnostic Meta-Learning
Authors: Liam Collins, Aryan Mokhtari, Sanjay Shakkottai
Abstract summary: We introduce the notion of "task-robustness" by reformulating the popular ModelAgnostic Meta-Learning (AML) objective. The solution to this novel formulation is taskrobust in the sense that it places equal importance on even the most difficult/or rare tasks.
Score: 42.27488241647739
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Meta-learning methods have shown an impressive ability to train models that rapidly learn new tasks. However, these methods only aim to perform well in expectation over tasks coming from some particular distribution that is typically equivalent across meta-training and meta-testing, rather than considering worst-case task performance. In this work we introduce the notion of "task-robustness" by reformulating the popular Model-Agnostic Meta-Learning (MAML) objective [Finn et al. 2017] such that the goal is to minimize the maximum loss over the observed meta-training tasks. The solution to this novel formulation is task-robust in the sense that it places equal importance on even the most difficult and/or rare tasks. This also means that it performs well over all distributions of the observed tasks, making it robust to shifts in the task distribution between meta-training and meta-testing. We present an algorithm to solve the proposed min-max problem, and show that it converges to an $\epsilon$-accurate point at the optimal rate of $\mathcal{O}(1/\epsilon^2)$ in the convex setting and to an $(\epsilon, \delta)$-stationary point at the rate of $\mathcal{O}(\max\{1/\epsilon^5, 1/\delta^5\})$ in nonconvex settings. We also provide an upper bound on the new task generalization error that captures the advantage of minimizing the worst-case task loss, and demonstrate this advantage in sinusoid regression and image classification experiments.

Related papers

Better Rates for Random Task Orderings in Continual Linear Models [50.11453013647086]
We analyze the forgetting, i.e., loss on previously seen tasks, after $k$ iterations. We develop novel last-iterate bounds in the realizable least squares setup, and apply them to derive new results for continual learning. We prove for the first time that randomization alone, with no task repetition, can prevent catastrophic forgetting in sufficiently long task.
arXiv Detail & Related papers (2025-04-06T18:39:45Z)
Multi-Level Contrastive Learning for Dense Prediction Task [59.591755258395594]
We present Multi-Level Contrastive Learning for Dense Prediction Task (MCL), an efficient self-supervised method for learning region-level feature representation for dense prediction tasks. Our method is motivated by the three key factors in detection: localization, scale consistency and recognition. Our method consistently outperforms the recent state-of-the-art methods on various datasets with significant margins.
arXiv Detail & Related papers (2023-04-04T17:59:04Z)
New Tight Relaxations of Rank Minimization for Multi-Task Learning [161.23314844751556]
We propose two novel multi-task learning formulations based on two regularization terms. We show that our methods can correctly recover the low-rank structure shared across tasks, and outperform related multi-task learning methods.
arXiv Detail & Related papers (2021-12-09T07:29:57Z)
Meta-learning with an Adaptive Task Scheduler [93.63502984214918]
Existing meta-learning algorithms randomly sample meta-training tasks with a uniform probability. It is likely that tasks are detrimental with noise or imbalanced given a limited number of meta-training tasks. We propose an adaptive task scheduler (ATS) for the meta-training process.
arXiv Detail & Related papers (2021-10-26T22:16:35Z)
Sample Efficient Linear Meta-Learning by Alternating Minimization [74.40553081646995]
We study a simple alternating minimization method (MLLAM) which alternately learns the low-dimensional subspace and the regressors. We show that for a constant subspace dimension MLLAM obtains nearly-optimal estimation error, despite requiring only $Omega(log d)$ samples per task. We propose a novel task subset selection scheme that ensures the same strong statistical guarantee as MLLAM.
arXiv Detail & Related papers (2021-05-18T06:46:48Z)
Generalization of Model-Agnostic Meta-Learning Algorithms: Recurring and Unseen Tasks [33.055672018805645]
We study the generalization properties of Model-Agnostic Meta-Learning (MAML) algorithms for supervised learning problems. Our proof techniques rely on the connections between algorithmic stability and generalization bounds of algorithms.
arXiv Detail & Related papers (2021-02-07T16:16:23Z)
Meta-Regularization by Enforcing Mutual-Exclusiveness [0.8057006406834467]
We propose a regularization technique for meta-learning models that gives the model designer more control over the information flow during meta-training. Our proposed regularization function shows an accuracy boost of $sim$ $36%$ on the Omniglot dataset.
arXiv Detail & Related papers (2021-01-24T22:57:19Z)
Robust Meta-learning for Mixed Linear Regression with Small Batches [34.94138630547603]
We study a fundamental question: can abundant small-data tasks compensate for the lack of big-data tasks? Existing approaches show that such a trade-off is efficiently achievable, with the help of medium-sized tasks with $Omega(k1/2)$ examples each. We introduce a spectral approach that is simultaneously robust under both scenarios.
arXiv Detail & Related papers (2020-06-17T07:59:05Z)
Meta Cyclical Annealing Schedule: A Simple Approach to Avoiding Meta-Amortization Error [50.83356836818667]
We develop a novel meta-regularization objective using it cyclical annealing schedule and it maximum mean discrepancy (MMD) criterion. The experimental results show that our approach substantially outperforms standard meta-learning algorithms.
arXiv Detail & Related papers (2020-03-04T04:43:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.