Modeling and Optimization Trade-off in Meta-learning
- URL: http://arxiv.org/abs/2010.12916v2
- Date: Tue, 13 Apr 2021 20:03:56 GMT
- Title: Modeling and Optimization Trade-off in Meta-learning
- Authors: Katelyn Gao and Ozan Sener
- Abstract summary: We introduce and rigorously define the trade-off between accurate modeling and ease in meta-learning.
Taking MAML as a representative metalearning algorithm, we theoretically characterize the trade-off for general non risk functions as well as linear regression.
We also empirically solve a trade-off for metareinforcement learning benchmarks.
- Score: 23.381986209234164
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: By searching for shared inductive biases across tasks, meta-learning promises
to accelerate learning on novel tasks, but with the cost of solving a complex
bilevel optimization problem. We introduce and rigorously define the trade-off
between accurate modeling and optimization ease in meta-learning. At one end,
classic meta-learning algorithms account for the structure of meta-learning but
solve a complex optimization problem, while at the other end domain randomized
search (otherwise known as joint training) ignores the structure of
meta-learning and solves a single level optimization problem. Taking MAML as
the representative meta-learning algorithm, we theoretically characterize the
trade-off for general non-convex risk functions as well as linear regression,
for which we are able to provide explicit bounds on the errors associated with
modeling and optimization. We also empirically study this trade-off for
meta-reinforcement learning benchmarks.
Related papers
- Fast Adaptation with Kernel and Gradient based Meta Leaning [4.763682200721131]
We propose two algorithms to improve both the inner and outer loops of Model A Meta Learning (MAML)
Our first algorithm redefines the optimization problem in the function space to update the model using closed-form solutions.
In the outer loop, the second algorithm adjusts the learning of the meta-learner by assigning weights to the losses from each task of the inner loop.
arXiv Detail & Related papers (2024-11-01T07:05:03Z) - Rethinking Meta-Learning from a Learning Lens [17.00587250127854]
We focus on the more fundamental learning to learn'' strategy of meta-learning to explore what causes errors and how to eliminate these errors without changing the environment.
We propose using task relations to the optimization process of meta-learning and propose a plug-and-play method called Task Relation Learner (TRLearner) to achieve this goal.
arXiv Detail & Related papers (2024-09-13T02:00:16Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training.
We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields.
Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z) - General-Purpose In-Context Learning by Meta-Learning Transformers [45.63069059498147]
We show that Transformers and other black-box models can be meta-trained to act as general-purpose in-context learners.
We characterize transitions between algorithms that generalize, algorithms that memorize, and algorithms that fail to meta-train at all.
We propose practical interventions such as biasing the training distribution that improve the meta-training and meta-generalization of general-purpose in-context learning algorithms.
arXiv Detail & Related papers (2022-12-08T18:30:22Z) - Meta Mirror Descent: Optimiser Learning for Fast Convergence [85.98034682899855]
We take a different perspective starting from mirror descent rather than gradient descent, and meta-learning the corresponding Bregman divergence.
Within this paradigm, we formalise a novel meta-learning objective of minimising the regret bound of learning.
Unlike many meta-learned optimisers, it also supports convergence and generalisation guarantees and uniquely does so without requiring validation data.
arXiv Detail & Related papers (2022-03-05T11:41:13Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z) - On the Global Optimality of Model-Agnostic Meta-Learning [133.16370011229776]
Model-a meta-learning (MAML) formulates meta-learning as a bilevel optimization problem, where the inner level solves each subtask based on a shared prior.
We characterize optimality of the stationary points attained by MAML for both learning and supervised learning, where the inner-level outer-level problems are solved via first-order optimization methods.
arXiv Detail & Related papers (2020-06-23T17:33:14Z) - Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.