A contrastive rule for meta-learning
- URL: http://arxiv.org/abs/2104.01677v1
- Date: Sun, 4 Apr 2021 19:45:41 GMT
- Title: A contrastive rule for meta-learning
- Authors: Nicolas Zucchet and Simon Schug and Johannes von Oswald and Dominic
Zhao and Jo\~ao Sacramento
- Abstract summary: Meta-learning algorithms leverage regularities that are present on a set of tasks to speed up and improve the performance of a subsidiary learning process.
We present a gradient-based meta-learning algorithm based on equilibrium propagation.
We establish theoretical bounds on its performance and present experiments on a set of standard benchmarks and neural network architectures.
- Score: 1.3124513975412255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meta-learning algorithms leverage regularities that are present on a set of
tasks to speed up and improve the performance of a subsidiary learning process.
Recent work on deep neural networks has shown that prior gradient-based
learning of meta-parameters can greatly improve the efficiency of subsequent
learning. Here, we present a gradient-based meta-learning algorithm based on
equilibrium propagation. Instead of explicitly differentiating the learning
process, our contrastive meta-learning rule estimates meta-parameter gradients
by executing the subsidiary process more than once. This avoids reversing the
learning dynamics in time and computing second-order derivatives. In spite of
this, and unlike previous first-order methods, our rule recovers an arbitrarily
accurate meta-parameter update given enough compute. As such, contrastive
meta-learning is a candidate rule for biologically-plausible meta-learning. We
establish theoretical bounds on its performance and present experiments on a
set of standard benchmarks and neural network architectures.
Related papers
- Meta-Value Learning: a General Framework for Learning with Learning
Awareness [1.4323566945483497]
We propose to judge joint policies by their long-term prospects as measured by the meta-value.
We apply a form of Q-learning to the meta-game of optimization, in a way that avoids the need to explicitly represent the continuous action space of policy updates.
arXiv Detail & Related papers (2023-07-17T21:40:57Z) - Continuous-Time Meta-Learning with Forward Mode Differentiation [65.26189016950343]
We introduce Continuous Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field.
Treating the learning process as an ODE offers the notable advantage that the length of the trajectory is now continuous.
We show empirically its efficiency in terms of runtime and memory usage, and we illustrate its effectiveness on a range of few-shot image classification problems.
arXiv Detail & Related papers (2022-03-02T22:35:58Z) - One Step at a Time: Pros and Cons of Multi-Step Meta-Gradient
Reinforcement Learning [61.662504399411695]
We introduce a novel method mixing multiple inner steps that enjoys a more accurate and robust meta-gradient signal.
When applied to the Snake game, the mixing meta-gradient algorithm can cut the variance by a factor of 3 while achieving similar or higher performance.
arXiv Detail & Related papers (2021-10-30T08:36:52Z) - Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate
in Gradient Descent [20.47598828422897]
We propose textit-Meta-Regularization, a novel approach for the adaptive choice of the learning rate in first-order descent methods.
Our approach modifies the objective function by adding a regularization term, and casts the joint process parameters.
arXiv Detail & Related papers (2021-04-12T13:13:34Z) - Large-Scale Meta-Learning with Continual Trajectory Shifting [76.29017270864308]
We show that allowing the meta-learners to take a larger number of inner gradient steps better captures the structure of heterogeneous and large-scale tasks.
In order to increase the frequency of meta-updates, we propose to estimate the required shift of the task-specific parameters.
We show that the algorithm largely outperforms the previous first-order meta-learning methods in terms of both generalization performance and convergence.
arXiv Detail & Related papers (2021-02-14T18:36:33Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - La-MAML: Look-ahead Meta Learning for Continual Learning [14.405620521842621]
We propose Look-ahead MAML (La-MAML), a fast optimisation-based meta-learning algorithm for online-continual learning, aided by a small episodic memory.
La-MAML achieves performance superior to other replay-based, prior-based and meta-learning based approaches for continual learning on real-world visual classification benchmarks.
arXiv Detail & Related papers (2020-07-27T23:07:01Z) - Meta-Gradient Reinforcement Learning with an Objective Discovered Online [54.15180335046361]
We propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network.
Because the objective is discovered online, it can adapt to changes over time.
On the Atari Learning Environment, the meta-gradient algorithm adapts over time to learn with greater efficiency.
arXiv Detail & Related papers (2020-07-16T16:17:09Z) - Gradient-EM Bayesian Meta-learning [6.726255259929496]
Key idea behind Bayesian meta-learning is empirical Bayes inference of hierarchical model.
In this work, we extend this framework to include a variety of existing methods, before proposing our variant based on gradient-EM algorithm.
Experiments on sinusoidal regression, few-shot image classification, and policy-based reinforcement learning show that our method not only achieves better accuracy with less computation cost, but is also more robust to uncertainty.
arXiv Detail & Related papers (2020-06-21T10:52:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.