Related papers: Learning where to learn: Gradient sparsity in meta and continual learning

Learning where to learn: Gradient sparsity in meta and continual learning

URL: http://arxiv.org/abs/2110.14402v1
Date: Wed, 27 Oct 2021 12:54:36 GMT
Title: Learning where to learn: Gradient sparsity in meta and continual learning
Authors: Johannes von Oswald, Dominic Zhao, Seijin Kobayashi, Simon Schug, Massimo Caccia, Nicolas Zucchet, Jo\~ao Sacramento
Abstract summary: We show that meta-learning can be improved by letting the learning algorithm decide which weights to change. We find that patterned sparsity emerges from this process, with the pattern of sparsity varying on a problem-by-problem basis. Our results shed light on an ongoing debate on whether meta-learning can discover adaptable features and suggest that learning by sparse gradient descent is a powerful inductive bias for meta-learning systems.
Score: 4.845285139609619
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Finding neural network weights that generalize well from small datasets is difficult. A promising approach is to learn a weight initialization such that a small number of weight changes results in low generalization error. We show that this form of meta-learning can be improved by letting the learning algorithm decide which weights to change, i.e., by learning where to learn. We find that patterned sparsity emerges from this process, with the pattern of sparsity varying on a problem-by-problem basis. This selective sparsity results in better generalization and less interference in a range of few-shot and continual learning problems. Moreover, we find that sparse learning also emerges in a more expressive model where learning rates are meta-learned. Our results shed light on an ongoing debate on whether meta-learning can discover adaptable features and suggest that learning by sparse gradient descent is a powerful inductive bias for meta-learning systems.

Related papers

Ticketed Learning-Unlearning Schemes [57.89421552780526]
We propose a new ticketed model for learning--unlearning. We provide space-efficient ticketed learning--unlearning schemes for a broad family of concept classes.
arXiv Detail & Related papers (2023-06-27T18:54:40Z)
Towards Scalable Adaptive Learning with Graph Neural Networks and Reinforcement Learning [0.0]
We introduce a flexible and scalable approach towards the problem of learning path personalization. Our model is a sequential recommender system based on a graph neural network. Our results demonstrate that it can learn to make good recommendations in the small-data regime.
arXiv Detail & Related papers (2023-05-10T18:16:04Z)
Continual Learning by Modeling Intra-Class Variation [33.30614232534283]
It has been observed that neural networks perform poorly when the data or tasks are presented sequentially. Unlike humans, neural networks suffer greatly from catastrophic forgetting, making it impossible to perform life-long learning. We examine memory-based continual learning and identify that large variation in the representation space is crucial for avoiding catastrophic forgetting.
arXiv Detail & Related papers (2022-10-11T12:17:43Z)
Learning an Explicit Hyperparameter Prediction Function Conditioned on Tasks [62.63852372239708]
Meta learning aims to learn the learning methodology for machine learning from observed tasks, so as to generalize to new query tasks. We interpret such learning methodology as learning an explicit hyper- parameter prediction function shared by all training tasks. Such setting guarantees that the meta-learned learning methodology is able to flexibly fit diverse query tasks.
arXiv Detail & Related papers (2021-07-06T04:05:08Z)
Variable-Shot Adaptation for Online Meta-Learning [123.47725004094472]
We study the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks. We find that meta-learning solves the full task set with fewer overall labels and greater cumulative performance, compared to standard supervised methods. These results suggest that meta-learning is an important ingredient for building learning systems that continuously learn and improve over a sequence of problems.
arXiv Detail & Related papers (2020-12-14T18:05:24Z)
Deep Reinforcement Learning for Adaptive Learning Systems [4.8685842576962095]
We formulate the problem of how to find an individualized learning plan based on learner's latent traits. We apply a model-free deep reinforcement learning algorithm that can effectively find the optimal learning policy. We also develop a transition model estimator that emulates the learner's learning process using neural networks.
arXiv Detail & Related papers (2020-04-17T18:04:03Z)
Meta-Meta Classification for One-Shot Learning [11.27833234287093]
We present a new approach, called meta-meta classification, to learning in small-data settings. In this approach, one uses a large set of learning problems to design an ensemble of learners, where each learner has high bias and low variance. We evaluate the approach on a one-shot, one-class-versus-all classification task and show that it is able to outperform traditional meta-learning as well as ensembling approaches.
arXiv Detail & Related papers (2020-04-17T07:05:03Z)
Meta Cyclical Annealing Schedule: A Simple Approach to Avoiding Meta-Amortization Error [50.83356836818667]
We develop a novel meta-regularization objective using it cyclical annealing schedule and it maximum mean discrepancy (MMD) criterion. The experimental results show that our approach substantially outperforms standard meta-learning algorithms.
arXiv Detail & Related papers (2020-03-04T04:43:16Z)
Provable Meta-Learning of Linear Representations [114.656572506859]
We provide fast, sample-efficient algorithms to address the dual challenges of learning a common set of features from multiple, related tasks, and transferring this knowledge to new, unseen tasks. We also provide information-theoretic lower bounds on the sample complexity of learning these linear features.
arXiv Detail & Related papers (2020-02-26T18:21:34Z)
Revisiting Meta-Learning as Supervised Learning [69.2067288158133]
We aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning. By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning. This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning.
arXiv Detail & Related papers (2020-02-03T06:13:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.