Accelerating Meta-Learning by Sharing Gradients
- URL: http://arxiv.org/abs/2312.08398v1
- Date: Wed, 13 Dec 2023 04:34:48 GMT
- Title: Accelerating Meta-Learning by Sharing Gradients
- Authors: Oscar Chang, Hod Lipson
- Abstract summary: We show that gradient sharing enables meta-learning under bigger inner loop learning rates and can accelerate the meta-training process by up to 134%.
We show using two popular few-shot classification datasets that gradient sharing enables meta-learning under bigger inner loop learning rates and can accelerate the meta-training process by up to 134%.
- Score: 12.090942406595637
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The success of gradient-based meta-learning is primarily attributed to its
ability to leverage related tasks to learn task-invariant information. However,
the absence of interactions between different tasks in the inner loop leads to
task-specific over-fitting in the initial phase of meta-training. While this is
eventually corrected by the presence of these interactions in the outer loop,
it comes at a significant cost of slower meta-learning. To address this
limitation, we explicitly encode task relatedness via an inner loop
regularization mechanism inspired by multi-task learning. Our algorithm shares
gradient information from previously encountered tasks as well as concurrent
tasks in the same task batch, and scales their contribution with meta-learned
parameters. We show using two popular few-shot classification datasets that
gradient sharing enables meta-learning under bigger inner loop learning rates
and can accelerate the meta-training process by up to 134%.
Related papers
- ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - TaskMix: Data Augmentation for Meta-Learning of Spoken Intent
Understanding [0.0]
We show that a state-of-the-art data augmentation method worsens this problem of overfitting when the task diversity is low.
We propose a simple method, TaskMix, which synthesizes new tasks by linearly interpolating existing tasks.
We show that TaskMix outperforms baselines, alleviates overfitting when task diversity is low, and does not degrade performance even when it is high.
arXiv Detail & Related papers (2022-09-26T00:37:40Z) - Set-based Meta-Interpolation for Few-Task Meta-Learning [79.4236527774689]
We propose a novel domain-agnostic task augmentation method, Meta-Interpolation, to densify the meta-training task distribution.
We empirically validate the efficacy of Meta-Interpolation on eight datasets spanning across various domains.
arXiv Detail & Related papers (2022-05-20T06:53:03Z) - Skill-based Meta-Reinforcement Learning [65.31995608339962]
We devise a method that enables meta-learning on long-horizon, sparse-reward tasks.
Our core idea is to leverage prior experience extracted from offline datasets during meta-learning.
arXiv Detail & Related papers (2022-04-25T17:58:19Z) - Meta-Learning with Fewer Tasks through Task Interpolation [67.03769747726666]
Current meta-learning algorithms require a large number of meta-training tasks, which may not be accessible in real-world scenarios.
By meta-learning with task gradient (MLTI), our approach effectively generates additional tasks by randomly sampling a pair of tasks and interpolating the corresponding features and labels.
Empirically, in our experiments on eight datasets from diverse domains, we find that the proposed general MLTI framework is compatible with representative meta-learning algorithms and consistently outperforms other state-of-the-art strategies.
arXiv Detail & Related papers (2021-06-04T20:15:34Z) - Large-Scale Meta-Learning with Continual Trajectory Shifting [76.29017270864308]
We show that allowing the meta-learners to take a larger number of inner gradient steps better captures the structure of heterogeneous and large-scale tasks.
In order to increase the frequency of meta-updates, we propose to estimate the required shift of the task-specific parameters.
We show that the algorithm largely outperforms the previous first-order meta-learning methods in terms of both generalization performance and convergence.
arXiv Detail & Related papers (2021-02-14T18:36:33Z) - Meta-learning the Learning Trends Shared Across Tasks [123.10294801296926]
Gradient-based meta-learning algorithms excel at quick adaptation to new tasks with limited data.
Existing meta-learning approaches only depend on the current task information during the adaptation.
We propose a 'Path-aware' model-agnostic meta-learning approach.
arXiv Detail & Related papers (2020-10-19T08:06:47Z) - Multitask Learning with Single Gradient Step Update for Task Balancing [4.330814031477772]
We propose an algorithm to balance between tasks at the gradient level by applying gradient-based meta-learning to multitask learning.
We apply the proposed method to various multitask computer vision problems and achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-05-20T08:34:20Z) - Information-Theoretic Generalization Bounds for Meta-Learning and
Applications [42.275148861039895]
Key performance measure for meta-learning is the meta-generalization gap.
This paper presents novel information-theoretic upper bounds on the meta-generalization gap.
arXiv Detail & Related papers (2020-05-09T05:48:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.