Large-Scale Meta-Learning with Continual Trajectory Shifting
- URL: http://arxiv.org/abs/2102.07215v1
- Date: Sun, 14 Feb 2021 18:36:33 GMT
- Title: Large-Scale Meta-Learning with Continual Trajectory Shifting
- Authors: Jaewoong Shin and Hae Beom Lee and Boqing Gong and Sung Ju Hwang
- Abstract summary: We show that allowing the meta-learners to take a larger number of inner gradient steps better captures the structure of heterogeneous and large-scale tasks.
In order to increase the frequency of meta-updates, we propose to estimate the required shift of the task-specific parameters.
We show that the algorithm largely outperforms the previous first-order meta-learning methods in terms of both generalization performance and convergence.
- Score: 76.29017270864308
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meta-learning of shared initialization parameters has shown to be highly
effective in solving few-shot learning tasks. However, extending the framework
to many-shot scenarios, which may further enhance its practicality, has been
relatively overlooked due to the technical difficulties of meta-learning over
long chains of inner-gradient steps. In this paper, we first show that allowing
the meta-learners to take a larger number of inner gradient steps better
captures the structure of heterogeneous and large-scale task distributions,
thus results in obtaining better initialization points. Further, in order to
increase the frequency of meta-updates even with the excessively long
inner-optimization trajectories, we propose to estimate the required shift of
the task-specific parameters with respect to the change of the initialization
parameters. By doing so, we can arbitrarily increase the frequency of
meta-updates and thus greatly improve the meta-level convergence as well as the
quality of the learned initializations. We validate our method on a
heterogeneous set of large-scale tasks and show that the algorithm largely
outperforms the previous first-order meta-learning methods in terms of both
generalization performance and convergence, as well as multi-task learning and
fine-tuning baselines.
Related papers
- XB-MAML: Learning Expandable Basis Parameters for Effective
Meta-Learning with Wide Task Coverage [12.38102349597265]
We introduce XB-MAML, which learns expandable basis parameters, where they are linearly combined to form an effective initialization to a given task.
XB-MAML observes the discrepancy between the vector space spanned by the basis and fine-tuned parameters to decide whether to expand the basis.
arXiv Detail & Related papers (2024-03-11T14:37:57Z) - Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training.
We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields.
Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z) - Gradient-Based Meta-Learning Using Uncertainty to Weigh Loss for
Few-Shot Learning [5.691930884128995]
Model-Agnostic Meta-Learning (MAML) is one of the most successful meta-learning techniques for few-shot learning.
New method is proposed for task-specific learner adaptively learn to select parameters that minimize the loss of new tasks.
Method 1 generates weights by comparing meta-loss differences to improve the accuracy when there are few classes.
Method 2 introduces the homoscedastic uncertainty of each task to weigh multiple losses based on the original gradient descent.
arXiv Detail & Related papers (2022-08-17T08:11:51Z) - One Step at a Time: Pros and Cons of Multi-Step Meta-Gradient
Reinforcement Learning [61.662504399411695]
We introduce a novel method mixing multiple inner steps that enjoys a more accurate and robust meta-gradient signal.
When applied to the Snake game, the mixing meta-gradient algorithm can cut the variance by a factor of 3 while achieving similar or higher performance.
arXiv Detail & Related papers (2021-10-30T08:36:52Z) - A contrastive rule for meta-learning [1.3124513975412255]
Meta-learning algorithms leverage regularities that are present on a set of tasks to speed up and improve the performance of a subsidiary learning process.
We present a gradient-based meta-learning algorithm based on equilibrium propagation.
We establish theoretical bounds on its performance and present experiments on a set of standard benchmarks and neural network architectures.
arXiv Detail & Related papers (2021-04-04T19:45:41Z) - Variable-Shot Adaptation for Online Meta-Learning [123.47725004094472]
We study the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks.
We find that meta-learning solves the full task set with fewer overall labels and greater cumulative performance, compared to standard supervised methods.
These results suggest that meta-learning is an important ingredient for building learning systems that continuously learn and improve over a sequence of problems.
arXiv Detail & Related papers (2020-12-14T18:05:24Z) - Meta-learning the Learning Trends Shared Across Tasks [123.10294801296926]
Gradient-based meta-learning algorithms excel at quick adaptation to new tasks with limited data.
Existing meta-learning approaches only depend on the current task information during the adaptation.
We propose a 'Path-aware' model-agnostic meta-learning approach.
arXiv Detail & Related papers (2020-10-19T08:06:47Z) - Improving Generalization in Meta-learning via Task Augmentation [69.83677015207527]
We propose two task augmentation methods, including MetaMix and Channel Shuffle.
Both MetaMix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets.
arXiv Detail & Related papers (2020-07-26T01:50:42Z) - TaskNorm: Rethinking Batch Normalization for Meta-Learning [43.01116858195183]
We evaluate a range of approaches to batch normalization for meta-learning scenarios, and develop a novel approach that we call TaskNorm.
Experiments on fourteen datasets demonstrate that the choice of batch normalization has a dramatic effect on both classification accuracy and training time.
We provide a set of best practices for normalization that will allow fair comparison of meta-learning algorithms.
arXiv Detail & Related papers (2020-03-06T15:43:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.