Meta Cyclical Annealing Schedule: A Simple Approach to Avoiding
Meta-Amortization Error
- URL: http://arxiv.org/abs/2003.01889v1
- Date: Wed, 4 Mar 2020 04:43:16 GMT
- Title: Meta Cyclical Annealing Schedule: A Simple Approach to Avoiding
Meta-Amortization Error
- Authors: Yusuke Hayashi and Taiji Suzuki
- Abstract summary: We develop a novel meta-regularization objective using it cyclical annealing schedule and it maximum mean discrepancy (MMD) criterion.
The experimental results show that our approach substantially outperforms standard meta-learning algorithms.
- Score: 50.83356836818667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to learn new concepts with small amounts of data is a crucial
aspect of intelligence that has proven challenging for deep learning methods.
Meta-learning for few-shot learning offers a potential solution to this
problem: by learning to learn across data from many previous tasks, few-shot
learning algorithms can discover the structure among tasks to enable fast
learning of new tasks. However, a critical challenge in few-shot learning is
task ambiguity: even when a powerful prior can be meta-learned from a large
number of prior tasks, a small dataset for a new task can simply be very
ambiguous to acquire a single model for that task. The Bayesian meta-learning
models can naturally resolve this problem by putting a sophisticated prior
distribution and let the posterior well regularized through Bayesian decision
theory. However, currently known Bayesian meta-learning procedures such as
VERSA suffer from the so-called {\it information preference problem}, that is,
the posterior distribution is degenerated to one point and is far from the
exact one. To address this challenge, we design a novel meta-regularization
objective using {\it cyclical annealing schedule} and {\it maximum mean
discrepancy} (MMD) criterion. The cyclical annealing schedule is quite
effective at avoiding such degenerate solutions. This procedure includes a
difficult KL-divergence estimation, but we resolve the issue by employing MMD
instead of KL-divergence. The experimental results show that our approach
substantially outperforms standard meta-learning algorithms.
Related papers
- Rethinking Meta-Learning from a Learning Lens [17.00587250127854]
We focus on the more fundamental learning to learn'' strategy of meta-learning to explore what causes errors and how to eliminate these errors without changing the environment.
We propose using task relations to the optimization process of meta-learning and propose a plug-and-play method called Task Relation Learner (TRLearner) to achieve this goal.
arXiv Detail & Related papers (2024-09-13T02:00:16Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Clustering-based Domain-Incremental Learning [4.835091081509403]
Key challenge in continual learning is the so-called "catastrophic forgetting problem"
We propose an online clustering-based approach on a dynamically updated finite pool of samples or gradients.
We demonstrate the effectiveness of the proposed strategy and its promising performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-09-21T13:49:05Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Algorithm Design for Online Meta-Learning with Task Boundary Detection [63.284263611646]
We propose a novel algorithm for task-agnostic online meta-learning in non-stationary environments.
We first propose two simple but effective detection mechanisms of task switches and distribution shift.
We show that a sublinear task-averaged regret can be achieved for our algorithm under mild conditions.
arXiv Detail & Related papers (2023-02-02T04:02:49Z) - Gradient-Based Meta-Learning Using Uncertainty to Weigh Loss for
Few-Shot Learning [5.691930884128995]
Model-Agnostic Meta-Learning (MAML) is one of the most successful meta-learning techniques for few-shot learning.
New method is proposed for task-specific learner adaptively learn to select parameters that minimize the loss of new tasks.
Method 1 generates weights by comparing meta-loss differences to improve the accuracy when there are few classes.
Method 2 introduces the homoscedastic uncertainty of each task to weigh multiple losses based on the original gradient descent.
arXiv Detail & Related papers (2022-08-17T08:11:51Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - Variable-Shot Adaptation for Online Meta-Learning [123.47725004094472]
We study the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks.
We find that meta-learning solves the full task set with fewer overall labels and greater cumulative performance, compared to standard supervised methods.
These results suggest that meta-learning is an important ingredient for building learning systems that continuously learn and improve over a sequence of problems.
arXiv Detail & Related papers (2020-12-14T18:05:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.