Meta-Learning with Versatile Loss Geometries for Fast Adaptation Using
Mirror Descent
- URL: http://arxiv.org/abs/2312.13486v2
- Date: Thu, 18 Jan 2024 20:42:48 GMT
- Title: Meta-Learning with Versatile Loss Geometries for Fast Adaptation Using
Mirror Descent
- Authors: Yilang Zhang, Bingcong Li, Georgios B. Giannakis
- Abstract summary: A fundamental challenge in meta-learning is how to quickly "adapt" the extracted prior in order to train a task-specific model.
Existing approaches deal with this challenge using a preconditioner that enhances convergence of the per-task training process.
The present contribution addresses this limitation by learning a nonlinear mirror map, which induces a versatile distance metric.
- Score: 44.56938629818211
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Utilizing task-invariant prior knowledge extracted from related tasks,
meta-learning is a principled framework that empowers learning a new task
especially when data records are limited. A fundamental challenge in
meta-learning is how to quickly "adapt" the extracted prior in order to train a
task-specific model within a few optimization steps. Existing approaches deal
with this challenge using a preconditioner that enhances convergence of the
per-task training process. Though effective in representing locally a quadratic
training loss, these simple linear preconditioners can hardly capture complex
loss geometries. The present contribution addresses this limitation by learning
a nonlinear mirror map, which induces a versatile distance metric to enable
capturing and optimizing a wide range of loss geometries, hence facilitating
the per-task training. Numerical tests on few-shot learning datasets
demonstrate the superior expressiveness and convergence of the advocated
approach.
Related papers
- Rethinking Meta-Learning from a Learning Lens [17.00587250127854]
We focus on the more fundamental learning to learn'' strategy of meta-learning to explore what causes errors and how to eliminate these errors without changing the environment.
We propose using task relations to the optimization process of meta-learning and propose a plug-and-play method called Task Relation Learner (TRLearner) to achieve this goal.
arXiv Detail & Related papers (2024-09-13T02:00:16Z) - Clustering-based Domain-Incremental Learning [4.835091081509403]
Key challenge in continual learning is the so-called "catastrophic forgetting problem"
We propose an online clustering-based approach on a dynamically updated finite pool of samples or gradients.
We demonstrate the effectiveness of the proposed strategy and its promising performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-09-21T13:49:05Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - A Representation Learning Perspective on the Importance of
Train-Validation Splitting in Meta-Learning [14.720411598827365]
splitting data from each task into train and validation sets during meta-training.
We argue that the train-validation split encourages the learned representation to be low-rank without compromising on expressivity.
Since sample efficiency benefits from low-rankness, the splitting strategy will require very few samples to solve unseen test tasks.
arXiv Detail & Related papers (2021-06-29T17:59:33Z) - Conditional Meta-Learning of Linear Representations [57.90025697492041]
Standard meta-learning for representation learning aims to find a common representation to be shared across multiple tasks.
In this work we overcome this issue by inferring a conditioning function, mapping the tasks' side information into a representation tailored to the task at hand.
We propose a meta-algorithm capable of leveraging this advantage in practice.
arXiv Detail & Related papers (2021-03-30T12:02:14Z) - Overcoming Catastrophic Forgetting via Direction-Constrained
Optimization [43.53836230865248]
We study a new design of the optimization algorithm for training deep learning models with a fixed architecture of the classification network in a continual learning framework.
We present our direction-constrained optimization (DCO) method, where for each task we introduce a linear autoencoder to approximate its corresponding top forbidden principal directions.
We demonstrate that our algorithm performs favorably compared to other state-of-art regularization-based continual learning methods.
arXiv Detail & Related papers (2020-11-25T08:45:21Z) - Expert Training: Task Hardness Aware Meta-Learning for Few-Shot
Classification [62.10696018098057]
We propose an easy-to-hard expert meta-training strategy to arrange the training tasks properly.
A task hardness aware module is designed and integrated into the training procedure to estimate the hardness of a task.
Experimental results on the miniImageNet and tieredImageNetSketch datasets show that the meta-learners can obtain better results with our expert training strategy.
arXiv Detail & Related papers (2020-07-13T08:49:00Z) - Meta Cyclical Annealing Schedule: A Simple Approach to Avoiding
Meta-Amortization Error [50.83356836818667]
We develop a novel meta-regularization objective using it cyclical annealing schedule and it maximum mean discrepancy (MMD) criterion.
The experimental results show that our approach substantially outperforms standard meta-learning algorithms.
arXiv Detail & Related papers (2020-03-04T04:43:16Z) - Provable Meta-Learning of Linear Representations [114.656572506859]
We provide fast, sample-efficient algorithms to address the dual challenges of learning a common set of features from multiple, related tasks, and transferring this knowledge to new, unseen tasks.
We also provide information-theoretic lower bounds on the sample complexity of learning these linear features.
arXiv Detail & Related papers (2020-02-26T18:21:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.