MAML is a Noisy Contrastive Learner
- URL: http://arxiv.org/abs/2106.15367v1
- Date: Tue, 29 Jun 2021 12:52:26 GMT
- Title: MAML is a Noisy Contrastive Learner
- Authors: Chia-Hsiang Kao, Wei-Chen Chiu and Pin-Yu Chen
- Abstract summary: Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays.
We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function.
We propose a simple but effective technique, zeroing trick, to alleviate such interference.
- Score: 72.04430033118426
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Model-agnostic meta-learning (MAML) is one of the most popular and
widely-adopted meta-learning algorithms nowadays, which achieves remarkable
success in various learning problems. Yet, with the unique design of nested
inner-loop and outer-loop updates which respectively govern the task-specific
and meta-model-centric learning, the underlying learning objective of MAML
still remains implicit and thus impedes a more straightforward understanding of
it. In this paper, we provide a new perspective to the working mechanism of
MAML and discover that: MAML is analogous to a meta-learner using a supervised
contrastive objective function, where the query features are pulled towards the
support features of the same class and against those of different classes, in
which such contrastiveness is experimentally verified via an analysis based on
the cosine similarity. Moreover, our analysis reveals that the vanilla MAML
algorithm has an undesirable interference term originating from the random
initialization and the cross-task interaction. We therefore propose a simple
but effective technique, zeroing trick, to alleviate such interference, where
the extensive experiments are then conducted on both miniImagenet and Omniglot
datasets to demonstrate the consistent improvement brought by our proposed
technique thus well validating its effectiveness.
Related papers
- A New First-Order Meta-Learning Algorithm with Convergence Guarantees [37.85411810113886]
Gradient-based meta-learning, especially MAML, has emerged as a viable solution to accomplish this goal.
One problem MAML encounters is its computational and memory burdens needed to compute the meta-gradients.
We propose a new first-order variant of MAML that we prove converges to a stationary point of the MAML objective, unlike other first-order variants.
arXiv Detail & Related papers (2024-09-05T16:37:26Z) - Understanding Masked Autoencoders From a Local Contrastive Perspective [80.57196495601826]
Masked AutoEncoder (MAE) has revolutionized the field of self-supervised learning with its simple yet effective masking and reconstruction strategies.
We introduce a new empirical framework, called Local Contrastive MAE, to analyze both reconstructive and contrastive aspects of MAE.
arXiv Detail & Related papers (2023-10-03T12:08:15Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Multi-Task Learning on Networks [0.0]
Multi-objective optimization problems arising in the multi-task learning context have specific features and require adhoc methods.
In this thesis the solutions in the Input Space are represented as probability distributions encapsulating the knowledge contained in the function evaluations.
In this space of probability distributions, endowed with the metric given by the Wasserstein distance, a new algorithm MOEA/WST can be designed in which the model is not directly on the objective function.
arXiv Detail & Related papers (2021-12-07T09:13:10Z) - How Fine-Tuning Allows for Effective Meta-Learning [50.17896588738377]
We present a theoretical framework for analyzing representations derived from a MAML-like algorithm.
We provide risk bounds on the best predictor found by fine-tuning via gradient descent, demonstrating that the algorithm can provably leverage the shared structure.
This separation result underscores the benefit of fine-tuning-based methods, such as MAML, over methods with "frozen representation" objectives in few-shot learning.
arXiv Detail & Related papers (2021-05-05T17:56:00Z) - On Fast Adversarial Robustness Adaptation in Model-Agnostic
Meta-Learning [100.14809391594109]
Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning.
Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning.
We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning.
arXiv Detail & Related papers (2021-02-20T22:03:04Z) - B-SMALL: A Bayesian Neural Network approach to Sparse Model-Agnostic
Meta-Learning [2.9189409618561966]
We propose a Bayesian neural network based MAML algorithm, which we refer to as the B-SMALL algorithm.
We demonstrate the performance of B-MAML using classification and regression tasks, and highlight that training a sparsifying BNN using MAML indeed improves the parameter footprint of the model.
arXiv Detail & Related papers (2021-01-01T09:19:48Z) - Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.