Exploring the Similarity of Representations in Model-Agnostic
Meta-Learning
- URL: http://arxiv.org/abs/2105.05757v1
- Date: Wed, 12 May 2021 16:20:40 GMT
- Title: Exploring the Similarity of Representations in Model-Agnostic
Meta-Learning
- Authors: Thomas Goerttler and Klaus Obermayer
- Abstract summary: Model-agnostic meta-learning (MAML) has been one of the most promising approaches in meta-learning.
Recent work proposes that MAML rather reuses features than rapidly learns.
We apply representation similarity analysis (RSA), a well-established method in neuroscience, to the few-shot learning instantiation of MAML.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In past years model-agnostic meta-learning (MAML) has been one of the most
promising approaches in meta-learning. It can be applied to different kinds of
problems, e.g., reinforcement learning, but also shows good results on few-shot
learning tasks. Besides their tremendous success in these tasks, it has still
not been fully revealed yet, why it works so well. Recent work proposes that
MAML rather reuses features than rapidly learns. In this paper, we want to
inspire a deeper understanding of this question by analyzing MAML's
representation. We apply representation similarity analysis (RSA), a
well-established method in neuroscience, to the few-shot learning instantiation
of MAML. Although some part of our analysis supports their general results that
feature reuse is predominant, we also reveal arguments against their
conclusion. The similarity-increase of layers closer to the input layers arises
from the learning task itself and not from the model. In addition, the
representations after inner gradient steps make a broader change to the
representation than the changes during meta-training.
Related papers
- LLAVADI: What Matters For Multimodal Large Language Models Distillation [77.73964744238519]
In this work, we do not propose a new efficient model structure or train small-scale MLLMs from scratch.
Our studies involve training strategies, model choices, and distillation algorithms in the knowledge distillation process.
By evaluating different benchmarks and proper strategy, even a 2.7B small-scale model can perform on par with larger models with 7B or 13B parameters.
arXiv Detail & Related papers (2024-07-28T06:10:47Z) - MinT: Boosting Generalization in Mathematical Reasoning via Multi-View
Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs)
We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles.
Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z) - MAML and ANIL Provably Learn Representations [60.17417686153103]
We prove that two well-known meta-learning methods, MAML and ANIL, are capable of learning common representation among a set of given tasks.
Specifically, in the well-known multi-task linear representation learning setting, they are able to recover the ground-truth representation at an exponentially fast rate.
Our analysis illuminates that the driving force causing MAML and ANIL to recover the underlying representation is that they adapt the final layer of their model.
arXiv Detail & Related papers (2022-02-07T19:43:02Z) - The Effect of Diversity in Meta-Learning [79.56118674435844]
Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples.
Recent studies show that task distribution plays a vital role in the model's performance.
We study different task distributions on a myriad of models and datasets to evaluate the effect of task diversity on meta-learning algorithms.
arXiv Detail & Related papers (2022-01-27T19:39:07Z) - Does MAML Only Work via Feature Re-use? A Data Centric Perspective [19.556093984142418]
We provide empirical results that shed some light on how meta-learned MAML representations function.
We show that it is possible to define a family of synthetic benchmarks that result in a low degree of feature re-use.
We conjecture the core challenge of re-thinking meta-learning is in the design of few-shot learning data sets and benchmarks.
arXiv Detail & Related papers (2021-12-24T20:18:38Z) - Meta-learning Amidst Heterogeneity and Ambiguity [11.061517140668961]
We devise a novel meta-learning framework, called Meta-learning Amidst Heterogeneity and Ambiguity (MAHA)
By extensively conducting several experiments in regression and classification, we demonstrate the validity of our model.
arXiv Detail & Related papers (2021-07-05T18:54:31Z) - MAML is a Noisy Contrastive Learner [72.04430033118426]
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays.
We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function.
We propose a simple but effective technique, zeroing trick, to alleviate such interference.
arXiv Detail & Related papers (2021-06-29T12:52:26Z) - BOIL: Towards Representation Change for Few-shot Learning [20.23766569940024]
We study the necessity of representation change for the ultimate goal of few-shot learning, which is solving domain-agnostic tasks.
We propose a novel meta-learning algorithm, called BOIL, which updates only the body of the model and freezes the head during inner loop updates.
BOIL empirically shows significant performance improvement over MAML, particularly on cross-domain tasks.
arXiv Detail & Related papers (2020-08-20T10:52:23Z) - Unraveling Meta-Learning: Understanding Feature Representations for
Few-Shot Tasks [55.66438591090072]
We develop a better understanding of the underlying mechanics of meta-learning and the difference between models trained using meta-learning and models trained classically.
We develop a regularizer which boosts the performance of standard training routines for few-shot classification.
arXiv Detail & Related papers (2020-02-17T03:18:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.