A Representation Learning Perspective on the Importance of
Train-Validation Splitting in Meta-Learning
- URL: http://arxiv.org/abs/2106.15615v1
- Date: Tue, 29 Jun 2021 17:59:33 GMT
- Title: A Representation Learning Perspective on the Importance of
Train-Validation Splitting in Meta-Learning
- Authors: Nikunj Saunshi, Arushi Gupta, Wei Hu
- Abstract summary: splitting data from each task into train and validation sets during meta-training.
We argue that the train-validation split encourages the learned representation to be low-rank without compromising on expressivity.
Since sample efficiency benefits from low-rankness, the splitting strategy will require very few samples to solve unseen test tasks.
- Score: 14.720411598827365
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An effective approach in meta-learning is to utilize multiple "train tasks"
to learn a good initialization for model parameters that can help solve unseen
"test tasks" with very few samples by fine-tuning from this initialization.
Although successful in practice, theoretical understanding of such methods is
limited. This work studies an important aspect of these methods: splitting the
data from each task into train (support) and validation (query) sets during
meta-training. Inspired by recent work (Raghu et al., 2020), we view such
meta-learning methods through the lens of representation learning and argue
that the train-validation split encourages the learned representation to be
low-rank without compromising on expressivity, as opposed to the non-splitting
variant that encourages high-rank representations. Since sample efficiency
benefits from low-rankness, the splitting strategy will require very few
samples to solve unseen test tasks. We present theoretical results that
formalize this idea for linear representation learning on a subspace
meta-learning instance, and experimentally verify this practical benefit of
splitting in simulations and on standard meta-learning benchmarks.
Related papers
- Rethinking Meta-Learning from a Learning Lens [17.00587250127854]
We focus on the more fundamental learning to learn'' strategy of meta-learning to explore what causes errors and how to eliminate these errors without changing the environment.
We propose using task relations to the optimization process of meta-learning and propose a plug-and-play method called Task Relation Learner (TRLearner) to achieve this goal.
arXiv Detail & Related papers (2024-09-13T02:00:16Z) - First-order ANIL provably learns representations despite overparametrization [21.74339210788053]
This work shows that first-order ANIL with a linear two-layer network architecture successfully learns linear shared representations.
Having a width larger than the dimension of the shared representations results in anally low-rank solution.
Overall, this illustrates how well model-agnostic methods such as first-order ANIL can learn shared representations.
arXiv Detail & Related papers (2023-03-02T15:13:37Z) - The Effect of Diversity in Meta-Learning [79.56118674435844]
Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples.
Recent studies show that task distribution plays a vital role in the model's performance.
We study different task distributions on a myriad of models and datasets to evaluate the effect of task diversity on meta-learning algorithms.
arXiv Detail & Related papers (2022-01-27T19:39:07Z) - Conditional Meta-Learning of Linear Representations [57.90025697492041]
Standard meta-learning for representation learning aims to find a common representation to be shared across multiple tasks.
In this work we overcome this issue by inferring a conditioning function, mapping the tasks' side information into a representation tailored to the task at hand.
We propose a meta-algorithm capable of leveraging this advantage in practice.
arXiv Detail & Related papers (2021-03-30T12:02:14Z) - Lessons from Chasing Few-Shot Learning Benchmarks: Rethinking the
Evaluation of Meta-Learning Methods [9.821362920940631]
We introduce a simple baseline for meta-learning, FIX-ML.
We explore two possible goals of meta-learning: to develop methods that generalize (i) to the same task distribution that generates the training set (in-distribution), or (ii) to new, unseen task distributions (out-of-distribution)
Our results highlight that in order to reason about progress in this space, it is necessary to provide a clearer description of the goals of meta-learning, and to develop more appropriate evaluation strategies.
arXiv Detail & Related papers (2021-02-23T05:34:30Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - How Important is the Train-Validation Split in Meta-Learning? [155.5088631672781]
A common practice in meta-learning is to perform a train-validation split (emphtrain-val method) where the prior adapts to the task on one split of the data, and the resulting predictor is evaluated on another split.
Despite its prevalence, the importance of the train-validation split is not well understood either in theory or in practice.
We show that the train-train method can indeed outperform the train-val method, on both simulations and real meta-learning tasks.
arXiv Detail & Related papers (2020-10-12T16:48:42Z) - SML: Semantic Meta-learning for Few-shot Semantic Segmentation [27.773396307292497]
We propose a novel meta-learning framework, Semantic Meta-Learning, which incorporates class-level semantic descriptions in the generated prototypes for this problem.
In addition, we propose to use the well established technique, ridge regression, to not only bring in the class-level semantic information, but also to effectively utilise the information available from multiple images present in the training data for prototype computation.
arXiv Detail & Related papers (2020-09-14T18:26:46Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z) - Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning [79.25478727351604]
We explore a simple process: meta-learning over a whole-classification pre-trained model on its evaluation metric.
We observe this simple method achieves competitive performance to state-of-the-art methods on standard benchmarks.
arXiv Detail & Related papers (2020-03-09T20:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.