Comparing Transfer and Meta Learning Approaches on a Unified Few-Shot
Classification Benchmark
- URL: http://arxiv.org/abs/2104.02638v1
- Date: Tue, 6 Apr 2021 16:17:51 GMT
- Title: Comparing Transfer and Meta Learning Approaches on a Unified Few-Shot
Classification Benchmark
- Authors: Vincent Dumoulin, Neil Houlsby, Utku Evci, Xiaohua Zhai, Ross
Goroshin, Sylvain Gelly, Hugo Larochelle
- Abstract summary: Cross-family study of the best transfer and meta learners on a large-scale meta-learning benchmark and a transfer learning benchmark.
We find that, on average, large-scale transfer methods (Big Transfer, BiT) outperform competing approaches on MD, even when trained only on ImageNet.
We reveal a number of discrepancies in evaluation norms and study some of these in light of the performance gap.
- Score: 44.530605715850506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meta and transfer learning are two successful families of approaches to
few-shot learning. Despite highly related goals, state-of-the-art advances in
each family are measured largely in isolation of each other. As a result of
diverging evaluation norms, a direct or thorough comparison of different
approaches is challenging. To bridge this gap, we perform a cross-family study
of the best transfer and meta learners on both a large-scale meta-learning
benchmark (Meta-Dataset, MD), and a transfer learning benchmark (Visual Task
Adaptation Benchmark, VTAB). We find that, on average, large-scale transfer
methods (Big Transfer, BiT) outperform competing approaches on MD, even when
trained only on ImageNet. In contrast, meta-learning approaches struggle to
compete on VTAB when trained and validated on MD. However, BiT is not without
limitations, and pushing for scale does not improve performance on highly
out-of-distribution MD tasks. In performing this study, we reveal a number of
discrepancies in evaluation norms and study some of these in light of the
performance gap. We hope that this work facilitates sharing of insights from
each community, and accelerates progress on few-shot learning.
Related papers
- Weighted Ensemble Self-Supervised Learning [67.24482854208783]
Ensembling has proven to be a powerful technique for boosting model performance.
We develop a framework that permits data-dependent weighted cross-entropy losses.
Our method outperforms both in multiple evaluation metrics on ImageNet-1K.
arXiv Detail & Related papers (2022-11-18T02:00:17Z) - Meta-Learning with Self-Improving Momentum Target [72.98879709228981]
We propose Self-improving Momentum Target (SiMT) to improve the performance of a meta-learner.
SiMT generates the target model by adapting from the temporal ensemble of the meta-learner.
We show that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods.
arXiv Detail & Related papers (2022-10-11T06:45:15Z) - Few-Shot Classification with Contrastive Learning [10.236150550121163]
We propose a novel contrastive learning-based framework that seamlessly integrates contrastive learning into both stages.
In the meta-training stage, we propose a cross-view episodic training mechanism to perform the nearest centroid classification on two different views of the same episode.
These two strategies force the model to overcome the bias between views and promote the transferability of representations.
arXiv Detail & Related papers (2022-09-17T02:39:09Z) - The Curse of Low Task Diversity: On the Failure of Transfer Learning to
Outperform MAML and Their Empirical Equivalence [20.965759895300327]
We propose a novel metric -- the diversity coefficient -- to measure the diversity of tasks in a few-shot learning benchmark.
Using the diversity coefficient, we show that the popular MiniImageNet and CIFAR-FS few-shot learning benchmarks have low diversity.
arXiv Detail & Related papers (2022-08-02T15:49:11Z) - The Curse of Zero Task Diversity: On the Failure of Transfer Learning to
Outperform MAML and their Empirical Equivalence [19.556093984142418]
A transfer learning solution might be all we needed to solve many few-shot learning benchmarks.
We name this metric the diversity coefficient of a few-shot learning benchmark.
We show that when making a fair comparison between MAML learned solutions to transfer learning, both have identical meta-test accuracy.
arXiv Detail & Related papers (2021-12-24T18:42:58Z) - Faster Meta Update Strategy for Noise-Robust Deep Learning [62.08964100618873]
We introduce a novel Faster Meta Update Strategy (FaMUS) to replace the most expensive step in the meta gradient with a faster layer-wise approximation.
We show our method is able to save two-thirds of the training time while still maintaining the comparable or achieving even better generalization performance.
arXiv Detail & Related papers (2021-04-30T16:19:07Z) - Lessons from Chasing Few-Shot Learning Benchmarks: Rethinking the
Evaluation of Meta-Learning Methods [9.821362920940631]
We introduce a simple baseline for meta-learning, FIX-ML.
We explore two possible goals of meta-learning: to develop methods that generalize (i) to the same task distribution that generates the training set (in-distribution), or (ii) to new, unseen task distributions (out-of-distribution)
Our results highlight that in order to reason about progress in this space, it is necessary to provide a clearer description of the goals of meta-learning, and to develop more appropriate evaluation strategies.
arXiv Detail & Related papers (2021-02-23T05:34:30Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning [79.25478727351604]
We explore a simple process: meta-learning over a whole-classification pre-trained model on its evaluation metric.
We observe this simple method achieves competitive performance to state-of-the-art methods on standard benchmarks.
arXiv Detail & Related papers (2020-03-09T20:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.