Boosting Meta-Training with Base Class Information for Few-Shot Learning
- URL: http://arxiv.org/abs/2403.03472v1
- Date: Wed, 6 Mar 2024 05:13:23 GMT
- Title: Boosting Meta-Training with Base Class Information for Few-Shot Learning
- Authors: Weihao Jiang, Guodong Liu, Di He, Kun He
- Abstract summary: We propose an end-to-end training paradigm consisting of two alternative loops.
In the outer loop, we calculate cross entropy loss on the entire training set while updating only the final linear layer.
This training paradigm not only converges quickly but also outperforms existing baselines, indicating that information from the overall training set and the meta-learning training paradigm could mutually reinforce one another.
- Score: 35.144099160883606
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot learning, a challenging task in machine learning, aims to learn a
classifier adaptable to recognize new, unseen classes with limited labeled
examples. Meta-learning has emerged as a prominent framework for few-shot
learning. Its training framework is originally a task-level learning method,
such as Model-Agnostic Meta-Learning (MAML) and Prototypical Networks. And a
recently proposed training paradigm called Meta-Baseline, which consists of
sequential pre-training and meta-training stages, gains state-of-the-art
performance. However, as a non-end-to-end training method, indicating the
meta-training stage can only begin after the completion of pre-training,
Meta-Baseline suffers from higher training cost and suboptimal performance due
to the inherent conflicts of the two training stages. To address these
limitations, we propose an end-to-end training paradigm consisting of two
alternative loops. In the outer loop, we calculate cross entropy loss on the
entire training set while updating only the final linear layer. In the inner
loop, we employ the original meta-learning training mode to calculate the loss
and incorporate gradients from the outer loss to guide the parameter updates.
This training paradigm not only converges quickly but also outperforms existing
baselines, indicating that information from the overall training set and the
meta-learning training paradigm could mutually reinforce one another. Moreover,
being model-agnostic, our framework achieves significant performance gains,
surpassing the baseline systems by approximate 1%.
Related papers
- Joint or Disjoint: Mixing Training Regimes for Early-Exit Models [3.052154851421859]
Early exits significantly reduce the amount of computation required in deep neural networks.
Most early exit methods employ a training strategy that either simultaneously trains the backbone network and the exit heads or trains the exit heads separately.
We propose a training approach where the backbone is initially trained on its own, followed by a phase where both the backbone and the exit heads are trained together.
arXiv Detail & Related papers (2024-07-19T13:56:57Z) - Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning [119.70303730341938]
We propose ePisode cUrriculum inveRsion (ECI) during data-free meta training and invErsion calibRation following inner loop (ICFIL) during meta testing.
ECI adaptively increases the difficulty level of pseudo episodes according to the real-time feedback of the meta model.
We formulate the optimization process of meta training with ECI as an adversarial form in an end-to-end manner.
arXiv Detail & Related papers (2023-03-20T15:10:41Z) - MetaICL: Learning to Learn In Context [87.23056864536613]
We introduce MetaICL, a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learn-ing on a large set of training tasks.
We show that MetaICL approaches (and sometimes beats) the performance of models fully finetuned on the target task training data, and outperforms much bigger models with nearly 8x parameters.
arXiv Detail & Related papers (2021-10-29T17:42:08Z) - Long-term Cross Adversarial Training: A Robust Meta-learning Method for
Few-shot Classification Tasks [10.058068783476598]
This paper proposed a meta-learning method on the adversarially robust neural network called Long-term Cross Adversarial Training (LCAT)
Due to cross-adversarial training, LCAT only needs half of the adversarial training epoch than AQ, resulting in a low adversarial training epoch.
Experiment results show that LCAT achieves superior performance both on the clean and adversarial few-shot classification accuracy.
arXiv Detail & Related papers (2021-06-22T06:31:16Z) - Trainable Class Prototypes for Few-Shot Learning [5.481942307939029]
We propose the trainable prototypes for distance measure instead of the artificial ones within the meta-training and task-training framework.
Also to avoid the disadvantages that the episodic meta-training brought, we adopt non-episodic meta-training based on self-supervised learning.
Our method achieves state-of-the-art performance in a variety of established few-shot tasks on the standard few-shot visual classification dataset.
arXiv Detail & Related papers (2021-06-21T04:19:56Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z) - Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning [79.25478727351604]
We explore a simple process: meta-learning over a whole-classification pre-trained model on its evaluation metric.
We observe this simple method achieves competitive performance to state-of-the-art methods on standard benchmarks.
arXiv Detail & Related papers (2020-03-09T20:06:36Z) - Incremental Meta-Learning via Indirect Discriminant Alignment [118.61152684795178]
We develop a notion of incremental learning during the meta-training phase of meta-learning.
Our approach performs favorably at test time as compared to training a model with the full meta-training set.
arXiv Detail & Related papers (2020-02-11T01:39:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.