Embedding Adaptation is Still Needed for Few-Shot Learning
- URL: http://arxiv.org/abs/2104.07255v1
- Date: Thu, 15 Apr 2021 06:00:04 GMT
- Title: Embedding Adaptation is Still Needed for Few-Shot Learning
- Authors: S\'ebastien M. R. Arnold and Fei Sha
- Abstract summary: ATG is a principled clustering method to defining train and test tasksets without additional human knowledge.
We empirically demonstrate the effectiveness of ATG in generating tasksets that are easier, in-between, or harder than existing benchmarks.
We leverage our generated tasksets to shed a new light on few-shot classification: gradient-based methods can outperform metric-based ones when transfer is most challenging.
- Score: 25.4156194645678
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Constructing new and more challenging tasksets is a fruitful methodology to
analyse and understand few-shot classification methods. Unfortunately, existing
approaches to building those tasksets are somewhat unsatisfactory: they either
assume train and test task distributions to be identical -- which leads to
overly optimistic evaluations -- or take a "worst-case" philosophy -- which
typically requires additional human labor such as obtaining semantic class
relationships. We propose ATG, a principled clustering method to defining train
and test tasksets without additional human knowledge. ATG models train and test
task distributions while requiring them to share a predefined amount of
information. We empirically demonstrate the effectiveness of ATG in generating
tasksets that are easier, in-between, or harder than existing benchmarks,
including those that rely on semantic information. Finally, we leverage our
generated tasksets to shed a new light on few-shot classification:
gradient-based methods -- previously believed to underperform -- can outperform
metric-based ones when transfer is most challenging.
Related papers
- Active Instruction Tuning: Improving Cross-Task Generalization by
Training on Prompt Sensitive Tasks [101.40633115037983]
Instruction tuning (IT) achieves impressive zero-shot generalization results by training large language models (LLMs) on a massive amount of diverse tasks with instructions.
How to select new tasks to improve the performance and generalizability of IT models remains an open question.
We propose active instruction tuning based on prompt uncertainty, a novel framework to identify informative tasks, and then actively tune the models on the selected tasks.
arXiv Detail & Related papers (2023-11-01T04:40:05Z) - TaskMix: Data Augmentation for Meta-Learning of Spoken Intent
Understanding [0.0]
We show that a state-of-the-art data augmentation method worsens this problem of overfitting when the task diversity is low.
We propose a simple method, TaskMix, which synthesizes new tasks by linearly interpolating existing tasks.
We show that TaskMix outperforms baselines, alleviates overfitting when task diversity is low, and does not degrade performance even when it is high.
arXiv Detail & Related papers (2022-09-26T00:37:40Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
for Semantic and Generative Capabilities [76.97949110580703]
We introduce SUPERB-SG, a new benchmark to evaluate pre-trained models across various speech tasks.
We use a lightweight methodology to test the robustness of representations learned by pre-trained models under shifts in data domain.
We also show that the task diversity of SUPERB-SG coupled with limited task supervision is an effective recipe for evaluating the generalizability of model representation.
arXiv Detail & Related papers (2022-03-14T04:26:40Z) - Adaptive Task Sampling for Meta-Learning [79.61146834134459]
Key idea of meta-learning for few-shot classification is to mimic the few-shot situations faced at test time.
We propose an adaptive task sampling method to improve the generalization performance.
arXiv Detail & Related papers (2020-07-17T03:15:53Z) - Meta-Reinforcement Learning Robust to Distributional Shift via Model
Identification and Experience Relabeling [126.69933134648541]
We present a meta-reinforcement learning algorithm that is both efficient and extrapolates well when faced with out-of-distribution tasks at test time.
Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data.
arXiv Detail & Related papers (2020-06-12T13:34:46Z) - MC-BERT: Efficient Language Pre-Training via a Meta Controller [96.68140474547602]
Large-scale pre-training is computationally expensive.
ELECTRA, an early attempt to accelerate pre-training, trains a discriminative model that predicts whether each input token was replaced by a generator.
We propose a novel meta-learning framework, MC-BERT, to achieve better efficiency and effectiveness.
arXiv Detail & Related papers (2020-06-10T09:22:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.