TaskNorm: Rethinking Batch Normalization for Meta-Learning
- URL: http://arxiv.org/abs/2003.03284v2
- Date: Sun, 28 Jun 2020 14:09:28 GMT
- Title: TaskNorm: Rethinking Batch Normalization for Meta-Learning
- Authors: John Bronskill, Jonathan Gordon, James Requeima, Sebastian Nowozin,
Richard E. Turner
- Abstract summary: We evaluate a range of approaches to batch normalization for meta-learning scenarios, and develop a novel approach that we call TaskNorm.
Experiments on fourteen datasets demonstrate that the choice of batch normalization has a dramatic effect on both classification accuracy and training time.
We provide a set of best practices for normalization that will allow fair comparison of meta-learning algorithms.
- Score: 43.01116858195183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern meta-learning approaches for image classification rely on increasingly
deep networks to achieve state-of-the-art performance, making batch
normalization an essential component of meta-learning pipelines. However, the
hierarchical nature of the meta-learning setting presents several challenges
that can render conventional batch normalization ineffective, giving rise to
the need to rethink normalization in this setting. We evaluate a range of
approaches to batch normalization for meta-learning scenarios, and develop a
novel approach that we call TaskNorm. Experiments on fourteen datasets
demonstrate that the choice of batch normalization has a dramatic effect on
both classification accuracy and training time for both gradient based and
gradient-free meta-learning approaches. Importantly, TaskNorm is found to
consistently improve performance. Finally, we provide a set of best practices
for normalization that will allow fair comparison of meta-learning algorithms.
Related papers
- Contrastive Knowledge-Augmented Meta-Learning for Few-Shot
Classification [28.38744876121834]
We introduce CAML (Contrastive Knowledge-Augmented Meta Learning), a novel approach for knowledge-enhanced few-shot learning.
We evaluate the performance of CAML in different few-shot learning scenarios.
arXiv Detail & Related papers (2022-07-25T17:01:29Z) - Meta-Learning with Fewer Tasks through Task Interpolation [67.03769747726666]
Current meta-learning algorithms require a large number of meta-training tasks, which may not be accessible in real-world scenarios.
By meta-learning with task gradient (MLTI), our approach effectively generates additional tasks by randomly sampling a pair of tasks and interpolating the corresponding features and labels.
Empirically, in our experiments on eight datasets from diverse domains, we find that the proposed general MLTI framework is compatible with representative meta-learning algorithms and consistently outperforms other state-of-the-art strategies.
arXiv Detail & Related papers (2021-06-04T20:15:34Z) - Large-Scale Meta-Learning with Continual Trajectory Shifting [76.29017270864308]
We show that allowing the meta-learners to take a larger number of inner gradient steps better captures the structure of heterogeneous and large-scale tasks.
In order to increase the frequency of meta-updates, we propose to estimate the required shift of the task-specific parameters.
We show that the algorithm largely outperforms the previous first-order meta-learning methods in terms of both generalization performance and convergence.
arXiv Detail & Related papers (2021-02-14T18:36:33Z) - A Primal-Dual Subgradient Approachfor Fair Meta Learning [23.65344558042896]
Few shot meta-learning is well-known with its fast-adapted capability and accuracy generalization onto unseen tasks.
We propose a Primal-Dual Fair Meta-learning framework, namely PDFM, which learns to train fair machine learning models using only a few examples.
arXiv Detail & Related papers (2020-09-26T19:47:38Z) - Improving Generalization in Meta-learning via Task Augmentation [69.83677015207527]
We propose two task augmentation methods, including MetaMix and Channel Shuffle.
Both MetaMix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets.
arXiv Detail & Related papers (2020-07-26T01:50:42Z) - Regularizing Meta-Learning via Gradient Dropout [102.29924160341572]
meta-learning models are prone to overfitting when there are no sufficient training tasks for the meta-learners to generalize.
We introduce a simple yet effective method to alleviate the risk of overfitting for gradient-based meta-learning.
arXiv Detail & Related papers (2020-04-13T10:47:02Z) - Incremental Meta-Learning via Indirect Discriminant Alignment [118.61152684795178]
We develop a notion of incremental learning during the meta-training phase of meta-learning.
Our approach performs favorably at test time as compared to training a model with the full meta-training set.
arXiv Detail & Related papers (2020-02-11T01:39:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.