Contextual Gradient Scaling for Few-Shot Learning
- URL: http://arxiv.org/abs/2110.10353v1
- Date: Wed, 20 Oct 2021 03:05:58 GMT
- Title: Contextual Gradient Scaling for Few-Shot Learning
- Authors: Sanghyuk Lee, Seunghyun Lee, Byung Cheol Song
- Abstract summary: We propose contextual gradient scaling (CxGrad) for model-agnostic meta-learning (MAML)
CxGrad scales gradient norms of the backbone to facilitate learning task-specific knowledge in the inner-loop.
Experimental results show that CxGrad effectively encourages the backbone to learn task-specific knowledge in the inner-loop.
- Score: 24.19934081878197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model-agnostic meta-learning (MAML) is a well-known optimization-based
meta-learning algorithm that works well in various computer vision tasks, e.g.,
few-shot classification. MAML is to learn an initialization so that a model can
adapt to a new task in a few steps. However, since the gradient norm of a
classifier (head) is much bigger than those of backbone layers, the model
focuses on learning the decision boundary of the classifier with similar
representations. Furthermore, gradient norms of high-level layers are small
than those of the other layers. So, the backbone of MAML usually learns
task-generic features, which results in deteriorated adaptation performance in
the inner-loop. To resolve or mitigate this problem, we propose contextual
gradient scaling (CxGrad), which scales gradient norms of the backbone to
facilitate learning task-specific knowledge in the inner-loop. Since the
scaling factors are generated from task-conditioned parameters, gradient norms
of the backbone can be scaled in a task-wise fashion. Experimental results show
that CxGrad effectively encourages the backbone to learn task-specific
knowledge in the inner-loop and improves the performance of MAML up to a
significant margin in both same- and cross-domain few-shot classification.
Related papers
- Classifier-guided Gradient Modulation for Enhanced Multimodal Learning [50.7008456698935]
Gradient-Guided Modulation (CGGM) is a novel method to balance multimodal learning with gradients.
We conduct extensive experiments on four multimodal datasets: UPMC-Food 101, CMU-MOSI, IEMOCAP and BraTS.
CGGM outperforms all the baselines and other state-of-the-art methods consistently.
arXiv Detail & Related papers (2024-11-03T02:38:43Z) - Masked Image Modeling with Local Multi-Scale Reconstruction [54.91442074100597]
Masked Image Modeling (MIM) achieves outstanding success in self-supervised representation learning.
Existing MIM models conduct reconstruction task only at the top layer of encoder.
We design local multi-scale reconstruction, where the lower and upper layers reconstruct fine-scale and coarse-scale supervision signals respectively.
arXiv Detail & Related papers (2023-03-09T13:42:04Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - HyperMAML: Few-Shot Adaptation of Deep Models with Hypernetworks [0.0]
Few-Shot learning aims to train models which can easily adapt to previously unseen tasks.
Model-Agnostic Meta-Learning (MAML) is one of the most popular Few-Shot learning approaches.
In this paper, we propose HyperMAML, where the training of the update procedure is also part of the model.
arXiv Detail & Related papers (2022-05-31T12:31:21Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Continuous-Time Meta-Learning with Forward Mode Differentiation [65.26189016950343]
We introduce Continuous Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field.
Treating the learning process as an ODE offers the notable advantage that the length of the trajectory is now continuous.
We show empirically its efficiency in terms of runtime and memory usage, and we illustrate its effectiveness on a range of few-shot image classification problems.
arXiv Detail & Related papers (2022-03-02T22:35:58Z) - B-SMALL: A Bayesian Neural Network approach to Sparse Model-Agnostic
Meta-Learning [2.9189409618561966]
We propose a Bayesian neural network based MAML algorithm, which we refer to as the B-SMALL algorithm.
We demonstrate the performance of B-MAML using classification and regression tasks, and highlight that training a sparsifying BNN using MAML indeed improves the parameter footprint of the model.
arXiv Detail & Related papers (2021-01-01T09:19:48Z) - How Does the Task Landscape Affect MAML Performance? [42.27488241647739]
We show that Model-Agnostic Meta-Learning (MAML) is more difficult to optimize than non-adaptive learning (NAL)
We analytically address this issue in a linear regression setting consisting of a mixture of easy and hard tasks.
We also give numerical and analytical results suggesting that these insights apply to two-layer neural networks.
arXiv Detail & Related papers (2020-10-27T23:54:44Z) - Regularizing Meta-Learning via Gradient Dropout [102.29924160341572]
meta-learning models are prone to overfitting when there are no sufficient training tasks for the meta-learners to generalize.
We introduce a simple yet effective method to alleviate the risk of overfitting for gradient-based meta-learning.
arXiv Detail & Related papers (2020-04-13T10:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.