Rethinking the Number of Shots in Robust Model-Agnostic Meta-Learning
- URL: http://arxiv.org/abs/2211.15180v1
- Date: Mon, 28 Nov 2022 09:47:13 GMT
- Title: Rethinking the Number of Shots in Robust Model-Agnostic Meta-Learning
- Authors: Xiaoyue Duan, Guoliang Kang, Runqi Wang, Shumin Han, Song Xue, Tian
Wang, Baochang Zhang
- Abstract summary: We propose a simple strategy, i.e., increasing the number of training shots, to mitigate the loss of dimension intrinsic caused by robustness-promoting regularization.
Our method remarkably improves the clean accuracy of MAML without much loss of robustness, producing a robust yet accurate model.
- Score: 26.02974754702544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robust Model-Agnostic Meta-Learning (MAML) is usually adopted to train a
meta-model which may fast adapt to novel classes with only a few exemplars and
meanwhile remain robust to adversarial attacks. The conventional solution for
robust MAML is to introduce robustness-promoting regularization during
meta-training stage. With such a regularization, previous robust MAML methods
simply follow the typical MAML practice that the number of training shots
should match with the number of test shots to achieve an optimal adaptation
performance. However, although the robustness can be largely improved, previous
methods sacrifice clean accuracy a lot. In this paper, we observe that
introducing robustness-promoting regularization into MAML reduces the intrinsic
dimension of clean sample features, which results in a lower capacity of clean
representations. This may explain why the clean accuracy of previous robust
MAML methods drops severely. Based on this observation, we propose a simple
strategy, i.e., increasing the number of training shots, to mitigate the loss
of intrinsic dimension caused by robustness-promoting regularization. Though
simple, our method remarkably improves the clean accuracy of MAML without much
loss of robustness, producing a robust yet accurate model. Extensive
experiments demonstrate that our method outperforms prior arts in achieving a
better trade-off between accuracy and robustness. Besides, we observe that our
method is less sensitive to the number of fine-tuning steps during
meta-training, which allows for a reduced number of fine-tuning steps to
improve training efficiency.
Related papers
- What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy.
By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z) - PUMA: margin-based data pruning [51.12154122266251]
We focus on data pruning, where some training samples are removed based on the distance to the model classification boundary (i.e., margin)
We propose PUMA, a new data pruning strategy that computes the margin using DeepFool.
We show that PUMA can be used on top of the current state-of-the-art methodology in robustness, and it is able to significantly improve the model performance unlike the existing data pruning strategies.
arXiv Detail & Related papers (2024-05-10T08:02:20Z) - Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models [88.80146574509195]
Quantization is a promising approach for reducing memory overhead and accelerating inference.
We propose a novel-aware quantization (ZSAQ) framework for the zero-shot quantization of various PLMs.
arXiv Detail & Related papers (2023-10-20T07:09:56Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning [23.135033752967598]
We consider the novel problem of repurposing pretrained MAML checkpoints to solve new few-shot classification tasks.
Because of the potential distribution mismatch, the original MAML steps may no longer be optimal.
We propose an alternative metatesting procedure and combine adversarial training and uncertainty-based stepsize adaptation.
arXiv Detail & Related papers (2021-03-16T12:53:09Z) - Robust MAML: Prioritization task buffer with adaptive learning process
for model-agnostic meta-learning [15.894925018423665]
Model agnostic meta-learning (MAML) is a popular state-of-the-art meta-learning algorithm.
This paper proposes a more robust MAML based on an adaptive learning scheme and a prioritization task buffer.
Experimental results on meta reinforcement learning environments demonstrate a substantial performance gain.
arXiv Detail & Related papers (2021-03-15T09:34:34Z) - On Fast Adversarial Robustness Adaptation in Model-Agnostic
Meta-Learning [100.14809391594109]
Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning.
Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning.
We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning.
arXiv Detail & Related papers (2021-02-20T22:03:04Z) - Meta-Learning with Adaptive Hyperparameters [55.182841228303225]
We focus on a complementary factor in MAML framework, inner-loop optimization (or fast adaptation)
We propose a new weight update rule that greatly enhances the fast adaptation process.
arXiv Detail & Related papers (2020-10-31T08:05:34Z) - How Does the Task Landscape Affect MAML Performance? [42.27488241647739]
We show that Model-Agnostic Meta-Learning (MAML) is more difficult to optimize than non-adaptive learning (NAL)
We analytically address this issue in a linear regression setting consisting of a mixture of easy and hard tasks.
We also give numerical and analytical results suggesting that these insights apply to two-layer neural networks.
arXiv Detail & Related papers (2020-10-27T23:54:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.