Perturbing the Gradient for Alleviating Meta Overfitting
- URL: http://arxiv.org/abs/2405.12299v2
- Date: Sun, 10 Nov 2024 13:46:36 GMT
- Title: Perturbing the Gradient for Alleviating Meta Overfitting
- Authors: Manas Gogoi, Sambhavi Tiwari, Shekhar Verma,
- Abstract summary: This paper proposes a number of solutions to tackle meta-overfitting on few-shot learning settings.
Our proposed approaches demonstrate improved generalization performance compared to state-of-the-art baselines for learning in a non-mutually exclusive task setting.
- Score: 6.6157730528755065
- License:
- Abstract: The reason for Meta Overfitting can be attributed to two factors: Mutual Non-exclusivity and the Lack of diversity, consequent to which a single global function can fit the support set data of all the meta-training tasks and fail to generalize to new unseen tasks. This issue is evidenced by low error rates on the meta-training tasks, but high error rates on new tasks. However, there can be a number of novel solutions to this problem keeping in mind any of the two objectives to be attained, i.e. to increase diversity in the tasks and to reduce the confidence of the model for some of the tasks. In light of the above, this paper proposes a number of solutions to tackle meta-overfitting on few-shot learning settings, such as few-shot sinusoid regression and few shot classification. Our proposed approaches demonstrate improved generalization performance compared to state-of-the-art baselines for learning in a non-mutually exclusive task setting. Overall, this paper aims to provide insights into tackling overfitting in meta-learning and to advance the field towards more robust and generalizable models.
Related papers
- Towards Task Sampler Learning for Meta-Learning [37.02030832662183]
Meta-learning aims to learn general knowledge with diverse training tasks conducted from limited data, and then transfer it to new tasks.
It is commonly believed that increasing task diversity will enhance the generalization ability of meta-learning models.
This paper challenges this view through empirical and theoretical analysis.
arXiv Detail & Related papers (2023-07-18T01:53:18Z) - ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z) - Learning Transferable Adversarial Robust Representations via Multi-view
Consistency [57.73073964318167]
We propose a novel meta-adversarial multi-view representation learning framework with dual encoders.
We demonstrate the effectiveness of our framework on few-shot learning tasks from unseen domains.
arXiv Detail & Related papers (2022-10-19T11:48:01Z) - Learning to generate imaginary tasks for improving generalization in
meta-learning [12.635773307074022]
The success of meta-learning on existing benchmarks is predicated on the assumption that the distribution of meta-training tasks covers meta-testing tasks.
Recent solutions have pursued augmentation of meta-training tasks, while it is still an open question to generate both correct and sufficiently imaginary tasks.
In this paper, we seek an approach that up-samples meta-training tasks from the task representation via a task up-sampling network. Besides, the resulting approach named Adversarial Task Up-sampling (ATU) suffices to generate tasks that can maximally contribute to the latest meta-learner by maximizing an adversarial loss.
arXiv Detail & Related papers (2022-06-09T08:21:05Z) - Set-based Meta-Interpolation for Few-Task Meta-Learning [79.4236527774689]
We propose a novel domain-agnostic task augmentation method, Meta-Interpolation, to densify the meta-training task distribution.
We empirically validate the efficacy of Meta-Interpolation on eight datasets spanning across various domains.
arXiv Detail & Related papers (2022-05-20T06:53:03Z) - Conflict-Averse Gradient Descent for Multi-task Learning [56.379937772617]
A major challenge in optimizing a multi-task model is the conflicting gradients.
We introduce Conflict-Averse Gradient descent (CAGrad) which minimizes the average loss function.
CAGrad balances the objectives automatically and still provably converges to a minimum over the average loss.
arXiv Detail & Related papers (2021-10-26T22:03:51Z) - Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning [50.59295648948287]
In few-shot learning scenarios, the challenge is to generalize and perform well on new unseen examples.
We introduce a new meta-learning framework with a loss function that adapts to each task.
Our proposed framework, named Meta-Learning with Task-Adaptive Loss Function (MeTAL), demonstrates the effectiveness and the flexibility across various domains.
arXiv Detail & Related papers (2021-10-08T06:07:21Z) - Meta-Regularization by Enforcing Mutual-Exclusiveness [0.8057006406834467]
We propose a regularization technique for meta-learning models that gives the model designer more control over the information flow during meta-training.
Our proposed regularization function shows an accuracy boost of $sim$ $36%$ on the Omniglot dataset.
arXiv Detail & Related papers (2021-01-24T22:57:19Z) - Is Support Set Diversity Necessary for Meta-Learning? [14.231486872262531]
We propose a modification to traditional meta-learning approaches in which we keep the support sets fixed across tasks, thus reducing task diversity.
Surprisingly, we find that not only does this modification not result in adverse effects, it almost always improves the performance for a variety of datasets and meta-learning methods.
arXiv Detail & Related papers (2020-11-28T02:28:42Z) - Improving Generalization in Meta-learning via Task Augmentation [69.83677015207527]
We propose two task augmentation methods, including MetaMix and Channel Shuffle.
Both MetaMix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets.
arXiv Detail & Related papers (2020-07-26T01:50:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.