Faster Meta Update Strategy for Noise-Robust Deep Learning
- URL: http://arxiv.org/abs/2104.15092v1
- Date: Fri, 30 Apr 2021 16:19:07 GMT
- Title: Faster Meta Update Strategy for Noise-Robust Deep Learning
- Authors: Youjiang Xu, Linchao Zhu, Lu Jiang, Yi Yang
- Abstract summary: We introduce a novel Faster Meta Update Strategy (FaMUS) to replace the most expensive step in the meta gradient with a faster layer-wise approximation.
We show our method is able to save two-thirds of the training time while still maintaining the comparable or achieving even better generalization performance.
- Score: 62.08964100618873
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has been shown that deep neural networks are prone to overfitting on
biased training data. Towards addressing this issue, meta-learning employs a
meta model for correcting the training bias. Despite the promising
performances, super slow training is currently the bottleneck in the meta
learning approaches. In this paper, we introduce a novel Faster Meta Update
Strategy (FaMUS) to replace the most expensive step in the meta gradient
computation with a faster layer-wise approximation. We empirically find that
FaMUS yields not only a reasonably accurate but also a low-variance
approximation of the meta gradient. We conduct extensive experiments to verify
the proposed method on two tasks. We show our method is able to save two-thirds
of the training time while still maintaining the comparable or achieving even
better generalization performance. In particular, our method achieves the
state-of-the-art performance on both synthetic and realistic noisy labels, and
obtains promising performance on long-tailed recognition on standard
benchmarks.
Related papers
- FREE: Faster and Better Data-Free Meta-Learning [77.90126669914324]
Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data.
We introduce the Faster and Better Data-Free Meta-Learning framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks.
arXiv Detail & Related papers (2024-05-02T03:43:19Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - One Step at a Time: Pros and Cons of Multi-Step Meta-Gradient
Reinforcement Learning [61.662504399411695]
We introduce a novel method mixing multiple inner steps that enjoys a more accurate and robust meta-gradient signal.
When applied to the Snake game, the mixing meta-gradient algorithm can cut the variance by a factor of 3 while achieving similar or higher performance.
arXiv Detail & Related papers (2021-10-30T08:36:52Z) - Accelerating Gradient-based Meta Learner [2.1349209400003932]
We propose various acceleration techniques to speed up meta learning algorithms such as MAML (Model Agnostic Meta Learning)
We introduce a novel method of training tasks in clusters, which not only accelerates the meta learning process but also improves model accuracy performance.
arXiv Detail & Related papers (2021-10-27T14:27:36Z) - Adapting Stepsizes by Momentumized Gradients Improves Optimization and
Generalization [89.66571637204012]
textscAdaMomentum on vision, and achieves state-the-art results consistently on other tasks including language processing.
textscAdaMomentum on vision, and achieves state-the-art results consistently on other tasks including language processing.
textscAdaMomentum on vision, and achieves state-the-art results consistently on other tasks including language processing.
arXiv Detail & Related papers (2021-06-22T03:13:23Z) - A contrastive rule for meta-learning [1.3124513975412255]
Meta-learning algorithms leverage regularities that are present on a set of tasks to speed up and improve the performance of a subsidiary learning process.
We present a gradient-based meta-learning algorithm based on equilibrium propagation.
We establish theoretical bounds on its performance and present experiments on a set of standard benchmarks and neural network architectures.
arXiv Detail & Related papers (2021-04-04T19:45:41Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Gradient-EM Bayesian Meta-learning [6.726255259929496]
Key idea behind Bayesian meta-learning is empirical Bayes inference of hierarchical model.
In this work, we extend this framework to include a variety of existing methods, before proposing our variant based on gradient-EM algorithm.
Experiments on sinusoidal regression, few-shot image classification, and policy-based reinforcement learning show that our method not only achieves better accuracy with less computation cost, but is also more robust to uncertainty.
arXiv Detail & Related papers (2020-06-21T10:52:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.