Related papers: Faster Meta Update Strategy for Noise-Robust Deep Learning

Faster Meta Update Strategy for Noise-Robust Deep Learning

URL: http://arxiv.org/abs/2104.15092v1
Date: Fri, 30 Apr 2021 16:19:07 GMT
Title: Faster Meta Update Strategy for Noise-Robust Deep Learning
Authors: Youjiang Xu, Linchao Zhu, Lu Jiang, Yi Yang
Abstract summary: We introduce a novel Faster Meta Update Strategy (FaMUS) to replace the most expensive step in the meta gradient with a faster layer-wise approximation. We show our method is able to save two-thirds of the training time while still maintaining the comparable or achieving even better generalization performance.
Score: 62.08964100618873
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It has been shown that deep neural networks are prone to overfitting on biased training data. Towards addressing this issue, meta-learning employs a meta model for correcting the training bias. Despite the promising performances, super slow training is currently the bottleneck in the meta learning approaches. In this paper, we introduce a novel Faster Meta Update Strategy (FaMUS) to replace the most expensive step in the meta gradient computation with a faster layer-wise approximation. We empirically find that FaMUS yields not only a reasonably accurate but also a low-variance approximation of the meta gradient. We conduct extensive experiments to verify the proposed method on two tasks. We show our method is able to save two-thirds of the training time while still maintaining the comparable or achieving even better generalization performance. In particular, our method achieves the state-of-the-art performance on both synthetic and realistic noisy labels, and obtains promising performance on long-tailed recognition on standard benchmarks.

Related papers

FREE: Faster and Better Data-Free Meta-Learning [77.90126669914324]
Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data. We introduce the Faster and Better Data-Free Meta-Learning framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks.
arXiv Detail & Related papers (2024-05-02T03:43:19Z)
Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples. We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment. We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z)
One Step at a Time: Pros and Cons of Multi-Step Meta-Gradient Reinforcement Learning [61.662504399411695]
We introduce a novel method mixing multiple inner steps that enjoys a more accurate and robust meta-gradient signal. When applied to the Snake game, the mixing meta-gradient algorithm can cut the variance by a factor of 3 while achieving similar or higher performance.
arXiv Detail & Related papers (2021-10-30T08:36:52Z)
Accelerating Gradient-based Meta Learner [2.1349209400003932]
We propose various acceleration techniques to speed up meta learning algorithms such as MAML (Model Agnostic Meta Learning) We introduce a novel method of training tasks in clusters, which not only accelerates the meta learning process but also improves model accuracy performance.
arXiv Detail & Related papers (2021-10-27T14:27:36Z)
Adapting Stepsizes by Momentumized Gradients Improves Optimization and Generalization [89.66571637204012]
textscAdaMomentum on vision, and achieves state-the-art results consistently on other tasks including language processing. textscAdaMomentum on vision, and achieves state-the-art results consistently on other tasks including language processing. textscAdaMomentum on vision, and achieves state-the-art results consistently on other tasks including language processing.
arXiv Detail & Related papers (2021-06-22T03:13:23Z)
A contrastive rule for meta-learning [1.3124513975412255]
Meta-learning algorithms leverage regularities that are present on a set of tasks to speed up and improve the performance of a subsidiary learning process. We present a gradient-based meta-learning algorithm based on equilibrium propagation. We establish theoretical bounds on its performance and present experiments on a set of standard benchmarks and neural network architectures.
arXiv Detail & Related papers (2021-04-04T19:45:41Z)
Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification. Our strategy enables important aspects of the base learner objective to be learned during meta-training. We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z)
Gradient-EM Bayesian Meta-learning [6.726255259929496]
Key idea behind Bayesian meta-learning is empirical Bayes inference of hierarchical model. In this work, we extend this framework to include a variety of existing methods, before proposing our variant based on gradient-EM algorithm. Experiments on sinusoidal regression, few-shot image classification, and policy-based reinforcement learning show that our method not only achieves better accuracy with less computation cost, but is also more robust to uncertainty.
arXiv Detail & Related papers (2020-06-21T10:52:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.