Related papers: Is the Meta-Learning Idea Able to Improve the Generalization of Deep Neural Networks on the Standard Supervised Learning?

Is the Meta-Learning Idea Able to Improve the Generalization of Deep Neural Networks on the Standard Supervised Learning?

URL: http://arxiv.org/abs/2002.12455v1
Date: Thu, 27 Feb 2020 21:29:54 GMT
Title: Is the Meta-Learning Idea Able to Improve the Generalization of Deep Neural Networks on the Standard Supervised Learning?
Authors: Xiang Deng and Zhongfei Zhang
Abstract summary: We propose a novel metalearning based training procedure (M) for deep neural networks (DNNs) M simulates the meta-training process by considering a batch of training samples as a task. The experimental results show a consistently improved performance on all the generalizations with different sizes.
Score: 34.00378876525579
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Substantial efforts have been made on improving the generalization abilities of deep neural networks (DNNs) in order to obtain better performances without introducing more parameters. On the other hand, meta-learning approaches exhibit powerful generalization on new tasks in few-shot learning. Intuitively, few-shot learning is more challenging than the standard supervised learning as each target class only has a very few or no training samples. The natural question that arises is whether the meta-learning idea can be used for improving the generalization of DNNs on the standard supervised learning. In this paper, we propose a novel meta-learning based training procedure (MLTP) for DNNs and demonstrate that the meta-learning idea can indeed improve the generalization abilities of DNNs. MLTP simulates the meta-training process by considering a batch of training samples as a task. The key idea is that the gradient descent step for improving the current task performance should also improve a new task performance, which is ignored by the current standard procedure for training neural networks. MLTP also benefits from all the existing training techniques such as dropout, weight decay, and batch normalization. We evaluate MLTP by training a variety of small and large neural networks on three benchmark datasets, i.e., CIFAR-10, CIFAR-100, and Tiny ImageNet. The experimental results show a consistently improved generalization performance on all the DNNs with different sizes, which verifies the promise of MLTP and demonstrates that the meta-learning idea is indeed able to improve the generalization of DNNs on the standard supervised learning.

Related papers

Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features. We find new and interesting properties that do not exist in single-task linear regression. Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z)
Improving Representational Continuity via Continued Pretraining [76.29171039601948]
Transfer learning community (LP-FT) outperforms naive training and other continual learning methods. LP-FT also reduces forgetting in a real world satellite remote sensing dataset (FMoW) variant of LP-FT gets state-of-the-art accuracies on an NLP continual learning benchmark.
arXiv Detail & Related papers (2023-02-26T10:39:38Z)
Neural Routing in Meta Learning [9.070747377130472]
We aim to improve the model performance of the current meta learning algorithms by selectively using only parts of the model conditioned on the input tasks. In this work, we describe an approach that investigates task-dependent dynamic neuron selection in deep convolutional neural networks (CNNs) by leveraging the scaling factor in the batch normalization layer. We find that the proposed approach, neural routing in meta learning (NRML), outperforms one of the well-known existing meta learning baselines on few-shot classification tasks.
arXiv Detail & Related papers (2022-10-14T16:31:24Z)
Meta-Learning with Self-Improving Momentum Target [72.98879709228981]
We propose Self-improving Momentum Target (SiMT) to improve the performance of a meta-learner. SiMT generates the target model by adapting from the temporal ensemble of the meta-learner. We show that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods.
arXiv Detail & Related papers (2022-10-11T06:45:15Z)
On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation. This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z)
What Matters For Meta-Learning Vision Regression Tasks? [19.373532562905208]
This paper makes two main contributions that help understand this barely explored area. First, we design two new types of cross-category level vision regression tasks, namely object discovery and pose estimation. Second, we propose the addition of functional contrastive learning (FCL) over the task representations in Conditional Neural Processes (CNPs) and train in an end-to-end fashion.
arXiv Detail & Related papers (2022-03-09T17:28:16Z)
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution. Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs. Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z)
Generalising via Meta-Examples for Continual Learning in the Wild [24.09600678738403]
We develop a novel strategy to deal with neural networks that "learn in the wild" We equip it with MEML - Meta-Example Meta-Learning - a new module that simultaneously alleviates catastrophic forgetting. We extend it by adopting a technique that creates various augmented tasks and optimises over the hardest.
arXiv Detail & Related papers (2021-01-28T15:51:54Z)
Meta Learning Backpropagation And Improving It [4.061135251278187]
We show that simple weight-sharing and sparsity in an NN is sufficient to express powerful learning algorithms (LAs) in a reusable fashion. A simple implementation of VS-ML called VS-ML RNN allows for implementing the backpropagation LA solely by running an RNN in forward-mode. It can even meta-learn new LAs that improve upon backpropagation and generalize to datasets outside of the meta training distribution.
arXiv Detail & Related papers (2020-12-29T18:56:10Z)
Meta-learning the Learning Trends Shared Across Tasks [123.10294801296926]
Gradient-based meta-learning algorithms excel at quick adaptation to new tasks with limited data. Existing meta-learning approaches only depend on the current task information during the adaptation. We propose a 'Path-aware' model-agnostic meta-learning approach.
arXiv Detail & Related papers (2020-10-19T08:06:47Z)
Meta-Learning with Network Pruning [40.07436648243748]
We propose a network pruning based meta-learning approach for overfitting reduction via explicitly controlling the capacity of network. We have implemented our approach on top of Reptile assembled with two network pruning routines: Dense-Sparse-Dense (DSD) and Iterative Hard Thresholding (IHT)
arXiv Detail & Related papers (2020-07-07T06:13:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.