Interleaving Learning, with Application to Neural Architecture Search
- URL: http://arxiv.org/abs/2103.07018v1
- Date: Fri, 12 Mar 2021 00:54:22 GMT
- Title: Interleaving Learning, with Application to Neural Architecture Search
- Authors: Hao Ban, Pengtao Xie
- Abstract summary: We propose a novel machine learning framework referred to as interleaving learning (IL)
In our framework, a set of models collaboratively learn a data encoder in an interleaving fashion.
We apply interleaving learning to search neural architectures for image classification on CIFAR-10, CIFAR-100, and ImageNet.
- Score: 12.317568257671427
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Interleaving learning is a human learning technique where a learner
interleaves the studies of multiple topics, which increases long-term retention
and improves ability to transfer learned knowledge. Inspired by the
interleaving learning technique of humans, in this paper we explore whether
this learning methodology is beneficial for improving the performance of
machine learning models as well. We propose a novel machine learning framework
referred to as interleaving learning (IL). In our framework, a set of models
collaboratively learn a data encoder in an interleaving fashion: the encoder is
trained by model 1 for a while, then passed to model 2 for further training,
then model 3, and so on; after trained by all models, the encoder returns back
to model 1 and is trained again, then moving to model 2, 3, etc. This process
repeats for multiple rounds. Our framework is based on multi-level optimization
consisting of multiple inter-connected learning stages. An efficient
gradient-based algorithm is developed to solve the multi-level optimization
problem. We apply interleaving learning to search neural architectures for
image classification on CIFAR-10, CIFAR-100, and ImageNet. The effectiveness of
our method is strongly demonstrated by the experimental results.
Related papers
- Accelerating Deep Learning with Fixed Time Budget [2.190627491782159]
This paper proposes an effective technique for training arbitrary deep learning models within fixed time constraints.
The proposed method is extensively evaluated in both classification and regression tasks in computer vision.
arXiv Detail & Related papers (2024-10-03T21:18:04Z) - EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training [79.96741042766524]
We reformulate the training curriculum as a soft-selection function.
We show that exposing the contents of natural images can be readily achieved by the intensity of data augmentation.
The resulting method, EfficientTrain++, is simple, general, yet surprisingly effective.
arXiv Detail & Related papers (2024-05-14T17:00:43Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM.
For learning methods, we explore the claim of a "free lunch" hypothesis.
For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z) - EfficientTrain: Exploring Generalized Curriculum Learning for Training
Visual Backbones [80.662250618795]
This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers)
As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models by >1.5x on ImageNet-1K/22K without sacrificing accuracy.
arXiv Detail & Related papers (2022-11-17T17:38:55Z) - Learning from Mistakes based on Class Weighting with Application to
Neural Architecture Search [12.317568257671427]
We propose a simple and effective multi-level optimization framework called learning from mistakes (LFM)
The primary objective is to train a model to perform effectively on target tasks by using a re-weighting technique to prevent similar mistakes in the future.
In this formulation, we learn the class weights by minimizing the validation loss of the model and re-train the model with the synthetic data from the image generator weighted by class-wise performance and real data.
arXiv Detail & Related papers (2021-12-01T04:56:49Z) - An Approach for Combining Multimodal Fusion and Neural Architecture
Search Applied to Knowledge Tracing [6.540879944736641]
We propose a sequential model based optimization approach that combines multimodal fusion and neural architecture search within one framework.
We evaluate our methods on two public real datasets showing the discovered model is able to achieve superior performance.
arXiv Detail & Related papers (2021-11-08T13:43:46Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Small-Group Learning, with Application to Neural Architecture Search [17.86826990290058]
In human learning, a small group of students work together towards the same learning objective, where they express their understanding of a topic to their peers, compare their ideas, and help each other to trouble-shoot problems.
In this paper, we aim to investigate whether this human learning method can be borrowed to train better machine learning models, by developing a novel ML framework -- small-group learning (SGL)
SGL is formulated as a multi-level optimization framework consisting of three learning stages: each learner trains a model independently and uses this model to perform pseudo-labeling; each learner trains another model using datasets pseudo-
arXiv Detail & Related papers (2020-12-23T05:56:47Z) - Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification.
Our strategy enables important aspects of the base learner objective to be learned during meta-training.
We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z) - Learning to Rank Learning Curves [15.976034696758148]
We present a new method that saves computational budget by terminating poor configurations early on in the training.
We show that our model is able to effectively rank learning curves without having to observe many or very long learning curves.
arXiv Detail & Related papers (2020-06-05T10:49:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.