Efficient Architecture Search for Continual Learning
- URL: http://arxiv.org/abs/2006.04027v2
- Date: Tue, 9 Jun 2020 04:54:11 GMT
- Title: Efficient Architecture Search for Continual Learning
- Authors: Qiang Gao, Zhipeng Luo, Diego Klabjan
- Abstract summary: Continual learning with neural networks aims to learn a sequence of tasks well.
It is often confronted with three challenges: (1) overcome the catastrophic forgetting problem, (2) adapt the current network to new tasks, and (3) control its model complexity.
We propose a novel approach named as Continual Learning with Efficient Architecture Search, or CLEAS in short.
- Score: 36.998565674813285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual learning with neural networks is an important learning framework in
AI that aims to learn a sequence of tasks well. However, it is often confronted
with three challenges: (1) overcome the catastrophic forgetting problem, (2)
adapt the current network to new tasks, and meanwhile (3) control its model
complexity. To reach these goals, we propose a novel approach named as
Continual Learning with Efficient Architecture Search, or CLEAS in short. CLEAS
works closely with neural architecture search (NAS) which leverages
reinforcement learning techniques to search for the best neural architecture
that fits a new task. In particular, we design a neuron-level NAS controller
that decides which old neurons from previous tasks should be reused (knowledge
transfer), and which new neurons should be added (to learn new knowledge). Such
a fine-grained controller allows one to find a very concise architecture that
can fit each new task well. Meanwhile, since we do not alter the weights of the
reused neurons, we perfectly memorize the knowledge learned from previous
tasks. We evaluate CLEAS on numerous sequential classification tasks, and the
results demonstrate that CLEAS outperforms other state-of-the-art alternative
methods, achieving higher classification accuracy while using simpler neural
architectures.
Related papers
- Simple and Effective Transfer Learning for Neuro-Symbolic Integration [50.592338727912946]
A potential solution to this issue is Neuro-Symbolic Integration (NeSy), where neural approaches are combined with symbolic reasoning.
Most of these methods exploit a neural network to map perceptions to symbols and a logical reasoner to predict the output of the downstream task.
They suffer from several issues, including slow convergence, learning difficulties with complex perception tasks, and convergence to local minima.
This paper proposes a simple yet effective method to ameliorate these problems.
arXiv Detail & Related papers (2024-02-21T15:51:01Z) - Enhancing Efficient Continual Learning with Dynamic Structure
Development of Spiking Neural Networks [6.407825206595442]
Children possess the ability to learn multiple cognitive tasks sequentially.
Existing continual learning frameworks are usually applicable to Deep Neural Networks (DNNs)
We propose Dynamic Structure Development of Spiking Neural Networks (DSD-SNN) for efficient and adaptive continual learning.
arXiv Detail & Related papers (2023-08-09T07:36:40Z) - The Clock and the Pizza: Two Stories in Mechanistic Explanation of
Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex.
We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z) - Neural Routing in Meta Learning [9.070747377130472]
We aim to improve the model performance of the current meta learning algorithms by selectively using only parts of the model conditioned on the input tasks.
In this work, we describe an approach that investigates task-dependent dynamic neuron selection in deep convolutional neural networks (CNNs) by leveraging the scaling factor in the batch normalization layer.
We find that the proposed approach, neural routing in meta learning (NRML), outperforms one of the well-known existing meta learning baselines on few-shot classification tasks.
arXiv Detail & Related papers (2022-10-14T16:31:24Z) - CogNGen: Constructing the Kernel of a Hyperdimensional Predictive
Processing Cognitive Architecture [79.07468367923619]
We present a new cognitive architecture that combines two neurobiologically plausible, computational models.
We aim to develop a cognitive architecture that has the power of modern machine learning techniques.
arXiv Detail & Related papers (2022-03-31T04:44:28Z) - Neural Architecture Search for Dense Prediction Tasks in Computer Vision [74.9839082859151]
Deep learning has led to a rising demand for neural network architecture engineering.
neural architecture search (NAS) aims at automatically designing neural network architectures in a data-driven manner rather than manually.
NAS has become applicable to a much wider range of problems in computer vision.
arXiv Detail & Related papers (2022-02-15T08:06:50Z) - Improving the sample-efficiency of neural architecture search with
reinforcement learning [0.0]
In this work, we would like to contribute to the area of Automated Machine Learning (AutoML)
Our focus is on one of the most promising research directions, reinforcement learning.
The validation accuracies of the child networks serve as a reward signal for training the controller.
We propose to modify this to a more modern and complex algorithm, PPO, which has demonstrated to be faster and more stable in other environments.
arXiv Detail & Related papers (2021-10-13T14:30:09Z) - Efficient and robust multi-task learning in the brain with modular task
primitives [2.6166087473624318]
We show that a modular network endowed with task primitives allows for learning multiple tasks well while keeping parameter counts, and updates, low.
We also show that the skills acquired with our approach are more robust to a broad range of perturbations compared to those acquired with other multi-task learning strategies.
arXiv Detail & Related papers (2021-05-28T21:07:54Z) - Self-Constructing Neural Networks Through Random Mutation [0.0]
This paper presents a simple method for learning neural architecture through random mutation.
It demonstrates 1) neural architecture may be learned during the agent's lifetime, 2) neural architecture may be constructed over a single lifetime without any initial connections or neurons, and 3) architectural modifications enable rapid adaptation to dynamic and novel task scenarios.
arXiv Detail & Related papers (2021-03-29T15:27:38Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.