SpaceNet: Make Free Space For Continual Learning
- URL: http://arxiv.org/abs/2007.07617v3
- Date: Wed, 14 Apr 2021 08:39:33 GMT
- Title: SpaceNet: Make Free Space For Continual Learning
- Authors: Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy
- Abstract summary: We propose a novel architectural-based method referred as SpaceNet for class incremental learning scenario.
SpaceNet trains sparse deep neural networks from scratch in an adaptive way that compresses the sparse connections of each task in a compact number of neurons.
Experimental results show the robustness of our proposed method against catastrophic forgetting old tasks and the efficiency of SpaceNet in utilizing the available capacity of the model.
- Score: 15.914199054779438
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The continual learning (CL) paradigm aims to enable neural networks to learn
tasks continually in a sequential fashion. The fundamental challenge in this
learning paradigm is catastrophic forgetting previously learned tasks when the
model is optimized for a new task, especially when their data is not
accessible. Current architectural-based methods aim at alleviating the
catastrophic forgetting problem but at the expense of expanding the capacity of
the model. Regularization-based methods maintain a fixed model capacity;
however, previous studies showed the huge performance degradation of these
methods when the task identity is not available during inference (e.g. class
incremental learning scenario). In this work, we propose a novel
architectural-based method referred as SpaceNet for class incremental learning
scenario where we utilize the available fixed capacity of the model
intelligently. SpaceNet trains sparse deep neural networks from scratch in an
adaptive way that compresses the sparse connections of each task in a compact
number of neurons. The adaptive training of the sparse connections results in
sparse representations that reduce the interference between the tasks.
Experimental results show the robustness of our proposed method against
catastrophic forgetting old tasks and the efficiency of SpaceNet in utilizing
the available capacity of the model, leaving space for more tasks to be
learned. In particular, when SpaceNet is tested on the well-known benchmarks
for CL: split MNIST, split Fashion-MNIST, and CIFAR-10/100, it outperforms
regularization-based methods by a big performance gap. Moreover, it achieves
better performance than architectural-based methods without model expansion and
achieved comparable results with rehearsal-based methods, while offering a huge
memory reduction.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation [20.851925464903804]
This paper introduces a novel learning paradigm, Dynamic Sparse Learning, tailored for recommendation models.
DSL innovatively trains a lightweight sparse model from scratch, periodically evaluating and dynamically adjusting each weight's significance.
Our experimental results underline DSL's effectiveness, significantly reducing training and inference costs while delivering comparable recommendation performance.
arXiv Detail & Related papers (2024-02-05T10:16:20Z) - Concrete Subspace Learning based Interference Elimination for Multi-task
Model Fusion [86.6191592951269]
Merging models fine-tuned from common extensively pretrained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multitask model that performs well across diverse tasks.
We propose the CONtinuous relaxation dis (Concrete) subspace learning method to identify a common lowdimensional subspace and utilize its shared information track interference problem without sacrificing performance.
arXiv Detail & Related papers (2023-12-11T07:24:54Z) - Lightweight Diffusion Models with Distillation-Based Block Neural
Architecture Search [55.41583104734349]
We propose to automatically remove structural redundancy in diffusion models with our proposed Diffusion Distillation-based Block-wise Neural Architecture Search (NAS)
Given a larger pretrained teacher, we leverage DiffNAS to search for the smallest architecture which can achieve on-par or even better performance than the teacher.
Different from previous block-wise NAS methods, DiffNAS contains a block-wise local search strategy and a retraining strategy with a joint dynamic loss.
arXiv Detail & Related papers (2023-11-08T12:56:59Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - Dense Network Expansion for Class Incremental Learning [61.00081795200547]
State-of-the-art approaches use a dynamic architecture based on network expansion (NE), in which a task expert is added per task.
A new NE method, dense network expansion (DNE), is proposed to achieve a better trade-off between accuracy and model complexity.
It outperforms the previous SOTA methods by a margin of 4% in terms of accuracy, with similar or even smaller model scale.
arXiv Detail & Related papers (2023-03-22T16:42:26Z) - Neural Weight Search for Scalable Task Incremental Learning [6.413209417643468]
Task incremental learning aims to enable a system to maintain its performance on previously learned tasks while learning new tasks, solving the problem of catastrophic forgetting.
One promising approach is to build an individual network or sub-network for future tasks.
This leads to an ever-growing memory due to saving extra weights for new tasks and how to address this issue has remained an open problem in task incremental learning.
arXiv Detail & Related papers (2022-11-24T23:30:23Z) - Task-Adaptive Neural Network Retrieval with Meta-Contrastive Learning [34.27089256930098]
We propose a novel neural network retrieval method, which retrieves the most optimal pre-trained network for a given task.
We train this framework by meta-learning a cross-modal latent space with contrastive loss, to maximize the similarity between a dataset and a network.
We validate the efficacy of our method on ten real-world datasets, against existing NAS baselines.
arXiv Detail & Related papers (2021-03-02T06:30:51Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z) - Neuromodulated Neural Architectures with Local Error Signals for
Memory-Constrained Online Continual Learning [4.2903672492917755]
We develop a biologically-inspired light weight neural network architecture that incorporates local learning and neuromodulation.
We demonstrate the efficacy of our approach on both single task and continual learning setting.
arXiv Detail & Related papers (2020-07-16T07:41:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.