Dense Network Expansion for Class Incremental Learning
- URL: http://arxiv.org/abs/2303.12696v1
- Date: Wed, 22 Mar 2023 16:42:26 GMT
- Title: Dense Network Expansion for Class Incremental Learning
- Authors: Zhiyuan Hu, Yunsheng Li, Jiancheng Lyu, Dashan Gao, Nuno Vasconcelos
- Abstract summary: State-of-the-art approaches use a dynamic architecture based on network expansion (NE), in which a task expert is added per task.
A new NE method, dense network expansion (DNE), is proposed to achieve a better trade-off between accuracy and model complexity.
It outperforms the previous SOTA methods by a margin of 4% in terms of accuracy, with similar or even smaller model scale.
- Score: 61.00081795200547
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The problem of class incremental learning (CIL) is considered.
State-of-the-art approaches use a dynamic architecture based on network
expansion (NE), in which a task expert is added per task. While effective from
a computational standpoint, these methods lead to models that grow quickly with
the number of tasks. A new NE method, dense network expansion (DNE), is
proposed to achieve a better trade-off between accuracy and model complexity.
This is accomplished by the introduction of dense connections between the
intermediate layers of the task expert networks, that enable the transfer of
knowledge from old to new tasks via feature sharing and reusing. This sharing
is implemented with a cross-task attention mechanism, based on a new task
attention block (TAB), that fuses information across tasks. Unlike traditional
attention mechanisms, TAB operates at the level of the feature mixing and is
decoupled with spatial attentions. This is shown more effective than a joint
spatial-and-task attention for CIL. The proposed DNE approach can strictly
maintain the feature space of old classes while growing the network and feature
scale at a much slower rate than previous methods. In result, it outperforms
the previous SOTA methods by a margin of 4\% in terms of accuracy, with similar
or even smaller model scale.
Related papers
- Joint Input and Output Coordination for Class-Incremental Learning [84.36763449830812]
We propose a joint input and output coordination (JIOC) mechanism to address these issues.
This mechanism assigns different weights to different categories of data according to the gradient of the output score.
It can be incorporated into different incremental learning approaches that use memory storage.
arXiv Detail & Related papers (2024-09-09T13:55:07Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - TaE: Task-aware Expandable Representation for Long Tail Class Incremental Learning [42.630413950957795]
We introduce a novel Task-aware Expandable (TaE) framework to learn diverse representations from each incremental task.
TaE achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-02-08T16:37:04Z) - TRGP: Trust Region Gradient Projection for Continual Learning [39.99577526417276]
Catastrophic forgetting is one of the major challenges in continual learning.
We propose Trust Region Gradient Projection to facilitate the forward knowledge transfer.
Our approach achieves significant improvement over related state-of-the-art methods.
arXiv Detail & Related papers (2022-02-07T04:21:54Z) - GROWN: GRow Only When Necessary for Continual Learning [39.56829374809613]
Catastrophic forgetting is a notorious issue in deep learning, referring to the fact that Deep Neural Networks (DNN) could forget the knowledge about earlier tasks when learning new tasks.
To address this issue, continual learning has been developed to learn new tasks sequentially and perform knowledge transfer from the old tasks to the new ones without forgetting.
GROWN is a novel end-to-end continual learning framework to dynamically grow the model only when necessary.
arXiv Detail & Related papers (2021-10-03T02:31:04Z) - Efficient Feature Transformations for Discriminative and Generative
Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning.
Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture.
We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z) - Context Decoupling Augmentation for Weakly Supervised Semantic
Segmentation [53.49821324597837]
Weakly supervised semantic segmentation is a challenging problem that has been deeply studied in recent years.
We present a Context Decoupling Augmentation ( CDA) method to change the inherent context in which the objects appear.
To validate the effectiveness of the proposed method, extensive experiments on PASCAL VOC 2012 dataset with several alternative network architectures demonstrate that CDA can boost various popular WSSS methods to the new state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-03-02T15:05:09Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z) - SpaceNet: Make Free Space For Continual Learning [15.914199054779438]
We propose a novel architectural-based method referred as SpaceNet for class incremental learning scenario.
SpaceNet trains sparse deep neural networks from scratch in an adaptive way that compresses the sparse connections of each task in a compact number of neurons.
Experimental results show the robustness of our proposed method against catastrophic forgetting old tasks and the efficiency of SpaceNet in utilizing the available capacity of the model.
arXiv Detail & Related papers (2020-07-15T11:21:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.