E2-AEN: End-to-End Incremental Learning with Adaptively Expandable
Network
- URL: http://arxiv.org/abs/2207.06754v1
- Date: Thu, 14 Jul 2022 09:04:51 GMT
- Title: E2-AEN: End-to-End Incremental Learning with Adaptively Expandable
Network
- Authors: Guimei Cao, Zhanzhan Cheng, Yunlu Xu, Duo Li, Shiliang Pu, Yi Niu and
Fei Wu
- Abstract summary: We propose an end-to-end trainable adaptively expandable network named E2-AEN.
It dynamically generates lightweight structures for new tasks without any accuracy drop in previous tasks.
E2-AEN reduces cost and can be built upon any feed-forward architectures in an end-to-end manner.
- Score: 57.87240860624937
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Expandable networks have demonstrated their advantages in dealing with
catastrophic forgetting problem in incremental learning. Considering that
different tasks may need different structures, recent methods design dynamic
structures adapted to different tasks via sophisticated skills. Their routine
is to search expandable structures first and then train on the new tasks,
which, however, breaks tasks into multiple training stages, leading to
suboptimal or overmuch computational cost. In this paper, we propose an
end-to-end trainable adaptively expandable network named E2-AEN, which
dynamically generates lightweight structures for new tasks without any accuracy
drop in previous tasks. Specifically, the network contains a serial of powerful
feature adapters for augmenting the previously learned representations to new
tasks, and avoiding task interference. These adapters are controlled via an
adaptive gate-based pruning strategy which decides whether the expanded
structures can be pruned, making the network structure dynamically changeable
according to the complexity of the new tasks. Moreover, we introduce a novel
sparsity-activation regularization to encourage the model to learn
discriminative features with limited parameters. E2-AEN reduces cost and can be
built upon any feed-forward architectures in an end-to-end manner. Extensive
experiments on both classification (i.e., CIFAR and VDD) and detection (i.e.,
COCO, VOC and ICCV2021 SSLAD challenge) benchmarks demonstrate the
effectiveness of the proposed method, which achieves the new remarkable
results.
Related papers
- LW2G: Learning Whether to Grow for Prompt-based Continual Learning [15.766350352592331]
Recent Prompt-based Continual Learning (PCL) has achieved remarkable performance with Pre-Trained Models (PTMs)
We propose a plug-in module in the former stage to textbfLearn Whether to Grow (LW2G) based on the disparities between tasks.
Inspired by Gradient Projection Continual Learning, our LW2G develops a metric called Hinder Forward Capability (HFC) to measure the hindrance imposed on learning new tasks.
arXiv Detail & Related papers (2024-09-27T15:55:13Z) - Dynamic Integration of Task-Specific Adapters for Class Incremental Learning [31.67570086108542]
Non-exemplar class Incremental Learning (NECIL) enables models to continuously acquire new classes without retraining from scratch and storing old task exemplars.
We propose a novel framework called Dynamic Integration of task-specific Adapters (DIA), which comprises two key components: Task-Specific Adapter Integration (TSAI) and Patch-Level Model Alignment.
arXiv Detail & Related papers (2024-09-23T13:01:33Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Controllable Dynamic Multi-Task Architectures [92.74372912009127]
We propose a controllable multi-task network that dynamically adjusts its architecture and weights to match the desired task preference as well as the resource constraints.
We propose a disentangled training of two hypernetworks, by exploiting task affinity and a novel branching regularized loss, to take input preferences and accordingly predict tree-structured models with adapted weights.
arXiv Detail & Related papers (2022-03-28T17:56:40Z) - GROWN: GRow Only When Necessary for Continual Learning [39.56829374809613]
Catastrophic forgetting is a notorious issue in deep learning, referring to the fact that Deep Neural Networks (DNN) could forget the knowledge about earlier tasks when learning new tasks.
To address this issue, continual learning has been developed to learn new tasks sequentially and perform knowledge transfer from the old tasks to the new ones without forgetting.
GROWN is a novel end-to-end continual learning framework to dynamically grow the model only when necessary.
arXiv Detail & Related papers (2021-10-03T02:31:04Z) - Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling.
We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations.
Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z) - A Novel Approach to Lifelong Learning: The Plastic Support Structure [0.0]
We propose a novel approach to lifelong learning, introducing a compact encapsulated support structure which endows a network with the capability to expand its capacity as needed to learn new tasks.
This is achieved by splitting neurons with high semantic drift and constructing an adjacent network to encode the new tasks at hand.
We call this the Plastic Support Structure (PSS), it is a compact structure to learn new tasks that cannot be efficiently encoded in the existing structure of the network.
arXiv Detail & Related papers (2021-06-11T10:34:37Z) - Efficient Feature Transformations for Discriminative and Generative
Continual Learning [98.10425163678082]
We propose a simple task-specific feature map transformation strategy for continual learning.
Theses provide powerful flexibility for learning new tasks, achieved with minimal parameters added to the base architecture.
We demonstrate the efficacy and efficiency of our method with an extensive set of experiments in discriminative (CIFAR-100 and ImageNet-1K) and generative sequences of tasks.
arXiv Detail & Related papers (2021-03-25T01:48:14Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.