Lifelong Sequence Generation with Dynamic Module Expansion and
Adaptation
- URL: http://arxiv.org/abs/2310.09886v4
- Date: Wed, 22 Nov 2023 06:44:16 GMT
- Title: Lifelong Sequence Generation with Dynamic Module Expansion and
Adaptation
- Authors: Chengwei Qin, Chen Chen, Shafiq Joty
- Abstract summary: Lifelong sequence generation (LSG) aims to continually train a model on a sequence of generation tasks to learn constantly emerging new generation patterns.
Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA)
DMEA enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks.
- Score: 39.886149621730915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lifelong sequence generation (LSG), a problem in continual learning, aims to
continually train a model on a sequence of generation tasks to learn constantly
emerging new generation patterns while avoiding the forgetting of previous
knowledge. Existing LSG methods mainly focus on maintaining old knowledge while
paying little attention to knowledge transfer across tasks. In contrast, humans
can better learn new tasks by leveraging previously acquired knowledge from
similar tasks. Inspired by the learning paradigm of humans, we propose Dynamic
Module Expansion and Adaptation (DMEA), which enables the model to dynamically
determine the architecture for acquiring new knowledge based on task
correlation and select the most similar previous tasks to facilitate adaptation
to new tasks. In addition, as the learning process can easily be biased towards
the current task which might cause more severe forgetting of previously learned
knowledge, we propose dynamic gradient scaling to balance the learning of the
current task and replayed tasks. With extensive experiments, we demonstrate
that DMEA can consistently outperform existing methods in different LSG
settings.
Related papers
- CODE-CL: COnceptor-Based Gradient Projection for DEep Continual Learning [7.573297026523597]
We introduce COnceptor-based gradient projection for DEep Continual Learning (CODE-CL)
CODE-CL encodes directional importance within the input space of past tasks, allowing new knowledge integration in directions modulated by $1-S$.
We analyze task overlap using conceptor-based representations to identify highly correlated tasks.
arXiv Detail & Related papers (2024-11-21T22:31:06Z) - Continual Task Learning through Adaptive Policy Self-Composition [54.95680427960524]
CompoFormer is a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network.
Our experiments reveal that CompoFormer outperforms conventional continual learning (CL) methods, particularly in longer task sequences.
arXiv Detail & Related papers (2024-11-18T08:20:21Z) - Online Continual Learning via the Knowledge Invariant and Spread-out
Properties [4.109784267309124]
Key challenge in continual learning is catastrophic forgetting.
We propose a new method, named Online Continual Learning via the Knowledge Invariant and Spread-out Properties (OCLKISP)
We empirically evaluate our proposed method on four popular benchmarks for continual learning: Split CIFAR 100, Split SVHN, Split CUB200 and Split Tiny-Image-Net.
arXiv Detail & Related papers (2023-02-02T04:03:38Z) - Beyond Not-Forgetting: Continual Learning with Backward Knowledge
Transfer [39.99577526417276]
In continual learning (CL) an agent can improve the learning performance of both a new task and old' tasks.
Most existing CL methods focus on addressing catastrophic forgetting in neural networks by minimizing the modification of the learnt model for old tasks.
We propose a new CL method with Backward knowlEdge tRansfer (CUBER) for a fixed capacity neural network without data replay.
arXiv Detail & Related papers (2022-11-01T23:55:51Z) - Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain.
It tackles the problem from two aspects: extracting knowledge and memorizing knowledge.
It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z) - Fully Online Meta-Learning Without Task Boundaries [80.09124768759564]
We study how meta-learning can be applied to tackle online problems of this nature.
We propose a Fully Online Meta-Learning (FOML) algorithm, which does not require any ground truth knowledge about the task boundaries.
Our experiments show that FOML was able to learn new tasks faster than the state-of-the-art online learning methods.
arXiv Detail & Related papers (2022-02-01T07:51:24Z) - AFEC: Active Forgetting of Negative Transfer in Continual Learning [37.03139674884091]
We show that biological neural networks can actively forget the old knowledge that conflicts with the learning of a new experience.
Inspired by the biological active forgetting, we propose to actively forget the old knowledge that limits the learning of new tasks to benefit continual learning.
arXiv Detail & Related papers (2021-10-23T10:03:19Z) - Lifelong Learning of Few-shot Learners across NLP Tasks [45.273018249235705]
We study the challenge of lifelong learning to few-shot learn over a sequence of diverse NLP tasks.
We propose a continual meta-learning approach which learns to generate adapter weights from a few examples.
We demonstrate our approach preserves model performance over training tasks and leads to positive knowledge transfer when the future tasks are learned.
arXiv Detail & Related papers (2021-04-18T10:41:56Z) - Meta-learning the Learning Trends Shared Across Tasks [123.10294801296926]
Gradient-based meta-learning algorithms excel at quick adaptation to new tasks with limited data.
Existing meta-learning approaches only depend on the current task information during the adaptation.
We propose a 'Path-aware' model-agnostic meta-learning approach.
arXiv Detail & Related papers (2020-10-19T08:06:47Z) - Automated Relational Meta-learning [95.02216511235191]
We propose an automated relational meta-learning framework that automatically extracts the cross-task relations and constructs the meta-knowledge graph.
We conduct extensive experiments on 2D toy regression and few-shot image classification and the results demonstrate the superiority of ARML over state-of-the-art baselines.
arXiv Detail & Related papers (2020-01-03T07:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.