Continual Learning in Low-rank Orthogonal Subspaces
- URL: http://arxiv.org/abs/2010.11635v2
- Date: Tue, 8 Dec 2020 15:23:37 GMT
- Title: Continual Learning in Low-rank Orthogonal Subspaces
- Authors: Arslan Chaudhry, Naeemullah Khan, Puneet K. Dokania, Philip H. S. Torr
- Abstract summary: In continual learning (CL), a learner is faced with a sequence of tasks, arriving one after the other, and the goal is to remember all the tasks once the learning experience is finished.
The prior art in CL uses episodic memory, parameter regularization or network structures to reduce interference among tasks, but in the end, all the approaches learn different tasks in a joint vector space.
We propose to learn tasks in different (low-rank) vector subspaces that are kept orthogonal to each other in order to minimize interference.
- Score: 86.36417214618575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In continual learning (CL), a learner is faced with a sequence of tasks,
arriving one after the other, and the goal is to remember all the tasks once
the continual learning experience is finished. The prior art in CL uses
episodic memory, parameter regularization or extensible network structures to
reduce interference among tasks, but in the end, all the approaches learn
different tasks in a joint vector space. We believe this invariably leads to
interference among different tasks. We propose to learn tasks in different
(low-rank) vector subspaces that are kept orthogonal to each other in order to
minimize interference. Further, to keep the gradients of different tasks coming
from these subspaces orthogonal to each other, we learn isometric mappings by
posing network training as an optimization problem over the Stiefel manifold.
To the best of our understanding, we report, for the first time, strong results
over experience-replay baseline with and without memory on standard
classification benchmarks in continual learning. The code is made publicly
available.
Related papers
- Infinite dSprites for Disentangled Continual Learning: Separating Memory Edits from Generalization [36.23065731463065]
We introduce Infinite dSprites, a parsimonious tool for creating continual classification benchmarks of arbitrary length.
We show that over a sufficiently long time horizon, the performance of all major types of continual learning methods deteriorates on this simple benchmark.
In a simple setting with direct supervision on the generative factors, we show how learning class-agnostic transformations offers a way to circumvent catastrophic forgetting.
arXiv Detail & Related papers (2023-12-27T22:05:42Z) - Online Continual Learning via the Knowledge Invariant and Spread-out
Properties [4.109784267309124]
Key challenge in continual learning is catastrophic forgetting.
We propose a new method, named Online Continual Learning via the Knowledge Invariant and Spread-out Properties (OCLKISP)
We empirically evaluate our proposed method on four popular benchmarks for continual learning: Split CIFAR 100, Split SVHN, Split CUB200 and Split Tiny-Image-Net.
arXiv Detail & Related papers (2023-02-02T04:03:38Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Continual Object Detection via Prototypical Task Correlation Guided
Gating Mechanism [120.1998866178014]
We present a flexible framework for continual object detection via pRotOtypical taSk corrElaTion guided gaTingAnism (ROSETTA)
Concretely, a unified framework is shared by all tasks while task-aware gates are introduced to automatically select sub-models for specific tasks.
Experiments on COCO-VOC, KITTI-Kitchen, class-incremental detection on VOC and sequential learning of four tasks show that ROSETTA yields state-of-the-art performance.
arXiv Detail & Related papers (2022-05-06T07:31:28Z) - On Steering Multi-Annotations per Sample for Multi-Task Learning [79.98259057711044]
The study of multi-task learning has drawn great attention from the community.
Despite the remarkable progress, the challenge of optimally learning different tasks simultaneously remains to be explored.
Previous works attempt to modify the gradients from different tasks. Yet these methods give a subjective assumption of the relationship between tasks, and the modified gradient may be less accurate.
In this paper, we introduce Task Allocation(STA), a mechanism that addresses this issue by a task allocation approach, in which each sample is randomly allocated a subset of tasks.
For further progress, we propose Interleaved Task Allocation(ISTA) to iteratively allocate all
arXiv Detail & Related papers (2022-03-06T11:57:18Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Linear Mode Connectivity in Multitask and Continual Learning [46.98656798573886]
We investigate whether multitask and continual solutions are similarly connected.
We propose an effective algorithm that constrains the sequentially learned minima to behave as the multitask solution.
arXiv Detail & Related papers (2020-10-09T10:53:25Z) - Adversarial Continual Learning [99.56738010842301]
We propose a hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features.
Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills.
arXiv Detail & Related papers (2020-03-21T02:08:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.