On the Effectiveness of Equivariant Regularization for Robust Online
Continual Learning
- URL: http://arxiv.org/abs/2305.03648v1
- Date: Fri, 5 May 2023 16:10:31 GMT
- Title: On the Effectiveness of Equivariant Regularization for Robust Online
Continual Learning
- Authors: Lorenzo Bonicelli, Matteo Boschini, Emanuele Frascaroli, Angelo
Porrello, Matteo Pennisi, Giovanni Bellitto, Simone Palazzo, Concetto
Spampinato, Simone Calderara
- Abstract summary: Continual Learning (CL) approaches seek to bridge this gap by facilitating the transfer of knowledge to both previous tasks and future ones.
Recent research has shown that self-supervision can produce versatile models that can generalize well to diverse downstream tasks.
We propose Continual Learning via Equivariant Regularization (CLER), an OCL approach that leverages equivariant tasks for self-supervision.
- Score: 17.995662644298974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans can learn incrementally, whereas neural networks forget previously
acquired information catastrophically. Continual Learning (CL) approaches seek
to bridge this gap by facilitating the transfer of knowledge to both previous
tasks (backward transfer) and future ones (forward transfer) during training.
Recent research has shown that self-supervision can produce versatile models
that can generalize well to diverse downstream tasks. However, contrastive
self-supervised learning (CSSL), a popular self-supervision technique, has
limited effectiveness in online CL (OCL). OCL only permits one iteration of the
input dataset, and CSSL's low sample efficiency hinders its use on the input
data-stream.
In this work, we propose Continual Learning via Equivariant Regularization
(CLER), an OCL approach that leverages equivariant tasks for self-supervision,
avoiding CSSL's limitations. Our method represents the first attempt at
combining equivariant knowledge with CL and can be easily integrated with
existing OCL methods. Extensive ablations shed light on how equivariant pretext
tasks affect the network's information flow and its impact on CL dynamics.
Related papers
- Continual Task Learning through Adaptive Policy Self-Composition [54.95680427960524]
CompoFormer is a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network.
Our experiments reveal that CompoFormer outperforms conventional continual learning (CL) methods, particularly in longer task sequences.
arXiv Detail & Related papers (2024-11-18T08:20:21Z) - Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning [99.05401042153214]
In-context learning (ICL) is potentially attributed to two major abilities: task recognition (TR) and task learning (TL)
We take the first step by examining the pre-training dynamics of the emergence of ICL.
We propose a simple yet effective method to better integrate these two abilities for ICL at inference time.
arXiv Detail & Related papers (2024-06-20T06:37:47Z) - What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights [67.72413262980272]
Severe data imbalance naturally exists among web-scale vision-language datasets.
We find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning.
The robustness and discriminability of CLIP improve with more descriptive language supervision, larger data scale, and broader open-world concepts.
arXiv Detail & Related papers (2024-05-31T17:57:24Z) - Plasticity-Optimized Complementary Networks for Unsupervised Continual
Learning [22.067640536948545]
Continuous unsupervised representation learning (CURL) research has greatly benefited from improvements in self-supervised learning (SSL) techniques.
Existing CURL methods using SSL can learn high-quality representations without any labels, but with a notable performance drop when learning on a many-tasks data stream.
We propose to train an expert network that is relieved of the duty of keeping the previous knowledge and can focus on performing optimally on the new tasks.
arXiv Detail & Related papers (2023-09-12T09:31:34Z) - CBA: Improving Online Continual Learning via Continual Bias Adaptor [44.1816716207484]
We propose a Continual Bias Adaptor to augment the classifier network to adapt to catastrophic distribution change during training.
In the testing stage, CBA can be removed which introduces no additional cost and memory overhead.
We theoretically reveal the reason why the proposed method can effectively alleviate catastrophic distribution shifts.
arXiv Detail & Related papers (2023-08-14T04:03:51Z) - Mitigating Forgetting in Online Continual Learning via Contrasting
Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one.
Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z) - Beyond Supervised Continual Learning: a Review [69.9674326582747]
Continual Learning (CL) is a flavor of machine learning where the usual assumption of stationary data distribution is relaxed or omitted.
Changes in the data distribution can cause the so-called catastrophic forgetting (CF) effect: an abrupt loss of previous knowledge.
This article reviews literature that study CL in other settings, such as learning with reduced supervision, fully unsupervised learning, and reinforcement learning.
arXiv Detail & Related papers (2022-08-30T14:44:41Z) - Online Continual Learning with Contrastive Vision Transformer [67.72251876181497]
This paper proposes a framework Contrastive Vision Transformer (CVT) to achieve a better stability-plasticity trade-off for online CL.
Specifically, we design a new external attention mechanism for online CL that implicitly captures previous tasks' information.
Based on the learnable focuses, we design a focal contrastive loss to rebalance contrastive learning between new and past classes and consolidate previously learned representations.
arXiv Detail & Related papers (2022-07-24T08:51:02Z) - Using Representation Expressiveness and Learnability to Evaluate
Self-Supervised Learning Methods [61.49061000562676]
We introduce Cluster Learnability (CL) to assess learnability.
CL is measured in terms of the performance of a KNN trained to predict labels obtained by clustering the representations with K-means.
We find that CL better correlates with in-distribution model performance than other competing recent evaluation schemes.
arXiv Detail & Related papers (2022-06-02T19:05:13Z) - Generalized Variational Continual Learning [33.194866396158005]
Two main approaches to continuous learning are Online Elastic Weight Consolidation and Variational Continual Learning.
We show that applying this modification to mitigate Online EWC as a limiting case, allowing baselines between the two approaches.
In order to the observed overpruning effect of VI, we take inspiration from a common multi-task architecture, mitigate neural networks with task-specific FiLM layers.
arXiv Detail & Related papers (2020-11-24T19:07:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.