Plasticity-Optimized Complementary Networks for Unsupervised Continual
Learning
- URL: http://arxiv.org/abs/2309.06086v1
- Date: Tue, 12 Sep 2023 09:31:34 GMT
- Title: Plasticity-Optimized Complementary Networks for Unsupervised Continual
Learning
- Authors: Alex Gomez-Villa, Bartlomiej Twardowski, Kai Wang, Joost van de Weijer
- Abstract summary: Continuous unsupervised representation learning (CURL) research has greatly benefited from improvements in self-supervised learning (SSL) techniques.
Existing CURL methods using SSL can learn high-quality representations without any labels, but with a notable performance drop when learning on a many-tasks data stream.
We propose to train an expert network that is relieved of the duty of keeping the previous knowledge and can focus on performing optimally on the new tasks.
- Score: 22.067640536948545
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continuous unsupervised representation learning (CURL) research has greatly
benefited from improvements in self-supervised learning (SSL) techniques. As a
result, existing CURL methods using SSL can learn high-quality representations
without any labels, but with a notable performance drop when learning on a
many-tasks data stream. We hypothesize that this is caused by the
regularization losses that are imposed to prevent forgetting, leading to a
suboptimal plasticity-stability trade-off: they either do not adapt fully to
the incoming data (low plasticity), or incur significant forgetting when
allowed to fully adapt to a new SSL pretext-task (low stability). In this work,
we propose to train an expert network that is relieved of the duty of keeping
the previous knowledge and can focus on performing optimally on the new tasks
(optimizing plasticity). In the second phase, we combine this new knowledge
with the previous network in an adaptation-retrospection phase to avoid
forgetting and initialize a new expert with the knowledge of the old network.
We perform several experiments showing that our proposed approach outperforms
other CURL exemplar-free methods in few- and many-task split settings.
Furthermore, we show how to adapt our approach to semi-supervised continual
learning (Semi-SCL) and show that we surpass the accuracy of other
exemplar-free Semi-SCL methods and reach the results of some others that use
exemplars.
Related papers
- Continual Task Learning through Adaptive Policy Self-Composition [54.95680427960524]
CompoFormer is a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network.
Our experiments reveal that CompoFormer outperforms conventional continual learning (CL) methods, particularly in longer task sequences.
arXiv Detail & Related papers (2024-11-18T08:20:21Z) - Context-Aware Predictive Coding: A Representation Learning Framework for WiFi Sensing [0.0]
WiFi sensing is an emerging technology that utilizes wireless signals for various sensing applications.
In this paper, we introduce a novel SSL framework called Context-Aware Predictive Coding (CAPC)
CAPC effectively learns from unlabelled data and adapts to diverse environments.
Our evaluations demonstrate that CAPC not only outperforms other SSL methods and supervised approaches, but also achieves superior generalization capabilities.
arXiv Detail & Related papers (2024-09-16T17:59:49Z) - Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning [22.13331870720021]
We propose a beyond prompt learning approach to the RFCL task, called Continual Adapter (C-ADA)
C-ADA flexibly extends specific weights in CAL to learn new knowledge for each task and freezes old weights to preserve prior knowledge.
Our approach achieves significantly improved performance and training speed, outperforming the current state-of-the-art (SOTA) method.
arXiv Detail & Related papers (2024-07-14T17:40:40Z) - Self-Supervision for Tackling Unsupervised Anomaly Detection: Pitfalls
and Opportunities [50.231837687221685]
Self-supervised learning (SSL) has transformed machine learning and its many real world applications.
Unsupervised anomaly detection (AD) has also capitalized on SSL, by self-generating pseudo-anomalies.
arXiv Detail & Related papers (2023-08-28T07:55:01Z) - Mitigating Forgetting in Online Continual Learning via Contrasting
Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one.
Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z) - Effective Self-supervised Pre-training on Low-compute Networks without
Distillation [6.530011859253459]
Reported performance of self-supervised learning has trailed behind standard supervised pre-training by a large margin.
Most prior works attribute this poor performance to the capacity bottleneck of the low-compute networks.
We take a closer at what are the detrimental factors causing the practical limitations, and whether they are intrinsic to the self-supervised low-compute setting.
arXiv Detail & Related papers (2022-10-06T10:38:07Z) - Decoupled Adversarial Contrastive Learning for Self-supervised
Adversarial Robustness [69.39073806630583]
Adversarial training (AT) for robust representation learning and self-supervised learning (SSL) for unsupervised representation learning are two active research fields.
We propose a two-stage framework termed Decoupled Adversarial Contrastive Learning (DeACL)
arXiv Detail & Related papers (2022-07-22T06:30:44Z) - DATA: Domain-Aware and Task-Aware Pre-training [94.62676913928831]
We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL)
Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2022-03-17T02:38:49Z) - Improving Self-supervised Learning with Hardness-aware Dynamic
Curriculum Learning: An Application to Digital Pathology [2.2742357407157847]
Self-supervised learning (SSL) has recently shown tremendous potential to learn generic visual representations useful for many image analysis tasks.
The existing SSL methods fail to generalize to downstream tasks when the number of labeled training instances is small or if the domain shift between the transfer domains is significant.
This paper attempts to improve self-supervised pretrained representations through the lens of curriculum learning.
arXiv Detail & Related papers (2021-08-16T15:44:48Z) - Incremental Embedding Learning via Zero-Shot Translation [65.94349068508863]
Current state-of-the-art incremental learning methods tackle catastrophic forgetting problem in traditional classification networks.
We propose a novel class-incremental method for embedding network, named as zero-shot translation class-incremental method (ZSTCI)
In addition, ZSTCI can easily be combined with existing regularization-based incremental learning methods to further improve performance of embedding networks.
arXiv Detail & Related papers (2020-12-31T08:21:37Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.