Online Continual Learning on a Contaminated Data Stream with Blurry Task
Boundaries
- URL: http://arxiv.org/abs/2203.15355v2
- Date: Wed, 30 Mar 2022 05:51:20 GMT
- Title: Online Continual Learning on a Contaminated Data Stream with Blurry Task
Boundaries
- Authors: Jihwan Bang, Hyunseo Koh, Seulki Park, Hwanjun Song, Jung-Woo Ha,
Jonghyun Choi
- Abstract summary: A large body of continual learning (CL) methods assumes data streams with clean labels, and online learning scenarios under noisy data streams are yet underexplored.
We consider a more practical CL task setup of an online learning from blurry data stream with corrupted labels, where existing CL methods struggle.
We propose a novel strategy to manage and use the memory by a unified approach of label noise aware diverse sampling and robust learning with semi-supervised learning.
- Score: 17.43350151320054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning under a continuously changing data distribution with incorrect
labels is a desirable real-world problem yet challenging. A large body of
continual learning (CL) methods, however, assumes data streams with clean
labels, and online learning scenarios under noisy data streams are yet
underexplored. We consider a more practical CL task setup of an online learning
from blurry data stream with corrupted labels, where existing CL methods
struggle. To address the task, we first argue the importance of both diversity
and purity of examples in the episodic memory of continual learning models. To
balance diversity and purity in the episodic memory, we propose a novel
strategy to manage and use the memory by a unified approach of label noise
aware diverse sampling and robust learning with semi-supervised learning. Our
empirical validations on four real-world or synthetic noise datasets (CIFAR10
and 100, mini-WebVision, and Food-101N) exhibit that our method significantly
outperforms prior arts in this realistic and challenging continual learning
scenario. Code and data splits are available in
https://github.com/clovaai/puridiver.
Related papers
- RanDumb: A Simple Approach that Questions the Efficacy of Continual Representation Learning [68.42776779425978]
We show that existing online continually trained deep networks produce inferior representations compared to a simple pre-defined random transforms.
We then train a simple linear classifier on top without storing any exemplars, processing one sample at a time in an online continual learning setting.
Our study reveals the significant limitations of representation learning, particularly in low-exemplar and online continual learning scenarios.
arXiv Detail & Related papers (2024-02-13T22:07:29Z) - Continual Learning with Deep Streaming Regularized Discriminant Analysis [0.0]
We propose a streaming version of regularized discriminant analysis as a solution to this challenge.
We combine our algorithm with a convolutional neural network and demonstrate that it outperforms both batch learning and existing streaming learning algorithms.
arXiv Detail & Related papers (2023-09-15T12:25:42Z) - CTP: Towards Vision-Language Continual Pretraining via Compatible
Momentum Contrast and Topology Preservation [128.00940554196976]
Vision-Language Continual Pretraining (VLCP) has shown impressive results on diverse downstream tasks by offline training on large-scale datasets.
To support the study of Vision-Language Continual Pretraining (VLCP), we first contribute a comprehensive and unified benchmark dataset P9D.
The data from each industry as an independent task supports continual learning and conforms to the real-world long-tail nature to simulate pretraining on web data.
arXiv Detail & Related papers (2023-08-14T13:53:18Z) - MILD: Modeling the Instance Learning Dynamics for Learning with Noisy
Labels [19.650299232829546]
We propose an iterative selection approach based on the Weibull mixture model to identify clean data.
In particular, we measure the difficulty of memorization and memorize for each instance via the transition times between being misclassified and being memorized.
Our strategy outperforms existing noisy-label learning methods.
arXiv Detail & Related papers (2023-06-20T14:26:53Z) - Nonstationary data stream classification with online active learning and
siamese neural networks [11.501721946030779]
An emerging need for online learning methods that train predictive models on-the-fly.
A series of open challenges, however, hinder their deployment in practice.
We propose the ActiSiamese algorithm, which addresses these challenges by combining online active learning, siamese networks, and a multi-queue memory.
arXiv Detail & Related papers (2022-10-03T17:16:03Z) - vCLIMB: A Novel Video Class Incremental Learning Benchmark [53.90485760679411]
We introduce vCLIMB, a novel video continual learning benchmark.
vCLIMB is a standardized test-bed to analyze catastrophic forgetting of deep models in video continual learning.
We propose a temporal consistency regularization that can be applied on top of memory-based continual learning methods.
arXiv Detail & Related papers (2022-01-23T22:14:17Z) - Online Continual Learning with Natural Distribution Shifts: An Empirical
Study with Visual Data [101.6195176510611]
"Online" continual learning enables evaluating both information retention and online learning efficacy.
In online continual learning, each incoming small batch of data is first used for testing and then added to the training set, making the problem truly online.
We introduce a new benchmark for online continual visual learning that exhibits large scale and natural distribution shifts.
arXiv Detail & Related papers (2021-08-20T06:17:20Z) - Rainbow Memory: Continual Learning with a Memory of Diverse Samples [14.520337285540148]
We argue the importance of diversity of samples in an episodic memory.
We propose a novel memory management strategy based on per-sample classification uncertainty and data augmentation.
We show that the proposed method significantly improves the accuracy in blurry continual learning setups.
arXiv Detail & Related papers (2021-03-31T17:28:29Z) - ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for
Semi-supervised Continual Learning [52.831894583501395]
Continual learning assumes the incoming data are fully labeled, which might not be applicable in real applications.
We propose deep Online Replay with Discriminator Consistency (ORDisCo) to interdependently learn a classifier with a conditional generative adversarial network (GAN)
We show ORDisCo achieves significant performance improvement on various semi-supervised learning benchmark datasets for SSCL.
arXiv Detail & Related papers (2021-01-02T09:04:14Z) - Bilevel Continual Learning [76.50127663309604]
We present a novel framework of continual learning named "Bilevel Continual Learning" (BCL)
Our experiments on continual learning benchmarks demonstrate the efficacy of the proposed BCL compared to many state-of-the-art methods.
arXiv Detail & Related papers (2020-07-30T16:00:23Z) - Neuromodulated Neural Architectures with Local Error Signals for
Memory-Constrained Online Continual Learning [4.2903672492917755]
We develop a biologically-inspired light weight neural network architecture that incorporates local learning and neuromodulation.
We demonstrate the efficacy of our approach on both single task and continual learning setting.
arXiv Detail & Related papers (2020-07-16T07:41:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.