Can Continual Learning Improve Long-Tailed Recognition? Toward a Unified
Framework
- URL: http://arxiv.org/abs/2306.13275v1
- Date: Fri, 23 Jun 2023 03:05:33 GMT
- Title: Can Continual Learning Improve Long-Tailed Recognition? Toward a Unified
Framework
- Authors: Mahdiyar Molahasani, Michael Greenspan, Ali Etemad
- Abstract summary: Long-Tailed Recognition methods aim to accurately learn a dataset comprising both a larger Head set and a smaller Tail set.
We show that Continual Learning (CL) methods can effectively update the weights of the learner to learn the Tail without forgetting the Head.
We also assess the applicability of CL techniques on real-world data by exploring CL on the naturally imbalanced256 dataset.
- Score: 16.457778420360537
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Long-Tailed Recognition (LTR) problem emerges in the context of learning
from highly imbalanced datasets, in which the number of samples among different
classes is heavily skewed. LTR methods aim to accurately learn a dataset
comprising both a larger Head set and a smaller Tail set. We propose a theorem
where under the assumption of strong convexity of the loss function, the
weights of a learner trained on the full dataset are within an upper bound of
the weights of the same learner trained strictly on the Head. Next, we assert
that by treating the learning of the Head and Tail as two separate and
sequential steps, Continual Learning (CL) methods can effectively update the
weights of the learner to learn the Tail without forgetting the Head. First, we
validate our theoretical findings with various experiments on the toy MNIST-LT
dataset. We then evaluate the efficacy of several CL strategies on multiple
imbalanced variations of two standard LTR benchmarks (CIFAR100-LT and
CIFAR10-LT), and show that standard CL methods achieve strong performance gains
in comparison to baselines and approach solutions that have been tailor-made
for LTR. We also assess the applicability of CL techniques on real-world data
by exploring CL on the naturally imbalanced Caltech256 dataset and demonstrate
its superiority over state-of-the-art classifiers. Our work not only unifies
LTR and CL but also paves the way for leveraging advances in CL methods to
tackle the LTR challenge more effectively.
Related papers
- ICL-TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models [103.45785408116146]
Continual learning (CL) aims to train a model that can solve multiple tasks presented sequentially.
Recent CL approaches have achieved strong performance by leveraging large pre-trained models that generalize well to downstream tasks.
However, such methods lack theoretical guarantees, making them prone to unexpected failures.
We bridge this gap by integrating an empirically strong approach into a principled framework, designed to prevent forgetting.
arXiv Detail & Related papers (2024-10-01T12:58:37Z) - What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights [67.72413262980272]
Severe data imbalance naturally exists among web-scale vision-language datasets.
We find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning.
The robustness and discriminability of CLIP improve with more descriptive language supervision, larger data scale, and broader open-world concepts.
arXiv Detail & Related papers (2024-05-31T17:57:24Z) - Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation [123.4883806344334]
We study a realistic Continual Learning setting where learning algorithms are granted a restricted computational budget per time step while training.
We apply this setting to large-scale semi-supervised Continual Learning scenarios with sparse label rates.
Our extensive analysis and ablations demonstrate that DietCL is stable under a full spectrum of label sparsity, computational budget, and various other ablations.
arXiv Detail & Related papers (2024-04-19T10:10:39Z) - TRACE: A Comprehensive Benchmark for Continual Learning in Large
Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety.
Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs.
We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z) - A dual-branch model with inter- and intra-branch contrastive loss for
long-tailed recognition [7.225494453600985]
Models trained on long-tailed datasets have poor adaptability to tail classes and the decision boundaries are ambiguous.
We propose a simple yet effective model, named Dual-Branch Long-Tailed Recognition (DB-LTR), which includes an imbalanced learning branch and a Contrastive Learning Branch (CoLB)
CoLB can improve the capability of the model in adapting to tail classes and assist the imbalanced learning branch to learn a well-represented feature space and discriminative decision boundary.
arXiv Detail & Related papers (2023-09-28T03:31:11Z) - Unbiased and Efficient Self-Supervised Incremental Contrastive Learning [31.763904668737304]
We propose a self-supervised Incremental Contrastive Learning (ICL) framework consisting of a novel Incremental InfoNCE (NCE-II) loss function.
ICL achieves up to 16.7x training speedup and 16.8x faster convergence with competitive results.
arXiv Detail & Related papers (2023-01-28T06:11:31Z) - A Study of Continual Learning Methods for Q-Learning [78.6363825307044]
We present an empirical study on the use of continual learning (CL) methods in a reinforcement learning (RL) scenario.
Our results show that dedicated CL methods can significantly improve learning when compared to the baseline technique of "experience replay"
arXiv Detail & Related papers (2022-06-08T14:51:52Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.